Subject: [HACKERS] fulltext parser strange behave
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 11/19/2007 10:58:34 AM
Andrew Dunstan <andrew@dunslane.net> writes:
> Here's a patch that fixes the patterns for numeric entities, tag names,
> and removes the upper case 'X' case in the special case for an XML
> prolog. There are still some oddities, but I decided against making
> heroic efforts to fix them. It's probably less important if the patterns
> are slightly too liberal (e.g. accepting <a href="qwe<qwe>"> ) than if
> they don't recognize what they are alleged to recognize.
I don't approve of the changes to the exposed token type names, but
the state machine changes seem sane first-glance.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Subject: [HACKERS] fulltext parser strange behave
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 11/19/2007 11:39:10 AM
Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> I don't approve of the changes to the exposed token type names, but
>> the state machine changes seem sane first-glance.
> Well, I think it's just plain wrong to describe as HTML tags and
> entities things that just aren't.
Maybe, but "HTML-type" is an unhelpful description. Isn't there a more
general markup standard that subsumes both HTML and XML? (I seem to
recall that SGML might be that, but not sure.)
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
Subject: [HACKERS] fulltext parser strange behave
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 11/19/2007 1:37:48 PM
Peter Eisentraut <peter_e@gmx.net> writes:
> Am Montag, 19. November 2007 schrieb Tom Lane:
>> Maybe, but "HTML-type" is an unhelpful description. Isn't there a more
>> general markup standard that subsumes both HTML and XML? (I seem to
>> recall that SGML might be that, but not sure.)
> I think "XML tag" would actually cover anything that would be valid as an HTML
> tag.
+1 for "XML tag", then.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
|