Subject: Example non-Latin words for text search parser docs?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/21/2007 4:29:53 PM
I'm afraid my English-centricity is showing, but I could use a little
help filling in the missing examples in the table here:
http://developer.postgresql.org/pgdocs/postgres/textsearch-parsers.html
I'm not sure of a suitable example all-non-ASCII-letters word, and
even less sure of how to represent it in SGML. (I remember we had
quite a bit of trouble dealing with accented letters in people's names,
for instance.)
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
Subject: Example non-Latin words for text search parser docs?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/21/2007 6:56:31 PM
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> and even less sure of how to represent it in SGML. (I remember we had
>> quite a bit of trouble dealing with accented letters in people's
>> names, for instance.)
> Yeah, that will prove difficult.
This problem largely goes away if we redefine the word categories as
under discussion in the -hackers thread: with any of the proposed
alternatives it'd be pretty easy to make up real words that are easily
representable in SGML.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
Subject: Example non-Latin words for text search parser docs?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/23/2007 5:27:25 PM
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> I'm afraid my English-centricity is showing, but I could use a little
>> help filling in the missing examples in the table here:
>> http://developer.postgresql.org/pgdocs/postgres/textsearch-parsers.html
>> I'm not sure of a suitable example all-non-ASCII-letters word,
> It's easy to find an example -- I went to the english Wikipedia,
> searched for "elephant", then clicked on the russian link at the left.
> It gives you "Слоновые", which I see on my terminal as a series of black
> squares :-) so there's not a single latin letter in it.
Given the just-applied changes in the definition of a "word", we no
longer need a totally-not-ASCII sample word. But I wonder if anyone
has a better idea than the føø that I made up on the
spot...
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at
http://www.postgresql.org/about/donate
Subject: Example non-Latin words for text search parser docs?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/23/2007 6:29:49 PM
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Actually I was wondering if we should use actual words. So instead of
> "foo" we could use "elephant" for asciiword and "Éléphant" (french) for
> word. And for the hword, "sous-espèces" (which appears on the French
> Wikipedia) would do.
Hmm ... I see a potential problem with that, which is that if someone
happened to be viewing the page on something that dropped the accents,
or even just made them too small to be easily readable, the examples
wouldn't make any sense at all.
I have no problem with "elephant" as a sample asciiword, but for the
sample non-ascii word I'd suggest something that (a) is clearly not
English and (b) as much as possible, everybody knows has an accent.
At least in large parts of the US, something like "mañana" would
work nicely.
Anyway, feel free to hack on it --- I'm getting a bit weary of looking
at that chapter.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at
http://www.postgresql.org/about/donate
Subject: Example non-Latin words for text search parser docs?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/25/2007 9:58:38 AM
Alvaro Herrera <alvherre@commandprompt.com> writes:
> The hword_asciipart I'm not 100% sure about. I used this:
> militar in the context político-militar, or postgresql in the
> context postgresql-beta1
Hmm ... I went and looked at the page on developer.postgresql.org,
and it's just as I feared: with slightly bleary morning eyes, the
accents over the i's are not obvious, and so you have to look *real*
close before you get the point of the examples. It doesn't help that
'politico' with no accent is exactly how the phrase would be spelled
in English, and so it's easy to not see the accent because you're not
expecting one. The other examples seem alright, but I think that one's
a bad choice.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
Subject: Example non-Latin words for text search parser docs?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/25/2007 11:24:38 AM
Alvaro Herrera <alvherre@commandprompt.com> writes:
> How about "lógico-matemática"?
Works for me.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
|