Group: pgsql.docs


Subject: Example non-Latin words for text search parser docs?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/21/2007 4:29:53 PM
I'm afraid my English-centricity is showing, but I could use a little help filling in the missing examples in the table here: http://developer.postgresql.org/pgdocs/postgres/textsearch-parsers.html I'm not sure of a suitable example all-non-ASCII-letters word, and even less sure of how to represent it in SGML. (I remember we had quite a bit of trouble dealing with accented letters in people's names, for instance.) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match

Subject: Example non-Latin words for text search parser docs?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/21/2007 6:56:31 PM
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> and even less sure of how to represent it in SGML. (I remember we had >> quite a bit of trouble dealing with accented letters in people's >> names, for instance.) > Yeah, that will prove difficult. This problem largely goes away if we redefine the word categories as under discussion in the -hackers thread: with any of the proposed alternatives it'd be pretty easy to make up real words that are easily representable in SGML. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend

Subject: Example non-Latin words for text search parser docs?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/23/2007 5:27:25 PM
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> I'm afraid my English-centricity is showing, but I could use a little >> help filling in the missing examples in the table here: >> http://developer.postgresql.org/pgdocs/postgres/textsearch-parsers.html >> I'm not sure of a suitable example all-non-ASCII-letters word, > It's easy to find an example -- I went to the english Wikipedia, > searched for "elephant", then clicked on the russian link at the left. > It gives you "Слоновые", which I see on my terminal as a series of black > squares :-) so there's not a single latin letter in it. Given the just-applied changes in the definition of a "word", we no longer need a totally-not-ASCII sample word. But I wonder if anyone has a better idea than the føø that I made up on the spot... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate

Subject: Example non-Latin words for text search parser docs?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/23/2007 6:29:49 PM
Alvaro Herrera <alvherre@commandprompt.com> writes: > Actually I was wondering if we should use actual words. So instead of > "foo" we could use "elephant" for asciiword and "Éléphant" (french) for > word. And for the hword, "sous-espèces" (which appears on the French > Wikipedia) would do. Hmm ... I see a potential problem with that, which is that if someone happened to be viewing the page on something that dropped the accents, or even just made them too small to be easily readable, the examples wouldn't make any sense at all. I have no problem with "elephant" as a sample asciiword, but for the sample non-ascii word I'd suggest something that (a) is clearly not English and (b) as much as possible, everybody knows has an accent. At least in large parts of the US, something like "mañana" would work nicely. Anyway, feel free to hack on it --- I'm getting a bit weary of looking at that chapter. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate

Subject: Example non-Latin words for text search parser docs?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/25/2007 9:58:38 AM
Alvaro Herrera <alvherre@commandprompt.com> writes: > The hword_asciipart I'm not 100% sure about. I used this: > militar in the context político-militar, or postgresql in the > context postgresql-beta1 Hmm ... I went and looked at the page on developer.postgresql.org, and it's just as I feared: with slightly bleary morning eyes, the accents over the i's are not obvious, and so you have to look *real* close before you get the point of the examples. It doesn't help that 'politico' with no accent is exactly how the phrase would be spelled in English, and so it's easy to not see the accent because you're not expecting one. The other examples seem alright, but I think that one's a bad choice. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster

Subject: Example non-Latin words for text search parser docs?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/25/2007 11:24:38 AM
Alvaro Herrera <alvherre@commandprompt.com> writes: > How about "lógico-matemática"? Works for me. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings