|
|
Subject: Hash Index Build Patch
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 9/26/2007 4:06:28 PM
Tom Raney <twraney@comcast.net> writes:
> Alvaro Herrera wrote:
>> Just wondering, wouldn't it be enough to obtain a tuple count estimate
>> by using reltuples / relpages * RelationGetNumberOfBlocks, like the
>> planner does?
> We thought of that and the verdict is still out whether it is more
> costly to scan the entire relation to get the accurate count or use the
> estimate and hope for the best with the possibility of splits occurring
> during the build. If we use the estimate and it is completely wrong
> (with the actual tuple count being much higher) the sort will provide no
> benefit and it will behave as did the original code.
I think this argument is *far* too weak to justify an extra pass over
the relation. The planner-style calculation is quite unlikely to give a
major underestimate of the rowcount. It might overestimate, eg if the
relation is bloated by dead tuples, but an error in that direction won't
kill you.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
|