Group: pgsql.patches


Subject: Hash Index Build Patch
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 9/26/2007 4:06:28 PM
Tom Raney <twraney@comcast.net> writes: > Alvaro Herrera wrote: >> Just wondering, wouldn't it be enough to obtain a tuple count estimate >> by using reltuples / relpages * RelationGetNumberOfBlocks, like the >> planner does? > We thought of that and the verdict is still out whether it is more > costly to scan the entire relation to get the accurate count or use the > estimate and hope for the best with the possibility of splits occurring > during the build. If we use the estimate and it is completely wrong > (with the actual tuple count being much higher) the sort will provide no > benefit and it will behave as did the original code. I think this argument is *far* too weak to justify an extra pass over the relation. The planner-style calculation is quite unlikely to give a major underestimate of the rowcount. It might overestimate, eg if the relation is bloated by dead tuples, but an error in that direction won't kill you. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly