Subject: TB-sized databases
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 11/29/2007 10:45:31 AM
Gregory Stark <stark@enterprisedb.com> writes:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
>> Tom's previous concerns were along the lines of "How would know what to
>> set it to?", given that the planner costs are mostly arbitrary numbers.
> Hm, that's only kind of true.
The units are not the problem. The problem is that you are staking
non-failure of your application on the planner's estimates being
pretty well in line with reality. Not merely in line enough that
it picks a reasonably cheap plan, but in line enough that if it
thinks plan A is 10x more expensive than plan B, then the actual
ratio is indeed somewhere near 10.
Given that this list spends all day every day discussing cases where the
planner is wrong, I'd have to think that that's a bet I wouldn't take.
You could probably avoid this risk by setting the cutoff at something
like 100 or 1000 times what you really want to tolerate, but how
useful is it then?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
Subject: TB-sized databases
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 12/6/2007 11:13:18 AM
Michael Stone <mstone+postgres@mathom.us> writes:
> OTOH, the planner can really screw up queries on really large databases.
> IIRC, the planner can use things like unique constraints to get some
> idea, e.g., of how many rows will result from a join. Unfortunately,
> the planner can't apply those techniques to certain constructs common in
> really large db's (e.g., partitioned tables--how do you do a unique
> constraint on a partitioned table?) I've got some queries that the
> planner thinks will return on the order of 10^30 rows for that sort of
> reason. In practice, the query may return 10^3 rows, and the difference
> between the seq scan and the index scan is the difference between a
> query that takes a few seconds and a query that I will never run to
> completion. I know the goal would be to make the planner understand
> those queries better,
Indeed, and if you've got examples where it's that far off, you should
report them.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
Subject: TB-sized databases
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 12/6/2007 12:55:38 PM
Matthew <matthew@flymine.org> writes:
> ... For this query, Postgres would perform a nested loop,
> iterating over all rows in the small table, and doing a hundred index
> lookups in the big table. This completed very quickly. However, adding the
> LIMIT meant that suddenly a merge join was very attractive to the planner,
> as it estimated the first row to be returned within milliseconds, without
> needing to sort either table.
> The problem is that Postgres didn't know that the first hit in the big
> table would be about half-way through, after doing a index sequential scan
> for half a bazillion rows.
Hmm. IIRC, there are smarts in there about whether a mergejoin can
terminate early because of disparate ranges of the two join variables.
Seems like it should be straightforward to fix it to also consider
whether the time-to-return-first-row will be bloated because of
disparate ranges. I'll take a look --- but it's probably too late
to consider this for 8.3.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
Subject: TB-sized databases
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 12/6/2007 1:34:47 PM
Matthew <matthew@flymine.org> writes:
> On Thu, 6 Dec 2007, Tom Lane wrote:
>> Hmm. IIRC, there are smarts in there about whether a mergejoin can
>> terminate early because of disparate ranges of the two join variables.
> Very cool. Would that be a planner cost estimate fix (so it avoids the
> merge join), or a query execution fix (so it does the merge join on the
> table subset)?
Cost estimate fix. Basically what I'm thinking is that the startup cost
attributed to a mergejoin ought to account for any rows that have to be
skipped over before we reach the first join pair. In general this is
hard to estimate, but for mergejoin it can be estimated using the same
type of logic we already use at the other end.
After looking at the code a bit, I'm realizing that there's actually a
bug in there as of 8.3: mergejoinscansel() is expected to be able to
derive numbers for either direction of scan, but if it's asked to
compute numbers for a DESC-order scan, it looks for a pg_stats entry
sorted with '>', which isn't gonna be there. It needs to know to
look for an '<' histogram and switch the min/max. So the lack of
symmetry here is causing an actual bug in logic that already exists.
That makes the case for fixing this now a bit stronger ...
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
Subject: TB-sized databases
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 12/6/2007 8:55:02 PM
I wrote:
> Hmm. IIRC, there are smarts in there about whether a mergejoin can
> terminate early because of disparate ranges of the two join variables.
> Seems like it should be straightforward to fix it to also consider
> whether the time-to-return-first-row will be bloated because of
> disparate ranges.
I've posted a proposed patch for this:
http://archives.postgresql.org/pgsql-patches/2007-12/msg00025.php
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
Subject: TB-sized databases
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 12/6/2007 11:14:10 PM
Ron Mayer <rm_pg@cheapcomplexdevices.com> writes:
> Tom Lane wrote:
>> There's something fishy about this --- given that that plan has a lower
>> cost estimate, it should've picked it without any artificial
>> constraints.
> I think the reason it's not picking it was discussed back in this thread
> too.
> http://archives.postgresql.org/pgsql-performance/2005-03/msg00675.php
> http://archives.postgresql.org/pgsql-performance/2005-03/msg00684.php
> My offset 0 is forcing the outer join.
> [Edit: Ugh - meant cartesian join - which helps this kind of query.]
Ah; I missed the fact that the two relations you want to join first
don't have any connecting WHERE clause.
The concern I mentioned in the above thread was basically that I don't
want the planner to go off chasing Cartesian join paths in general ---
they're usually useless and would result in an exponential explosion
in the number of join paths considered in many-table queries.
However, in this case the reason that the Cartesian join might be
relevant is that both of them are needed in order to form an inner
indexscan on the big table. I wonder if we could drive consideration
of the Cartesian join off of noticing that. It'd take some rejiggering
around best_inner_indexscan(), or somewhere in that general vicinity.
Way too late for 8.3, but something to think about for next time.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at
http://www.postgresql.org/about/donate
|