Subject: How to speed up min/max(id) in 50M rows table?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/12/2007 6:03:46 PM
henk de wit <henk53602@hotmail.com> writes:
> The plan looks like this:
> "Result (cost=3D0.37..0.38 rows=3D1 width=3D0) (actual time=3D184231.636..=
> 184231.638 rows=3D1 loops=3D1)"
> " InitPlan"
> " -> Limit (cost=3D0.00..0.37 rows=3D1 width=3D8) (actual time=3D18423=
> 1.620..184231.622 rows=3D1 loops=3D1)"
> " -> Index Scan Backward using trans_payment_id_index on transact=
> ions (cost=3D0.00..19144690.58 rows=3D51122691 width=3D8) (actual time=3D1=
> 84231.613..184231.613 rows=3D1 loops=3D1)"
> " Filter: (payment_id IS NOT NULL)"
> "Total runtime: 184231.755 ms"
The only way I can see for that to be so slow is if you have a very
large number of rows where payment_id is null --- is that the case?
There's not a lot you could do about that in existing releases :-(.
In 8.3 it'll be possible to declare the index as NULLS FIRST, which
moves the performance problem from the max end to the min end ...
> select min(time) from transactions where payment_id =3D 67
> There are indexes on both the time (a timestamp with time zone) and payment=
> _id (a bigint) columns.
Creating indexes at random with no thought about how the system could
use them is not a recipe for speeding up your queries. What you'd need
to make this query fast is a double-column index on (payment_id, time)
so that a forward scan on the items with payment_id = 67 would
immediately find the minimum time entry. Neither of the single-column
indexes offers any way to find the desired entry without scanning over
lots of unrelated entries.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
Subject: How to speed up min/max(id) in 50M rows table?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/12/2007 6:17:41 PM
I wrote:
> The only way I can see for that to be so slow is if you have a very
> large number of rows where payment_id is null --- is that the case?
> There's not a lot you could do about that in existing releases :-(.
Actually, there is a possibility if you are willing to change the query:
make a partial index that excludes nulls. Toy example:
regression=# create table fooey(f1 int);
CREATE TABLE
regression=# create index fooeyi on fooey(f1) where f1 is not null;
CREATE INDEX
regression=# explain select max(f1) from fooey;
QUERY PLAN
---------------------------------------------------------------
Aggregate (cost=36.75..36.76 rows=1 width=4)
-> Seq Scan on fooey (cost=0.00..31.40 rows=2140 width=4)
(2 rows)
regression=# explain select max(f1) from fooey where f1 is not null;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Result (cost=0.03..0.04 rows=1 width=0)
InitPlan
-> Limit (cost=0.00..0.03 rows=1 width=4)
-> Index Scan Backward using fooeyi on fooey (cost=0.00..65.55 rows=2129 width=4)
Filter: (f1 IS NOT NULL)
(5 rows)
Probably the planner ought to be smart enough to figure this out without
the explicit WHERE in the query, but right now it isn't.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
Subject: How to speed up min/max(id) in 50M rows table?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/12/2007 6:51:03 PM
henk de wit <henk53602@hotmail.com> writes:
> I see, that sounds very interesting too. As you might have noticed, I'm not=
> an expert on this field but I'm trying to learn. I was under the impressio=
> n that the last few incarnations of postgresql automatically combined singl=
> e column indexes for cases where a multi-column index would be needed in ea=
> rlier releases.
It's possible to combine independent indexes for resolving AND-type
queries, but the combination process does not preserve ordering, so
it's useless for this type of situation.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
|