Subject: Is my database now too big?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/7/2007 12:19:12 PM
"Darren Reed" <darrenr+postgres@fastmail.net> writes:
> * pg_dump fails to run, causing an "out of memory" error (I don't
> understand
> why this happens, but it does);
Let's see the exact command to pg_dump and the exact failure message.
It might be useful to add -v so we can get some sense of where in the
process it fails.
> * restarting postgres across a reboot results in tables disappearing
> without any error messages being given out (this is *really
> disturbing*.)
Hm, system reboot you mean? To be honest I think this is likely pilot
error, along the lines of the postmaster you are running after the
reboot is using a different data directory than the one you were talking
to before. Comparing the output of "SHOW data_directory;" before and
after might be illuminating.
I remember a vaguely similar incident awhile back in which it turned out
that the DBA had put the data directory on a soft-mounted NFS server,
which sometimes hadn't come up by the time Postgres started, and so
sometimes he got the NFS server's version of the database and sometimes
he got a version that was on the local disk underneath the NFS mount
point. *That* was a mess :-(, and that was one reason why the PGDG
and Red Hat init scripts for PG no longer do an automatic initdb if
they don't find a valid data directory where it's supposed to be.
Exactly how are you starting the postmaster, anyway, and what is the
underlying platform here? Did you roll-your-own Postgres build or
is this a prepackaged distribution? If the latter, whose?
While I'm asking questions, is it just the tables that disappear or
is it all your database objects (including functions etc)? Do
*all* your tables disappear or just some of them? What do you
mean by "disappear" exactly --- do you get 'relation "foo" does not
exist', or some weirder error, or are the tables present but empty?
> "wc -l" of the output from the last successuful dump_all is around
> 8million lines, spread across half a dozen or so tables.
As already pointed out, this database is tiny. I'm not sure what your
problem is, but I'm quite sure it's not "database is too big".
> I'm currently using 8.1.3, will I fare any better with a more recent
> version?
You should certainly be on 8.1.10 just on general principles, though
I don't immediately recall any bugs that seem like they could be
related. There have been plenty of bugs fixed since 8.1.3 though.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
Subject: Is my database now too big?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/15/2007 12:03:10 PM
Darren Reed <darrenr@fastmail.net> writes:
> # /usr/pkg/bin/psql -U postgres template1
> psql: FATAL: out of memory
> DETAIL: Failed on request of size 20.
I'm starting to think there is something very broken about your machine :-(.
Have you run any hardware diagnostics on it lately? The level of
flakiness you're seeing starts to suggest bad RAM to me.
Anyway, the above error should have also produced a map of per-context
memory usage in the postmaster log (ie, postmaster stderr). If you
could show us that, it might be revealing.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Subject: Is my database now too big?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/15/2007 5:56:11 PM
Darren Reed <darrenr@fastmail.net> writes:
> Tom Lane wrote:
>> Anyway, the above error should have also produced a map of per-context
>> memory usage in the postmaster log (ie, postmaster stderr). If you
>> could show us that, it might be revealing.
> MessageContext: 267378688 total in 43 blocks; 1768 free (45 chunks);
> 267376920 used
> The standout problem is the "MessageContext" count.
Indeed. And there shouldn't even be anything in MessageContext until
the first client command has been received. Maybe you have something
in ~/.psqlrc that you haven't told us about?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
Subject: Is my database now too big?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/15/2007 10:04:31 PM
Darren Reed <darrenr+postgres@fastmail.net> writes:
> Tom Lane wrote:
>> Indeed. And there shouldn't even be anything in MessageContext until
>> the first client command has been received. Maybe you have something
>> in ~/.psqlrc that you haven't told us about?
> That's easy - I don't even have one of these files!
Then the behavior you showed is impossible ;-)
There is *something* that is completely broken about your machine,
and the rest of us really don't have enough context to tell what.
You haven't told us anything about the hardware or operating system,
or how you built or obtained the Postgres executables.
I don't think you should dismiss the possibility of a hardware problem,
especially since the failures aren't 100% reproducible (AFAICT from your
previous remarks). We've seen more than one case where Postgres
stressed a system more than anything else that was being run, and
thereby exposed hardware problems that didn't manifest otherwise.
For instance, a bit of bad RAM up near the end of physical memory might
not get used at all until Postgres starts eating up memory.
Another line of thought is that you built Postgres with a buggy compiler
and thereby got buggy executables. Have you tried running the PG
regression tests?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
Subject: Is my database now too big?
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/16/2007 12:15:33 AM
Darren Reed <darrenr+postgres@fastmail.net> writes:
> I'm starting to wonder if it is a combination of:
> - the operating system (NetBSD 4.99.20)
Um ... what was your motivation for choosing that? According to
http://www.netbsd.org/releases/release-map.html
a .99 release number signifies "an alpha quality distribution. It isn't
even guaranteed to compile." It looks like NetBSD 4 is currently
up to an RC2 release, which is probably not what you've got there ...
but even if you were running the RC2 code I'd question the sanity
of insisting on a back-rev Postgres release on top of bleeding edge
operating system.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
|