Group: pgsql.admin


Subject: Problem with PITR Past Particular WAL File
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/24/2007 8:12:56 AM
Craig McElroy <craig.mcelroy@contegix.com> writes: > Now, if I include one more WAL file in the recovery, ... >> Oct 23 22:22:29 db01b postgres[92]: [ID 748848 local0.info] >> [5706-1] LOG: archive recovery complete >> Oct 23 22:27:06 db01b postgres[91]: [ID 748848 local0.info] [1-1] >> LOG: startup process (PID 92) was terminated by signal 11 Can you get a stack trace from the core dump reported here? It looks like a bug in post-replay cleanup, but there's not enough detail to say more. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly

Subject: Problem with PITR Past Particular WAL File
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/24/2007 2:26:34 PM
Craig McElroy <craig.mcelroy@contegix.com> writes: >> Can you get a stack trace from the core dump reported here? > Certainly, how can that be obtained? $ gdb /path/to/postgres-executable /path/to/core-file gdb> bt gdb> quit If you don't find a corefile in $PGDATA (or wherever your system puts core files) then you probably need to restart the postmaster with "ulimit -c unlimited" to allow producing a core. If the "bt" output is just numbers and no symbols then it won't be of any use; in that case you'll need to find or build non-stripped Postgres executables. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly

Subject: Problem with PITR Past Particular WAL File
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 10/25/2007 11:59:34 AM
Craig McElroy <craig.mcelroy@contegix.com> writes: > Core was generated by `/usr/local/pgsql/bin/postgres -D /pgdata01/data'. > Program terminated with signal 11, Segmentation fault. > #0 0x080b8ee0 in entrySplitPage () > #1 0x080baccf in ginInsertValue () > #2 0x080b81b7 in gin_xlog_cleanup () > #3 0x080af4ce in StartupXLOG () > #4 0x080c04ca in BootstrapMain () > #5 0x08186b2f in StartChildProcess () > #6 0x081889eb in PostmasterMain () > #7 0x0814ee9e in main () Hm, I wonder if this is explained by a bug already fixed in 8.2.5: 2007-06-04 11:59 teodor * src/backend/access/gin/: gindatapage.c, ginentrypage.c, ginget.c, ginvacuum.c, ginxlog.c (REL8_2_STABLE): Fix bundle bugs of GIN: - Fix possible deadlock between UPDATE and VACUUM queries. Bug never was observed in 8.2, but it still exist there. HEAD is more sensitive to bug after recent "ring" of buffer improvements. - Fix WAL creation: if parent page is stored as is after split then incomplete split isn't removed during replay. This happens rather rare, only on large tables with a lot of updates/inserts. - Fix WAL replay: there was wrong test of XLR_BKP_BLOCK_* for left page after deletion of page. That causes wrong rightlink field: it pointed to deleted page. - add checking of match of clearing incomplete split - cleanup incomplete split list after proceeding All of this chages doesn't change on-disk storage, so backpatch... But second point may be an issue for replaying logs from previous version. Teodor, can you comment on whether this stack trace looks like it could be related to that fix? Craig, can you retry your test scenario on 8.2.5? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq