Group: pgsql.patches


Subject: HOT synced with HEAD
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 9/16/2007 1:40:28 PM
BTW, I am still looking for a reason for the hard-prune logic to live. It seems to complicate matters far more than it's worth --- in particular the way that the WAL replay representation is set up seems confusing and fragile. (If prune_hard is set, the "redirect" entries mean something completely different.) There was some suggestion that VACUUM FULL has to have it, but unless I see proof of that I'm thinking of taking it out. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match

Subject: HOT synced with HEAD
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 9/16/2007 2:16:54 PM
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes: > On 9/16/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> BTW, I'm in process of taking out the separate HEAPTUPLE_DEAD_CHAIN >> return value from HeapTupleSatisfiesVacuum. >> > I agree. I myself suggested doing so earlier in the discussion (I actually > even removed this before I sent out the add-on patch last night, but then > reverted back because I realized at least it is required at one place) > The place where I thought its required is to avoid marking an index tuple > dead > even though the corresponding root tuple is dead and the root tuple was > HOT updated. But seems like you have already put in a different mechanism > to handle that. So we should be able to get rid of HEAPTUPLE_DEAD_CHAIN. Yeah, actually this depends in part on having HeapTupleIsHotUpdated include a check on XMIN_INVALID; otherwise testing that wouldn't be a full substitute for what tqual.c had been doing. Something else I was just looking at: in the pruning logic, SetRedirect and SetDead operations are done at the same time that we record them for the eventual WAL record creation, but SetUnused operations are postponed and only done at the end. This seems a bit odd/nonorthogonal. Furthermore it's costing extra code complexity: if you were to SetUnused immediately then you wouldn't need that bitmap thingy to prevent you from doing it twice. I think that the reason it's like that is probably because of the problem of potential multiple visits to a DEAD heap-only tuple, but it looks to me like that's not really an issue given the current form of the testing for aborted tuples, which I have as if (HeapTupleSatisfiesVacuum(tuple->t_data, global_xmin, buffer) == HEAPTUPLE_DEAD && !HeapTupleIsHotUpdated(tuple)) heap_prune_record_unused(nowunused, nunused, unusedbitmap, rootoffnum); ISTM that if HeapTupleIsHotUpdated is false, we could simply SetUnused immediately, because any potential chain members after this one must be dead too, and they'll get reaped by this same bit of code --- there is no need to preserve the chain. (The only case we're really covering here is XMIN_INVALID, and any later chain members must certainly be XMIN_INVALID as well.) When the HOT-chain is later followed, we'll detect chain break anyway, so I see no need to postpone clearing the item pointer. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match

Subject: HOT synced with HEAD
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 9/16/2007 3:16:11 PM
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes: > So are you suggesting we go back to the earlier way of handling > aborted tuples separately ? But then we can not do that by simply > checking for !HeaptupleIsHotUpdated. There could be several aborted > tuples at the end of the chain of which all but one are marked HotUpdated. > Or are you suggesting we also check for XMIN_INVALID for detecting > aborted tuples ? Yeah. As the code stands, anything that's XMIN_INVALID will be considered not-HotUpdated (look at the macro...). So far I've seen no place where there is any value in following a HOT chain past such a tuple --- do you see any? Every descendant tuple must be XMIN_INVALID as well ... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster

Subject: HOT synced with HEAD
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 9/16/2007 9:19:05 PM
Hmm ... so all that logic to prune just one tuple chain is dead code, because heap_page_prune_defrag() ignores its pruneoff argument and always passes InvalidOffsetNumber down to heap_page_prune(). While this is certainly a fairly trivial bug, it makes me wonder whether the prune-one-chain logic has ever been active and whether there is any real evidence for having it at all. Was this error introduced in some recent refactoring, or has it always been like that? Given the way that the logic works now, in particular that we always insist on doing defrag, what point is there in not pruning all the chains when we can? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings

Subject: HOT synced with HEAD
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 9/17/2007 10:03:08 AM
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes: > No, I don't think we would ever need to follow a HOT chain past > the aborted tuple. The only thing that worries about this setup though > is the dependency on hint bits being set properly. But the places > where this would be used right now for detecting aborted dead tuples, > apply HeapTupleSatisfiesVacuum on the tuple before checking > for HeapTupleIsHotUpdated, so we are fine. Practically all the places that check that have just done a tqual.c test, so they can count on the INVALID bits to be up-to-date. If not, it's still OK, it just means that they might uselessly advance to the next (former) chain member. There is always a race condition in these sorts of things: for instance, a tuple could go from INSERT_IN_PROGRESS to DEAD at any instant, if its inserting transaction rolls back. So you have to have adequate defenses in place anyway, like the xmin/xmax comparison. > Or should we just check for XMIN_INVALID explicitly at those places ? I went back and forth on that, but on balance a single macro seems better. Meanwhile I've started looking at the vacuum code, and it seems that v16 has made that part of the patch significantly worse. VACUUM will fail to count tuples that are removed by pruning, which seems like something it should report somehow. And you've introduced a race condition: as I just mentioned, it's perfectly possible that the second call of HeapTupleSatisfiesVacuum gets a different answer than what the prune code saw, especially in lazy VACUUM (in VACUUM FULL it'd suggest that someone released lock early ... but we do have to cope with that). The comments you added seem to envision a more invasive patch that gets rid of the second HeapTupleSatisfiesVacuum pass altogether, but I'm not sure how practical that is, and am not real inclined to try to do it right now anyway ... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster