|
|
Subject: HOT synced with HEAD
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 9/16/2007 1:40:28 PM
BTW, I am still looking for a reason for the hard-prune logic to live.
It seems to complicate matters far more than it's worth --- in
particular the way that the WAL replay representation is set up seems
confusing and fragile. (If prune_hard is set, the "redirect" entries
mean something completely different.) There was some suggestion that
VACUUM FULL has to have it, but unless I see proof of that I'm thinking
of taking it out.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
Subject: HOT synced with HEAD
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 9/16/2007 2:16:54 PM
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
> On 9/16/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> BTW, I'm in process of taking out the separate HEAPTUPLE_DEAD_CHAIN
>> return value from HeapTupleSatisfiesVacuum.
>>
> I agree. I myself suggested doing so earlier in the discussion (I actually
> even removed this before I sent out the add-on patch last night, but then
> reverted back because I realized at least it is required at one place)
> The place where I thought its required is to avoid marking an index tuple
> dead
> even though the corresponding root tuple is dead and the root tuple was
> HOT updated. But seems like you have already put in a different mechanism
> to handle that. So we should be able to get rid of HEAPTUPLE_DEAD_CHAIN.
Yeah, actually this depends in part on having HeapTupleIsHotUpdated
include a check on XMIN_INVALID; otherwise testing that wouldn't be a
full substitute for what tqual.c had been doing.
Something else I was just looking at: in the pruning logic, SetRedirect
and SetDead operations are done at the same time that we record them for
the eventual WAL record creation, but SetUnused operations are
postponed and only done at the end. This seems a bit odd/nonorthogonal.
Furthermore it's costing extra code complexity: if you were to SetUnused
immediately then you wouldn't need that bitmap thingy to prevent you
from doing it twice. I think that the reason it's like that is probably
because of the problem of potential multiple visits to a DEAD heap-only
tuple, but it looks to me like that's not really an issue given the
current form of the testing for aborted tuples, which I have as
if (HeapTupleSatisfiesVacuum(tuple->t_data, global_xmin, buffer)
== HEAPTUPLE_DEAD && !HeapTupleIsHotUpdated(tuple))
heap_prune_record_unused(nowunused, nunused, unusedbitmap,
rootoffnum);
ISTM that if HeapTupleIsHotUpdated is false, we could simply SetUnused
immediately, because any potential chain members after this one must
be dead too, and they'll get reaped by this same bit of code --- there
is no need to preserve the chain. (The only case we're really covering
here is XMIN_INVALID, and any later chain members must certainly be
XMIN_INVALID as well.) When the HOT-chain is later followed, we'll
detect chain break anyway, so I see no need to postpone clearing the
item pointer.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
Subject: HOT synced with HEAD
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 9/16/2007 3:16:11 PM
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
> So are you suggesting we go back to the earlier way of handling
> aborted tuples separately ? But then we can not do that by simply
> checking for !HeaptupleIsHotUpdated. There could be several aborted
> tuples at the end of the chain of which all but one are marked HotUpdated.
> Or are you suggesting we also check for XMIN_INVALID for detecting
> aborted tuples ?
Yeah. As the code stands, anything that's XMIN_INVALID will be
considered not-HotUpdated (look at the macro...). So far I've seen no
place where there is any value in following a HOT chain past such a
tuple --- do you see any? Every descendant tuple must be XMIN_INVALID
as well ...
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
Subject: HOT synced with HEAD
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 9/16/2007 9:19:05 PM
Hmm ... so all that logic to prune just one tuple chain is dead code,
because heap_page_prune_defrag() ignores its pruneoff argument and
always passes InvalidOffsetNumber down to heap_page_prune().
While this is certainly a fairly trivial bug, it makes me wonder whether
the prune-one-chain logic has ever been active and whether there is any
real evidence for having it at all. Was this error introduced in some
recent refactoring, or has it always been like that? Given the way that
the logic works now, in particular that we always insist on doing
defrag, what point is there in not pruning all the chains when we can?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Subject: HOT synced with HEAD
From: tgl@sss.pgh.pa.us (Tom Lane)
Date: 9/17/2007 10:03:08 AM
"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
> No, I don't think we would ever need to follow a HOT chain past
> the aborted tuple. The only thing that worries about this setup though
> is the dependency on hint bits being set properly. But the places
> where this would be used right now for detecting aborted dead tuples,
> apply HeapTupleSatisfiesVacuum on the tuple before checking
> for HeapTupleIsHotUpdated, so we are fine.
Practically all the places that check that have just done a tqual.c
test, so they can count on the INVALID bits to be up-to-date. If not,
it's still OK, it just means that they might uselessly advance to the
next (former) chain member. There is always a race condition in these
sorts of things: for instance, a tuple could go from INSERT_IN_PROGRESS
to DEAD at any instant, if its inserting transaction rolls back. So you
have to have adequate defenses in place anyway, like the xmin/xmax
comparison.
> Or should we just check for XMIN_INVALID explicitly at those places ?
I went back and forth on that, but on balance a single macro seems better.
Meanwhile I've started looking at the vacuum code, and it seems that v16
has made that part of the patch significantly worse. VACUUM will fail
to count tuples that are removed by pruning, which seems like something
it should report somehow. And you've introduced a race condition: as
I just mentioned, it's perfectly possible that the second call of
HeapTupleSatisfiesVacuum gets a different answer than what the prune
code saw, especially in lazy VACUUM (in VACUUM FULL it'd suggest that
someone released lock early ... but we do have to cope with that).
The comments you added seem to envision a more invasive patch that gets
rid of the second HeapTupleSatisfiesVacuum pass altogether, but I'm not
sure how practical that is, and am not real inclined to try to do it
right now anyway ...
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
|