From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | daveg <daveg(at)sonic(dot)net> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: error: could not find pg_class tuple for index 2662 |
Date: | 2011-08-01 17:23:49 |
Message-ID: | 27557.1312219429@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
daveg <daveg(at)sonic(dot)net> writes:
> On Sun, Jul 31, 2011 at 11:44:39AM -0400, Tom Lane wrote:
>> I think we need to start adding some instrumentation so we can get a
>> better handle on what's going on in your database. If I were to send
>> you a source-code patch for the server that adds some more logging
>> printout when this happens, would you be willing/able to run a patched
>> build on your machine?
> Yes we can run an instrumented server so long as the instrumentation does
> not interfere with normal operation. However, scheduling downtime to switch
> binaries is difficult, and generally needs to be happen on a weekend, but
> sometimes can be expedited. I'll look into that.
OK, attached is a patch against 9.0 branch that will re-scan pg_class
after a failure of this sort occurs, and log what it sees in the tuple
header fields for each tuple for the target index. This should give us
some useful information. It might be worthwhile for you to also log the
results of
select relname,pg_relation_filenode(oid) from pg_class
where relname like 'pg_class%';
in your script that does VACUUM FULL, just before and after each time it
vacuums pg_class. That will help in interpreting the relfilenodes in
the log output.
> My observations so far are:
> - the error occurs at commit of vacuum full of pg_class
> - in these cases error hits autovacuum after it waited for a lock on pg_class
> - in these two cases there was a new process startup while the vacuum was
> running. Don't know if this is relevant.
Interesting. We'll want to know whether that happens every time.
> - while these hit autovacuum, the error does hit other processs (just not in
> these sessions). Unknown if autovacuum is a required component.
Good question. Please consider setting log_autovacuum_min_duration = 0
so that the log also traces all autovacuum activity.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
log_missing_indexes-1.patch | text/x-patch | 3.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2011-08-01 17:31:22 | Re: Compressing the AFTER TRIGGER queue |
Previous Message | Simon Riggs | 2011-08-01 17:08:23 | Re: One-Shot Plans |