"failed to re-find parent key" question

"failed to re-find parent key" question

am 05.03.2007 18:53:45 von Roger Pan

This is a multi-part message in MIME format.

------_=_NextPart_001_01C75F4F.38295391
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi!
=20
We use postgreSQL 8.1.2 in Solaris 9 platform to maintain very =
important business data. The postgresql DB was interrupted now:=20
=20
> more postgresql-2007-03-05_210154.log
LOG: could not bind socket for statistics collector: Cannot assign =
requested address
LOG: database system was interrupted while in recovery at 2007-03-05 =
20:26:30 CST
HINT: This probably means that some data is corrupted and you will have =
to use the last backup for recovery.
LOG: checkpoint record is at 114/FDB86500
LOG: redo record is at 114/FDB2B0F8; undo record is at 0/0; shutdown =
FALSE
LOG: next transaction ID: 8817742; next OID: 106734149
LOG: next MultiXactId: 60550; next MultiXactOffset: 14674685
LOG: database system was not properly shut down; automatic recovery in =
progress
LOG: redo starts at 114/FDB2B0F8
LOG: record with zero length at 115/249891E8
LOG: redo done at 115/249891B8
PANIC: failed to re-find parent key in "1560660"
LOG: startup process (PID 14266) was terminated by signal 6
LOG: aborting startup due to startup process failure
LOG: logger shutting down "urn:schemas-microsoft-com:office:office" />

I know the problem " failed to re-find parent key" has been fixed in =
the newer release. My question is how we can recover the data in this =
case? The difficult is the disk with postgres data system is full.=20

Many thanks!

Roger


------_=_NextPart_001_01C75F4F.38295391
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable



charset=3Diso-8859-1">




Hi!

 

   We use postgreSQL =
8.1.2 in=20
Solaris 9 platform  to maintain very important business =
data. The=20
postgresql DB was interrupted now:
 

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">> more=20
postgresql-2007-03-05_210154.log

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">LOG:  could not bind =
socket for=20
statistics collector: Cannot assign requested =
address

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">L class=3D191484317-05032007>O style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">G:  database system =
was=20
interrupted while in recovery at 2007-03-05 20:26:30 =
CST

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">HINT:  This probably =
means that=20
some data is corrupted and you will have to use the last backup for=20
recovery.

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">LOG:  checkpoint =
record is at=20
114/FDB86500

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">LOG:  redo record is =
at=20
114/FDB2B0F8; undo record is at 0/0; shutdown FALSE

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">LOG:  next =
transaction ID:=20
8817742; next OID: 106734149

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">LOG:  next =
MultiXactId: 60550;=20
next MultiXactOffset: 14674685

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">LOG:  database system =
was not=20
properly shut down; automatic recovery in progress

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">LOG:  redo starts at=20
114/FDB2B0F8

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">LOG:  record with =
zero length=20
at 115/249891E8

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">LOG:  redo done at=20
115/249891B8

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">PANIC:  failed to =
re-find=20
parent key in "1560660"

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">LOG:  startup process =
(PID=20
14266) was terminated by signal 6

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">LOG:  aborting =
startup due to=20
startup process failure

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">LOG:  logger shutting =

down "urn:schemas-microsoft-com:office:office"=20
/>

style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">  class=3D603423517-05032007>I know  the problem " failed to re-find =
parent=20
key" has been fixed =
in  class=3D191484317-05032007>the newer release. My question is how =
we can=20
recover the data in this case? The =
difficult=20
is the disk with postgres data system is full. =


Arial"> class=3D603423517-05032007>Many thanks!


Arial"> class=3D603423517-05032007> style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial"> class=3D603423517-05032007>Roger

IV>

------_=_NextPart_001_01C75F4F.38295391--

Re: "failed to re-find parent key" question

am 06.03.2007 05:29:46 von Tom Lane

"Roger Pan" writes:
> We use postgreSQL 8.1.2 in Solaris 9 platform to maintain very =
> important business data.

If it's as important as all that, you should make more of an effort to
keep up-to-date with PG minor releases...

> PANIC: failed to re-find parent key in "1560660"
> I know the problem " failed to re-find parent key" has been fixed in =
> the newer release. My question is how we can recover the data in this =
> case? The difficult is the disk with postgres data system is full.=20

A quick and dirty solution would be to do pg_resetxlog, but the problem
is that it's difficult to predict how much corruption or data loss would
result. If the data is really worth an effort to save, you might consider
making a hacked-up build in which this PANIC is reduced to a WARNING,
which you use just long enough to boot up and shut down. I think it'd
work to change (in src/backend/access/nbtree/nbtinsert.c)

/* Check for error only after writing children */
if (pbuf == InvalidBuffer)
elog(ERROR, "failed to re-find parent key in \"%s\"",
RelationGetRelationName(rel));

/* Recursively update the parent */
_bt_insertonpg(rel, pbuf, stack->bts_parent,
0, NULL, new_item, stack->bts_offset,
is_only);

to

/* Check for error only after writing children */
if (pbuf == InvalidBuffer)
elog(WARNING, "failed to re-find parent key in \"%s\"",
RelationGetRelationName(rel));
else
/* Recursively update the parent */
_bt_insertonpg(rel, pbuf, stack->bts_parent,
0, NULL, new_item, stack->bts_offset,
is_only);

After that, reboot into a standard postmaster, and reindex
the index(es) identified by the warning messages.

After that, think about an update ;-)

regards, tom lane

PS: if you try this, I'd *strongly* suggest first making a
filesystem-level backup of all of the $PGDATA tree, so that you aren't
any worse off if it doesn't work.

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match