Postgres server goes in recovery mode repeteadly
--000e0cd213dcb9d4e30474b9d0e9
Content-Type: text/plain; charset=ISO-8859-1
Hi ,
We are using Postgres 8.4 and its been found going into recovery
mode couple of times. The server process seems to fork another child process
which is another postgres server running under same data directory and after
some time it goes away while the old server is still running. There were few
load issues on the server but the load didnt went above "32".
We are running opensuse 10.2 x86_64 with 32Gb of physical memory.
Checking the logs I found that theres a segmentation fault ,
Sep 26 05:39:54 pace kernel: postgres[28694]: segfault at 0000000000000030
rip 000000000066ba8c rsp 00007fffd364da30 error 4
gdb dump shows this
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...done.
Loaded symbols for /lib64/libnss_files.so.2
0x00002ad6d7b8c2b3 in __select_nocancel () from /lib64/libc.so.6
(gdb)
Any suggestions what is causing this segmentation fault?
--000e0cd213dcb9d4e30474b9d0e9
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Hi ,<br>=A0=A0=A0=A0=A0=A0=A0 We are using Postgres 8.4 and its been found =
going into recovery mode couple of times. The server process seems to fork =
another child process which is another postgres server running under same d=
ata directory and after some time it goes away while the old server is stil=
l running. There were few load issues on the server but the load didnt went=
above "32".<br>
<br>=A0=A0 We are running opensuse 10.2 x86_64 with 32Gb of physical memory=
..<br>Checking the logs I found that theres a segmentation fault , <br>=A0=
<meta http-equiv=3D"CONTENT-TYPE" content=3D"text/html; charset=3Dutf-8">
<title></title>
<meta name=3D"GENERATOR" content=3D"OpenOffice.org 2.3 (Linux)">
<style type=3D"text/css">
<!--
[at] page { size: 8.5in 11in; margin: 0.79in }
P { margin-bottom: 0.08in }
-->
</style>
<p style=3D"margin-bottom: 0in;">Sep 26 05:39:54 pace kernel:
postgres[28694]: segfault at 0000000000000030 rip 000000000066ba8c
rsp 00007fffd364da30 error 4</p><p style=3D"margin-bottom: 0in;">gdb dump s=
hows this</p><p style=3D"margin-bottom: 0in;">Reading symbols from /lib64/l=
ibdl.so.2...done.<br>Loaded symbols for /lib64/libdl.so.2<br>Reading symbol=
s from /lib64/libm.so.6...done.<br>
Loaded symbols for /lib64/libm.so.6<br>Reading symbols from /lib64/libc.so.=
6...done.<br>Loaded symbols for /lib64/libc.so.6<br>Reading symbols from /l=
ib64/ld-linux-x86-64.so.2...done.<br>Loaded symbols for /lib64/ld-linux-x86=
-64.so.2<br>
Reading symbols from /lib64/libnss_files.so.2...done.<br>Loaded symbols for=
/lib64/libnss_files.so.2<br>0x00002ad6d7b8c2b3 in __select_nocancel () fro=
m /lib64/libc.so.6<br>(gdb)<br></p><p style=3D"margin-bottom: 0in;"><br></p=
>
<p style=3D"margin-bottom: 0in;">=A0 Any suggestions what is causing this s=
egmentation fault?<br></p>
--000e0cd213dcb9d4e30474b9d0e9--
Re: Postgres server goes in recovery mode repeteadly
kunal sharma wrote:
> Hi ,
> We are using Postgres 8.4 and its been found going into
> recovery mode couple of times. The server process seems to fork
> another child process which is another postgres server running under
> same data directory and after some time it goes away while the old
> server is still running. There were few load issues on the server but
> the load didnt went above "32".
>
> We are running opensuse 10.2 x86_64 with 32Gb of physical memory.
> Checking the logs I found that theres a segmentation fault ,
>
>
> Sep 26 05:39:54 pace kernel: postgres[28694]: segfault at
> 0000000000000030 rip 000000000066ba8c rsp 00007fffd364da30 error 4
>
> gdb dump shows this
>
> Reading symbols from /lib64/libdl.so.2...done.
> Loaded symbols for /lib64/libdl.so.2
> Reading symbols from /lib64/libm.so.6...done.
> Loaded symbols for /lib64/libm.so.6
> Reading symbols from /lib64/libc.so.6...done.
> Loaded symbols for /lib64/libc.so.6
> Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> Reading symbols from /lib64/libnss_files.so.2...done.
> Loaded symbols for /lib64/libnss_files.so.2
> 0x00002ad6d7b8c2b3 in __select_nocancel () from /lib64/libc.so.6
> (gdb)
>
>
>
>
Please try to get a backtrace from gdb.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers [at] postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Re: [ADMIN] Postgres server goes in recovery mode repeteadly
kunal sharma <ksharma.linux [at] gmail.com> writes:
> We are using Postgres 8.4 and its been found going into recovery
8.4.what? (If not 8.4.1, an update would be the first thing to try.)
> Checking the logs I found that theres a segmentation fault ,
> Sep 26 05:39:54 pace kernel: postgres[28694]: segfault at 0000000000000030
> rip 000000000066ba8c rsp 00007fffd364da30 error 4
> gdb dump shows this
> Reading symbols from /lib64/libdl.so.2...done.
> Loaded symbols for /lib64/libdl.so.2
> Reading symbols from /lib64/libm.so.6...done.
> Loaded symbols for /lib64/libm.so.6
> Reading symbols from /lib64/libc.so.6...done.
> Loaded symbols for /lib64/libc.so.6
> Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> Reading symbols from /lib64/libnss_files.so.2...done.
> Loaded symbols for /lib64/libnss_files.so.2
> 0x00002ad6d7b8c2b3 in __select_nocancel () from /lib64/libc.so.6
> (gdb)
A segfault inside select() seems fairly unlikely. I suspect that you
used the wrong executable or otherwise got the wrong result here.
Please double-check, and next time show the whole stack trace ("bt")
not just the top function.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers [at] postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Re: Postgres server goes in recovery mode repeteadly
--000e0cd250023f1dfe0474ba00c8
Content-Type: text/plain; charset=ISO-8859-1
gdb backtrce-
(gdb) bt full
#0 0x00002ad6d7b8c2b3 in __select_nocancel () from /lib64/libc.so.6
No symbol table info available.
#1 0x00000000005a39bc in ServerLoop () at postmaster.c:1304
timeout = {tv_sec = 55, tv_usec = 352000}
rmask = {fds_bits = {24, 0 <repeats 15 times>}}
selres = <value optimized out>
readmask = {fds_bits = {24, 0 <repeats 15 times>}}
nSockets = 5
now = 1254241068
last_touch_time = 1254238950
__func__ = "ServerLoop"
#2 0x00000000005a4dba in PostmasterMain (argc=3, argv=0xb1e3d0) at
postmaster.c:1040
fpidfile = (FILE *) 0x3
opt = <value optimized out>
status = <value optimized out>
userDoption = 0x1 <Address 0x1 out of bounds>
__func__ = "PostmasterMain"
#3 0x0000000000553b5e in main (argc=3, argv=0xb1e3d0) at main.c:188
No locals.
(gdb)
2009/9/29 Andrew Dunstan <andrew [at] dunslane.net>
>
>
> kunal sharma wrote:
>
>> Hi ,
>> We are using Postgres 8.4 and its been found going into recovery
>> mode couple of times. The server process seems to fork another child process
>> which is another postgres server running under same data directory and after
>> some time it goes away while the old server is still running. There were few
>> load issues on the server but the load didnt went above "32".
>>
>> We are running opensuse 10.2 x86_64 with 32Gb of physical memory.
>> Checking the logs I found that theres a segmentation fault ,
>>
>> Sep 26 05:39:54 pace kernel: postgres[28694]: segfault at 0000000000000030
>> rip 000000000066ba8c rsp 00007fffd364da30 error 4
>>
>> gdb dump shows this
>>
>> Reading symbols from /lib64/libdl.so.2...done.
>> Loaded symbols for /lib64/libdl.so.2
>> Reading symbols from /lib64/libm.so.6...done.
>> Loaded symbols for /lib64/libm.so.6
>> Reading symbols from /lib64/libc.so.6...done.
>> Loaded symbols for /lib64/libc.so.6
>> Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
>> Loaded symbols for /lib64/ld-linux-x86-64.so.2
>> Reading symbols from /lib64/libnss_files.so.2...done.
>> Loaded symbols for /lib64/libnss_files.so.2
>> 0x00002ad6d7b8c2b3 in __select_nocancel () from /lib64/libc.so.6
>> (gdb)
>>
>>
>>
>>
>
> Please try to get a backtrace from gdb.
>
> cheers
>
> andrew
>
--000e0cd250023f1dfe0474ba00c8
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
gdb backtrce-<br><br><br>(gdb) bt full<br>#0=A0 0x00002ad6d7b8c2b3 in __sel=
ect_nocancel () from /lib64/libc.so.6<br>No symbol table info available.<br=
>#1=A0 0x00000000005a39bc in ServerLoop () at postmaster.c:1304<br>=A0=A0=
=A0=A0=A0=A0=A0 timeout =3D {tv_sec =3D 55, tv_usec =3D 352000}<br>
=A0=A0=A0=A0=A0=A0=A0 rmask =3D {fds_bits =3D {24, 0 <repeats 15 times&g=
t;}}<br>=A0=A0=A0=A0=A0=A0=A0 selres =3D <value optimized out><br>=A0=
=A0=A0=A0=A0=A0=A0 readmask =3D {fds_bits =3D {24, 0 <repeats 15 times&g=
t;}}<br>=A0=A0=A0=A0=A0=A0=A0 nSockets =3D 5<br>=A0=A0=A0=A0=A0=A0=A0 now =
=3D 1254241068<br>
=A0=A0=A0=A0=A0=A0=A0 last_touch_time =3D 1254238950<br>=A0=A0=A0=A0=A0=A0=
=A0 __func__ =3D "ServerLoop"<br>#2=A0 0x00000000005a4dba in Post=
masterMain (argc=3D3, argv=3D0xb1e3d0) at postmaster.c:1040<br>=A0=A0=A0=A0=
=A0=A0=A0 fpidfile =3D (FILE *) 0x3<br>=A0=A0=A0=A0=A0=A0=A0 opt =3D <va=
lue optimized out><br>
=A0=A0=A0=A0=A0=A0=A0 status =3D <value optimized out><br>=A0=A0=A0=
=A0=A0=A0=A0 userDoption =3D 0x1 <Address 0x1 out of bounds><br>=A0=
=A0=A0=A0=A0=A0=A0 __func__ =3D "PostmasterMain"<br>#3=A0 0x00000=
00000553b5e in main (argc=3D3, argv=3D0xb1e3d0) at main.c:188<br>
No locals.<br>(gdb) <br><br><div class=3D"gmail_quote">2009/9/29 Andrew Dun=
stan <span dir=3D"ltr"><<a href=3D"mailto:andrew [at] dunslane.net">andrew [at] du=
nslane.net</a>></span><br><blockquote class=3D"gmail_quote" style=3D"bor=
der-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-=
left: 1ex;">
<div><div></div><div class=3D"h5"><br>
<br>
kunal sharma wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"border-left: 1px solid rgb(204, =
204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Hi ,<br>
=A0 =A0 =A0 =A0We are using Postgres 8.4 and its been found going into rec=
overy mode couple of times. The server process seems to fork another child =
process which is another postgres server running under same data directory =
and after some time it goes away while the old server is still running. The=
re were few load issues on the server but the load didnt went above "3=
2".<br>
<br>
=A0 We are running opensuse 10.2 x86_64 with 32Gb of physical memory.<br>
Checking the logs I found that theres a segmentation fault ,<br>
=A0<br>
Sep 26 05:39:54 pace kernel: postgres[28694]: segfault at 0000000000000030 =
rip 000000000066ba8c rsp 00007fffd364da30 error 4<br>
<br>
gdb dump shows this<br>
<br>
Reading symbols from /lib64/libdl.so.2...done.<br>
Loaded symbols for /lib64/libdl.so.2<br>
Reading symbols from /lib64/libm.so.6...done.<br>
Loaded symbols for /lib64/libm.so.6<br>
Reading symbols from /lib64/libc.so.6...done.<br>
Loaded symbols for /lib64/libc.so.6<br>
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.<br>
Loaded symbols for /lib64/ld-linux-x86-64.so.2<br>
Reading symbols from /lib64/libnss_files.so.2...done.<br>
Loaded symbols for /lib64/libnss_files.so.2<br>
0x00002ad6d7b8c2b3 in __select_nocancel () from /lib64/libc.so.6<br>
(gdb)<br>
<br>
<br>
=A0<br>
</blockquote>
<br></div></div>
Please try to get a backtrace from gdb.<br>
<br>
cheers<br><font color=3D"#888888">
<br>
andrew<br>
</font></blockquote></div><br>
--000e0cd250023f1dfe0474ba00c8--
Re: [ADMIN] Postgres server goes in recovery mode repeteadly
kunal sharma <ksharma.linux [at] gmail.com> writes:
> gdb backtrce-
> (gdb) bt full
> #0 0x00002ad6d7b8c2b3 in __select_nocancel () from /lib64/libc.so.6
> No symbol table info available.
> #1 0x00000000005a39bc in ServerLoop () at postmaster.c:1304
> timeout = {tv_sec = 55, tv_usec = 352000}
I think what you're showing us is a stack trace of an idle postmaster
process, not the process that crashed.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers [at] postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers