SATA RAID5 kernel hang on 2.6.11-1.14_FC3

am 12.05.2005 07:37:16 von Tim Harvey

Greetings,

I'm attempting to run a RAID5 array with the following configuration:

ASUS TUSI-M Motherboard, Celeron 1GHz, 256MB SDRAM
Promise SATAII-150-TX4 4-port RAID controller
4 x Seagate ST300 300GB SATA NQC drives
Linux FC3 system with 2.6.11-1.14_FC3

I've been trying to determine the cause of a kernel hang that occurs when I
start to transfer a large ammount of data to the array over an NFS/SMB mount.

After anywhere from hours to a few seconds large transfers, the console starts
spewing endlessly the following:

do_IRQ: stack overflow: 312
[] do_IRQ+0x83/0x85
[] common_interrupt+0x1a/0x20
[] cfq_set_request+0x1b2/0x4fd
[] autoremove_wake_function+0x0/0x37
[] cfq_set_request+0x0/0x4fd
[] elv_set_request+0x20/0x23
[] get_request+0x21a/0x56e
[] __make_request+0x15b/0x629
[] generic_make_request+0x19e/0x279
[] autoremove_wake_function+0x0/0x37
[] autoremove_wake_function+0x0/0x37
[] handle_stripe+0xf7e/0x16a3 [raid5]
[] raid5_build_block+0x65/0x70 [raid5]
[] get_active_stripe+0x29e/0x560 [raid5]
[] make_request+0x349/0x539 [raid5]
[] autoremove_wake_function+0x0/0x37
[] mempool_alloc+0x72/0x2a9
[] autoremove_wake_function+0x0/0x37
[] generic_make_request+0x19e/0x279
[] autoremove_wake_function+0x0/0x37
[] autoremove_wake_function+0x0/0x37
[] bio_clone+0xa1/0xa6
[] __map_bio+0x30/0xc8 [dm_mod]
[] __clone_and_map+0xcd/0x309 [dm_mod]
[] __split_bio+0x9d/0x10b [dm_mod]
[] dm_request+0x5f/0x88 [dm_mod]
[] generic_make_request+0x19e/0x279
[] autoremove_wake_function+0x0/0x37
[] autoremove_wake_function+0x0/0x37
[] prep_new_page+0x5c/0x5f
[] autoremove_wake_function+0x0/0x37
[] submit_bio+0x4b/0xc5
[] autoremove_wake_function+0x0/0x37
[] bio_add_page+0x29/0x2f
[] _pagebuf_ioapply+0x164/0x2d9 [xfs]
[] pagebuf_iorequest+0x33/0x14a [xfs]
[] _pagebuf_find+0xd9/0x2f3 [xfs]
[] _pagebuf_map_pages+0x64/0x84 [xfs]
[] xfs_buf_get_flags+0xc4/0x108 [xfs]
[] pagebuf_iostart+0x53/0x8c [xfs]
[] xfs_buf_read_flags+0x4f/0x6c [xfs]
[] xfs_trans_read_buf+0x1b9/0x31b [xfs]
....

I would like ot give a more detailed report, but I'm not really sure what to do
next. It would seem that something in RAID5 code is recursing endlessly? I'm
not quite sure how to proceed as I'm not getting an oops and the system is non
responsive (other than spewing the endless stack dump)

Any help would be greatly appreciated,

Tim

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: SATA RAID5 kernel hang on 2.6.11-1.14_FC3

am 12.05.2005 11:27:17 von Joshua Baker-LePain

On Wed, 11 May 2005 at 10:37pm, Tim Harvey wrote

> Greetings,
>
> I'm attempting to run a RAID5 array with the following configuration:
>
> ASUS TUSI-M Motherboard, Celeron 1GHz, 256MB SDRAM
> Promise SATAII-150-TX4 4-port RAID controller
> 4 x Seagate ST300 300GB SATA NQC drives
> Linux FC3 system with 2.6.11-1.14_FC3
>
*snip*
> do_IRQ: stack overflow: 312
*snip*
> [] xfs_buf_get_flags+0xc4/0x108 [xfs]
> [] pagebuf_iostart+0x53/0x8c [xfs]
> [] xfs_buf_read_flags+0x4f/0x6c [xfs]
> [] xfs_trans_read_buf+0x1b9/0x31b [xfs]
> ...

It looks like you were using XFS as your filesystem? XFS has had some
issues with 4K stacks, although I thought they were lessened in more
recent kernels. You may want to ask over on the XFS list.

--
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html