RAID 10 resync leading to attempt to access beyond end of device

am 14.02.2007 23:08:56 von John Stilson

Hi,

I'm experiencing what appears to be a kernel bug in the raid10 driver,
where immediately after a resync completes an access beyond the end of the
rebuilt disk is attempted which causes the disk to be failed.

The system is a single-processor dual-core Xeon 3000 at 1.86GHz. It has
four 250GB drives, two each on two channels of an Intel ICH7. It's running
Fedora Core 4 with a custom compiled unpatched 2.6.20 kernel. I can provide
the kernel itself, config, etc on request.

Here is a full dmesg output from where the /dev/sdc1, part of /dev/md0
was intentionally failed using mdadm /dev/md0 -f /dev/sdc1, mdadm /dev/md0
-r /dev/sdc1, mdadm /dev/md0 -a /dev/sdc1.

Feb 14 16:20:18 testsvr kernel: raid10: Disk failure on sdc1, disabling
device.
Feb 14 16:20:18 testsvr kernel: Operation continuing on 3 devices
Feb 14 16:20:18 testsvr kernel: RAID10 conf printout:
Feb 14 16:20:18 testsvr kernel: --- wd:3 rd:4
Feb 14 16:20:18 testsvr kernel: disk 0, wo:0, o:1, dev:sda9
Feb 14 16:20:18 testsvr kernel: disk 1, wo:0, o:1, dev:sdb1
Feb 14 16:20:18 testsvr kernel: disk 2, wo:1, o:0, dev:sdc1
Feb 14 16:20:18 testsvr kernel: disk 3, wo:0, o:1, dev:sdd1
Feb 14 16:20:18 testsvr kernel: RAID10 conf printout:
Feb 14 16:20:18 testsvr kernel: --- wd:3 rd:4
Feb 14 16:20:18 testsvr kernel: disk 0, wo:0, o:1, dev:sda9
Feb 14 16:20:18 testsvr kernel: disk 1, wo:0, o:1, dev:sdb1
Feb 14 16:20:18 testsvr kernel: disk 3, wo:0, o:1, dev:sdd1
Feb 14 16:20:20 testsvr kernel: md: unbind
Feb 14 16:20:20 testsvr kernel: md: export_rdev(sdc1)
Feb 14 16:20:23 testsvr kernel: md: bind
Feb 14 16:20:23 testsvr kernel: RAID10 conf printout:
Feb 14 16:20:23 testsvr kernel: --- wd:3 rd:4
Feb 14 16:20:23 testsvr kernel: disk 0, wo:0, o:1, dev:sda9
Feb 14 16:20:23 testsvr kernel: disk 1, wo:0, o:1, dev:sdb1
Feb 14 16:20:23 testsvr kernel: disk 2, wo:1, o:1, dev:sdc1
Feb 14 16:20:23 testsvr kernel: disk 3, wo:0, o:1, dev:sdd1
Feb 14 16:20:23 testsvr kernel: md: recovery of RAID array md0
Feb 14 16:20:23 testsvr kernel: md: minimum _guaranteed_ speed: 1000
KB/sec/disk.
Feb 14 16:20:23 testsvr kernel: md: using maximum available idle IO
bandwidth (but not more than 40000 KB/sec) for recovery.
Feb 14 16:20:23 testsvr kernel: md: using 128k window, over a total of
8040320 blocks.
Feb 14 16:23:45 testsvr kernel: md: md0: recovery done.
Feb 14 16:23:45 testsvr kernel: attempt to access beyond end of device
Feb 14 16:23:45 testsvr kernel: sdc1: rw=1, want=901904331651136,
limit=16081002
Feb 14 16:23:45 testsvr kernel: raid10: Disk failure on sdc1, disabling
device.
Feb 14 16:23:45 testsvr kernel: Operation continuing on 3 devices
Feb 14 16:23:45 testsvr kernel: RAID10 conf printout:
Feb 14 16:23:45 testsvr kernel: --- wd:3 rd:4
Feb 14 16:23:45 testsvr kernel: disk 0, wo:0, o:1, dev:sda9
Feb 14 16:23:45 testsvr kernel: disk 1, wo:0, o:1, dev:sdb1
Feb 14 16:23:45 testsvr kernel: disk 2, wo:1, o:0, dev:sdc1
Feb 14 16:23:45 testsvr kernel: disk 3, wo:0, o:1, dev:sdd1
Feb 14 16:23:45 testsvr kernel: RAID10 conf printout:
Feb 14 16:23:45 testsvr kernel: --- wd:3 rd:4
Feb 14 16:23:45 testsvr kernel: disk 0, wo:0, o:1, dev:sda9
Feb 14 16:23:45 testsvr kernel: disk 1, wo:0, o:1, dev:sdb1
Feb 14 16:23:45 testsvr kernel: disk 3, wo:0, o:1, dev:sdd1

I made the kernel OOPS during handle_bad_sector in ll_rw_blk.c to try
and get a backtrace, however the backtrace looks mildly suspicious, so I
think it may not be a good indicator. Here it is anyway:

Feb 13 14:25:23 testsvr kernel: Oops: 0000 [#1]
Feb 13 14:25:23 testsvr kernel: SMP
Feb 13 14:25:23 testsvr kernel: CPU: 0
Feb 13 14:25:23 testsvr kernel: EIP: 0060:[] Not tainted VLI

Feb 13 14:25:23 testsvr kernel: EFLAGS: 00010296 (2.6.19.1 #3)
Feb 13 14:25:23 testsvr kernel: EIP is at handle_bad_sector+0x96/0xf0
Feb 13 14:25:23 testsvr kernel: eax: 00000039 ebx: 00000001 ecx:
f6a7c9c0 edx: 00000082
Feb 13 14:25:23 testsvr kernel: esi: 00000000 edi: f6a7c9c0 ebp:
f7451e58 esp: f7451df4
Feb 13 14:25:23 testsvr kernel: ds: 007b es: 007b ss: 0068
Feb 13 14:25:23 testsvr kernel: Process md0_raid10 (pid: 2267, ti=f7450000
task=f6ed2550 task.ti=f7450000)
Feb 13 14:25:23 testsvr kernel: Stack: c044c950 f7451e2c 00000001 00000102
f7ee0208 00f5606a 00000000 00000002
Feb 13 14:25:23 testsvr kernel: f7fb0408 eac0d400 00000001 00000102
f7ee0208 00000001 31646473 00000000
Feb 13 14:25:23 testsvr kernel: f6e80000 00000086 c0124ce1 00000086
f6e81bc0 f7fb0408 f7ee0208 f6a7c9c0
Feb 13 14:25:23 testsvr kernel: Call Trace:
Feb 13 14:25:23 testsvr kernel: [] __mod_timer+0x8e/0xa5
Feb 13 14:25:23 testsvr kernel: []
generic_make_request+0x64/0x21e
Feb 13 14:25:23 testsvr kernel: [] kobject_release+0x0/0x17
Feb 13 14:25:23 testsvr kernel: [] scsi_request_fn+0x15b/0x36e
Feb 13 14:25:23 testsvr kernel: []
generic_unplug_device+0x1b/0x2a
Feb 13 14:25:23 testsvr kernel: [] unplug_slaves+0x5c/0xa2
Feb 13 14:25:23 testsvr kernel: [] raid10d+0x564/0xc79
Feb 13 14:25:23 testsvr kernel: [] schedule+0x31e/0x8ed
Feb 13 14:25:23 testsvr kernel: [] schedule_timeout+0x72/0xb0
Feb 13 14:25:23 testsvr kernel: [] schedule_timeout+0x72/0xb0
Feb 13 14:25:23 testsvr kernel: [] md_thread+0x40/0x103
Feb 13 14:25:23 testsvr kernel: []
autoremove_wake_function+0x0/0x4b
Feb 13 14:25:23 testsvr kernel: [] md_thread+0x0/0x103
Feb 13 14:25:23 testsvr kernel: [] kthread+0xfc/0x100
Feb 13 14:25:23 testsvr kernel: [] kthread+0x0/0x100
Feb 13 14:25:23 testsvr kernel: [] kernel_thread_helper+0x7/0x10

Any help would be appreciated. I'm available to try any test -- this is a
test server that I can perform any kind of wild test on.

-John
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: RAID 10 resync leading to attempt to access beyond end of device

am 15.02.2007 00:37:47 von NeilBrown

On Wednesday February 14, john9601@gmail.com wrote:
> Feb 14 16:23:45 testsvr kernel: attempt to access beyond end of device
> Feb 14 16:23:45 testsvr kernel: sdc1: rw=1, want=901904331651136,
> limit=16081002

That 'want=' value is an enormous number! 52 bits. Looks a lot like an
uninitialised variable somewhere.

What does
grep . /sys/block/md*/md/dev-*/offset

show while the resync is running? How about

grep . /sys/block/md*/md/dev-*/size

And can you give me the output of "mdadm --detail' on the array?

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: RAID 10 resync leading to attempt to access beyond end of device

am 15.02.2007 19:02:25 von John Stilson

Ok tried the patch and got a kernel BUG this time (BUG_ON(k == conf->copies)?)

-John

Feb 15 12:52:35 testsvr kernel: md: recovery of RAID array md0
Feb 15 12:52:35 testsvr kernel: md: minimum _guaranteed_ speed: 1000
KB/sec/disk.
Feb 15 12:52:35 testsvr kernel: md: using maximum available idle IO
bandwidth (but not more than 40000 KB/sec) for recovery.
Feb 15 12:52:35 testsvr kernel: md: using 128k window, over a total of
8040320 blocks.
Feb 15 12:55:57 testsvr kernel: ------------[ cut here ]------------
Feb 15 12:55:57 testsvr kernel: kernel BUG at drivers/md/raid10.c:1804!
Feb 15 12:55:57 testsvr kernel: invalid opcode: 0000 [#1]
Feb 15 12:55:57 testsvr kernel: SMP
Feb 15 12:55:57 testsvr kernel: Modules linked in:
Feb 15 12:55:57 testsvr kernel: CPU: 0
Feb 15 12:55:57 testsvr kernel: EIP: 0060:[] Not tainted VLI
Feb 15 12:55:57 testsvr kernel: EFLAGS: 00010246 (2.6.20test1 #3)
Feb 15 12:55:57 testsvr kernel: EIP is at sync_request+0x43d/0x928
Feb 15 12:55:57 testsvr kernel: eax: c2330e14 ebx: c2330dc0 ecx:
00000003 edx: 00000000
Feb 15 12:55:57 testsvr kernel: esi: f68b30c0 edi: f782d4c0 ebp:
00000002 esp: f7397e58
Feb 15 12:55:57 testsvr kernel: ds: 007b es: 007b ss: 0068
Feb 15 12:55:57 testsvr kernel: Process md0_resync (pid: 2589,
ti=f7396000 task=f7ade030 task.ti=f7396000)
Feb 15 12:55:57 testsvr kernel: Stack: f7397eac 00000000 00000024
00f55e00 00000000 f717fa00 00000000 00000000
Feb 15 12:55:57 testsvr kernel: 00000080 00000000 00000000
00000000 00000003 00000100 00000000 00000001
Feb 15 12:55:57 testsvr kernel: c020307c 00443eb0 00000000
00f55f00 00000000 00000400 c036b7ab 00f55e00
Feb 15 12:55:57 testsvr kernel: Call Trace:
Feb 15 12:55:57 testsvr kernel: [] __next_cpu+0x12/0x1f
Feb 15 12:55:57 testsvr kernel: [] sync_request+0x0/0x928
Feb 15 12:55:57 testsvr kernel: [] md_do_sync+0x581/0xa07
Feb 15 12:55:57 testsvr kernel: [] md_thread+0x0/0xdc
Feb 15 12:55:57 testsvr kernel: [] md_thread+0xc6/0xdc
Feb 15 12:55:57 testsvr kernel: [] complete+0x38/0x47
Feb 15 12:55:57 testsvr kernel: [] kthread+0xab/0xcf
Feb 15 12:55:57 testsvr kernel: [] kthread+0x0/0xcf
Feb 15 12:55:57 testsvr kernel: [] kernel_thread_helper+0x7/0x10
Feb 15 12:55:57 testsvr kernel: =======================
Feb 15 12:55:57 testsvr kernel: Code: 4f 04 8b 01 f0 ff 80 9c 00 00 00
f0 ff 03 31 ed 8d 43 34 eb 0c 8b 4c 24 30 39 08 74 09 45 83 c0 10 3b
6f 1c 7c ef
3b 6f 1c 75 04 <0f> 0b eb fe 8b 4b 38 c1 e5 04 89 71 08 89 59 3c c7 41 34 ba b6
Feb 15 12:55:57 testsvr kernel: EIP: []
sync_request+0x43d/0x928 SS:ESP 0068:f7397e58

On 2/14/07, John Stilson wrote:
> Wow thanks for the quick response. I will try this tomorrow morning
> and let you know.
>
> -John
>
> On 2/14/07, Neil Brown wrote:
> >
> > Thanks for the extra detail. I think I've nailed it.
> > Does this fix it for you?
> >
> > Thanks,
> > NeilBrown
> >
> > Signed-off-by: Neil Brown
> >
> > ### Diffstat output
> > ./drivers/md/raid10.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
> > --- .prev/drivers/md/raid10.c 2007-02-15 13:57:34.000000000 +1100
> > +++ ./drivers/md/raid10.c 2007-02-15 15:20:04.000000000 +1100
> > @@ -420,7 +420,7 @@ static sector_t raid10_find_virt(conf_t
> > if (dev < 0)
> > dev += conf->raid_disks;
> > } else {
> > - while (sector > conf->stride) {
> > + while (sector >= conf->stride) {
> > sector -= conf->stride;
> > if (dev < conf->near_copies)
> > dev += conf->raid_disks - conf->near_copies;
> > @@ -1747,6 +1747,7 @@ static sector_t sync_request(mddev_t *md
> > for (k=0; kcopies; k++)
> > if (r10_bio->devs[k].devnum == i)
> > break;
> > + BUG_ON(k == conf->copies);
> > bio = r10_bio->devs[1].bio;
> > bio->bi_next = biolist;
> > biolist = bio;
> > @@ -1973,6 +1974,7 @@ static int run(mddev_t *mddev)
> > conf->far_offset = fo;
> > conf->chunk_mask = (sector_t)(mddev->chunk_size>>9)-1;
> > conf->chunk_shift = ffz(~mddev->chunk_size) - 9;
> > + mddev->size &= ~(conf->chunk_mask >> 1);
> > if (fo)
> > conf->stride = 1 << conf->chunk_shift;
> > else {
> >
>
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: RAID 10 resync leading to attempt to access beyond end of device

am 15.02.2007 19:23:38 von John Stilson

Oh, an additional piece of information I just realized I had not put
in my original email is that this failure only happens intermittenly
-- 50%-75% of the time a rebuild occurs

-John

On 2/15/07, John Stilson wrote:
> Ok tried the patch and got a kernel BUG this time (BUG_ON(k == conf->copies)?)
>
> -John
>
> Feb 15 12:52:35 testsvr kernel: md: recovery of RAID array md0
> Feb 15 12:52:35 testsvr kernel: md: minimum _guaranteed_ speed: 1000
> KB/sec/disk.
> Feb 15 12:52:35 testsvr kernel: md: using maximum available idle IO
> bandwidth (but not more than 40000 KB/sec) for recovery.
> Feb 15 12:52:35 testsvr kernel: md: using 128k window, over a total of
> 8040320 blocks.
> Feb 15 12:55:57 testsvr kernel: ------------[ cut here ]------------
> Feb 15 12:55:57 testsvr kernel: kernel BUG at drivers/md/raid10.c:1804!
> Feb 15 12:55:57 testsvr kernel: invalid opcode: 0000 [#1]
> Feb 15 12:55:57 testsvr kernel: SMP
> Feb 15 12:55:57 testsvr kernel: Modules linked in:
> Feb 15 12:55:57 testsvr kernel: CPU: 0
> Feb 15 12:55:57 testsvr kernel: EIP: 0060:[] Not tainted VLI
> Feb 15 12:55:57 testsvr kernel: EFLAGS: 00010246 (2.6.20test1 #3)
> Feb 15 12:55:57 testsvr kernel: EIP is at sync_request+0x43d/0x928
> Feb 15 12:55:57 testsvr kernel: eax: c2330e14 ebx: c2330dc0 ecx:
> 00000003 edx: 00000000
> Feb 15 12:55:57 testsvr kernel: esi: f68b30c0 edi: f782d4c0 ebp:
> 00000002 esp: f7397e58
> Feb 15 12:55:57 testsvr kernel: ds: 007b es: 007b ss: 0068
> Feb 15 12:55:57 testsvr kernel: Process md0_resync (pid: 2589,
> ti=f7396000 task=f7ade030 task.ti=f7396000)
> Feb 15 12:55:57 testsvr kernel: Stack: f7397eac 00000000 00000024
> 00f55e00 00000000 f717fa00 00000000 00000000
> Feb 15 12:55:57 testsvr kernel: 00000080 00000000 00000000
> 00000000 00000003 00000100 00000000 00000001
> Feb 15 12:55:57 testsvr kernel: c020307c 00443eb0 00000000
> 00f55f00 00000000 00000400 c036b7ab 00f55e00
> Feb 15 12:55:57 testsvr kernel: Call Trace:
> Feb 15 12:55:57 testsvr kernel: [] __next_cpu+0x12/0x1f
> Feb 15 12:55:57 testsvr kernel: [] sync_request+0x0/0x928
> Feb 15 12:55:57 testsvr kernel: [] md_do_sync+0x581/0xa07
> Feb 15 12:55:57 testsvr kernel: [] md_thread+0x0/0xdc
> Feb 15 12:55:57 testsvr kernel: [] md_thread+0xc6/0xdc
> Feb 15 12:55:57 testsvr kernel: [] complete+0x38/0x47
> Feb 15 12:55:57 testsvr kernel: [] kthread+0xab/0xcf
> Feb 15 12:55:57 testsvr kernel: [] kthread+0x0/0xcf
> Feb 15 12:55:57 testsvr kernel: [] kernel_thread_helper+0x7/0x10
> Feb 15 12:55:57 testsvr kernel: =======================
> Feb 15 12:55:57 testsvr kernel: Code: 4f 04 8b 01 f0 ff 80 9c 00 00 00
> f0 ff 03 31 ed 8d 43 34 eb 0c 8b 4c 24 30 39 08 74 09 45 83 c0 10 3b
> 6f 1c 7c ef
> 3b 6f 1c 75 04 <0f> 0b eb fe 8b 4b 38 c1 e5 04 89 71 08 89 59 3c c7 41 34 ba b6
> Feb 15 12:55:57 testsvr kernel: EIP: []
> sync_request+0x43d/0x928 SS:ESP 0068:f7397e58
>
>
> On 2/14/07, John Stilson wrote:
> > Wow thanks for the quick response. I will try this tomorrow morning
> > and let you know.
> >
> > -John
> >
> > On 2/14/07, Neil Brown wrote:
> > >
> > > Thanks for the extra detail. I think I've nailed it.
> > > Does this fix it for you?
> > >
> > > Thanks,
> > > NeilBrown
> > >
> > > Signed-off-by: Neil Brown
> > >
> > > ### Diffstat output
> > > ./drivers/md/raid10.c | 4 +++-
> > > 1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
> > > --- .prev/drivers/md/raid10.c 2007-02-15 13:57:34.000000000 +1100
> > > +++ ./drivers/md/raid10.c 2007-02-15 15:20:04.000000000 +1100
> > > @@ -420,7 +420,7 @@ static sector_t raid10_find_virt(conf_t
> > > if (dev < 0)
> > > dev += conf->raid_disks;
> > > } else {
> > > - while (sector > conf->stride) {
> > > + while (sector >= conf->stride) {
> > > sector -= conf->stride;
> > > if (dev < conf->near_copies)
> > > dev += conf->raid_disks - conf->near_copies;
> > > @@ -1747,6 +1747,7 @@ static sector_t sync_request(mddev_t *md
> > > for (k=0; kcopies; k++)
> > > if (r10_bio->devs[k].devnum == i)
> > > break;
> > > + BUG_ON(k == conf->copies);
> > > bio = r10_bio->devs[1].bio;
> > > bio->bi_next = biolist;
> > > biolist = bio;
> > > @@ -1973,6 +1974,7 @@ static int run(mddev_t *mddev)
> > > conf->far_offset = fo;
> > > conf->chunk_mask = (sector_t)(mddev->chunk_size>>9)-1;
> > > conf->chunk_shift = ffz(~mddev->chunk_size) - 9;
> > > + mddev->size &= ~(conf->chunk_mask >> 1);
> > > if (fo)
> > > conf->stride = 1 << conf->chunk_shift;
> > > else {
> > >
> >
>
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

(unknown)

am 15.02.2007 19:28:20 von Derek Yeung

help

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

(unknown)

am 15.02.2007 19:53:38 von Derek Yeung

unsubscribe linux-raid

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: RAID 10 resync leading to attempt to access beyond end of device

am 16.02.2007 03:25:51 von NeilBrown

On Thursday February 15, john9601@gmail.com wrote:
> Ok tried the patch and got a kernel BUG this time (BUG_ON(k == conf->copies)?)

Thanks.... obviously I missed some subtlety. I think I have it right
now.
I've tested this against a setup which I think is sufficiently
identical to yours this time (now that I know what the important
parameters are: device size), but if you could test it too, that would
be great.

This patch is in place of the previous patch.

Thanks,
NeilBrown

Signed-off-by: Neil Brown

### Diffstat output
./drivers/md/raid10.c | 39 +++++++++++++++++++++------------------
1 file changed, 21 insertions(+), 18 deletions(-)

diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
--- .prev/drivers/md/raid10.c 2007-02-15 13:57:34.000000000 +1100
+++ ./drivers/md/raid10.c 2007-02-16 13:23:55.000000000 +1100
@@ -420,7 +420,7 @@ static sector_t raid10_find_virt(conf_t
if (dev < 0)
dev += conf->raid_disks;
} else {
- while (sector > conf->stride) {
+ while (sector >= conf->stride) {
sector -= conf->stride;
if (dev < conf->near_copies)
dev += conf->raid_disks - conf->near_copies;
@@ -1747,6 +1747,8 @@ static sector_t sync_request(mddev_t *md
for (k=0; kcopies; k++)
if (r10_bio->devs[k].devnum == i)
break;
+
+ BUG_ON(k == conf->copies);
bio = r10_bio->devs[1].bio;
bio->bi_next = biolist;
biolist = bio;
@@ -1967,19 +1969,30 @@ static int run(mddev_t *mddev)
if (!conf->tmppage)
goto out_free_conf;

+ conf->mddev = mddev;
+ conf->raid_disks = mddev->raid_disks;
conf->near_copies = nc;
conf->far_copies = fc;
conf->copies = nc*fc;
conf->far_offset = fo;
conf->chunk_mask = (sector_t)(mddev->chunk_size>>9)-1;
conf->chunk_shift = ffz(~mddev->chunk_size) - 9;
+ size = mddev->size >> (conf->chunk_shift-1);
+ sector_div(size, fc);
+ size = size * conf->raid_disks;
+ sector_div(size, nc);
+ /* 'size' is now the number of chunks in the array */
+ /* calculate "used chunks per device" in 'stride' */
+ stride = size * conf->copies;
+ sector_div(stride, conf->raid_disks);
+ mddev->size = stride << (conf->chunk_shift-1);
+
if (fo)
- conf->stride = 1 << conf->chunk_shift;
- else {
- stride = mddev->size >> (conf->chunk_shift-1);
+ stride = 1;
+ else
sector_div(stride, fc);
- conf->stride = stride << conf->chunk_shift;
- }
+ conf->stride = stride << conf->chunk_shift;
+
conf->r10bio_pool = mempool_create(NR_RAID10_BIOS, r10bio_pool_alloc,
r10bio_pool_free, conf);
if (!conf->r10bio_pool) {
@@ -2009,8 +2022,6 @@ static int run(mddev_t *mddev)

disk->head_position = 0;
}
- conf->raid_disks = mddev->raid_disks;
- conf->mddev = mddev;
spin_lock_init(&conf->device_lock);
INIT_LIST_HEAD(&conf->retry_list);

@@ -2052,16 +2063,8 @@ static int run(mddev_t *mddev)
/*
* Ok, everything is just fine now
*/
- if (conf->far_offset) {
- size = mddev->size >> (conf->chunk_shift-1);
- size *= conf->raid_disks;
- size <<= conf->chunk_shift;
- sector_div(size, conf->far_copies);
- } else
- size = conf->stride * conf->raid_disks;
- sector_div(size, conf->near_copies);
- mddev->array_size = size/2;
- mddev->resync_max_sectors = size;
+ mddev->array_size = size << (conf->chunk_shift-1);
+ mddev->resync_max_sectors = size << conf->chunk_shift;

mddev->queue->issue_flush_fn = raid10_issue_flush;
mddev->queue->backing_dev_info.congested_fn = raid10_congested;
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: RAID 10 resync leading to attempt to access beyond end of device

am 19.02.2007 18:16:39 von John Stilson

Hey Neil,

I tested this new patch and it seems to work! I'm going to do some
more vigorous testing, and I'll let you know if any more issues bubble
out. Thanks!

-John

On 2/15/07, Neil Brown wrote:
> On Thursday February 15, john9601@gmail.com wrote:
> > Ok tried the patch and got a kernel BUG this time (BUG_ON(k == conf->copies)?)
>
> Thanks.... obviously I missed some subtlety. I think I have it right
> now.
> I've tested this against a setup which I think is sufficiently
> identical to yours this time (now that I know what the important
> parameters are: device size), but if you could test it too, that would
> be great.
>
> This patch is in place of the previous patch.
>
> Thanks,
> NeilBrown
>
>
> Signed-off-by: Neil Brown
>
> ### Diffstat output
> ./drivers/md/raid10.c | 39 +++++++++++++++++++++------------------
> 1 file changed, 21 insertions(+), 18 deletions(-)
>
> diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
> --- .prev/drivers/md/raid10.c 2007-02-15 13:57:34.000000000 +1100
> +++ ./drivers/md/raid10.c 2007-02-16 13:23:55.000000000 +1100
> @@ -420,7 +420,7 @@ static sector_t raid10_find_virt(conf_t
> if (dev < 0)
> dev += conf->raid_disks;
> } else {
> - while (sector > conf->stride) {
> + while (sector >= conf->stride) {
> sector -= conf->stride;
> if (dev < conf->near_copies)
> dev += conf->raid_disks - conf->near_copies;
> @@ -1747,6 +1747,8 @@ static sector_t sync_request(mddev_t *md
> for (k=0; kcopies; k++)
> if (r10_bio->devs[k].devnum == i)
> break;
> +
> + BUG_ON(k == conf->copies);
> bio = r10_bio->devs[1].bio;
> bio->bi_next = biolist;
> biolist = bio;
> @@ -1967,19 +1969,30 @@ static int run(mddev_t *mddev)
> if (!conf->tmppage)
> goto out_free_conf;
>
> + conf->mddev = mddev;
> + conf->raid_disks = mddev->raid_disks;
> conf->near_copies = nc;
> conf->far_copies = fc;
> conf->copies = nc*fc;
> conf->far_offset = fo;
> conf->chunk_mask = (sector_t)(mddev->chunk_size>>9)-1;
> conf->chunk_shift = ffz(~mddev->chunk_size) - 9;
> + size = mddev->size >> (conf->chunk_shift-1);
> + sector_div(size, fc);
> + size = size * conf->raid_disks;
> + sector_div(size, nc);
> + /* 'size' is now the number of chunks in the array */
> + /* calculate "used chunks per device" in 'stride' */
> + stride = size * conf->copies;
> + sector_div(stride, conf->raid_disks);
> + mddev->size = stride << (conf->chunk_shift-1);
> +
> if (fo)
> - conf->stride = 1 << conf->chunk_shift;
> - else {
> - stride = mddev->size >> (conf->chunk_shift-1);
> + stride = 1;
> + else
> sector_div(stride, fc);
> - conf->stride = stride << conf->chunk_shift;
> - }
> + conf->stride = stride << conf->chunk_shift;
> +
> conf->r10bio_pool = mempool_create(NR_RAID10_BIOS, r10bio_pool_alloc,
> r10bio_pool_free, conf);
> if (!conf->r10bio_pool) {
> @@ -2009,8 +2022,6 @@ static int run(mddev_t *mddev)
>
> disk->head_position = 0;
> }
> - conf->raid_disks = mddev->raid_disks;
> - conf->mddev = mddev;
> spin_lock_init(&conf->device_lock);
> INIT_LIST_HEAD(&conf->retry_list);
>
> @@ -2052,16 +2063,8 @@ static int run(mddev_t *mddev)
> /*
> * Ok, everything is just fine now
> */
> - if (conf->far_offset) {
> - size = mddev->size >> (conf->chunk_shift-1);
> - size *= conf->raid_disks;
> - size <<= conf->chunk_shift;
> - sector_div(size, conf->far_copies);
> - } else
> - size = conf->stride * conf->raid_disks;
> - sector_div(size, conf->near_copies);
> - mddev->array_size = size/2;
> - mddev->resync_max_sectors = size;
> + mddev->array_size = size << (conf->chunk_shift-1);
> + mddev->resync_max_sectors = size << conf->chunk_shift;
>
> mddev->queue->issue_flush_fn = raid10_issue_flush;
> mddev->queue->backing_dev_info.congested_fn = raid10_congested;
>
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html