Linux Software RAID Bitmap Question

Anyone have a good explanation for the use of bitmaps?

Anyone on the list use them?

http://gentoo-wiki.com/HOWTO_Gentoo_Install_on_Software_RAID #Data_Scrubbing

Provides an explanation on that page.

I believe Neil stated that using bitmaps does incur a 10% performance
penalty. If one's box never (or rarely) crashes, is a bitmap needed?

The one question I had regarding a bitmap is as follows:

The mismatch_cnt file.

If I have bitmaps turned on for my RAID DEVICES, is it possible that the
'mismatch_cnt' will be updated when it finds a bad block?

That would be VERY nice instead of running a check all the time.

Justin.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Justin Piszcz [ So, 25 Februar 2007 15:05 ] [ ID #1639388 ]

Re: Linux Software RAID Bitmap Question

On Sunday February 25, jpiszcz [at] lucidpixels.com wrote:
> Anyone have a good explanation for the use of bitmaps?
>
> Anyone on the list use them?
>
> http://gentoo-wiki.com/HOWTO_Gentoo_Install_on_Software_RAID #Data_Scrubbing
>
> Provides an explanation on that page.
>
> I believe Neil stated that using bitmaps does incur a 10% performance
> penalty. If one's box never (or rarely) crashes, is a bitmap needed?

I think I said it "can" incur such a penalty. The actual cost is very
dependant on work-load.

>
> The one question I had regarding a bitmap is as follows:
>
> The mismatch_cnt file.
>
> If I have bitmaps turned on for my RAID DEVICES, is it possible that the
> 'mismatch_cnt' will be updated when it finds a bad block?
>
> That would be VERY nice instead of running a check all the time.

When md find a bad block (read failure) it either fixes it (by
successfully over-writing the correct date) or fails the drive.

The count of the times that this has happened is available via
/sys/block/mdX/md/errors

If you use version-1 superblocks, then this count is maintained
throughout the life of the array. If you use v0.90, the count is
zeroed whenever you assemble the array.

This count is completely separate from the 'mismatch_cnt'.
'mismatch_cnt' referred to when md check if redundant information
(copies or parity) is consistent or not. This does not happen at all
during normal operation. It only happens when you ask for a 'check'
or 'repair' operation. It might also happen when the array
automatically performs a 'sync' after an unclean shutdown.

And all this has very little to do with bitmaps.
So I'm afraid I don't understand your question.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
NeilBrown [ Mo, 26 Februar 2007 05:22 ] [ ID #1640007 ]

Re: Linux Software RAID Bitmap Question

Neil Brown wrote:
> When md find a bad block (read failure) it either fixes it (by
> successfully over-writing the correct date) or fails the drive.
>
> The count of the times that this has happened is available via
> /sys/block/mdX/md/errors
>
What kernel provides this? I have system running everything from 2.6.15
to 2.6.20-get14, and there is no such file in any of them. There is a
per-device errors file one level down, but that presumably wouldn't be
in the superblock.

Do I have to go from 0.90 to v1 or later superblocks to get this, and if
so is that a safe thing to do?

--
bill davidsen <davidsen [at] tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Bill Davidsen [ Di, 27 Februar 2007 18:42 ] [ ID #1641050 ]

Re: Linux Software RAID Bitmap Question

On Tuesday February 27, davidsen [at] tmr.com wrote:
> Neil Brown wrote:
> > When md find a bad block (read failure) it either fixes it (by
> > successfully over-writing the correct date) or fails the drive.
> >
> > The count of the times that this has happened is available via
> > /sys/block/mdX/md/errors
> >
> What kernel provides this? I have system running everything from 2.6.15
> to 2.6.20-get14, and there is no such file in any of them. There is a
> per-device errors file one level down, but that presumably wouldn't be
> in the superblock.

Sorry, I did get that wrong. As you say it is a per-device field:
/sys/block/mdX/md/dev-*/errors

which makes sense because it is individual devices that get errors,
not whole arrays. And it *is* stored in the superblock for v1. The
superblock has a per-device section which is potentially different on
each device, The corrected-error count is stored there.

>
> Do I have to go from 0.90 to v1 or later superblocks to get this, and if
> so is that a safe thing to do?

It is not currently easy to convert a v0.90 array to use v1
superblocks. I should put that on my todo list as it isn't
conceptually hard.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
NeilBrown [ Di, 27 Februar 2007 22:04 ] [ ID #1641058 ]

Re: Linux Software RAID Bitmap Question

Neil Brown wrote:
> On Tuesday February 27, davidsen [at] tmr.com wrote:
>
>> Neil Brown wrote:
>>
>>> When md find a bad block (read failure) it either fixes it (by
>>> successfully over-writing the correct date) or fails the drive.
>>>
>>> The count of the times that this has happened is available via
>>> /sys/block/mdX/md/errors
>>>
>>>
>> What kernel provides this? I have system running everything from 2.6.15
>> to 2.6.20-get14, and there is no such file in any of them. There is a
>> per-device errors file one level down, but that presumably wouldn't be
>> in the superblock.
>>
>
> Sorry, I did get that wrong. As you say it is a per-device field:
> /sys/block/mdX/md/dev-*/errors
>
> which makes sense because it is individual devices that get errors,
> not whole arrays. And it *is* stored in the superblock for v1. The
> superblock has a per-device section which is potentially different on
> each device, The corrected-error count is stored there.
>
>
OK, I can add...
>> Do I have to go from 0.90 to v1 or later superblocks to get this, and if
>> so is that a safe thing to do?
>>
>
> It is not currently easy to convert a v0.90 array to use v1
> superblocks. I should put that on my todo list as it isn't
> conceptually hard.
Unfortunately the thing I'd like most is hard (RAID5E), but if/when the
SB conversion is available I would give it a test drive.

--
bill davidsen <davidsen [at] tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Bill Davidsen [ Mi, 28 Februar 2007 04:00 ] [ ID #1642312 ]

Re: Linux Software RAID Bitmap Question

On Mon, 26 Feb 2007, Neil Brown wrote:

> On Sunday February 25, jpiszcz [at] lucidpixels.com wrote:
> > I believe Neil stated that using bitmaps does incur a 10% performance
> > penalty. If one's box never (or rarely) crashes, is a bitmap needed?
>
> I think I said it "can" incur such a penalty. The actual cost is very
> dependant on work-load.

i did a crude benchmark recently... to get some data for a common setup
i use (external journals and bitmaps on raid1, xfs fs on raid5).

emphasis on "crude":

time sh -c 'tar xf /var/tmp/linux-2.6.20.tar; sync'

xfs journal raid5 bitmap times
internal none 0.18s user 2.14s system 2% cpu 1:27.95 total
internal internal 0.16s user 2.16s system 1% cpu 2:01.12 total
raid1 none 0.07s user 2.02s system 2% cpu 1:20.62 total
raid1 internal 0.14s user 2.01s system 1% cpu 1:55.18 total
raid1 raid1 0.14s user 2.03s system 2% cpu 1:20.61 total


raid5:
- 4x seagate 7200.10 400GB on marvell MV88SX6081
- mdadm --create --level=5 --raid-devices=4 /dev/md4 /dev/sd[abcd]1

raid1:
- 2x maxtor 6Y200P0 on 3ware 7504
- two 128MiB partitions starting at cyl 1
- mdadm --create --level=1 --raid-disks=2 --auto=yes --assume-clean /dev/md1 /dev/sd[fg]1
- mdadm --create --level=1 --raid-disks=2 --auto=yes --assume-clean /dev/md2 /dev/sd[fg]2
- md1 is used for external xfs journal
- md2 has an ext3 filesystem for the external md4 bitmap

xfs:
- mkfs.xfs issued before each run using the defaults (aside from -l logdev=/dev/md1)
- mount -o noatime,nodiratime[,logdev=/dev/md1]

system:
- dual opteron 848 (2.2ghz), 8GiB ddr 266
- tyan s2882
- 2.6.20

-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
dean gaudet [ Mi, 28 Februar 2007 20:30 ] [ ID #1642319 ]

Re: Linux Software RAID Bitmap Question

On Wed, 28 Feb 2007, dean gaudet wrote:

> On Mon, 26 Feb 2007, Neil Brown wrote:
>
>> On Sunday February 25, jpiszcz [at] lucidpixels.com wrote:
>>> I believe Neil stated that using bitmaps does incur a 10% performance
>>> penalty. If one's box never (or rarely) crashes, is a bitmap needed?
>>
>> I think I said it "can" incur such a penalty. The actual cost is very
>> dependant on work-load.
>
> i did a crude benchmark recently... to get some data for a common setup
> i use (external journals and bitmaps on raid1, xfs fs on raid5).
>
> emphasis on "crude":
>
> time sh -c 'tar xf /var/tmp/linux-2.6.20.tar; sync'
>
> xfs journal raid5 bitmap times
> internal none 0.18s user 2.14s system 2% cpu 1:27.95 total
> internal internal 0.16s user 2.16s system 1% cpu 2:01.12 total
> raid1 none 0.07s user 2.02s system 2% cpu 1:20.62 total
> raid1 internal 0.14s user 2.01s system 1% cpu 1:55.18 total
> raid1 raid1 0.14s user 2.03s system 2% cpu 1:20.61 total
>
>
> raid5:
> - 4x seagate 7200.10 400GB on marvell MV88SX6081
> - mdadm --create --level=5 --raid-devices=4 /dev/md4 /dev/sd[abcd]1
>
> raid1:
> - 2x maxtor 6Y200P0 on 3ware 7504
> - two 128MiB partitions starting at cyl 1
> - mdadm --create --level=1 --raid-disks=2 --auto=yes --assume-clean /dev/md1 /dev/sd[fg]1
> - mdadm --create --level=1 --raid-disks=2 --auto=yes --assume-clean /dev/md2 /dev/sd[fg]2
> - md1 is used for external xfs journal
> - md2 has an ext3 filesystem for the external md4 bitmap
>
> xfs:
> - mkfs.xfs issued before each run using the defaults (aside from -l logdev=/dev/md1)
> - mount -o noatime,nodiratime[,logdev=/dev/md1]
>
> system:
> - dual opteron 848 (2.2ghz), 8GiB ddr 266
> - tyan s2882
> - 2.6.20
>
> -dean
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo [at] vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

I like your crude benchmark, it works well. I have not tested with a
bitmap, but it does not seem worth it to use those.

Here are my results (for disk speed/purposes only):

All FS are XFS with similar mount options as yours.

Raid1 Dual 74GB Raptor (older models, no NCQ) [no bitmap]
# time sh -c 'tar xf linux-2.6.20.tar; sync'

real 0m40.226s
user 0m0.200s
sys 0m1.515s

Raid5 Quad 150 Raptor (NCQ) [no bitmap]
# time sh -c 'tar xf linux-2.6.20.tar; sync'

real 0m21.721s
user 0m0.174s
sys 0m1.541s

Raid5 Six 400GB Sata Drives (some NCQ, some not) [no bitmap]

# time sh -c 'tar xf linux-2.6.20.tar; sync'
real 1m7.322s
user 0m0.194s
sys 0m1.492s


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Justin Piszcz [ Mi, 28 Februar 2007 20:44 ] [ ID #1642320 ]
Linux » gmane.linux.raid » Linux Software RAID Bitmap Question

Vorheriges Thema: trouble creating array
Nächstes Thema: [RFC, PATCH] raid456: replace the handle_list with a