error injection
Hi
I am trying to dynamically add error injection to my virtual
disk(LVM) for testing+ debugging purpose. I saw "faulty" personality
module in the kernel and was wondering if there was any documentation
on its usage. I am not looking to set up a RAID but a simple mapped
device. So the basic use case is that I need to be able to dynamically
add/remove error sectors and also be able to have granular error
configuration like read error, read+write error etc.
thanks in advance
Jojy
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: error injection
--Sig_/FP1uXW4iFq8SknZRmb7f7k8
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
On Wed, 28 Sep 2011 12:48:37 -0700 Jojy Varghese <jojy.varghese [at] gmail.com>
wrote:
> Hi
> I am trying to dynamically add error injection to my virtual
> disk(LVM) for testing+ debugging purpose. I saw "faulty" personality
> module in the kernel and was wondering if there was any documentation
> on its usage. I am not looking to set up a RAID but a simple mapped
> device. So the basic use case is that I need to be able to dynamically
> add/remove error sectors and also be able to have granular error
> configuration like read error, read+write error etc.
>
> thanks in advance
> Jojy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo [at] vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
The 'faulty' md personality is described briefly in the 'md.4' man page whi=
ch
is included in the mdadm distribution.
I've included the relevant part below.
Configuring the type of faults is described in mdadm.8 under the '-p
--layout=3D' section. So can adjust the settings using mdadm --grow.
so:
mdadm -B /dev/md0 -l faulty -n1 /dev/sda
will build a 'faulty' device which provides access to /dev/sda, but
introduces faults. Initially no faults will be introduces.
mdadm -G /dev/md0 --layout=3Drt400
will tell md0 to generate a read error every 400 requests, but not to
remember the error - rt =3D=3D readtransient
--layout=3Drp400
will create a persistent error every 400 reads subsequent reads of the same
block will produce the same error. at most 50 persistent errors can be
recorded.
mdadm -G /dev/md0 --layout=3Dclear
will stop producing new errors
mdadm -G /dev/md0 --layout=3Dflush
will forget all persistent errors.
from md.4:
FAULTY
The FAULTY md module is provided for testing purposes. A faulty ar=
ray
has exactly one component device and is normally assembled withou=
t a
superblock, so the md array created provides direct access to all =
of
the data in the component device.
The FAULTY module may be requested to simulate faults to allow test=
ing
of other md levels or of filesystems. Faults can be chosen to trig=
ger
on read requests or write requests, and can be transient (a subsequ=
ent
read/write at the address will probably succeed) or persistent (sub=
se-
quent read/write of the same address will fail). Further, read fau=
lts
can be "fixable" meaning that they persist until a write request at =
the
same address.
Fault types can be requested with a period. In this case, the fa=
ult
will recur repeatedly after the given number of requests of the re=
le-
vant type. For example if persistent read faults have a period of 1=
00,
then every 100th read request would generate a fault, and the fau=
lty
sector would be recorded so that subsequent reads on that sector wo=
uld
also fail.
There is a limit to the number of faulty sectors that are remember=
ed.
Faults generated after this limit is exhausted are treated as tr=
an-
sient.
The list of faulty sectors can be flushed, and the active list of fa=
il-
ure modes can be cleared.
from mdadm.8:
When setting the failure mode for level faulty, the options a=
re:
write-transient, wt, read-transient, rt, write-persistent, =
wp,
read-persistent, rp, write-all, read-fixable, rf, clear, flu=
sh,
none.
Each failure mode can be followed by a number, which is used =
as
a period between fault generation. Without a number, the fa=
ult
is generated once on the first relevant request. With a numb=
er,
the fault will be generated after that many requests, and w=
ill
continue to be generated every time the period elapses.
Multiple failure modes can be current simultaneously by us=
ing
the --grow option to set subsequent failure modes.
"clear" or "none" will remove any pending or periodic fail=
ure
modes, and "flush" will clear any persistent faults.
NeilBrown
--Sig_/FP1uXW4iFq8SknZRmb7f7k8
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iD8DBQFOg6mgG5fc6gV+Wb0RAm2xAJ9jF985UPsLXJi04JwgVyTUEFQeSgCg hRV0
DVWkE6l7RR/4pUiAR49dheM=
=/vRq
-----END PGP SIGNATURE-----
--Sig_/FP1uXW4iFq8SknZRmb7f7k8--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: error injection
Thanks Neil. I tried setting my sda7 partition to generate write
errors every 40 bytes(writing 1 byte at a time). I did :
1. Create a array with:
mdadm -C /dev/md/me0 -l faulty -n1 /dev/sda7
After this step I can see /dev/md127 and when i do a mdadm -D /dev/md12=
7, i get:
/dev/md127:
Version : 1.2
Creation Time : Wed Sep 28 17:35:50 2011
Raid Level : faulty
Array Size : 969410424 (924.50 GiB 992.68 GB)
Raid Devices : 1
Total Devices : 1
Persistence : Superblock is persistent
Update Time : Wed Sep 28 17:35:50 2011
State : clean
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Name : eng-dev16.lab.local:me0 (local to host eng-dev16.lab=
=2Elocal)
UUID : 96f4be10:312f9574:f40107aa:d9f278ba
Events : 0
Number Major Minor RaidDevice State
0 8 7 0 active sync /dev/sda7
2. Set write fault level with:
mdadm -G /dev/md/me0 --layout=3Dwp40
After this when i write > 40 bytes into /dev/md127, i dont get any
I/O errors. I am sure i am doing something wrong here.
Any help is much appreciated.
Thanks
Jojy
On Wed, Sep 28, 2011 at 4:11 PM, NeilBrown <neilb [at] suse.de> wrote:
> On Wed, 28 Sep 2011 12:48:37 -0700 Jojy Varghese <jojy.varghese [at] gmail=
=2Ecom>
> wrote:
>
>> Hi
>> =C2=A0I am trying to dynamically add error injection to my virtual
>> disk(LVM) for testing+ debugging purpose. I saw "faulty" personality
>> module in the kernel and was wondering if there was any documentatio=
n
>> on its usage. I am not looking to set up a RAID but a simple mapped
>> device. So the basic use case is that I need to be able to dynamical=
ly
>> add/remove error sectors and also be able to have granular error
>> configuration like read error, read+write error etc.
>>
>> thanks in advance
>> Jojy
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid=
" in
>> the body of a message to majordomo [at] vger.kernel.org
>> More majordomo info at =C2=A0http://vger.kernel.org/majordomo-info.h=
tml
>
> The 'faulty' md personality is described briefly in the 'md.4' man pa=
ge which
> is included in the mdadm distribution.
> I've included the relevant part below.
>
> Configuring the type of faults is described in mdadm.8 under the '-p
> --layout=3D' section. =C2=A0So can adjust the settings using mdadm --=
grow.
> so:
> =C2=A0mdadm -B /dev/md0 -l faulty -n1 /dev/sda
>
> will build a 'faulty' device which provides access to /dev/sda, but
> introduces faults. =C2=A0Initially no faults will be introduces.
>
> =C2=A0mdadm -G /dev/md0 --layout=3Drt400
>
> will tell md0 to generate a read error every 400 requests, but not to
> remember the error - rt =3D=3D readtransient
> =C2=A0 --layout=3Drp400
> will create a persistent error every 400 reads subsequent reads of th=
e same
> block will produce the same error. =C2=A0at most 50 persistent errors=
can be
> recorded.
> =C2=A0mdadm -G /dev/md0 --layout=3Dclear
> will stop producing new errors
> =C2=A0mdadm -G /dev/md0 --layout=3Dflush
> will forget all persistent errors.
>
>
> from md.4:
>
> =C2=A0 FAULTY
> =C2=A0 =C2=A0 =C2=A0 The FAULTY md module is provided for testing pur=
poses. =C2=A0A faulty =C2=A0array
> =C2=A0 =C2=A0 =C2=A0 has =C2=A0exactly =C2=A0one =C2=A0component devi=
ce and is normally assembled without a
> =C2=A0 =C2=A0 =C2=A0 superblock, so the md array created provides dir=
ect access =C2=A0to =C2=A0all =C2=A0of
> =C2=A0 =C2=A0 =C2=A0 the data in the component device.
>
> =C2=A0 =C2=A0 =C2=A0 The =C2=A0FAULTY module may be requested to simu=
late faults to allow testing
> =C2=A0 =C2=A0 =C2=A0 of other md levels or of filesystems. =C2=A0Faul=
ts can be chosen to =C2=A0trigger
> =C2=A0 =C2=A0 =C2=A0 on =C2=A0read requests or write requests, and ca=
n be transient (a subsequent
> =C2=A0 =C2=A0 =C2=A0 read/write at the address will probably succeed)=
or persistent =C2=A0(subse-
> =C2=A0 =C2=A0 =C2=A0 quent =C2=A0read/write of the same address will =
fail). =C2=A0Further, read faults
> =C2=A0 =C2=A0 =C2=A0 can be "fixable" meaning that they persist until=
a write request at the
> =C2=A0 =C2=A0 =C2=A0 same address.
>
> =C2=A0 =C2=A0 =C2=A0 Fault =C2=A0types =C2=A0can =C2=A0be requested w=
ith a period. =C2=A0In this case, the fault
> =C2=A0 =C2=A0 =C2=A0 will recur repeatedly after the given number of =
requests of =C2=A0the =C2=A0rele-
> =C2=A0 =C2=A0 =C2=A0 vant type. =C2=A0For example if persistent read =
faults have a period of 100,
> =C2=A0 =C2=A0 =C2=A0 then every 100th read request would generate a f=
ault, =C2=A0and =C2=A0the =C2=A0faulty
> =C2=A0 =C2=A0 =C2=A0 sector =C2=A0would be recorded so that subsequen=
t reads on that sector would
> =C2=A0 =C2=A0 =C2=A0 also fail.
>
> =C2=A0 =C2=A0 =C2=A0 There is a limit to the number of faulty sectors=
that =C2=A0are =C2=A0remembered.
> =C2=A0 =C2=A0 =C2=A0 Faults =C2=A0generated =C2=A0after =C2=A0this =C2=
=A0limit is exhausted are treated as tran-
> =C2=A0 =C2=A0 =C2=A0 sient.
>
> =C2=A0 =C2=A0 =C2=A0 The list of faulty sectors can be flushed, and t=
he active list of fail-
> =C2=A0 =C2=A0 =C2=A0 ure modes can be cleared.
>
>
> from mdadm.8:
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0When setting the fail=
ure mode for level faulty, the options are:
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0write-transient, wt, =
read-transient, rt, =C2=A0write-persistent, =C2=A0wp,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0read-persistent, =C2=A0=
rp, write-all, read-fixable, rf, clear, flush,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0none.
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Each failure mode can=
be followed by a number, which is used =C2=A0as
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0a =C2=A0period betwee=
n fault generation. =C2=A0Without a number, the fault
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0is generated once on =
the first relevant request. =C2=A0With a number,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0the =C2=A0fault =C2=A0=
will be generated after that many requests, and will
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0continue to be genera=
ted every time the period elapses.
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Multiple failure mode=
s can be current =C2=A0simultaneously =C2=A0by =C2=A0using
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0the --grow option to =
set subsequent failure modes.
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"clear" =C2=A0or =C2=A0=
"none" =C2=A0will remove any pending or periodic failure
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0modes, and "flush" wi=
ll clear any persistent faults.
>
>
>
> NeilBrown
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: error injection
--Sig_/k9Ir0z0.zzHr_=..ltLWAN/
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
On Wed, 28 Sep 2011 17:59:49 -0700 Jojy Varghese <jojy.varghese [at] gmail.com>
wrote:
> Thanks Neil. I tried setting my sda7 partition to generate write
> errors every 40 bytes(writing 1 byte at a time). I did :
md doesn't see byte writes. It sees sectors or more - usually whole pages =
or
groups of pages.
>
> 1. Create a array with:
> mdadm -C /dev/md/me0 -l faulty -n1 /dev/sda7
-C will write a superblock to /dev/sda7 which you don't really want. It
doesn't hurt, but I always used -B (--build) to avoid any metadata.
>
> After this step I can see /dev/md127 and when i do a mdadm -D /dev/md127,=
i get:
>
> /dev/md127:
> Version : 1.2
> Creation Time : Wed Sep 28 17:35:50 2011
> Raid Level : faulty
> Array Size : 969410424 (924.50 GiB 992.68 GB)
> Raid Devices : 1
> Total Devices : 1
> Persistence : Superblock is persistent
>
> Update Time : Wed Sep 28 17:35:50 2011
> State : clean
> Active Devices : 1
> Working Devices : 1
> Failed Devices : 0
> Spare Devices : 0
>
> Name : eng-dev16.lab.local:me0 (local to host eng-dev16.lab.l=
ocal)
> UUID : 96f4be10:312f9574:f40107aa:d9f278ba
> Events : 0
>
> Number Major Minor RaidDevice State
> 0 8 7 0 active sync /dev/sda7
>
>
> 2. Set write fault level with:
>
> mdadm -G /dev/md/me0 --layout=3Dwp40
>
>
>
> After this when i write > 40 bytes into /dev/md127, i dont get any
> I/O errors. I am sure i am doing something wrong here.
When you write to /dev/md127 it will just go into the page cache and
eventually be flushed to the device in one write.
Use O_DIRECT or O_SYNC and it will be flushed out more quickly, but always
write at least 512 bytes at a time.
NeilBrown
>
>
> Any help is much appreciated.
>
> Thanks
> Jojy
--Sig_/k9Ir0z0.zzHr_=..ltLWAN/
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iD8DBQFOg8UUG5fc6gV+Wb0RAkEOAKCKye5hm5WYHh5r7FwJn0VzWEOD4gCf W0dh
/QpgbivMh9oa6SQP4IMabhc=
=3R1R
-----END PGP SIGNATURE-----
--Sig_/k9Ir0z0.zzHr_=..ltLWAN/--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: error injection
Thanks Neil. Also, is there any way to find the current fault blocks be=
ing set?
On Wed, Sep 28, 2011 at 6:08 PM, NeilBrown <neilb [at] suse.de> wrote:
> On Wed, 28 Sep 2011 17:59:49 -0700 Jojy Varghese <jojy.varghese [at] gmail=
=2Ecom>
> wrote:
>
>> Thanks Neil. I tried setting my sda7 partition to generate write
>> errors every 40 bytes(writing 1 byte at a time). I did :
>
> md doesn't see byte writes. =C2=A0It sees sectors or more - usually w=
hole pages or
> groups of pages.
>
>>
>> 1. Create a array with:
>> mdadm -C /dev/md/me0 -l faulty -n1 /dev/sda7
>
> -C will write a superblock to /dev/sda7 which you don't really want. =
=C2=A0It
> doesn't hurt, but I always used -B (--build) to avoid any metadata.
>
>>
>> After this step I can see /dev/md127 and when i do a mdadm -D /dev/m=
d127, i get:
>>
>> /dev/md127:
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Version : 1.2
>> =C2=A0 Creation Time : Wed Sep 28 17:35:50 2011
>> =C2=A0 =C2=A0 =C2=A0Raid Level : faulty
>> =C2=A0 =C2=A0 =C2=A0Array Size : 969410424 (924.50 GiB 992.68 GB)
>> =C2=A0 =C2=A0Raid Devices : 1
>> =C2=A0 Total Devices : 1
>> =C2=A0 =C2=A0 Persistence : Superblock is persistent
>>
>> =C2=A0 =C2=A0 Update Time : Wed Sep 28 17:35:50 2011
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 State : clean
>> =C2=A0Active Devices : 1
>> Working Devices : 1
>> =C2=A0Failed Devices : 0
>> =C2=A0 Spare Devices : 0
>>
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Name : eng-dev16.lab.local:=
me0 =C2=A0(local to host eng-dev16.lab.local)
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0UUID : 96f4be10:312f9574:f4=
0107aa:d9f278ba
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Events : 0
>>
>> =C2=A0 =C2=A0 Number =C2=A0 Major =C2=A0 Minor =C2=A0 RaidDevice Sta=
te
>> =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=
=A0 =C2=A07 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0active syn=
c =C2=A0 /dev/sda7
>>
>>
>> 2. Set write fault level with:
>>
>> mdadm -G /dev/md/me0 --layout=3Dwp40
>>
>>
>>
>> =C2=A0 After this when i write > 40 bytes into /dev/md127, i dont ge=
t any
>> I/O errors. I am sure i am doing something wrong here.
>
> When you write to /dev/md127 it will just go into the page cache and
> eventually be flushed to the device in one write.
> Use O_DIRECT or O_SYNC and it will be flushed out more quickly, but a=
lways
> write at least 512 =C2=A0bytes at a time.
>
> NeilBrown
>
>
>
>>
>>
>> Any help is much appreciated.
>>
>> Thanks
>> Jojy
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: error injection
--Sig_/BcsNVUd+dpG6M/L3DkhTyoX
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
On Wed, 28 Sep 2011 19:06:17 -0700 Jojy Varghese <jojy.varghese [at] gmail.com>
wrote:
> Thanks Neil. Also, is there any way to find the current fault blocks bein=
g set?
>
No. All you can get is what is shown in "/proc/mdstat".
It wouldn't be too hard to add something to /proc/mdstat or /sys/.... to sh=
ow
that information.
NeilBrown
--Sig_/BcsNVUd+dpG6M/L3DkhTyoX
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iD8DBQFOg9QKG5fc6gV+Wb0RAsAGAKDHpnlM+cdCLfAKgj9kCtvGQqhWJACd HtET
AuGVx1ZkzZry3Df/l6jleyQ=
=qrR+
-----END PGP SIGNATURE-----
--Sig_/BcsNVUd+dpG6M/L3DkhTyoX--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html