I am an idiot.

Hi guys...

Yes I am an idiot. I was changing the chunk size of my RAID5 array
last night from 64kb to 256kb and left it running overnight. During
the night we had a power outage.

This is where the idiot part comes in. The backup file is on a
filesystem that's part of the RAID5 array, so obviously I am unable to
start it. I completely forgot the filesystem I specified for
--backup-file was part of the same array.

Once you're all done pointing and laughing, can you let me know if I
am totally screwed? I've a lot of data here that I -really- don't
want to lose...

Please help..

Idiot.

--
Alex Boag-Munroe

Lack of planning on your part does not constitute an emergency on mine.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alex Boag-Munroe [ Do, 04 März 2010 12:30 ] [ ID #2034006 ]

Re: I am an idiot.

On 04/03/2010 11:30, Alex Boag-Munroe wrote:
> Hi guys...
>
> Yes I am an idiot. I was changing the chunk size of my RAID5 array
> last night from 64kb to 256kb and left it running overnight. During
> the night we had a power outage.
>
> This is where the idiot part comes in. The backup file is on a
> filesystem that's part of the RAID5 array, so obviously I am unable to
> start it. I completely forgot the filesystem I specified for
> --backup-file was part of the same array.
>
> Once you're all done pointing and laughing, can you let me know if I
> am totally screwed? I've a lot of data here that I -really- don't
> want to lose...
>
> Please help..
>
> Idiot.
>
> --
> Alex Boag-Munroe
>
> Lack of planning on your part does not constitute an emergency on mine.

OK, I was done pointing and laughing, until I saw your signature. Did
you choose that on purpose or did Gmail pick it for you?

I'm afraid I can't help with your problem, except to say that I've a
feeling you ought to be able to manually restart the half-reshaped array
without the backup file, so the worst case ought to be that you might
lose one backup file's worth of data. However, kernel and mdadm versions
together with output of `mdadm --detail` of your md device and `mdadm
--examine` of its constituent devices will help those more knowledgeable
than me tell you what to do next. If you're lucky the boss, Neil Brown,
will help but I imagine he's asleep right now since he lives in
Australia and it's the middle of the night there.

Best of luck,

John.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
John Robinson [ Do, 04 März 2010 13:01 ] [ ID #2034007 ]

Re: I am an idiot.

On 4 March 2010 12:01, John Robinson <john.robinson [at] anonymous.org.uk> w=
rote:
> On 04/03/2010 11:30, Alex Boag-Munroe wrote:
>>
>> Hi guys...
>>
>> Yes I am an idiot. =A0I was changing the chunk size of my RAID5 arra=
y
>> last night from 64kb to 256kb and left it running overnight. =A0Duri=
ng
>> the night we had a power outage.
>>
>> This is where the idiot part comes in. =A0The backup file is on a
>> filesystem that's part of the RAID5 array, so obviously I am unable =
to
>> start it. =A0I completely forgot the filesystem I specified for
>> --backup-file was part of the same array.
>>
>> Once you're all done pointing and laughing, can you let me know if I
>> am totally screwed? =A0I've a lot of data here that I -really- don't
>> want to lose...
>>
>> Please help..
>>
>> Idiot.
>>
>> --
>> Alex Boag-Munroe
>>
>> Lack of planning on your part does not constitute an emergency on mi=
ne.
>
> OK, I was done pointing and laughing, until I saw your signature. Did=
you
> choose that on purpose or did Gmail pick it for you?
>
> I'm afraid I can't help with your problem, except to say that I've a =
feeling
> you ought to be able to manually restart the half-reshaped array with=
out the
> backup file, so the worst case ought to be that you might lose one ba=
ckup
> file's worth of data. However, kernel and mdadm versions together wit=
h
> output of `mdadm --detail` of your md device and `mdadm --examine` of=
its
> constituent devices will help those more knowledgeable than me tell y=
ou what
> to do next. If you're lucky the boss, Neil Brown, will help but I ima=
gine
> he's asleep right now since he lives in Australia and it's the middle=
of the
> night there.
>
> Best of luck,
>
> John.
>

Hi John, thanks so much for your reply.

That is my signature and I stand by it, hence the whole "me idiot" and
not DEMANDING I get help etc.

mdadm is version 3.1.1. New developments. I found a post on the
internet where Neil recommended to someone to recreate the array
without erasing it. Which I have done, mdadm starts the array and
mdadm -D shows that almost a terabyte of space is in use.

However, mdadm -D also shows a chunk size of 512k, which is neither
the 64k original chunk nor the 512k I asked for.

Kernel version is gentoo-sources-2.6.33.

Output of mdadm --examine for /dev/sda5 through /dev/sdd5:

/dev/sda5:
Magic : a92b4efc
Version : 0.90.00
UUID : 17862986:014cb4c0:ffe6e849:786ed339 (local to host nc=
c-1701-e)
Creation Time : Thu Mar 4 13:10:24 2010
Raid Level : raid5
Used Dev Size : 974767616 (929.61 GiB 998.16 GB)
Array Size : 2924302848 (2788.83 GiB 2994.49 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 1

Update Time : Thu Mar 4 13:10:29 2010
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : b951290 - correct
Events : 3

Layout : left-symmetric
Chunk Size : 512K

Number Major Minor RaidDevice State
this 0 8 5 0 active sync /dev/sda5

0 0 8 5 0 active sync /dev/sda5
1 1 8 21 1 active sync /dev/sdb5
2 2 8 37 2 active sync /dev/sdc5
3 3 8 53 3 active sync /dev/sdd5
/dev/sdb5:
Magic : a92b4efc
Version : 0.90.00
UUID : 17862986:014cb4c0:ffe6e849:786ed339 (local to host nc=
c-1701-e)
Creation Time : Thu Mar 4 13:10:24 2010
Raid Level : raid5
Used Dev Size : 974767616 (929.61 GiB 998.16 GB)
Array Size : 2924302848 (2788.83 GiB 2994.49 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 1

Update Time : Thu Mar 4 13:10:29 2010
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : b9512a2 - correct
Events : 3

Layout : left-symmetric
Chunk Size : 512K

Number Major Minor RaidDevice State
this 1 8 21 1 active sync /dev/sdb5

0 0 8 5 0 active sync /dev/sda5
1 1 8 21 1 active sync /dev/sdb5
2 2 8 37 2 active sync /dev/sdc5
3 3 8 53 3 active sync /dev/sdd5
/dev/sdc5:
Magic : a92b4efc
Version : 0.90.00
UUID : 17862986:014cb4c0:ffe6e849:786ed339 (local to host nc=
c-1701-e)
Creation Time : Thu Mar 4 13:10:24 2010
Raid Level : raid5
Used Dev Size : 974767616 (929.61 GiB 998.16 GB)
Array Size : 2924302848 (2788.83 GiB 2994.49 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 1

Update Time : Thu Mar 4 13:10:29 2010
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : b9512b4 - correct
Events : 3

Layout : left-symmetric
Chunk Size : 512K

Number Major Minor RaidDevice State
this 2 8 37 2 active sync /dev/sdc5

0 0 8 5 0 active sync /dev/sda5
1 1 8 21 1 active sync /dev/sdb5
2 2 8 37 2 active sync /dev/sdc5
3 3 8 53 3 active sync /dev/sdd5
/dev/sdd5:
Magic : a92b4efc
Version : 0.90.00
UUID : 17862986:014cb4c0:ffe6e849:786ed339 (local to host nc=
c-1701-e)
Creation Time : Thu Mar 4 13:10:24 2010
Raid Level : raid5
Used Dev Size : 974767616 (929.61 GiB 998.16 GB)
Array Size : 2924302848 (2788.83 GiB 2994.49 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 1

Update Time : Thu Mar 4 13:10:29 2010
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : b9512c6 - correct
Events : 3

Layout : left-symmetric
Chunk Size : 512K

Number Major Minor RaidDevice State
this 3 8 53 3 active sync /dev/sdd5

0 0 8 5 0 active sync /dev/sda5
1 1 8 21 1 active sync /dev/sdb5
2 2 8 37 2 active sync /dev/sdc5
3 3 8 53 3 active sync /dev/sdd5

Booting with autodetecting raid, states that there's no valid 0.9 super=
block.

--
Alex Boag-Munroe

Lack of planning on your part does not constitute an emergency on mine.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alex Boag-Munroe [ Do, 04 März 2010 13:22 ] [ ID #2034008 ]

Re: I am an idiot.

On 4 March 2010 12:22, Alex Boag-Munroe <boagenator [at] gmail.com> wrote:
> On 4 March 2010 12:01, John Robinson <john.robinson [at] anonymous.org.uk>=
wrote:
>> On 04/03/2010 11:30, Alex Boag-Munroe wrote:
>>>
>>> Hi guys...
>>>
>>> Yes I am an idiot. =A0I was changing the chunk size of my RAID5 arr=
ay
>>> last night from 64kb to 256kb and left it running overnight. =A0Dur=
ing
>>> the night we had a power outage.
>>>
>>> This is where the idiot part comes in. =A0The backup file is on a
>>> filesystem that's part of the RAID5 array, so obviously I am unable=
to
>>> start it. =A0I completely forgot the filesystem I specified for
>>> --backup-file was part of the same array.
>>>
>>> Once you're all done pointing and laughing, can you let me know if =
I
>>> am totally screwed? =A0I've a lot of data here that I -really- don'=
t
>>> want to lose...
>>>
>>> Please help..
>>>
>>> Idiot.
>>>
>>> --
>>> Alex Boag-Munroe
>>>
>>> Lack of planning on your part does not constitute an emergency on m=
ine.
>>
>> OK, I was done pointing and laughing, until I saw your signature. Di=
d you
>> choose that on purpose or did Gmail pick it for you?
>>
>> I'm afraid I can't help with your problem, except to say that I've a=
feeling
>> you ought to be able to manually restart the half-reshaped array wit=
hout the
>> backup file, so the worst case ought to be that you might lose one b=
ackup
>> file's worth of data. However, kernel and mdadm versions together wi=
th
>> output of `mdadm --detail` of your md device and `mdadm --examine` o=
f its
>> constituent devices will help those more knowledgeable than me tell =
you what
>> to do next. If you're lucky the boss, Neil Brown, will help but I im=
agine
>> he's asleep right now since he lives in Australia and it's the middl=
e of the
>> night there.
>>
>> Best of luck,
>>
>> John.
>>
>
> Hi John, thanks so much for your reply.
>
> That is my signature and I stand by it, hence the whole "me idiot" an=
d
> not DEMANDING I get help etc.
>
> mdadm is version 3.1.1. =A0New developments. =A0I found a post on the
> internet where Neil recommended to someone to recreate the array
> without erasing it. =A0Which I have done, mdadm starts the array and
> mdadm -D shows that almost a terabyte of space is in use.
>
> However, mdadm -D also shows a chunk size of 512k, which is neither
> the 64k original chunk nor the 512k I asked for.
>
> Kernel version is gentoo-sources-2.6.33.
>
> Output of mdadm --examine for /dev/sda5 through /dev/sdd5:
>
> /dev/sda5:
> =A0 =A0 =A0 =A0 =A0Magic : a92b4efc
> =A0 =A0 =A0 =A0Version : 0.90.00
> =A0 =A0 =A0 =A0 =A0 UUID : 17862986:014cb4c0:ffe6e849:786ed339 (local=
to host ncc-1701-e)
> =A0Creation Time : Thu Mar =A04 13:10:24 2010
> =A0 =A0 Raid Level : raid5
> =A0Used Dev Size : 974767616 (929.61 GiB 998.16 GB)
> =A0 =A0 Array Size : 2924302848 (2788.83 GiB 2994.49 GB)
> =A0 Raid Devices : 4
> =A0Total Devices : 4
> Preferred Minor : 1
>
> =A0 =A0Update Time : Thu Mar =A04 13:10:29 2010
> =A0 =A0 =A0 =A0 =A0State : clean
> =A0Active Devices : 4
> Working Devices : 4
> =A0Failed Devices : 0
> =A0Spare Devices : 0
> =A0 =A0 =A0 Checksum : b951290 - correct
> =A0 =A0 =A0 =A0 Events : 3
>
> =A0 =A0 =A0 =A0 Layout : left-symmetric
> =A0 =A0 Chunk Size : 512K
>
> =A0 =A0 =A0Number =A0 Major =A0 Minor =A0 RaidDevice State
> this =A0 =A0 0 =A0 =A0 =A0 8 =A0 =A0 =A0 =A05 =A0 =A0 =A0 =A00 =A0 =A0=
=A0active sync =A0 /dev/sda5
>
> =A0 0 =A0 =A0 0 =A0 =A0 =A0 8 =A0 =A0 =A0 =A05 =A0 =A0 =A0 =A00 =A0 =A0=
=A0active sync =A0 /dev/sda5
> =A0 1 =A0 =A0 1 =A0 =A0 =A0 8 =A0 =A0 =A0 21 =A0 =A0 =A0 =A01 =A0 =A0=
=A0active sync =A0 /dev/sdb5
> =A0 2 =A0 =A0 2 =A0 =A0 =A0 8 =A0 =A0 =A0 37 =A0 =A0 =A0 =A02 =A0 =A0=
=A0active sync =A0 /dev/sdc5
> =A0 3 =A0 =A0 3 =A0 =A0 =A0 8 =A0 =A0 =A0 53 =A0 =A0 =A0 =A03 =A0 =A0=
=A0active sync =A0 /dev/sdd5
> /dev/sdb5:
> =A0 =A0 =A0 =A0 =A0Magic : a92b4efc
> =A0 =A0 =A0 =A0Version : 0.90.00
> =A0 =A0 =A0 =A0 =A0 UUID : 17862986:014cb4c0:ffe6e849:786ed339 (local=
to host ncc-1701-e)
> =A0Creation Time : Thu Mar =A04 13:10:24 2010
> =A0 =A0 Raid Level : raid5
> =A0Used Dev Size : 974767616 (929.61 GiB 998.16 GB)
> =A0 =A0 Array Size : 2924302848 (2788.83 GiB 2994.49 GB)
> =A0 Raid Devices : 4
> =A0Total Devices : 4
> Preferred Minor : 1
>
> =A0 =A0Update Time : Thu Mar =A04 13:10:29 2010
> =A0 =A0 =A0 =A0 =A0State : clean
> =A0Active Devices : 4
> Working Devices : 4
> =A0Failed Devices : 0
> =A0Spare Devices : 0
> =A0 =A0 =A0 Checksum : b9512a2 - correct
> =A0 =A0 =A0 =A0 Events : 3
>
> =A0 =A0 =A0 =A0 Layout : left-symmetric
> =A0 =A0 Chunk Size : 512K
>
> =A0 =A0 =A0Number =A0 Major =A0 Minor =A0 RaidDevice State
> this =A0 =A0 1 =A0 =A0 =A0 8 =A0 =A0 =A0 21 =A0 =A0 =A0 =A01 =A0 =A0 =
=A0active sync =A0 /dev/sdb5
>
> =A0 0 =A0 =A0 0 =A0 =A0 =A0 8 =A0 =A0 =A0 =A05 =A0 =A0 =A0 =A00 =A0 =A0=
=A0active sync =A0 /dev/sda5
> =A0 1 =A0 =A0 1 =A0 =A0 =A0 8 =A0 =A0 =A0 21 =A0 =A0 =A0 =A01 =A0 =A0=
=A0active sync =A0 /dev/sdb5
> =A0 2 =A0 =A0 2 =A0 =A0 =A0 8 =A0 =A0 =A0 37 =A0 =A0 =A0 =A02 =A0 =A0=
=A0active sync =A0 /dev/sdc5
> =A0 3 =A0 =A0 3 =A0 =A0 =A0 8 =A0 =A0 =A0 53 =A0 =A0 =A0 =A03 =A0 =A0=
=A0active sync =A0 /dev/sdd5
> /dev/sdc5:
> =A0 =A0 =A0 =A0 =A0Magic : a92b4efc
> =A0 =A0 =A0 =A0Version : 0.90.00
> =A0 =A0 =A0 =A0 =A0 UUID : 17862986:014cb4c0:ffe6e849:786ed339 (local=
to host ncc-1701-e)
> =A0Creation Time : Thu Mar =A04 13:10:24 2010
> =A0 =A0 Raid Level : raid5
> =A0Used Dev Size : 974767616 (929.61 GiB 998.16 GB)
> =A0 =A0 Array Size : 2924302848 (2788.83 GiB 2994.49 GB)
> =A0 Raid Devices : 4
> =A0Total Devices : 4
> Preferred Minor : 1
>
> =A0 =A0Update Time : Thu Mar =A04 13:10:29 2010
> =A0 =A0 =A0 =A0 =A0State : clean
> =A0Active Devices : 4
> Working Devices : 4
> =A0Failed Devices : 0
> =A0Spare Devices : 0
> =A0 =A0 =A0 Checksum : b9512b4 - correct
> =A0 =A0 =A0 =A0 Events : 3
>
> =A0 =A0 =A0 =A0 Layout : left-symmetric
> =A0 =A0 Chunk Size : 512K
>
> =A0 =A0 =A0Number =A0 Major =A0 Minor =A0 RaidDevice State
> this =A0 =A0 2 =A0 =A0 =A0 8 =A0 =A0 =A0 37 =A0 =A0 =A0 =A02 =A0 =A0 =
=A0active sync =A0 /dev/sdc5
>
> =A0 0 =A0 =A0 0 =A0 =A0 =A0 8 =A0 =A0 =A0 =A05 =A0 =A0 =A0 =A00 =A0 =A0=
=A0active sync =A0 /dev/sda5
> =A0 1 =A0 =A0 1 =A0 =A0 =A0 8 =A0 =A0 =A0 21 =A0 =A0 =A0 =A01 =A0 =A0=
=A0active sync =A0 /dev/sdb5
> =A0 2 =A0 =A0 2 =A0 =A0 =A0 8 =A0 =A0 =A0 37 =A0 =A0 =A0 =A02 =A0 =A0=
=A0active sync =A0 /dev/sdc5
> =A0 3 =A0 =A0 3 =A0 =A0 =A0 8 =A0 =A0 =A0 53 =A0 =A0 =A0 =A03 =A0 =A0=
=A0active sync =A0 /dev/sdd5
> /dev/sdd5:
> =A0 =A0 =A0 =A0 =A0Magic : a92b4efc
> =A0 =A0 =A0 =A0Version : 0.90.00
> =A0 =A0 =A0 =A0 =A0 UUID : 17862986:014cb4c0:ffe6e849:786ed339 (local=
to host ncc-1701-e)
> =A0Creation Time : Thu Mar =A04 13:10:24 2010
> =A0 =A0 Raid Level : raid5
> =A0Used Dev Size : 974767616 (929.61 GiB 998.16 GB)
> =A0 =A0 Array Size : 2924302848 (2788.83 GiB 2994.49 GB)
> =A0 Raid Devices : 4
> =A0Total Devices : 4
> Preferred Minor : 1
>
> =A0 =A0Update Time : Thu Mar =A04 13:10:29 2010
> =A0 =A0 =A0 =A0 =A0State : clean
> =A0Active Devices : 4
> Working Devices : 4
> =A0Failed Devices : 0
> =A0Spare Devices : 0
> =A0 =A0 =A0 Checksum : b9512c6 - correct
> =A0 =A0 =A0 =A0 Events : 3
>
> =A0 =A0 =A0 =A0 Layout : left-symmetric
> =A0 =A0 Chunk Size : 512K
>
> =A0 =A0 =A0Number =A0 Major =A0 Minor =A0 RaidDevice State
> this =A0 =A0 3 =A0 =A0 =A0 8 =A0 =A0 =A0 53 =A0 =A0 =A0 =A03 =A0 =A0 =
=A0active sync =A0 /dev/sdd5
>
> =A0 0 =A0 =A0 0 =A0 =A0 =A0 8 =A0 =A0 =A0 =A05 =A0 =A0 =A0 =A00 =A0 =A0=
=A0active sync =A0 /dev/sda5
> =A0 1 =A0 =A0 1 =A0 =A0 =A0 8 =A0 =A0 =A0 21 =A0 =A0 =A0 =A01 =A0 =A0=
=A0active sync =A0 /dev/sdb5
> =A0 2 =A0 =A0 2 =A0 =A0 =A0 8 =A0 =A0 =A0 37 =A0 =A0 =A0 =A02 =A0 =A0=
=A0active sync =A0 /dev/sdc5
> =A0 3 =A0 =A0 3 =A0 =A0 =A0 8 =A0 =A0 =A0 53 =A0 =A0 =A0 =A03 =A0 =A0=
=A0active sync =A0 /dev/sdd5
>
> Booting with autodetecting raid, states that there's no valid 0.9 sup=
erblock.
>
> --
> Alex Boag-Munroe
>
> Lack of planning on your part does not constitute an emergency on min=
e.
>

Oops. Where I said "it isn't the 512k chunk I asked for" I meant 256k c=
hunk.

Thanks again

--
Alex Boag-Munroe

Lack of planning on your part does not constitute an emergency on mine.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alex Boag-Munroe [ Do, 04 März 2010 13:23 ] [ ID #2034009 ]

Re: I am an idiot.

Alex Boag-Munroe wrote:
> Hi guys...
>
> Yes I am an idiot. I was changing the chunk size of my RAID5 array
> last night from 64kb to 256kb and left it running overnight. During
> the night we had a power outage.
>
> This is where the idiot part comes in. The backup file is on a
> filesystem that's part of the RAID5 array, so obviously I am unable to
> start it. I completely forgot the filesystem I specified for
> --backup-file was part of the same array.
>
> Once you're all done pointing and laughing, can you let me know if I
> am totally screwed? I've a lot of data here that I -really- don't
> want to lose...
>
> Please help..
>
> Idiot.
>

Firstly I will say that I have never faced this situation, so please
wait for someone more knowledgeable to reply before trying.

Supposing the resync cannot be continued after a power failure (which I
am not sure)...

My idea is that the reshape progresses linearly so one of the two
filesystems (either the original one or backup) should be accessible. If
the power failed when the reshape was within the first filesystem, the
second filesystem should be somehow accessible, if it failed when the
reshape was within the second filesystem, the first filesystem should be
somehow accessible.

In this situation I guess you need to go to the hard route: you will
probably need to recreate the array with all the drives specified
exactly in the same order, using all the original options (you can get
info from every drive with mdadm --examine /dev/sdXY), and the chunksize
either set at 64k or at 256k (you try both), and specifying
--assume-clean so that it does not start to resync, and then set it
--readonly before doing anything else. Then you will probably be able to
do some experiments try mounting one of the two filesystems.

Thinking again, I guess there is a situation which will prevent you to
see both filesystems... this is the case if 64kb prevents you to see the
good filesystem and 256k prevents you to see the LVM metadata :-( You
use LVM right? In this case you might need to "find" your filesystem by
mounting the device with progressively increasing offsets from the
beginning, without the help of LVM. And this will work only if your good
partition in LVM was contiguous (LVM allows holes).

Anyway, wait other replies.

Good luck

Asdo


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Asdo [ Do, 04 März 2010 13:25 ] [ ID #2034010 ]

Re: I am an idiot.

Though I may not be able to help, but I'm sure you'd get much better
support & help if you choose a proper subject summarizing the problem.

If possible, I suggest you clone each disk to another disk before
proceeding just in case something goes further wrong.

Maybe the array you recreated was with the wrong chunksize of 512
instead of your desired 256?

On Thu, Mar 4, 2010 at 3:23 PM, Alex Boag-Munroe <boagenator [at] gmail.com>=
wrote:
> On 4 March 2010 12:22, Alex Boag-Munroe <boagenator [at] gmail.com> wrote:
>> On 4 March 2010 12:01, John Robinson <john.robinson [at] anonymous.org.uk=
> wrote:
>>> On 04/03/2010 11:30, Alex Boag-Munroe wrote:
>>>>
>>>> Hi guys...
>>>>
>>>> Yes I am an idiot. =C2=A0I was changing the chunk size of my RAID5=
array
>>>> last night from 64kb to 256kb and left it running overnight. =C2=A0=
During
>>>> the night we had a power outage.
>>>>
>>>> This is where the idiot part comes in. =C2=A0The backup file is on=
a
>>>> filesystem that's part of the RAID5 array, so obviously I am unabl=
e to
>>>> start it. =C2=A0I completely forgot the filesystem I specified for
>>>> --backup-file was part of the same array.
>>>>
>>>> Once you're all done pointing and laughing, can you let me know if=
I
>>>> am totally screwed? =C2=A0I've a lot of data here that I -really- =
don't
>>>> want to lose...
>>>>
>>>> Please help..
>>>>
>>>> Idiot.
>>>>
>>>> --
>>>> Alex Boag-Munroe
>>>>
>>>> Lack of planning on your part does not constitute an emergency on =
mine.
>>>
>>> OK, I was done pointing and laughing, until I saw your signature. D=
id you
>>> choose that on purpose or did Gmail pick it for you?
>>>
>>> I'm afraid I can't help with your problem, except to say that I've =
a feeling
>>> you ought to be able to manually restart the half-reshaped array wi=
thout the
>>> backup file, so the worst case ought to be that you might lose one =
backup
>>> file's worth of data. However, kernel and mdadm versions together w=
ith
>>> output of `mdadm --detail` of your md device and `mdadm --examine` =
of its
>>> constituent devices will help those more knowledgeable than me tell=
you what
>>> to do next. If you're lucky the boss, Neil Brown, will help but I i=
magine
>>> he's asleep right now since he lives in Australia and it's the midd=
le of the
>>> night there.
>>>
>>> Best of luck,
>>>
>>> John.
>>>
>>
>> Hi John, thanks so much for your reply.
>>
>> That is my signature and I stand by it, hence the whole "me idiot" a=
nd
>> not DEMANDING I get help etc.
>>
>> mdadm is version 3.1.1. =C2=A0New developments. =C2=A0I found a post=
on the
>> internet where Neil recommended to someone to recreate the array
>> without erasing it. =C2=A0Which I have done, mdadm starts the array =
and
>> mdadm -D shows that almost a terabyte of space is in use.
>>
>> However, mdadm -D also shows a chunk size of 512k, which is neither
>> the 64k original chunk nor the 512k I asked for.
>>
>> Kernel version is gentoo-sources-2.6.33.
>>
>> Output of mdadm --examine for /dev/sda5 through /dev/sdd5:
>>
>> /dev/sda5:
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Magic : a92b4efc
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0Version : 0.90.00
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 UUID : 17862986:014cb4c0:ffe6e849=
:786ed339 (local to host ncc-1701-e)
>> =C2=A0Creation Time : Thu Mar =C2=A04 13:10:24 2010
>> =C2=A0 =C2=A0 Raid Level : raid5
>> =C2=A0Used Dev Size : 974767616 (929.61 GiB 998.16 GB)
>> =C2=A0 =C2=A0 Array Size : 2924302848 (2788.83 GiB 2994.49 GB)
>> =C2=A0 Raid Devices : 4
>> =C2=A0Total Devices : 4
>> Preferred Minor : 1
>>
>> =C2=A0 =C2=A0Update Time : Thu Mar =C2=A04 13:10:29 2010
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0State : clean
>> =C2=A0Active Devices : 4
>> Working Devices : 4
>> =C2=A0Failed Devices : 0
>> =C2=A0Spare Devices : 0
>> =C2=A0 =C2=A0 =C2=A0 Checksum : b951290 - correct
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Events : 3
>>
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Layout : left-symmetric
>> =C2=A0 =C2=A0 Chunk Size : 512K
>>
>> =C2=A0 =C2=A0 =C2=A0Number =C2=A0 Major =C2=A0 Minor =C2=A0 RaidDevi=
ce State
>> this =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0 =C2=
=A05 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0=
/dev/sda5
>>
>> =C2=A0 0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
=C2=A05 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0active sync =C2=
=A0 /dev/sda5
>> =C2=A0 1 =C2=A0 =C2=A0 1 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
21 =C2=A0 =C2=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0=
/dev/sdb5
>> =C2=A0 2 =C2=A0 =C2=A0 2 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
37 =C2=A0 =C2=A0 =C2=A0 =C2=A02 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0=
/dev/sdc5
>> =C2=A0 3 =C2=A0 =C2=A0 3 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
53 =C2=A0 =C2=A0 =C2=A0 =C2=A03 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0=
/dev/sdd5
>> /dev/sdb5:
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Magic : a92b4efc
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0Version : 0.90.00
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 UUID : 17862986:014cb4c0:ffe6e849=
:786ed339 (local to host ncc-1701-e)
>> =C2=A0Creation Time : Thu Mar =C2=A04 13:10:24 2010
>> =C2=A0 =C2=A0 Raid Level : raid5
>> =C2=A0Used Dev Size : 974767616 (929.61 GiB 998.16 GB)
>> =C2=A0 =C2=A0 Array Size : 2924302848 (2788.83 GiB 2994.49 GB)
>> =C2=A0 Raid Devices : 4
>> =C2=A0Total Devices : 4
>> Preferred Minor : 1
>>
>> =C2=A0 =C2=A0Update Time : Thu Mar =C2=A04 13:10:29 2010
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0State : clean
>> =C2=A0Active Devices : 4
>> Working Devices : 4
>> =C2=A0Failed Devices : 0
>> =C2=A0Spare Devices : 0
>> =C2=A0 =C2=A0 =C2=A0 Checksum : b9512a2 - correct
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Events : 3
>>
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Layout : left-symmetric
>> =C2=A0 =C2=A0 Chunk Size : 512K
>>
>> =C2=A0 =C2=A0 =C2=A0Number =C2=A0 Major =C2=A0 Minor =C2=A0 RaidDevi=
ce State
>> this =C2=A0 =C2=A0 1 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0 21 =
=C2=A0 =C2=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0 /de=
v/sdb5
>>
>> =C2=A0 0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
=C2=A05 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0active sync =C2=
=A0 /dev/sda5
>> =C2=A0 1 =C2=A0 =C2=A0 1 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
21 =C2=A0 =C2=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0=
/dev/sdb5
>> =C2=A0 2 =C2=A0 =C2=A0 2 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
37 =C2=A0 =C2=A0 =C2=A0 =C2=A02 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0=
/dev/sdc5
>> =C2=A0 3 =C2=A0 =C2=A0 3 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
53 =C2=A0 =C2=A0 =C2=A0 =C2=A03 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0=
/dev/sdd5
>> /dev/sdc5:
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Magic : a92b4efc
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0Version : 0.90.00
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 UUID : 17862986:014cb4c0:ffe6e849=
:786ed339 (local to host ncc-1701-e)
>> =C2=A0Creation Time : Thu Mar =C2=A04 13:10:24 2010
>> =C2=A0 =C2=A0 Raid Level : raid5
>> =C2=A0Used Dev Size : 974767616 (929.61 GiB 998.16 GB)
>> =C2=A0 =C2=A0 Array Size : 2924302848 (2788.83 GiB 2994.49 GB)
>> =C2=A0 Raid Devices : 4
>> =C2=A0Total Devices : 4
>> Preferred Minor : 1
>>
>> =C2=A0 =C2=A0Update Time : Thu Mar =C2=A04 13:10:29 2010
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0State : clean
>> =C2=A0Active Devices : 4
>> Working Devices : 4
>> =C2=A0Failed Devices : 0
>> =C2=A0Spare Devices : 0
>> =C2=A0 =C2=A0 =C2=A0 Checksum : b9512b4 - correct
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Events : 3
>>
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Layout : left-symmetric
>> =C2=A0 =C2=A0 Chunk Size : 512K
>>
>> =C2=A0 =C2=A0 =C2=A0Number =C2=A0 Major =C2=A0 Minor =C2=A0 RaidDevi=
ce State
>> this =C2=A0 =C2=A0 2 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0 37 =
=C2=A0 =C2=A0 =C2=A0 =C2=A02 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0 /de=
v/sdc5
>>
>> =C2=A0 0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
=C2=A05 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0active sync =C2=
=A0 /dev/sda5
>> =C2=A0 1 =C2=A0 =C2=A0 1 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
21 =C2=A0 =C2=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0=
/dev/sdb5
>> =C2=A0 2 =C2=A0 =C2=A0 2 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
37 =C2=A0 =C2=A0 =C2=A0 =C2=A02 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0=
/dev/sdc5
>> =C2=A0 3 =C2=A0 =C2=A0 3 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
53 =C2=A0 =C2=A0 =C2=A0 =C2=A03 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0=
/dev/sdd5
>> /dev/sdd5:
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Magic : a92b4efc
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0Version : 0.90.00
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 UUID : 17862986:014cb4c0:ffe6e849=
:786ed339 (local to host ncc-1701-e)
>> =C2=A0Creation Time : Thu Mar =C2=A04 13:10:24 2010
>> =C2=A0 =C2=A0 Raid Level : raid5
>> =C2=A0Used Dev Size : 974767616 (929.61 GiB 998.16 GB)
>> =C2=A0 =C2=A0 Array Size : 2924302848 (2788.83 GiB 2994.49 GB)
>> =C2=A0 Raid Devices : 4
>> =C2=A0Total Devices : 4
>> Preferred Minor : 1
>>
>> =C2=A0 =C2=A0Update Time : Thu Mar =C2=A04 13:10:29 2010
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0State : clean
>> =C2=A0Active Devices : 4
>> Working Devices : 4
>> =C2=A0Failed Devices : 0
>> =C2=A0Spare Devices : 0
>> =C2=A0 =C2=A0 =C2=A0 Checksum : b9512c6 - correct
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Events : 3
>>
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Layout : left-symmetric
>> =C2=A0 =C2=A0 Chunk Size : 512K
>>
>> =C2=A0 =C2=A0 =C2=A0Number =C2=A0 Major =C2=A0 Minor =C2=A0 RaidDevi=
ce State
>> this =C2=A0 =C2=A0 3 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0 53 =
=C2=A0 =C2=A0 =C2=A0 =C2=A03 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0 /de=
v/sdd5
>>
>> =C2=A0 0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
=C2=A05 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0active sync =C2=
=A0 /dev/sda5
>> =C2=A0 1 =C2=A0 =C2=A0 1 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
21 =C2=A0 =C2=A0 =C2=A0 =C2=A01 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0=
/dev/sdb5
>> =C2=A0 2 =C2=A0 =C2=A0 2 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
37 =C2=A0 =C2=A0 =C2=A0 =C2=A02 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0=
/dev/sdc5
>> =C2=A0 3 =C2=A0 =C2=A0 3 =C2=A0 =C2=A0 =C2=A0 8 =C2=A0 =C2=A0 =C2=A0=
53 =C2=A0 =C2=A0 =C2=A0 =C2=A03 =C2=A0 =C2=A0 =C2=A0active sync =C2=A0=
/dev/sdd5
>>
>> Booting with autodetecting raid, states that there's no valid 0.9 su=
perblock.
>>
>> --
>> Alex Boag-Munroe
>>
>> Lack of planning on your part does not constitute an emergency on mi=
ne.
>>
>
> Oops. Where I said "it isn't the 512k chunk I asked for" I meant 256k=
chunk.
>
> Thanks again
>
> --
> Alex Boag-Munroe
>
> Lack of planning on your part does not constitute an emergency on min=
e.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid"=
in
> the body of a message to majordomo [at] vger.kernel.org
> More majordomo info at =C2=A0http://vger.kernel.org/majordomo-info.ht=
ml
>



--
Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
majedb [ Do, 04 März 2010 13:25 ] [ ID #2034011 ]

Re: I am an idiot. (power failure during chunk resize, no backupfile)

On Thu, 4 Mar 2010 12:22:53 +0000
Alex Boag-Munroe <boagenator [at] gmail.com> wrote:

> mdadm is version 3.1.1. New developments. I found a post on the
> internet where Neil recommended to someone to recreate the array
> without erasing it. Which I have done, mdadm starts the array and
> mdadm -D shows that almost a terabyte of space is in use.
>
> However, mdadm -D also shows a chunk size of 512k, which is neither
> the 64k original chunk nor the 512k I asked for.
>

You didn't did you?
Oh no, you did! Two idiotic things.....

Think about what you just did.

You have an array were part has one chunk size and part has another chunk
size. The only way you can get the data out is to know where in the array
the chunk size changes and tell the kernel that fact.
And what have you just done? You told mdadm to --create a new array
replacing the metadata so the record of the most important piece of
information: the point where the chunk size changed - just got erased.

If you happen to have an 'mdadm -E' output of the device from before you
re-created the array that might be useful. If you don't, it will be very
hard to make this work.

What you need to do is:
- get that information
- assemble the array without allowing reshape to restart. You can
probably do this by writing an appropriate set of things to files
in /sys - it would have been much easier without the --create
- mount the filesystem read-only
- copy out the backup file
- unmount, unassemble
- re-assemble with the backup file and let the reshape complete

The last step is possibly the hardest as it really requires writing new
metadata to exactly match the metadata that you erased, and that is not
easy to do - it will require some hacking in C.

If you have the option of copying the whole array elsewhere, then that would
be easiest.
- find out where chunk size changes
- assemble array read-only via writes to sysfs
- copy all the data out
- mount filesystem, find backup file, apply backup over copied data
- mount newly copied file system and be sure all is OK
- make brand new array on original disks.

I can talk you though assembling the array via sysfs if you get to that part.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
NeilBrown [ Do, 04 März 2010 21:45 ] [ ID #2034015 ]
Linux » gmane.linux.raid » I am an idiot.

Vorheriges Thema: VERY slow mdadm recovery speed 12KB/s
Nächstes Thema: Re: I am an idiot (power failure during chunk resize, no backup file)