LVM cannot stop/remove LV, cannot stop/delete dead RAID situation.

LVM cannot stop/remove LV, cannot stop/delete dead RAID situation.

am 25.08.2009 16:45:38 von Benjamin ESTRABAUD

Hi,

I am having an issue with LVM and RAID in some failure cases:

It seems that any operations on a LVM LV (lvchange, lvremove, etc..),
require this LV's metadata to be readable.

In case a LVM LV is setup ontop of a RAID 0, or a RAID 5 for instance,
if two disks are lost from the RAID array, the array dies.

Now that the array is dead, I would like to recreate a new RAID 0 or 5
using the remaining alive disks and some new ones. For this reason, I'd
like to stop the previous dead RAID using mdadm.

However, because the LVM LV does still exists, it seems to have a handle
on the dead RAID, as shon below:

# /opt/soma/bin/mdadm/mdadm --stop /dev/md/d0
raid manager: fail to stop array /dev/md/d0: Device or resource busy
Perhaps a running process, mounted filesystem or active volume group?

The Volume group created on this RAID's physical volume is:

/dev/1367191

The LVM LV that was created on this RAID is at the location:

/dev/1367191/1367259

Trying to stop or remove the LV as shown below will yield an error too,
since the LVM metadata cannot be accessed, since the RAID this LVM is
sitting on is dead:

# lvchange -an /dev/1367191/1367259
/dev/1367191/1367259: read failed after 0 of 65536 at 0: Input/output
error
/dev/md/d0p2: read failed after 0 of 65536 at 0: Input/output error
/dev/md/d0p3: read failed after 0 of 2048 at 0: Input/output error
/dev/1367191/1367259: read failed after 0 of 65536 at 0: Input/output
error
/dev/1367191/1367259: read failed after 0 of 65536 at 0: Input/output
error
/dev/1367191/1367259: read failed after 0 of 65536 at 0: Input/output
error
/dev/md/d0: read failed after 0 of 65536 at 110275985408: Input/output
error
/dev/md/d0: read failed after 0 of 65536 at 110275985408: Input/output
error
/dev/md/d0p1: read failed after 0 of 1024 at 50266112: Input/output error
/dev/md/d0p2: read failed after 0 of 65536 at 0: Input/output error
/dev/md/d0p2: read failed after 0 of 65536 at 0: Input/output error
/dev/md/d0p2: read failed after 0 of 65536 at 0: Input/output error
/dev/md/d0p3: read failed after 0 of 512 at 110175191040: Input/output
error
/dev/md/d0p3: read failed after 0 of 512 at 110175293440: Input/output
error
/dev/md/d0p3: read failed after 0 of 512 at 0: Input/output error
/dev/md/d0p3: read failed after 0 of 512 at 4096: Input/output error
/dev/md/d0p3: read failed after 0 of 2048 at 0: Input/output error
Volume group "1367191" not found

# lvremove -f /dev/1367191/1367259
/dev/1367191/1367259: read failed after 0 of 65536 at 0: Input/output
error
/dev/md/d0p2: read failed after 0 of 65536 at 0: Input/output error
/dev/md/d0p3: read failed after 0 of 2048 at 0: Input/output error
/dev/1367191/1367259: read failed after 0 of 65536 at 0: Input/output
error
/dev/1367191/1367259: read failed after 0 of 65536 at 0: Input/output
error
/dev/1367191/1367259: read failed after 0 of 65536 at 0: Input/output
error
/dev/md/d0: read failed after 0 of 65536 at 110275985408: Input/output
error
/dev/md/d0: read failed after 0 of 65536 at 110275985408: Input/output
error
/dev/md/d0p1: read failed after 0 of 1024 at 50266112: Input/output error
/dev/md/d0p2: read failed after 0 of 65536 at 0: Input/output error
/dev/md/d0p2: read failed after 0 of 65536 at 0: Input/output error
/dev/md/d0p2: read failed after 0 of 65536 at 0: Input/output error
/dev/md/d0p3: read failed after 0 of 512 at 110175191040: Input/output
error
/dev/md/d0p3: read failed after 0 of 512 at 110175293440: Input/output
error
/dev/md/d0p3: read failed after 0 of 512 at 0: Input/output error
/dev/md/d0p3: read failed after 0 of 512 at 4096: Input/output error
/dev/md/d0p3: read failed after 0 of 2048 at 0: Input/output error
Volume group "1367191" not found

-------------

The problem with the above is that because LVM depends on the RAID's
good health to perform any operations, and the MD forbids to perform any
operation should a handle be opened on itself, we cannot stop either the
RAID or the LVM, unless the system is rebooted.

Since rebooting works, but that I cannot afford to reboot in this case,
I was wondering if anybody
knew where to start looking to force the handle opened by LVM on the
RAID to go away, maybe in the LVM admin programs (lvremove, lvchange) or
in the dm driver itself?

Thanks a million in advance for your advices.

Ben - MPSTOR.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: LVM cannot stop/remove LV, cannot stop/delete dead RAID situation.

am 26.08.2009 13:20:45 von Goswin von Brederlow

Benjamin ESTRABAUD writes:

> Hi,
>
> I am having an issue with LVM and RAID in some failure cases:
>
> It seems that any operations on a LVM LV (lvchange, lvremove, etc..),
> require this LV's metadata to be readable.
>
> In case a LVM LV is setup ontop of a RAID 0, or a RAID 5 for instance,
> if two disks are lost from the RAID array, the array dies.
>
> Now that the array is dead, I would like to recreate a new RAID 0 or 5
> using the remaining alive disks and some new ones. For this reason,
> I'd like to stop the previous dead RAID using mdadm.
>
> However, because the LVM LV does still exists, it seems to have a
> handle on the dead RAID, as shon below:
>
>...
>
> The problem with the above is that because LVM depends on the RAID's
> good health to perform any operations, and the MD forbids to perform
> any operation should a handle be opened on itself, we cannot stop
> either the RAID or the LVM, unless the system is rebooted.
>
> Since rebooting works, but that I cannot afford to reboot in this
> case, I was wondering if anybody
> knew where to start looking to force the handle opened by LVM on the
> RAID to go away, maybe in the LVM admin programs (lvremove, lvchange)
> or in the dm driver itself?
>
> Thanks a million in advance for your advices.
>
> Ben - MPSTOR.

You can remove the lvm devices yourself:

man dmsetup

MfG
Goswin
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: LVM cannot stop/remove LV, cannot stop/delete dead RAID situation.

am 26.08.2009 14:20:52 von Benjamin ESTRABAUD

Dear Goswin,

Thank you for your reply, I had no knowledge of this tool (dmsetup).

After using the following commands:

dmsetup remove -f -j 253 -m 0
dmsetup remove -f -j 253 -m 1
dmsetup remove -f -j 253 -m 2
dmsetup remove -f -j 253 -m 4

I was able to remove all of the LVM LVs, they disappeared from
/proc/partitions,
from /sys/block too

It seems that the /dev/xxxx/yyyy links were still active though, but
they disappeared shortly after the above command.

This worked actually pretty well, within 30 seconds of stopping these,
I was able to stop the RAID with the stuck handle!

Thank you very much for your help!

Ben - MPSTOR.

Goswin von Brederlow wrote:
> Benjamin ESTRABAUD writes:
>
>
>> Hi,
>>
>> I am having an issue with LVM and RAID in some failure cases:
>>
>> It seems that any operations on a LVM LV (lvchange, lvremove, etc..),
>> require this LV's metadata to be readable.
>>
>> In case a LVM LV is setup ontop of a RAID 0, or a RAID 5 for instance,
>> if two disks are lost from the RAID array, the array dies.
>>
>> Now that the array is dead, I would like to recreate a new RAID 0 or 5
>> using the remaining alive disks and some new ones. For this reason,
>> I'd like to stop the previous dead RAID using mdadm.
>>
>> However, because the LVM LV does still exists, it seems to have a
>> handle on the dead RAID, as shon below:
>>
>> ...
>>
>> The problem with the above is that because LVM depends on the RAID's
>> good health to perform any operations, and the MD forbids to perform
>> any operation should a handle be opened on itself, we cannot stop
>> either the RAID or the LVM, unless the system is rebooted.
>>
>> Since rebooting works, but that I cannot afford to reboot in this
>> case, I was wondering if anybody
>> knew where to start looking to force the handle opened by LVM on the
>> RAID to go away, maybe in the LVM admin programs (lvremove, lvchange)
>> or in the dm driver itself?
>>
>> Thanks a million in advance for your advices.
>>
>> Ben - MPSTOR.
>>
>
> You can remove the lvm devices yourself:
>
> man dmsetup
>
> MfG
> Goswin
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html