Potential data rollback/corruption after drive failure and re-appearance

Hi,

I am testing the following scenario: a simple RAID1 md array with drives A
and B. Assume that drive B fails, but the array remains operational and
services IOs. After a while, machine is rebooted. After reboot drive B comes
back, but now drive A becomes inaccessible. Assembling the array with both
drives results in a degraded array, with a single drive B. However, B's data
is the array's data at the time of drive B failure, not the latest array's
data. So the data kind of rolls back in time.

Testing a similar scenario with RAID5: A,B and C drives, C drive fails,
RAID5 becomes degraded but operational. After reboot B and C are accessible,
but A disappears. Assembling the array fails, unless --force is given.
With --force, the array comes up, but the data, of course, is corrupted.

Is this behavior intentional?

Suppose I want to protect against this by first examining the MD superblocks
(--examine). I want to find the most updated drive, and check what array
state it shows. Which part of "mdadm --examine" output should I use to find
the most updated drive? The "Update Time" or the "Events" counter? Or
perhaps something else?

Thanks,
Moshe Melnikov

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo [at] vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Moshe Melnikov [ Mo, 17 Oktober 2011 08:46 ] [ ID #2065732 ]
Linux » gmane.linux.raid » Potential data rollback/corruption after drive failure and re-appearance

Vorheriges Thema: [PATCH] lib/raid6: Fix filename emitted in generated code
Nächstes Thema: [mdadm PATCH 2/2] Fix unterminated buffer after readlink() call