IBM DS6000 Series Server User Manual


 
54 DS6000 Series: Concepts and Architecture
If two array sites are used to make a RAID-10 array and the array sites contain spares, then
six DDMs are used to make two RAID-0 arrays which are mirrored. If spares do not exist on
the array sites then eight DDMs are used to make two RAID-0 arrays which are mirrored.
Drive failure
When a disk drive module (DDM) fails in a RAID-10 array, the controller starts an operation to
reconstruct the data from the failed drive onto one of the spare drives. The spare that is used
is chosen based on a smart algorithm that looks at the location of the spares and the size and
location of the failed DDM. Remember, a RAID-10 array is effectively a RAID-0 array that is
mirrored. Thus when a drive fails in one of the RAID-0 arrays we can rebuild the failed drive
by reading the data from the equivalent drive in the other RAID-0 array.
While this data reconstruction is going on, the controller can still service read and write
requests to the array from the hosts. There may be some degradation in performance while
the sparing operation is in progress because some controller and switched network resources
are being used to do the reconstruction. Due to the switched architecture of the DS6000, this
effect will be minimal. Read requests for data on the failed drive should not be affected
because they can all be directed to the good RAID-0 array.
Write operations will not be affected. Performance of the RAID-10 array returns to normal
when the data reconstruction onto the spare device completes. The time taken for sparing
can vary, depending on the size of the failed DDM and on the workload on the array and the
controller.
3.3.3 Spare creation
There are four array sites in each enclosure of the DS6000. The first and third array sites
created on each loop are used to supply spares. This normally means that two spares will be
created in the server enclosure and two spares in the first expansion enclosure. Spares are
created as the array sites are created, which occurs when the DDMs are installed. After four
spares have been created for the entire storage unit, no more spares are normally needed.
On the ESS 800 the spare creation policy was to have four DDMs on each SSA (Serial
Storage Architecture) loop for each DDM type. This meant that on a specific SSA loop, it was
possible to have 12 spare DDMs, if you chose to populate a loop with three different DDM
types. With the DS6000 the intention is to not do this. Where DDMs with different sizes, but
the same RPM, exist in the complex, the spares will be taken from the array sites with the
larger sized DDMs. This means in most cases the DS6000 will continue to have only four
spares for the entire complex regardless of DDM size intermix.
Floating spares
The DS6000 implements a smart floating technique for spare DDMs. When a spare floats,
this means that when a DDM fails and the data it contained is rebuilt onto a spare, then the
disk is replaced, the replacement disk becomes the spare. The data is not copied back to the
original position which the failed DDM occupied. The DS6000 microcode may choose to allow
the hot spare to remain where it has been
moved, but it may instead choose to move the
spare to a more optimum position. This will be done to better balance the spares across the
DA pairs and enclosures. It may be preferable that a DDM that is currently in use as an array
member, be converted to a spare. In this case the data on that DDM will be migrated in the
background onto an existing spare. This process does not
fail the disk that is being migrated,
though it does reduce the number of available spares in the DS6000 until the migration
process is complete.
A smart process may be used to ensure that the larger or higher RPM DDMs act as spares.
This is preferable because if we were to rebuild the contents of a 73 GB DDM onto a 146 GB