IBM DS8000 Computer Drive User Manual


 
78 DS8000 Series: Concepts and Architecture
On the ESS 800 the spare creation policy was to have four DDMs on each SSA loop for each
DDM type. This meant that on a specific SSA loop it was possible to have 12 spare DDMs if
you chose to populate a loop with three different DDM sizes. With the DS8000 the intention is
to not do this. A minimum of one spare is created for each array site defined until the following
conditions are met:
A minimum of 4 spares per DA pair
A minimum of 4 spares of the largest capacity array site on the DA pair
A minimum of 2 spares of capacity and RPM greater than or equal to the fastest array site
of any given capacity on the DA pair
Floating spares
The DS8000 implements a smart floating technique for spare DDMs. On an ESS 800, the
spare
floats. This means that when a DDM fails and the data it contained is rebuilt onto a
spare, then when the disk is replaced, the replacement disk becomes the spare. The data is
not migrated to another DDM, such as the DDM in the original position the failed DDM
occupied. So in other words, on an ESS 800 there is no post repair processing.
The DS8000 microcode may choose to allow the hot spare to remain where it has been
moved, but it may instead choose to migrate the spare to a more optimum position. This will
be done to better balance the spares across the DA pairs, the loops, and the enclosures. It
may be preferable that a DDM that is currently in use as an array member be converted to a
spare. In this case the data on that DDM will be migrated in the background onto an existing
spare. This process does not
fail the disk that is being migrated, though it does reduce the
number of available spares in the DS8000 until the migration process is complete.
A smart process will be used to ensure that the larger or higher RPM DDMs always act as
spares. This is preferable because if we were to rebuild the contents of a 146 GB DDM onto a
300 GB DDM, then approximately half of the 300 GB DDM will be wasted since that space is
not needed. The problem here is that the failed 146 GB DDM will be replaced with a new
146 GB DDM. So the DS8000 microcode will most likely migrate the data back onto the
recently replaced 146 GB DDM. When this process completes, the 146 GB DDM will rejoin
the array and the 300 GB DDM will become the spare again. Another example would be if we
fail a 73 GB 15k RPM DDM onto a 146 GB 10k RPM DDM. This means that the data has now
moved to a slower DDM, but the replacement DDM will be the same as the failed DDM. This
means the array will have a mix of RPMs. This is not desirable. Again, a smart migrate of the
data will be performed once suitable spares have become available.
Hot plugable DDMs
Replacement of a failed drive does not affect the operation of the DS8000 because the drives
are fully hot plugable. Due to the fact that each disk plugs into a switch, there is no loop break
associated with the removal or replacement of a disk. In addition there is no potentially
disruptive loop initialization process.
4.6.5 Predictive Failure Analysis® (PFA)
The drives used in the DS8000 incorporate Predictive Failure Analysis (PFA) and can
anticipate certain forms of failures by keeping internal statistics of read and write errors. If the
error rates exceed predetermined threshold values, the drive will be nominated for
replacement. Because the drive has not yet failed, data can be copied directly to a spare
drive. This avoids using RAID recovery to reconstruct all of the data onto the spare drive.