IBM SC34-7012-01 Server User Manual


 
Chapter 1. Recovery and restart facilities
Problems that occur in a data processing system could be failures with
communication protocols, data sets, programs, or hardware. These problems are
potentially more severe in online systems than in batch systems, because the data
is processed in an unpredictable sequence from many different sources.
Online applications therefore require a system with special mechanisms for
recovery and restart that batch systems do not require. These mechanisms ensure
that each resource associated with an interrupted online application returns to a
known state so that processing can restart safely. Together with suitable operating
procedures, these mechanisms should provide automatic recovery from failures
and allow the system to restart with the minimum of disruption.
The two main recovery requirements of an online system are:
v To maintain the integrity and consistency of data
v To minimize the effect of failures
CICS provides a facility to meet these two requirements called the recovery
manager. The CICS recovery manager provides the recovery and restart functions
that are needed in an online system.
Maintaining the integrity of data
Data integrity means that the data is in the form you expect and has not been
corrupted. The objective of recovery operations on files, databases, and similar data
resources is to maintain and restore the integrity of the information.
Recovery must also ensure consistency of related changes, whereby they are made
as a whole or not at all. (The term resources used in this book, unless stated
otherwise, refers to data resources.)
Logging changes
One way of maintaining the integrity of a resource is to keep a record, or log, of all
the changes made to a resource while the system is executing normally. If a failure
occurs, the logged information can help recover the data.
An online system can use the logged information in two ways:
1. It can be used to back out incomplete or invalid changes to one or more
resources. This is called backward recovery, or backout. For backout, it is
necessary to record the contents of a data element before it is changed. These
records are called before-images. In general, backout is applicable to processing
failures that prevent one or more transactions (or a batch program) from
completing.
2. It can be used to reconstruct changes to a resource, starting with a backup copy
of the resource taken earlier. This is called forward recovery. For forward
recovery, it is necessary to record the contents of a data element after it is
changed. These records are called after-images.
© Copyright IBM Corp. 1982, 2010 3