Support User Manuals

Sun Microsystems 5310 NAS Server User Manual

Open as PDF

of 382

Chapter 2 NAS Head 2-73

After reviewing the case, engineering may make specific recommendations and

modifications, or they may recommend that you proceed with the filesystem repair.

For instructions on how to complete a filesystem repair, see “Filesystem check

procedure” under Diagnostic Procedures at the end of this document.

Reoccurrence of filesystem related error messages / mount

problems after repair

If you have run a filesystem check until no errors were reported, or recreated a

volume, this should permanently resolve the filesystem errors. If the errors return,

the source of the problem remains. The most likely source is a hardware problem. A

good first step is to replace the system board memory and the RAID controller, or

failing that, the entire system. Once the source of the problem has been resolved, it

will be necessary to proceed according to the “Filesystem check procedure” under

Diagnostic Procedures at the end of this document.

Checkpoint database problems reported in system log

Can’t delete checkpoints

The indication of a checkpoint database problem is either a hard error (e.g. cannot

write) in the system log when attempting to delete a checkpoint, or an error message

which specifically states “error in checkpoint database”. As the checkpoint

filesystem is read-only, and treated as a separate filesystem in many ways, this

problem must be addressed at the filesystem level. Specifically, via the chkpntabort

command and a file system check.

It is generally recommended that this issue be escalated for assistance in accurately

identifying the problem, and also to locate the source of the problem. The messages

can vary considerably from the above; and similar checkpoint related messages

could lead one down the wrong path toward applying an unnecessarily severe

solution.

A diagnostic email, with all attachments, is required to escalate this type of issue.

The primary source of information for this case is the system log. The diagnostic

should be captured as close as possible to the time the messages occur, so that they

may be seen in context in the system log. Also, collect as much information as

possible about the circumstances surrounding the failure, e.g. when did the

messages first appear, what was happening at the time, symptoms reported by users.

Typically in this case, it is necessary to abort checkpoints on the volume. This is done

from the CLI. After verifying the diagnosis with engineering, access the CLI and

enter “chkpntabort <volumename>”. StorEdge will prompt for confirmation.

Answering “y”, “yes” to the prompt will result in the immediate deletion all

checkpoints. A file system check is required as soon as possible after aborting

previous next