Support User Manuals

IBM SPSS Amos 21 Laptop User Manual

Open as PDF

of 680

270

Example 17

exclude only persons whose incomes you do not know. Similarly, in computing the

sample covariance between age and income, you would exclude an observation only if

age is missing or if income is missing. This approach to missing data is sometimes

called pairwise deletion.

A third approach is data imputation, replacing the missing values with some kind

of guess, and then proceeding with a conventional analysis appropriate for complete

data. For example, you might compute the mean income of the persons who reported

their income, and then attribute that income to all persons who did not report their

income. Beale and Little (1975) discuss methods for data imputation, which are

implemented in many statistical packages.

Amos does not use any of these methods. Even in the presence of missing data, it

computes maximum likelihood estimates (Anderson, 1957). For this reason, whenever

you have missing data, you may prefer to use Amos to do a conventional analysis, such as

a simple regression analysis (as in Example 4) or to estimate means (as in Example 13).

It should be mentioned that there is one kind of missing data that Amos cannot deal

with. (Neither can any other general approach to missing data, such as the three

mentioned above.) Sometimes the very fact that a value is missing conveys

information. It could be, for example, that people with very high incomes tend (more

than others) not to answer questions about income. Failure to respond may thus convey

probabilistic information about a person’s income level, beyond the information

already given in the observed data. If this is the case, the approach to missing data that

Amos uses is inapplicable.

Amos assumes that data values that are missing are missing at random. It is not

always easy to know whether this assumption is valid or what it means in practice

(Rubin, 1976). On the other hand, if the missing at random condition is satisfied, Amos

provides estimates that are efficient and consistent. By contrast, the methods

mentioned previously do not provide efficient estimates, and provide consistent

estimates only under the stronger condition that missing data are missing completely

at random (Little and Rubin, 1989).

About the Data

For this example, we have modified the Holzinger and Swineford (1939) data used in

Example 8. The original dataset (in the SPSS Statistics file Grnt_fem.sav) contains the

scores of 73 girls on six tests, for a total of 438 data values. To obtain a dataset with

missing values, each of the 438 data values in Grnt_fem.sav was deleted with

probability 0.30.

previous next