100
Chapter 6
Figure 6-1
Specifying missing values for a continuous variable
Reading in mixed data.
Note that when you are reading in fields with numeri c storage (either
integer, real, time, timestamp, or date ) , any non-numeric values are set to null or system missing.
This is because, unlike some applica tions, does not allow mixed storage types within a field. To
avoid this, any fields with mixed data should be read in as strings by changing th e storage type in
the source node or external application as necessary.
Reading empty strings from Oracle.
When reading from or writing to an Oracle database, be aware
that, unlike SPSS Modeler and unlike most othe r databases, Oracle treats and stores empty string
values as equivalent to null values. T his means that the same data extracted from an Oracle
database may behave differently than when extracted from a file or another database, and the data
may return different results .
Handling Missing Values
You should decide how to treat missing values in light of your business or domain knowledge. To
ease training time and in crease accuracy, you may want to remove blanks from your data set. On
the ot
her hand, the presence of blank values may lead to new busin ess opportunities or additional
insights. In choosing the best technique, you should consider the following a spects of your d ata:
Size of the data set
Number of fields containing blanks
Amount of missing inf ormation