34
Chapter 4
have been resolved adequately. Similarly, the evaluation phase can lead you to reevaluate your
original busi
ness understanding, and you may decide that you have been trying to answer the
wrong question. At this point, you can revise your business understanding and pr oceed through
the rest of the process again with a better target in mind.
The second ke
y point is the iterative nature of data mining. You will rarely, if ever, simply
plan a data mining project, complete it, and then pack up your data and go home. Data mining to
address your customers’ demands is an ongoing endeavor. The knowledg e gained from one cycle
of data minin
g will almo st invariably lead to new questions, new issues, and new opportunities
to identify and meet your customers’ needs. Those new questions, issues, and opportunities can
usually be addressed by mining your data once again. This process of mining and identifying new
opportunit
ies should become part of the way you think about your business and a cornerstone of
your overall business strategy.
This introduction provides only a brief overvie w of the CRISP-DM process model. For
complete de
tails on the model, consult the followin g resources:
The CRISP-DM Guide, which can be accessed along with other documentation from the
\Documentation folde r on the installation disk.
The C RISP-DM Help system, available from the Start menu or by cli cking CRISP-DM Help on
the Help menu in IBM® SPSS® Modeler.
Types of Models
IBM® SPSS ® Modeler offers a variety of modeling methods taken from machin e learning,
artificial intelligence, and statistics. The methods available on the Modeling palette allow you
to derive new information from your data and to develop predictive models. Each method has
certain strengths and is best suited for particular types of problems.
The SPSS Modeler Applications Guide provides examples for many of these methods, along
with a general introduction to the modeling process. This gu ide is available as an online tutorial,
and also in PDF forma t. For more information, see the topic Application Examples in Chapter 1
on p. 5.
Modeli ng methods are divided into three categori es:
Classification
Associa tion
Segmentation
Classification Models
Classification models use the values of one or more input fields to predict the value of one or
more output, or target, fields. Some examples of these techniques are: decision trees (C&R Tree,
QUEST, CHAID and C5.0 algorit hms), regression (linear, logistic, generalized linear, and Cox
regress ion algorithm s), neural networks, support vector machines, and Bayesian networks.
Classification models helps organizations to predict a known result, such as whether a custom er
will buy or leave or whether a transaction fits a kno wn pattern of fr aud. Modeling techniques
include machine learning, rule induction, subgroup identification, statistical meth ods, a nd multiple
model generation.