IBM 15 Switch User Manual


 
32
Chapter 4
A Strategy for Data Mining
As with most business ende avors, data mining is much more effective if done in a planned,
systematic way. Even with cutting-edge data mining to ols, such as IBM® SPSS® Modeler, the
majorit y of the work in data mining requires a knowledgeable business analyst to k eep the process
on track. To guide your planning , answer the following questions:
What substantive problem do you want to s olve?
What data sources are availab le, and what parts of the data are relevant to the current problem?
What kind of preprocessing and data cleaning do you need to do before you start mining
the data?
What data mining technique(s) will you use?
How will you evaluate the results of the data mining analysis?
How will you get the most out of the information you obtained from data mining?
The typical d ata mining pr oc ess can become complicated very q uickly. There is a lot to keep track
of—complex business problem s, multiple data sources, varying data quali ty across data sources,
an arra y of data mining techniques, different ways of measuring data mining suc cess, and so on.
To stay on track, it helps to have an explicitly dened process model for data mining. The
process model helps you answer the questions listed earlier in this section, and makes sure the
important points are addressed. It serves as a data mining road map so that you will not lose your
way as you dig into the complexities of your data.
The data mining process suggested for use with SPSS Modeler is the Cross-Industry Standard
Process for Data Mining (CRISP-DM). As you ca n tell from the name, this model is designed as a
general model th at ca n be applied to a wide variety of industries and busines s problems.
The CRISP-DM Process Model
The genera
l CRISP-DM process model includes six phases that addres s the main is sues in data
mining. The six phases t together in a cyclical process design ed to incorporate data mining
into your larger business pra ctices.