IBM 15 Switch User Manual


 
Chapter
1
11
13
33
3
Performance Considerations for
Streams and N
odes
You can design you r streams to maximiz e performance by arranging the nodes in the most
efcient con
guration, by enabling node caches when a ppropriate, and by p aying attention to
other considera tions as detailed in this section.
Aside from the considerations discussed here, additional an d more sub stantia l performance
improv emen
ts can typically be gained by making effective use of your database , particularly
through SQ L optimization.
Order of Nod
es
Even when you are not usi ng SQL optimization, the order of nodes in a stream can affect
performance. The general goal is to minimize downstream processing; therefore , when you
have nodes that reduce the amount of data, place them near the beginning of the stream. IBM®
SPSS® Modeler Server can apply some reordering rules auto matically during compilation to bring
forward certain nodes when it can be proven safe to do so. (This feature is enabled by def ault.
Check with your system administrator to make sure it is enabled in your installation.)
When using SQL optimization, you want to maximize its availability and efciency. Since
optimiza tion halts w hen the stream contains an operation that cannot be p erformed in the database,
it is best to group SQL-optimized operations togethe r at the be ginning of the stream. This strategy
keeps more of the processing in the database, so less data is carried into IBM® SPSS® M odeler.
The f ollowing operations c an be done in most databa ses. Try to group them at the beginning of
the stream:
Merge by key ( join)
Select
Aggre gate
Sort
Sampl e
Append
Distinct operations in include mode, in which all elds are selected
Filler operations
Basic derive oper ations using standard arithmetic or string manipulation (depending on which
operations are supported by the database )
Set-to-ag
© Copyright IBM Corporation 1994, 2012.
230