Support User Manuals

Intel IXC1100 Personal Computer User Manual

Open as PDF

of 568

Intel

®

IXP42X product line and IXC1100 control plane processors—Intel XScale

®

Processor

Intel

®

IXP42X Product Line of Network Processors and IXC1100 Control Plane Processor

DM September 2006

190 Order Number: 252480-006US

Note the order reversal of the prefetches in relationship to the usage. If there is a

cache conflict and data is evicted from the cache then only the data from the first

prefetch is lost.

3.10.4.4.9 Loop Interchange

As mentioned earlier, the sequence in which data is accessed affects cache thrashing.

Usually, it is best to access data in a contiguous spatially address range. However,

arrays of data may have been laid out such that indexed elements are not physically

next to each other. Consider the following C code which places array elements in row

major order.

In the above example, A[i][j] and A[i+1][j] are not sequentially next to each other.

This situation causes an increase in bus traffic when prefetching loop data. In some

cases where the loop mathematics are unaffected, the problem can be resolved by

induction variable interchange. The above examples becomes:

3.10.4.4.10 Loop Fusion

Loop fusion is a process of combining multiple loops, which reuse the same data, in to

one loop. The advantage of this is that the reused data is immediately accessible from

the data cache. Consider the following example:

The second loop reuses the data elements A[i] and c[i]. Fusing the loops together

produces:

for(j=0; j<NMAX; j++)

for(i=0; i<NMAX; i++)

{

prefetch(A[i+1][j]);

sum += A[i][j];

}

for(i=0; i<NMAX; i++)

for(j=0; j<NMAX; j++)

{

prefetch(A[i][j+1]);

sum += A[i][j];

}

for(i=0; i<NMAX; i++)

{

prefetch(A[i+1], c[i+1], c[i+1]);

A[i] = b[i] + c[i];

}

for(i=0; i<NMAX; i++)

{

prefetch(D[i+1], c[i+1], A[i+1]);

D[i] = A[i] + c[i];

}

for(i=0; i<NMAX; i++)

{

prefetch(D[i+1], A[i+1], c[i+1], b[i+1]);

ai = b[i] + c[i];

A[i] = ai;

D[i] = ai + c[i];

}

previous next