Texas Instruments TMS320 DSP Computer Hardware User Manual


 
www.ti.com
2.3DataMemory
DataMemory
voidPRE_filter1(intinput[],intlength,int*z)
{
intI,tmp;
for(I=0;I
<length;I++){
tmp=input[i]-z[0]+(13*z[1]+16)/32;
z[1]=z[0];
z[0]=input[i];
input[i]=tmp;
}
}
Thistechniqueofreplacingreferencestoglobaldatawithreferencestoparametersillustratesageneral
techniquethatcanbeusedtomakevirtuallyanyCodereentrant.Onesimplydefinesa"stateobject"as
onethatcontainsallofthestatenecessaryforthealgorithm;apointertothisstateispassedtothe
algorithm(alongwiththeinputandoutputdata).
typedefstruct
PRE_Obj{/*stateobjforpre-emphasisalg*/
intz0;
intz1;
}PRE_Obj;
void
PRE_filter2(PRE_Obj*pre,intinput[],intlength)
{
intI,tmp;
for(I=0;I<length;I++)
{
tmp=input[i]-pre->z0+(13*pre->z1+16)/
32;
pre->z1=pre->z0;
pre->z0=input[i];
input[i]=tmp;
}
}
AlthoughtheCCodelooksmorecomplicatedthanouroriginalimplementation,itsperformanceis
comparable,itisfullyreentrant,anditsperformancecanbeconfiguredona"perdataobject"basis.Since
eachstateobjectcanbeplacedinanydatamemory,itispossibletoplacesomeobjectsinon-chip
memoryandothersinexternalmemory.Thepointertothestateobjectis,ineffect,thefunction'sprivate
"datapagepointer."Allofthefunction'sdatacanbeefficientlyaccessedbyaconstantoffsetfromthis
pointer.
Noticethatwhileperformanceiscomparabletoouroriginalimplementation,itisslightlylargerandslower
becauseofthestateobjectredirection.Directlyreferencingglobaldataisoftenmoreefficientthan
referencingdataviaanaddressregister.Ontheotherhand,thedecreaseinefficiencycanusuallybe
factoredoutofthetime-criticalloopandintotheloop-setupCode.Thus,theincrementalperformancecost
isminimalandthebenefitisthatthissameCodecanbeusedinvirtuallyanysystem—independentof
whetherthesystemmustsupportasinglechannelormultiplechannels,orwhetheritispreemptiveor
non-preemptive.
"Weshouldforgetaboutsmallefficiencies,sayabout97%ofthetime:prematureoptimizationistheroot
ofallevil."—DonaldKnuth"StructuredProgrammingwithgotoStatements,"ComputingSurveys,Vol.6,
No.4,December,1974,page268.
Thelargeperformancedifferencebetweenon-chipdatamemoryandoff-chipmemory(even0wait-state
SRAM)issolargethateveryalgorithmvendordesignstheirCodetooperateasmuchaspossiblewithin
theon-chipmemory.Sincetheperformancegapisexpectedtoincreasedramaticallyinthenext3-5
years,thistrendwillcontinuefortheforeseeablefuture.TheTMS320C6000series,forexample,incursa
25waitstatepenaltyforexternalSDRAMdatamemoryaccess.Futureprocessorsmayseethispenalty
increaseto80oreven100waitstates!
SPRU352GJune2005RevisedFebruary2007GeneralProgrammingGuidelines19
SubmitDocumentationFeedback