Writing ARM and Thumb Assembly Language
2-24 Copyright © 2000, 2001 ARM Limited. All rights reserved. ARM DUI 0068B
Converting to Thumb
Because
B
is the only Thumb instruction that can be executed conditionally, the gcd
algorithm must be written with conditional branches in Thumb code.
Like the ARM conditional branch implementation, the Thumb code requires seven
instructions. However, because Thumb instructions are only 16 bits long, the overall
code size is 14 bytes, compared to 16 bytes for the smaller ARM implementation.
In addition, on a system using 16-bit memory the Thumb version runs faster than the
second ARM implementation because only one memory access is required for each
Thumb instruction, whereas each ARM instruction requires two fetches.
Branch prediction and caches
To optimize code for execution speed you need detailed knowledge of the instruction
timings, branch prediction logic, and cache behavior of your target system. Refer to
ARM Architecture Reference Manual and the technical reference manuals for individual
processors for full information.
Table 2-3 All instructions conditional
r0: a r1: b Instruction Cycles (ARM7)
12
CMP r0, r1
1
12
SUBGT r0,r0,r1
1 (not executed)
11
SUBLT r1,r1,r0
1
11
BNE gcd
3
11
CMP r0,r1
1
11
SUBGT r0,r0,r1
1 (not executed)
11
SUBLT r1,r1,r0
1 (not executed)
11
BNE gcd
1 (not executed)
Total = 10