English
Language : 

UPSD3422_06 Datasheet, PDF (25/293 Pages) STMicroelectronics – Turbo Plus Series Fast Turbo 8032 MCU with USB and Programmable Logic
uPSD34xx
8032 MCU core performance enhancements
four MCU clocks). But it is also important to understand PFQ operation on multi-cycle
instructions.
5.2
PFQ example, multi-cycle instructions
Let us look at a string of two-byte, two-cycle instructions in Figure 9 on page 25. There are
three instructions executed sequentially in this example, instructions A, B, and C. Each of
the time divisions in the figure is one machine-cycle of four clocks, and there are six phases
to reference in this discussion. Each instruction is pre-fetched into the PFQ in advance of
execution by the MCU. Prior to Phase 1, the PFQ has pre-fetched the two instruction bytes
(A1 and A2) of Instruction A. During Phase one, both bytes are loaded into the MCU
execution unit. Also in Phase 1, the PFQ is pre-fetching Instruction B (bytes B1 and B2) from
program memory. In Phase 2, the MCU is processing Instruction A internally while the PFQ
is pre-fetching Instruction C. In Phase 3, both bytes of instruction B are loaded into the MCU
execution unit and the PFQ begins to pre-fetch bytes for the next instruction. In Phase 4
Instruction B is processed.
The uPSD34xx MCU instructions are an exact 1/3 scale of all standard 8032 instructions
with regard to number of cycles per instruction. Figure 10 on page 26 shows the equivalent
instruction sequence from the example above on a standard 8032 for comparison.
5.3
Aggregate performance
The stream of two-byte, two-cycle instructions in Figure 9 on page 25, running on a 40MHz,
5V, uPSD34xx will yield 5 MIPs. And we saw the stream of one- or two-byte, one-cycle
instructions in Figure 7 on page 23, on the same MCU yield 10 MIPs. Effective performance
will depend on a number of things: the MCU clock frequency; the mixture of instructions
types (bytes and cycles) in the application; the amount of time an empty PFQ stalls the MCU
(mix of instruction types and misses on Branch Cache); and the operating voltage. A 5V
uPSD34xx device operates with four memory wait states, but a 3.3V device operates with
five memory wait states yielding 8 MIPS peak compared to 10 MIPs peak for 5V device. The
same number of wait states will apply to both program fetches and to data READ/WRITEs
unless otherwise specified in the SFR named BUSCON.
In general, a 3X aggregate performance increase is expected over any standard 8032
application running at the same clock frequency.
Figure 9.
PFQ operation on multi-cycle instructions
Three 2-byte, 2-cycle Instructions on uPSD34xx
Pre-Fetch
Inst A
Pre-Fetch Inst B and C
Pre-Fetch next Inst
PFQ
MCU
Execution
Inst A, Byte 1&2 Inst B, Byte 1&2 Inst C, Byte 1&2 Next Inst
4-clock
Macine Cycle
Previous Instruction
Phase 1
A1
A2
Phase 2
Process A
Phase 3
B1
B2
Phase 4
Process B
Continue to Pre-Fetch
Phase 5
C1 C2
Phase 6
Process C
Next Inst
Instruction A
Instruction B
Instruction C
AI10432
25/293