English
Language : 

PXS20RM Datasheet, PDF (382/1368 Pages) Freescale Semiconductor, Inc – PXS20 Microcontroller
e200z4d Core Complex Overview
NOTES:
1 Vector to [p_rstbase[0:29]] || 0b0.
2 Autovectored external and critical input interrupts use this IVOR. Vectored interrupts supply an interrupt vector
offset directly.
17.4 Microarchitecture summary
The e200z4d processor utilizes a five-stage pipeline for instruction execution. These stages operate in an
overlapped fashion, allowing single clock-cycle instruction execution for most instructions. The stages are
as follows:
1. Instruction fetch
2. Instruction decode/register file read/effective address calculation
3. Execute 0/memory access 0
4. Execute 1/memory access 1
5. Register write-back
The integer execution units consist of a 32-bit arithmetic unit, a logic unit, a 32-bit barrel shifter, a
mask-insertion unit, a condition register manipulation unit, a count-leading-zeros unit, a 32  32 hardware
multiplier array, and result feed-forward hardware. Integer unit 1 also supports hardware division.
Most arithmetic and logical operations are executed in a single cycle with the exception of multiply, which
is implemented with a 2-cycle pipelined hardware array, and the divide instructions. A count-leading-zeros
unit operates in a single clock cycle.
The instruction unit contains a program counter incrementer and dedicated branch address adder to
minimize delays during change-of-flow operations. Sequential prefetching is performed to ensure a supply
of instructions into the execution pipeline. Branch target prefetching using the BTB is performed to
accelerate taken branches. Prefetched instructions are placed into an 8-entry instruction buffer, with each
entry capable of holding a single 32-bit instruction or a pair of 16-bit instructions.
Branch target addresses are calculated in parallel with branch instruction decode. Conditional branches
that are not taken execute in a single clock cycle. Branches with successful BTB target prefetching have
an effective execution time of one clock cycle if correctly predicted. All other taken branches have an
execution time of two clock cycles.
Memory load and store operations are provided for byte, half-word, word (32-bit), and double-word data
with automatic zero or sign extension of byte and half-word load data as well as optional byte reversal of
data. These instructions can be pipelined to allow effective single-cycle throughput. Load and store
multiple word instructions allow low-overhead context save and restore operations. The load/store unit
contains a dedicated effective address adder to allow effective address generation to be optimized. There
is a single load-to-use bubble for load instructions.
The condition register unit supports the condition register (CR) and condition register operations defined
by the architecture. The condition register consists of eight 4-bit fields that reflect the results of certain
operations, such as move, integer and floating-point compare, arithmetic, and logical instructions. It also
provides a mechanism for testing and branching.
17-12
PXS20 Microcontroller Reference Manual, Rev. 1
Freescale Semiconductor