English
Language : 

PIC32MX440F256H-80I Datasheet, PDF (37/646 Pages) Microchip Technology – 64/100-Pin General Purpose and USB 32-Bit Flash Microcontrollers
2.2.1 EXECUTION UNIT
The PIC32MX3XX/4XX Family core execution unit
implements a load/store architecture with single-cycle
ALU operations (logical, shift, add, subtract) and an
autonomous multiply/divide unit. The core contains
thirty-two 32-bit general purpose registers used for
integer operations and address calculation. One
additional register file shadow set (containing thirty-two
registers) is added to minimize context switching
overhead during interrupt/exception processing. The
register file consists of two read ports and one write
port and is fully bypassed to minimize operation latency
in the pipeline.
The execution unit includes:
• 32-bit adder used for calculating the data address
• Address unit for calculating the next instruction
address
• Logic for branch determination and branch target
address calculation
• Load aligner
• Bypass multiplexers used to avoid stalls when
executing instructions streams where data
producing instructions are followed closely by
consumers of their results
• Leading Zero/One detect unit for implementing the
CLZ and CLO instructions
• Arithmetic Logic Unit (ALU) for performing bitwise
logical operations
• Shifter and Store Aligner
PIC32MX3XX/4XX
2.2.2 MULTIPLY/DIVIDE UNIT (MDU)
The PIC32MX3XX/4XX Family core includes a
multiply/divide unit (MDU) that contains a separate
pipeline for multiply and divide operations. This pipeline
operates in parallel with the integer unit (IU) pipeline
and does not stall when the IU pipeline stalls. This
allows MDU operations to be partially masked by
system stalls and/or other integer unit instructions.
The high-performance MDU consists of a 32x16 booth
recoded multiplier, result/accumulation registers (HI
and LO), a divide state machine, and the necessary
multiplexers and control logic. The first number shown
(‘32’ of 32x16) represents the rs operand. The second
number (‘16’ of 32x16) represents the rt operand. The
PIC32MX core only checks the value of the latter (rt)
operand to determine how many times the operation
must pass through the multiplier. The 16x16 and 32x16
operations pass through the multiplier once. A 32x32
operation passes through the multiplier twice.
The MDU supports execution of one 16x16 or 32x16
multiply operation every clock cycle; 32x32 multiply
operations can be issued every other clock cycle.
Appropriate interlocks are implemented to stall the
issuance of back-to-back 32x32 multiply operations.
The multiply operand size is automatically determined
by logic built into the MDU.
Divide operations are implemented with a simple 1 bit
per clock iterative algorithm. An early-in detection
checks the sign extension of the dividend (rs) operand.
If rs is 8 bits wide, 23 iterations are skipped. For a 16-
bit-wide rs, 15 iterations are skipped, and for a 24-bit-
wide rs, 7 iterations are skipped. Any attempt to issue
a subsequent MDU instruction while a divide is still
active causes an IU pipeline stall until the divide
operation is completed.
Table 2-1 lists the repeat rate (peak issue rate of cycles
until the operation can be reissued) and latency
(number of cycles until a result is available) for the
PIC32MX core multiply and divide instructions. The
approximate latency and repeat rates are listed in
terms of pipeline clocks.
© 2008 Microchip Technology Inc.
Preliminary
DS61143E-page 35