English
Language : 

DS693 Datasheet, PDF (11/13 Pages) Xilinx, Inc – Integrated into Xilinx Embedded Development Kit
LogiCORE IP Virtex-5 APU Floating-Point Unit (v1.01a)
Arrays and pointers can often limit the compiler’s ability to allocate variables to registers. If a design has small
arrays of floating-point values, better performance may be possible if a small number of individual variables are
declared instead (like, float a0, a1, a2 instead of float a[3]), and loops that index into them are unrolled.
Core Parameters
The FPU core has three parameters that influence its implementation, as shown in Table 3.:
Table 3: APU FPU Virtex-5 Core Parameters
Parameter
Default
Value and Meaning
0
1
2
C_DOUBLE_PRECISION
C_USE_RLOCS
C_LATENCY_CONF
1
Single-precision FPU
Double-precision FPU
(default)
Not applicable
0
No area constraints
Use area constraints for Use area constraints for
PowerPC0 (bottom)
PowerPC1 (top)
1
High-speed variant
Low-latency variant
(default)
Not applicable
The default implementation values are conservative. FPU-enabled systems generated by Base System Builder will
use a 3:1 CPU:FPU clock ratio by default, with the low-latency FPU variant. To obtain the highest possible
performance, C_USE_RLOCS should be set to 1 or 2 so that the FPU is implemented using AREA_GROUP
constraints. In Virtex-5 FXT devices with a single PowerPC block, this parameter should be set to 1 to enable
placement. In Virtex-5 FXT devices with two PowerPC blocks, this parameter should be set to 1 if the FPU is to be
placed next to the block PPC440_X0Y0; it should be set to 2 if the FPU is to be placed next to the block
PPC440_X0Y1.
In the unlikely that these constraints cause conflicts with other IP blocks in the system, the constraints can be
omitted, but note that these constraints must be present to obtain maximum performance (200MHz clock frequency
on -1 silicon). Attempting to achieve 200MHz operation with C_USE_RLOCS set to 0 will likely result in significant
timing failures.
The C_LATENCY_CONF parameter controls the latency of the floating point operators. If the FPU is required to
run at half the speed of the CPU clock, this parameter should be set to 0 (the default) to obtain the highest possible
operating frequency. If the FPU is running at some lower ratio of the CPU clock speed (for example, at one third of
the default), or if the CPU itself is operating well below its rated maximum frequency, better overall performance
may be obtained by setting this parameter to 1. Setting C_USE_RLOCS to 1 or 2 when C_LATENCY_CONF is 1 will
have no effect (no constraints will be generated).
Table 4 shows the latencies of the various operations supported by the FPU.
Table 4: FPU Operator Latencies and Frequencies
Instruction
C_LATENCY = 0 (high speed)
Single
Double
C_LATENCY = 1 (low latency)
Single
Double
Add, Subtract
Multiply
5
6
3
4
4
6
3
4
Divide
29
60
16
60
Square Root
29
59
16
59
Convert
5
6
3
4
DS693 March 1, 2011
www.xilinx.com
11
Product Specification