English
Language : 

TMS320DM643AZDK5 Datasheet, PDF (7/164 Pages) Texas Instruments – Video/Imaging Fixed-Point Digital Signal Processor
TMS320DM643
www.ti.com
SPRS269D – FEBRUARY 2005 – REVISED OCTOBER 2010
to the same execute packet as the previous instruction, or whether it should be executed in the following
clock as a part of the next execute packet. Fetch packets are always 256 bits wide; however, the execute
packets can vary in size. The variable-length execute packets are a key memory-saving feature,
distinguishing the C64x CPUs from other VLIW architectures. The C64x™ VelociTI.2™ extensions add
enhancements to the TMS320C62x™ DSP VelociTI™ architecture. These enhancements include:
• Register file enhancements
• Data path extensions
• Quad 8-bit and dual 16-bit extensions with data flow enhancements
• Additional functional unit hardware
• Increased orthogonality of the instruction set
• Additional instructions that reduce code size and increase register flexibility
The CPU features two sets of functional units. Each set contains four units and a register file. One set
contains functional units .L1, .S1, .M1, and .D1; the other set contains units .D2, .M2, .S2, and .L2. The
two register files each contain 32 32-bit registers for a total of 64 general-purpose registers. In addition to
supporting the packed 16-bit and 32-/40-bit fixed-point data types found in the C62x™ VelociTI™ VLIW
architecture, the C64x™ register files also support packed 8-bit data and 64-bit fixed-point data types. The
two sets of functional units, along with two register files, compose sides A and B of the CPU [see the
functional block and CPU (DSP core) diagram, and Figure 2-1]. The four functional units on each side of
the CPU can freely share the 32 registers belonging to that side. Additionally, each side features a "data
cross path"—a single data bus connected to all the registers on the other side, by which the two sets of
functional units can access data from the register files on the opposite side. The C64x CPU pipelines
data-cross-path accesses over multiple clock cycles. This allows the same register to be used as a
data-cross-path operand by multiple functional units in the same execute packet. All functional units in the
C64x CPU can access operands via the data cross path. Register access by functional units on the same
side of the CPU as the register file can service all the units in a single clock cycle. On the C64x CPU, a
delay clock is introduced whenever an instruction attempts to read a register via a data cross path if that
register was updated in the previous clock cycle.
In addition to the C62x™ DSP fixed-point instructions, the C64x™ DSP includes a comprehensive
collection of quad 8-bit and dual 16-bit instruction set extensions. These VelociTI.2™ extensions allow the
C64x CPU to operate directly on packed data to streamline data flow and increase instruction set
efficiency. This is a key factor for video and imaging applications.
Another key feature of the C64x CPU is the load/store architecture, where all instructions operate on
registers (as opposed to data in memory). Two sets of data-addressing units (.D1 and .D2) are
responsible for all data transfers between the register files and the memory. The data address driven by
the .D units allows data addresses generated from one register file to be used to load or store data to or
from the other register file. The C64x .D units can load and store bytes (8 bits), half-words (16 bits), and
words (32 bits) with a single instruction. And with the new data path extensions, the C64x .D unit can load
and store doublewords (64 bits) with a single instruction. Furthermore, the non-aligned load and store
instructions allow the .D units to access words and doublewords on any byte boundary. The C64x CPU
supports a variety of indirect addressing modes using either linear- or circular-addressing with 5- or 15-bit
offsets. All instructions are conditional, and most can access any one of the 64 registers. Some registers,
however, are singled out to support specific addressing modes or to hold the condition for conditional
instructions (if the condition is not automatically "true").
The two .M functional units perform all multiplication operations. Each of the C64x .M units can perform
two 16 × 16-bit multiplies or four 8 × 8-bit multiplies per clock cycle. The .M unit can also perform 16 ×
32-bit multiply operations, dual 16 × 16-bit multiplies with add/subtract operations, and quad 8 × 8-bit
multiplies with add operations. In addition to standard multiplies, the C64x .M units include bit-count,
rotate, Galois field multiplies, and bidirectional variable shift hardware.
The two .S and .L functional units perform a general set of arithmetic, logical, and branch functions with
results available every clock cycle. The arithmetic and logical functions on the C64x CPU include single
32-bit, dual 16-bit, and quad 8-bit operations.
Copyright © 2005–2010, Texas Instruments Incorporated
Submit Documentation Feedback
Product Folder Link(s): TMS320DM643
Device Overview
7