ADSP-TS201S_06 Datasheet, PDF(4/48 Page) Analog Devices

English

English German Russian Spanish Italian Polish Chinese Japanese Korean French Portuguese	Language :

ADSP-TS201S_06 Datasheet, PDF (4/48 Pages) Analog Devices – TigerSHARC-R Embedded Processor

◁

ADSP-TS201S

The TigerSHARC DSP uses a Static SuperscalarTMâ architecture.

This architecture is superscalar in that the ADSP-TS201S pro-

cessorâs core can execute simultaneously from one to four 32-bit

instructions encoded in a very large instruction word (VLIW)

instruction line using the DSPâs dual compute blocks. Because

the DSP does not perform instruction re-ordering at runtimeâ

the programmer selects which operations will execute in parallel

prior to runtimeâthe order of instructions is static.

With few exceptions, an instruction line, whether it contains

one, two, three, or four 32-bit instructions, executes with a

throughput of one cycle in a 10-deep processor pipeline.

For optimal DSP program execution, programmers must follow

the DSPâs set of instruction parallelism rules when encoding an

instruction line. In general, the selection of instructions that the

DSP can execute in parallel each cycle depends on the instruc-

tion line resources each instruction requires and on the source

and destination registers used in the instructions. The program-

mer has direct control of three core componentsâthe IALUs,

the compute blocks, and the program sequencer.

The ADSP-TS201S processor, in most cases, has a two-cycle

execution pipeline that is fully interlocked, soâwhenever a

computation result is unavailable for another operation depen-

dent on itâthe DSP automatically inserts one or more stall

cycles as needed. Efficient programming with dependency-free

instructions can eliminate most computational and memory

transfer data dependencies.

In addition, the ADSP-TS201S processor supports SIMD opera-

tions two waysâSIMD compute blocks and SIMD

computations. The programmer can load both compute blocks

with the same data (broadcast distribution) or different data

(merged distribution).

DUAL COMPUTE BLOCKS

The ADSP-TS201S processor has compute blocks that can exe-

cute computations either independently or together as a single-

instruction, multiple-data (SIMD) engine. The DSP can issue up

to two compute instructions per compute block each cycle,

instructing the ALU, multiplier, shifter, or CLU to perform

independent, simultaneous operations. Each compute block can

execute eight 8-bit, four 16-bit, two 32-bit, or one 64-bit SIMD

computations in parallel with the operation in the other block.

These computation units support IEEE 32-bit single-precision

floating-point, extended-precision 40-bit floating point, and 8-,

16-, 32-, and 64-bit fixed-point processing.

The compute blocks are referred to as X and Y in assembly syn-

tax, and each block contains four computational unitsâan

ALU, a multiplier, a 64-bit shifter, a 128-bit CLUâand a 32-

word register file.

â¢ Register Fileâeach compute block has a multiported 32-

word, fully orthogonal register file used for transferring

data between the computation units and data buses and for

â Static Superscalar is a trademark of Analog Devices, Inc.

storing intermediate results. Instructions can access the

registers in the register file individually (word-aligned), in

sets of two (dual-aligned), or in sets of four (quad-aligned).

â¢ ALUâthe ALU performs a standard set of arithmetic oper-

ations in both fixed- and floating-point formats. It also

performs logic operations.

â¢ Multiplierâthe multiplier performs both fixed- and float-

ing-point multiplication and fixed-point multiply and

accumulate.

â¢ Shifterâthe 64-bit shifter performs logical and arithmetic

shifts, bit and bit stream manipulation, and field deposit

and extraction operations.

â¢ Communications Logic Unit (CLU)âthis 128-bit unit pro-

vides trellis decoding (for example, Viterbi and Turbo

decoders) and executes complex correlations for CDMA

communication applications (for example, chip-rate and

symbol-rate functions).

Using these features, the compute blocks can:

â¢ Provide 8 MACS per cycle peak and 7.1 MACS per cycle

sustained 16-bit performance and provide 2 MACS per

cycle peak and 1.8 MACS per cycle sustained 32-bit perfor-

mance (based on FIR)

â¢ Execute six single-precision floating-point or execute 24

fixed-point (16-bit) operations per cycle, providing

3.6G FLOPS or 14.4G/s regular operations performance at

600 MHz

â¢ Perform two complex 16-bit MACS per cycle

â¢ Execute eight trellis butterflies in one cycle

DATA ALIGNMENT BUFFER (DAB)

The DAB is a quad-word FIFO that enables loading of quad-

word data from nonaligned addresses. Normally, load instruc-

tions must be aligned to their data size so that quad words are

loaded from a quad-aligned address. Using the DAB signifi-

cantly improves the efficiency of some applications, such as

FIR filters.

DUAL INTEGER ALU (IALU)

The ADSP-TS201S processor has two IALUs that provide pow-

erful address generation capabilities and perform many general-

purpose integer operations. The IALUs are referred to as J and

K in assembly syntax and have the following features:

â¢ Provide memory addresses for data and update pointers

â¢ Support circular buffering and bit-reverse addressing

â¢ Perform general-purpose integer operations, increasing

programming flexibility

â¢ Include a 31-word register file for each IALU

As address generators, the IALUs perform immediate or indi-

rect (pre- and post-modify) addressing. They perform modulus

and bit-reverse operations with no constraints placed on mem-

ory addresses for the modulus data buffer placement. Each

IALU can specify either a single-, dual-, or quad-word access

from memory.

Rev. C | Page 4 of 48 | December 2006

▷