English
Language : 

SH7760 Datasheet, PDF (134/1345 Pages) Renesas Technology Corp – SuperHTM RISC engine
The instruction execution sequence is expressed as a combination of the execution patterns shown
in figure 5.2. One instruction is separated from the next by the number of machine cycles for its
issue rate. Normally, execution, data access, and write-back stages cannot be overlapped onto the
same stages of another instruction; the only exception is when two instructions are executed in
parallel under parallel-executability conditions. See (a) to (d) in figure 5.3 for some simple
examples.
Latency is the interval between issue and completion of an instruction, and is also the interval
between the execution of two instructions with an interdependent relationship. When there is
interdependency between two instructions fetched simultaneously, the latter of the two is stalled
for the following number of cycles:
• (Latency) cycles when there is flow dependency (read-after-write)
• (Latency − 1) or (latency − 2) cycles when there is output dependency (write-after-write)
 Single/double-precision FDIV or FSQRT is the preceding instruction: (latency – 1) cycles
 Other instructions in the FE group is the preceding instruction: (latency – 2) cycles
• Five or two cycles when there is anti-flow dependency (write-after-read), as in the following
cases:
 FTRV is the preceding instruction: 5 cycles
 Double-precision FADD, FSUB, or FMUL is the preceding instruction: 2 cycles
In the case of flow dependency, the latency may be exceptionally increased or decreased,
depending on the combination of sequential instructions (figure 5.3 (e)).
• When a floating-point computation is followed by a floating-point register store, latency of the
floating-point computation may be decreased by one cycle.
• If there is a load of the shift amount immediately before an SHAD or SHLD instruction,
latency of the load is increased by one cycle.
• If an instruction with latency of less than two cycles, including write-back to a floating-point
register, is followed by a double-precision floating-point instruction, FIPR, or FTRV, latency
of the first instruction is increased to two cycles.
The number of cycles in a pipeline stall due to flow dependency will vary depending on the
combination of interdependent instructions or the fetch timing (see figure 5.3 (e)).
Output dependency occurs when the destination operands are the same in a preceding FE group
instruction and a following LS group instruction.
For the stall cycles of an instruction with output dependency, the longest latency to the last write-
back among all the destination operands must be applied instead of "latency" (see figure 5.3 (f)). A
stall due to output dependency with respect to FPSCR, which reflects the result of a floating-point
operation, never occurs. For example, when FADD follows FDIV with no dependency between
Rev. 1.0, 02/03, page 84 of 1294