English
Language : 

GMS30C2116 Datasheet, PDF (24/322 Pages) Hynix Semiconductor – USERS MANUAL
1-4
CHAPTER 1
(3) Delayed Load Instructions
Load instruction read operands from memory into processor register for subsequent
operation by other instructions. Because memory typically operates at much slower speeds
than processor clock rates, the loaded operand is not immediately available to subsequent
instructions in an instruction pipeline. The data dependency is illustrated in Figure 1.3.
Load 1
F
Instru c tio n
A
M
D ata from Load
W available as operatio n
2F
A
M
W
3F
A
M
W
4F
A
M
W
Figure 1.3: Data Dependency Resulting From a Load Instruction
In this illustration, the operand loaded by instruction 1 is not available for use in a cycle
(ALU, or Arithmetic/Logic Unit operation) of instruction 2. One way to handle this
dependency is to delay the pipeline by inserting additional clock cycles into the execution
of instruction 2 until the loaded data becomes available. This approach obviously
introduces delays that would increase the cycles/instructions factor.
In many RISC designs the technique used to handle this data dependency is to recognize
and make visible to compilers the fact that all load instructions have an inherent latency or
load delay. Figure 1.3 illustrates a load delay or latency of one instruction. The instruction
that immediately follows the load is in the load delay slot. If the instruction in this slot does
not require the data from the load, then no pipeline delay is required.
If this load delay is made visible to software, a compiler can arrange instructions to ensure
that there is no data dependency a load instruction and the instruction in the load delay slot.
The simplest way of ensuring that there is no data dependency is to insert a No Operation
(NOP) instruction to fill the slot, as follow:
Load
Load
NOP
ADD
R1, A
R2, B
<= This instruction fills the delay slot
R3, R1, R2
Although filling the delay slot with NOP instructions eliminates the need for hardware-
controlled pipeline stalls in this case, it still is not a very efficient use of the pipeline stream
since these additional NOP instructions increase code size and perform no useful work. (In
practice, however, this technique need not have much negative impact on performance.)
A more effective solution to handling the data dependency is to fill the load delay slot with
a useful instruction. Good optimizing compilers can usually accomplish this, especially if