



















CSA











| • Data de<br>betwee<br>static se | ependences<br>in instruction<br>cheduling ap          | of instruction<br>ns. It can be<br>proach.        | s create an interlock resolved through com                                                                                                   | elationship<br>npiler base |
|----------------------------------|-------------------------------------------------------|---------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|----------------------------|
| • Using                          | a compiler                                            | or a post p                                       | processor we can ind                                                                                                                         | crease the                 |
| separat                          | ion between                                           | tion:                                             | nstructions.                                                                                                                                 |                            |
| separat                          | ion between<br>Instruc<br>Add                         | tion:<br>R0, R1                                   | $/R0 \leftarrow (R0) + (R1)/$                                                                                                                |                            |
| separat                          | ion between<br>Instruc<br>Add<br>Move                 | tion:<br>R0, R1<br>R1, R5                         | /R0 $\leftarrow$ (R0) + (R1)/<br>/R1 $\leftarrow$ (R5)/                                                                                      |                            |
| separat                          | ion between<br>Instruc<br>Add<br>Move<br>Load         | tion:<br>R0, R1<br>R1, R5<br>R2, Μ(α)             | /R0 $\leftarrow$ (R0) + (R1)/<br>/R1 $\leftarrow$ (R5)/<br>/R2 $\leftarrow$ (Memory ( $\alpha$ ))/                                           |                            |
| separat                          | ion between<br>Instruc<br>Add<br>Move<br>Load<br>Load | tion:<br>R0, R1<br>R1, R5<br>R2, M(α)<br>R3, M(β) | /R0 $\leftarrow$ (R0) + (R1)/<br>/R1 $\leftarrow$ (R5)/<br>/R2 $\leftarrow$ (Memory ( $\alpha$ ))/<br>/R3 $\leftarrow$ (Memory ( $\beta$ ))/ |                            |

| <ul> <li>Consider the above code. H<br/>initiated until the prece<br/>dependence will stall the pip</li> </ul> | ere, the mu<br>ding load<br>peline, for 3   | Itiply instru<br>is comple<br>clock cycles          | ction cannot be<br>ete. This data |
|----------------------------------------------------------------------------------------------------------------|---------------------------------------------|-----------------------------------------------------|-----------------------------------|
| <ul> <li>The two Load instructions a<br/>instructions. So we can mo<br/>spacing between them and r</li> </ul>  | are indeper<br>ove these i<br>multiply inst | ndent of the<br>nstructions<br>truction.            | e add and move<br>to increase the |
| • After modification we get,                                                                                   | Load<br>Load<br>Add<br>Move<br>Multiply     | R2, M(α)<br>R3, M (β)<br>R0, R1<br>R1, R5<br>R2, R3 |                                   |
| Prepare                                                                                                        | ed By Mr.EBIN PM, AP, I                     | ESCE                                                | EDULINE 17                        |







## Score boarding

- Unlike out of order execution, this technique issues instructions inorder (in-order-issue).
- Score boarding is a hardware mechanism that maintains an execution rate of one instructions per cycle, by executing an instruction as soon as its operands are made available, and no hazard conditions prevent it.
- Every instructions go through a score board where a record of data dependences is constructed corresponding to instruction issue.
- A system with a scoreboard is assumed to have several functional units with their status information reported to the score board.

| Prepared By Mr.EBIN PM, AP, IESCE EDULINE | 21 |  |
|-------------------------------------------|----|--|
|                                           |    |  |

| <ul> <li>If the scoreboard detering<br/>immediately, it executed<br/>monitoring hardware up<br/>can proceed to executed</li> </ul> | ermines<br>es anot<br>nit statu | that an instr<br>her waiting<br>ıs and decide | uction<br>instru<br>s whe | cannot exection and kend the instruction of the ins | cute<br>eps<br>tion |
|------------------------------------------------------------------------------------------------------------------------------------|---------------------------------|-----------------------------------------------|---------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------|
| <ul> <li>All hazard detection scoreboarding</li> </ul>                                                                             | n and                           | resolution                                    | are                       | centralized                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | in                  |
|                                                                                                                                    | Prepared By N                   | Mr.EBIN PM, AP, IESCE                         |                           | EDULINE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 22                  |

| Register renaming is used to<br>eliminate WAR and WAW hazardsIt must wait for WAR and WAW<br>hazards to clearHazard detection and execution<br>control is distributed to each<br>functional unitHazard detection and execution<br>control is centralizedForwards results directly to the<br>functional unitResult is forwarded to the register | TOMASULO'S APPROACH                                                           | SCOREBOARDING                                         |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------|-------------------------------------------------------|
| Hazard detection and execution Hazard detection and execution control is distributed to each control is centralized functional unit Forwards results directly to the Result is forwarded to the register functional unit                                                                                                                       | Register renaming is used to eliminate WAR and WAW hazards                    | It must wait for WAR and WAW hazards to clear         |
| Forwards results directly to the Result is forwarded to the register                                                                                                                                                                                                                                                                           | Hazard detection and execution control is distributed to each functional unit | Hazard detection and execution control is centralized |
| Tunctional units Tile                                                                                                                                                                                                                                                                                                                          | Forwards results directly to the functional units                             | Result is forwarded to the register file              |
|                                                                                                                                                                                                                                                                                                                                                | Prepared By Mr.E                                                              | BIN PM, AP, IESCE EDULINE 23                          |







## **\***Static Arithmetic Pipelines

- The ALU perform fixed-point and floating-point operations separately.
- The fixed-point unit is also called the integer unit. The floatingpoint unit can be built either as part of the central processor or on a separate coprocessor.
- These arithmetic units perform scalar operations. The pipelining in scalar arithmetic pipelines is controlled by software loops. Vector arithmetic units can be designed with pipeline hardware directly under firmware or hardwired control

Prepared By Mr.EBIN PM, AP, IESCE

EDULINE

27





| Consida | r ac                                     | ər  |    | vo       | mr          | ماد | +1  | 20  | m   |     | inl  | ica | tia  | n c  | ∖f I | +   | 0  | 8-h                   | it intogo   |
|---------|------------------------------------------|-----|----|----------|-------------|-----|-----|-----|-----|-----|------|-----|------|------|------|-----|----|-----------------------|-------------|
|         |                                          |     |    | :.<br>1- | 111µ<br>- 1 | ле  |     |     |     | alu | .ipi |     |      |      | י ול |     | 0  | 0-0                   | It integer  |
| A×R=N N | vnere                                    | 3 P | IS | τη       | е т         | 6 - | -01 | τρ  | oro | au  | CT.  | In  | IS T | ixe  | a-p  | 0   | nτ | mu                    | itiplicatio |
| an be v | vritte                                   | en  | as | th       | e s         | un  | nm  | ati | ion | 0   | fei  | igh | t pa | arti | al   | pro | od | ucts                  | s as show   |
| below   |                                          |     |    |          |             |     |     |     | 1   | 0   | 1    | 1   | 0    | 1    | 0    | 1   | -  | A                     | ]           |
|         |                                          |     |    |          |             |     |     | ×)  | 1   | 0   | 0    | 1   | 0    | 0    | 1    | 1   | =  | В                     |             |
|         | n an |     |    |          |             |     |     |     | 1   | 0   | 1    | 1   | 0    | 1    | 0    | 1   | ш  | $P_0$                 | -           |
|         |                                          |     |    |          |             |     |     | 1   | 0   | I   | 1    | 0   | 1    | 0    | I    | 0   | ы  | $P_1$                 |             |
|         |                                          |     |    |          |             |     | 0   | 0   | 0   | 0   | 0    | 0   | 0    | 0    | 0    | 0   | m  | $P_2$                 |             |
|         |                                          |     |    |          |             | 0   | 0   | 0   | 0   | 0   | 0    | 0   | 0    | 0    | 0    | 0   | =  | <i>P</i> <sub>3</sub> |             |
|         |                                          |     |    |          | 1           | 0   | I   | I   | 0   | 1   | 0    | 1   | 0    | 0    | 0    | 0   | =  | $P_4$                 |             |
|         |                                          |     |    | 0        | 0           | 0   | 0   | 0   | 0   | 0   | 0    | 0   | 0    | 0    | 0    | 0   | 10 | P <sub>5</sub>        |             |
|         |                                          |     | 0  | 0        | 0           | 0   | 0   | 0   | 0   | 0   | 0    | 0   | 0    | 0    | 0    | 0   | =  | $P_6$                 |             |
|         | +)                                       | 1   | 0  | 1        | 1           | 0   | 1   | 0   | 1   | 0   | 0    | 0   | 0    | 0    | 0    | 0   | -  | P7                    | -           |
|         | 0                                        | 1   | 1  | 0        | 0           | 1   | 1   | 1   | 1   | 1   | 1    | 0   | 1    | 1    | 1    | 1   | =  | P                     |             |





| Machine type                      | Scalar base machine of k pipeline stages | Superscalar machine of degree n |
|-----------------------------------|------------------------------------------|---------------------------------|
| Machine pipeline cycle            | 1 (base cycle)                           | 1                               |
| Instruction issue rate            |                                          | m                               |
| Instruction issue latency         | 1                                        | 1                               |
| Simple operation latency          | 1                                        |                                 |
| ILP to fully utilize the pipeline | -1                                       | m                               |



EDULINE

35

- In this design, the processor can issue two instructions per cycle if there is no resource conflict and no data dependence problem.
- There are essentially two pipelines in the design.
- Both pipelines have four processing stages labeled fetch, decode, execute, and store, respectively.
- Each pipeline essentially has its own fetch unit. decode unit. and store unit.
- The two instruction streams flowing through the two pipelines are retrieved from a single source stream (the I-cache).
- The fan-out from a single instruction stream is subject to resource constraints and a data dependence relationship among the successive instructions.

Prepared By Mr.EBIN PM, AP, IESCE

```
Four functional units, multiplier, adder, logic unit, and load unit, are available for use in the execute stage. These functional units are shared by the two pipelines on a dynamic basis.
The multiplier itself has three pipeline stages, the adder has two stages, and the others each have only one stage.
There is a lookahead window with its own fetch and decoding logic. This window is used for instruction lookahead in case out-of-order instruction issue is desired to achieve better pipeline throughput.
It requires complex logic to schedule multiple pipelines simultaneously. The aim is to avoid pipeline stalling and minimize pipeline idle time
```























