Review of Last Lecture

RISC Design Principles
- Smaller is faster: 32 registers, fewer instructions
- Keep it simple: rigid syntax
RISC-V Registers: s0-s11, t0-t6, x0
- No data types, just ** **, operations determine how they are interpreted
Memory is byte-addressed
- no types → no automatic pointer arithmetic
RISC-V Instructions
- Arithmetic: add, sub, addi, mult, div
- Data Transfer: lw, sw, lb, sb, lbu
- Branching: beq, bne, bge, blt, jal, j, jalr, jr;
- Bitwise: and, or, xor, andi, ori, xori
- Shifting: sll, srl, sra, slli, srli, srai
  (i = “immediate” constant integer)

Sign Extension Practice

How do we know if a binary two’s complement number is negative?

Look at the most significant bit!

Sign Extension

If we want to take an 8-bit two’s complement number and make it a 9-bit number, how would we do so?

We replicate the most significant bit!

e.g.
0b0000 0010(+2) -> 0b0 0000 0010(+2)
0b1111 1110 (-2) -> 0b1 1111 1110 (-2)

Arithmetic Sign Extension

When doing math, immediate values are sign extended.

# -1 = 0xFFF 
addi t0, x0, -1      # t0->[-1]->[0xFFFFFFFF]
addi t0, x0, 0x0FF   # t0->[0x000000FF]
addi t0, x0, 0xF77   # t0->[0xFFFFFF77]

Loading Sign Extension

For assembly, this happens when we pull data out of memory
Byte in memory: 0b1111 1110 (-2)
load byte -> Register contents:
0b XXXX XXXX XXXX XXXX XXXX XXXX 1111 1110 What do we do with the X values?
Sign extend!
0b 1111 1111 1111 1111 1111 1111 1111 1110

Normal (signed) loads sign extend the most significant bit

Memory: 0b1000 1111
Load Byte -> 0b1111 1111 1111 1111 1111 1111 1000 1111
Memory: 0b0000 1111
Load Byte -> 0b0000 0000 0000 0000 0000 0000 0000 1111

Offset loads also sign extend:
Memory = [0x00008011] (address in s0)
Assume system is little endian
lb t0, 0(s0) -> loading 0b00010001
0b0000 0000 0000 0000 0000 0000 0001 000
lb t0, 1(s0) -> loading 0b10000000
0b1111 1111 1111 1111 1111 1111 1000 0000
lbu t0, 1(s0) -> loading 0b10000000
0b0000 0000 0000 0000 0000 0000 1000 0000

Pseudo-Instructions

Assembly Instructions

A low-level programming language where the program instructions match a particular architecture’s operations
- Code can be compiled into different assembly languages, but an assembly language can only run on hardware that supports it
But sometimes, for the programmer’s benefit, it’s useful to have additional instructions that aren’t really implemented by the hardware
- Instead translated into real instructions
Example: mv dst,reg1
translates into: addi dst,reg1,0

More Pseudo-Instruction

Load Immediate (li)
- li dst, im
- Loads 32-bit immediate into dst
- utilizes: addi, lui(load upper immediate)
Load Address (la)
- la dst, label
- Loads address of specified label into dst
- translates to: auipc dst, <offset to label>(add upper immediate program counter)
No Operation(nop)
- nop
- Do nothing
- translates to: addi x0, x0, 0

Pseudo-Instructions are useful

Even thej instruction is actually a pseudo-Instruction
- We will see what this converts to later this lecture
Pseudo-Instructions are core to writing RISC assembly code and you will see them in any RISC assembly code you read

Full list of RISC-V supported pseudo instructions is on the greensheet

C to RISC-V Practice

Keep in mind that whenever you are changing C to RISCV, the number one thing you should keep in mind that you should write everything out step by step and you convert that line by line to do your proper RISCV instructions. And at the very end, if you see an opportunity, you can clean it up(sharpen your code)

Let’s put our all of our new RISC-V knowledge to use in an example: “Fast String Copy”
C code is as follows:
/* Copy string from p to q */
char *p, *q;
while((*q++ = *p++) != ‘\0’) ;
What do we know about its structure?
- Single while loop
- Exit condition is an equality test

Start with code skeleton:

# copy String p to q
# p->s0, q->s1 (char* pointers)
Loop: lb    t0, 0(s0)      # t0 = *p
      sb    t0, 0(s1)      # *q = t0
      addi  s0, s0, 1      # p = p + 1
      addi  s1, s1, 1      # q = q + q
      beq   t0, x0, Exit   # if *p==0, go to Exit
      j Loop               # go to Loop
Exit: # N chars in p => N*6 instructions

In lb t0, 0(s0) what if lb sign extends?
- Not a problem because sb only writes a single byte.(The sign extension is ignored)

• Alternate code using bne:(optimize)

# copy String p to q 
# p→s0, q→s1 (char* pointers) 
Loop: lb   t0,0(s0)     # t0 = *p
	  sb   t0,0(s1)     # *q = t0
	  addi s0,s0,1      # p = p + 1
	  addi s1,s1,1      # q = q + 1
	  bne t0,x0,Loop    # if *p==0, go to Loop 
Exit: # N chars in p => N*5 instructions

Functions in Assembly

Six Steps of Calling a Function

Put arguments in a place where the function can access them
Transfer control to the function
The function will acquire any (local) storage resources it needs
The function performs its desired task \
The function puts return value in an accessible place and “cleans up”
Control is returned to you

1 and 5: Where should we put the arguments and return values?

Registers way faster than memory, so use them whenever possible
a0–a7: eight argument registers to pass parameters
a0–a1: two argument registers also used to return values
- Order of arguments matters
- If need extra space, use memory (the stack!)

Example: function in assembly

void main(void) { 
	a = 3; 
	b = a+1; 
	a = add(a, b); 
	... 
} 

int add(int a, int b) { 
	return a+b; 
}

main: 
	addi a0, x0, 3 
	addi a1, a0, 1 
	jal ra, add 
	... 
add:
	add a0, a0, a1 
	jr ra

More Registers

a0–a7: eight argument registers to pass parameters
a0–a1: two registers to return values
sp: “stack pointer”
- Holds the current memory address of the “bottom” of the stack

2 and 6: How do we Transfer Control?

Jump(j)
- j label
Jump and Link (jal) (Used to invoke a function)
- jal dst label
Jump and Link Register (jalr) (Used to invoke a function)
- jalr dst src imm
“and Link”: Saves the location of instruction in a register before jumping
Jump Register (jr) – jr src (Used to return from a function(src = ra))
ra = return address register, used to save where a function is called from so we can get back

Function Call Example

How do we get back?

Well, option one is that we could always put a label here at address 1016, and we can just j back to that label. But that kind of annoying and ugly because that means that for every single time we are going to make a funcion we need to write a new label which is really annoying and not that clean.
So instead, let’s try to use a memory address.This means for instructions like jal and jalr, instead of using the label, they look at a register, and this register should hold an address of where you want to jump to so before jumping into a function, you need to put the address of the instruction that you want to come back to into ra register. So ra stands for return address, because when you return from the function, you want to return to that address
Would we know this before compiling?

Both for accessibility reasons and for security reasons it’s good that you don’t. So how can we fix this problem?
Well, this is why we introduced the jal function. jal stands for jump and link. Jump is jumping to somewhere. Link is linking the return address. And what is the return address should be? When we link, what’s happening is we’re actually putting the address of the next instruction inside of this register.
What jal or jalr does is, the first thing that is does is it takes the current pc (program counter) and it adds four. And then it puts that address inside of this register.
Why does it adds four? Memory is byte addressed, and every RISCV instruction is one word (or four bytes) long.

J is a pseudo-instruction explained

jal syntax: jal dst label
You supply the register used to link
- When calling a function you use ra
What happens if you specify x0?
- jal x0 label
- x0 always contains 0, so attempts to write to it do nothing
- So jal x0 label is just jumping without linking
j label is a pseudo-instruction for jal x0 label
- Similarly jr is a pseudo-instruction for jalr following the same idea

Review Question

A: Invalid Syntax(jal requires a label)
B: Invalid Syntax(branch requires a label)
C: Correct !
D: Invalid Syntax(j requires a label)
E: Would return properly though it would overwrite ra after doing so

3: Local storage for variables

Stack pointer (sp) holds the address of the bottom of the stack
- Decrement it (recall stack grows downwards)
- Then use store word to write to a variable
- To “clean up”, just increment the stack pointer

1
2
3

# store t0 to the stack 
addi sp, sp, -4 
sw t0, 0(sp)

Function Calling Conventions

Which registers can we use?

Problem: how does the function know which registers are safe to use?

Function A may have been using t0 when it called Function B!

Example: sumSquare

1
2
3

int sumSquare(int x, int y) { 
	return mult(x,x)+ y; 
}

What do we need to save?
- Call to mult will overwrite ra, so save it
- Reusing a1 to pass 2nd argument to mult, but need current value (y) later, so save a1

Calling Conventions

CalleR: the calling function
CalleE: the function being called
Register Conventions: A set of generally accepted rules as to which registers will be unchanged after a procedure call (jal) and which may have changed

Caller and Callee

Saved Register (Callee Saved)

These registers are expected to be the same before and after a function call
- If calleE uses them, it must restore values before returning
- This means save the old values, use the registers, then reload the old values back into the registers
s0-s11(saved registers)
sp(stack pointer)
- If not in same place, the caller won’t be able to properly access its own stack variables

You should be putting important values in the S registers.However both the caller and the callee might be using the s registers. How can we resolve this conflict of interest?
Well, the first thing that the Callee should do before starts executing any of its code is it should save any of the s registers that is wishes to use on the stack. This means the caller have some important value in the s registers and it wants to make sure and it nervous that if it calls a function that is going to override it. The callee is smart and says “Don’t worry,everything will seem fine.” The callee wants to use the s0-s11, but sees that its caller is also using them. So the callee will take the caller’s value of s0-s1 and put them on the stack. And then the callee is free to use s0-s11 . Then, right before function B returns control back, it will bring those values back into s registers. And then it will immediately return back to the caller.

Volatile Registers (Caller Saved)

These registers can be freely changed by the calleE
- If calleR needs them, it must save those values before making a procedure call
t0-t6(temporary registers)
a0-a7(return address and arguments)
ra(return address)
- These will change if calleE invokes another function (nested function means calleE is also a calleR)

If the caller cares about the argument registers, then it should save it’s values (or registers) before it calls the function. In addition, we also have the ra register as well.The caller wants to make sure that it’s not going to be get and overwritten by a future function.
So above are registers that can be freely changed by the callee because caller already saved them

Register Conventions

Each register is one of two types:

Caller saved
- The callee function can use them freely
  (if needed, the caller had to save them before invoking and will restore them afterwards)
Callee saved
- The callee function must save them before modifying them, and restore them before returning (avoid using them at all, and no need to save)

This is a contract agreed upon by all functions

Calling Convention on Greencard

How do we save registers? The stack!

Stack Before, During, After Call

Basic Structure of a Function

Prologue 
	func_label: 
	addi sp,sp, -framesize 
	sw ra, <framesize-4>(sp) 
	#store other callee saved registers 
	#save other regs if need be 
Body    (call other functions…)
	… 
Epilogue
	#restore other regs if need be  
	#restore other callee saved registers 
	lw ra, <framesize-4>(sp) 
	addi sp,sp, framesize 
	jr ra

Stack during function execution

Need to save old values of s0 and s1

Example: Using Saved Registers

Example: Using Volatile Registers

Register Conventions Summary

One more time for luck:
- CalleR must save any volatile registers it is using onto the stack before making a procedure call
- CalleR can trust saved registers to maintain values
- CalleE must “save” any saved registers it intends to use by putting them on the stack before overwriting their values
Notes:
- CalleR and calleE only need to save the appropriate registers they are using (not all!)
- Don’t forget to restore the values later

Summary

Example function with calling convention

int Leaf(int g, int h, int i, int j) { 
	int f; 
	f = (g+h)-(i+j); 
	return f; 
}

Leaf: 
addi sp,sp,-8 
# allocate stack 
sw s1,4(sp) # save s1 
sw s0,0(sp) # save s0 

add s0,a0,a1 # s0 = g+h 
add s1,a2,a3 # s1 = i+j 
sub a0,s0,s1 
# return value a0 = s0-s1 

lw s0,0(sp) # restore s0 
lw s1,4(sp) # restore s1 
addi sp,sp,8 # free stack 
jr ra # return

Choosing Your Registers

Minimize register footprint
- Optimize to reduce number of registers you need to save by choosing which registers to use in a function
- Only save when you absolutely have to
Function does NOT call another function
- Use only t0-t6 and there is nothing to save!
Function calls other function(s)
- Values you need throughout go in s0-s11, others go in t0-t6
- At each function call, check number arguments and return values for whether you or not you need to save

Different register choices could reduce effort

int Leaf(int g, int h, int i, int j) { 
	int f; 
	f = (g+h)-(i+j); 
	return f; 
}

Leaf: 
# nothing to save on stack 
add t0,a0,a1 # t0 = g+h 
add t1,a2,a3 # t1 = i+j 
sub a0,t0,t1 
# return value a0 = t0-t1 
# nothing to restore from stack 
jr ra        # return

Be lazy! Use register choices that minimize saving to the stack. It makes your program faster too…

Summary

Pseudo-instructions
Functions is assembly
- Six steps of calling a function
  1. Place arguments
  2. Jump to function
  3. Create local storage (Prologue)
  4. Perform desired task
  5. Place return value and clean up storage (Epilogue)
  6. Jump back to caller
Calling conventions
- Need a method for knowing which registers can be trusted across function calls
- Caller-saved registers (Volatile Registers)
  - Saved by caller if needed
  - Free to use by callee
- Callee-saved registers (Saved Registers)
  - Saved by callee if needed
  - Safe across function calls for caller

Review of Last Lecture

Sign Extension Practice

Sign in Two’s Complement

Sign Extension

Arithmetic Sign Extension

Loading Sign Extension

Pseudo-Instructions

Assembly Instructions

More Pseudo-Instruction

Pseudo-Instructions are useful

C to RISC-V Practice

Functions in Assembly

Six Steps of Calling a Function

1 and 5: Where should we put the arguments and return values?

Example: function in assembly

More Registers

2 and 6: How do we Transfer Control?

Function Call Example

J is a pseudo-instruction explained

Review Question

3: Local storage for variables

Function Calling Conventions

Which registers can we use?

Example: sumSquare

Calling Conventions

Caller and Callee

Saved Register (Callee Saved)

Volatile Registers (Caller Saved)

Register Conventions

Calling Convention on Greencard

How do we save registers? The stack!

Stack Before, During, After Call

Basic Structure of a Function

Stack during function execution

Example: Using Saved Registers

Example: Using Volatile Registers

Register Conventions Summary

Summary

Example function with calling convention

Choosing Your Registers

Different register choices could reduce effort

Summary