More RISC-V,RISC-V Functions
Review of Last Lecture
RISC Design Principles
- Smaller is faster: 32 registers, fewer instructions
- Keep it simple: rigid syntax
RISC-V Registers:
s0-s11, t0-t6, x0
- No data types, just ** **, operations determine how they are interpreted
Memory is byte-addressed
- no types → no automatic pointer arithmetic
RISC-V Instructions
- Arithmetic:
add, sub, addi, mult, div
- Data Transfer:
lw, sw, lb, sb, lbu
- Branching:
beq, bne, bge, blt, jal, j, jalr, jr;
- Bitwise:
and, or, xor, andi, ori, xori
- Shifting:
sll, srl, sra, slli, srli, srai
(i = “immediate” constant integer)
- Arithmetic:
Sign Extension Practice
Sign in Two’s Complement
How do we know if a binary two’s complement number is negative?
- Look at the most significant bit!
Sign Extension
If we want to take an 8-bit two’s complement number and make it a 9-bit number, how would we do so?
- We replicate the most significant bit!
e.g.
0b0000 0010(+2) -> 0b0 0000 0010(+2)
0b1111 1110 (-2) -> 0b1 1111 1110 (-2)
Arithmetic Sign Extension
When doing math, immediate values are sign extended.
1 | # -1 = 0xFFF |
Loading Sign Extension
- For assembly, this happens when we pull data out of memory
- Byte in memory: 0b1111 1110 (-2)
- load byte -> Register contents:
0b XXXX XXXX XXXX XXXX XXXX XXXX 1111 1110 What do we do with the X values?
Sign extend!
0b 1111 1111 1111 1111 1111 1111 1111 1110
Normal (signed) loads sign extend the most significant bit
Memory: 0b1000 1111
Load Byte -> 0b1111 1111 1111 1111 1111 1111 1000 1111Memory: 0b0000 1111
Load Byte -> 0b0000 0000 0000 0000 0000 0000 0000 1111
Offset loads also sign extend:
Memory = [0x00008011]
(address in s0)
Assume system is little endian
lb t0, 0(s0) -> loading 0b00010001
0b0000 0000 0000 0000 0000 0000 0001 000
lb t0, 1(s0) -> loading 0b10000000
0b1111 1111 1111 1111 1111 1111 1000 0000
lbu t0, 1(s0) -> loading 0b10000000
0b0000 0000 0000 0000 0000 0000 1000 0000
Pseudo-Instructions
Assembly Instructions
- A low-level programming language where the program instructions match a particular architecture’s operations
- Code can be compiled into different assembly languages, but an assembly language can only run on hardware that supports it
- But sometimes, for the programmer’s benefit, it’s useful to have additional instructions that aren’t really implemented by the hardware
- Instead translated into real instructions
- Example: mv dst,reg1
translates into: addi dst,reg1,0
More Pseudo-Instruction
- Load Immediate (li)
li dst, im
- Loads 32-bit immediate into
dst
- utilizes:
addi, lui
(load upper immediate)
- Load Address (
la
)la dst, label
- Loads address of specified label into
dst
- translates to:
auipc dst, <offset to label>
(add upper immediate program counter)
- No Operation(nop)
nop
- Do nothing
- translates to:
addi x0, x0, 0
Pseudo-Instructions are useful
- Even the
j
instruction is actually a pseudo-Instruction- We will see what this converts to later this lecture
- Pseudo-Instructions are core to writing RISC assembly code and you will see them in any RISC assembly code you read
Full list of RISC-V supported pseudo instructions is on the greensheet
C to RISC-V Practice
Keep in mind that whenever you are changing C to RISCV, the number one thing you should keep in mind that you should write everything out step by step and you convert that line by line to do your proper RISCV instructions. And at the very end, if you see an opportunity, you can clean it up(sharpen your code)
Let’s put our all of our new RISC-V knowledge to use in an example: “Fast String Copy”
C code is as follows:
/* Copy string from p to q */
char *p, *q;
while((*q++ = *p++) != ‘\0’) ;
What do we know about its structure?
- Single
while
loop - Exit condition is an equality test
- Single
Start with code skeleton:
1
2
3
4
5
6
7
8
9# copy String p to q
# p->s0, q->s1 (char* pointers)
Loop: lb t0, 0(s0) # t0 = *p
sb t0, 0(s1) # *q = t0
addi s0, s0, 1 # p = p + 1
addi s1, s1, 1 # q = q + q
beq t0, x0, Exit # if *p==0, go to Exit
j Loop # go to Loop
Exit: # N chars in p => N*6 instructionsIn
lb t0, 0(s0)
what iflb
sign extends?- Not a problem because
sb
only writes a single byte.(The sign extension is ignored)
- Not a problem because
• Alternate code using bne:(optimize)
1 | # copy String p to q |
Functions in Assembly
Six Steps of Calling a Function
- Put arguments in a place where the function can access them
- Transfer control to the function
- The function will acquire any (local) storage resources it needs
- The function performs its desired task \
- The function puts return value in an accessible place and “cleans up”
- Control is returned to you
1 and 5: Where should we put the arguments and return values?
- Registers way faster than memory, so use them whenever possible
- a0–a7: eight argument registers to pass parameters
- a0–a1: two argument registers also used to return values
- Order of arguments matters
- If need extra space, use memory (the stack!)
Example: function in assembly
1 | void main(void) { |
1 | main: |
More Registers
a0–a7
: eight argument registers to pass parametersa0–a1
: two registers to return valuessp
: “stack pointer”- Holds the current memory address of the “bottom” of the stack
2 and 6: How do we Transfer Control?
- Jump(j)
- j label
- Jump and Link (jal) (Used to invoke a function)
- jal dst label
- Jump and Link Register (jalr) (Used to invoke a function)
- jalr dst src imm
- “and Link”: Saves the location of instruction in a register before jumping
- Jump Register (jr) – jr src (Used to return from a function(src = ra))
- ra = return address register, used to save where a function is called from so we can get back
Function Call Example
How do we get back?
Well, option one is that we could always put a label here at address 1016, and we can just j back to that label. But that kind of annoying and ugly because that means that for every single time we are going to make a funcion we need to write a new label which is really annoying and not that clean.
So instead, let’s try to use a memory address.This means for instructions like jal and jalr, instead of using the label, they look at a register, and this register should hold an address of where you want to jump to so before jumping into a function, you need to put the address of the instruction that you want to come back to intora
register. So ra stands for return address, because when you return from the function, you want to return to that addressWould we know this before compiling?
Both for accessibility reasons and for security reasons it’s good that you don’t. So how can we fix this problem?
Well, this is why we introduced thejal
function.jal
stands for jump and link. Jump is jumping to somewhere. Link is linking the return address. And what is the return address should be? When we link, what’s happening is we’re actually putting the address of the next instruction inside of this register.
Whatjal
orjalr
does is, the first thing that is does is it takes the current pc (program counter) and it adds four. And then it puts that address inside of this register.
Why does it adds four? Memory is byte addressed, and every RISCV instruction is one word (or four bytes) long.
J is a pseudo-instruction explained
jal
syntax:jal dst label
- You supply the register used to link
- When calling a function you use ra
- What happens if you specify x0?
jal x0 label
x0
always contains 0, so attempts to write to it do nothing- So jal x0 label is just jumping without linking
j label
is a pseudo-instruction forjal x0 label
- Similarly
jr
is a pseudo-instruction forjalr
following the same idea
- Similarly
Review Question
- A: Invalid Syntax(
jal
requires a label) - B: Invalid Syntax(branch requires a label)
- C: Correct !
- D: Invalid Syntax(
j
requires a label) - E: Would return properly though it would overwrite ra after doing so
3: Local storage for variables
- Stack pointer (sp) holds the address of the bottom of the stack
- Decrement it (recall stack grows downwards)
- Then use store word to write to a variable
- To “clean up”, just increment the stack pointer
1 | # store t0 to the stack |
Function Calling Conventions
Which registers can we use?
- Problem: how does the function know which registers are safe to use?
Function A may have been using t0 when it called Function B!
Example: sumSquare
1 | int sumSquare(int x, int y) { |
- What do we need to save?
- Call to
mult
will overwritera
, so save it - Reusing
a1
to pass 2nd argument tomult
, but need current value (y
) later, so savea1
- Call to
Calling Conventions
CalleR: the calling function
CalleE: the function being called
Register Conventions: A set of generally accepted rules as to which registers will be unchanged after a procedure call (
jal
) and which may have changed
Caller and Callee
Saved Register (Callee Saved)
- These registers are expected to be the same before and after a function call
- If calleE uses them, it must restore values before returning
- This means save the old values, use the registers, then reload the old values back into the registers
- s0-s11(saved registers)
- sp(stack pointer)
- If not in same place, the caller won’t be able to properly access its own stack variables
You should be putting important values in the S registers.However both the caller and the callee might be using the s registers. How can we resolve this conflict of interest?
Well, the first thing that the Callee should do before starts executing any of its code is it should save any of the s registers that is wishes to use on the stack. This means the caller have some important value in the s registers and it wants to make sure and it nervous that if it calls a function that is going to override it. The callee is smart and says “Don’t worry,everything will seem fine.” The callee wants to use thes0
-s11
, but sees that its caller is also using them. So the callee will take the caller’s value ofs0
-s1
and put them on the stack. And then the callee is free to uses0
-s11
. Then, right before function B returns control back, it will bring those values back into s registers. And then it will immediately return back to the caller.
Volatile Registers (Caller Saved)
- These registers can be freely changed by the calleE
- If calleR needs them, it must save those values before making a procedure call
- t0-t6(temporary registers)
- a0-a7(return address and arguments)
- ra(return address)
- These will change if calleE invokes another function (nested function means calleE is also a calleR)
If the caller cares about the argument registers, then it should save it’s values (or registers) before it calls the function. In addition, we also have the
ra
register as well.The caller wants to make sure that it’s not going to be get and overwritten by a future function.
So above are registers that can be freely changed by the callee because caller already saved them
Register Conventions
Each register is one of two types:
- Caller saved
- The callee function can use them freely
(if needed, the caller had to save them before invoking and will restore them afterwards)
- The callee function can use them freely
- Callee saved
- The callee function must save them before modifying them, and restore them before returning (avoid using them at all, and no need to save)
This is a contract agreed upon by all functions
Calling Convention on Greencard
How do we save registers? The stack!
Stack Before, During, After Call
Basic Structure of a Function
1 | Prologue |
Stack during function execution
- Need to save old values of s0 and s1
Example: Using Saved Registers
Example: Using Volatile Registers
Register Conventions Summary
- One more time for luck:
- CalleR must save any volatile registers it is using onto the stack before making a procedure call
- CalleR can trust saved registers to maintain values
- CalleE must “save” any saved registers it intends to use by putting them on the stack before overwriting their values
- Notes:
- CalleR and calleE only need to save the appropriate registers they are using (not all!)
- Don’t forget to restore the values later
Summary
Example function with calling convention
1 | int Leaf(int g, int h, int i, int j) { |
1 | Leaf: |
Choosing Your Registers
- Minimize register footprint
- Optimize to reduce number of registers you need to save by choosing which registers to use in a function
- Only save when you absolutely have to
- Function does NOT call another function
- Use only
t0-t6
and there is nothing to save!
- Use only
- Function calls other function(s)
- Values you need throughout go in
s0-s11
, others go int0-t6
- At each function call, check number arguments and return values for whether you or not you need to save
- Values you need throughout go in
Different register choices could reduce effort
1 | int Leaf(int g, int h, int i, int j) { |
1 | Leaf: |
Be lazy! Use register choices that minimize saving to the stack. It makes your program faster too…
Summary
Pseudo-instructions
Functions is assembly
- Six steps of calling a function
- Place arguments
- Jump to function
- Create local storage (Prologue)
- Perform desired task
- Place return value and clean up storage (Epilogue)
- Jump back to caller
- Six steps of calling a function
Calling conventions
- Need a method for knowing which registers can be trusted across function calls
- Caller-saved registers (Volatile Registers)
- Saved by caller if needed
- Free to use by callee
- Callee-saved registers (Saved Registers)
- Saved by callee if needed
- Safe across function calls for caller