C-Introduction,Pointers
Overview
Prerequisites
- Official prerequisites: “Some” C experience is required before CS61C
- C++ or JAVA okay
Introduction
C is not a “very high level” language, nor a “big” one, and is not specialized to any particular area of application. But its absence of restrictions and its generality make it more convenient and effective for many tasks than supposedly more powerful languages. –Kernighan and Ritchie
- With C we can write programs that allow us to exploit underlying features of the architecture
Basic C Concepts
Compiler | Creates useable program from C source code |
Typed Variables | Must declare the kind of data the variable will contain |
Typed functions | Must declare the kind of data returned from the function |
Header files(.h) | Allows you to declare functions and variables in separate files |
Structs | Groups of related values |
Enums | Lists of predefined values |
Pointers | Aliases to other variables |
Compilation
- C is a complied language
- C compilers map C programs into architecture-specific machine code (string of 0s and 1s)
- Unlike Java, which converts to architecture-independent bytecode (run by JVM)
- Unlike python, which directly interprets the code
- Main difference is when your program is mapped to low-level machine instructions
Compliation Advantages
- Excellent run-time performance: Generally much faster than Python or Java for Comparable code because it optimizes for the given architecture
- Fair compilation time: enhancements in compilation procedure (Makefiles) allow us to recompile only the modified files
Compliation Disadvantages
- Compiled files,including the executable,are architecture-specific(CPU type and OS)
- Executable must be rebuilt on each new system
- i.e. “porting your code” to a new architecture
- “Edit->Compile->Run[repeat]“ iteration cycle can be slow
- e.g. when debugging, it is not “Edit->Run”, you have to compile after edit, which can take some time
Variable Types
Typed Variables in C
- You must declare the type of data a variable will hold
- Declaration must come before or simultaneously with assignment
Type | Description | Examples |
---|---|---|
int | signed integer | 5, -12, 0 |
short | smaller signed integer | |
long | larger signed integer | |
char | single text character or symbol | ‘a’,’D’ |
float | floating point non-integer numbers | 0.0, 1.618, -1.4 |
double | greater precision FP number |
- Integer sizes are machine dependant!
- Common size is 4 or 8 bytes (32/64-bit),but can’t ever assume this
- Can add “unsigned” before
int
orchar
Characters
- Encode characters as numbers,same as everything!
- ASCII standard defines 128 different characters and their numeric encodings http://www.asciitable.com
char
representing the character ‘a’ contains the value 97char c = 'a'
orchar c = 97
are both valid
- A
char
take up 1 byte of space- 7 bits is enough to store a char ($2^7 = 128$),but we add a bit to rough up to 1 byte since computers usually deal with muliples of bytes
Typecasting in C (1/2)
- C is a “weakly” typed language
- You can explicitly typecast from any type to any other:
int i = -1;
if(i < 0)
printf("This will print\\n");
if((unsigned int) i < 0)
printf("This will not print\\n");
- You can explicitly typecast from any type to any other:
- This is possible because everything is stored as bits!
- Can be seen as changing the “programmer’s perspective” of the variable
- Can typecast anything, even if it doesn’t make sense:
struct node n;
int i = (int) n
- More freedom, but easier to shoot yourself in the foot
Typed Functions in C
- You have to declare the type of data you plan to return from a function
- Return type can be any C variable type or
void
for no return value- Place on the left of function name
- Also necessary to define types for function arguments
- Declaring the “prototype” of a function allows you to use it
1
2
3
4
5
6
7
8
9
10
11
12
13//function prototypes
int my_func(int ,int);
void sayHello();
//function definitions
int my_func(int x, int y) {
sayHello();
return x * y;
}
void sayHello() {
printf("Hello\n");
}
Structs in C
- Way of defining compound data types
- A structured group of variables, possibly including other structs
- Thing of it as an instruction to C on how to arrange a bunch of bits in a buckets
1 | typedef struct { |
Structs Alignment and Padding in C
1 | struct foo { |
- They provide enough space and aligns the data with padding!
- The actual layout on a 32 bit architecture would be:
- 4-bytes for a
- 1 byte for b
- 3 unused bytes (pad the rest of them so that we have aligned accesses)
- 4 bytes for C
- sizeof(struct foo) == 12
Unions in C
- A “union” is also an instruction in C on how to arrange a bunch of bits
1
2
3
4
5union foo {
int a;
char b;
union foo *c;
} - Provides enough space for the largest element
1
2
3union foo f;
f.a = 0xDEADB33F; //treat f as an integer and store that value
f.c = &f; //treat f as a pointer of type "union foo*" and store the address of f in itself
Differences between C and Java
C | Java | |
---|---|---|
Type of Language | Function Oriented | Object Oriented |
Programming Unit | Function | Class = Abstract Data Type |
Compilation | Creates machine-dependent code | Creates machine-independent code |
Execution | Loads and executes program | JVM interprets bytecode |
Hello World | …… | …… |
Memory management | Manual(malloc, free) | Automatic(garbage collection) |
C Syntax and Control Flow
Operators
- arithmatic: +, -, *, /, %
- assignment: =
- augmented assignment: +=, -=, *=, /=, %=, &=, |=, ^=, <<=, >>=
- bitwise logic: ~, &, |, ^
- bitwise shifts: <<, >>
- boolean login: !, &&, ||
- equality testing: ==, !=
- subexpression grouping: ( )
- order relations: <, <=, >, >=
- increment and decrement: ++ and –
- member selection: ., ->
- conditional evaluation: ? :
Generic C Program Layout
1 | //-------handled by Preprocessor------- |
- typically the return value of
main()
is the exit code
C Syntax: main
- To get arguments to the main function, use:
int main(int argc, char *argv[])
- What does this mean?
- argc (argument count) contains the number of strings on the command line (the executable path counts as one, plus one for each argument).
- argv (argument value) is an array containing pointers to the arguments as strings (more on pointers later)
Example
$ ./foo hello 87
- Here argc = 3 and the array argv contains pointers to the following strings:
argv[0] = “./foo”
argv[1] = “hello”
argv[2] = “87” - We will cover pointers and strings later.
C Syntax: Variable Declarations
- All variable declarations must appear before they are used (e.g. at the beginning of a block of code)
- A variable may be initialized in its declaration; if not, it holds garbage!
- Variables of the same type may be declared on the same line
- Examples of declarations:
- Correct:
int x;
int a, b = 10, c;
int a, *b = NULL, c;
- Incorrect:
short x = 1, float y = 1.0;
z = 'c';
- Correct:
C Syntax: True or False
- No explicit Boolean type in C (unlike Java)
- What evaluates to FALSE in C?
- 0 (integer)
- NULL (a special kind of pointer: more on this later)
- What evaluates to TRUE in C?
- Anything that isn’t false is true
- Same idea as in Scheme: only #f is false, anything else is true
C Syntax: Control Flow
if-else
1 | if (expression){ statement } |
while
1 | while (expression) { |
for
1 | for (initialize; check; update){ |
switch
1 | switch (expression) { |
switch and break
Case statement (switch) requires proper placement of
break
to work properly- “Fall through” effect: will execute all cases until a
break
is found1
2
3
4
5switch(ch){
case ‘+’: … /* does + and - */
case ‘-’: … break;
case ‘*’: … break;
default: …
- “Fall through” effect: will execute all cases until a
In certain cases, can take advantage of this!
update to ANSI C
Yes! There have been a few.
We use “C99” or “C9x” std
- Use option “gcc -std=c99” at compilation
References
Highlights:
- Declarations in
for
loops, like Java (#15) - Java-like // comments (to end of line) (#10)
- Variable-length non-global arrays (#33)
- for explicit integer types (#38)
- for boolean logic definitions (#35)
- Declarations in
Pointers
Address vs. Value
- Consider memory to be a single huge array –Each cell/entry of the array has an address –Each cell also stores some value
- Don’t confuse the address referring to a memory location with the value stored there
Pointers
- A pointer is a variable that contains an address
- An address refers to a particular memory location, usually also associated with a variable name
- Name comes from the fact that you can say that it points to a memory location
Pointer Syntax
int *x;
- Declare variable
x
as the address of anint
- Declare variable
x = &y;
- Assign address of
y
to x &
called the “address operator” in this context
- Assign address of
z = *x;
- Assigns the value at address in
x
toz
*
called the “dereference operator” in this context
- Assigns the value at address in
Example
Pointer Types
Pointers are used to point to one kind of data (int, char, a struct, etc.)
- Pointers to pointers? Oh yes! (e.g.
int **pp
,pp
is called a double pointer )
- Pointers to pointers? Oh yes! (e.g.
Exception is the type void *, which can point to anything (generic pointer)
- Use sparingly to help avoid program bugs and other bad things!
Functions can return pointers
1 | char *foo () { |
- How do we get a function to change a value?
- Pass “by reference”: function accepts a pointer and then modifies value by dereferencing it
1 | void addOne(int *p) { |
Pointers in C
Why use pointers?
- When passing a large struct or array, it’s easier/faster to pass a pointer than a copy of the whole thing
- In general, pointers allow cleaner, more compact code
Careful: Pointers are likely the single largest source of bugs in C
- Most problematic with dynamic memory management, which we will cover later
- Dangling references and memory leaks
Pointer Bugs
- Local variables in C are not initialized, they may contain anything (a.k.a. “garbage”)
- Declaring a pointer just allocates space to hold the pointer – it does not allocate the thing being pointed to!
below are bad examples:
1 | //TWO BAD EXAMPLES |
Summary
- C is an efficient (compiled) language, but leaves safety to the programmer
- Weak type safety, variables not auto-initialized
- Use pointers with care: common source of bugs!
- Pointer is a C version (abstraction) of a data address
- Each memory location has an address and a value stored in it
*
“follows” a pointer to its value&
gets the address of a value
- C functions “pass by value”