C Arrays,Strings,More Pointers
Structure definition:
Creates a variable type
struct foo
,then declare the variable of that type1
2
3
4
5struct foo {
/* fields */
};
struct foo name1;
struct foo *name2;Joint struct definition and typedef
- Don’t need to name struct in this case
typedef
is a keyword that’s used to give a type or used to define the type of a new name.1
2
3
4typedef struct foo {
/* fields */
} bar;
bar name;
- Don’t need to name struct in this case
C Operator
Operator Precedence
Precedence | Operator | Description | Associativity |
---|---|---|---|
1 | ++ -- |
Suffix/postfix increment and decrement | Left-to-right |
1 | () |
Function call | Left-to-right |
1 | [] |
Array subscripting | Left-to-right |
1 | . |
Structure and union member access | Left-to-right |
1 | -> |
Structure and union member access through pointer | Left-to-right |
1 | (type){list} |
Compound literal(C99) | Left-to-right |
Precedence | Operator | Description | Associativity |
---|---|---|---|
2 | ++ -- |
Prefix increment and decrement | Right-to-left |
2 | + - |
Unary plus and minus | Right-to-left |
2 | ! ~ |
Logical NOT and bitwise NOT | Right-to-left |
2 | (type) |
Type cast | Right-to-left |
2 | * |
Indirection(dereference) | Right-to-left |
2 | & |
Address-of | Right-to-left |
2 | sizeof |
Size-of | Right-to-left |
2 | _Alignof |
Alignment requirement(C11) | Right-to-left |
- Use parentheses to manipulate
- Equality test(
==
) binds more tightly than logic(&,|,&&,||
)x & 1 = 0
meansx & (1 == 0)
instead of(x & 1) == 0
Assignment and Equality
- One of the most common errors for beginning C programmers
1
2a = b //is assignment
a == b //is equality test
Operator Precedence
- Prefix(
++p
) takes effect immediately - Postfix(
p++
)takes effect last
Arrays
- Modern machines are “byte-addressable”
- Hardware’s memory composed of 8-bit storage cells, each has a unique address
- A C pointer is just abstracted memory address
- Type declaration tells compiler how many bytes to fetch on each access through pointer
- E.g., 32-bit integer stored in 4 consecutive 8-bit bytes
- But we actually want “word alignment”
- Some processors will not allow you to address 32b values without being on 4 byte boundaries
- Others will just be very slow if you try to access “unaligned” memory.
Sizeof()
Integer and pointer sizes are machine dependent—how do we tell?
Use
sizeof()
operator- Returns size in bytes of variable or data type name Examples:
1
2
3
4
5
6int x;
int *y;
sizeof(x); // 4 (32-bit int)
sizeof(int); // 4 (32-bit int)
sizeof(y); // 4 (32-bit addr)
sizeof(char); // 1 (always)
- Returns size in bytes of variable or data type name Examples:
Acts differently with arrays and structs (to be explained later)
- Arrays: returns size of whole array
- Structs: returns size of one instance of struct (sum of sizes of all struct variables + padding)
Struct Alignment
1 | struct hello { |
- Assume the default alignment rule is “32b architecture”
- char: 1 byte, no alignment needed
- short: 2 bytes, ½ word aligned
- int: 4 bytes, word aligned
- Pointers are the same size as int
Rearrange to shrink the size
In order to align the data following the rule of 32-bit architecture, we need to insert padding, which are empty bytes or addresses between other structure varaibles’ address. Because of that, size of structure can be a lot bigger than we think. But we can rearrange the variables to shrink the size.
Array Basics
Declaration:
int ar[2];
declares a 2-element integer array (just a block of memory)
int ar[] = {795, 635};
declares and initializes a 2-element integer arrayAccessing elements:
ar[num]
returns the $num^{th}$ element- Zero-indexed
Pitfall: An array in C does not know its own length, and its bounds are not checked!
- We can accidentally access off the end of an array
- We must pass the array and its size to any procedure that is going to manipulate it
Mistakes with array bounds cause segmentation faults and bus errors
- Be careful! These are VERY difficult to find (You’ll learn how to debug these in lab)
Accessing an Array
- Array size n: access entries 0 to n-1
- Use separate variable for array declaration & array bound to be reused (eg: no hard-coding)
1 | //----------Bad Pattern--------- |
Arrays and Pointers
- Arrays are (almost) identical to pointers
char *buffer
andchar buffer[]
are nearly identical declarations- Differ in subtle ways: initialization, sizeof(), etc.
- Key Concept: An array variable looks like a pointer to the first ($0^{th}$) element
ar[0]
same as*ar
;ar[2]
same as*(ar+2)
- We can use pointer arithmetic to conveniently access arrays
- An array variable is read-only (no assignment) (i.e. cannot use “
ar = <anything>
”)
Example
ar[i]
is treated as*(ar+i)
- To zero an array, the following three ways are equivalent:
for(i=0; i<SIZE; i++) {ar[i] = 0;}
for(i=0; i<SIZE; i++) {*(ar+i) = 0;}
for(p=ar; p<ar+SIZE; p++) {*p = 0;}
- These use pointer arithmetic, which we will get to shortly
Arrays Stored Differently Than Pointers
Arrays and Functions
Declared arrays only allocated while the scope is valid:
1
2
3
4
5
6
7//-------BAD--------
char *foo(){
char string[32];
……
return string;
}
//-------BAD--------An array is passed to a function as a pointer:
1
2
3
4
5//the first arg is Really int* ar
//Must explicitly pass the size
int foo(int ar[], unsigned int size) {
……ar[size-1]……
}Array size gets lost when passed to a function
What prints in the following code:
1
2
3
4
5
6
7
8
9
10int foo (int array[], unsigned int size) {
……
printf("%d\n", sizeof(array)); //sizeof(int*)
}
int main(void) {
int a[10], b[5];
……foo(a, 10)……
printf("%d\n", sizeof(a)); //10*sizeof(int)
}
String
C Strings
- A String in C is just an array of characters
char letters[] = "abc"
const char letters[] = {'a', 'b', 'c', '\0'};
- But how fo we know when the string ends?(because arrays in C don’t know their size)
- Last character is followed by a 0 byte(
\0
)(a.k.a. “null terminator”) - this means you need an extra space in your array
- Last character is followed by a 0 byte(
- How do you tell how long a C string is? – Count until you reach the null terminator
1
2
3
4int strlen(char s[]) {
int n = 0;
while (s[n] != 0){n++;}
return n; } - Danger: What if there is no null terminator?
C String Standard Functions
- Accessible with
#include
int strlen(char *string);
- Returns the length of string (not including null term)
int strcmp(char *str1, char *str2);
- Return 0 if
str1
andstr2
are identical - how is this different from
str1 == str2
?- this function will actually return the reference
str1
minus the referencestr2
(the first non-matching character),basically the difference between their ASCII value
- this function will actually return the reference
- Return 0 if
char *strcpy(char *dst, char *src);
- Copy contents of stringsrc
to the memory atdst
. Caller must ensure thatdst
has enough memory to hold the data to be copied
- Note:dst = src
only copies pointer (the address)
Example
1 |
|
value of the following expressions?
Exp | Val | Description |
---|---|---|
sizeof(s1) |
10 | - |
strlen(s1) |
2 | - |
s1 == s2 |
0 | Point to different locations |
strcmp(s1,s2) |
0 | - |
strcmp(s1,s3) |
4 | (s1>s3) e,f,g,h,i |
strcmp(s1,s4) |
-6 | (s1<s4) i,j,k,l,m,n,o |
More Pointers
Pointer Arithmetic
$pointer \pm number$
- e.g. $pointer + 1$ adds 1 something to the address
Compare what happens:(assume
a
at address 100)Pointer arithmetic should be used cautiously
A pointer is just a memory address, so we can add to/subtract from it to move through an array
p+1
correctly incrementsp
bysizeof(*p)
- i.e. moves pointer to the next array element
What about an array of structs?
- Struct declaration tells C the size to use, so handled like basic types
What is valid pointer arithmetic?
- Add an integer to a pointer
- Subtract 2 pointers (in the same array, result is the index difference between two elements)
- Compare pointers (<, <=, ==, !=, >, >=)
- Compare pointer to NULL (indicates that the pointer points to nothing)
Everything else is illegal since it makes no sense:
- Adding two pointers
- Multiplying pointers
- Subtract pointer from integer
Increment and Dereference
When multiple prefixal operators are present, they are applied from right to left
*--p
decrementsp
, returns val at that addr- – binds to
p
before*
and takes effect first
- – binds to
++*p
increments*p
and returns that val*
binds first (get val), then increment immediately
Postfixal in/decrement operators have precedence over prefixal operators (e.g.
*
)- BUT the in/decrementation takes effect last because it is a postfix. The “front” of expression is returned.
*p++
returns*p
, then incrementsp
++
binds top
before*
, but takes effect last
Postfixal in/decrement operators have precedence over prefixal operators (e.g.
*
)- BUT the in/decrementation takes effect last because it is a postfix. The “front” of expression is returned.
(*p)++
returns*p
, then increments in mem- Post-increment happens last
Pointer Misc
Pointers and Allocation
- When you declare a pointer (e.g.
int *ptr;
), it doesn’t actually point to anything yet- It points somewhere (garbage; don’t know where)
- Dereferencing will usually cause an error
- Option 1: Point to something that already exists
int *ptr,var; var = 5; ptr = &var;
var
has space implicitly allocated for it (declaration)
- Option 2: Allocate room in memory for new thing to point to (next lecture)
Pointers and Structures
- Variable declarations:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23//Variable declarations:
struct Point {
int x;
int y;
//Cannot contain an sinstance of itself, but can point to one;
struct Point *p;
}
struct Point pt1;
struct Point pt2;
struct Point *ptaddr;
//Valid operations:
//1. dot natation
int h = pt1.x;
pt2.y = pt1.y;
//2. arrow notation
int h = ptaddr->x;
int h = (*ptaddr).x
//3. copy
pt1 = pt2;//copies contents
Pointers to Pointers
- Pointer to a pointer, declared as
**h
……
Summary
- Pointers and array variables are very similar
- Can use pointer or array syntax to index into arrays
- Strings are null-terminated arrays of characters
- Pointer arithmetic moves the pointer by the size of the thing it’s pointing to
- Pointers are the source of many bugs in C, so handle with care