APS105 L21 - Strings

#APS105 Slides Lecture

Strings in C

Format Specifier of String is %s

When using printf(), we can print a string literal by using the format specifier %s.

#include <stdio.h>
#include <stdlib.h>
int main(void) {
	const char *s = "Hello world"; // use const here so that C doesn't break
	s[0] = 'h'; // we cannot reassign the first letter here, so line returns error
	printf("s: %s\n", s);
	return EXIT_SUCCESS;
}

We can modify a string in a stack

If we do:

char s[] = /* string literal */

C copies the char values onto the stack for s, which allows us to modify the string.

Example

#include <stdio.h>
#include <stdlib.h>
int main(void) {
	char s[] = "Hello world";
	s[0] = 'h';
	printf("s: %s\n", s);
	return EXIT_SUCCESS;
}

C knows how many bytes to copy from a string literal

That is, if we don't define the length of the string, C should automatically copy a string literal with length that matches the defined string.

char s[] = "Hello";
// length of array s is 6, 1 byte for null byte and 5 for characters

We can create a larger array than needed

If we do this, the extra spaces will be null bytes. For example:

char s[8] = "Hello";
// this statement is equivalent to:
char s[8] = {'H', 'e', 'l', 'l', 'o', '\0', '\0', '\0'};

If we set the length of the array to be shorter than needed, we won't have a null byte at the end and thus the output will only stop when it reaches a memory cell that is 0. Otherwise, it will append random cells in memory to the output.

Note that string literals have type pointer, not array, so its size will always be 8 bytes.

That is to say, #define ARRAY_LENGTH(array) (sizeof((array))/sizeof((array)[0])) returns 8 bytes always.

Shorter hand notation for printf for strings: puts(s)

printf("%s", s);
puts(s);
// these lines are equivalent
Do NOT use scanf() for strings

It is impossible to use it correctly, not just because it doesn't add a null byte to the end of the string, but also because it is impossible to detect the size of the string correctly.

We can use the gets(s) function to circumvent some issues with scanf

gets(s) reads all characters until a newline. However, we are writing data beyond the memory we've allocated, which is known as a buffer overflow. This is a security vulnerability.

Use fgets() instead of gets() or scanf()

fgets() discards any extra characters and adds a null byte to the end.

Syntax: char* fgets(char *str, int size, FILE *stream)

Dynamically allocating memory size of string using getline

ssize_t getline(char **bufferp, size_t *sizep, FILE *stream);