Strings and related functions (8)

Managing strings in C could be very complicated. There are no classes in the language, so what I'm calling a string is a basic array of characters. The standard library embeds a lot of functions to manage them.

We have already seen arrays there. We have also seen that writing ['h', 'e', 'l', 'l', 'o'] and "hello" were the same thing.

Say my name !

In order to create a string we just have to create a pointer/array of type char :

char* <identifier> = <string>
char <identifier>[] = <string>
char* name = "Linus";

Let's create a program printing this name !

#include <stdio.h>

int main(void) {
    char* name = "Linus";
    printf("%s\n", name);

    return 0;
}
Linus

Character 0 as the last

To know the end of a string, instead of saving its size with a variable, the compiler helps us by putting a \0 character at the end.

When we walk through the string, we know when we have to stop :

| 'L' | 'i' | 'n' | 'u' | 's' | '\0' | 
  ⇑     ⇑     ⇑    ⇑     ⇑    ⇑
  ok    ok    ok    ok    ok    oh, stop !

Imagine printing each character one by one without knowing the string's length :

for (int i = 0; name[i] != '\0'; ++i) {
    printf("%c", name[i]);
}

The loop stops when it encounters the last character (always '\0')

Are 0 and '\0' the same for strings ?

Yes, because in fact, characters are tiny integers. Sometimes you will see just 0 instead of the character '\0'

Modifying a string

Maybe you tried to do that, as we do it for a simple variable :

char* name = "Hubert Reeves";
name = "John Conway";

But direct assignment does not work ! We have to use a function from the standard library.

For strings manipulation, we import <string.h>.

#include <string.h>
#include <assert.h>

int main(void) {
    char name[] = "Hubert Reeves";
    assert(strcmp(name, "Hubert Reeves") == 0);

    strcpy(name, "John Conway");
    assert(strcmp(name, "John Conway") == 0);

    return 0;
}

Here we used strcpy(char*, char*) to reassign the variable name.

What's strcmp() ? Why don't we use name == <string> ?

The comparison operator does not check every character of the string, to do that we have to use a function from the standard library int strcmp(string, string). If it returns 0, both strings are the same.

Why the char <identifier>[] is used and not char* <identifier> ?

There is a tiny difference between both syntaxes. When you have to modify the string's value, use the brackets. Here, strcpy() changes the value so we use the syntax with brackets.

We cannot use the addition operator with strings to concatenate them. But a function exists :

#include <string.h>
#include <assert.h>
#include <stdio.h>

int main(void) {
    char name[] = "Hubert";
    char surname[] = " Reeves";

    strcat(name, surname);
    assert(strcmp(name, "Hubert Reeves") == 0);

    return 0;
}
  • char* strcpy(char* dest, char* src)
    
    Copies a string into another one. Returns the result but also changed the first string object.
  • int strcmp(char* str1, char* str2)`
    
    Compares two strings and returns 0 when the same
  • size_t strlen(char* str)`
    
    Gets the string's length
  • char* strcat(char* dest, char* to_add)`
    
    Concatenates a string into another one. Returns the result but also changed the first string object.
  • char* strstr(char* target, char* to_search)`
    

    Returns the first occurrence of a substring from a string

      #include <string.h>
      #include <stdio.h>
    
      int main(void) {
          char* text = "I love apples";
          char* to_search = "apple";
    
          if (strstr(text, to_search) == NULL) {
              return 1;
          } 
    
          printf("'%s' was found in '%s'\n", to_search, text);
    
          return 0;
      }
    
      'apple' was found in 'I love apples'
    
  • char* strchr(char* target, char to_search)`
    
    Looks like strstr because it also searches the first occurrence but for a character.
  • int sprintf(char* target, char* format, ...)`
    
    Exactly like printf but used to write a formatted string into another string
    char text[256];
    char* fruit = "bananas";
    sprintf(text, "Do you like %s ?", fruit);
    printf("%s\n", text);
    

Exercices

  1. Write a program that concatenates two command line arguments to create a name, then print it.
  2. Rewrite the strlen() and strcmp() functions

    For people who have seen the strcmp documentation page, here we don't care about greater or lower characters. We return 0 if it's equal, and 1 if it's not.


Solutions

  1. Write a program that concatenates two command line arguments to create a name, then print it.

     #include <string.h>
     #include <stdio.h>
    
     int main(int argc, char** argv) {
         char name[256];
    
         strcpy(name, argv[1]);
         strcat(name, " ");
         strcat(name, argv[2]);
    
         printf("Name : %s\n", name);
    
         return 0;
     }
    

    Or

     #include <string.h>
     #include <stdio.h>
    
     int main(int argc, char** argv) {
         char name[256] = "";
         sprintf(name, "%s %s", argv[1], argv[2]);
         printf("Name : %s\n", name);
    
         return 0;
     }
    
  2. Rewrite the strlen() and strcmp() functions

     #include <stdio.h>
     #include <assert.h>
    
     size_t strlen(char* str) {
         int i = 0;
         for (; str[i] != '\0'; ++i);
         return i;
     }
    
     int strcmp(char* str1, char* str2) {
         size_t str1_length = strlen(str1);
         if (str1_length != strlen(str2)) {
             return 1;
         }
    
         for (size_t i = 0; i < str1_length; ++i) {
             if (str1[i] != str2[i]) {
                 return 1;
             }
         }
    
         return 0;
     }
    
     int main(void) {
         char* str = "hello";
         assert(strcmp(str, "hello") == 0);
    
         return 0;
     }
    

    You program can be different, it's not a problem until there is no memory error and the program does the same.