5.5. Sizeof and storage allocation

The sizeof operator returns the size in bytes of its operand. Whether the result of sizeof is unsigned int or unsigned long is implementation defined—which is why the declaration of malloc above ducked the issue by omitting any parameter information; normally you would use the stdlib.h header file to declare malloc correctly. Here is the last example done portably:

#include <stdlib.h>     /* declares malloc() */
float *fp;

fp = (float *)malloc(sizeof(float));

The operand of sizeof only has to be parenthesized if it's a type name, as it was in the example. If you are using the name of a data object instead, then the parentheses can be omitted, but they rarely are.

#include <stdlib.h>

int *ip, ar[100];
ip = (int *)malloc(sizeof ar);

In the last example, the array ar is an array of 100 ints; after the call to malloc (assuming that it was successful), ip will point to a region of store that can also be treated as an array of 100 ints.

The fundamental unit of storage in C is the char, and by definition

sizeof(char)

is equal to 1, so you could allocate space for an array of ten chars with

malloc(10)

while to allocate room for an array of ten ints, you would have to use

malloc(sizeof(int[10]))

If malloc can't find enough free space to satisfy a request it returns a null pointer to indicate failure. For historical reasons, the stdio.h header file contains a defined constant called NULL which is traditionally used to check the return value from malloc and some other library functions. An explicit 0 or (void *)0 could equally well be used.

As a first illustration of the use of malloc, here's a program which reads up to MAXSTRING strings from its input and sort them into alphabetical order using the library strcmp routine. The strings are terminated by a ‘\n’ character. The sort is done by keeping an array of pointers to the strings and simply exchanging the pointers until the order is correct. This saves having to copy the strings themselves, which improves the efficency somewhat.

The example is done first using fixed size arrays, then another version uses malloc and allocates space for the strings at run time. Unfortunately, the array of pointers is still fixed in size: a better solution would use a linked list or similar data structure to store the pointers and would have no fixed arrays at all. At the moment, we haven't seen how to do that.

The overall structure is this:

while(number of strings read < MAXSTRING
      && input still remains){

              read next string;
}
sort array of pointers;
print array of pointers;
exit;

A number of functions are used to implement this program:

char *next_string(char *destination)

Read a line of characters terminated by ‘\n’ from the program's input. The first MAXLEN-1 characters are written into the array pointed to by destination.

If the first character read is EOF, return a null pointer, otherwise return the address of the start of the string (destination). On return, destination always points to a null-terminated string.

void sort_arr(const char *p_array[])

P_array[] is an array of pointers to characters. The array can be arbitrarily long; its end is indicated by the first element containing a null pointer.

Sort_arr sorts the pointers so that the pointers point to strings which are in alphabetical order when the array is traversed in index order.

void print_arr(const char *p_array[])
Like sort_arr, but prints the strings in index order.

It will help to understand the examples if you remember that in an expression, an array's name is converted to the address of its first element. Similarly, for a two-dimensional array (such as strings below), then the expression strings[1][2] has type char, but strings[1] has type ‘array of char’ which is therefore converted to the address of the first element: it is equivalent to &strings[1][0].

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAXSTRING       50      /* max no. of strings */
#define MAXLEN          80      /* max length. of strings */

void print_arr(const char *p_array[]);
void sort_arr(const char *p_array[]);
char *next_string(char *destination);

main(){
      /* leave room for null at end */
      char *p_array[MAXSTRING+1];

      /* storage for strings */
      char strings[MAXSTRING][MAXLEN];

      /* count of strings read */
      int nstrings;

      nstrings = 0;
      while(nstrings < MAXSTRING &&
              next_string(strings[nstrings]) != 0){

              p_array[nstrings] = strings[nstrings];
              nstrings++;
      }
      /* terminate p_array */
      p_array[nstrings] = 0;

      sort_arr(p_array);
      print_arr(p_array);
      exit(EXIT_SUCCESS);
}

void print_arr(const char *p_array[]){
      int index;
      for(index = 0; p_array[index] != 0; index++)
              printf("%s\n", p_array[index]);
}


void sort_arr(const char *p_array[]){
      int comp_val, low_index, hi_index;
      const char *tmp;

      for(low_index = 0;
              p_array[low_index] != 0 &&
                              p_array[low_index+1] != 0;
                      low_index++){

              for(hi_index = low_index+1;
                      p_array[hi_index] != 0;
                              hi_index++){

                      comp_val=strcmp(p_array[hi_index],
                              p_array[low_index]);
                      if(comp_val >= 0)
                              continue;
                      /* swap strings */
                      tmp = p_array[hi_index];
                      p_array[hi_index] = p_array[low_index];
                      p_array[low_index] = tmp;
              }
      }
}



char *next_string(char *destination){
      char *cp;
      int c;

      cp = destination;
      while((c = getchar()) != '\n' && c != EOF){
              if(cp-destination < MAXLEN-1)
                      *cp++ = c;
      }
      *cp = 0;
      if(c == EOF && cp == destination)
              return(0);
      return(destination);
}
Example 5.10

It is no accident that next_string returns a pointer. We can now dispense with the strings array by getting next_string to allocate its own storage.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAXSTRING       50      /* max no. of strings */
#define MAXLEN          80      /* max length. of strings */

void print_arr(const char *p_array[]);
void sort_arr(const char *p_array[]);
char *next_string(void);

main(){
      char *p_array[MAXSTRING+1];
      int nstrings;

      nstrings = 0;
      while(nstrings < MAXSTRING &&
              (p_array[nstrings] = next_string()) != 0){

              nstrings++;
      }
      /* terminate p_array */
      p_array[nstrings] = 0;

      sort_arr(p_array);
      print_arr(p_array);
      exit(EXIT_SUCCESS);
}

void print_arr(const char *p_array[]){
      int index;
      for(index = 0; p_array[index] != 0; index++)
              printf("%s\n", p_array[index]);
}


void sort_arr(const char *p_array[]){
      int comp_val, low_index, hi_index;
      const char *tmp;

      for(low_index = 0;
              p_array[low_index] != 0 &&
                      p_array[low_index+1] != 0;
                      low_index++){

              for(hi_index = low_index+1;
                      p_array[hi_index] != 0;
                              hi_index++){

                      comp_val=strcmp(p_array[hi_index],
                              p_array[low_index]);
                      if(comp_val >= 0)
                              continue;
                      /* swap strings */
                      tmp = p_array[hi_index];
                      p_array[hi_index] = p_array[low_index];
                      p_array[low_index] = tmp;
              }
      }
}

char *next_string(void){
      char *cp, *destination;
      int c;

      destination = (char *)malloc(MAXLEN);
      if(destination != 0){
              cp = destination;
              while((c = getchar()) != '\n' && c != EOF){
                      if(cp-destination < MAXLEN-1)
                              *cp++ = c;
              }
              *cp = 0;
              if(c == EOF && cp == destination)
                      return(0);
      }
      return(destination);
}
Example 5.11

Finally, for the extremely brave, here is the whole thing with even p_array allocated using malloc. Further, most of the array indexing is rewritten to use pointer notation. If you are feeling queasy, skip this example. It is hard. One word of explanation: char **p means a pointer to a pointer to a character. Many C programmers find this hard to deal with.

#include <stdio.h>
#include <stdlib.hi>
#include <string.h>

#define MAXSTRING       50      /* max no. of strings */
#define MAXLEN          80      /* max length. of strings */

void print_arr(const char **p_array);
void sort_arr(const char **p_array);
char *next_string(void);

main(){
      char **p_array;
      int nstrings;   /* count of strings read */

      p_array = (char **)malloc(
                      sizeof(char *[MAXSTRING+1]));
      if(p_array == 0){
              printf("No memory\n");
              exit(EXIT_FAILURE);
      }

      nstrings = 0;
      while(nstrings < MAXSTRING &&
              (p_array[nstrings] = next_string()) != 0){

              nstrings++;
      }
      /* terminate p_array */
      p_array[nstrings] = 0;

      sort_arr(p_array);
      print_arr(p_array);
      exit(EXIT_SUCCESS);
}

void print_arr(const char **p_array){
      while(*p_array)
              printf("%s\n", *p_array++);
}


void sort_arr(const char **p_array){
      const char **lo_p, **hi_p, *tmp;

      for(lo_p = p_array;
              *lo_p != 0 && *(lo_p+1) != 0;
                                      lo_p++){
              for(hi_p = lo_p+1; *hi_p != 0; hi_p++){

                      if(strcmp(*hi_p, *lo_p) >= 0)
                              continue;
                      /* swap strings */
                      tmp = *hi_p;
                      *hi_p = *lo_p;
                      *lo_p = tmp;
              }
      }
}



char *next_string(void){
      char *cp, *destination;
      int c;

      destination = (char *)malloc(MAXLEN);
      if(destination != 0){
              cp = destination;
              while((c = getchar()) != '\n' && c != EOF){
                      if(cp-destination < MAXLEN-1)
                              *cp++ = c;
              }
              *cp = 0;
              if(c == EOF && cp == destination)
                      return(0);
      }
      return(destination);
}
Example 5.12

To further illustrate the use of malloc, another example program follows which can cope with arbitrarily long strings. It simply reads strings from its standard input, looking for a newline character to mark the end of the string, then prints the string on its standard output. It stops when it detects end-of-file. The characters are put into an array, the end of the string being indicated (as always) by a zero. The newline is not stored, but used to detect when a full line of input should be printed on the output. The program doesn't know how long the string will be, so it starts by allocating ten characters—enough for a short string.

If the string is more than ten characters long, malloc is called to allocate room for the current string plus ten more characters. The current characters are copied into the new space, the old storage previously allocated is released and the program continues using the new storage.

To release storage allocated by malloc, the library function free is used. If you don't release storage when it isn't needed any more, it just hangs around taking up space. Using free allows it to be ‘given away’, or at least re-used later.

The program reports errors by using fprintf, a close cousin of printf. The only difference between them is that fprintf takes an additional first argument which indicates where its output should go. There are two constants of the right type for this purpose defined in stdio.h. Using stdout indicates that the program's standard output is to be used; stderr refers to the program's standard error stream. On some systems both may be the same, but other systems do make the distinction.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define GROW_BY 10      /* string grows by 10 chars */

main(){
      char *str_p, *next_p, *tmp_p;
      int ch, need, chars_read;

      if(GROW_BY < 2){
              fprintf(stderr,
                      "Growth constant too small\n");
              exit(EXIT_FAILURE);
      }

      str_p = (char *)malloc(GROW_BY);
      if(str_p == NULL){
              fprintf(stderr,"No initial store\n");
              exit(EXIT_FAILURE);
      }

      next_p = str_p;
      chars_read = 0;
      while((ch = getchar()) != EOF){
              /*
               * Completely restart at each new line.
               * There will always be room for the
               * terminating zero in the string,
               * because of the check further down,
               * unless GROW_BY is less than 2,
               * and that has already been checked.
               */
              if(ch == '\n'){
                      /* indicate end of line */
                      *next_p = 0;
                      printf("%s\n", str_p);
                      free(str_p);
                      chars_read = 0;
                      str_p = (char *)malloc(GROW_BY);
                      if(str_p == NULL){
                              fprintf(stderr,"No initial store\n");
                              exit(EXIT_FAILURE);
                      }
                      next_p = str_p;
                      continue;
              }
              /*
               * Have we reached the end of the current
               * allocation ?
               */
              if(chars_read == GROW_BY-1){
                      *next_p = 0;    /* mark end of string */
                      /*
                       * use pointer subtraction
                       * to find length of
                       * current string.
                       */
                      need = next_p - str_p +1;
                      tmp_p = (char *)malloc(need+GROW_BY);
                      if(tmp_p == NULL){
                              fprintf(stderr,"No more store\n");
                              exit(EXIT_FAILURE);
                      }
                      /*
                       * Copy the string using library.
                       */
                      strcpy(tmp_p, str_p);
                      free(str_p);
                      str_p = tmp_p;
                      /*
                       * and reset next_p, character count
                       */
                      next_p = str_p + need-1;
                      chars_read = 0;
              }
              /*
               * Put character at end of current string.
               */
              *next_p++ = ch;
              chars_read++;
      }
      /*
       * EOF - but do unprinted characters exist?
       */
      if(str_p - next_p){
              *next_p = 0;
              fprintf(stderr,"Incomplete last line\n");
              printf("%s\n", str_p);
      }
      exit(EXIT_SUCCESS);
}
Example 5.13

That may not be a particularly realistic example of how to handle arbitrarily long strings—for one thing, the maximum storage demand is twice the amount needed for the longest string—but it does actually work. It also costs rather a lot in terms of copying around. Both problems could be reduced by using the library realloc function instead.

A more sophisticated method might use a linked list, implemented with the use of structures, as described in the next chapter. That would have its drawbacks too though, because then the standard library routines wouldn't work for a different method of storing strings.

5.5.1. What sizeof can't do

One common mistake made by beginners is shown below:

#include <stdio.h>
#include <stdlib.h>

const char arr[] = "hello";
const char *cp = arr;
main(){

      printf("Size of arr %lu\n", (unsigned long)
                      sizeof(arr));
      printf("Size of *cp %lu\n", (unsigned long)
                      sizeof(*cp));
      exit(EXIT_SUCCESS);
}
Example 5.14

The numbers printed will not be the same. The first will, correctly, identify the size of arr as 6; five characters followed by a null. The second one will always, on every system, print 1. That's because the type of *cp is const char, which can only have a size of 1, whereas the type of arr is different: array of const char. The confusion arises because this is the one place that the use of an array is not converted into a pointer first. It is never possible, using sizeof, to find out how long an array a pointer points to; you must have a genuine array name instead.

5.5.2. The type of sizeof

Now comes the question of just what this does:

sizeof ( sizeof (anything legal) )

That is to say, what type does the result of sizeof have? The answer is that it is implementation defined, and will be either unsigned long or unsigned int, depending on your implementation. There are two safe things to do: either always cast the return value to unsigned long, as the examples have done, or to use the defined type size_t provided in the <stddef.h> header file. For example:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

main(){
      size_t sz;
      sz = sizeof(sz);
      printf("size of sizeof is %lu\n",
              (unsigned long)sz);
      exit(EXIT_SUCCESS);
}
Example 5.15