5.5. Sizeof and storage allocation
The sizeof
operator returns the size in bytes of its
operand. Whether the result of sizeof
is unsigned
int
or unsigned long
is implementation
defined—which is why the declaration of malloc above ducked the
issue by omitting any parameter information; normally you would use the
stdlib.h
header file to declare malloc
correctly. Here is the last example done portably:
#include <stdlib.h> /* declares malloc() */ float *fp; fp = (float *)malloc(sizeof(float));
The operand of sizeof
only has to be parenthesized if it's
a type name, as it was in the example. If you are using the name of
a data object instead, then the parentheses can be omitted, but they
rarely are.
#include <stdlib.h> int *ip, ar[100]; ip = (int *)malloc(sizeof ar);
In the last example, the array ar
is an array of 100
ints
; after the call to malloc
(assuming that it was
successful), ip
will point to a region of store that can
also be treated as an array of 100 ints
.
The fundamental unit of storage in C is the char
, and by
definition
sizeof(char)
is equal to 1, so you could allocate space for an array of ten
char
s with
malloc(10)
while to allocate room for an array of ten int
s, you would
have to use
malloc(sizeof(int[10]))
If malloc
can't find enough free space to satisfy
a request it returns a null pointer to indicate failure. For historical
reasons, the stdio.h
header file contains a defined constant
called NULL which is traditionally used to check the return value from
malloc
and some other library functions. An explicit
0
or (void *)0
could equally well be used.
As a first illustration of the use of malloc
, here's
a program which reads up to MAXSTRING
strings from its input
and sort them into alphabetical order using the library
strcmp
routine. The strings are terminated by
a ‘\n
’ character. The sort is done by keeping an array
of pointers to the strings and simply exchanging the pointers until the
order is correct. This saves having to copy the strings themselves, which
improves the efficency somewhat.
The example is done first using fixed size arrays, then another version uses malloc and allocates space for the strings at run time. Unfortunately, the array of pointers is still fixed in size: a better solution would use a linked list or similar data structure to store the pointers and would have no fixed arrays at all. At the moment, we haven't seen how to do that.
The overall structure is this:
while(number of strings read < MAXSTRING && input still remains){ read next string; } sort array of pointers; print array of pointers; exit;
A number of functions are used to implement this program:
char *next_string(char *destination)
-
Read a line of characters terminated by ‘
\n
’ from the program's input. The firstMAXLEN-1
characters are written into the array pointed to by destination.If the first character read is
EOF
, return a null pointer, otherwise return the address of the start of the string (destination). On return, destination always points to a null-terminated string. void sort_arr(const char *p_array[])
-
P_array[]
is an array of pointers to characters. The array can be arbitrarily long; its end is indicated by the first element containing a null pointer.Sort_arr
sorts the pointers so that the pointers point to strings which are in alphabetical order when the array is traversed in index order. void print_arr(const char *p_array[])
- Like
sort_arr
, but prints the strings in index order.
It will help to understand the examples if you remember that in an
expression, an array's name is converted to the address of its first
element. Similarly, for a two-dimensional array (such as
strings
below), then the expression
strings[1][2]
has type char, but strings[1]
has
type ‘array of char
’ which is therefore converted to
the address of the first element: it is equivalent to
&strings[1][0]
.
#include <stdio.h> #include <stdlib.h> #include <string.h> #define MAXSTRING 50 /* max no. of strings */ #define MAXLEN 80 /* max length. of strings */ void print_arr(const char *p_array[]); void sort_arr(const char *p_array[]); char *next_string(char *destination); main(){ /* leave room for null at end */ char *p_array[MAXSTRING+1]; /* storage for strings */ char strings[MAXSTRING][MAXLEN]; /* count of strings read */ int nstrings; nstrings = 0; while(nstrings < MAXSTRING && next_string(strings[nstrings]) != 0){ p_array[nstrings] = strings[nstrings]; nstrings++; } /* terminate p_array */ p_array[nstrings] = 0; sort_arr(p_array); print_arr(p_array); exit(EXIT_SUCCESS); } void print_arr(const char *p_array[]){ int index; for(index = 0; p_array[index] != 0; index++) printf("%s\n", p_array[index]); } void sort_arr(const char *p_array[]){ int comp_val, low_index, hi_index; const char *tmp; for(low_index = 0; p_array[low_index] != 0 && p_array[low_index+1] != 0; low_index++){ for(hi_index = low_index+1; p_array[hi_index] != 0; hi_index++){ comp_val=strcmp(p_array[hi_index], p_array[low_index]); if(comp_val >= 0) continue; /* swap strings */ tmp = p_array[hi_index]; p_array[hi_index] = p_array[low_index]; p_array[low_index] = tmp; } } } char *next_string(char *destination){ char *cp; int c; cp = destination; while((c = getchar()) != '\n' && c != EOF){ if(cp-destination < MAXLEN-1) *cp++ = c; } *cp = 0; if(c == EOF && cp == destination) return(0); return(destination); }Example 5.10
It is no accident that next_string
returns a pointer. We
can now dispense with the strings array by getting
next_string
to allocate its own storage.
#include <stdio.h> #include <stdlib.h> #include <string.h> #define MAXSTRING 50 /* max no. of strings */ #define MAXLEN 80 /* max length. of strings */ void print_arr(const char *p_array[]); void sort_arr(const char *p_array[]); char *next_string(void); main(){ char *p_array[MAXSTRING+1]; int nstrings; nstrings = 0; while(nstrings < MAXSTRING && (p_array[nstrings] = next_string()) != 0){ nstrings++; } /* terminate p_array */ p_array[nstrings] = 0; sort_arr(p_array); print_arr(p_array); exit(EXIT_SUCCESS); } void print_arr(const char *p_array[]){ int index; for(index = 0; p_array[index] != 0; index++) printf("%s\n", p_array[index]); } void sort_arr(const char *p_array[]){ int comp_val, low_index, hi_index; const char *tmp; for(low_index = 0; p_array[low_index] != 0 && p_array[low_index+1] != 0; low_index++){ for(hi_index = low_index+1; p_array[hi_index] != 0; hi_index++){ comp_val=strcmp(p_array[hi_index], p_array[low_index]); if(comp_val >= 0) continue; /* swap strings */ tmp = p_array[hi_index]; p_array[hi_index] = p_array[low_index]; p_array[low_index] = tmp; } } } char *next_string(void){ char *cp, *destination; int c; destination = (char *)malloc(MAXLEN); if(destination != 0){ cp = destination; while((c = getchar()) != '\n' && c != EOF){ if(cp-destination < MAXLEN-1) *cp++ = c; } *cp = 0; if(c == EOF && cp == destination) return(0); } return(destination); }Example 5.11
Finally, for the extremely brave, here is the whole thing with even
p_array
allocated using malloc
. Further, most
of the array indexing is rewritten to use pointer notation. If you are
feeling queasy, skip this example. It is hard. One word of explanation:
char **p
means a pointer to a pointer to a character. Many
C programmers find this hard to deal with.
#include <stdio.h> #include <stdlib.hi> #include <string.h> #define MAXSTRING 50 /* max no. of strings */ #define MAXLEN 80 /* max length. of strings */ void print_arr(const char **p_array); void sort_arr(const char **p_array); char *next_string(void); main(){ char **p_array; int nstrings; /* count of strings read */ p_array = (char **)malloc( sizeof(char *[MAXSTRING+1])); if(p_array == 0){ printf("No memory\n"); exit(EXIT_FAILURE); } nstrings = 0; while(nstrings < MAXSTRING && (p_array[nstrings] = next_string()) != 0){ nstrings++; } /* terminate p_array */ p_array[nstrings] = 0; sort_arr(p_array); print_arr(p_array); exit(EXIT_SUCCESS); } void print_arr(const char **p_array){ while(*p_array) printf("%s\n", *p_array++); } void sort_arr(const char **p_array){ const char **lo_p, **hi_p, *tmp; for(lo_p = p_array; *lo_p != 0 && *(lo_p+1) != 0; lo_p++){ for(hi_p = lo_p+1; *hi_p != 0; hi_p++){ if(strcmp(*hi_p, *lo_p) >= 0) continue; /* swap strings */ tmp = *hi_p; *hi_p = *lo_p; *lo_p = tmp; } } } char *next_string(void){ char *cp, *destination; int c; destination = (char *)malloc(MAXLEN); if(destination != 0){ cp = destination; while((c = getchar()) != '\n' && c != EOF){ if(cp-destination < MAXLEN-1) *cp++ = c; } *cp = 0; if(c == EOF && cp == destination) return(0); } return(destination); }Example 5.12
To further illustrate the use of malloc
, another example
program follows which can cope with arbitrarily long strings. It simply
reads strings from its standard input, looking for a newline character to
mark the end of the string, then prints the string on its standard
output. It stops when it detects end-of-file. The characters are put
into an array, the end of the string being indicated (as always) by
a zero. The newline is not stored, but used to detect when a full line of
input should be printed on the output. The program doesn't know how long
the string will be, so it starts by allocating ten characters—enough
for a short string.
If the string is more than ten characters long, malloc
is
called to allocate room for the current string plus ten more characters.
The current characters are copied into the new space, the old storage
previously allocated is released and the program continues using the new
storage.
To release storage allocated by malloc
, the library
function free
is used. If you don't release storage when it
isn't needed any more, it just hangs around taking up space. Using
free
allows it to be ‘given away’, or at least re-used
later.
The program reports errors by using fprintf
, a close
cousin of printf
. The only difference between them is that
fprintf takes an additional first argument which indicates where its
output should go. There are two constants of the right type for this
purpose defined in stdio.h
. Using stdout
indicates that the program's standard output is to be used;
stderr
refers to the program's standard error stream. On
some systems both may be the same, but other systems do make the
distinction.
#include <stdio.h> #include <stdlib.h> #include <string.h> #define GROW_BY 10 /* string grows by 10 chars */ main(){ char *str_p, *next_p, *tmp_p; int ch, need, chars_read; if(GROW_BY < 2){ fprintf(stderr, "Growth constant too small\n"); exit(EXIT_FAILURE); } str_p = (char *)malloc(GROW_BY); if(str_p == NULL){ fprintf(stderr,"No initial store\n"); exit(EXIT_FAILURE); } next_p = str_p; chars_read = 0; while((ch = getchar()) != EOF){ /* * Completely restart at each new line. * There will always be room for the * terminating zero in the string, * because of the check further down, * unless GROW_BY is less than 2, * and that has already been checked. */ if(ch == '\n'){ /* indicate end of line */ *next_p = 0; printf("%s\n", str_p); free(str_p); chars_read = 0; str_p = (char *)malloc(GROW_BY); if(str_p == NULL){ fprintf(stderr,"No initial store\n"); exit(EXIT_FAILURE); } next_p = str_p; continue; } /* * Have we reached the end of the current * allocation ? */ if(chars_read == GROW_BY-1){ *next_p = 0; /* mark end of string */ /* * use pointer subtraction * to find length of * current string. */ need = next_p - str_p +1; tmp_p = (char *)malloc(need+GROW_BY); if(tmp_p == NULL){ fprintf(stderr,"No more store\n"); exit(EXIT_FAILURE); } /* * Copy the string using library. */ strcpy(tmp_p, str_p); free(str_p); str_p = tmp_p; /* * and reset next_p, character count */ next_p = str_p + need-1; chars_read = 0; } /* * Put character at end of current string. */ *next_p++ = ch; chars_read++; } /* * EOF - but do unprinted characters exist? */ if(str_p - next_p){ *next_p = 0; fprintf(stderr,"Incomplete last line\n"); printf("%s\n", str_p); } exit(EXIT_SUCCESS); }Example 5.13
That may not be a particularly realistic example of how to handle
arbitrarily long strings—for one thing, the maximum storage demand
is twice the amount needed for the longest string—but it
does actually work. It also costs rather a lot in terms of copying
around. Both problems could be reduced by using the library
realloc
function instead.
A more sophisticated method might use a linked list, implemented with the use of structures, as described in the next chapter. That would have its drawbacks too though, because then the standard library routines wouldn't work for a different method of storing strings.
5.5.1. What sizeof can't do
One common mistake made by beginners is shown below:
#include <stdio.h> #include <stdlib.h> const char arr[] = "hello"; const char *cp = arr; main(){ printf("Size of arr %lu\n", (unsigned long) sizeof(arr)); printf("Size of *cp %lu\n", (unsigned long) sizeof(*cp)); exit(EXIT_SUCCESS); }Example 5.14
The numbers printed will not be the same. The first will,
correctly, identify the size of arr
as 6
; five
characters followed by a null. The second one will always, on every
system, print 1
. That's because the type of
*cp
is const char
, which can only have a size
of 1
, whereas the type of arr
is different:
array of const char
. The confusion arises because this is
the one place that the use of an array is not converted into a pointer
first. It is never possible, using sizeof
, to find out how
long an array a pointer points to; you must have a genuine
array name instead.
5.5.2. The type of sizeof
Now comes the question of just what this does:
sizeof ( sizeof (anything legal) )
That is to say, what type does the result of sizeof
have?
The answer is that it is implementation defined, and will be either
unsigned long
or unsigned int
, depending on
your implementation. There are two safe things to do: either always cast
the return value to unsigned long, as the examples have done, or to use
the defined type size_t
provided in the
<stddef.h>
header file. For example: