6.7. Initialization

Now that we have seen all of the data types supported by C, we can look at the subject of initialization. C allows ordinary variables, structures, unions and arrays to be given initial values in their definitions. Old C had some strange rules about this, reflecting an unwillingness by compiler writers to work too hard. The Standard has rationalized this, and now it is possible to initialize things as and when you want.

There are basically two sorts of initialization: at compile time, and at run time. Which one you get depends on the storage duration of the thing being initialized.

Objects with static duration are declared either outside functions, or inside them with the keyword extern or static as part of the declaration. These can only be initialized at compile time.

Any other object has automatic duration, and can only be initialized at run time. The two categories are mutually exclusive.

Although they are related, storage duration and linkage (see Chapter 4) are different and should not be confused.

Compile-time initialization can only be done using constant expressions; run-time initialization can be done using any expression at all. The Old C restriction, that only simple variables (not arrays, structures or unions) could be initialized at run time, has been lifted.

6.7.1. Constant expressions

There are a number of places where constant expressions must be used. The definition of what constitutes a constant expression is relatively simple.

A constant expression is evaluated by the compiler, not at run-time. It may be used anywhere that a constant may be used. Unless it is part of the operand of sizeof, it may not contain any assignment, increment or decrement operations, function calls or comma operators; that may seem odd, but it's because sizeof only needs to evaluate the type of an expression, not its value.

If real numbers are evaluated at compile-time, then the Standard insists that they are evaluated with at least as much precision and range as will be used at run-time.

A more restricted form, called the integral constant expression exists. This has integral type and only involves operands that are integer constants, enumeration constants, character constants, sizeof expressions and real constants that are the immediate operands of casts. Any cast operators are only allowed to convert arithmetic types to integral types. As with the previous note on sizeof expressions, since they don't have to be evaluated, just their type determined, no restrictions apply to their contents.

The arithmetic constant expression is like the integral constant expression, but allows real constants to be used and restricts the use of casts to converting one arithmetic type to another.

The address constant is a pointer to an object that has static storage duration or a pointer to a function. You can get these by using the & operator or through the usual conversions of array and function names into pointers when they are used in expressions. The operators [], ., ->, & (address of) and * (pointer dereference) as well as casts of pointers can all be used in the expression as long as they don't involve accessing the value of any object.

6.7.2. More initialization

The various types of constants are permitted in various places; integral constant expressions are particularly important because they are the only type of expression that may be used to specify the size of arrays and the values in case statement prefixes. The types of constants that are permitted in initializer expressions are less restricted; you are allowed to use: arithmetic constant expressions; null pointer or address constants; an address constant for an object plus or minus an integral constant expression. Of course it depends on the type of thing being initialized whether or not a particular type of constant expression is appropriate.

Here is an example using several initialized variables:

#include <stdio.h>
#include <stdlib.h>

#define NMONTHS 12

int month = 0;

short month_days[] =
      {31,28,31,30,31,30,31,31,30,31,30,31};

char *mnames[] ={
      "January", "February",
      "March", "April",
      "May", "June",
      "July", "August",
      "September", "October",
      "November", "December"
};

main(){

      int day_count = month;

      for(day_count = month; day_count < NMONTHS;
              day_count++){
              printf("%d days in %s\n",
                      month_days[day_count],
                      mnames[day_count]);
      }
      exit(EXIT_SUCCESS);
}
Example 6.14

Initializing ordinary variables is easy: put = expression after the variable name in a declaration, and the variable is initialized to the value of the expression. As with all objects, whether you can use any expression, or just a constant expression, depends on its storage duration.

Initializing arrays is easy for one-dimensional arrays. Just put a list of the values you want, separated by commas, inside curly brackets. The example shows how to do it. If you don't give a size for the array, then the number of initializers will determine the size. If you do give a size, then there must be at most that many initializers in the list. Too many is an error, too few will just initialize the first elements of the array.

You could build up a string like this:

char str[] = {'h', 'e', 'l', 'l', 'o', 0};

but because it is so often necessary to do that, it is also permitted to use a quoted string literal to initialize an array of chars:

char str[] = "hello";

In that case, the null at the end of the string will also be included if there is room, or if no size was specified. Here are examples:

/* no room for the null */
char str[5] = "hello";

/* room for the null */
char str[6] = "hello";

The example program used string literals for a different purpose: there they were being used to initialize an array of character pointers; a very different prospect.

For structures that have automatic duration, an expression of the right type can be used to initialize them, or else a bracketed list of constant expressions must be used:

#include <stdio.h>
#include <stdlib.h>

struct s{
      int a;
      char b;
      char *cp;
}ex_s = {
      1, 'a', "hello"
      };

main(){
      struct s first = ex_s;
      struct s second = {
              2, 'b', "byebye"
              };

      exit(EXIT_SUCCESS);
}
Example 6.15

Only the first member of a union can be initialized.

If a structure or union contains unnamed members, whether unnamed bitfields or padding for alignment, they are ignored in the initialization process; they don't have to be counted when you provide the initializers for the real members of the structure.

For objects that contain sub-objects within them, there are two ways of writing the initializer. It can be written out with an initializer for each member:

struct s{
      int a;
      struct ss{
              int c;
              char d;
      }e;
}x[] = {
      1, 2, 'a',
      3, 4, 'b'
      };
Example 6.16

which will assign 1 to x[0].a, 2 to x[0].e.c, a to x[0].e.d and 3 to x[1].a and so on.

It is much safer to use internal braces to show what you mean, or one missed value will cause havoc.

struct s{
      int a;
      struct ss{
              int c;
              char d;
      }e;
}x[] = {
      {1, {2, 'a'}},
      {3, {4, 'b'}}
      };
Example 6.17

Always fully bracket initializers—that is much the safest thing to do.

It is the same for arrays as for structures:

float y[4][3] = {
      {1, 3, 5},      /* y[0][0], y[0][1], y[0][2] */
      {2, 4, 6},      /* y[1][0], y[1][1], y[1][2] */
      {3, 5, 7}       /* y[2][0], y[2][1], y[2][2] */
};
Example 6.18

that gives full initialization to the first three rows of y. The fourth row, y[3], is uninitialized.

Unless they have an explicit initializer, all objects with static duration are given implicit initializers—the effect is as if the constant 0 had been assigned to their components. This is in fact widely used—it is an assumption made by most C programs that external objects and internal static objects start with the value zero.

Initialization of objects with automatic duration is only guaranteed if their compound statement is entered ‘at the top’. Jumping into the middle of one may result in the initialization not happening—this is often undesirable and should be avoided. It is explicitly noted by the Standard with regard to switch statements, where providing initializers in declarations cannot be of any use; this is because a declaration is not linguistically a ‘statement’ and only statements may be labelled. As a result it is not possible for initializers in switch statements ever to be executed, because the entry to the block containing them must be below the declarations!

A declaration inside a function (block scope) can, using various techniques outlined in Chapter 4 and Chapter 8, be made to refer to an object that has either external or internal linkage. If you've managed to do that, and it's not likely to happen by accident, then you can't initialize the object as part of that declaration. Here is one way of trying it:

int x;                        /* external linkage */
main(){
      extern int x = 5;       /* forbidden */
}

Our test compiler didn't notice that one, either.