5.2. Arrays

Like other languages, C uses arrays as a way of describing a collection of variables with identical properties. The group has a single name for all of the members, with the individual members being selected by an index. Here's an array being declared:

double ar[100];

The name of the array is ar and its members are accessed as ar[0] through to ar[99] inclusive, as Figure 5.1 shows.

Diagram showing an array consisting of elements labelled 'ar[0]',           'ar[1]', etc., up to 'ar[99]'.
Figure 5.1. 100 element array

Each of the hundred members is a separate variable whose type is double. Without exception, all arrays in C are numbered from 0 up to one less than the bound given in the declaration. This is a prime cause of surprise to beginners—watch out for it. For simple examples of the use of arrays, look back at earlier chapters where several problems are solved with their help.

One important point about array declarations is that they don't permit the use of varying subscripts. The numbers given must be constant expressions which can be evaluated at compile time, not run time. For example, this function incorrectly tries to use its argument in the size of an array declaration:

f(int x){
      char var_sized_array[x];        /* FORBIDDEN */
}

It's forbidden because the value of x is unknown when the program is compiled; it's a run-time, not a compile-time, value.

To tell the truth, it would be easy to support arrays whose first dimension is variable, but neither Old C nor the Standard permits it, although we do know of one Very Old C compiler that used to do it.

5.2.1. Multidimensional arrays

Multidimensional arrays can be declared like this:

int three_dee[5][4][2];
int t_d[2][3]

The use of the brackets gives a clue to what is going on. If you refer to the precedence table given in Section 2.8.3 (Table 2.9), you'll see that [] associates left to right and that, as a result, the first declaration gives us a five-element array called three_dee. The members of that array are each a four element array whose members are an array of two ints. We have declared arrays of arrays, as Figure 5.2 shows for two dimensions.

Diagram showing a two dimensional array, with the 'outer' array            having two elements labelled 't_d[0]' and 't_d[1]', each with            three elements within it, labelled 't_d[0][0]', etc.
Figure 5.2. Two-dimensional array, showing layout

In the diagram, you will notice that t_d[0] is one element, immediately followed by t_d[1] (there is no break). It so happens that both of those elements are themselves arrays of three integers. Because of C's storage layout rules, t_d[1][0] is immediately after t_d[0][2]. It would be possible (but very poor practice) to access t_d[1][0] by making use of the lack of array-bound checking in C, and to use the expression t_d[0][3]. That is not recommended—apart from anything else, if the declaration of t_d ever changes, then the results will be likely to surprise you.

That's all very well, but does it really matter in practice? Not much it's true; but it is interesting to note that in terms of actual machine storage layout the rightmost subscript ‘varies fastest’. This has an impact when arrays are accessed via pointers. Otherwise, they can be used just as would be expected; expressions like these are quite in order:

three_dee[1][3][1] = 0;
three_dee[4][3][1] += 2;

The second of those is interesting for two reasons. First, it accesses the very last member of the entire array—although the subscripts were declared to be [5][4][2], the highest usable subscript is always one less than the one used in the declaration. Second, it shows where the combined assignment operators are a real blessing. For the experienced C programmer it is much easier to tell that only one array member is being accessed, and that it is being incremented by two. Other languages would have to express it like this:

three_dee[4][3][1] = three_dee[4][3][1] + 2;

It takes a conscious effort to check that the same array member is being referenced on both sides of the assignment. It makes thing easier for the compiler too: there is only one array indexing calculation to do, and this is likely to result in shorter, faster code. (Of course a clever compiler would notice that the left- and right-hand sides look alike and would be able to generate equally efficient code—but not all compilers are clever and there are lots of special cases where even clever compilers are unable to make use of the information.)

It may be of interest to know that although C offers support for multidimensional arrays, they aren't particularly common to see in practice. One-dimensional arrays are present in most programs, if for no other reason than that's what strings are. Two dimensional arrays are seen occasionally, and arrays of higher order than that are most uncommon. One of the reasons is that the array is a rather inflexible data structure, and the ease of building and manipulating other types of data structures in C means that they tend to replace arrays in the more advanced programs. We will see more of this when we look at pointers.