8.2. Declarations, Definitions and Accessibility
Chapter 4 introduced the concepts of scope and linkage, showing how they can be combined to control the accessibility of things throughout a program. We deliberately gave a vague description of exactly what constitutes a definition on the grounds that it would give you more pain than gain at that stage. Eventually it has to be spelled out in detail, which we do in this chapter. Just to make things interesting, we need to throw in storage class too.
You'll probably find the interactions between these various elements to be both complex and confusing: that's because they are! We try to eliminate some of the confusion and give some useful rules of thumb in Section 8.2.5 below—but to understand them, you still need to read the stuff in between at least once.
For a full understanding, you need a good grasp of three distinct but related concepts. The Standard calls them:
- duration
- scope
- linkage
and describes what they mean in a fairly readable way (for a standard). Scope and linkage have already been described in Chapter 4, although we do present a review of them below.
8.2.1. Storage class specifiers
There are five keywords under the category of storage class
specifiers, although one of them, typedef
, is there
more out of convenience than utility; it has its own section later since
it doesn't really belong here. The ones remaining are auto
,
extern
, register
, and static
.
Storage class specifiers help you to specify the type of storage used
for data objects. Only one storage class specifier is permitted in
a declaration—this makes sense, as there is only one way of storing
things—and if you omit the storage class specifier in
a declaration, a default is chosen. The default depends on whether the
declaration is made outside a function (external declarations) or inside
a function (internal declarations). For external declarations the
default storage class specifier will be extern
and for
internal
declarations it will be auto
. The
only exception to this rule is the declaration of functions, whose
default storage class specifier is always
extern
.
The positioning of a declaration, the storage class specifiers used (or their defaults) and, in some cases, preceding declarations of the same name, can all affect the linkage of a name, although fortunately not its scope or duration. We will investigate the easier items first.
8.2.1.1. Duration
The duration of an object describes whether its storage is allocated once only, at program start-up, or is more transient in its nature, being allocated and freed as necessary.
There are only two types of duration of objects: static duration and automatic duration. Static duration means that the object has its storage allocated permanently, automatic means that the storage is allocated and freed as necessary. It's easy to tell which is which: you only get automatic duration if
- the declaration is inside a function
- and the declaration does not contain the
static
orextern
keywords - and the declaration is not the declaration of a function
(if you work through the rules, you'll find that the formal parameters of a function always meet all three requirements—they are always ‘automatic’).
Although the presence of static
in a declaration
unambiguously ensures that it has static duration, it's interesting to
see that it is by no means the only way. This is a notorious source of
confusion, but we just have to accept it.
Data objects declared inside functions are given the default storage
class specifier of auto
unless some other storage class
specifier is used. In the vast majority of cases, you don't want these
objects to be accessible from outside the function, so you want them to
have no linkage. Either the default, auto
, or the
explicit register storage
class specifier results in an
object with no linkage and automatic duration. Neither
auto
nor register
can be applied to
a declaration that occurs outside a function.
The register
storage class is quite interesting,
although it is tending to fall into disuse nowadays. It suggests to the
compiler that it would be a good idea to store the object in one or
more hardware registers in the interests of speed. The compiler does
not have to take any notice of this, but to make things easy for it,
register
variables do not have an address (the
&
address-of operator is forbidden) because some
computers don't support the idea of addressable registers. Declaring
too many register
objects may slow the program down,
rather than speed it up, because the compiler may either have to save
more registers on entrance to a function, often a slow process, or
there won't be enough registers remaining to be used for intermediate
calculations. Determining when to use registers will be
a machine-specific choice and should only be taken when detailed
measurements show that a particular function needs to be speeded up.
Then you will have to experiment. In our opinion, you should never
declare register variables during program development. Get the program
working first, then measure it, then, maybe, judicious use of registers
will give a useful increase in performance. But that work will have to
be repeated for every type of processor you move the program to; even
within one family of processors the characteristics are often
different.
A final note on register
variables: this is the only
storage class specifier that may be used in a function prototype or
function definition. In a function prototype, the storage class
specifier is simply ignored, in a function definition it is a hint that
the actual parameter should be stored in a register if possible. This
example shows how it might be used:
#include <stdio.h> #include <stdlib.h> void func(register int arg1, double arg2); main(){ func(5, 2); exit(EXIT_SUCCESS); } /* * Function illustrating that formal parameters * may be declared to have register storage class. */ void func(register int arg1, double arg2){ /* * Illustrative only - nobody would do this * in this context. * Cannot take address of arg1, even if you want to */ double *fp = &arg2; while(arg1){ printf("res = %f\n", arg1 * (*fp)); arg1--; } }Example 8.1
So, the duration of an object depends on the storage class specifier used, whether it's a data object or function, and the position (block or file scope) of the declaration concerned. The linkage is also dependent on the storage class specifier, what kind of object it is and the scope of the declaration. Table 8.1 and Table 8.2 show the resulting storage duration and apparent linkage for the various combinations of storage class specifiers and location of the declaration. The actual linkage of objects with static duration is a bit more complicated, so use these tables only as a guide to the simple cases and take a look at what we say later about definitions.
Storage Class Specifier | Function or Data Object | Linkage | Duration |
---|---|---|---|
static |
either | internal | static |
extern |
either | probably external | static |
none | function | probably external | static |
none | data object | external | static |
The table above omits the register
and auto
storage class specifiers because they are not permitted in file-scope
(external) declarations.
Storage Class Specifier | Function or Data Object | Linkage | Duration |
---|---|---|---|
register |
data object only | none | automatic |
auto |
data object only | none | automatic |
static |
data object only | none | static |
extern |
either | probably external | static |
none | data object | none | automatic |
none | function | probably external | static |
Internal static
variables retain their values between
calls of the function that contains them, which is useful in certain
circumstances (see Chapter 4).
8.2.2. Scope
Now we must look again at the scope of the names of objects, which defines when and where a given name has a particular meaning. The different types of scope are the following:
- function scope
- file scope
- block scope
- function prototype scope
The easiest is function scope. This only applies to labels, whose names are visible throughout the function where they are declared, irrespective of the block structure. No two labels in the same function may have the same name, but because the name only has function scope, the same name can be used for labels in every function. Labels are not objects—they have no storage associated with them and the concepts of linkage and duration have no meaning for them.
Any name declared outside a function has file scope, which means that the name is usable at any point from the declaration on to the end of the source code file containing the declaration. Of course it is possible for these names to be temporarily hidden by declarations within compound statements. As we know, function definitions must be outside other functions, so the name introduced by any function definition will always have file scope.
A name declared inside a compound statement, or as a formal parameter
to a function, has block scope and is usable up to the end of
the associated }
which closes the compound statement. Any
declaration of a name within a compound statement hides any outer
declaration of the same name until the end of the compound
statement.
A special and rather trivial example of scope is function prototype scope where a declaration of a name extends only to the end of the function prototype. That means simply that this is wrong (same name used twice):
void func(int i, int i);
and this is all right:
void func(int i, int j);
The names declared inside the parentheses disappear outside them.
The scope of a name is completely independent of any storage class specifier that may be used in its declaration.
8.2.3. Linkage
We will briefly review the subject of linkage here, too.
Linkage is used to determine what makes the same name
declared in different scopes refer to the same thing. An object only
ever has one name, but in many cases we would like to be able to refer
to the same object from different scopes. A typical example is the wish
to be able to call printf
from several different places in
a program, even if those places are not all in the same source file.
The Standard warns that declarations which refer to the same thing must all have compatible type, or the behaviour of the program will be undefined. A full description of compatible type is given later; for the moment you can take it to mean that, except for the use of the storage class specifier, the declarations must be identical. It's the responsibility of the programmer to get this right, though there will probably be tools available to help you check this out.
The three different types of linkage are:
- external linkage
- internal linkage
- no linkage
In an entire program, built up perhaps from a number of source files and libraries, if a name has external linkage, then every instance of a that name refers to the same object throughout the program.
For something which has internal linkage, it is only within a given source code file that instances of the same name will refer to the same thing.
Finally, names with no linkage refer to separate things.
8.2.4. Linkage and definitions
Every data object or function that is actually used in a program
(except as the operand of a sizeof
operator) must have
one and only one corresponding definition. This is
actually very important, although we haven't really come across it yet
because most of our examples have used only data objects with automatic
duration, whose declarations are axiomatically definitions, or functions
which we have defined by providing their bodies.
This ‘exactly one’ rule means that for objects with external linkage there must be exactly one definition in the whole program; for things with internal linkage (confined to one source code file) there must be exactly one definition in the file where it is declared; for things with no linkage, whose declaration is always a definition, there is exactly one definition as well.
Now we try to draw everything together. The real questions are
- How do I get the sort of linkage that I want?
- What actually constitutes a definition?
We need to look into linkage first, then definitions.
How do you get the appropriate linkage for a particular name? The rules are a little complicated.
- A declaration outside a function (file scope) which contains the
static storage class specifier results in internal linkage
for that name. (The Standard requires that function declarations which
contain
static
must be at file scope, outside any block) - If a declaration contains the
extern
storage class specifier, or is the declaration of a function with no storage class specifier (or both), then:- If there is already a visible declaration of that identifier with file scope, the resulting linkage is the same as that of the visible declaration;
- otherwise the result is external linkage.
- If a file scope declaration is neither the declaration of a function nor contains an explicit storage class specifier, then the result is external linkage.
- Any other form of declaration results in no linkage.
- In any one source code file, if a given identifer has both internal and external linkage then the result is undefined.
These rules were used to derive the ‘linkage’ columns of Table 8.1 and Table 8.2, without the full application of rule 2—hence the use of the ‘probably external’ term. Rule 2 allows you to determine the precise linkage in those cases.
What makes a declaration into a definition?
- Declarations that result in no linkage are also definitions.
- Declarations that include an initializer are always definitions; this includes the ‘initialization’ of functions by providing their body. Declarations with block scope may only have initializers if they also have no linkage.
- Otherwise, the declaration of a name with file scope and with
either no storage class specifier or with the
static
storage class specifier is a tentative definition. If a source code file contains one or more tentative definitions for an object, then if that file contains no actual definitions, a default definition is provided for that object as if it had an initializer of0
. (Structures and arrays have all their elements initialized to0
). Functions do not have tentative definitions.
A consequence of the foregoing is that unless you also provide an initializer, declarations that explicitly include the extern storage class specifier do not result in a definition.
8.2.5. Realistic use of linkage and definitions
The rules that determine the linkage and definition associated with declarations look quite complicated. The combinations used in practice are nothing like as bad; so let's investigate the usual cases.
The three types of accessibility that you will want of data objects or functions are:
- throughout the entire program,
- restricted to one source file,
- restricted to one function (or perhaps a single compound statement).
For the three cases above, you will want external linkage, internal linkage, and no linkage respectively. The recommended practice for the first two cases is to declare all of the names in each of the relevant source files before you define any functions. The recommended layout of a source file would be as shown in Figure 8.1.
The external linkage declarations would be prefixed with extern, the
internal linkage declarations with static
. Here's an
example.
/* example of a single source file layout */ #include <stdio.h> /* Things with external linkage: * accessible throughout program. * These are declarations, not definitions, so * we assume their definition is somewhere else. */ extern int important_variable; extern int library_func(double, int); /* * Definitions with external linkage. */ extern int ext_int_def = 0; /* explicit definition */ int tent_ext_int_def; /* tentative definition */ /* * Things with internal linkage: * only accessible inside this file. * The use of static means that they are also * tentative definitions. */ static int less_important_variable; static struct{ int member_1; int member_2; }local_struct; /* * Also with internal linkage, but not a tentative * definition because this is a function. */ static void lf(void); /* * Definition with internal linkage. */ static float int_link_f_def = 5.3; /* * Finally definitions of functions within this file */ /* * This function has external linkage and can be called * from anywhere in the program. */ void f1(int a){} /* * These two functions can only be invoked by name from * within this file. */ static int local_function(int a1, int a2){ return(a1 * a2); } static void lf(void){ /* * A static variable with no linkage, * so usable only within this function. * Also a definition (because of no linkage) */ static int count; /* * Automatic variable with no linkage but * an initializer */ int i = 1; printf("lf called for time no %d\n", ++count); } /* * Actual definitions are implicitly provided for * all remaining tentative definitions at the end of * the file */Example 8.2
We suggest that your re-read the preceding sections to see how the rules have been applied in Example 8.2.