2.3. The Textual Structure of Programs

2.3.1. Program Layout

The examples so far have used the sort of indentation and line layout that is common in languages belonging to the same family as C. They are ‘free format’ languages and you are expected to use that freedom to lay the program out in a way that enhances its readability and highlights its logical structure. Space (including horizontal tab) characters can be used for indentation anywhere except in identifiers or keywords without any effect on the meaning of the program. New lines work in the same way as space and tab except on preprocessor command lines, which have a line-by-line structure.

If a line is getting too long for comfort there are two things you can do. Generally it will be possible to replace one of the spaces by a newline and use simply two lines instead, as this example shows.

/* a long line */
a = fred + bill * ((this / that) * sqrt(3.14159));
/* the same line */
a = fred + bill *
        ((this / that) *
        sqrt(3.14159));

If you're unlucky it may not be possible to break the lines like that. The preprocessor suffers most from the problem, because of its reliance on single-line ‘statements’. To help, it's useful to know that the sequence ‘backslash newline’ becomes invisible to the C translation system. As a result, the sequence is valid even in unusual places such as the middle of identifiers, keywords, strings and so on. Only trigraphs are processed before this step.

/*
 * Example of the use of line joining
 */
#define IMPORTANT_BUT_LONG_PREPROCESSOR_TEXT \
printf("this is effectively all ");\
printf("on a single line ");\
printf("because of line-joining\n");

The only time that you might want to use this way of breaking lines (outside of preprocessor control lines) is to prevent long strings from disappearing off the right-hand side of a program listing. New lines are not permitted inside strings and character constants, so you might think that the following is a good idea.

/* not a good way of folding a string */
printf("This is a very very very\
long string\n");

That will certainly work, but for strings it is preferable to make use of the string-joining feature introduced by the Standard:

/* This string joining will not work in Old C */
printf("This is a very very very"
       "long string\n");

The second example allows you to indent the continuation portion of the string without changing its meaning; adding indentation in the first example would have put the indentation into the string.

Incidentally, both examples contain what is probably a mistake. There is no space in front of the ‘long’ in the continuation string, which will contain the sequence ‘verylong’ as a result. Did you notice?

2.3.2. Comment

Comment, as has been said already, is introduced by the character pair /* and terminated by */. It is translated into a single space wherever it occurs and so it follows exactly the same rules that spaces do. It's important to realize that it doesn't simply disappear, which it used to do in Old C, and that it is not possible to put comment into strings or character constants. Comment in such a place becomes part of the string or constant:

/*"This is comment"*/
"/*The quotes mean that this is a string*/"

Old C was a bit hazy about what the deletion of comment implied. You could argue that

int/**/egral();

should have the comment deleted and so be taken by the compiler to be a call of a function named integral. The Standard C rule is that comment is to be read as if were a space, so the example must be equivalent to

int egral();

which declares a function egral that returns type int.

2.3.3. Translation phases

The various character translation, line joining, comment recognition and other early phases of translation must be specified to occur in a certain order. The Standard says that the translation is to proceed as if the phases occurred in this order (there are more phases, but these are the important ones):

  1. Trigraph translation.
  2. Line joining.
  3. Translate comment to space (but not in strings or character constants). At this stage, multiple white spaces may optionally be condensed into one.
  4. Translate the program.

Each stage is completed before the next is started.