2.9. Constants

2.9.1. Integer constants

The normal integral constants are obvious: things like 1, 1034 and so on. You can put l or L at the end of an integer constant to force it to be long. To make the constant unsigned, one of u or U can be used to do the job.

Integer constants can be written in hexadecimal by preceding the constant with 0x or 0X and using the upper or lower case letters a, b, c, d, e, f in the usual way.

Be careful about octal constants. They are indicated by starting the number with 0 and only using the digits 0, 1, 2, 3, 4, 5, 6, 7. It is easy to write 015 by accident, or out of habit, and not to realize that it is not in decimal. The mistake is most common with beginners, because experienced C programmers already carry the scars.

The Standard has now invented a new way of working out what type an integer constant is. In the old days, if the constant was too big for an int, it got promoted to a long (without warning). Now, the rule is that a plain decimal constant will be fitted into the first in this list

int   long   unsigned long

that can hold the value.

Plain octal or hexadecimal constants will use this list

int   unsigned int   long   unsigned long

If the constant is suffixed by u or U:

unsigned int   unsigned long

If it is suffixed by l or L:

long   unsigned long

and finally, if it suffixed by both u or U and l or L, it can only be an unsigned long.

All that was done to try to give you ‘what you meant’; what it does mean is that it is hard to work out exactly what the type of a constant expression is if you don't know something about the hardware. Hopefully, good compilers will warn when a constant is promoted up to another length and the U or L etc. is not specified.

A nasty bug hides here:

printf("value of 32768 is %d\n", 32768);

On a 16-bit two's complement machine, 32768 will be a long by the rules given above. But printf is only expecting an int as an argument (the %d indicates that). The type of the argument is just wrong. For the ultimate in safety-conscious programming, you should cast such cases to the right type:

printf("value of 32768 is %d\n", (int)32768);

It might interest you to note that there are no negative constants; writing -23 is an expression involving a positive constant and an operator.

Character constants actually have type int (for historical reasons) and are written by placing a sequence of characters between single quote marks:

'a'
'b'
'like this'

Wide character constants are written just as above, but prefixed with L:

L'a'
L'b'
L'like this'

Regrettably it is valid to have more than one character in the sequence, giving a machine-dependent result. Single characters are the best from the portability point of view, resulting in an ordinary integer constant whose value is the machine representation of the single character. The introduction of extended characters may cause you to stumble over this by accident; if '<a>' is a multibyte character (encoded with a shift-in shift-out around it) then '<a>' will be a plain character constant, but containing several characters, just like the more obvious 'abcde'. This is bound to lead to trouble in the future; let's hope that compilers will warn about it.

To ease the way of representing some special characters that would otherwise be hard to get into a character constant (or hard to read; does ' ' contain a space or a tab?), there is what is called an escape sequence which can be used instead. Table 2.10 shows the escape sequences defined in the Standard.

Sequence	Represents
`\a`	audible alarm
`\b`	backspace
`\f`	form feed
`\n`	newline
`\r`	carriage return
`\t`	tab
`\v`	vertical tab
`\\`	backslash
`\'`	quote
`\"`	double quote
`\?`	question mark

Table 2.10. C escape sequences

It is also possible to use numeric escape sequences to specify a character in terms of the internal value used to represent it. A sequence of either \ooo or \xhhhh, where the ooo is up to three octal digits and hhhh is any number of hexadecimal digits respectively. A common version of it is '\033', which is used by those who know that on an ASCII based machine, octal 33 is the ESC (escape) code. Beware that the hexadecimal version will absorb any number of valid following hexadecimal digits; if you want a string containing the character whose value is hexadecimal ff followed by a letter f, then the safe way to do it is to use the string joining feature:

"\xff" "f"

The string

"\xfff"

only contains one character, with all three of the fs eaten up in the hexadecimal sequence.

Some of the escape sequences aren't too obvious, so a brief explanation is needed. To get a single quote as a character constant you type '\'', to get a question mark you may have to use '\?'; not that it matters in that example, but to get two of them in there you can't use '??', because the sequence ??' is a trigraph! You would have to use '\?\?'. The escape \" is only necessary in strings, which will come later.

There are two distinct purposes behind the escape sequences. It's obviously necessary to be able to represent characters such as single quote and backslash unambiguously: that is one purpose. The second purpose applies to the following sequences which control the motions of a printing device when they are sent to it, as follows:

\a: Ring the bell if there is one. Do not move.
\b: Backspace.
\f: Go to the first position on the ‘next page’, whatever that may mean for the output device.
\n: Go to the start of the next line.
\r: Go back to the start of the current line.
\t: Go to the next horizontal tab position.
\v: Go to the start of the line at the next vertical tab position.

For \b, \t, \v, if there is no such position, the behaviour is unspecified. The Standard carefully avoids mentioning the physical directions of movement of the output device which are not necessarily the top to bottom, left to right movements common in Western cultural environments.

It is guaranteed that each escape sequence has a unique integral value which can be stored in a char.

2.9.2. Real constants

These follow the usual format:

1.0
2.
.1
2.634
.125
2.e5
2.e+5
.125e-3
2.5e5
3.1E-6

and so on. For readability, even if part of the number is zero, it is a good idea to show it:

1.0
0.1

The exponent part shows the number of powers of ten that the rest of the number should be raised to, so

3.0e3

is equivalent in value to the integer constant

As you can see, the e can also be E. These constants all have type double unless they are suffixed with f or F to mean float or l or L to mean long double.

For completeness, here is the formal description of a real constant:

A real constant is one of:

A fractional constant followed by an optional exponent.
A digit sequence followed by an exponent.

In either case followed by an optional one of f, l, F, L, where:

A fractional constant is one of:
- An optional digit sequence followed by a decimal point followed by a digit sequence.
- A digit sequence followed by a decimal point.
An exponent is one of
- e or E followed by an optional + or - followed by a digit sequence.
A digit sequence is an arbitrary combination of one or more digits.

Previous section | Chapter contents | Next section