2.6. Real types
It's easier to deal with the real types first because there's less to say about them and they don't get as complicated as the integer types. The Standard breaks new ground by laying down some basic guarantees on the precision and range of the real numbers; these are found in the header file float.h which is discussed in detail in Chapter 9. For some users this is extremely important information, but it is of a highly technical nature and is likely only to be fully understood by numerical analysts.
The varieties of real numbers are these:
float double long double
Each of the types gives access to a particular way of representing real
numbers in the target computer. If it only has one way of doing things,
they might all turn out to be the same; if it has more than three, then C
has no way of specifying the extra ones. The type float
is
intended to be the small, fast representation corresponding to what FORTRAN
would call REAL
. You would use double
for extra
precision, and long double
for even more.
The main points of interest are that in the increasing ‘lengths’ of
float
, double
and long double
, each
type must give at least the same range and precision as the previous type.
For example, taking the value in a double
and putting it into
a long double
must result in the same value.
There is no requirement for the three types of ‘real’ variables to differ in their properties, so if a machine only has one type of real arithmetic, all of C's three types could be implemented in the same way. None the less, the three types would be considered to be different from the point of view of type checking; it would be ‘as if’ they really were different. That helps when you move the program to a system where the three types really are different—there won't suddenly be a set of warnings coming out of your compiler about type mismatches that you didn't get on the first system.
In contrast to more ‘strongly typed’ languages, C permits
expressions to mix all of the scalar types: the various flavours of
integers, the real numbers and also the pointer types. When an expression
contains a mixture of arithmetic (integer and real) types there are
implicit conversions invoked which can be used to work out what the overall
type of the result will be. These rules are quite important and are known
as the usual arithmetic conversions; it will be worth committing
them to memory later. The full set of rules is described in Section 2.8; for the moment, we will investigate only the ones that involve
mixing float
, double
and long double
to see if they make sense.
The only time that the conversions are needed is when two different types are mixed in an expression, as in the example below:
int f(void){ float f_var; double d_var; long double l_d_var; f_var = 1; d_var = 1; l_d_var = 1; d_var = d_var + f_var; l_d_var = d_var + f_var; return(l_d_var); }Example 2.1
There are a lot of forced conversions in that example. Getting the
easiest of them out of the way first, let's look at the assignments of the
constant value 1
to each of the variables. As the section
on constants will point out, that 1
has type int
,
i.e. it is an integer, not a real constant. The assignment converts the
integer value to the appropriate real type, which is easy to cope with.
The interesting conversions come next. The first of them is on the line
d_var = d_var + f_var;
What is the type of the expression involving the +
operator?
The answer is easy when you know the rules. Whenever two different real
types are involved in an expression, the lower precision type is first
implicitly converted to the higher precision type and then the arithmetic
is performed at that precision. The example involves both a
double
and a float
, so the value of
f_var
is converted to type double
and is then
added to the value of the double d_var
. The result of the
expression is naturally of type double
too, so it is clearly
of the correct type to assign to d_var
.
The second of the additions is a little bit more complicated, but still
perfectly O.K. Again, the value of f_var
is converted and the
arithmetic performed with the precision of double
, forming the
sum of the two variables. Now there's a problem. The result (the sum) is
double
, but the assignment is to a long double
.
Once again the obvious procedure is to convert the lower precision value to
the higher one, which is done, and then make the assignment.
So we've taken the easy ones. The difficult thing to see is what to do when forced to assign a higher precision result to a lower precision destination. In those cases it may be necessary to lose precision, in a way specified by the implementation. Basically, the implementation must specify whether and in what way it rounds or truncates. Even worse, the destination may be unable to hold the value at all. The Standard says that in these cases loss of precision may occur; if the destination is unable to hold the necessary value—say by attempting to add the largest representable number to itself—then the behaviour is undefined, your program is faulty and you can make no predictions whatsoever about any subsequent behaviour.
It is no mistake to re-emphasize that last statement. What the Standard means by undefined behaviour is exactly what it says. Once a program's behaviour has entered the undefined region, absolutely anything can happen. The program might be stopped by the operating system with an appropriate message, or just as likely nothing observable would happen and the program be allowed to continue with an erroneous value stored in the variable in question. It is your responsibility to prevent your program from exhibiting undefined behaviour. Beware!
Summary of real arithmatic
- Arithmetic with any two real types is done at the highest precision of the members involved.
- Assignment involves loss of precision if the receiving type has a lower precision than the value being assigned to it.
- Further conversions are often implied when expressions mix other types, but they have not been described yet.
2.6.1. Printing real numbers
The usual output function, printf
, can be used to format
real numbers and print them. There are a number of ways to format these
numbers, but we'll stick to just one for now. Table 2.4 below
shows the appropriate format description for each of the real types.
Here's an example to try:
#include <stdio.h> #include <stdlib.h> #define BOILING 212 /* degrees Fahrenheit */ main(){ float f_var; double d_var; long double l_d_var; int i; i = 0; printf("Fahrenheit to Centigrade\n"); while(i <= BOILING){ l_d_var = 5*(i-32); l_d_var = l_d_var/9; d_var = l_d_var; f_var = l_d_var; printf("%d %f %f %Lf\n", i, f_var, d_var, l_d_var); i = i+1; } exit(EXIT_SUCCESS); }Example 2.2
Try that example on your own computer to see what results you get.
Exercise 2.10. Which type of variable can hold the largest range of values?
Exercise 2.11. Which type of variable can store values to the greatest precision?
Exercise 2.12. Are there any problems possible when assigning a
float
or double
to a double
or
long double
?
Exercise 2.13. What could go wrong when assigning, say, a long
double
to a double
?
Exercise 2.14. What predictions can you make about a program showing ‘undefined behaviour’?