C++ Newsletter/Tutorial Issue 9
Issue #009
April, 1996
Contents
- Introduction to Templates Part 1 - Function Templates
- New C++ Feature - Member Templates
- Introduction to Stream I/O Part 4 - Tie()
- Using C++ as a Better C Part 9 - Extern "C"
- Correction
INTRODUCTION
In this issue we'll start discussing C++ templates, and present a new aspect of them known as member templates. There will also be a continuation of the series on stream I/O, along with a discussion of how to combine C++ code with code written in other languages.
INTRODUCTION TO C++ TEMPLATES PART 1 - FUNCTION TEMPLATES
In issue #007 we talked about the use of inline functions. Suppose that you wish to compute the maximum of two quantities, and you define a C macro for this:
#define max(a, b) ((a) > (b) ? (a) : (b))
This works OK until a case like:
max(x++, y++)
comes along. An inline function:
inline int max(int a, int b)
{
return a > b ? a : b;
}
solves this problem. But what if you want a max() function for a variety of numeric types? You can define a slew of function prototypes:
int max(int, int);
long max(long, long);
double max(double, double);
and rely on function overloading to sort things out. Or you can define one function that might work in all cases:
long double max(long double, long double);
since nothing can be bigger than a long double, right? This last approach fails because there's no guarantee that, for example, the size of a long is less than the size of a long double, and assigning a long to a long double would in such a case result in loss of precision.
In C++ there is another way to approach this problem, using what are called parameterized types or templates. We can define a function template:
template <class T> inline T max(T a, T b)
{
return a > b ? a : b;
}
The preface "template <class T>" is used to declare a template. T is a template parameter, a type argument to the template. When this template is used:
int a = 37;
int b = 47;
int i = max(a, b);
the type value of T will be "int", because a and b are integers. If instead I had said:
double a = 37.53;
double b = -47.91;
double d = max(a, b);
then T would have the type value "double". The process of generating an actual function from a function template is known as "instantiation". Note also that "const T&" may be used instead of "T"; we will be discussing this point in a future issue of the newsletter.
This template will also work on non-numeric types, so long as they have the ">" operator defined. For example:
class A {
public:
int operator>(const A&); // use "bool" return type
// instead, if available
};
A a;
A b;
A c = max(a, b);
Templates are a powerful but complex feature, about which we will have more to say. Languages like C or Java, that do not have templates, typically use macros or rely on using base class pointers and virtual functions to synthesize some of the properties of templates.
Templates in C++ are a more ambitious attempt to support "generic programming" than some previous efforts found in other programming languages. Support for generic programming in C++ is considered by some to be as important a language goal for C++ as is support for object-oriented programming (using base/derived classes and virtual functions; see newsletter issue #008). An example of heavy template use can be found in STL, the Standard Template Library.
NEW C++ FEATURE - MEMBER TEMPLATES
In chapter 14 of the draft ANSI/ISO C++ standard is a mention of something called member templates. This feature is new in a way and not yet widely available, but worth mentioning here.
Member templates are simply a generalization of templates such that a template can be a class member. For example:
#include <stdio.h>
template <class A, class B> struct Pair {
A a;
B b;
Pair(const A& ax, const B& bx) : a(ax), b(bx) {}
template <class T, class U> Pair(const Pair<T,U>& p) :
a(p.a), b(p.b) {}
};
int main()
{
Pair<short, float> x(37, 12.34);
Pair<long, long double> y(x);
printf("%ld %Lg\n", y.a, y.b);
return 0;
}
This is an adaptation of a class found in the Standard Template Library. Note that an object of class Pair<long, long double> is constructed from an object of class Pair<short, float>. By using a template constructor it is possible to construct a Pair from any other Pair, assuming that conversion from T to A and U to B are supported. Without the availability of template constructors one could only declare constructors with fixed types like "Pair(int)" or else use the template arguments to Pair itself, as in "Pair(A, B)".
In a similar way to function template use, it's possible to have usage like:
template <class T> struct A {
template <class U> struct B {/* stuff */};
};
A<double>::B<long> ab;
In this example, the type value of T within the nested template declaration would be "double", while the value of U would be "long".
There are a few restrictions on member templates. A destructor for a class cannot be defined as a function template, nor may a function template member of a class be virtual.
INTRODUCTION TO STREAM I/O PART 4 - TIE()
In issue #008 we talked about copying files using a variety of methods. One example that was presented was this one:
#include <iostream.h>
int main()
{
char c;
cin.unsetf(ios::skipws);
while (cin >> c)
cout << c;
return 0;
}
Jerry Schwarz suggested that it might be worth discussing the tie() function and its effect on the performance of this code. Specifically, if we slightly change the above code to:
#include <iostream.h>
int main()
{
char c;
cin.tie(0);
cin.unsetf(ios::skipws);
while (cin >> c)
cout << c;
return 0;
}
it runs about 8X faster with one popular C++ compiler, and about 18X with another.
The difference has to do with buffering and flushing of streams. When input is requested, for example with:
cin >> c
there may be output pending in the buffer for the output stream. The input stream is therefore tied to the output stream such that a request for input will cause pending output to be flushed. Flushing output is expensive, typically triggering a flush() call and a write() system call (on UNIX systems). Disabling the linkage between the input and output streams gets rid of this overhead.
To further illustrate this point, consider another example:
#include <iostream.h>
int main()
{
char buf[100];
//cin.tie(0);
cin.unsetf(ios::skipws);
cout.unsetf(ios::unitbuf);
cout << "What is your name? ";
cin >> buf;
return 0;
}
It's common for output to be completely unbuffered (unit buffering) if going to a terminal (screen or window). So setting cin.tie(0) will not necessarily change observable behavior, because output will be flushed immediately in all cases.
To affect behavior in this example, one also needs to disable unit buffering for the stream, achieved by saying:
cout.unsetf(ios::unitbuf);
Once this is done, cin.tie(0) will change behavior in a visible way. If the input stream is untied, then the prompt in the example above will not come out before input is requested from the user, leading to confusion.
Note also that current libraries vary in their behavior. The above example works for one library that was tried, but for another, there appears to be no way to disable unit buffering under any circumstances, when output is to a terminal. The draft ANSI/ISO C++ standard calls for unit buffering to be set for error output ("cerr").
If tie() is called with no argument, it returns the stream currently tied to. For example:
cout << (void*)cin.tie() << "\n";
cout << (void*)(&cout) << "\n";
give identical results if cin is currently tied to cout.
Copying files a character at a time has other pitfalls. One has to be careful in assessing the buffering and function call overhead for anything done on a per-character basis. There is yet another way of copying files by character, using streambufs, that we'll present in a future issue.
USING C++ AS A BETTER C PART 9 - EXTERN "C"
One of the common issues that always comes up with programming languages is how to mix code written in one language with code written in another.
For example, suppose that you're writing C++ code and wish to call C functions. A common case of this would be to access C functions that manipulate C-style strings, for example strcmp() or strlen(). So as a first try, we might say:
extern size_t strlen(const char*);
and then use the function. This will work, at least at compile time, but will probably give a link error about an unresolved symbol.
The reason for the link error is that a typical C++ compiler will modify the name of a function or object ("mangle" it), for example to include information about the types of the arguments. As an example, a common scheme for mangling the function name strlen(const char*) would result in:
strlen__FPCc
There are two purposes for this mangling. One is to support function overloading. For example, the following two functions cannot both be called "f" in the object file symbol table:
int f(int);
int f(double);
But suppose that overloading was not an issue, and in one compilation unit we have:
extern void f(double);
and we use this function, and its name in the object file is just "f". And suppose that in another compilation unit the definition is found, as:
void f(char*) {}
This will silently do the wrong thing -- a double will be passed to a function requiring a char*. Mangling the names of functions eliminates this problem, because a linker error will instead be triggered. This technique goes by the name "type safe linkage".
So to be able to call C functions, we need to disable name mangling. The way of doing this is to say:
extern "C" size_t strlen(const char*);
or:
extern "C" {
size_t strlen(const char*);
int strcmp(const char*, const char*);
}
This usage is commonly seen in header files that are used both by C and C++ programs. The extern "C" declarations are conditional based on whether C++ is being compiled instead of C.
Because name mangling is disabled with a declaration of this type, usage like:
extern "C" {
int f(int);
int f(double);
}
is illegal (because both functions would have the name "f").
Note that extern "C" declarations do not specify the details of what must be done to allow C++ and C code to be mixed. Name mangling is commonly part of the problem to be solved, but only part.
There are other issues with mixing languages that are beyond the scope of this presentation. The whole area of calling conventions, such as the order of argument passing, is a tricky one. For example, if every C++ compiler used the same mangling scheme for names, this would not necessarily result in object code that could be mixed and matched.
CORRECTION
In issue #008 we talked about copying files and said this about one of the examples of copying files using C:
This approach works on text files. Unfortunately, however, for binary files, an attempt to copy a 10406-byte file resulted in output of only 383 bytes. Why? Because EOF is itself a valid character that can occur in a binary file. If set to -1, then this is equivalent to 255 or 0377 or 0xff, a perfectly legal byte in a file.
This isn't quite the case. A common mistake when copying files in C is to use a char instead of an int with getc() and putc(). If a char is used, then the explanation above is correct, because with a binary file EOF interpreted as a character is one of the 256 valid bit patterns that a char can hold.
But with an int this is not a problem. getc(), and its functional equivalent fgetc(), return an unsigned char converted to an int. So the int can represent all character values 0-255, along with the EOF marker (typically -1).
It turns out that the reason why the example failed was due to a ^Z in the file. ^Z used to be used as an end-of-file marker for DOS files used on PCs.
Thanks to David Nelson for mentioning this.
ACKNOWLEDGEMENTS
Thanks to Nathan Myers, Eric Nagler, David Nelson, Terry Rudd, Jonathan Schilling, and Clay Wilson for help with proofreading.
SUBSCRIPTION INFORMATION / BACK ISSUES
To subscribe to the newsletter, send mail to majordomo@world.std.com with this line as its message body:
subscribe c_plus_plus
Back issues are available via FTP from:
rmii.com /pub2/glenm/newslett
or on the Web at:
There is also a Java newsletter. To subscribe to it, say:
subscribe java_letter
using the same majordomo@world.std.com address.
-------------------------
Copyright (c) 1996 Glen McCluskey. All Rights Reserved.
This newsletter may be further distributed provided that it is copied in its entirety, including the newsletter number at the top and the copyright and contact information at the bottom.
Glen McCluskey & Associates
Professional C++ Consulting
Internet: glenm@glenmccl.com
Phone: (800) 722-1613 or (970) 490-2462
Fax: (970) 490-2463
FTP: rmii.com /pub2/glenm/newslett (for back issues)
Web: http://www.rmii.com/~glenm