C++ Newsletter/Tutorial Issue 16
Issue #016
October, 1996
Contents
- Newsletter Index
- Introduction to Exception Handling Part 2 - Throwing an Exception
- Using C++ as a Better C Part 16 - Anonymous Unions
- Notes From ANSI/ISO - Template Compilation Model Part 1
- Newsletter Writing
- Introduction to STL Part 3 - Sets
NEWSLETTER INDEX
An index to all the C++ and Java newsletters can be found on the Web at:
The index will be updated as new issues come out.
INTRODUCTION TO EXCEPTION HANDLING PART 2 - THROWING AN EXCEPTION
In the last issue we introduced C++ exception handling. In this issue we'll go more into detail about throwing exceptions.
Throwing an exception transfers control to an exception handler. For example:
void f()
{
throw 37;
}
void g()
{
try { // try block
f();
}
catch (int i) { // handler or catch clause
}
}
In this example the exception with value 37 is thrown, and control passes to the handler. A throw transfers control to the nearest handler with the appropriate type. "Nearest" means in the sense of stack frames and try blocks that have been dynamically entered.
Typically an exception that is thrown is of class type rather than a simple constant like "37". Throwing a class object instance allows for more sophisticated usage such as conveying additional information about the nature of an exception.
A class object instance that is thrown is treated similarly to a function argument or operand in a return statement. A temporary copy of the instance may be made at the throw point, just as temporaries are sometimes used with function argument passing. A copy constructor if any is used to initialize the temporary, with the class's destructor used to destruct the temporary. The temporary persists as long as there is a handler being executed for the given exception. As in other parts of the C++ language, some compilers may be able in some cases to eliminate the temporary.
An example:
#include <iostream.h>
class Exc {
char* s;
public:
Exc(char* e) {s = e; cerr << "ctor called\n";}
Exc(const Exc& e) {s = e.s; cerr << "copy ctor called\n";}
~Exc() {cerr << "dtor called\n";}
char* geterr() const {return s;}
};
void check_date(int date)
{
if (date < 1900)
throw Exc("date < 1900");
// other processing
}
int main()
{
try {
check_date(1879);
}
catch (const Exc& e) {
cerr << "exception was: " << e.geterr() << "\n";
}
return 0;
}
If you run this program, you can trace through the various stages of throwing the exception, including the actual throw, making a temporary copy of the class instance, and the invocation of the destructor on the temporary.
It's also possible to have "throw" with no argument, as in:
catch (const Exc& e) {
cerr << "exception was: " << e.geterr() << "\n";
throw;
}
What does this mean? Such usage rethrows the exception, using the already-established temporary. The exception thrown is the most recently caught one not yet finished. A caught exception is one where the parameter of the catch clause has been initialized, and for which the catch clause has not yet been exited.
So in the example above, "throw;" would rethrow the exception represented by "e". Because there is no outer catch clause to catch the rethrown exception, a special library function terminate() is called. If an exception is rethrown, and there is no exception currently being handled, terminate() is called as well.
In the next issue we'll talk more about how exceptions are handled in a catch clause.
USING C++ AS A BETTER C PART 16 - ANONYMOUS UNIONS
Here's a simple one. In C++ this usage is legal:
struct A {
union {
int x;
double y;
char* z;
};
};
whereas in C you'd have to say:
struct A {
union {
int x;
double y;
char* z;
} u;
};
giving the union a name. With the C++ approach, you can treat the union members as though they were members of the enclosing struct.
Of course, the members still belong to the union, meaning that they share memory space and only one is active at a given time.
NOTES FROM ANSI/ISO - TEMPLATE COMPILATION MODEL PART 1
Jonathan Schilling, jls@sco.com
From the time templates were first introduced to C++, a problem area has been defining how templates are compiled at the source level. At the most recent C++ standards meeting in Stockholm in July, a full specification of this was made for the first time.
The crux of the issue is whether template function definitions (regular functions or member functions) are compiled separately, or must be visible within the translation units containing instantiations. Consider first the most basic source arrangement (throughout, .h and .C are used to represent header file and source file extensions, but they may be different on any given system):
file1.h:
template <class T>
T max(T a, T b) {
return a > b ? a : b;
}
caller.C:
#include "file1.h"
void c(float x, float y) {
float z = max(x, y);
...
}
The template function definition is included in the header file that declares the function. This is the simplest method, and up to now has been the only fully portable method; the original Standard Template Library implementation used this technique almost exclusively.
However, there is a natural reluctance to have all implementation code in header files, and so the next simplest arrangement is to move the template definitions to regular source files, and have the header pull them in:
file2.h:
template <class T> T max(T a, T b);
#include "file2.C"
file2.C:
template <class T>
T max(T a, T b) {
return a > b ? a : b;
}
where caller.C is the same as before (except for including file2.h rather than file1.h). The use of a regular source file for the template definition here is mostly an illusion, since file2.C is never compiled by itself but rather as part of the compilation of caller.C. But it does at least suggest a separation of interface and implementation.
A variation on this scheme that is used in some compilers permits you to leave out the explicit #include in the header:
file2a.h:
template <class T> T max(T a, T b);
with file2.C and caller.C the same as before. Here, the compiler implicitly knows by some rule where to find the corresponding .C file (usually it looks in the same directory as the .h, for a .C file with the same base name), and pulls it into the translation unit being compiled. But again, the .C file is not itself compiled.
All of these methods belong to the "inclusion" model of template compilation. It is the model that almost all current C++ compilers provide. It is relatively simple to implement and simple to understand, but while it has sufficed in practice, there are some serious flaws with it. Most of these are due to the template definition code getting introduced into the instantiating context, with unexpected name leakage as a result. Consider the following example:
file3.h:
template <class T> void f(T);
#include "file3.C"
file3.C:
void g(int);
template <class T> void f(T t) {
g(0);
}
caller3.C:
#include "file3.h"
void g(long);
void h() {
f(3.14);
g(1); // hijacked!
}
Clearly the writer of caller3.C expected the g(1) call to refer to the g(long) in the same source file. But instead, the g(int) in file3.C is visible as well, and is a better match on overloading resolution. While use of namespaces can alleviate some of these problems, similar things can happen due to macros:
caller3a.C:
#define g some_other_name
#include "file3.h"
void h() {
f(3.14);
}
This time, the call g(0) in file3.C, which is clearly intended to refer to the g(int) in that file, gets altered by the macro defined in the context of caller3a.C.
None of these problems would occur if file3.C were separately compiled, because there would be no possibility of its context and the caller's context becoming intermingled in unexpected ways. While these kinds of context problems can also occur in inline functions (see C++ Newsletter #015), the potential for damage with templates is much greater, given the centrality of templates to modern C++ libraries and applications.
A separate compilation model for templates was envisioned as part of the C++ language template design from the start, but was never specified in any detailed way. The first attempt to (partially) implement it (Cfront 3.0) ran into difficulties, and subsequently compiler vendors shied away from it. The first attempts to specify it in the draft ANSI/ISO standard were criticized as poorly specified, hard to use, and hard to implement efficiently. A series of contentious discussions and reversals ensued, but now by way of invention and compromise, a (what is hoped to be) clear and reasonably efficient version of separate compilation has been made. In addition, the de facto existing "inclusion" model is also permitted by the standard (but the implicit inclusion method, illustrated by file2a.h above, will not be, unless by vendor extension).
We'll take a look at this new separate compilation model in the next issue of the newsletter.
NEWSLETTER WRITING
Would your company find it valuable to have a customized newsletter on C++ or Java or related topics, similar to this newsletter that you're reading right now? If so, please contact me (glenm@glenmccl.com) for further information. Custom newsletters can consist of technical material furnished from outside, internal material with outside editting, or a combination of the two.
INTRODUCTION TO STL PART 3 - SETS
In the last issue we talked about several STL container types, namely vectors, lists, and deques. STL also has set and multiset, where set is a collection of unique values, and multiset is a set with possible non-unique values, that is, keys (elements) of the set may appear more than one time. Sets are maintained in sorted order at all times.
To illustrate the use of sets, consider the following example:
#include <iostream>
#include <set>
using namespace std;
int main()
{
typedef set<int, less<int> > SetInt;
//typedef multiset<int, less<int> > SetInt;
SetInt s;
for (int i = 0; i < 10; i++) {
s.insert(i);
s.insert(i * 2);
}
SetInt::iterator iter = s.begin();
while (iter != s.end()) {
cout << *iter << " ";
iter++;
}
cout << endl;
return 0;
}
This example is for set, but the usage for multiset is almost identical. The first item to consider is the line:
typedef set<int, less<int> > SetInt;
This establishes a type "SetInt", which is a set operating on ints, and which uses the template "less<int>" defined in <function> to order the keys of the set. In other words, set takes two type arguments, the first of which is the underlying type of the set, and the second a template class that defines how ordering is to be done in the set.
Next, we use insert() to insert keys in the set. Note that some duplicate keys will be inserted, for example "4".
Then we establish an iterator pointing at the beginning of the set, and iterate over the elements, outputting each in turn. The code for multiset is identical save for the typedef declaration.
The output for set is:
0 1 2 3 4 5 6 7 8 9 10 12 14 16 18
and for multiset:
0 0 1 2 2 3 4 4 5 6 6 7 8 8 9 10 12 14 16 18
STL also provides bitsets, which are packed arrays of binary values. These are not the same as "vector<bool>", which is a vector of Booleans.
ACKNOWLEDGEMENTS
Thanks to Nathan Myers, Eric Nagler, David Nelson, Terry Rudd, Jonathan Schilling, John Spicer, and Clay Wilson for help with proofreading.
SUBSCRIPTION INFORMATION / BACK ISSUES
To subscribe to the newsletter, send mail to majordomo@world.std.com with this line as its message body:
subscribe c_plus_plus
Back issues are available via FTP from:
rmi.net /pub2/glenm/newslett
or on the Web at:
There is also a Java newsletter. To subscribe to it, say:
subscribe java_letter
using the same majordomo@world.std.com address.
-------------------------
Copyright (c) 1996 Glen McCluskey. All Rights Reserved.
This newsletter may be further distributed provided that it is copied in its entirety, including the newsletter number at the top and the copyright and contact information at the bottom.
Glen McCluskey & Associates
Professional C++ Consulting
Internet: glenm@glenmccl.com
Phone: (800) 722-1613 or (970) 490-2462
Fax: (970) 490-2463
FTP: rmi.net /pub2/glenm/newslett (for back issues)
Web: http://rmi.net/~glenm