C++ Newsletter/Tutorial Issue 17
Issue #017
November, 1996
Contents
- Introduction to Exception Handling Part 3 - Stack Unwinding
- Notes From ANSI/ISO - Template Compilation Model Part 2
- Using C++ as a Better C Part 17 - Empty Classes
- Transitioning to C++
- Introduction to STL Part 4 - Maps
INTRODUCTION TO EXCEPTION HANDLING PART 3 - STACK UNWINDING
In the last issue we talked about throwing exceptions. Before discussing how exceptions are handled, we need to talk about an intermediate step, stack unwinding.
The exception handling mechanism is dynamic in that a record is kept of the flow of program execution, for example via stack frames and program counter mapping tables. When an exception is thrown, control transfers to the nearest suitable handler. "nearest" in this sense means the nearest dynamically surrounding try block containing a handler that matches the type of the thrown exception. We will talk more about exception handlers in a future issue.
Transfer of control from the point at which an exception is thrown to the exception handler implies jumping out of one program context into another. What about cleanup of the old program context? For example, what about local class objects that have been allocated? Are their destructors called?
The answer is "yes". All stack-allocated ("automatic") objects allocated since the try block was entered will have their destructors invoked. Let's look at an example:
#include <iostream.h>
class A {
int x;
public:
A(int i) {x = i; cerr << "ctor " << x << endl;}
~A() {cerr << "dtor " << x << endl;}
};
void f()
{
A a1(1);
throw "this is a test";
A a2(2);
}
int main()
{
try {
A a3(3);
f();
A a4(4);
}
catch (const char* s) {
cerr << "exception: " << s << endl;
}
return 0;
}
Output of this program is:
ctor 3
ctor 1
dtor 1
dtor 3
exception: this is a test
In this example, we enter the try block in main(), allocate a3, then call f(). f() allocates a1, then throws an exception, which will transfer control to the catch clause in main().
In this example, the a1 and a3 objects have their destructors called. a2 and a4 do not, because they were never allocated.
It's possible to have class objects containing other class objects, or arrays of class objects, with partial construction taking place followed by an exception being thrown. In this case, only the constructed subobjects will be destructed.
NOTES FROM ANSI/ISO - TEMPLATE COMPILATION MODEL PART 2
Jonathan Schilling, jls@sco.com
In the last issue, we looked at the "inclusion" model of template compilation, which is the one used by most compilers in practice but which is lacking in several respects. As a reminder, here was the example that illustrated name leakage in the inclusion model:
file3.h:
template <class T> void f(T);
#include "file3.C"
file3.C:
void g(int);
template <class T> void f(T t) {
g(0);
}
caller3.C:
#include "file3.h"
void g(long);
void h() {
f(3.14);
g(1); // should call g(long), but calls g(int) instead
}
Now we'll look at the newly-specified template separate compilation model that has recently been added to the standard. There isn't space here to go into a full description of the new rules, and in fact the complexity of this subject rapidly approaches infinity! But here are some of the key highlights:
Names in template functions are divided into those that are dependent upon the template arguments, and those that are not. This distinction is made syntactically, making it easier for people and compilers to understand.Names in template functions that are not dependent upon the template arguments are resolved only in the template definition context (an example would be g(0) in file3.C above).
Names in template functions that are dependent upon the template arguments are resolved either in the template instantiation context (using external names that may be found in object code symbol tables) or in the template definition context. In the case of nested or transitive instantiations, no "intermediate context" is available.
Instantiation of template functions is made "position independent", meaning that if the meaning of a program changes depending upon where instantiations are placed, program behavior is undefined.
Instantiations may be performed at either compile- or link-time. If the choice makes a difference, program behavior is undefined. An implementation is allowed to place compilation-order restrictions on separately-compiled templates.
Separate compilation of templates is not done by default: the template declaration or definition must use the new keyword "export" in order for it to happen. Otherwise the inclusion method is used. This will provide upward compatibility of existing template code.
Some of these changes involve the template instantiation model (see C++ Newsletter #010) more than the template source model, but are necessary to make separate compilation workable.
Here's the example from above, made into a separately-compiled template:
file4.h:
template <class T> void f(T);
file4.C:
void g(int);
export template <class T> void f(T t) {
g(0);
}
caller4.C:
#include "file4.h"
void g(long);
void h() {
f(3.14);
g(1); // now, this calls g(long); g(int) not visible
}
In this model, file4.C is compiled explicitly, as well as caller4.C, and its source is not pulled into the header, explicitly or implicitly. The source is otherwise identical to the inclusion model except for the addition of the "export" keyword. (The meaning of this keyword is somewhat similar to the existing "extern" keyword, and some people wanted to reuse that keyword rather than introduce a new one. After some debate, the committee decided at its recently concluded November meeting not to overload "extern". Also note that the keyword may be placed on either the template declaration or the template definition; this flexibility may help library vendors in shipping products that can be used with either template compilation model).
One area of uncertainty is how much the "no intermediate context" limitation will affect real code. Here's an example where it matters:
ic1.h:
export template <class T>
void g(const T&);
ic1.C:
export template <class T>
void g(const T& t)
{
length(t); // how does length get found?
}
ic2.h:
export template <class T> void f(T);
ic2.C:
#include "ic1.h"
template <class T>
class Container { ... };
export template <class T>
int length (const Container<T>&) { ... }
export template <class T> void f(T t)
{
Container<T> s;
g(s);
}
ic3.C:
#include "ic2.h"
class A { ... };
void m() {
A a;
f(a); // this starts the instantiations
}
This is a case of transitive instantiation, where m() instantiates f(A) which instantiates g(Container<A>). Within g(), length(t) is a dependent name lookup, so it can find length either in the definition context (ic1.C) or in the instantiation context (ic3.C). But it's in neither. It's in ic2.C, which is considered "intermediate context". Thus this example would not compile as is, and would have to be recoded to use the inclusion method (basically, drop the "export"'s and include the .C's into the .h's).
It is an open question how common this kind of intermediate context problem will be. One analysis found no cases of it in the template-intensive Standard Template Library, which may be encouraging. As with many of the new inventions of the C++ standardization process, only time will tell.
USING C++ AS A BETTER C PART 17 - EMPTY CLASSES
Here's a simple one. In C, an empty struct like:
struct A {};
is invalid, whereas in C++ usage like:
struct A {};
or:
class B {};
is perfectly legal. This type of construct is useful when developing a skeleton or placeholder for a class.
An empty class has size greater than zero. Two class objects of empty classes will have distinct addresses, as in:
class A {};
void f()
{
A* p1 = new A;
A* p2 = new A;
// p1 != p2 at this point ...
}
There are still one or two C++ compilers that generate C code as their "assembly" language. To handle an empty class, they will generate a dummy member, so for example:
class A {};
becomes:
struct A {
char __dummy;
};
in the C output.
TRANSITIONING TO C++
One of the services we offer is advice and support to organizations transitioning to C++ and object-oriented development. Some of the aspects of the service include:
- Advice on which compilers to use
- Support via e-mail
- Development of internal company newsletters
- Training and development of tutorial information
- Advice on which language features to use and avoid
- Object-oriented design / designing for performance
If you'd like more information about how these services could be applied in your organization, please send mail to glenm@glenmccl.com.
INTRODUCTION TO STL PART 4 - MAPS
In the previous issue we talked a bit about STL sets. In this issue we'll discuss another data structure, maps. A map is something like an associative array or hash table, in that each element consists of a key and an associated value. A map must have unique keys, whereas with a multimap keys may be duplicated.
To see how maps work, let's look at a simple application that counts word frequency. Words are input one per line and the total count of each is output.
#include <iostream>
#include <string>
#include <map>
using namespace std;
int main()
{
typedef map<string, long, less<string> > MAP;
typedef MAP::value_type VAL;
MAP counter;
char buf[256];
while (cin >> buf)
counter[buf]++;
MAP::iterator it = counter.begin();
while (it != counter.end()) {
cout << (*it).first << " " << (*it).second << endl;
it++;
}
return 0;
}
This is a short but somewhat tricky example. We first set up a typedef for:
map<string, long, less<string> >
which is a map template with three template arguments. The first is the type of the key, in this example a string. The second is the value associated with the key, in this case a long integer used as a counter. Finally, because the keys of the map are maintained in sorted order, we provide a template comparison function (see issue #016 for another example of this).
Another typedef we establish but do not use in this simple example is the VAL type, which is a template of type "pair<string,long>". pair is used internally within STL, and in this case is used to represent a map element key/value pair. So VAL represents an element in the map.
We then read lines of input and insert each word into the map. The statement:
counter[buf]++;
does several things. First of all, buf is a char*, not a string, and must be converted via a constructor. What we've said is equivalent to:
counter[string(buf)]++;
operator[] is overloaded for maps, and in this case the key is used to look up the element, and return a long&, that is, a reference to the underlying value. This value is then incremented (it started at zero).
Finally, we iterate over the map entries, using an iterator. Note that:
(*it).first
cannot be replaced by:
it->first
because "*" is overloaded. When * is applied to "it", it returns a pair<string,key> object, that is, the underlying type of elements in the map. We then reference "first" and "second", fields in pair, to retrieve keys and values for output.
For input:
a
b
c
a
b
output is:
a 2
b 2
c 1
There are some complex ideas here, but map is a very powerful feature worth mastering.
ACKNOWLEDGEMENTS
Thanks to Nathan Myers, Eric Nagler, David Nelson, Terry Rudd, Jonathan Schilling, Elaine Siegel, John Spicer, and Clay Wilson for help with proofreading.
SUBSCRIPTION INFORMATION / BACK ISSUES
To subscribe to the newsletter, send mail to majordomo@world.std.com with this line as its message body:
subscribe c_plus_plus
Back issues are available via FTP from:
rmi.net /pub2/glenm/newslett
or on the Web at:
There is also a Java newsletter. To subscribe to it, say:
subscribe java_letter
using the same majordomo@world.std.com address.
-------------------------
Copyright (c) 1996 Glen McCluskey. All Rights Reserved.
This newsletter may be further distributed provided that it is copied in its entirety, including the newsletter number at the top and the copyright and contact information at the bottom.
Glen McCluskey & Associates
Professional C++ Consulting
Internet: glenm@glenmccl.com
Phone: (800) 722-1613 or (970) 490-2462
Fax: (970) 490-2463
FTP: rmi.net /pub2/glenm/newslett (for back issues)
Web: http://rainbow.rmi.net/~glenm