Kara Ottewell's STL Tutorial 1

Version 1.3 © Kara Ottewell

Table of Contents
Introduction
Templates ite Domum
Function Templates
Class Templates
What has the STL ever done for us ?
Vector
Deque
List
Allocators
Sequence Adapters
Stack
Queue
Priority Queue
Strings
Iterators
Iterator Adapters
We are searching for the Associative Container
Map and Multimap
Set and Multiset
Algorithms and Functions
Algorithms
Functions
STL Related Web Pages
Bibliography


Warning for the humour-impaired: Any strange references that you can't understand are almost certainly a skit on Monty Python's "Life of Brian"
These notes formed part of an internal course on the STL which I was asked to give to my colleagues at Yezerski Roper


1. Introduction
The C++ Standard Template Library, generally referred to as the STL, saves you, the programmer, from having to re-invent the wheel. This course is aimed at programmers who have reasonable familiarity with the C++ programming language, and know about classes, constructors and the like. Avoiding esoteric functionality, the examples have been written to clearly demonstrate STL features. The sample programs also use Standard Library (as distinct from Standard Template Library) features like fstream and iostream, providing examples of how to use these. A discussion of the Standard Library as a whole is beyond the scope of this document - see Stroustrup and others in the bibliography for more information.
What motivated people to write the STL ? Many people felt that C++ classes were inadequate in situations requiring containers for user defined types, and methods for common operations on them. For example, you might need self-expanding arrays, which can easily be searched, sorted, added to or removed from without messing about with memory reallocation and management. Other Object-Oriented languages used templates to implement this sort of thing, and hence they were incorporated into C++.
Driving forces behind the STL include Alexander Stepanov and Meng Lee at Hewlett-Packard in Palo Alto, California, Dave Musser at General Electric's Research Center in Schenectady, New York, Andrew Koenig, and of course "Mr C++" himself, Bjarne Stroustrup at AT&T Bell Laboratories.
The example programs are known to work on Alpha/VAX VMS 6.2 onwards using DEC C++ 5.6, and Windows NT 4.0 SP3 with Visual C++ 5.0 , and Windows 2000/Windows XP with Visual C++ 6.0 . Platform-specific #pragmas have been guarded with #ifdef _VMS or #ifdef _WIN32. To build under VMS use the MAKE.COM command file. Just give a program name like example_1_1 as its argument and it will look for files with extension .CXX, .CPP, .C , in that order. If you provide the extension then it uses that. On Alphas, output files get an _ALPHA suffix. Here is an example:
$ @MAKE EXAMPLE_1_1 ! On an Alpha DEV$DISK:[PHIL.WWW.STL] CC/PREFIX=ALL EXAMPLE_1_1.C -> EXAMPLE_1_1.OBJ_ALPHA LINK EXAMPLE_1_1 -> EXAMPLE_1_1.EXE_ALPHA $ @MAKE EXAMPLE_1_2 ! Now on a VAX DEV$DISK:[PHIL.WWW.STL] CXX/ASSUME=(NOHEADER_TYPE_DEFAULT)/EXCEPTIONS/TEMPLATE_DEFINE=(LOCAL) EXAMPLE_1_2.CXX -> EXAMPLE_1_2.OBJ CXXLINK EXAMPLE_1_2 -> EXAMPLE_1_2.EXE
A slight buglet introduced in DEC C++ 5.6 for Alpha VMS means that you might get a warning in the CXXLINK step.
%LINK-W-NUDFSYMS, 1 undefined symbol: %LINK-I-UDFSYM, WHAT__K9BAD_ALLOCXV %LINK-W-USEUNDEF, undefined symbol WHAT__K9BAD_ALLOCXV referenced in psect __VTBL_9BAD_ALLOC offset %X00000004 in module MEMORY file SYS$COMMON:[SYSLIB]LIBCXXSTD.OLB;1
The undefined symbol is harmless and never referenced, but you can obtain the official patch from ftp.service.digital.com. Download and run cxxae01056.a-dcx_axpexe to unpack it, then whilst logged on as SYSTEM use @SYS$UPDATE:VMSINSTAL to install it.
Download individual sample programs using Right Click and "Save" on their links, or download all the examples in a .zip file. For Windows NT or Windows 2000, the Developer Studio files are provided in the .zip distribution.
The first two programs are implementations of expanding, integer arrays which are sorted into value order. Example 1.1 is in ANSI C, and 1.2 is C++using the STL. I have tried to make the C program as general as possible by using typedef, but this is still not really adequate, as we will see.
Right Click & save example_1_1.c
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_1_1.c%22%2D%2D%3E
In the this program (see Phil's C Course for an introduction to C) I used a typedef for the type of item I want to store and sort, and a get_array_space function to allocate and/or expand the available space. Even with very basic error handling get_array_space is rather ungainly. It handles different types of data if I change the typedef, but if I wanted more than one type of data storing, or even more than one buffer of the same type, I would have to write a unique get_array_space_type function for each. The compare_values function would also have to be rewritten, though this is also the case in the C++ code, for user defined types or pointers to values.
Right Click & save example_1_2.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_1_2.cxx%22%2D%2D%3E
Contrast the C++ program, Example 1.2, with the previous code. Using the STL, we invoke the vector template class, which allows us to store any data type we like in what is essentially a contiguous, self-expanding, random access array.
2. Templates ite Domum
This is (incorrect) Latin for "Templates Go Home !" and represents the ambivalence that some new C++ programmers feel towards this language feature. I hope that you will be persuaded of its usefulness after reading this section.
C++ supports a number of OOP (Object Oriented Programming) concepts. Broadly speaking, it supports encapsulation through the member functions and private or protected data members, inheritance by allowing classes to be derived from other classes and abstract base classes, and polymorphism through virtual functions, function signatures and templates. Templates achieve polymorphism by allowing us to define classes or functions in a generic way, and let the compiler/linker generate an instantiation of the function or class using the actual types we require.
The STL is built, as its name suggests, on the C++ template feature. There are two types of template: function templates and class templates. Both perform similar roles in that they allow functions or classes to be specified in a generic form, enabling the function or class to be generated for any data type - user defined or built-in.
At first sight this might not appear to be very different from macros in the C language. In the C program Example 1.1 we could have made it more flexible by using a macro to define the comparison function.
#define COMPARE_VALUES( value_type ) \ value_type compare_values_##value_type( const void *a, const void *b ) \ {const value_type *first, *second; \ first = (value_type *)a; second = (value_type *)b; return( *first - *second );} COMPARE_VALUES( float ) /* Generate function for floats, */ COMPARE_VALUES( double ) /* doubles and */ COMPARE_VALUES( int ) /* ints */ . /* Pick comparison function */ qsort( v, nitems, sizeof(array_type), compare_values_int );
The same method can be used for structure generation. There are a number of drawbacks to this approach. You have to explicitly generate functions for all the types you want to use, and there is no type checking in this particular case, so you could easily pass compare_values_float when you meant to compare integers, and you would have to be rigorous about your naming convention. In addition, some people would argue that macros are not as transparent, since you can't see what they expand into until compilation time.
Templates avoid these problems. Because they are built into the language, they are able to provide full type safety checking and deduce the types of their arguments automatically, generating the code for whatever arguments are used. C++ allows you to overload operators like < for user-defined types so the same template definition often suffices for built-in and user-defined classes. The following two sections discuss the syntax and use of template functions and classes.
Function Templates
Template functions have the following form:
template < template-argument-list > function-definition
The template-argument-list is one or more type-names within the scope of the template definition. In template functions the first argument is always a type, as in this code fragment.
template T mymin( T v1, T v2) { return( (v1 < v2) ? v1 : v2 ); }
The above example uses class T but you can also use things like template if you want to constrain a particular parameter to be of some known type. For example, if you were doing some sort of buffer class, you might want people to be able to pre-allocate a particular size of buffer,
template class MyBuffer { protected: T MyBufferObjects[iMaxObjects]; public: & x(const T & x) { array[0] = x; return *this; } MyBuffer void func(const T & buffer) { // Whatever } }; class StringBuffer : public MyBuffer // Defaults to 256 { // Whatever } class ScreenBuffer : public MyBuffer { // Whatever }
You should be able to use the typename keyword in your function (or class) declaration, like this
// This may well give an error but is perfectly legal template T mymin( T v1, T v2) { return( (v1 < v2) ? v1 : v2 ); }
Stroustrup favours the class T format because it means fewer keystrokes. Personally I would prefer the typename T form, but won't use it because it will give errors with some compilers.
Those of use who started programming using proper languages like Fortran are used to the idea of the compiler selecting the correct function variant :-) Not many people using the Fortran MAX function bother to specify IMAX0,JMAX0,KMAX0 and so on. The compiler selects the specific function according to the arguments. Remember that the class T type doesn't have to be a class. It can be a built-in type like int or float. The C++ compiler always tries to find a "real" function with matching arguments and return type before generating a function from the template, as in the following program.
Right Click & save example_2_1.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_2_1.cxx%22%2D%2D%3E
The "real" function signature matched the float case, and was used in preference to the template. The using namespace std line is necessary on Windows if we wish to avoid prefixing STL features with std::, e.g. std::cin. It is not necessary with VMS and DEC C++ 5.6, though it may be with future versions of DEC C++.
Class Templates
Template classes have the following form:
template < template-argument-list > class-definition
The template-argument-list is one or more type-name within the scope of the template definition. For example
Right Click & save example_2_2.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_2_2.cxx%22%2D%2D%3E
3. What has the STL ever done for us ?
"Well, yes, vectors, I mean obviously the vectors are good ..."
"Don't forget queues Reg, I mean, where would we be without properly organized queues and iterators for any data type ?" General murmurs of agreement
"Yes, alright, apart from vectors and queues ..."
"Sorts Reg - I hated having to code up a new sort routine for every class." Here, here, etc.
"Right. So apart from vectors, queues and associated containers classes, iterators, various useful algorithms and functions, what has the STL ever done for us ?"
"Memory allocators Reg. We can allocate container memory using any scheme we like, and change it without rewriting all the code. It keeps things in order" Reg loses his temper
"Order ? Order ? Oh shut up!"
At the end of this section, the waffle above should start making sense to you, but is unlikely to become more humorous as a result of your studies.
There are three types of sequence containers in the STL. These, as their name suggests, store data in linear sequence. They are the vector, deque and list:
vector
deque
list
To choose a container, decide what sort of operations you will most frequently perform on your data, then use the following table to help you.
Time overhead of operations on sequence containers
Operation Vector Deque List
Access 1st Element Constant Constant Constant
Access last Element Constant Constant Constant
Access "random" element Constant Constant Linear
Add/Delete at Beginning Linear Constant Constant
Add/Delete at End Constant Constant Constant
Add/Delete at "random" Linear Linear Constant
Each container has attributes suited to particular applications. The subsections and code samples below should further clarify when and how to use each type of sequence container.
Throughout this tutorial, I have given the #include file needed to use a feature immediately after the subsection heading. Note that some of the header names have changed since earlier versions of the STL, and the .h suffix has been dropped. Older books may refer to, for example, , which you should replace with  . If you include ANSI C headers, they should have the .h, e.g. . C++ versions of the ANSI C headers, prefixed by the letter "c" and minus the .h are becoming more widely available, but not all implementations currently support them, e.g. .
On OpenVMS systems a reference copy of the source code for the STL can be found in SYS$COMMON:[CXX$LIB.REFERENCE.CXXL$ANSI_DEF] . So for look in there for the file VECTOR.; . For Windows, go into Visual Studio, click on the "binocular search" button on the tool bar, then under the "Index" tab, type vector header file (replace vector with your choice if header file, of course), hit  , then click on the entry in the "Select topic to display" list at the bottom.
Vector
#include
We introduced the vector in Example 1.2, where we used it instead of an array. The vector class is similar to an array, and allows array-type syntax, e.g. my_vector[2] . A vector is able to access elements at any position (referred to as "random" access in the preceding table) with a constant time overhead, O(1). Insertion or deletion at the end of a vector is "cheap". As with the string, no bounds checking is performed when you use operator [].
Insertions and deletions anywhere other than at the end of the vector incur overhead O(N), N being the number of elements in the vector, because all the following entries have to be shuffled along to make room for the new entries, the storage being contiguous. Memory overhead of a vector is very low and comparable to a normal array.
The table below shows some of the main vector functions.
Some Vector Access Functions Purpose ---------------------------- ------- begin() Returns iterator pointing to first element end() Returns iterator pointing _after_ last element push_back(...) Add element to end of vector pop_back(...) Destroy element at end of vector swap( , ) Swap two elements insert( , ) Insert new element size() Number of elements in vector capacity() Element capacity before more memory needed empty() True if vector is empty [] Random access operator
The next example shows a vector in use.
Right Click & save example_3_1.cxx{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_3_1.cxx%22%2D%2D%3E
Note how the element sort takes v.begin() and v.end() as range arguments. This is very common in the STL, and you will meet it again. The STL provides specialized variants of vectors: the bitset and valarray. The former allows a degree of array-like addressing for individual bits, and the latter is intended for numeric use with real or integer quantities. To use them, include the or header files (these are not always supported in current STL implementations). Be careful if you erase() or insert() elements in the middle of a vector. This can invalidate all existing iterators. To erase all elements in a vector use the clear() member function.
Deque
#include
The double-ended queue, deque (pronounced "deck") has similar properties to a vector, but as the name suggests you can efficiently insert or delete elements at either end.
The table shows some of the main deque functions.
Some Deque Access Functions Purpose --------------------------- ------- begin() Returns iterator pointing to first element end() Returns iterator pointing _after_ last element push_front(...) Add element to front of deque pop_front(...) Destroy element at front of deque push_back(...) Add element to end of deque pop_back(...) Destroy element at end of deque swap( , ) Swap two elements insert( , ) Insert new element size() Number of elements in deque capacity() Element capacity before more memory needed empty() True if deque is empty [] Random access operator
A deque, like a vector, is not very good at inserting or deleting elements at random positions, but it does allow random access to elements using the array-like [] syntax, though not as efficiently as a vector or array. Like the vector an erase() or insert() in the middle can invalidate all existing iterators.
The following program shows a deque representing a deck of cards. The queue is double-ended, so you could modify it to cheat and deal off the bottom :-)
Right Click & save example_3_2.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_3_2.cxx%22%2D%2D%3E
The card game is a version of pontoon, the idea being to get as close to 21 as possible. Aces are counted as one, picture cards as 10. Try to modify the program to do smart addition and count aces as 10 or 1; use a vector to store your "hand" and give alternative totals.
Notice the check on the state of the input stream after reading in the character response. This is needed because if you hit, say, Z, the input stream will be in an error state and the next read will return immediately, causing a loop if you don't clear cin to a good state.
List
#include
Lists don't provide [] random access like an array or vector, but are suited to applications where you want to add or remove elements to or from the middle. They are implemented as double linked list structures in order to support bidirectional iterators, and are the most memory-hungry standard container, vector being the least so. In compensation, lists allow low-cost growth at either end or in the middle.
Here are some of the main list functions.
Some List Access Functions Purpose -------------------------- ------ begin() Returns iterator pointing to first element end() Returns iterator pointing _after_ last element push_front(...) Add element to front of list pop_front(...) Destroy element at front of list push_back(...) Add element to end of list pop_back(...) Destroy element at end of list swap( , ) Swap two elements erase(...) Delete elements insert( , ) Insert new element size() Number of elements in list capacity() Element capacity before more memory needed empty() True if list is empty sort() Specific function because sort routines expect random access iterators
Right Click & save example_3_3.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_3_3.cxx%22%2D%2D%3E
The loop over elements starts at yrl.begin() and ends just before yrl.end(). The STL .end() functions return iterators pointing just past the last element, so loops should do a != test and not try to dereference this, most likely invalid, position. Take care not to reuse (e.g. ++) iterators after they have been used with erase() - they will be invalid. Other iterators, however, are still valid after erase() or insert().
Container Caveats
Be aware that copy constructors and copy assignment are used when elements are added to and (in the case of the vector and deque) deleted from containers, respectively. To refresh your memories, copy constructor and copy assignment member functions look like this example:
class MyClass { public: . // Copy constructor MyClass( const MyClass &mc ) { // Initialize new object by copying mc. // If you have *this = mc , you'll call the copy assignment function } // Copy assignment MyClass & operator =( const MyClass &mcRHS ) { // Avoid self-assignment if ( this != &mcRHS ) { // Be careful not to do *this = mcRHS or you'll loop . } return( *this ); } };
When you put an object in a container, the copy constructor will be called. If you erase elements, then destructors and copy assignments (if other elements need to be shuffled down) will be called. see supplementary example, RefCount.cxx for a demonstration of this.
Another point to bear in mind is that, if you know in advance how many elements you are going to add to a container, you can reserve() space, avoiding the need for the STL to reallocate or move the container.
vector things; things.reserve( 30000 ); for ( ... ) { things.push_back( nextThing ); }
The above code fragment reserves enough space for 30000 objects up front, and produced a significant speed up in the program.
Allocators
Allocators do exactly what it says on the can. They allocate raw memory, and return it. They do not create or destroy objects. Allocators are very "low level" features in the STL, and are designed to encapsulate memory allocation and deallocation. This allows for efficient storage by use of different schemes for particular container classes. The default allocator, alloc, is thread-safe and has good performance characteristics. On the whole, it is best to regard allocators as a "black box", partly because their implementation is still in a state of change, and also because the defaults work well for most applications. Leave well alone !
4. Sequence Adapters
Sequence container adapters are used to change the "user interface" to other STL sequence containers, or to user written containers if they satisfy the access function requirements. Why might you want to do this ? Well, if you wanted to implement a stack of items, you might at first decide to base your stack class on the list container - let's call it ListStack - and define public member functions for push(), pop(), empty() and top(). However, you might later decide that another container like a vector might be better suited to the task. You would then have to define a new stack class, with the same public interface, but based on the vector, e.g. VectorStack, so that other programmers could choose a list or a vector based queue. It is obvious that the number of names for what is essentially the same thing start to mushroom. In addition, this approach rules out the programmer using his or her own underlying class as the container.
Container adapters neatly solve this by presenting the same public interface irrespective of the underlying container. Being templatized, they avoid name proliferation. Provided the container type used supports the operations required by the adapter class (see the individual sections below) you can use any type for the underlying implementation. It is important to note that the adapters provide a restricted interface to the underlying container, and you cannot use iterators with adapters.
Stack
#include
The stack implements a Last In First Out, or LIFO structure, which provide the public functions push(), pop(), empty() and top(). Again, these are self explanatory - empty returns a bool value which is true if the stack is empty. To support this functionality stack expects the underlying container to support push_back(), pop_back(), empty() or size() and back()
Container Function Stack Adapter Function ------------------ ---------------------- back() top() push_back() push() pop_back() pop() empty() empty() size() size()
You would be correct in surmising that you can use vector, deque or list as the underlying container type. If you wanted a user written type as the container, then if provided the necessary public interfaces, you could just "plug" it into a container adapter.
Example 4.1 demonstrates a stack implemented with a vector of pointers to char. Note that the syntax of using container adapters differs from that shown in Saini and Musser or Nelson's book, and is based on the December 1996 Working Paper of the ANSIC++ Draft Standard.
Right Click & save example_4_1.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_4_1.cxx%22%2D%2D%3E
Note how the stack declaration uses two arguments. The first is the type of object stored, and the second is a container of the same type of object.
Queue
#include
A queue implements a First In First Out, or FIFO structure, which provides the public functions push(), pop(), empty(), back() and front() ( empty() returns a bool value which is true if the queue is empty). To support these, queue expects the underlying container to have push_back(), pop_front(), empty() or size() and back()
Container Function Queue Adapter Function ------------------ --------------------- front() front() back() back() push_back() push() pop_front() pop() empty() empty() size() size()
You can use deque or list as the underlying container type, or a user-written type. You can't use a vector because vector doesn't support pop_front(). You could write a pop_front() function for vector, but this would be inefficient because removing the first element would require a potentially large memory shuffle for all the other elements, taking time O(N).
The following code shows how to use a queue.
Right Click & save example_4_2.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_4_2.cxx%22%2D%2D%3E
Note how we haven't given a second argument in the queue declaration, but used the default deque given in the header file.
Priority Queue
#include
A priority_queue, defined in the header, is similar to a queue, with the additional capability of ordering the objects according to a user-defined priority. The order of objects with equal priority is not really predictable, except of course, they will be grouped together. This might be required by an operating system process scheduler, or batch queue manager. The underlying container has to support push_back(), pop_back(), empty(), front(), plus a random access iterator and comparison function to decide priority order.
Container Function Priority Queue Adapter Function ------------------ ------------------------------- front() top() push_back() push() pop_back() pop() empty() empty() size() size() [] random iterators Required to support heap ordering operations
Hence a vector or a deque can be used as the underlying container, or a suitable user-provided class.
The next sample program demonstrates a priority_queue implemented with a vector of task objects. Note that the syntax of using container adapters differs from that shown in Saini and Musser or Nelson's book.
Right Click & save example_4_3.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_4_3.cxx%22%2D%2D%3E
Example 4.3 program shows a user-defined comparison function object (discussed later), the only member of the PrioritizeTasks class. This is used to determine the relative priority of tasks and, like the << operator, is made a friend of the TaskObject class so that it can access the private data members. When the tasks are pulled off the priority_queue, they are in our notional execution order, highest priority first.
5. Strings
#include
A member of the C++ standards committee was allegedly told that if strings didn't appear pretty darn quickly, then there was going to be a lynching. There hasn't been a lynching, and whilst we can debate the merits of this, I think there is general agreement that it is a good thing to have strings at last. Those of us who started programming with proper languages, like Fortran, have long criticized the rather ugly syntax of C string manipulation - "What ? You have to call a function to add two strings ?" being a typical comment.
The C++ string template class is built on the basic_string template. Providing much of the functionality of the container classes like vector, it has built in routines for handling character set conversion, and wide characters, like NT's Unicode. The string class also provides a variety of specialized search functions for finding substrings. The characteristics of the character set stored in the string are described by the char_traits structure within the string, there being a different definition of this for each type of character set. On the whole you needn't concern yourself too much with these details if you are using strings of ASCII characters. Strings, like the vector, expand as you add to them, which is much more convenient than C-style strings, where you either have to know how big they will be before you use them, or malloc and realloc. The largest possible string that can be accommodated is given by the max_size() access function.
Some String Access Functions Purpose ---------------------------- ------- find(...) Find substring or character, start at start find_first_of(...) Find first occurrence of any characters in given set, starting from start of string find_last_of(...) Find last occurrence of any characters in given set, starting from start of string find_not_first_of(...) Find first occurrence of characters _not_ in given set, starting from start of string find_last_not_of(...) Find last occurrence of characters _not_ in given set, starting from start of string rfind(...) Find substring or character, start at end size() Number of elements in vector [] Random access to return a single character - no bounds checking at(...) Random access to return a single character - with bounds checking + Concatenate strings swap( , ) Swap two strings insert( , ) Insert a string at the specified position replace(...) Replace selected substring with another string
The string provides the highest level of iterator functionality, including [] random access. Hence all relevant standard algorithms work with string. You can sort, reverse, merge and so on. Specialized versions of some algorithms, like swap(), are provided for strings to take advantage of certain optimization techniques. The operator [] allows you to access a single character in a string, but without any bounds checking. Use the at() function if you want bounds checking. The operator+ allows easy string concatenation, so you can now do things like
string firstname, lastname, name; . name = firstname + " " + lastname;
or
name = firstname; name += " "; name += lastname;
Easily understandable documentation on the string class is still a bit thin on the ground at the moment, so I have compiled some sample code to illustrate the main facilities.
Right Click & save example_5_1.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_5_1.cxx%22%2D%2D%3E
The next program puts some of the string functions to use in a simple expression evaluator, which takes arithmetic-style expressions. It also shows the at() function, which unlike operator[], at() will throw an out_of_rangeexception for a bad index. Try calculating the rest energy of an electron and proton (E = m*c^2).
Right Click & save example_5_2.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_5_2.cxx%22%2D%2D%3E
The expression evaluator above introduces maps, discussed later. Here they are used to allow us to get a numeric value from the symbolic name stored in a string.
6. Iterators
#include // Don't normally need to include this yourself
An iterator you will already be familiar with is a pointer into an array.
char name[] = "Word"; char ch, *p; p = name; // or &name[0] if you like ch = p[3]; // Use [] for random access ch = *(p+3);// Equivalent to the above *p = 'C'; // Write "through" p into name while ( *p && *p++ != 'r' ); // Read name through p, look for letter 'r'
Looking at the above code sample shows how flexible and powerful iterators can be. The above code fragment uses p in at least 5 different ways. We take it for granted that the compiler will generate the appropriate offset for array elements, using the size of a single element.
The STL iterators you've already met are those returned by the begin() and end() container access functions, that let you loop over container elements. For example:
list l; list::iterator liter; // Iterator for looping over list elements for ( liter = l.begin(); liter != l.end(); ++liter ) { *liter = 0; }
The end-of-loop condition is slightly different to normal. Usually the end condition would be a less than < comparison, but as you can see from the table of iterator categories below, not all iterators support <, so we increment the iterator from begin() and stop just before it becomes equal to end(). It is important to note that, for virtually all STL purposes, end() returns an iterator "pointing" to an element just after the last element, which it is not safe to dereference, but is safe to use in equality tests with another iterator of the same type. The pre-increment ++ operator can sometimes yield better performance because there is no need to create a temporary copy of the previous value, though the compiler usually optimizes this away.
Iterators are a generalized abstraction of pointers, designed to allow programmers to access different container types in a consistent way. To put it more simply, you can think of iterators as a "black box" between containers and algorithms. When you use a telephone to directly dial someone in another country, you don't need to know how the other phone system works. Provided it supports certain basic operations, like dialling, ringing, reporting an engaged tone, hanging up after the call, then you can talk to the remote person. Similarly, if a container class supports the minimum required iterator types for an algorithm, then that algorithm will work with the container.
This is important because it means that you can use algorithms such as the sort and random_shuffle we've seen in earlier examples, without their authors having to know anything about the containers they are acting on, provided we support the type of iterator required by that algorithm. The sort algorithm, for example, only needs to know how to move through the container elements, how to compare them, and how to swap them. There are 5 categories of iterator:
Random access iterators
Bidirectional iterators
Forward iterators
Input iterators
Output iterators
They are not all as powerful in terms of the operations they support - most don't allow [] random access, as we've seen with the difference between vector and list. The following is a summary of the iterator hierarchy, most capable at the top, operations supported on the right.
Iterator Type Operations Supported ^ +-------------------------------------+ / \ \ == != < > >= [] / / Access \ \ =*p / /-----------\ \-------------------------/ /Bidirectional\ \ == != ++ -- / / \ \ *p= -> =*p / /-----------------\ \-------------------/ / Forward \ \ == != ++ / / \ \ *p= -> =*p / /-----------+-----------\ \-------+-----/ /Input | Output\ \ == !=| ++ / / | \ \++ ->|*p=/ +--------------+--------------+ \ =*p| / \ | / \ |/ \ / v
The Iterator Hierarchy
The higher layers have all the functionality of the layers below, plus some extra. Only random iterators provide the ability to add or subtract an integer to or from the iterator, like *(p+3). If you write an iterator it must provide all the operations needed for its category, e.g. if it is a forward iterator it must provide ==, !=, ++, *p=, -> and =*p. Remember that ++p and p++ are different. The former increments the iterator then returns a reference to itself, whereas the latter returns a copy of itself then increments.
Operators must retain their conventional meaning, and elements must have the conventional copy semantics. In a nutshell, this means that the copy operation must produce an object that, when tested for equality with the original item, must match. Because only random iterators support integer add and subtract, all iterators except output iterators provide a distance() function to find the "distance" between any two iterators. The type of the value returned is
template typename iterator_traits::difference_type
This is useful if, for example, you find() a value in a container, and want to know the "position" of the element you've found.
map< key_type, data_type >::iterator im; map< key_type, data_type >::difference_type dDiff; im = my_map.find( key ); dDiff = distance( my_map.begin(), im );
Of course, this operation might well be inefficient if the container doesn't support random access iterators, since in that case it will have to "walk through" the elements comparing the iterators.
Just as you can declare pointers to const objects, you can have iterators to const elements. The const_ prefix is used for this purpose, e.g.
vector::iterator i; // Similar to my_type *i vector::const_iterator i; // Similar to const my_type *i
The iterator_traits for a particular class is a collection of information, like the "iterator tag" category, which help the STL "decide" on the best algorithm to use when calculating distances. The calculation is trivial for random iterators, but if you only have forward iterators then it may be a case of slogging through a linked list to find the distance. If you write a new class of container, then this is one of the things you must be aware of. As it happens, the vector, list, deque, map and set all provide at least Bidirectional iterators, but if you write a new algorithm, you should not assume any capability better than that which you really need. The lower the category of iterator you use in your algorithm, the wider the range of containers your algorithm will work with.
Although the input and output iterators seem rather poor in capability, in fact they do add the useful ability to be able to read and write containers to or from files. This is demonstrated in the program below, and again in Example 7.2.
Right Click & save example_6_1.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_6_1.cxx%22%2D%2D%3E
The result of the possible restrictions on an iterator is that most algorithms have two iterators as their arguments, or (perhaps less safely) an iterator and a number of elements count. In particular, when you are using iterators, you need to be aware that it isn't a good idea to test an iterator against NULL, or to test if one iterator is greater than another. Testing for equality or inequality is safe except for output iterators, which is why the loops in the example code use iterator != x.end() as their termination test.
Iterator Adapters
Like the container adapters, queue, priority_queue and stack, iterators have adapters too. There are three types:
Reverse iterators
Insert iterators
Raw storage iterators
The reverse iterator reverses the behaviour of the ++ and -- operators, so you can write code like this:
vector v; vector::reverse_iterator ir; . for ( ir = v.rbegin(); ir != v.rend(); ++ir ) { // Whatever, going from end to start of vector x = *ir; }
Standard containers all provide rbegin() and rend() functions to support this kind of thing.
The insertions iterators will, depending on the type of container, allow insertion at the front, back or middle of the elements, using front_insert_iterator, back_insert_iterator or insert_iterator. Because you might just as well use container.push_back() and so forth, their main use is as the return value of functions like front_inserter(), back_inserter and inserter, which modify how a particular algorithm should work.
Raw storage iterators are used for efficiency when performing operations like copying existing container elements to regions of uninitialized memory, such as that obtained by the STL functions get_temporary_buffer and return_temporary_buffer. Look in the header for examples of iterator use.
7. We are searching for the Associative Container
"We've already got one !"
Mumbles of "Ask them what it looks like"
"Well what does it look like ?"
"It's a verra naice !"
There are four types of associative container in the STL. Associative containers are used to store objects that we want to be able to retrieve using a key. We could use a map as a simple token/value database, where a token might be a character string, and the value might be an integer.
Associative containers store items in key order, based on a user-supplied comparison function, and the multi variants allow duplicate keys. Lookup is O(logN), N being the number of items stored. The associative containers are:
map
multimap
set
multiset
All four associative containers store the keys in sorted order to facilitate fast traversal. For built-in types the Compare function can simply be a suitable STL function object, e.g. map > If you are storing pointers to objects rather than the objects themselves, then you will need to provide your own comparison function object even for built-in types. The multi variant of the container is able to store more than one entry with the same key, whereas map and set can only have one entry with a particular key.
In Stroustrup's book he shows how to make a hash_map variant of map. When working well, even with large data sets this can perform lookups in O(1) time, compared to O(logN) performance from the map. However, hashing can exhibit pathological behaviour if many keys hash to the same value, and if hash table expansion is required, that can be a slow operation.
Map and Multimap
#include
A map is used to store key-value pairs, with the values retrieved using the key. The multimap allows duplicate keys, whereas maps insist on unique keys. Items in a map are, when they are dereferenced through an iterator for example, returned as a pair, which is a class defined in the header. The pair has two members, first and second which are the key and the data respectively. The pair is used throughout the STL when a function needs to return two values.
Some Map Access Functions Purpose ------------------------- ------- begin() Returns iterator pointing to first element end() Returns iterator pointing _after_ last element swap( , ) Swap two elements insert( , ) Insert a new element size() Number of elements in map max_size() Maximum possible number of elements in map empty() True if map is empty [] "Subscript search" access operator
In the sample program below, which uses first and second, a list of tokens and values in the form
pi = 3.1415926535898 c = 299792459.0
are read in from the file tokens.dat, then you are prompted to enter a token name for which the corresponding value is displayed. Because map supports the [] subscript operator, you can access the stored value using the key as the subscript.
Right Click & save example_7_1.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_7_1.cxx%22%2D%2D%3E
In Example 5.2 we used the following lookup method with a map
value = symbol_values[item];
This is fine where we know that item is definitely in symbol_values[], but generally you must use the find(...) function and test against end(), which is the value returned if the key doesn't exist.
map< key_type, data_type >::iterator i i = my_map.find( key ); if ( i != my_map.end() ) { // Got it }
Several variants of an insert() function exist for the map. The single argument version of insert allows you test whether the item was already in the map by returning a pair< iterator, bool >. If successful the .second bool value will be true and the iterator will "point" at the inserted item. On failure .second will be false and the iterator will point at the duplicate key that caused the insertion to fail.
The map can only store one value against each key. Because each key can only appear once, if you try and add a second instance of the same key, then that will supercede the existing one. Edit the tokens.dat file used by Example 7.1 and convince yourself that this is the case. In situations where this restriction is not acceptable, the multimap should be used.
Set and Multiset
#include
The set stores unique keys only, i.e. the key is the value. Here are some of the set access functions:
Some Set Access Functions Purpose ------------------------- ------ begin() Returns iterator pointing to first element end() Returns iterator pointing _after_ last element swap( , ) Swap two elements insert( , ) Insert a new element size() Number of elements in set max_size() Maximum possible number of elements in set empty() True if set is empty
Like map, set supports the insert() function. Entries are kept in order, but you can provide your own comparison function to determine that order. Useful algorithms operating on a set are includes(), set_union(), set_intersection(), set_difference() and set_symmetric_difference(). The set supports bidirectional iterators, all set iterators are const_iterators, even if you declare them as set::iterator, so watch out for that.
Right Click & save example_7_2.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_7_2.cxx%22%2D%2D%3E
This example also shows the use of an output iterator which in this case is directing the output to cout, but this could just as well be a file. The set can only store unique keys, hence if you try and insert a second instance of the same key a failure will result. The single argument version of insert( const value_type& ) returns pair( it, true or false) with the same meaning as for map. Remember that you can define what equality means, so the situation may arise where two "identical" set elements have different data members. In situations where this restriction is not acceptable, the multiset should be used.
8. Algorithms and Functions
We've already met and used several of the STL algorithms and functions in the example programs, sort(...) being one of them. In the STL, algorithms are all template functions, parameterized by iterator types. For example, sort(...) might be implemented like this:
template inline void sort (RandomAccessIter first, RandomAccessIter last) { if (!(first == last)) { // Do the sort } }
Because algorithms only depend on the iterator type they need no knowledge of the container they are acting on. This allows you to write your own container and, if it satisfies the iterator requirements, the STL algorithms will work with a container type "unknown" when they were written. Because the algorithms are all templatized, the compiler should be able to generate inline code to do the operation just as efficiently as if you had "hand coded" the routine.
Many algorithms need a little help from the programmer to determine, for example, whether one item in a container is greater, less than or equal to another. This is where function objects come into the picture. Function objects are used similarly to function pointers in C. We have seen one example of function pointers in qsort used by Example 1.1. In OSF/Motif programs we often need to supply a "callback function" that will be executed when a particular event is seen by a "Widget", e.g.someone clicking on a button widget. In C we might do this:
typedef void (*CallbackProcedure)( Context c, Event e, Userdata u); struct { CallbackProcedure callback; Userdata udat; } CallbackRecord; . /* My function to be called when button is pressed */ void ButtonPressedCB( Context c, Event e, Userdata u); . CallbackRecord MyCallbackList[] = { {ButtonPressedCB, MyData}, {NULL,NULL} }; . SetButtonCallback( quit_button, MyCallbackList );
Problems with this approach are the lack of type safety, the overhead associated with indirecting function calls, lack of inline optimization, and problems with interpreting "user data" which tends to end up being a (void *) pointer. STL function objects avoid these problems because they are templatized, so provided the function object is fully defined at the point of use, the compiler can generate the code inline at compilation time. User data can be kept in the function object, and maintains type safety. You can, of course, pass function pointers to algorithms if you wish, but in the following sections it should become apparent that function objects are generally a better idea.
Algorithms
#include
We all know that an algorithm is abstract logical, arithmetical or computational procedure that, if correctly applied, ensures the solution of a problem. But what is an STL algorithm ? STL algorithms are template functions parameterized by the iterator types they require. In the iterators section I likened iterators to a "black box" that allowed algorithms to act on any container type which supported the correct iterators, as shown in the diagram below.
+------------------+ +------------------+ | +---------------------+ | | Algorithms | <-- Iterators --> | Containers | | Algorithms | <-- Iterators --> | Containers | | +---------------------+ | +------------------+ +------------------+
The "abstraction layer" provided by iterators decouples algorithms from containers, and vastly increases the capability of the STL. Not all containers support the same level of iterators, so there are often several variants of the same algorithm to allow it to work across a range of containers without sacrificing speed benefits for containers with more capable iterators. The appropriate version of the algorithm is automatically selected by the compiler using the iterator tag mechanism mentioned earlier. It does this by using the iterator_category() template function within a "jacket definition" of the algorithm, and a rather convoluted mechanism which you don't really need to worry about (see Mark Nelson's book pages 338-346 if you really want to know the grisly details).
Something you should worry about is whether the containers you are using support the iterators you need for an algorithm. The best way to determine this is to use a reference book, or look at the header file. For example, the min_element algorithm will have a definition similar to this:
template ForwardIterator min_element (ForwardIterator first, ForwardIterator last);
Hence it requires a container that supports at least forward iterators. It can lead to strange errors if you try an algorithm with a container that doesn't provide the necessary iterators, because the compiler will try and generate code from the various templates and get confused before collapsing with an error. On VMS, this might result in a lot of
%CXX-E-PARMTYPLIST, Ill-formed parameter type list. %CXX-E-BADTEMPINST, Previous error was detected during the instantiation of .. %CXX-E-OVERLDFAIL, In this statement, the argument list .. matches no ..
errors. In Windows you tend to get lots of
.. <`template-parameter-1',`template-parameter-2', .. .. could not deduce template argument for .. .. does not define this operator or a conversion to a type acceptable to the predefined operator
Try compiling the example code and see what errors you get.
Right Click & save example_8_1.cxx
{cke_protected}{C}%3C!%2D%2D%23include%20file%3D%22example_8_1.cxx%22%2D%2D%3E
There are 60 different algorithms in 8 main categories in the STL. See Stroustrup pages 507-511 for a full list of all the functions.
Nonmodifying Sequence Operations - these extract information, find, position within or move through elements but don't change them, e.g. find() .
Modifying Sequence Operations - these are miscellaneous functions that do change the element they act on, e.g. swap(), transform(), fill(), for_each() .
Sorted Sequences - sorting and bound checking functions, e.g. sort(), lower_bound() .
Set Algorithms - create sorted unions, intersections and so on, e.g. set_union(), set_intersection() .
Heap Operations - e.g. make_heap(), push_heap(), sort_heap() .
Minimum and Maximum - e.g. min(), max(), min_element(), max_element() .
Permutations - e.g. next_permutation(), prev_permutation() .
Numeric - include Some of the algorithms, like unique() (which tries to eliminate adjacent duplicates) or replace(), can't simply eliminate or replace elements because they have no knowledge of what the elements are. What they actually do is shuffle the unwanted elements to the end of the sequence and return an iterator pointing just past the "good" elements, and it is then up to you to erase() the others if you want to. To get round this several algorithms have an _copy suffix version, which produces a new sequence as its output, containing only the required elements.
Algorithms whose names end with the _if suffix, only perform their operation on objects that meet certain criteria. To ascertain whether the necessary conditions, known as predicates, have been met, you pass a function object returning a bool value. There are two types of predicate: Predicate and BinaryPredicate. Predicates dereference a single item to test, whereas BinaryPredicates dereference two items which they might compare for instance.
template void count_if( InputIterator first, InputIterator last, Predicate pred,Size& n);
This will return the number of objects in the range first to just before last that match the Predicate function object pred , which takes one argument - a reference to the data type you are checking. The next algorithm requires a BinaryPredicate.
template ForwardIterator adjacent_find (ForwardIterator first, ForwardIterator last, BinaryPredicate binary_pred);
This will look in the range first to just before last for two adjacent objects that "match". If no match is found, then it returns last. Because a match is determined by the BinaryPredicate function object binary_pred , which takes two arguments (references to the appropriate data types), you can match on any conditions you like. In fact, there are two versions of adjacent_find : one just requires first and last and uses the == operator to determine equality, and the one above which gives you more control over the match test.
With the information above, you should now be able to look at an algorithm in the header file or reference manual, and determine what sort of function object, if any, you need to provide, and what sort of iterators your container must support if the algorithm is to be used on it.
Functions and Function Objects
#include
Function objects are the STL's replacement for traditional C function pointers, and if you look at the STL algorithms, they are written as they would be if function objects were function pointers. This is why you can use function pointers (or plain, old functions) with the correct argument signature if you wish, though function objects offer several advantages, as we will see.
We have already seen a function object used in Example 4.3 to compare two task objects. This is superior to the qsort style comparison function in many respects. The function object provides type safety, allows an object's copy constructors to be used if necessary (rather than just doing a binary copy), and doesn't require that the objects be contiguous in memory - it can use the appropriate iterators to walk through the objects. Function objects can be used to "tuck away" data that would otherwise have to be global, or passed in a "user data" pointer. The usual template features like inline optimization and automatic code generation for different data types also apply.
"Why a 'function object'?", you might ask. What would be wrong with using a function template like this:
template bool is_less_than( const T &x, const T &y ) { return x < y; }; . sort( first, last, is_less_than() );
Unfortunately this is not legal C++. You can't instantiate a template function in this way (Stroustrup page 855). The correct thing to do is to declare a template class with operator() in it.
template class is_less_than { // A function object bool operator()( const T &x, const T &y ) { return x < y;} }; . sort( first, last, is_less_than() );
This is legal because we have instantiated the function object for MyObjects. There are three main uses for function objects within the STL. The comparison and predicate function objects which we have met already return a bool value indicating the result of a comparison, e.g. one object greater than another, or telling an algorithm whether to perform a conditional action, e.g. remove all objects with a particular attribute. The numeric function objects perform operations like addition, subtraction, multiplication or division. These usually apply to numeric types, but some, like +, can be used with strings. Several function objects are built in to the STL, such as plus, minus, multiplies, divides, modulus, negate, equal_to, not_equal_to, greater, and so on. See the header file for a complete list. If the data type defines the correct operator, you can use the pre-defined template function objects like this:
some_algorithm( first, last, greater() );
so you don't always need to create your own function object from scratch. Try and use one of the library versions if it is available. This saves effort, reduces the chances of error and improves portability. To increase the flexibility of STL function objects, adapters are provided which allow us to compose slightly modified function objects from the standard ones. If we wanted to find values greater than 1997 in a vector of integers we would use a binder to take advantage of the greater() function, which takes two arguments, to compare each value with 1997.
iter = find_if( v.begin(), v.end(), bind2nd(greater(),1997) );
Other adapters exist to allow negation of predicates, calling of member functions, or use of "real" function pointers with binders. This topic is covered in some detail by Stroustrup in pages 518 onwards.
9. STL Related Web Pages
Here are a few of the URL's I've collected relating to the STL and C++ draft standard. Use AltaVista http://www.altavista.digital.com/ and search for STL Tutorial, or the Yahoo! http://www.yahoo.com/ Standard Template Library section for more links.
Bjarne Stroustrup's Homepage A man who needs no introduction, he has many links to other useful C++ and STL sites
http://www.research.att.com/~bs/homepage.html
Mumit's STL Newbie Guide Mumit Khan's informative STL introduction is full of examples
http://abel.hive.no/C++/STL/stlnew.html
Standard Template Library Dave Musser's Web Page. Highly recommended
http://www.cs.rpi.edu/~musser/stl.html
The ISO/ANSI C++ Draft Jason Merrill's HTML of the 02-Dec-1996 draft
http://www.cygnus.com/misc/wp/index.html
C++ Standard Template LibraryAnother great tutorial, by Mark Sebern
http://www.objectplace.com/te
December 1996 Working Paper of the ANSI C++ Draft Standard
http://www.cygnus.com/misc/wp/dec96pub/
The Standard Template Library Silicon Graphics STL Reference Manual, (some nonstandard features)
http://www.sgi.com/tech/stl/stl_index.html
Ready-made Components for use with the STL Collected by Boris Fomitche
http://www.metabyte.com/~fbp/stl/components.html
Sites of interest to C++ users by Robert Davies, this is packed full of very useful web site URLs for both the beginner and more advanced C++ programmer
http://webnz.com/robert/cpp_site.html#Learn
10. Bibliography
[Amazon Books] This is an approved Amazon.com Associates site, and if you click on the titles below it will take you straight to the Amazon Books order page for that book. Amazon.com offers a safe option for Internet purchases: The Netscape Secure Commerce Server, which encrypts any information you type in. Click here to read their policy on privacy and security.
The C++ Programming Language (Third Edition) by Bjarne Stroustrup, Pub. Addison-Wesley, ISBN 0-201-88954-4
300 more pages than 2nd edition, much of the new material concerns the STL
STL Tutorial and Reference Guide C++ Programming with the Standard Template Library by David R. Musser and Atul Saini, Pub. Addison-Wesley, ISBN 0-201-63398-1
Fairly useful in conjunction with Nelson
C++ Programmer's Guide to the Standard Template Library by Mark Nelson, Pub. IDG Books Worldwide, ISBN 1-56884-314-3
Plenty of examples and more readable than most of the other books
The Annotated C++ Reference Manual (known as the ARM) by Margaret A. Ellis and Bjarne Stroustrup, Pub. Addison-Wesley, ISBN 0-201-51459-1
Explains templates - a "must have" book for anyone doing C++ programming
Data Structures and Algorithms in C++ by Adam Drozdek, Pub. PWS Publishing Company, ISBN 0-534-94974-6
Not about the STL, but useful for understanding the implementation
Standard Template Library : A Definitive Approach to C++ Programming Using STL by P. J. Plauger, Alexander A. Stepanov, Meng Lee, Pub. Prentice Hall, ISBN 0-134-37633-1
Back to Top
You can edit text on your website by double clicking on a text box on your website. Alternatively, when you select a text box a settings menu will appear. Selecting 'Edit Text' from this menu will also allow you to edit the text within this text box. Remember to keep your wording friendly, approachable and easy to understand as if you were talking to your customer