Iterators interact seamlessly with built-in C++ types. In particular, native C++ pointers are treated as iterators to C++ arrays. Naturally, all containers in the Standard C++ Library define an iterator type, i.e., a nested type iterator that represent their respective pointer-like type.
Iterator Categories Iterators fall into categories. This is because different algorithms impose different requirements on an iterator they use. For example, the find() algorithms needs an iterator that can be advanced by incrementing it, whereas the reverse() algorithm needs an iterator that can be decremented as well, etc. Ultimately, there are five categories of iterators in STL and Standard C++ Library:
Using the knowledge of an iterator's category one can provide optimized implementations of an algorithm. The advance() operation is an example. It increments (or decrements for negative n) an iterator. template
<class Iterator, class Distance>Obviously, there are many ways to do this. For a C++ array one would simply perform pointer arithmetic, i.e., add n to the C++ pointer.
inline void advance (Iterator& i, Distance n);
i += n;For a list, Iterators must step through the sequence and advance step-by-step.
if (n >= 0)Observation 1: The iterator category, which is an abstraction that represents a set of requirements to an iterator, is information related to an iterator. It is useful for providing optimized versions of an operation like advance();
while (n--)
++i;
else
while (n++)
--i;
Certain type information is related to an iterator. One of the questions that must be answered when designing the iterators in STL and the Standard C++ Library is: how can information related to a type be expressed in C++? Let's defer this discussion for a moment until we discuss other information related to an iterator.
The Distance Type An operation like advance() obviously needs an argument that indicates how far to advance the iterator:
templateThe type of this distance argument must represent the distance between any two iterators. Hence the distance type depends on the iterator type. For C++ pointers the distance type is the C++ type ptrdiff_t, which can represent the differenc between any two C++ pointers. Also, ptrdiff_t is the distance type of all other iterators in STL and Standard Library. However, the distance type in STL and the Standard C++ Library is not limited to ptrdiff_t.
inline void advance (Iterator& i, Distance n);
input output
| |
v v
+-----------------------+
|
v
forward
|
v
bidirectional
|
v
random access
The Value Type An iterator can be dereferenced. It then returns a reference to a value stored in a container. The type of this referenced value also depends on the respective iterator. For example, if the iterator refers to a container holding integers, the value type will be int. More generally, if the iterator refers to a container that stores elements of an arbitrary type T, the value type will be T.
Observation 2: Each iterator has two related types, its value type and its distance type.
So, we have found more information related to an iterator type. What is this information used for?
Value type and distance type are sometimes needed to implement algorithms. In STL and Standard C++ Library algorithms are separated from containers, i.e., an algorithm takes an iterator and uses it to access the container. No information about the container itself is available to an algorithm. This clear separation of containers and algorithms is the basic idea of Generic Programming, which is the key design idea behind the STL.
To implement algorithms only in terms of iterators it is often necessary to infer both the value type and the distance type from an iterator. Here is an example of a fictitious reverse algorithm:
template <class BidirectionalIterator>Note that distance_type and value_type aren't declared anywhere. How do we know what they are? We need to find a way to make available the distance type and value type associated to the iterator type.
void reverse(BidirectionalIterator first,
BidirectionalIterator last)
{
distance_type n = distance(first, last);
--n;
while(n>0)
{
value_type tmp = *first;
*first++ = *--last;
*last = tmp;
n-=2;
}
}
Also, we want to implement optimized versions of operations like advance() for different iterator categories. Here we would want to make use of the related iterator category information. Assuming the category could be represented as a C++ type a tentative implementation could look like this:
template <class InputIterator, class Distance>Again, it is not entirely obvious how we can represent and eventually find the category information associated to an iterator type. We'll explain how this is done shortly.
inline void advance (InputIterator& i, Distance n)
{
__advance(i, n, iterator_category);
}
template <class RandomAccessIterator, class Distance>
inline void __advance (RandomAccessIterator& i,
Distance n,
random_access_iterator_category)
{
i += n;
}
template <class BidirectionalIterator,
class Distance>
inline void __advance (BidirectionalIterator& i,
Distance n,
bidirectional_iterator_category)
{
if (n>=0)
while (n--) ++i;
else
while (n++) --i;
}
THE REQUIREMENTS Basically, the requirements to the design of iterators in the STL and the Standard C++ Library are:
Algorithms shall be generic, i.e., they shall be implemented only in terms of iterators. Hence we need to deduce type information associated with an iterator type, like the value and distance type, directly from the iterator type itself.
Operations shall be optimized if possible. In other words, for reasons of efficiency we want to implement polymorphic operations that exhibit different behavior depending on some property of their argument types. For example, different versions of the advance() function, optimized for each iterator category, should be available.
There are additional requirements that stem from other design principles of the STL and the Standard C++ Library:
Everything must be resolved at compile-time rather than at runtime. The goal is to make components as fast as possible at run-time. Hence, the function dispatch of polymorphic operations mentioned above can be performed at compile-time.
C++ pointers are iterators on a C++ array. They stand on equal footing with all other iterator types. Hence, whatever solution we choose, it must work for C++ pointers as well.
The way in which information is associated to an iterator type is generic. For example, the fact that an iterator is associated to a certain value type has nothing to do with the fact that the iterator falls into a certain category.
In the July 1996 issue, we explained the solution to the above requirements that was used in the original STL. Now, we'll explore the solution chosen for the Standard C++ Library.
THE SOLUTION CHOSEN FOR THE STANDARD C++ LIBRARY In the Standard C++ Library iterators have traits. The traits technique is a means to associate information with class types as well as with nonclass types. See the second sidebar for more information about the traits technique in general. Here are the iterator traits:
template <class Iterator>The iterator traits are a class template to be instantiated with an iterator type that provides all the information associated with the respective iterator type as typedefs. Therefore, all iterators must define nested types, like Iterator::value_type, for each piece of related information. The value and distance type of an iterator are types in the sense of the C++ type system. Similarly, the information about the iterator category is represented by C++ types as well: There are tag classes for each of the iterator categories. They are empty structures named input_iterator_tag, output_iterator_tag, and so on. Since native C++ pointers can be treated as iterators, we need traits for C++ pointers, as well. This is provided as a partial specialization of the iterator_traits template:
struct iterator_traits
{
typedef Iterator::distance_type distance_type;
typedef Iterator::value_type value_type;
typedef Iterator::iterator_category iterator_category;
};
template <class T>Now let us see how iterator traits help to meet the requirements. Here is our ctitious reverse function again:
struct iterator_traits
{
typedef ptrdiff_t distance_type;
typedef T value_type;
typedef random_access_iterator_tag iterator_category;
};
template <class BidirectionalIterator>Now that we have iterator traits it is easy to determine the mysterious distance and value type from the iterator type: The distance type is iterator_traits
void reverse(BidirectionalIterator first,
BidirectionalIterator last)
{
distance_type n = distance(first, last);
--n;
while(n>0)
{ value_type tmp = *first;
*first++ = *--last;
*last = tmp;
n-=2;
}
}
template <class InputIterator, class Distance>The global advance() function calls the overloaded versions of an __advance() function, which takes an iterator tag as function argument. The iterator tags are of different types. Each type of iterator tag represents an iterator category. Consequently, there are five types of iterator tags:
inline void advance (InputIterator& i, Distance n)
{
__advance(i,n,iterator_traits<InputIterator>::iterator_category());
}
template <class RandomAccessIterator, class Distance>
inline void __advance (RandomAccessIterator& i,
Distance n,
random_access_iterator_tag)
{
i += n;
}
template <class BidirectionalIterator, class Distance>
inline void __advance (BidirectionalIterator& i,
Distance n,
bidirectional_iterator_tag)
{
if (n >= 0)
while (n--)
++i;
else
while (n++)
--i;
}
Note that function overloading is always resolved at compile-time. No runtime overhead is imposed.
In addition, note that there are iterator traits for C++ pointers. Thus, the solution works for C++ pointers as well, as stated in the original requirements we listed above.
Consider the count algorithm:
template<class InputIterator, class T>The distance type is used here as the return type of an algorithm. Without iterator traits the count algorithm cannot deduce its return type from its iterator type. This explains why the count algorithm in STL looks like this:
iterator_traits<InputIterator>::distance_type
count (InputIterator first,
InputIterator last,
const T& val)
{
iterator_traits<InputIterator>::distance_type n = 0;
while (first != last)
if (*first++ == val)
++n;
return n;
}
templateThe distance type is explicitly specified by an additional template parameter. As template parameters can only be automatically deduced from the function argument types and never from the function return type, the result is returned via a fourth function argument. This way a call to the count algorithm in STL is error-prone and looks counterintuitive:
void count (InputIterator first,
InputIterator last,
const T& value,
Distance& n)
{
while (first != last)
if (*first++ == value)
++n;
}
list<string> lst;Compare to the call of the count algorithm in the Standard C++ Library:
int i = 0; // you'd better not forget to initialize !!!
count(lst.begin(),lst.end(),"Hello",i)
list<string> lst;Conclusion: The most convincing advantage of iterator traits compared to the old technique is that types associated to an iterator type can be deduced from the iterator alone. Hence associated types can be used as return types of function arguments, which improves the count algorithm.*
int i = count(lst.begin(),lst.end(),"Hello")
Of course there is more to iterator traits. They improve the design of iterators in the general. Consider for instance the way the iterator category is retrieved in STL. It is via the global function
iterator_category().For each type of iterator, including C++ pointers, there is an overloaded versions of iterator_category(). There is one version taking C++ pointers, and five function templates taking iterator types that are derived from certain iterator base classes. For sake of brevity, we will spare you the details of those base classes. See our July '96 article for reference [1].
template <class T, class Distance>The iterator_category() functions are empty. Their only purpose is to allow the selection of the associated iterator tag type, which is then used for function overload resolution.
inline input_iterator_tag iterator_category
(const input_iterator<T,Distance>&)
{
return input_iterator_tag();
}
// etc. for output_iterator, forward_iterator,
// and bidirectional_iterator
template <class T, class Distance>
inline random_access_iterator_tag iterator_category
(const random_access_iterator<T,Distance>&)
{
return random_access_iterator_tag();
}
template <class T>
inline random_access_iterator_tag iterator_category
(const T*)
{
return random_access_iterator_tag();
}
It is certainly more intuitive to do a compile-time function dispatch via a type, i.e., iterator_traits