Part of ARTICLE "Iterators in the Standard C++ Library"
Klaus Kreft & Angelika Langer
Angelika Langer

Iterators Iterators are pointer-like objects that allow programs to step through the elements of a container sequentially without exposing the underlying representation. Iterators can be advanced from one element to the next by incrementing them. Some iterators can also be decremented or allow arbitrary jumps from one element to another, as we will see later. When they are dereferenced, iterators yield a reference to a container element. In addition, they can be compared to each other for equality or inequality.

Iterators interact seamlessly with built-in C++ types. In particular, native C++ pointers are treated as iterators to C++ arrays. Naturally, all containers in the Standard C++ Library define an iterator type, i.e., a nested type iterator that represent their respective pointer-like type.

Iterator Categories Iterators fall into categories. This is because different algorithms impose different requirements on an iterator they use. For example, the find() algorithms needs an iterator that can be advanced by incrementing it, whereas the reverse() algorithm needs an iterator that can be decremented as well, etc. Ultimately, there are five categories of iterators in STL and Standard C++ Library:

An iterator category is an abstraction. It represents a set of requirements to an iterator.

Using the knowledge of an iterator's category one can provide optimized implementations of an algorithm. The advance() operation is an example. It increments (or decrements for negative n) an iterator. template

<class Iterator, class Distance>
inline void advance (Iterator& i, Distance n);
Obviously, there are many ways to do this. For a C++ array one would simply perform pointer arithmetic, i.e., add n to the C++ pointer.
i += n;
For a list, Iterators must step through the sequence and advance step-by-step.
if (n >= 0)
while (n--)
++i;
else
while (n++)
--i;
Observation 1: The iterator category, which is an abstraction that represents a set of requirements to an iterator, is information related to an iterator. It is useful for providing optimized versions of an operation like advance();

Certain type information is related to an iterator. One of the questions that must be answered when designing the iterators in STL and the Standard C++ Library is: how can information related to a type be expressed in C++? Let's defer this discussion for a moment until we discuss other information related to an iterator.

TYPES ASSOCIATED WITH AN ITERATOR
There are two types that might vary depending on the iterator type:

The Distance Type An operation like advance() obviously needs an argument that indicates how far to advance the iterator:

template
inline void advance (Iterator& i, Distance n);
The type of this distance argument must represent the distance between any two iterators. Hence the distance type depends on the iterator type. For C++ pointers the distance type is the C++ type ptrdiff_t, which can represent the differenc between any two C++ pointers. Also, ptrdiff_t is the distance type of all other iterators in STL and Standard Library. However, the distance type in STL and the Standard C++ Library is not limited to ptrdiff_t.
ITERATOR CATEGORIES
There are five categories of Iterators in STL and the Standard C++ Library: Each category adds new features to the previous one. The iterator categories obey the following order:

               input                  output
                 |                       |
                 v                       v
                 +-----------------------+
                             |
                             v
                          forward
                             |
                             v
                       bidirectional
                             |
                             v
                       random access

The Value Type An iterator can be dereferenced. It then returns a reference to a value stored in a container. The type of this referenced value also depends on the respective iterator. For example, if the iterator refers to a container holding integers, the value type will be int. More generally, if the iterator refers to a container that stores elements of an arbitrary type T, the value type will be T.

Observation 2: Each iterator has two related types, its value type and its distance type.

So, we have found more information related to an iterator type. What is this information used for?

Value type and distance type are sometimes needed to implement algorithms. In STL and Standard C++ Library algorithms are separated from containers, i.e., an algorithm takes an iterator and uses it to access the container. No information about the container itself is available to an algorithm. This clear separation of containers and algorithms is the basic idea of Generic Programming, which is the key design idea behind the STL.

To implement algorithms only in terms of iterators it is often necessary to infer both the value type and the distance type from an iterator. Here is an example of a fictitious reverse algorithm:

template <class BidirectionalIterator>
void reverse(BidirectionalIterator first,
    BidirectionalIterator last)
{
    distance_type n = distance(first, last);
    --n;
    while(n>0)
    {
        value_type tmp = *first;
        *first++ = *--last;
        *last = tmp;
        n-=2;
    }
}
Note that distance_type and value_type aren't declared anywhere. How do we know what they are? We need to find a way to make available the distance type and value type associated to the iterator type.

Also, we want to implement optimized versions of operations like advance() for different iterator categories. Here we would want to make use of the related iterator category information. Assuming the category could be represented as a C++ type a tentative implementation could look like this:

template <class InputIterator, class Distance>
inline void advance (InputIterator& i, Distance n)
{
    __advance(i, n, iterator_category);
}

template <class RandomAccessIterator, class Distance>
inline void __advance (RandomAccessIterator& i,
    Distance n,
    random_access_iterator_category)
{
    i += n;
}

template <class BidirectionalIterator,
    class Distance>
inline void __advance (BidirectionalIterator& i,
    Distance n,
    bidirectional_iterator_category)
{
    if (n>=0)
    while (n--) ++i;
    else
    while (n++) --i;
}
Again, it is not entirely obvious how we can represent and eventually find the category information associated to an iterator type. We'll explain how this is done shortly.

THE REQUIREMENTS Basically, the requirements to the design of iterators in the STL and the Standard C++ Library are:

Algorithms shall be generic, i.e., they shall be implemented only in terms of iterators. Hence we need to deduce type information associated with an iterator type, like the value and distance type, directly from the iterator type itself.

Operations shall be optimized if possible. In other words, for reasons of efficiency we want to implement polymorphic operations that exhibit different behavior depending on some property of their argument types. For example, different versions of the advance() function, optimized for each iterator category, should be available.

There are additional requirements that stem from other design principles of the STL and the Standard C++ Library:

Everything must be resolved at compile-time rather than at runtime. The goal is to make components as fast as possible at run-time. Hence, the function dispatch of polymorphic operations mentioned above can be performed at compile-time.

C++ pointers are iterators on a C++ array. They stand on equal footing with all other iterator types. Hence, whatever solution we choose, it must work for C++ pointers as well.

The way in which information is associated to an iterator type is generic. For example, the fact that an iterator is associated to a certain value type has nothing to do with the fact that the iterator falls into a certain category.

In the July 1996 issue, we explained the solution to the above requirements that was used in the original STL. Now, we'll explore the solution chosen for the Standard C++ Library.

THE SOLUTION CHOSEN FOR THE STANDARD C++ LIBRARY In the Standard C++ Library iterators have traits. The traits technique is a means to associate information with class types as well as with nonclass types. See the second sidebar for more information about the traits technique in general. Here are the iterator traits:

template <class Iterator>
struct iterator_traits
{
     typedef Iterator::distance_type distance_type;
     typedef Iterator::value_type value_type;
     typedef Iterator::iterator_category iterator_category;
};
The iterator traits are a class template to be instantiated with an iterator type that provides all the information associated with the respective iterator type as typedefs. Therefore, all iterators must define nested types, like Iterator::value_type, for each piece of related information. The value and distance type of an iterator are types in the sense of the C++ type system. Similarly, the information about the iterator category is represented by C++ types as well: There are tag classes for each of the iterator categories. They are empty structures named input_iterator_tag, output_iterator_tag, and so on. Since native C++ pointers can be treated as iterators, we need traits for C++ pointers, as well. This is provided as a partial specialization of the iterator_traits template:
template <class T>
struct iterator_traits
{
     typedef ptrdiff_t distance_type;
     typedef T value_type;
     typedef random_access_iterator_tag iterator_category;
};
Now let us see how iterator traits help to meet the requirements. Here is our ctitious reverse function again:
template <class BidirectionalIterator>
void reverse(BidirectionalIterator first,
     BidirectionalIterator last)
{
     distance_type n = distance(first, last);
     --n;
     while(n>0)
     { value_type tmp = *first;
          *first++ = *--last;
          *last = tmp;
          n-=2;
     }
}
Now that we have iterator traits it is easy to determine the mysterious distance and value type from the iterator type: The distance type is iterator_traits ::distance_type, and the value type is iterator_traits ::value_type. Also, we wanted to offer versions of the advance() function that are optimized for each iterator category. Here is how this function dispatch is done using iterator traits:
template <class InputIterator, class Distance>
inline void advance (InputIterator& i, Distance n)
{
__advance(i,n,iterator_traits<InputIterator>::iterator_category());
}

template <class RandomAccessIterator, class Distance>
inline void __advance (RandomAccessIterator& i,
          Distance n,
          random_access_iterator_tag)
{
     i += n;
}
template <class BidirectionalIterator, class Distance>
inline void __advance (BidirectionalIterator& i,
          Distance n,
          bidirectional_iterator_tag)
{
     if (n >= 0)
          while (n--)
               ++i;
     else
          while (n++)
               --i;
}
The global advance() function calls the overloaded versions of an __advance() function, which takes an iterator tag as function argument. The iterator tags are of different types. Each type of iterator tag represents an iterator category. Consequently, there are five types of iterator tags: These are empty types; their only purpose it to allow the above described function overloading.

Note that function overloading is always resolved at compile-time. No runtime overhead is imposed.

In addition, note that there are iterator traits for C++ pointers. Thus, the solution works for C++ pointers as well, as stated in the original requirements we listed above.

COMPARING THE STL TECHNIQUE TO THE STANDARD LIBRARY TECHNIQUE
The solution using iterator traits meets the requirements listed above. So did the solution in STL. What did we gain? Why was it changed? The iterator traits have an edge over the old solution. Let's see why.

Consider the count algorithm:

template<class InputIterator, class T>
iterator_traits<InputIterator>::distance_type
count (InputIterator first,
        InputIterator last,
        const T& val)
{
    iterator_traits<InputIterator>::distance_type n = 0;
    while (first != last)
        if (*first++ == val)
            ++n;
    return n;
}
The distance type is used here as the return type of an algorithm. Without iterator traits the count algorithm cannot deduce its return type from its iterator type. This explains why the count algorithm in STL looks like this:
template
void count (InputIterator first,
        InputIterator last,
        const T& value,
        Distance& n)
{
    while (first != last)
        if (*first++ == value)
            ++n;
}
The distance type is explicitly specified by an additional template parameter. As template parameters can only be automatically deduced from the function argument types and never from the function return type, the result is returned via a fourth function argument. This way a call to the count algorithm in STL is error-prone and looks counterintuitive:
list<string> lst;
int i = 0; // you'd better not forget to initialize !!!
count(lst.begin(),lst.end(),"Hello",i)
Compare to the call of the count algorithm in the Standard C++ Library:
list<string> lst;
int i = count(lst.begin(),lst.end(),"Hello")
Conclusion: The most convincing advantage of iterator traits compared to the old technique is that types associated to an iterator type can be deduced from the iterator alone. Hence associated types can be used as return types of function arguments, which improves the count algorithm.*

Of course there is more to iterator traits. They improve the design of iterators in the general. Consider for instance the way the iterator category is retrieved in STL. It is via the global function

iterator_category().For each type of iterator, including C++ pointers, there is an overloaded versions of iterator_category(). There is one version taking C++ pointers, and five function templates taking iterator types that are derived from certain iterator base classes. For sake of brevity, we will spare you the details of those base classes. See our July '96 article for reference [1].

template <class T, class Distance>
inline input_iterator_tag iterator_category
        (const input_iterator<T,Distance>&)
{
    return input_iterator_tag();
}

// etc. for output_iterator, forward_iterator,
// and bidirectional_iterator

template <class T, class Distance>
inline random_access_iterator_tag iterator_category
        (const random_access_iterator<T,Distance>&)
{
    return random_access_iterator_tag();
}

template <class T>
inline random_access_iterator_tag iterator_category
        (const T*)
{
    return random_access_iterator_tag();
}
The iterator_category() functions are empty. Their only purpose is to allow the selection of the associated iterator tag type, which is then used for function overload resolution.

It is certainly more intuitive to do a compile-time function dispatch via a type, i.e., iterator_traits ::itertor_category, opposed to the return type of a function, iterator_category(). It somehow suggests that the function has "functionality," i.e., performs something at runtime, which in fact it doesn't.