Noeud:Containers and Iterators, Noeud « Next »:, Noeud « Previous »:How the STL is Structured, Noeud « Up »:libstdc++ and the Standard Template Library



Containers and Iterators

Here we introduce the different containers available, as well as talk briefly about how iterators work with regard to containers. There are two groups of containers to be aware of: sequence and sorted associative containers (we'll look at iterators in a minute).

Sequence containers are exactly what they say - a sequence of elements. The elements are stored according to how they are inserted, or until you make some change (like performing a sort) to the ordering of elements. So creating a sequence container with ten elements means that when you access the first element, you are looking at the first element that was inserted; the second position is the second element you inserted, and so on.

Three sequence containers are provided: vector, deque1 and list. vectors and deques store elements contiguously in memory, so inserting elements can be expensive because (for example) if an element is inserted at the start of a vector, the remaining elements have to be reallocated. The advantage of this is that elements can be accessed very quickly via their index (like an array).

For example, consider the following vector containing characters 'a' through 'e':

     
                ----.-----.-----.-----.-----.-----.----
                 ...|  a  |  b  |  c  |  d  |  e  |  ... memory
                ----'-----'-----'-----'-----'-----'----
     Index:            0     1     2     3     4
     
     

Suppose that we want to insert a new character, 'z', between elements 2 and 3; to do so would involve reallocating memory from element 3 onwards to cope with the new insertion because elements are stored contiguously. This takes time of course; the elements that sit to the right of the newly inserted element need to be reallocated. Thus, the more elements you have, the longer it takes to insert new elements near the start of the vector. Inserting the new character is represented diagrammatically below:

                                              elements 'd' and 'e'
                                             need to be reallocated
                                new element    |
                                      |     |-----|
                                      V     V     V
             ----.-----.-----.-----.-----.-----.-----.----
             ... |  a  |  b  |  c  |  z  |  d  |  e  |  ... memory
             ----'-----'-----'-----'-----'-----'-----'----
     Index:         0     1     2     3     4     5
     
     
...and because the elements have been reallocated ('d' is now located at index 4 etc.), we can retrieve them via their index in constant time. Deleting elements is also costly, for the same reason - if you take out element 3 ('z'), 'd' and 'e' need to be reallocated so that they occupy their old positions again. As compensation however, if we want element 3, we know exactly where it is (just like using an array), so although it may take time to insert and remove elements, retrieval is fast.

TODO: deque explanation w. diagrams etc.

lists on the other hand provide quick insertion and deletion times because elements are not stored contiguously in memory, but as a consequence you cannot access elements via an index - you have to move through the list looking at each element to determine what it contains.

TODO: list explanation w. diagrams etc.

Sorted associative containers on the other hand are stored via keys which aim to make retrieval quicker than sequence containers. You do not access them by their location (as you would a sequence container), but instead access them by their key. So if you had a sorted associative container with the key being a string and the value being an integer (for example a name:phone-number pairing), you would retrive the integer value by accessing the relevant key. For example if you had an element that had the key "John Smith" and the value 456123, you get the number by asking for the element with key "John Smith".

There are four sorted associative containers available, split into two categories: set and map. With sets, the data items are the keys themselves; maps on the other hand store data as a key:value pairing. set and map allow unique keys only; multiset and multimap allow duplicate keys.

TODO: set explanation w. diagrams etc.

TODO: map explanation w. diagrams etc.

Before we proceed, a note about time complexities. In the following table, we use the term O to represent big-oh notation. That is, we use O to represent the worst-case scenario for some computation. For example, if you have a dynamic array of characters and you want to search for an element in the array, the time complexity for the search will be O(n) - or linear time - in otherwords, we know that we'll not have to make more than n tests to find some arbitrary element. The different concepts that we'll refer to at various points in this chapter involve looking at constant, linear and logarithmic time. Here's a brief explanation of each one:

Time Discussion
Constant The best time we can hope for and acheive; constant time operations are performed in O(1) time.
Linear Achieved in O(n) time - for example, searching an unordered linked list for an element would take linear time.
Logarithmic Operations are performed in O(log n) time. This is an improvment on linear time operations, and a classic example would be a binary tree.

Now time complexity is out of the way, the following table summarises the different containers available along with their benefits:

Container Insertion Time Deletion Time Reference Invalidation2 Retrieval Time
vector For elements at the end: O(n) if there is no space for the element, and all elements need to be reallocated; O(1) otherwise. For all other elements: O(n) For elements at the end - constant O(1) time; for all other elements: O(n) Yes Constant
deque For elements at the beginning and end: O(n) if there is no space for the element, and all elements need to be reallocated; O(1) otherwise. For all other elements: O(n). For elements at the beginning and end: O(1). For all other elements: O(n). Yes Constant
list Constant: elements can be inserted anywhere in O(1) time. Constant: elements can be deleted anywhere in O(1) time. No O(n)
set, multiset O(log n) Where there are i elements to be deleted, the time complexity is O(log n+i) No O(log n)
map, multimap O(log n) Where there are i elements to be deleted, the time complexity is O(log n+i) No O(log n)

Before moving on, let's look at some real-world examples of practical applications that would use STL containers.

Application Container(s) Explanation
Telephone Directory map, multimap A simple telephone directory would contain an address with a corresponding telephone number. You'd search the directory for an address, and the address would give you the phone number. Simple, really, but the important reason for using map is that it enables us to search in logarithmic time. For a telephone directory of many entries, this is very important: a directory holding 16 million entries would require no more than 24 steps.
Order Processing System vector, deque Consider a simple order processing system where orders are received and placed in some kind of queue. vector and deque are perfect for the job: they enable us to store elements contiguously and as a result, we can model the way in which orders are received by queuing them. deque would probably be our best solution: we add order placements to the end of the deque in the order that they arrive; and we remove them from the front, like a FIFO (first-in, first-out) queue.

The following sections provide programming examples for each of these containers. Depending on what problem you are working with, different containers will provide different advantages. Just for the record, STL provides a number of container adaptors, namely stack, queue and priority_queue - the names of which should give you some idea of what they do. A special container is also provided called bitset. However, none of these containers are dealt with in this chapter; see Further Reading for reference material.


Notes de bas de page

  1. deque stands for double-ended queue and is pronounced deck as in 'deck of cards'.

  2. from insertion or deletion