Generic Algorithms
The library containers define a surprisingly small set of operations. Rather than adding lots of functionality to each container, the library provides a set of algorithms, most of which are independent of any particular container type. These algorithms are generic: They operate on different types of containers and on elements of various types.
Overview
Most of the algorithms are defined in the algorithm
header. The library also defines a set of generic numeric algorithms that are defined in the numeric
header.
In general, the algorithms do not work directly on a container. Instead, they operate by traversing a range of elements bounded by two iterators. If we want find a substring in string container, we can use the s.find
function, however the vector does not provide method find, how could we do?
The generic algorithm provides a function called find can do this:
int main() {
vector<int> k = { 5,4,3,2,1 };
// result will denote the element we want if it's in vec, or vec.cend() if not
auto pos1 = find(k.cbegin(), k.cend(), 6);
auto pos2 = find(k.cbegin(), k.cend(), 2);
cout << (pos1 == k.cend()) << " " << *pos2 << endl;
return 0;
}
//out put
//1 2
The first two arguments to find are iterators denoting a range of elements, and the third argument is a value. find compares each element in the given range to the given value. It returns an iterator to the first element that is equal to that value. If there is no match, find returns its second iterator to indicate failure.(in this case id k.cend())
How the Algorithms Work
Conceptually, we can list the steps find must take:
- It accesses the first element in the sequence.
- It compares that element to the value we want.
- If this element matches the one we want, find returns a value that identifies this element.
- Otherwise, find advances to the next element and repeats steps 2 and 3.
- find must stop when it has reached the end of the sequence.
- If find gets to the end of the sequence, it needs to return a value indicating that the element was not found. This value and the one returned from step 3 must have compatible types.
None of these operations depends on the type of the container that holds the elements.
Iterators Make the Algorithms Container Independent
All but the second step in the find function can be handled by iterator operations: The iterator dereference operator gives access to an element’s value;
But Algorithms Do Depend on Element-Type Operations
For example, step 2, uses the element type’s ==
operator to compare each element to the given value.
A First Look at the Algorithms
The library provides more than 100 algorithms. Fortunately, like the containers, the algorithms have a consistent architecture. Understanding this architecture makes learning and using the algorithms easier than memorizing all 100+ of them.
With only a few exceptions, the algorithms operate over a range of elements. We’ll refer to this range as the “input range”
. The algorithms that take an input range always use their first two parameters to denote that range. These parameters are iterators denoting the first
and one past the last elements
to process.
Read-Only Algorithms
A number of the algorithms read, but never write to, the elements in their input range. The find
function is one such algorithm, as is the count
function which count a value how often. Another read-only algorithm is accumulate
, which is defined in the numeric
header. The first two parameters specify a range of elements to sum. The third is an initial value for the sum.
int main() {
vector<int> k = { 5,4,5,2,1 };
// output is 2
cout << count(k.cbegin(), k.cend(), 5) << endl;
// output is initvalue 2 add all vector equal to 2+17=19
cout << accumulate(k.cbegin(), k.cend(), 2) << endl;
return 0;
}
The type of the third argument to accumulate determines which addition operator is used and is the type that accumulate returns.
The fact that accumulate uses its third argument as the starting point for the summation has an important implication: It must be possible to add
the element type to the type of the sum. That is, the elements in the sequence must match or be convertible to the type of the third argument.
Another read-only algorithm is equal
, which lets us determine whether two sequences hold the same values. It compares each element from the first sequence to the corresponding element in the second. It returns true if the corresponding elements are equal, false otherwise. The algorithm takes three iterators: The first two (as usual) denote the range of elements in the first sequence; the third denotes the first element in the second sequence:
int main() {
vector<int> v1 = { 1,2,3,4,5 };
vector<int> v2 = { 1,2,3,4,5 };
// ouput is 1
cout << equal(v1.cbegin() - 1, v1.cend(), v2.cbegin() - 1) << endl;
return 0;
}
Because equal
operates in terms of iterators, we can call equal to compare elements in containers of different types. Moreover, the element types also need not be the same so long as we can use == to compare the element types.
int main() {
vector<string> vs = { "123","234" };
list<char*> lc = { "123","234" };
// out put is 1
cout << equal(vs.cbegin(), vs.cend(), lc.cbegin()) << endl;
return 0;
}
NOTICE: Algorithms that take a single iterator denoting a second sequence assume that the second sequence is at least as large at the first.
Algorithms That Write Container Elements
Some algorithms assign new values to the elements in a sequence. When we use an algorithm that assigns to elements, we must take care to ensure that the sequence into which the algorithm writes is at least as large as the number of elements we ask the algorithm to write. Remember, algorithms do not perform container operations, so they have no way themselves to change the size of a container.
the fill
algorithm takes a pair of iterators that denote a range and a third argument that is a value. fill assigns the given value to each element in the input sequence:
int main() {
vector<string> vs(10);
fill(vs.begin(), vs.end(), "A");
return 0;
}
Some algorithms take an iterator that denotes a separate destination. These algorithms assign new values to the elements of a sequence starting at the element denoted by the destination iterator. For example, the fill_n
function takes a single iterator, a count, and a value. It assigns the given value to the specified number of elements starting at the element denoted to by the iterator
.
int main() {
vector<string> vs(10);
fill_n(vs.begin(), vs.size(), "A");
return 0;
}
One way to ensure that an algorithm has enough elements to hold the output is to use an insert iterator
. An insert iterator is an iterator that adds elements to a container. back_inserter
, which is a function defined in the iterator
header.
back_inserter
takes a reference to a container and returns an insert iterator bound to that container. When we assign through that iterator, the assignment calls push_back
to add an element with the given value to the container:
int main() {
vector<string> k;
auto it = back_inserter(k);
*it = "hello";
cout << *k.cbegin() << endl;
return 0;
}
// output
// hello
We frequently use back_inserter to create an iterator to use as the destination of an algorithm. For example:
int main() {
vector<string> k;
fill_n(back_inserter(k), 5, "A");
for (auto m : k) {
cout << m;
}
cout << endl;
return 0;
}
// output
// AAAAA
The copy
algorithm is another example of an algorithm that writes to the elements of an output sequence denoted by a destination iterator. This algorithm takes three iterators. The first two denote an input range; the third denotes the beginning of the destination sequence. The value returned by copy is the (incremented) value of its destination iterator. That is, ret will point just past the last element copied into a2.
int main() {
int a1[] = { 0,1,2,3,4,5,6,7,8,9 };
int a2[sizeof(a1) / sizeof(*a1)]; // a2 has the same size as a1
// ret points just past the last element copied into a2
auto ret = copy(begin(a1), end(a1), a2); // copy a1 into a2
cout << *(ret - 1) << endl; // output is 9
return 0;
}
the replace
algorithm reads a sequence and replaces every instance of a given value with another value. This algorithm takes four parameters: two iterators denoting the input range, and two values. It replaces each element that is equal to the first value with the second:
int main() {
vector<string> k;
fill_n(back_inserter(k), 5, "A");
replace(k.begin(), k.end(), "A", "B");
//output BBBBB
for (auto m : k) {
cout << m;
}
cout << endl;
return 0;
}
If we want to leave the original sequence unchanged, we can call replace_copy
. That algorithm takes a third iterator argument denoting a destination in which to write the adjusted sequence:
replace_copy(k.begin(), k.end(), back_inserter(m), "A", "B");
Algorithms That Reorder Container Elements
Some algorithms rearrange the order of elements within a container. An obvious example of such an algorithm is sort
. A call to sort arranges the elements in the input range into sorted order using the element type’s <
operator.
int main() {
vector<int> vec = { 5,4,3,4,1,2 };
sort(vec.begin(), vec.end());
return 0;
}
To eliminate the duplicated words, we will first sort the vector so that duplicated words appear adjacent to each other. Once the vector is sorted, we can use another library algorithm, named unique
, to reorder the vector so that the unique elements appear in the first part of the vector. Because algorithms cannot do container operations, we’ll use the erase member of vector to actually remove the elements:
void elimDups(vector<int>& vec) {
sort(vec.begin(), vec.end());
auto end_unique = unique(vec.begin(), vec.end());
vec.erase(end_unique, vec.end());
}
You can see more details about unique
in here. Shortly, the removal is done by replacing the duplicate elements by the next element that is not a duplicate, and signaling the new size of the shortened range by returning an iterator to the element that should be considered its new past-the-end element.
#include<bits/stdc++.h>
using namespace std;
void printVec(vector<int>& vec) {
for (auto veci : vec) {
cout << veci << " ";
}
cout << endl;
}
void elimDups(vector<int>& vec) {
// sorted
sort(vec.begin(), vec.end());
printVec(vec);
// uniqued
auto end_unique = unique(vec.begin(), vec.end());
printVec(vec);
// remove duplicates
vec.erase(end_unique, vec.end());
printVec(vec);
}
int main() {
vector<int> vec = { 5,4,3,4,1,2,1 };
elimDups(vec);
return 0;
}
the end_unique
denotes one past the last unique element.
Customizing Operations
Many of the algorithms compare elements in the input sequence. By default, such algorithms use either the element type’s <
or ==
operator. The library also defines versions of these algorithms that let us supply our own operation to use in place of the default operator.
For example, the sort algorithm uses the element type’s <
operator. However, we might want to sort a sequence into a different order from that defined by <
, or our sequence might have elements of a type (such as Sales_data) that does not have a <
operator. In both cases, we need to override the default behavior of sort.
Passing a Function to an Algorithm
As one example, assume that we want to sort string order by length, and then alphabetically within each size. To reorder the vector by length, we’ll use a second, overloaded version of sort. This version of sort takes a third argument that is a predicate.
A predicate is an expression that can be called and that returns a value that can be used as a condition. The predicates used by library algorithms are either unary predicates (meaning they have a single parameter) or binary predicates (meaning they have two parameters).
#include<bits/stdc++.h>
using namespace std;
bool isShort(const string& str1, const string& str2) {
if (str1.size() != str2.size()) {
return str1.size() < str2.size();
}
else {
return str1 < str2;
}
}
int main() {
vector<string> vec = { "bbcd","abcd","c" };
sort(vec.begin(), vec.end(), isShort);
for (auto i = vec.begin(); i != vec.end(); i++) {
cout << *i << " ";
}
cout << endl;
return 0;
}
Lambda Expressions
The predicates we pass to an algorithm must have exactly one or two parameters, depending on whether the algorithm takes a unary or binary predicate, respectively. However, sometimes we want to do processing that requires more arguments than the algorithm’s predicate allows.
As a example, if we want find the first element in the string vector that has the given size, we can use the library find_if
algorithm to find an element that has a particular size. Like find
, the find_if
algorithm takes a pair of iterators denoting a range. Unlike find
, the third argument to find_if
is a predicate
. The find_if
algorithm calls the given predicate on each element in the input range. It returns the first element for which the predicate returns a nonzero value, or its end iterator if no such element is found. However, find_if
takes a unary predicate—any function we pass to find_if must have exactly one parameter that can be called with an element from the input sequence. There is no way to pass a second argument representing the size.
We can pass any kind of callable object to an algorithm. The only callables we’ve used so far are functions and function pointers. There are two other kinds of callables: classes that overload the function-call
operator, and lambda expressions.
The format of lambda expressions is [capture list] (parameter list) -> return type { function body }
.
- capture list is an (often empty) list of local variables defined in the enclosing function
- return type, parameter list, and function body are the same as in any ordinary function.
We’re now ready to solve our original problem, which is to write a callable expression that we can pass to find_if
. We want an expression that will compare the length of each string in the input sequence with the value of the sz
parameter in the biggies function.
int main() {
vector<string> vec = { "bbcd","abcd","sd","c" };
size_t sz = 2;
auto wc = find_if(vec.begin(), vec.end(), [sz](const string& str) {
return str.size() <= sz;
});
cout << *wc << endl; // output is sd
return 0;
}
Lambda Captures and Returns
When we define a lambda, the compiler generates a new (unnamed) class type that corresponds to that lambda. For now, what’s useful to understand is that when we pass a lambda to a function, we are defining both a new type and an object of that type: The argument is an unnamed object of this compiler-generated class type. Similarly, when we use auto to define a variable initialized by a lambda, we are defining an object of the type generated from that lambda.
Similar to parameter passing, we can capture
variables by value or by reference. As with a parameter passed by value, it must be possible to copy such variables. Unlike parameters, the value of a captured variable is copied when the lambda is created, not when it is called:
int v = 10;
auto k = [v]() {return v; };
v = 100;
cout << k() << endl; // output is 10
We can also define lambdas that capture variables by reference. For example:
int v = 10;
auto k = [v]() {return v; };
v = 100;
cout << k() << endl; // output is 10
Rather than explicitly listing the variables we want to use from the enclosing function, we can let the compiler infer which variables we use from the code in the lambda’s body. To direct the compiler to infer the capture list, we use an &
or =
in the capture list. The &
tells the compiler to capture by reference, and the =
says the values are captured by value.
int v = 10;
auto k = [=]() {return v; };
v = 100;
cout << k() << endl; // output is 10
int v = 10;
auto k = [&]() {return v; };
v = 100;
cout << k() << endl; // output is 100
If we want to capture some variables by value and others by reference, we can mix implicit and explicit captures:
int v = 10, d = 20;
auto k = [=, &d]() {return v + d; };
v = 100;
d = 50;
cout << k() << endl; // output is 60
By default, a lambda may not change the value of a variable that it copies by value. If we want to be able to change the value of a captured variable, we must follow the parameter list with the keyword mutable
. Lambdas that are mutable may not omit the parameter list:
int v = 10;
auto k = [v]()mutable {v = 100; return v; };
cout << k() << endl; // output is 100
cout << v << endl; // output is 10
The lambdas we’ve written so far contain only a single return statement. We can use -> type
specify return type.
auto k = [=, &d]()->int {return v + d; };
Binding Arguments
It is usually straightforward to use a function in place of a lambda that has an empty capture list. As we’ve seen, we can use either a lambda or our isShort
function to order the vector on word length. However, it is not so easy to write a function to replace a lambda that captures local variables. For example, the lambda that we used in the call to find_if compared a string with a given size. We can easily write a function to do the same work:
bool check_size(const string &s, string::size_type sz){
return s.size() >= sz;
}
However, we can’t use this function as an argument to find_if. Because find_if
takes a unary predicate. The lambda that biggies passed to find_if used its capture list to store sz
. In order to use check_size
in place of that lambda, we have to figure out how to pass an argument to the sz
parameter.
We can solve the problem of passing a size argument to check_size by using a new library function named bind
, which is defined in the functional header. The bind
function can be thought of as a general-purpose function adaptor. It takes a callable object and generates a new callable that “adapts” the parameter list of the original object. The form is auto newCallable = bind(callable, arg_list);
.
The arguments in arg_list
may include names of the form placeholders::_n
, where n is an integer. These arguments are “placeholders” representing the parameters of newCallable
. They stand “in place of” the arguments that will be passed to newCallable
. The number n
is the position of the parameter in the generated callable: _1 is the first parameter in newCallable
, _2 is the second, and so forth.
auto check_new = bind(check_size, placeholders::_1, 2);
Using bind, we can replace our original lambda-based call to find_if
:
auto wc = find_if(vec.begin(), vec.end(), check_new);
Revisiting Iterators
In addition to the iterators that are defined for each of the containers, the library defines several additional kinds of iterators in the iterator header. These iterators include
- Insert iterators: These iterators are bound to a container and can be used to insert elements into the container.
- Stream iterators: These iterators are bound to input or output streams and can be used to iterate through the associated IO stream.
- Reverse iterators: These iterators move backward, rather than forward. The library containers, other than forward_list, have reverse iterators.
- Move iterators: These special-purpose iterators move rather than copy their elements.
Insert Iterators
An inserter is an iterator adaptor that takes a container and yields an iterator that adds elements to the specified container. There are three kinds of inserters. Each differs from the others as to where elements
are inserted:
- back_inserter creates an iterator that uses
push_back
. - front_inserter creates an iterator that uses
push_front
. - inserter creates an iterator that uses insert. This function takes a second argument, which must be an iterator into the given container. Elements are inserted ahead of the element denoted by the given iterator.
It is important to understand that when we call inserter(c, iter), we get an iterator that, when used successively, inserts elements ahead of the element originally denoted by iter. That is, if it is an iterator generated by inserter, then an assignment such as *it = va1;
behaves as:
it = c.insert(it, val); // it points to the newly added element
++it; // increment it so that it denotes the same element as before
iostream Iterators
Even though the iostream types are not containers, there are iterators that can be used with objects of the IO types. An istream_iterator
reads an input stream, and an ostream_iterator
writes an output stream. This part as a expand you can read chapter10 section 4 in c++ primer 5th.
Structure of Generic Algorithms
The most fundamental property of any algorithm is the list of operations it requires from its iterator(s). Some algorithms, such as find
, require only the ability to access an element through the iterator, to increment the iterator, and to compare two iterators for equality. Others, such as sort
, require the ability to read, write, and randomly access elements. The iterator operations required by the algorithms are grouped into five iterator categories listed in next table. Each algorithm specifies what kind of iterator must be supplied for each of its iterator parameters.
Type | Characteristics |
---|---|
Input iterator | Read, Increment only |
Output iterator | Write, Increment only |
Forward iterator | Read and write, Increment only |
Bidirectional iterator | Read and write, Increment and Descrement |
Random_access iterator | Read and write, full iterator arithmetic |
The Iterator Categories
- Input iterators: can read elements in a sequence. An input iterator must provide
- Equality and inequality operators (
==
,!=
) to compare two iterators - Prefix and postfix increment (
++
) to advance the iterator - Dereference operator (
*
) to read an element; dereference may appear only on the right-hand side of an assignment - The arrow operator (
->
) as a synonym for(* it).member
—that is, dereference the iterator and fetch a member from the underlying object.
- Equality and inequality operators (
- Output iterators: can be thought of as having complementary functionality to input iterators; they write rather than read elements. Output iterators must provide
- Prefix and postfix increment (
++
) to advance the iterator - Dereference (*), which may appear only as the left-hand side of an assignment (Assigning to a dereferenced output iterator writes to the underlying element.)
- Prefix and postfix increment (
- Forward iterators: can read and write a given sequence. They move in only one direction through the sequence. Forward iterators support all the operations of both input iterators and output iterators.
- Bidirectional iterators: can read and write a sequence forward or backward. In addition to supporting all the operations of a forward iterator, a bidirectional iterator also supports the prefix and postfix decrement (
--
) operators. - Random-access iterators: provide constant-time access to any position in the sequence. These iterators support all the functionality of bidirectional iterators.
Container-Specific Algorithms
Unlike the other containers, list
and forward_list
define several algorithms as members. In particular, the list types define their own versions of sort
, merge
, remove
, reverse
, and unique
. you can see details in here.