Loop over all the elements in a container and call a function with each element and additional values as arguments
In part 1 of this series we looked at the simplest case we could, a case that is tailor-made for std::for_each – calling a function with each element of a container as an argument. That particular case is easy, but it obviously doesn’t cover everything so we’ll move on to a (slightly) more complex example.
Our previous problem statement was: “loop over all the elements in a container and call a function with each element as an argument”: We’ll extend the problem so that now the function doesn’t just take a single argument of the same type as the element, it takes an additional argument.
As before we have a few declarations:
class T
{
...
};
void f_T_int( T const& t, int i )
std::vector< T > v_T;
Notice that f_T_int
takes an object of type T
and an int
.
First of all, our raw loop version of the solution.
Solution #0
int j( functionReturningInt() );
for( std::vector< T >::iterator i( std::begin( v_T ) );
i != std::end( v_T );
++i )
{
f_T_int( *i, j );
}
Solution #0 has the same set of pros and cons as solution #0 did in part 1.
As before, there is a range-based for solution:
Solution #1
for( T const& t : v_T )
{
f_T_int( t, j );
}
Again, similar pros and cons to the range-based for in part 1.
How about using for_each? In part 1 , turning the explicit loop into an application of std::for_each was easy. std::for_each expects to be given a function (and I’ll include “function object” in the definition of function) that takes a single argument. In part 1 that was exactly the function we wished to call. Now, however, we want to call a function that takes an additional parameter – as well as taking an object of type T
it takes an integer.
Back in the bad old days before C++11 we would set up a function object like this:
Solution #2
class C_T_int
{
public:
C_T_int( int j ) : j_( j )
{
}
void operator ()( T const& t )
{
f_T_int( t, j_ );
}
private:
int j_;
};
std::for_each(
std::begin( v_T ),
std::end( v_T ),
C_T_int( j ) );
The function object C_T_int
takes our additional argument (the integer) in its constructor, then defines operator ()
to accept the argument from our container. When operator ()
is called it has both arguments that it needs to be able to call f_T_int
.
The function object allows us to use std::for_each, and the line of code that calls std::for_each is very straightforward, however solution #1 has a massive “con”: the overhead is horrible, there is lots of boilerplate.
If only we had some way of generating the function object without having to write all of the boilerplate. The C++11 standard provides us with exactly that: lamda functions. As the standard says “Lambda expressions provide a concise way to create simple function objects”. Using lambdas leads to:
Solution #3
std::for_each(
std::begin( v_T ),
std::end( v_T ),
[ j ]( T const& t )
{
f_T_int( t, j );
} );
(This is a minimal introduction to lambda functions, there is much more to them than I am going to cover. Herb Sutter’s talk Lambdas, Lambdas Everywhere has much more information.)
The lambda function consists of three parts, contained in three different types of brackets. It starts with square brackets – [] – moves on to round brackets – () – then finishes off with curly brackets – {}.
The three parts correspond to the three essential features of the function object we created for solution #2:
- The value(s) passed into the constructor of the function object appear in the square brackets – []
- The value(s) passed into
operator ()
appear in the round brackets – ()
- The code in
operator ()
appears in the curly brackets – {}
Notice the unusual looking piece of syntax at the end:
} );
That occurs because the lambda is being passed as an argument to std::for_each.
This first four solutions have one thing in common, they each contain a block of code looking something like this:
{
f_T_int( t, j );
}
The difference is in the surrounding boilerplate. Assuming that we ignore solution #2, the hand-coded function object (which is sufficiently horrible that I feel justified in dismissing it), all of these solutions keep the code to be executed each time during the loop in the same place as the loop – they keep everything together (although if you take that to the extreme then we end up with one enormous “main” function – it’s another entry in both the “pros”” and “cons” column). The range-based for and std::for_each solutions separate out the loop from what happens each time around the loop. Looking at the code and seeing a range-based for or std::for_each tells us that exceptions and early returns aside, we will iterate over every member of the given range.
The block of code can easily be extended to do something else. As before, this falls into both “pros” and “cons”.
C++11 gives us another option, a way of taking a function with N arguments and turning it into a function with M arguments (where M < N). For those of you who think that this sounds like currying, you’re right – it’s like currying.
Solution #4
using namespace std::placeholders;
std::for_each(
std::begin( v_T ),
std::end( v_T ),
std::bind( f_T_int, _1, j ) );
std::bind started life as boost::bind and was adopted into the C++11 standard. In solution #4 it takes three arguments:
f_T_int
– The function we ultimately want to call.
_1
– The first argument to be passed to f_T_int
. _1
is a special value, a placeholder. std::bind is going to produce a function object – in this case a function object that takes a single argument. That single argument (the first argument) will be passed on to f_T_int
in the position that _1
is in.
j
– The second argument to be passed to f_T_int
. In this case it’s the value of the variable j
.
In this example, std::bind took a function that takes two parameters – f_T_int
and turned it into a function object that takes a single parameter by binding one of the arguments to j
. This single parameter function is exactly what we want for std::for_each.
(As with the lambdas, I am skipping over many, many details about std::bind.)
(Aside – I am not normally a fan of using
statements, and in particular I won’t use them much in this series because I want to be very clear about where everything is coming from, however if I were to keep this rule up I would have to write std::placeholders::_1 and std::placeholders::_2 and that is too ugly even for me.)
We can also use adobe::for_each with lambda or std::bind:
Solution #5
adobe::for_each( v_T, [ j ]( T const& t )
{
f_T_int( t, j );
} );
Solution #6
adobe::for_each( v_T, std::bind( f_T_int, _1, j ) );
So we have our original raw loop solution (#0) and 6 other possibilities:
- The raw loop – we are trying to get away from this.
- Range based for loop
- Old style C++ function object
- std::for_each with a lambda
- std::for_each with std::bind
- adobe::for_each with a lambda
- adobe::for_each with std::bind
#0 is the version we are trying to get away from, #2 has so much boilerplate that I am prepared to drop it without further consideration. std::for_each and adobe::for_each are variations on a theme, and the choice between them will depend on (a) whether you are prepared to include the Adobe Source Libraries in your project and (b) whether you want to operate over the entire container. That leaves three solutions:
- Range based for loop
- std::for_each with a lambda
- std::for_each with std::bind
Range-based for and lambda both have a block of code that (more or less) matches the block of code in the original raw solution:
{
f_T_int( t, j );
}
Range-based for lambda differ in the boilerplate surrounding this block of code, although there are still similarities (for example, they both have T const& t
declarations).
The main difference is that the range-based for models one type of loop – std::for_each (yes, you can achieve the same effect as other algorithms but it takes additional code). For solving this particular problem that isn’t an issue, that is exactly the type of loop we want. Later in this series we’ll be looking at other algorithms.
The fact that range-based for and lambda both have a block of code means that they can both put extra things into that block of code. In fact, they could put the definition of f_T_int
right into the loop block itself. std::bind doesn’t let us do that – once we have bound a particular function the only thing that can be done each time around the loop is to call that function. Let’s look at the pros and cons of the code block vs. std::bind:
Code block pros:
- Keeps the code together – the loop code and the action to take on each iteration is in the same place.
- Makes it easy to change the operation that is taking place on each iteration of the loop.
- Easy to step through in a debugger
- Clear error messages (at least as clear as C++ error messages ever are)
- “Normal” syntax – basically the same syntax as if we were calling the function once.
Code block cons:
- Keeps the code together – the loop code and the action to take on each iteration is in the same place.
- Too easy to change. Making something easy to change makes it more likely to be changed. Keeping code working as it is maintained is a problem.
- Difficult to test the body of the loop separately from the loop itself.
std::bind pros:
- Forces you to split the code into separate functions and manage complexity. The loop is separate from the action taken on each iteration.
- Easy to test the body of the loop separately from the loop itself.
std::bind cons:
- Forces you to split the code into separate functions.
- Tricky to step through in the debugger.
- Error messages are verbose and almost incomprehensible.
Wrap up
I like std::bind. I think that currying is an elegant solution to many problems, and I like the fact that std::bind forces me to split the looping construct from the action taken on each iteration. I have to come up with a good function name which means I have to think through the problem fully and clearly. Naming is difficult but the payoff of having a well named function is worth it. I like the fact that I can test the function executed for each element independently of the loop.
Sadly (for me anyway), I think I am in a minority of one here. Bjarne Stroustrup writes about bind:
These binders … were heavily used in the past, but most uses seem to be more easily expressed using lambdas.
The C++ Programming Language – Fourth Edition, page 967.
I think that the title of Herb Sutter’s talk Lambdas, Lambdas Everywhere is intended as an expression of optimism rather than as a dire warning.
My biggest concern with lambdas is that they’re going to be used in the same way that function bodies are now – to contain vast amounts of deeply nested code.
I took the title of this series “No raw loops” from a talk by Sean Parent at Going Native 2013. In that same talk (starting at around 29:10) he makes suggestions about when and how to use range-based for loops and lambda functions. The summary is, “keep the body short”, he suggests limiting the body to the composition of two functions with an operator. If I was confident that lambda and range-based for bodies were never going to exceed this level of complexity I would be more positive about them. For myself, having some limits on what I can do often leads to a better organized, more thought through design.