C++ Morsels: std::for_each functors member variables

So, picture the situation. You need to iterate the contents of an STL container, performing a given operation on each one.

Perfect. std::for_each is your friend. It can take either a plain unary function (taking a single argument of the type of the item in the container) or a functor -- a special-purpose object that must have a specified method, typically the operator () method (known as function-call), taking a single argument of the container's type. A functor could have other methods, but for_each won't call them, all it cares about is operator (). More interestingly, a functor can also have data members. Again, for_each will disregard them, but your operator () implementation can use them to store temporary data, and you can pre-load context data into them prior to calling for_each, to pass additional arguments.

But what if you want to collect data during the iteration of the container, and pass that data back to the caller? See, for_each doesn't really allow you to return any state out of the unary function/functor ("Its return value, if any, is ignored"). However, the for_each documentation states that "For_each returns the function object after it has been applied to each element." Ahh, now this sounds promising. Let's have the functor collect the info in the functor's data members, and then we can fetch them out after for_each is done. Here's some example code of how that might look. I've written this from scratch without trying to compile it, there could be typos. I'm emphasizing property-type access to private member storage here, as that's something that the for_each/functor technique is especially effective on.


class Contained
{
public:
Contained() : _shouldBeCounted(0) {}
bool getShouldBeCounted(void) const {return(_shouldBeCounted);}
void setShouldBeCounted(bool newShouldBeCounted) {_shouldBeCounted = newShouldBeCounted;}
private:
bool _shouldBeCounted;
};

class CountingFunctor
{
public:
CountingFunctor() : _counter(0) {}
int getCounter(void) const {return(_counter);}
void operator () (Contained item) {if(item.getShouldBeCounted()) _counter++;}
private:
int _counter;
};

Now let's use that code. We're assuming we have an STL container of some sort of Contained-class objects, called Container. Some of these objects have their _shouldBeCounted property set to true, some don't. Our goal is to count how many say they should be counted. It ought to be this easy (but it isn't):


CountingFunctor CountAllWhoShouldBe;
std::for_each(Container.begin(), Container.end(), CountAllWhoShouldBe);
std::cout << CountAllWhoShouldBe.getCounter() << " items should be counted." << std::endl;

And behold, the code will compile and run. It's just that the answer printed will be zero. Humph. You might think instead we need to getCounter() on the result returned from for_each(), but you'll find it's the exact same object (CountAllWhoShouldBe) that you passed to for-each. Remember "For_each returns the function object after it has been applied to each element"? So, what's the problem?

You may notice, nowhere did the for_each documentation ever say that the functor object you passed to for_each was the one for_each actually USED during the iteration. It isn't. Look again at The behavior of this template function is equivalent to. See how the function/functor argument is defined as "Function f"? This is a Call By Value. I quote "...bound to the corresponding variable in the function (frequently by copying the value into a new memory region). If the function or procedure is able to assign values to its parameters, only its local copy is assigned — that is, anything passed into a function call is unchanged in the caller's scope when the function returns." You see, we passed a unique instance of CountingFunctor named CountAllWhoShouldBe, but during the call to for_each, an identical COPY of CountAllWhoShouldBe was made, and our operator () is now changing the _counter of the local COPY, not the original CountAllWhoShouldBe. for_each then happily returns us the ORIGINAL upon completion. The original, with its unchanged _counter still set to 0.

How do we get around this problem? By references. We can't force for_each to take its function/functor by reference unfortunately, but we can make _counter be a reference to a variable that exists before, during and persists after the scope of for_each. The Contained class is unchanged (whew!) but CountingFunctor does get more complicated.


class CountingFunctor
{
public:
CountingFunctor(int &extCounter) : _counterRef(extCounter) {_counterRef = 0;}
int getCounter(void) const {return(_counterRef);}
void operator () (Contained item) {if(item.getShouldBeCounted()) _counterRef++;}
private:
int &_counterRef;
};

To utilize this reference technique, we must use the initializer-list form of initialization of the _counterRef. [See C++ Morsels: Initializer List Execution Order for more gotchas about initializer-list ordering.] A C++ reference variable must be constructed from an existing object of the same type right at creation time. You can't assign its identity later on, because the operator = assigns values to the referred-to variable rather than assigning what variable the reference refers to. Which is why we assign extCounter to _counterRef in the CountingFunctor's constructor's initializer-list, and then immediately assign 0 to the contents of the reference in the inline body of the constructor.

Now, even if we make a copy of a CountingFunctor object, the copy will have an identical reference that points back to the same int that the original instance points to. This int existed prior to the creation of the CountingFunctor, and will exist after the demise of both the original and clone, so we know we will have access to its contents. The new invocation code looks like:


int actualCounter;
CountingFunctor CountAllWhoShouldBe(actualCounter); // pass actualCounter to be the actual counter
std::for_each(Container.begin(), Container.end(), CountAllWhoShouldBe);
std::cout << CountAllWhoShouldBe.getCounter() << " items were counted accorind to CountAllWhoShouldBe." << std::endl;
std::cout << actualCounter << " items were counted according to actualCounter." << std::endl;

Obviously, the two answers reported SHOULD be the same, since in effect, they are querying the same variable -- one via the original instance, and one via the reference embedded in the CountingFunctor.

This is pretty uninformed

The std::for_each returns a copy of the functor it applied to each element of the list. Therefore you could simply write

CountAllWhoShouldBe = std::for_each( begin, end, CountAllWhoShouldBe );

No need for any of this!

Fortunately, you can avoid all this hassle, and just use the object returned by for_each. Did you actually try that? (It works, by the way. A quick perusal of the implementation of for_each ought to convince you of that...)

Thanks for posting this

Great post, really clear and totally gets the point across. I really appreciate the explanation.

Instead of all this, why not

Instead of all this, why not pass in a temporary to for_each and then take the result (note that for_each returns your functor to you), i.e.

Functor f = std::for_each(container.begin(), container.end(), Functor());

Then f will be a copy of the temporary, which contains the results of the operation.

Return-by-value

You might think instead we need to getCounter() on the result returned from for_each(), but you'll find it's the exact same object (CountAllWhoShouldBe) that you passed to for-each.

This works perfectly fine (tested w/ gcc 4.3.2):
int main(void)
{
std::vector<Contained> v(10);
for(int i=0;i<v.size();++i) v[i].setShouldBeCounted(i%2==0);

CountingFunctor f1;
CountingFunctor f2=std::for_each(v.begin(),v.end(),f1);
std::cout << "f1.getCounter()==" << f1.getCounter() << std::endl;
std::cout << "f2.getCounter()==" << f2.getCounter() << std::endl;

return 0;
}

$ ./testforeach
f1.getCounter()==0
f2.getCounter()==5

---

for_each then happily returns us the ORIGINAL upon completion.

No, it returns a copy of the (altered) copy, as required by the definitions of local variables and return-by-value. If you get anything else, your STL or compiler is broken.

Consider a similar, but simpler, situation:
int f(int i)
{
i+=1;
return i;
}

...

cout << f(5) << endl;

Does this print 5 or 6? Although it is generally considered bad form to alter (local) parameters, many older programs would be broken if the answer is anything but 6.

Typo fixes

Thanks to my friend Jeff Frontz for a couple of typo fixes. The reference version of CountingFunctor mistakenly returned _counter instead of _counterRef (just didn't search/replace it when adding the reference) and the operator () was missing the () in both examples.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <h1> <h2> <h3> <img>
  • Lines and paragraphs break automatically.

More information about formatting options