C++ Morsels: Initializer List Execution Order

Here's a little morsel that might save someone a bit of trouble.

C++ objects can have sub-objects ("has-a") and parent classes ("is-a"). This morsel doesn't concern itself with the parent class's objects but rather the encapsulated sub-objects defined by our very class. These encapsulated sub-objects can be initialized at construction time by an initializer list. Any objects that aren't initialized there you should initialize in the constructor itself, but it's preferred to do them in the initializer list if possible (for a variety of reasons including performance and code cleanliness).

Here's a link to the 2005-10-19 dated "Working Draft, Standard for Programming Language C++" document. It's a draft of a future C++ specification, but it's as good of a place as any to cite the behaviour I'm referring to. On page 232, we find section 12.6.2 "Initializing bases and members", which gives us BNF for the ctor-initializer, or Constructor Initializer. Basically, imagine an object Foo:


class Foo
{
public:
Foo(int valueInt, double valueReal) : _internalValueInt(valueInt), _internalValueReal(valueReal) { }
private:
double _internalValueReal;
int _internalValueInt;
};

So, the Foo constructor initializes both the Real and Int internal members from the supplied arguments. No problem. Let's look at a more complicated version though. I've renamed the int value to internalValueScale to reflect that it might be some sort of scaling factor to be used in combination with the real quantity. (Let's assume we want to have the scaled value as well as the original factors available for later.)


class Foo
{
public:
Foo(int valueInt, double valueReal) : _internalValueScale(valueInt), _internalValueReal(valueReal), _internalScaledReal(_internalValueReal * _internalValueScale) { }
private:
double _internalValueReal;
double _internalScaledReal;
int _internalValueScale;
};

Looks pretty straightforward and should be just fine, right? Wrong. Here's where DRY (Don't Repeat Yourself) should have somehow been applied to the C++ standard long ago. See, there are TWO places where internal data members are listed in order, and only ONE of them can be relevant for determining the initialization execution order. The first place is obvious to everyone who has written an initialization-list (aka ctor-initializer). The ctor-initializer itself can list the members and you can list them in whatever order you want (the standard calls these mem-initializers).


: _internalValueScale(valueInt), _internalValueReal(valueReal), _internalScaledReal(valueReal * internalValueScale)

By looking at it, you REALLY want to believe that the order you write them in (delimited by commas) is the order they execute in, right? It's because you're used to code executing in the order you write it. If you wrote one line of code with several statements on it, delimited by semicolons:

foo(); bar(); baz(); fubar=12;

You'd expect that the left-to-right order you wrote them in would (normally, ignoring out-of-order independent execution reordering optimizations) be the order they executed them in. And you'd be right, for plain code.

But for a ctor-initializer, that doesn't hold up. You see, you won't always HAVE a ctor-initializer. It's optional. So, the language standard must rely on something ELSE to determine the concrete order that encapsulated objects are initialized in. Something that's ALWAYS pertinent and available. Let's refer to page 234 of the standard cited above, 12.6.2, paragraph 5, the third dashed item reads "Then, non-static data members shall be initialized in the order they were declared in the class definition (again regardless of the order of the mem-initializers)." (Emphasis mine.)

So, our example above that happily thinks its calculating internalScaledReal last, relying on the value of internalValueScale having already been set will fail because the class definition:


double _internalValueReal;
double _internalScaledReal;
int _internalValueScale;

shows ScaledReal defined (and therefore initialized) before ValueScale. As it is, the internalScaledReal will be undefined because the internalValueScale is undefined at the point where it is calculated. If you want the desired order, change the class definition to read


double _internalValueReal;
int _internalValueScale;
double _internalScaledReal;

Oh, and comment your constructor code to explain your required side-effect that you're relying on.

One might argue that the standard COULD have the default order be that of the class definition, and let the mem-initializers override that order. That would make some sense, but it adds complexity and exceptions, so it's arguable that it was undesirable in the C++ standards mindset. Either way, watch your back, this one can bite you.

Order is highly over-rated

The leeway for the compiler to rearrange your code as long as the observable state is unaffected is a big deal. This, coupled with C++'s intentional blind-spot for threading ("let the libraries handle it") causes immeasurable grief when trying to write portable, multithreading code in C++ (since other threads are, by the language's definition, unobservable). For example, most attempts to make a multi-threaded Singleton Pattern (typically implemented as the Double Checked Locking Pattern) fail due to the potential for the compiler to do something completely different than you intended. Heisenbugs abound.

Note: Wikipedia notes "In a nod to the concept of 'Code Smells', this pattern has also been known as the Stinkleton pattern. (coined by Robert Penner)."

Interesting information about activation

I think this board is the proper place to ask you about the activation proccess. My link is not working properly, do you know why it is happening? http://xenon.arcticus.com/?e0afaf10620178d8a0674965dec,

Ouch!

I wonder how this works with C++ over RPC. With RPC, the order is very specific. If you don't know the order of the initializers, then the RPC data could be overwritten by an initializer, or the structure could be uninitialized.

Is there some kind of sync/flush function for initializers? (Everything must be initialized before proceeding.) For example:
foo(); flush_ctor_init(); bar(); flush_ctor_init(); baz(); flush_ctor_init();
Or is using a junk operation enough?
foo(); 1; bar(); 1; baz(); 1;
(The "1" is a perfectly valid operation that does absolutely nothing.)

Between inconsistent template support between compilers, ambiguous initialization order, and those totally screwed up << and >> for I/O, it is no wonder my C++ code looks like C...

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <h1> <h2> <h3> <img>
  • Lines and paragraphs break automatically.

More information about formatting options