Okay so, hello everyone.
Welcome to the talk.
What C++ developers should know
And in parenthesis, the linker.
Alternate title for this talk,
things I never wanted to know about globals and linkers
but was forced to find out anyway.
That's probably a more accurate summary
of how I came by this knowledge.
I'm not a build engineer.
I'm not a linker dev.
I actually work in high frequency trading.
So I came by this knowledge kind of incidentally.
But I think that's okay.
Most C++ developers in my experience
are not experts on the linker.
So, maybe this will make the talk a bit more relatable.
Okay, so globals.
I've never seen code,
you know a code base,
and thought to myself wow there's not enough globals here.
If there were only more globals,
this code would be so much better, please.
So the talk's not an endorsement of yes,
sprinkle globals everywhere once you know how to use them.
But in my experience there are things
that are best done using globals.
So here's three examples that
I've used firsthand often.
So logging intrusive performance profiling,
and polymorphic factories.
So, logging is kind of a canonical example.
So let me just pick on intrusive performance profiling.
You know, typically you'll have macros in your code
that record cycle counters or other information.
They can be enabled and disabled by macros.
They have to record that information somewhere.
And you don't really want to pass down
references to this object
where they get recorded all the way down
and up your code base.
It doesn't make sense.
It doesn't effect the behavior of your code.
So, it's just easier to use a global and
less intrusive for the surrounding code.
So, right off the bat,
most people, when you talk about globals,
they go straight to singletons.
And actually globals and singletons are different,
or at least in the definitions and usages that I've dug up.
So, a singleton is a class that can only have
up to one instance.
Okay, it's a property of the class.
A global is a variable that has global scope
or global access.
You know, more technically in C++ parlance,
you could say it's a variable that
is at namespace scope.
Sometimes a variable can have global access
without it being at namespace scope.
So these two are actually orthogonal.
You could have any one or both without the other.
A singleton that's not a global is rare.
But that could exist as well.
But lots of globals are not singletons.
And in general,
I'll come back to this point.
But singletons are even worse than globals.
And I discourage them.
And I'll explain a bit why, later.
Okay, so compiler versus linker.
So C++ developers in my experience
really, really like the compiler.
They like to talk about the compiler,
they like to argue about the compiler.
You know, the compiler's getting updated all the time.
We had C++11, then 14, then 17.
It's really interesting.
But the linker,
it does get worked on, it does get updated.
But it's sort of this thing that
has been around for a very long time.
C++ was always designed to work with C linkers.
That was a big part of the original design.
C++ code written today
you could probably use linkers,
maybe before I was born as an exaggeration,
but probably not by much.
So, this leads to weirdness.
And if you don't believe me that C++ developers
really like the compiler more than the linker,
anytime you Google for help,
you'll see that any question that involves
the C++ standard and compiling,
and the compiler,
you will find 10 people on Stack Overflow
with 100,000 rep
arguing about the correct interpretation of the standard.
When you Google for help with the linker,
usually you can't find any answer.
And when you do find an answer,
it's got like two up votes or something like that.
So, it's a lot harder to find out
how you should be doing things.
Okay, so here's my example.
So here's code that looks pretty innocent.
Or at least it may look innocent to some of you
if you haven't encountered this problem before.
So let's start with a static library.
So, I show the header and the cpp contents
in the same code block, just for simplicity.
So, in the header file,
we're just including a string,
we've got our pragma once,
we declare a global with extern.
And then in the cpp file we define the global
after declaring it in the header file.
So far so good.
Okay, so now we have a shared library.
And if you're little fuzzy
on static versus shared libraries,
I'll also talk a little bit
about the difference there shortly.
So what does a shared library do?
Well, it has a function that basically just
gets this global that it gets from the static library.
So the shared library depends on the static library.
And then finally we have an executable.
The executable depends on both the shared
and static libraries.
And it's just gonna try to print the string.
It's gonna try to print the string that it gets
directly from the static.
And it's gonna try to print the string that it gets
directly from the dynamic library, the shared library.
Okay, so I'm curious.
How many people would be able to
build this from memory?
I'm just curious.
A few people, not too many.
I mean, I had to look up all these commands.
I mean, we're used to just writing some C++ code
and then we just fire off our build command.
We have a build engineer at our company.
He tells us what to do or we use CMake.
And I was very careful here.
Many commands that you might issue the compiler,
and by the compiler I mean basically,
g++ or clang++.
They actually invoke the compiler and then the linker,
depending on what you're doing.
So here, I've separated everything.
So we had three things that we were interested in.
The static, the shared, and the executable.
So, I have six commands.
One to compile the object code,
and one to produce the actual target.
So, first I compile the cpp file.
Then I produce a static library
with this command called ar.
Next, I compile the code
for the shared library.
I link it against the static library.
So, that's what that looks like.
And then finally, I compile the code
for the main executable, the object code.
And I link the executable against
the static and shared libraries.
So, when I do this, and I run it,
any guesses as to what prints out
or what happens?
Watching this program will segfault.
Again, if you've encountered this problem before,
maybe you kind of had a feel where this was going.
But this program, this code,
seems very, very innocent.
And yet, this segfaults.
And it actually only segfaults if you build it this way.
If you only use static libraries,
if you only use shared libraries,
you're gonna be okay.
Which in some ways makes it worse
because you find the problem later.
And this is probably a good place to point out that
as far as this stuff is concerned,
a class static variable is still a global.
So that seems like a very reasonable thing
from a software engineering point of view.
If you are writing a class,
let's say the class parses JSON.
In JSON true is a literal that has to be parsed out
as a boolean value.
Maybe you would have a class static constant string
which is set to true.
And that seems totally reasonable and harmless
but that can cause these segfaults.
And it does.
let's try to understand what's actually happening.
So, I suspect that a lot of C++ programmers
have written a class like this at some point in their lives.
Personally, I call it Informer,
because it informs on everything that it does.
So here, when it's constructed, it will print.
And it will print its address.
When it destructs it will also print and print its address.
And it also has a get method
which just returns its address too, why not.
So I make that change,
and I also will change my main.
I mean, there's no string to print out anymore.
So I'll just print out the address.
And let's try to see what's going on
with these global objects.
Okay, so we're not doing anything interesting anymore,
like allocating memory.
So we don't segfault.
But here's what the program prints out.
One expectation might have been
that you'd have two copies of the global,
which was actually a very reasonable expectation,
but it's wrong.
There aren't two copies of the global.
But the constructor runs twice in the same spot.
And then the destructor runs twice in the same spot.
So, it's not very hard to see that
this is gonna cause big problems.
If you have a string,
what's gonna happen is,
you're gonna allocate some memory,
set some pointers,
then you're gonna allocate again
and set the same pointers to the new allocation.
So that memory is gone forever now, you leaked it.
And then you're going to call delete
on the pointers in the string.
And then you're gonna do that a second time
with the same pointer.
So that causes a segfault.
That's not a good thing to do.
Okay so, this is where we take a quick detour
and try to talk a little bit about the linker.
how does a linker work?
And this is
very, very simplified,
I'm assuming. (laughing)
I don't know.
I assume it's simplified.
This is kind of like a spherical cow model of the linker.
You provide a link line.
The link line has a bunch of targets.
The linker marches from left to right.
And you can think of the linker as like
it keeps a little notepad or a table.
And every time that it encounters a target,
that target has a symbol table.
And the symbol table basically says to the linker look,
here's a list of all the symbols that I need,
whose definitions I don't have
but I refer to in code.
And here's a list of things that I do have
that I provide in case anybody else needs it.
And it marches left to right.
And every time a symbol is needed,
it adds it to its little list.
And every time that a needed symbol is provided
by a new target,
it then does one of two things.
And this is the difference between
static and dynamic libraries.
If the library is static,
then it basically copies and pastes the object code
into whatever target you're trying to make.
So when you look against the static library,
you copy and paste the code,
it becomes part of your executable.
That's why you don't need to ship a static library with
If it's a dynamic library,
then it basically just checks the box,
and it politely says,
thank you, this symbol is gonna come from
this shared library.
And then it tells another entity,
or depends on another entity called the loader.
And the loader will fetch this at runtime
by loading the shared library
the moment the executable runs.
So that's our static and dynamic libraries.
Okay, so I mentioned symbol tables.
Maybe we should take a quick look at a symbol table.
So the easiest way to look at a symbol table
is to use a little command line application called nm.
And usually you'll want to pass
the double dash demangle flag
so you can read the output more easily.
This is symbol table...
Actually not sure for which version of main.
But the idea here,
you can see there's a list of symbols,
and they're marked with various letters on the left.
Some of these letters are more important or less.
I'm just gonna focus on three of them on this slide.
So the U is undefined symbol.
So this is basically saying I need this.
So for instance, the static,
sorry, the main executable
doesn't define this global, g_informer.
It needs to get it from somewhere else.
W is a weak symbol.
So this is a symbol that's defined.
But it's basically one that's defined with,
like a function that would be defined with inline.
And inline, as we know,
let's you have multiple definitions of the same thing.
So the W is related to that.
And T is just a regular defined symbol.
Again, for our purposes here,
the difference between W and T is not too important.
But basically, needed symbols and provided symbols.
That's the key thing in this symbol table.
if we just take this very, very simple mental model,
and we apply it to our example,
what do we know?
Well, when we build the shared library,
we build it against the static library.
And that means that the code from the static library
gets copied and pasted into the shared,
or at least the code with regards to the global.
When we build the executable,
if I were to go back and you looked at the link line,
you might notice that I linked in the static first,
and then the dynamic.
And remember I said the linker just marches
from left to right.
So the moment that it encounters a symbol
that's provided in multiple places,
it's just gonna take the first one that it meets.
And if it encounters it again,
it doesn't care.
So the main executable
is going to get that symbol from the static,
the way that I built it before.
So it's also going to get a copy.
We're actually gonna come back to what a copy means exactly
in this case.
So you know,
you could ask well now that we know that the link order
what if we change the order of arguments?
So here, what if I link dynamic and then static?
Now when I run it, everything just works.
So, I mean this is again,
another way in which the problem can basically be hidden.
If you get all of your linker order arguments right,
you may be able to prevent these issues.
But you're probably not staring at your
linker command line order.
So that's probably not a good way to solve the problem.
Okay, so let's try to dig a little deeper
and figure out what's actually going on.
Where does a global actually get initialized?
So here I use objdump
and I just dump the assembly.
And there's a lot of stuff.
But if you search through carefully,
you will find basically something like
what I have here.
So, there's this section
which is kind of suggestively named
global var init.
And this is a section which is used for things
that have to happen immediately
when the shared library is loaded.
And one of the things that you can clearly see there
is a callq,
a call to the constructor of Informer.
Now, the shared library in both of our examples,
the one that worked and the one that didn't,
it was built the same way, right.
So the shared library will always try to initialize
Okay, now how about the executable?
Well, the executable
will sometimes try to initialize it.
And what I mean by that is
if you link against the static first,
and then you run objdump,
you will find code like this in the executable.
Again, it's very similar.
It calls the constructor,
one thing that you might also notice here
is that it actually loads the destructor
into this kind of variable cxa_atexit,
that makes sure the destructor gets called
when the program ends.
If you change the link order so that
the shared library is first,
the executable will not have this code.
And then it doesn't run it.
So, we actually don't get two copies of the global.
What we get are...
You only have one copy of the global,
but the problem is there's more than one entity
in your program that thinks it's their job
to construct it destruct it.
In a way, this is worse.
Instead of having two correct globals,
which is not great if you actually want it to be a global.
But now you have a global whose constructor ran twice
in the exact same spot in memory
and then destructed.
That's pretty bad.
there are a lot of
things you could say here, right.
Like you could say,
well only write globals in shared libraries.
Or only use either static or shared linking, don't mix.
You could say all those things.
But what I would rather say is
what if we just wrote the code
so that this could never happen?
Because a lot of times the people who might write the code
are pretty far removed
from the people responsible for building it.
And at a large company,
maybe you originally only used it as a shared library,
you never found this problem,
someone downstream of you used it as a static library.
Robust code is better.
a friend just sent me this link
a day or two ago.
So, in Boost.Log, there's this caveat.
"If your application consists of more than one module,
"that use Boost.Log,
"the library must be built as a shared object."
So I haven't actually dug into the code.
I don't know if this is exactly the reason.
But it sure smells like it.
logging is a very common use case for globals.
I mean, I would prefer to see this written, right.
You can build against Boost.Log however you want
and everything's gonna be fine.
Okay, so what do we do about this?
So in C++17, it's actually really simple.
We have the keyword inline.
You're probably more used to thinking of inline as
well we can just put this in the header.
That's true but,
it also means that regardless
of whether you put it in the header
or the cpp,
and I'll come back to that point,
you only get one copy of the global.
And this just works.
I'm gonna come back to why this just works.
Okay, what if you're not on 17?
I'm not on 17.
I'm sure many people here are not.
So here's a lazy solution.
You basically have a static local
and your function returns a reference to it.
And that's it.
So this kind of popularized in the Meyers singleton.
So there's some good things about it,
there's some bad things about it.
It's lazy, so it always get constructed when you need it.
But if you're replacing an existing global,
you know have to add these parens everywhere.
That could be annoying.
there are some performance questions with this.
So first of all,
you're gonna have a branch every single time
you access this global forever.
There's also gonna be some extra code there
to actually atomically ensure
that this is initialized only once
in a thread safe manner.
some people this may be an issue,
for some it may not.
But whenever you do something lazily
in a complex program,
you are taking the chance that
it's going to happen at a really inconvenient time.
So if you care a lot about performance,
lazy may not be great.
So, I've actually had plenty of issues with
lazy things happening at the worst possible time.
And it's not good.
Okay, so what's one way to solve this?
Well we could just stop being lazy.
So this is a neat trick
that's very simple
but I haven't really seen
suggested very many places or anywhere.
So you basically still have your function,
but now we just declare a very simple
static global reference.
So this is basically just a pointer.
And we initialize it with a function.
This just sort of makes everything work out.
The function is forced to be initialized.
And now this is not lazy anymore,
which might be good or bad.
And also, you're going to access through the reference,
not through the function.
So there's no boolean check anymore.
The reference is basically a pointer.
You have a pointer straight to your global.
You just access it, no boolean.
Now if you really, really, really care about
performance when accessing a global,
which might be a code smell in the first place,
this mess is a little faster,
produces slightly better assembly.
And so what on earth am I doing here?
Well, before I said hey inline just works,
I'll explain why later.
So if you take a step back and think about it,
even before 17,
linkers must have known how to do this correctly.
And the reason why is because templates,
have to be defined in header files>
And a template, a class template,
could have a static variable.
And that static variable has to be handled correctly.
So there's actually already a facility in linkers built in
that lets you correctly build,
correctly link as long as you use
a static of a class template.
But there's no easy way to access it.
And it's a little funny in a way
because templates are implicitly inline.
So there's actually this implicit inline keyword here
that we weren't able to make explicit until 17.
As you can see,
this is a lot of boiler plate.
It's kind of annoying but
like I said, it is a little faster than
the previous slide.
- [Man] Are you missing a static?
- [Man] Are you missing the static keyword?
Oh, man that's a pretty bad bug, yeah.
So that should be a static std string, g_str.
all of these techniques will
basically help your globals not be double constructor,
If you're on 17, just use inline.
If you're not on 17,
pick one of the three.
the last thing that I'm gonna talk about,
kind of briefly I guess is
So, if you just use one of those three tricks,
it's actually very easy to write code
that will always get constructed and destructed properly,
no matter how you build,
for one global.
The bigger, nasty problem is
what's called Static Initialization Order Fiasco.
So in C++ there's no guarantee
in what order different translation units are loaded.
So because of that,
once you have globals that depend on other globals,
particularly in their constructors and their destructors,
you really have a mess.
So, unfortunately there is no silver bullet to this
that I know of.
But what I'm gonna do is basically point out
a few pitfalls and alternatives and
if you keep the number of globals in your application
small to start with,
then applying these judiciously
should probably help you avoid most SIOF problems.
Okay so, the best thing to do is to just not have
interdependent globals as much as possible.
And earlier in the talk,
I mentioned singletons versus globals,
and I said singletons even worse than globals.
So what I mean by that is that
when you write a singleton,
you're taking all this useful functionality
and you're locking it inside of a thing
that can only have one instance.
And that's very hard to justify.
I mean, if you have a logger,
why are you writing the logger as a singleton?
There's no fundamental reason why you can't have two,
or three, or four, or five loggers in a program.
Maybe you want to have one global logger for convenience.
So here's an example.
Let's say I'm writing an IntrusiveProfileData.
This is intended to be used as a global
so that I can write my intrusive profiling macros.
So in the first line,
one option would be to just use the global logger
and log something in the constructor.
But now I have a dependency between two things that
are going to be global.
Another option is,
you could just take the local logger class and say well
because this intrusive performance data is itself a global,
I'm just gonna give it its own logger instance and
that may or may not be okay
depending on your use case.
But, it just completely removes that dependency, so.
You'd have like one logging file for your main program.
You'd have a different one for your intrusive profiler.
And in many situations, that's okay.
And that really actually makes things simple, simpler.
so when people talk about inline coming in 17,
a lot of people say well it's great for header only
libraries to define globals.
But actually I'm gonna say
I think you should always define your globals
in header files.
That may seem a bit weird.
But because of SIOF, right.
Consider that if you define a global in a cpp file,
and you don't provide any kind of like special hook
or function to force initialization of that global,
there's no way that anyone else
can ever use
that global in their globals constructor.
It's not possible.
Because there's no way they can force that global
to initialize first.
If you define it in a header file,
then you just pound include it.
And because it's gonna show up before your global,
it will always get initialized before yours.
Okay, a couple of quick code examples.
So, laziness and destructors.
You have to be really careful.
Lazy globals are great for some things.
They're not so great when you want to so something
in a destructor.
So here's an example where we have a lazy logger, okay.
So it says constructed, logging, destructed.
So in this sample program I have another global called Foo.
Okay, Foo f,
and then my main program tries to do some logging.
So what actually
happens here right?
Foo gets constructed first
because it's first.
Then main runs.
Main tries to use the logger,
it does some logging.
Now, at this point we fall off the end of main
and all the globals get destructed.
So, logger was constructed second.
So it will be destroyed first.
So the logger gets destroyed,
and then after that,
Foo gets destroyed.
And that means that Foo is now trying to log
to a logger that doesn't exist anymore.
And that's no good.
So there's that caveat with laziness.
If you have a header
that defines globals,
you should include it in your own header.
And again, to see
what I mean by that,
let's look at this example.
I've condensed this all into one file
but it could be a multi file example
where the translation units happen to be
set up in this order.
What happens here basically is that
I've decided to pound include iostream in my cpp file.
But in my header file, I define the global.
And what happens is that
actually the program will see my global
before the iostream globals.
And if my constructor happens to use that global,
yet another way to get a segfault.
So, the advantage of header files
is they enforce ordering.
And when you're dealing with globals,
that's very, very important.
Okay, so the end of the talk conclusion.
Just knowing the very basics of how a linker works,
having like a rough mental model,
being comfortable with nm and objdump
to debug these sort of things,
Anytime you're creating a non trivial global,
so something that's not an integer or a boolean,
use one of the techniques that I discussed.
Either if you're on 17, just use inline.
Otherwise, use something else.
Be aware of SIOF.
Try to leverage header files to avoid
dependency ordering problems.
And just think very carefully before you introduce
dependencies between globals.
Because you're gonna have
to think about that dependency forever
if you introduce it.
So maybe just better to not, so.
Thanks very much.
And, any questions?
- [Man] One small, additional point.
As a useful thing,
you can also just not have your globals destructed.
If you do the lazy pattern with new
to put 'em on the heap,
your destructors aren't run.
And it's not really a memory leak
because the pointers still there,
and the entire process is coming down
so the OS will reap it.
It is substantially safer
than having your destructors run.
So a couple of thoughts on that.
I'm not a huge fan of that personally.
It makes it harder to find real memory issues.
But even aside from that,
memory is not the only resource.
And actually, very common when you're writing a fast logger
is you're not gonna log things immediately.
You're gonna have a buffer.
That buffer needs to be flushed.
And that will happen in the destructor.
But I agree.
Depending on your use case,
that may be the way to go.
- [Man] Yeah, just one addition.
The way you described a linker,
does the ordering resolution is specific to
- [Man] The linker doesn't work that way.
And actually LLD doesn't work that way either.
Is that the clang linker?
I forgot the name. Yeah.
Yeah, so the clang linker introduced a small wrinkle
where it still goes left to right as far as I know.
But it picks up
defined symbols before they're actually needed.
So you don't need to worry that
every single symbol table that defines something
is after where that symbol is needed.
But as far as I know that's the only difference.
- [Man] Yeah, exactly.
With traditional Unix linkers,
you can get in the situation where you have to
specify a library multiple times.
- [Man] On Windows and with LLD you don't have that.
Right, I actually meant to prefix the talk.
I'm kind of sorry.
So, these issues are,
as people probably realized,
Linux or MacOS specific,
I think that the Windows situation for this stuff
is actually a lot better,
that it's harder to create these globals
foolishly on Windows.
But I don't really know.
I'm not on Windows much, so.
- [Man] Hello, just wanted to mention one,
maybe slightly neat idea.
But now with the constexpr,
that overhead where we have to
ensure that the constructor's only called once.
That overhead doesn't need to be paid
because it's essentially the old C style type
Yeah, but I don't think you can mark the constructor
for something like a string constexpr right?
'Cause it can do a heap allocation.
- [Man] Yeah, for the constructor it wouldn't be possible.
it might not be possible in all cases.
But in the case that it can,
it's now a way to also be able to not pay for that.
Yes, yeah that's true.
- [Man] Hi, I have still feelings that this is a problem
of a build system.
Because, I find it elegant way to demonstrate
the problems with global variables.
But you have duplicated symbols
because you link static library to dynamic,
and then you link again the static library
to the executable.
So if you have the functions there
and you use both in executable and dynamic library,
you get build error to what I understand.
So I think the root of the problem is
linking twice to the static library.
Yeah, I mean.
It's one of the roots in the sense that
if you didn't do that,
you wouldn't observe the problem.
The problem is that basically if you don't do that,
and this is actually,
the build engineer at my company said something
very similar to this,
the problem is that like then what you're basically saying,
and he says this happens a lot in practice that,
whoever's distributing the shared library
will say okay I need this library with the global in it.
But now it's your responsibility as the end user
to provide it.
Because it has globals and we're not gonna link it in
because you might get duplicates.
So now it's your problem.
So, you can do things this way
and if you fully control everything from end-to-end,
it's one way to go about it.
But I don't think it's optimal.
I think each
target could be built by someone else.
And each target should include its own dependencies.
And if you think that those two are both good requirements,
then this kind of thing can happen.
But again it's a matter of approach.
I just prefer to write code that will always work.
Thank you. Yeah, not a problem, yep.
- [Man] In the very first example,
when I saw that we have this object
initialized twice from
the main executable and from the shared library,
this address of this object.
What address is it?
Is it in the segment of
the main executable?
Is it in the segment of dynamic library?
Or is it something completely different?
So in this example,
the segment rule...
In the example that breaks,
the global will live in the main executable.
And I didn't really have time to get into this,
but basically when the loader loads a shared library,
it will just quietly discard any symbols
that it's already seen before.
And the global itself is a symbol.
So this is actually why you don't get two copies
of the global.
The global will load the shared library.
It sees the global symbol.
It's already got the global symbol.
It ignores it.
But then it still runs the initialization code.
And since that global symbols already defined,
it runs it on the same one.
So, in short it would have been in the static,
in the example,
sorry, in the main executable in the example that I gave.
The one from the shared library
would physically exist inside the space
of the shared library.
But it would never have been used by the program.
- [Man] Saying with roles of the dynamic library
That is another caveat that
I didn't mention up front, maybe I should.
I mean, I didn't get into dlopen with this talk.
I've used it before but I'm not very familiar.
So I actually don't know the answer to your question.
Okay, thanks very much.