Cookies   I display ads to cover the expenses. See the privacy policy for more information. You can keep or reject the ads.

Video thumbnail
- Welcome everybody and thank you so much for being here
for multithreading is the answer, what is the question?
A title which I wish I could take credit for
but my cofounder Barbara was the one
that came up with that one.
I'd like to take a second to set the stage
and talk a little bit about where we're going to go.
As we explore what multithreading is,
we're going to discuss some of the terminology
that's involved in the practice of multithreading,
talk about the problems you can solve with it,
and the problems that you shouldn't try to solve with it
which I think is a topic that is often overlooked
when we discuss multithreading.
Then I'm gonna talk about how to review multithreaded code
and share with you a better way
for implementing a multithreaded design,
a better abstraction for dealing with shared data
in a multithreaded program.
A lot of people will say oh multithreading,
that's really complicated.
Well it is and it isn't, but part of the reason
why it can be complicated is because
it's difficult to talk about.
When Barbara and I were working on CopperSpice,
which we're the cofounders of the CopperSpice project
which is an open source, cross-platform GUI library,
we had a section where we needed to design some code
that would be thread safe, and we weren't in control
of how it would be called externally
because we are in a library,
so we don't know what the user will do.
We discovered that communicating about
a multithreaded piece of code was very challenging because,
for one thing, the terminology can be confusing,
and people can disagree on what it means,
and this is what happened in our case.
We had both done a great deal of concurrent programming.
Mine had been more on the multithreading side
of the equation.
Hers had been more on the multi-user database context
of the equation, and in a multi-user environment
you deal with locks, and in a multithreading environment
you deal with locks, and they are completely different,
but we use the same word.
So it's really important to get your terminology straight.
So let's start there.
No offense to anyone in the room named Bob
but the generic C++ programmer named Bob decides
I have something that I'd like to do,
and I'd like to solve this problem using multithreading.
There's your first mistake.
If you choose multithreading as the solution
before you decide that you must, you're making a mistake
because you have more opportunities
to shoot yourself in the foot,
because every multithreaded program can be reduced
to one race condition which crashes.
You'll have a few memory leaks here and there,
and you will have a high profile customer find the bug
you didn't unless you're extraordinarily careful.
However, it really helps to be working on a problem
that is naturally solvable with multithreading because
if you try to fit multithreading
into an environment where it doesn't fit,
you'll discover that it doesn't make sense,
and you're working at cross purposes with the mechanism.
So really, really ask yourself is this a problem
that I actually want to solve with threads?
Is there some better way, and often the answer is yes.
In order to have a proper definition of all these terms,
I'm gonna start by defining multithreading itself
which, for our purposes, is the ability of a program
to execute multiple instructions at the same time.
Sometimes you might be sharing time on a single CPU
with other threads, but the real complexity doesn't come in
until you have multiple cores all processing
different stages of execution simultaneously
and having effects on the state of the entire program.
Sometimes people like to distinguish
between concurrency and parallelism because
concurrency has a connotation
that threads are working together
to solve a particular problem, and parallelism is
when the threads don't have to work together.
If you can end up in that situation it's great,
because as we all know working solo is simpler
than working with someone else.
Although the solution may not be as good.
There's also multitasking which is really common
in most operating systems.
Any operating system worth its salt
is a multitasking environment these days,
and this is where you get into things like time slicing.
I'm not gonna go too much into that, but it's important
to understand that multitasking is not multithreading.
Let's have a quick little quiz.
Who here thinks that threading makes a program faster?
Anybody?
Okay.
Not necessarily.
It can, but it doesn't always.
Who wants to write a multithreaded program
so it can be really stable and reliable?
I'm not seeing too many hands, and I'm seeing some snickers.
I'm on the side of the snickers here.
If you're smart enough, then multithreading will be easy.
I'm gonna bash that one right outta the gate.
There is no sufficient level of intelligence
that makes multithreading trivial.
Threading might be useful for error recovery.
Well it turns out it's not.
Threading is actually, can be more complex
for error recovery because now you don't know
what the state of something is
if one thread has had a problem.
But neither of the last two are true either.
Multithreading is not hard, and it is not easy.
It is just a different discipline with its own set
of constraints and requirements,
and you need to be fluent in its language.
A thread is nothing more than work
that can run on one core at a time.
A thread is part of a process.
A process consists of multiple threads potentially,
and each thread has its own stack.
So in C++, all the state of the stack is local
to your thread, and pretty much the entire
global state of memory is shared.
This is contrasted with a process
which is really a separate program.
It's a separate memory space.
It has its own copy of everything.
We deal with processes all the time in our build systems.
Make runs a separate process to run the compiler,
things like that.
Threads in the same process share a lot of resources.
The core, a lot of people have sort of
an intuitive understanding of what a core is.
The number of cores is really just the number of things
that can be happening simultaneously in a processor
to a reasonable approximation.
Not all cores are equal in some environments.
I'll be talking in this session
about places where all the cores are equal,
but there are cases where they're not.
Things like NUMA and asymmetric multiprocessing.
Those are more specialized topics.
More cores does not mean faster.
Our CI system has six cores because it's faster
than the eight core version of the same chip.
Because when they added two extra cores,
there was more heat generated,
and so they had to slow down all of the cores to compensate.
So it turns out the six-core machine is actually capable of
processing things much faster than the eight-core machine.
The tricky bit, if there can be said to be
only one tricky bit in multithreading,
is dealing with resources correctly.
A resource is, in generic terms, something which
you don't want two threads to access simultaneously.
Common things are a location in memory.
We're fortunate in C++.
Since C++11, we have a well-defined memory model,
so we can actually reason about the behavior
of the memory in the process.
File handles are not usually thread safe,
and you may have objects that were written in your code
that are not thread safe.
You must make sure not to access them
from multiple threads simultaneously.
If you do, you have a race condition.
The exact definition of a race condition
as contained in C++ is in order for a race condition
to occur, you have two accesses to the same location
in memory and at least one of them is a write.
It's important to understand that constraint because
multiple reads are okay, but if there is a single write,
all other simultaneous access is forbidden.
If you have two simultaneous accesses
and one of them is a write,
you will have undefined behavior.
I cannot stress this enough.
It's not a you might get the wrong answer.
It's not a oh well, if two threads increment the same value
at the same time, it might increment it twice.
No, it is undefined behavior pure and simple.
And there are architectures where you can get
truly bizarre behavior if you have a race condition
in your program, so it is of the utmost importance
to prevent race conditions.
It's not simply a nice property.
As I mentioned, the stack belongs to a single thread.
It's the area of memory we use for fixed-size objects.
The heap is shared among all the threads.
Fibers are another thing that we've used
in various dialects of some programs.
They're more of a lightweight thread.
We have these in Boost.Fiber.
They're not that commonly used.
They haven't been standardized into the language yet,
and given the fact that coroutines are in the works,
fibers may not make it too far
into the standardization process at any point.
I'm not quite sure about that.
You'll also in a few languages run into
what are called green threads for reasons
that should be lost in the mists of time,
but these are threads that are implemented
in user space rather than by the operating system,
and they have a lot of consequences.
When your environment uses green threads,
you have to be very careful about how you interact because
if you block, you may be potentially blocking
a large number of other threads simultaneously.
This one is not that relevant to C++,
but it's good to be aware of if you're interoperating
with other systems in your environment
that they may have constraints on threading
that we don't have to work under.
Given all that, what do we actually wanna do
with multithreading now that we've defined
some of the basic terms?
When do you reach for multithreading in your toolbox?
Well the things I look for are, if you have a task
that can easily be split into independent steps.
This is a nice property to have in any problem,
but it also makes it tractable to work with
in a multithreaded implementation because you might
be able to split it up into independent processes
that can run it simultaneously.
It really helps if each step has a very clear
input and output, or you know
for a fact that you're extremely CPU-bound,
and you're going to be using the entire CPU,
and you would like to use more than one CPU.
It also can be very nice for a process
that has a large read-only data set that never changes,
but it has to be manipulated in various ways.
This is a very natural place
to use a multithreaded solution.
Or stream processing, it's quite common that if you have,
if you can set up your program as sort of
a data flow diagram where you have data streaming
in one side, passing through a set of transformations
and streaming out the other side.
It can be very natural to set up each of those paths
along the stream graph, each of those nodes
along the stream graph as a separate thread,
and you can increase your performance considerably there.
These are the problems where you look at them
and you say ah, awesome, multithreading will fit.
I can solve this using threads.
It will be great.
Then there are the problems where
you have to use multithreading and you wish you didn't,
and these are the ones that create
interest and complexity in your code.
If you have a task where the performance
would just be completely unacceptable as a single thread,
if you have to use more than one core and you have
a large data set and it has to be maintained synchronously
between all of the tasks in your program,
you need to consider multithreading even if it's awkward.
It's also very common that this shows up in places
where you don't know the workload ahead of time,
and you need an independent array of workers
all of which are capable of doing
whichever task comes along at a given time.
If you have a lot of resources,
like you're an operating system, you have disk drives,
you have network sockets, you have all sorts of things
that you're managing concurrent access to,
you must be multithreaded.
A non-multithreaded operating system
is completely useless at this point.
It's hard because you have a large number
of resources to manage, and you can't allow contention,
or you can't allow race conditions on the resources.
You have to manage contention, and there's not
a lot of tools for managing the complexity there.
This is another case of if you have external clients
making requests to you, you're a web server
or a database server, anything like that.
You need to handle whatever comes your way
as quickly as possible and get it out of the way
for the next request, and you really probably oughta
be looking at a multithreaded solution.
At a prior job that I had,
I had a real-life example of this.
I was writing a streaming video server as a web server,
and the prototype had really bad performance.
It was unable to serve the traffic that we needed.
For production we said okay,
well let's make it as fast as possible.
I said well, how fast is possible?
Well, let me make this multithreaded.
Oh, let me check if I can meet my performance goals
without making this multithreaded.
And I simply optimized the code and refactored
some of the internal data structures, and all of a sudden
I was able to saturate the network bandwidth
that was available to us with 12% of one core.
My job's done.
I'm I/O bound.
I will get no benefit out of multithreading this code
because I'm able to serve bits as fast as the network
can take them away, so what difference does it make
how many cores I'm running on to do this?
Look for this.
When you find this, be glad
and avoid multithreading if you can.
What I'm going to be talking about here is sort of
the sweet spot of the generic multithreading environment.
As I mentioned, there are some other areas
that multithreading is used, but this is sort of
the desktop slash mobile slash cloud typical environment
where we have some number of cores
that you can count possibly with taking your shoes off,
and they're all similar cores.
The memory's fairly close together.
We're not dealing with a super computer
with the memory over in the other building.
So if you have that situation,
you may need additional complexity in your design.
In order to talk about multithreading,
I'd like to bring in an example,
and the example is a restaurant kitchen.
We have a couple of chefs, and each chef is a thread.
It can do work independently.
And we have two knives.
Each chef has their own knife, so they can work
with the knife as needed and do whatever they wanna do.
We say okay, we have a big order that came in.
We need 50 fruit salads,
so let's have each chef make 25 fruit salads.
Now as an aside, I really enjoy questions.
I enjoy an interactive discussion as we go through this,
so if anybody has questions as I'm going through this,
please feel free to raise your hand.
The great thing about C++ and C++11, the threading library
that came in there, is the code actually fits on a slide
because we have a simple enough threading system,
and the library is concise enough that this isn't fake code.
This is real code on a slide.
So I set up a thread.
I name it chef one.
That's the name of the variable that's going to represent
this thread, and I just pass a lambda
to the constructor of std thread.
I say this is the work you will do when you start,
and I'm going to count to 25 and make a fruit salad.
We do the same thing for chef two, and then at the end,
we join both, and what that means is
we're going to wait in the main thread
until both chefs have finished their work.
Then we know that they've completed,
all the fruit salads are ready, and we can serve them.
Any optimization that anybody can see here?
Does this seem reasonable?
Yeah, we seem to be in agreement on this one.
I'm sorry?
- [Audience Member] The chefs might work
at different speeds.
- The chefs might work at different speeds.
That is true.
That is very true.
Now in this case since they're doing exactly the same work,
they'll probably be roughly the same speed,
but it's a consideration that we need to take into account.
In this case, it won't really affect anything.
What it will mean is, we'll still get the same result,
but if one chef works at half the speed of the other,
we're not really getting that much advantage
from multithreading because the slower chef
will stall everything, so that's a good point.
Thank you for that.
Now let's make 50 apple pies.
Now we need an oven.
We have a resource.
Remember, you can't use the same resource
for multiple threads simultaneously
at least in the definition of resource
I'm using in this presentation.
Let's say okay, let's have each chef make 50 apple pies.
So we set up an oven.
This is a global resource.
And we set up a mutex to protect this oven.
Each chef is going to 25 times make a pie,
make the crust, put the apples in the pie,
get it ready to go in the oven, then lock the mutex
so that they know that they're the only accessor
of the oven, bake the pie.
Any problems with this design?
(audience member speaking quietly)
Excellent observation.
You could be preparing some more crust
while you're waiting for the oven to be available,
so this is an example of,
well, this is a fairly simple exam.
We have one resource and two threads,
and already we're getting into issues of
we're going to be blocking, waiting
for this resource to become available.
We have contention.
Yes sir?
(audience member speaking quietly)
I'm sorry I couldn't hear that.
(audience member speaking quietly)
(laughs) He has an oven which allows,
handles two pies simultaneously.
That would certainly be an addition to this.
Some resources can be used
by some finite number of threads simultaneously.
It's not common, but it does occur.
You'd have a slightly different design and behavior here,
but that's a good example.
Let's make things even a little more complex.
Let's take the idea that you could be making a crust
while the pie is baking, and let's say
instead of having both chefs do exactly the same work,
let's have one chef make the pies
and the other chef put them in the oven.
Now why would this be useful?
Anybody care to guess as to why this would be helpful?
Yes?
- [Audience Member] If they take about the same time,
then that minimizes the time that someone's waiting
(audience member speaking quietly)
- If they take about the same amount of time,
then that minimizes the amount of time
that somebody's sitting and waiting and doing nothing.
That could absolutely be true.
Another, yes?
(audience member speaking quietly)
Excellent, you're getting at where I was going.
Specialization, one chef might be better at making pies.
The other chef might be better at baking them.
That's a good point, and I'll come back to that.
Yes?
(audience member speaking quietly)
It'll prevent deadlock because you only have one chef
trying to put a pie in the oven.
That's well-stated, but in this example
since we only have one resource and each chef
locks the mutex and then uses the resource
and unlocks the mutex can we have a deadlock?
(audience member answering quietly)
There may be blocking, that's true.
That's not a deadlock because it's a different situation.
You'll still eventually make progress.
You just may have to wait for the other chef
to be done with the oven.
Livelock, yes, you can get into
a livelock situation if you have enough threads
all working on this simultaneously.
That is true.
You can hit a livelock.
There's no deadlock here.
Like I said, terminology is exceedingly important
in this area.
(audience member speaking quietly)
Yes, yes, that is, so thank you for bringing in a term
that I actually didn't have in my slide deck.
That's helpful.
In this example, if one chef is fast enough at making pies
that they're always locking, acquiring the lock on the oven
before the other chef can get their pie in the oven,
then you have starvation.
So one chef could end up completing their work much faster
than the other one because they have better access
to the shared resource, so that's a great example.
In this case since both of them are doing
a fixed amount of work, you can't have starvation
for an indefinite amount of time, but it can definitely
make it so that one thread will complete
much faster than the other.
And that can be a property that you don't want
in your system, so thank you for bringing that up.
Getting back to the next idea of one chef prepares pies
and the second chef is going to bake them.
In order to do this, we're going to need some way
to move the pies from one chef to the other.
So I am going to, for sake of argument,
say that we have a thread-safe queue.
There are plenty of talks that you can go to
about how to implement a thread-safe queue.
I'm talking more abstractly about the way to plug together
these multithreading concepts into a working solution.
So we have a conveyor belt that will take the pies
from one chef to the other, so now our two chefs
do completely different things.
Chef one instantiates a pie, does everything
that needs to be done to get it ready for the oven
and puts it on the conveyor belt.
They give the pie away.
And this is a key attribute to look for.
If you can design your code
so that you can give away ownership of something
once you're done working on it, it is much, much easier
to work with in a multithreaded environment.
Because whenever you have shared data,
remember, you can have a race condition.
We don't want two processes, two threads,
accessing the same data at the same time.
Well if one thread creates something,
does all of the work on it that it needs to do,
and then gives it away to another thread,
you can never have shared data.
In this case, we're using move to give away our ownership
of this pie to the conveyor belt.
And the other chef just stands there waiting
for a pie to come off the conveyor belt,
pulls it off the belt, puts it in the oven,
waits for the oven to finish, puts another pie in.
Any questions on this phrasing of the problem,
and why this is potentially an improvement
over the previous solution?
Right there.
(audience member speaking quietly)
I've moved the work of synchronizing the resource
from the pie to the conveyor belt.
Very well spotted, that's true.
However, there is no way that I can allow,
in my universe of this restaurant kitchen,
there's no way I can allow two chefs
to use the oven simultaneously.
But there might be ways that I could allow
one to put a pie on the conveyor belt
and the other to take it off at the same time.
There are certainly lockless cues where we can move data
between threads without either thread having to wait,
so I may have improved my solution by getting myself access
to more tools to apply to this problem.
Does that address your question?
Thank you, any other comments on that?
I'm sorry?
(audience member speaking quietly)
You might make too many pies for the conveyor belt to hold.
Yes, so that's the opposite of starvation.
Where one thread gets ahead of the process, and you might
have to have some way to throttle the first thread
or allow an arbitrary sized buffer so that it can grow
to consume this work backlog
that the second thread has incurred.
Yes?
- [Audience Member] The data sharing is still here
it's just hidden inside the queue which brings up
a great point about multithreading in general.
If you can make it somebody else's problem, do it.
You got a thread-safe queue from somebody,
maybe a lock-free queue.
Good for you, you don't see data sharing anymore.
- Yes, that's very well stated.
And in general, the best strategy
for writing a safe multithreaded data structure
is get it from someone else because they've probably
broken it in at least the ways they could discover,
so you have a better chance.
Yes, if you acquire a known safe thread-safe queue
from someone else, then by all means
find a place to use it in your program
to reduce your larger problem,
to using somebody else's solution
to an already solved problem.
It's very helpful, so thank you for that.
Can we hit a deadlock in this case?
Not if our thread-safe queue is implemented correctly
because it's the only place where both threads
are operating on simultaneously.
Can we make this more optimal?
Well, we've talked about several ways in which
it could be better or worse than the preceding design
depending upon your constraints,
what the costs are for the various resources,
what the costs are for synchronization.
Are there any race conditions?
No, this design is correct.
Again, assuming the code you got from somebody else
to implement a thread-safe queue works properly.
You're done.
Now let's deal with the complexity
of a real commercial kitchen.
This is what you have.
You have orders coming in all the time for random things.
You have resources.
I'm sorry, I'm gonna get slightly ahead of myself.
Let's deal with a slightly less complex
commercial kitchen first where we have two things
that we need to do so more of a catering setup.
We need 25 fruit salads and 25 chicken salads.
Which of these strategies sounds like a better option?
We could have each chef make a fruit salad,
clean up his work station, make his chicken salad,
clean up his work station again and do that 25 times.
Does that sound like a good use of time?
What about one chef makes 25 fruit salads,
the other chef makes 25 chicken salads?
Does that sound advantageous?
I get a thumbs up.
Would you care to venture a reason
why this would be a good solution?
- [Audience Member] Because we won't spend any time
on cleaning the kitchen.
- We don't spend any time cleaning the kitchen.
Well maybe once at the end,
but we save a lot of time during the middle.
Absolutely.
- [Audience Member] It's good if they take
about the same time.
- It's a good solution--
- [Audience Member] It's bad if the times
are very different.
- Very good.
It's a good solution if they take about the same time.
If a fruit salad is much faster to make
than a chicken salad, it may not be the best solution.
It might be depending upon what you're optimizing for.
If what you wanna do is to get the most work done
in a particular amount of time it might be a good solution
even though the chicken salads will take longer to come out.
They can be reasonable.
Or we can do what we don't want to have to do
in a multithreaded program and contain some shared state
where both chefs start out working on one thing,
and then they switch over to the other
once enough has been made.
This way if one chef works slightly faster than the other
at the beginning, we'll still end up with the right number,
and we'll have the lowest latency
to having all 50 salads available.
Does that seem reasonable?
Again, there's no right answer here.
These are trade-offs.
It depends on what you're optimizing for
in your particular case.
Now, let's talk about a real kitchen.
We have a lot of resources.
They all do different things.
They're all unique.
They can't be substituted for one another.
Anybody can just walk up to the counter
and order whatever they would like at any time.
How do you optimize for this?
You don't know the workload.
You don't know how long your chefs are going to take.
You don't know what's gonna be busy
when you go to try and fill an order.
What do you do?
This is the real multithreading problem.
This is not the simple one
that you like to solve with multithreading.
This is the one you have when you have no other choice.
So let's start working through this and develop this example
and see how we can apply the tools in C++11 to this,
so that we can make some sense of it.
Then in the next section, I'll show you
some additional tools that we can build
on top of the language to reduce the complexity
of this solution and make it more tractable.
First let's set up our resources.
So we have an oven, and we have now two because
we need the viking oven for baking pies,
and we need the brick oven for baking pizza.
The brick oven will be used for both the pizza
and the garlic knots, so now I have a resource
that's shared among multiple different things
that I might be working on as well.
And I have an ice cream maker.
We declare mutexes for each one of these,
so that we can manage the resource contention
and enforce that only one chef
will use them at a given time.
I have a bunch of classes.
I have a class hierarchy of the various foods
that people might want to eat.
What I really wanna do is eat some food.
That's the ultimate goal of this exercise.
Everything else is just machinery.
Always identify the output in a multithreaded program
and try to work back from there to see
what machinery you need underlying the solution.
In order to eat something, I need to consume food,
so I'm gonna consume it by our value reference because
once I'm done with it, it's no longer edible.
So I'm just going to print a message that says
I was successfully served my food.
And I'm going to set up a couple of tickets.
We're going to use the future and promise system in C++11.
I look at the future and promise system
as the equivalent of the ticket you get
when you order a sandwich at a sandwich shop.
It gets split into two halves.
One half goes to the chef and tells the chef
what work I would like you to do.
This is the sandwich I want.
I get the other half.
Later on when the work is done, we can pair them up.
I get my result.
I get my food.
So this is the future promise system,
and this is the value of doing this
rather than some other mechanism
for getting data back from another thread.
You'll note that these are working with unique pointers.
The T inside the future and promise are unique pointers.
Unique pointer is the best feature that they added
to the threading library in C++11 because unique pointer
is a fantastic tool for preventing shared data,
and if you can, prevent shared data.
So I've got a couple of patrons,
and they're going to come up,
and they're going to order random things
in a random order and then eat them.
Note that when we call eat, we're passing it,
we're using the ticket that we got back,
we're using the future, and we're calling get.
And when we call get, the patron is going to sit
at their table and wait until that particular food
is available, and then it will become available.
It will show up in your local thread, and you can eat it.
Or if the chef gave up and there was a problem
like the oven caught on fire and it couldn't finish
making the food, when you sit to try to eat the food,
what you eat instead will be an exception.
And you can catch that exception and do
whatever you'd like with it at that point,
but this is a very useful tool because it means that,
remember I was mentioning, multithreading
does not always make your program more reliable.
At least we have a way to manage the unreliability because
we will get an exception if the other thread
fails to complete its promise.
We're going to set up an order.
That is a thing that can be passed around
so that we know what to work on.
And we set up a flag that says is the restaurant still open
because we need to know when to send the chefs home.
And since we can't send the chefs home,
we need to set a flag in a place that they can see it
when they're done working on a order
and leave and go home of their own accord.
The chefs actually become very simple.
We just take another order off the queue and we work on it.
This looks fairly straightforward.
Well what does an order actually do?
An order is nothing more than a wrapper
for work that is to be done.
This is also sometimes called a work unit
in a threading system.
So when I want to order some pizza,
I'm going to set up a future promise pair.
I'm gonna set up this chef ticket that I have,
and then I'm going to encode in the order
all the work that I want done by the chef.
So I want the chef to go make a new pizza, add the sauce,
add the cheese, go acquire the lock on the brick oven,
bake the pizza and then give the pizza back to me.
This whole setup returns immediately because
I'm not actually asking the work to be done.
I'm just saying this is the work I would like
to have done later by someone else.
So this order pizza method returns immediately
and returns the future that says at some point in the future
when the chef gets around to it
and all of this work is done,
you will have a pizza in front of you.
That's our promise.
If you're in C++14, you can capture by move of the promise
which means you don't need a shared pointer to do this.
C++11 lambdas didn't quite have
all this functionality available.
What are the consequences of this?
Well okay, we have a single queue.
So we might have contention on the queue of orders.
What could we do about that?
Well if we split it up into multiple queues,
one for each chef, then it can potentially be much faster
because each chef has to only check one queue.
But then what if the chef is sitting there waiting for work,
and there's stuff to do in another chef's queue?
So now you go look at the other chef's queue
and steal the work from them.
This is work stealing,
and it's a very common attribute in thread pools.
In order to implement correctly and efficiently
a work queue, there's a lot of complexity
that goes into this.
Again try to get one.
Don't write it yourself.
There's a lot of tricks to doing this properly.
Our chef really shouldn't be sitting there
waiting for a pizza to bake because we're wasting
an entire thread just sitting there doing nothing,
staring at an oven, waiting for it to be done.
If you go back a little bit,
this locking, it's kind of arbitrary.
There's a thing called a brick oven mutex,
and there's the thing called a brick oven,
and you have to lock one before you use the other.
There oughta be a better way.
So think on that.
Before I move on to the second part,
I'll take a few minutes to talk about
some miscellaneous advice for how threads
should be pulled together in a program and what to look for
when you're writing multithreading code.
This is a really common mistake.
If you have too many threads runnable at a given time,
it can be very slow in the operating system.
This is usually not what you want.
When people start working with threads they think okay,
I have 500 things I wanna do, so I'll start a thread
for each one, and that's not what you want.
What you want is some number of threads roughly proportional
to the number of cores on your system.
Depending upon your design, it might be one to one,
or it might be two to one.
I've seen some cases where that's useful.
Where you have one set of threads that's doing
a very complex CPU-bound operation,
and you have another set of threads
that handles all the I/O-bound operations,
and each of those should be represented on a single core,
so you need two threads for each core in the total system.
So you really wanna aim for one or two
active threads per core.
More is not really a great thing.
We wanna really move the blocking calls
out to ancillary threads, so you really wanna see
if you can partition your work into CPU-bound
and I/O-bound components and then split this apart
into separate pools of threads.
This can really improve your performance
in areas like network servers that are doing
a lot of processing but also doing a lot of I/O.
Too much shared data, I know I've said this a few times.
I can't repeat it enough times.
If you have shared data,
you have the potential for a race condition.
The less potential you have for a race condition
the more correct your code will be.
So really try hard when formulating your problem
to reduce the set of shared data that you have to work with,
reduce the size of the shared data
that you have to work with, maybe even compute
some values individually in each thread
rather than fetching it from a common location.
It will often be an improvement.
Reduce the number of threads that need shared data
if you do have some need for sharing.
It's really the same idea as encapsulation
that we've all been doing as object-oriented programmers
for many years for those of you that actually use
the object-oriented part of C++.
Trying to compartmentalize your system so that
the number of parts of the system that have access
to shared data simultaneously is the minimum feasible.
This should be your main design goal.
It is often more efficient to reduce
the amount of shared data and duplicate work.
If you do have shared data
and you can make it read-only, huge win.
Read-only data is better for caching.
It's also better for reasoning
about the behavior of the system.
Often times you can set up a system where you create
a large shared data structure initially, but then
once the process is up and running, the individual threads
only need read access to the shared data structure.
If you can do this, do so, mark it const,
make it very explicit that this data is read-only,
and you will know that once you've reached that point
in your program where the data is no longer written to,
your code is race condition free.
And we aim for an environment
where you can look at a piece of code and say
I know a priori that piece of code is race condition free
because I can reason about its behavior,
and I know what it's doing when it accesses shared data.
If you don't have that property,
then you have to work through each example separately,
and I'll talk more about that in the second half.
As I just said, the property that we're aiming for is
as little shared writeable data as possible.
That is the number one concern that should drive
your data structure design in your program.
Now when you're reviewing multithreaded code,
there are many things to look for.
And there are better ways to arrange your code,
your data and your access to the data.
Stick around for part two.
If any of this interests you or if you're interested
in any of the other projects that we've worked on,
you can definitely come find us on YouTube.
Please do subscribe to our YouTube channel.
We have content coming out every two weeks about C++,
multithreading, CopperSpice and the various
other projects that we work on.
It's of general interest to the C++ community
even if you're not using our libraries.
Some of the things that we've worked on include,
CopperSpice, as I mentioned, it's a library,
a cross-platform GUI library.
We've also refactored out of that the CsSignal library
which is a thread aware signal delivery mechanism.
This is actually a large part of what drove this talk
was developing the CsSignal library and encountering
some of the challenges in working with a library
that needs to be thread safe in this context.
There's also the CsString library
which is our standalone Unicode aware
both UTF-8 and UTF-16 support string library.
And libGuarded which will be the subject
of the second half of this talk.
Some additional programs that we've worked on.
KitchenSink is a demo application
if you're interested in CopperSpice and how it works.
Diamond which is a programmers editor
written in CopperSpice, very lightweight,
cross-platform programmers editor.
And DoxyPress which is a documentation generator
that uses clang on the front end
for correct documentation of C++ source code.
Our information is available all online,
and you can contact us through email.
We welcome comments and questions.
Now I'll turn the floor open to questions here in the room.
Comments, observations?
All right, well in the next session, I'll be going over
the entire design of libGuarded which is a system
for managing access to shared data
in a multithreaded environment without losing your mind
and without invoking race conditions.
I hope you stick around for that because I'll be going
much deeper into this example and showing you how
to reduce the complexity of what we've developed
in our multithreading vocabulary at this point.
Thank you.
(audience applauding)