- Good morning.
My name is Stoyan and first of all,
thank you for waking up early to come to this talk.
Today we'll be chatting about data-oriented design
and I decided to do this talk because,
when looking at all the resources, all the talks,
all the blocks on data-oriented design,
I was unable to find a good example
of people showing a real-world production system
made with data-oriented design,
comparing it with objected-oriented programming
and outside of games.
So, I've been in the video games industry
for about 10 years now and I've been doing games,
I've been doing mostly, however, game technology.
I work at a company called Coherent Labs
where we create the technology for games,
so it's very likely that if you're playing
and if you're a gamer, you machine has executed
some of the code we'll be looking at.
For the past six and a half years,
we've been working on products based on Chromium,
WebKit, and now our in-house proprietary game UI
and the browser engine.
If you're not familiar with the terms,
Chromium is the open-source project
that is at the heart of the Chrome browser,
of Slack, of Skype, and many other applications
that are probably running on your machines,
and WebKit is the original project that the core
of Chromium is forked from and works in Safari,
on IOS, on Macs, so if you're an Apple user,
you're using it as well.
So, we have these successful projects
and around two and a half years ago
when I was working on optimizing and improving
the software based on these technologies,
the whole time I felt like I was in a cage
and that cage was the architecture of WebKit,
which is also the architecture in many ways of Chromium,
of those layout engines.
So, we gather up with my co-founders
and did the only sensible thing, of course,
which is to start from scratch,
leave all that behind and build
our own in-house browser engine for games,
and there was a very big question for us:
can a browser engine be successful
with data-oriented design?
We didn't know that.
We had successful projects with data-oriented design,
our graphics engine was built with that,
we knew about a lot of games that have been successful
by employing the design,
but we didn't know of anything with the scale
that we were looking at and we didn't know if it scaled
to a software that is very object-oriented in its core.
If you look at all those open-source projects,
they are all object-oriented and this has many reasons:
it's part legacy, it's part the way that the standard
of HTML rendering requires you to do some things.
So, to see how that went, let's do a very quick demo.
Let's run a test page that I did with Chromium.
This is a build that I have here.
Okay, it's a very simple example.
These are 3000 rectangles.
All that they do is they move left and right
and they change their color
and at the center there's a big rectangle
whose only idea is for us to gain an insight
on the performance.
So, if you're a keen gamer or somebody who works
a lot on performance, you already see that the frame rate
is not so great, and the exact number is...
It's a bit difficult to see.
It's around 13, 15-ish frames per second.
So, to zoom that a bit I'll use a very old school method
of doing this, I'll paste it in Paint and zoom it
and the reason is that the Windows magnifier
actually reduces the performance of the Windows presenter,
so we will skew our test if we did it with the magnifier.
So, we get 15 point seven frames per second.
It's very far from the 60 frames that we would like
to have for a smooth animation.
Okay, let's see what we did.
If you look down on the right,
we got around 50 frames.
Actually, in my hotel room, it was 70,
but who knows what's happening?
So, we have a significant improvement on the performance.
And the answer to that question happens to be yes,
we were successful, and we'll see how that was implemented.
So, today's agenda is we'll be looking
at the basic issues of object-oriented programming.
I'll just glance at the things that we believe
can be done better.
We'll look at the basics of data-oriented design.
There are a lot of free sources.
I have a final slide with references
that you can look up later.
The idea for today will be to design a specific system,
to look at a design made with object-oriented programming,
then to redo it with data-oriented design
and see the difference between those two.
What's so wrong with object-oriented programming?
Why people have a problem with it?
I believe that a picture speaks 1000 words,
so this is an inheritance hierarchy.
It's a bit difficult to follow, right?
So, you've got diamond inheritance and whatever.
So, object-oriented programming at it's gist
marries the operations with the data they're done on
and it's not a really happy marriage,
because in the end, you end up with those giant objects
with a lot of unrelated data in them
that you poke in different parts of your software.
If we imagine a game and you have an enemy,
you might have a data about its physics,
some data about it's AI, some data about it's rendering,
and so on, but every time you access that object,
for instance to draw it, you're actually polluting
your cache and getting a lot of data from the other stuff
that you're not using right now,
and object-oriented programming also has the feature
of hiding state in many places,
and by hiding I mean that you have all these Booleans,
all these hidden ifs in your methods
that, in the end, increase substantially
the complexity of your code.
I know those things have a real impact on performance,
scalability, modifiability, and testability
and these will be the four quality attributes
we'll be analyzing.
Data-oriented design takes a different approach.
It puts the data first.
You can imagine it's in the title.
So, if we look at the lower side of our slide,
in object-oriented programming you usually have
this logical entity with all its fields,
all the stuff that logically belongs to that object.
In data-oriented design,
we do a bit of a different approach:
we separate the data according to its usage.
So, we might have data A to C with data A with fields
A to C that is used in system Alpha,
we get data B from fields from D to F,
we use that in Alpha and in Beta,
and we transform that data to a new collection
that is used down the pipeline.
So, the gist of data-oriented design is to separate
the data from the logic.
We try to use the data where necessary
without polluting our caches
and trying to keep our logic simpler.
So, we embrace the data, we don't try to hide it.
We avoid hidden usage.
We'll see how.
It's an interesting technique.
We try to avoid virtual calls because, very often,
we just don't need them, and data-oriented design is one
of those things that, at its core, are simple,
but they're very difficult to master,
a bit like the game of Go.
It promotes domain knowledge.
You cannot separate well your data
and build a good system if you don't know that data,
if you don't know the problem that you're solving
and the system you're designing.
And as I mentioned, there are references at the end
of the talk for additional information.
So, let's design a system and the system I have selected
is a system for animations, from the browser engines
that I showed earlier, and to give you an idea
about what is possible with such a system,
this is a very small example from one of our samples,
but it gives you the idea that with animations
you can do a lot of things in CSS,
so the example that we showed earlier for the performance
was just moving rectangles and changing their color,
but you can do more complex things like transformations,
moving text, opacity, and so on,
so you can modify different properties.
The details of the system are not that important.
It's more important for us to gain an insight
on how to create it.
The definition of the animation is again very simple.
We have textual definition, we have our key frames,
everybody who's doing animations know what the key frame is.
It's I want the animation to start here and end here
and has this property here and here
and we have the duration of the animation.
In a real world system, there are other properties,
obviously, but these are the basics and the others
don't change the design that much.
We already saw that we can modify
different types of properties
and this will have a real impact on our design.
This means that when we implement this in C plus plus,
we'll have to interpolate different types
of C plus plus structures or objects or whatever.
We can have numbers, we can have colors,
we can have transformations, and so on.
And another important design is that this definition
is not static.
We want to allow the user at runtime,
to modify the animations.
So, let's try to implement this
with object-oriented programming
and for the example today, I've selected Chromium,
and I don't know how many people are familiar
with the Chromium code base,
but I really, really encourage you to go and study it.
It's an incredible piece of software
done by one of the best engineers in the world,
so there's a lot to learn.
Of course, it's a giant code base,
many millions line of code,
but when you get the gist of it, it's very cool.
So, Chromium actually has two animation systems.
We'll be looking at just one of them.
The other is similar in its design,
so all the things that I'm talking apply to that as well,
and it very closely follows the HTML standard,
so that's why probably they have this object-oriented design
in the first place.
As it's object-oriented and we are creating animations,
what do we expect to have?
A class called animation.
Right, that's exactly what I looked up in the code.
There is a class called animation.
That's great, but wow.
It inherits six other non-trivial classes.
At least it's final so we know there are no others
down the line.
And you remember the picture that I showed
in the beginning about the inheritance hierarchy?
It's actually the inheritance hierarchy of this class.
So, up you get a lot of inheritance.
It's already difficult to follow, right?
Okay, so to gain a better understanding of the system,
we'll be looking at the most common operation,
which is ticking or updating an animation,
that is from the time and the definition
to generate the new value for this frame
of the property that we're changing,
the new color or the new position of our elements,
and we'll delve into the code of Chromium.
It starts very as we would expect.
There is a method called service animations.
It takes time at the second line.
It takes the current time and it calls update
on every animation pointer that is in our collection.
There is a collection of animations needing update,
so clear so far.
However, we already see some strange things happening.
They are not walking over the animations
needing update vector.
They are copying all the pointers and walking over
and iterating over another temporary vector,
and the reason for that is that, in this call stack,
but also probably in other call stacks,
they are deleting from the original vector.
So, we have some unclear lifetime semantics.
It is very difficult for us now to know
when those animations get created and, most importantly,
when they get deleted.
If they are writing this code, there is a reason for that.
So, let's look at the update method,
what's happening right there.
Fairly simple, it takes the reason,
but the very first line is the hidden state
I was talking to you about.
If there is no timeline, which is some pointer, whatever,
they return false.
So, the animation will do nothing in some condition,
but we're already paying a big cost just for this if
because the animations are all located on the heap,
so we're just touching that field just to see
if it's set or not and we are paying for a cache miss
and we might be having a lot
of branch miss predictions as well,
because if we have a good mix of active
and inactive animations, the branch predictor
will probably get confused,
so we're paying some performance cost
and we're exponentially the states and the complexity
of our system.
Okay, so if we have this content,
which is another condition, what do we do?
We delegate, actually, the animation to another class,
which is held in the declaration of our animation
as a member.
So, Chromium does garbage collection in C plus plus,
but you can think of this member type
as a shared pointer.
So, we are again paying the cost of a possible cache miss
and this animation effect read only
is actually the definition of our animation.
So, I'll be skipping some of the call stacks.
I had a lot more slides, but had to remove them
for time constraints.
We'll be skipping the call stacks
and looking just at the interesting parts.
Next up, we're going to update our value,
so we skipped a bit and we see some interesting stuff
that happens a lot in object-oriented programming.
In some cases, the animations have to call an event.
So, if there is an event delegate,
we'll call an event condition,
and here we have brought coupling
between our animation system and the event system.
Moreover, we have no idea what the on event condition
which will destroy all the whole data
that we have in our caches,
execute whatever it's doing,
and they you'll have to reload all the data
that we just threw away.
So, a possible very heavy performance heap.
There are probably also some instruction cache misses
because we don't know where this code is going
in our giant source.
So, we have calculated all this and comes
an interesting part that I mentioned in the beginning.
We want to be able to interpolate different
C plus plus types because our animation
could be on a color, could be on a number, whatever.
So however, we still want to keep that
in our system somehow, so how do we do that
in classic C plus plus,
in classic object-oriented programming?
We want the method that is different
depending on the type that we want to interpolate
and they have different sizes.
Yep, virtual functions on abstract class.
That's exactly how they do it.
So, there is a class interpolation,
which is abstract and it is inherited
for every possible type that can be interpolated
and the virtual interpolate method does the magic.
Unfortunately, this type of dynamic type erasure
in C plus plus is very costly to do.
You have the cost of allocating all those interpolations
and in our example earlier, that's 6000 animation,
so 6000 of those, and you have the very likely cache miss
when you call the virtual,
and a very likely instruction cache miss,
depending on the types of properties you're animating.
Finally, we have calculated the new value,
the interpolation did its magic.
We have to apply that new value to the properties,
to the objects that are being animated,
and again we do it in a very straightforward manner.
We have a member variable target,
which is the element that will receive the new value.
And again, we bring some coupling between the systems
because now our animation system,
whose only job was to calculate new values
according to time, knows about the document object model,
about the styling system, so that it can call
the appropriate method of the element.
In our case, the element is called set needs animation
style recalc, and if we look at the implementation,
we'll see something very, very scary.
This is again a symptom of object-oriented programming,
I believe, where you call a method,
but as it is a black box, we have no idea what's going on,
so you might end paying a very big cost on that
and we're indeed paying a very big cost
because set need style recalc, which is called there,
actually walks up the document object model tree
for every element and sets a Boolean,
and on each call up this tree,
we'll very likely be getting a cache miss,
so we were doing simple animations,
now we're walking up the DOM tree,
we're again changing the context of our program.
So, to recap very quickly, to do our animation,
we've used more than six non-trivial classes,
the objects contain smart pointers to the other systems,
they are coupled with the other systems,
for the interpolation we use abstract classes,
and we are basically drinking the poison
of object-oriented programming all the way.
All right, now let's try do it with data-oriented design.
Let's clear our mind, forget what we saw,
and return to the drawing board.
So, let's design our system around the most common operation
and this is the operation of ticking,
of updating the animation.
It happens like 99% of the time,
we have other operations obviously,
like adding, removing, posing an animation, whatever,
but the most important one,
the one that defines our performance, is the tick.
For this tick, we have two very simple pieces
of data as input.
We have the definition, obviously, of the animation,
and we have time, and as output, if you think about it,
we also have some very simple piece of data.
We need the properties that have changed,
we need the new property values,
and we need a mapping the elements
that will receive those properties.
And again, as a cornerstone of data-oriented design,
we will work on this system like there are many animations.
With object-oriented programming and the thing that we saw
earlier, it works as if there is just one animation,
so you call update on the animation, it does what it does,
but this is not what's actually happening.
You don't have one animation, you have dozens,
hundreds, even thousands.
So, how will our tick work?
It can be done very simply.
Let's create a class called animation controller
or whatever name we find interesting.
It will contain an array of our animation states,
it will take time as input.
Those states are generated from definitions,
and it will output exactly what we need:
An array, a vector of the changed properties,
an array of the elements that have changed
with the links back to the new properties.
And here, the element pointers are just DOM pointers
for our system.
It doesn't know what they are.
It doesn't directly call to the document object model.
With that data as output, we can leave it
to the next systems in our pipeline
to decide what to do.
So, the events system can walk these changed properties,
these changed elements, and decide to call
the events where necessary.
The styling system can walk over these elements
and decide what to do, knowing their new properties.
So, we've reached a lot of the coupling,
a lot of the separation of concerns.
How do we build those animation states?
Well, it's very easy.
Again, go flat, forget about methods,
just have plain old data style struct
with all the data that we need.
The details are not important.
This is the runtime data that we need,
so the start time, the post time of the animation,
and here is a bit of a twist:
we also have all the animation definition copied
for each animation state,
and we did this for two reasons:
the first one is that empirically,
we found out that the performance is better this way
and second, it makes some operations very easy.
Imagine you want to change the duration
of a certain animation.
If the definitions were shared pointers,
you would have to implement some sort of copy on write
or other EDM to do that,
but now that they are separate,
you change just the property for that animation,
all the others are completely unaffected.
So, think about how that might help us,
for instance, doing this in multiple threads.
Okay, so we have these properties.
However, we haven't solved the problem of having
to interpolate different types of different
C plus plus types.
Remember, they were doing it with the abstract class,
but we don't want to pay the cost for that.
So, to avoid type erasure, C plus plus has a very good way
to do it statically: we can use a template.
We can use a template because we know exactly all the types
of properties that we will be modifying.
There are no new ones.
So, we now have different types
templated by the animation key frame
and we would like to iterate over them
to run the animation.
However, obviously they have different sizes in memory,
so we cannot put them in a nice, neat vector
and walk it over, but as I said, we know each on of them,
so what we can do is very simple.
Let's have a vector for each type.
Now, they are the same size,
so we have as many vectors as types as needed,
and when we want to interpolate them, it's very simple.
We go over all the types A, all the vectors of that type,
then we go to the second one, then we go to the third one.
So, we are now having an order of magnitude
less cache misses than we had with the object-oriented case.
The reason is that we will be having a cache miss
when we change the type of the vector we are iterating on.
However, and this is again a bit of the main knowledge,
but we know that we'll usually have a handful of properties
that are really animated, so we'll end up with a bunch
of very large vectors that we can linearly iterate
and the prefecture of the CPU will do its magic
and we'll not be paying the cost of cache misses
and we'll end up with a lot of vectors
that are simply empty and we'll skip over them.
So, instead of having one cache miss for each animation,
as it was in the object-oriented case, so O N,
we're actually having one cache miss for each type
of animation, which is an order of magnitude less.
Actually, in the performance example I showed earlier,
there were 6000 animations
because there were 3000 rectangles,
but they were changing the color and the position,
so these are different animations,
so in the object-oriented case,
we got 6000 cache misses each frame.
In this case, we're just walking over the color
and the position, so we'll be getting
just two cache misses, probably.
And on the implementation level, it's very easy.
We can template our decontamination function,
according to the property and just have
the same code running.
I mentioned that object-oriented programming
tends to hide a lot of state.
We saw that with the activeness of the animations,
so you can have some that are enabled,
some that are disabled, and in the easiest way
of implementation, you can have a Boolean is active
or in the case of Chromium, it was if there is a timeline,
which is basically the same thing,
and do if this, then that, and whatever.
And in our system, the activeness, inactiveness
is the most important state, the most important Boolean
that we have, so we can employ something
that we call existence-based predication.
We just have to erase.
One array is for the active animations,
one is for the inactive animations.
The act of enabling an animation is just moving data
from one to the other or vice versa.
In this way, when we iterate and have to apply an operation
on our active animations,
we know that there is no hidden state,
we can apply the exact same code,
the exact same transformation on all of them.
This is a bit difficult to do for every possible state
that you have, so you can prioritize
by the most important,
by the one that has the most volatility,
or you can try to reduce the number of states.
So, to look a bit at the code that's final,
the details are not that important,
but if you look at it, it's very simple
and in many ways beautiful because there are no ifs,
there are no branches, and you can read it like a book.
So, we interpolate the key frames,
we interpolate the values,
everything works as we would expect.
And this get interpolated value function
that I have highlighted is just a template
that will do the right thing and call the right function
for the property we're changing.
And though this code can be safely in our C plus plus file,
it won't pollute with templates all over the place,
and will also keep the code together, very likely,
in our final executable,
so instruction cache misses are very unlikely.
Okay however, our system is now working.
It's very high performance, we are very happy with it,
but we want to add an API to the user
and we want it to be a convenient API.
And here, object-oriented style of classes
with methods and so on is really, really convenient.
You can give the user just an animation object
and she can call play, pause, and whatever.
So, we want to give that to the user.
So, let's create our animation object.
Now is the time.
It has all of the methods that we would like,
but contains only one simple piece of data
and that's an animation handle, just an ID,
and the first all the operations to there functions
the animation controller.
So, our animation class is very DOM,
but it gives this convenient API to the user
while we can still rework our internal data
and make our system as performant as possible,
and we are using a simple handle to simplify the lifetime
of those objects because now we know that the lifetime
is completely controlled by our controller.
There is no shared ownership over this data.
And to implement the DOM API in the controller
is very simple, you just have all the methods
that take the ID.
So, to quickly recap the concepts between OOP
and data-oriented design that we saw,
in the Chromium case we were using an inheritance
of six classes and all the logic was crammed
in this animation object and its links.
In D O D, we did a very simple flat animation state struct,
templated, and we had other methods that worked on that.
We used a list of dynamically allocated interpolation,
so abstract classes, but in C plus plus you can do better.
You can use templates and if you know the types
that you're having, you can just have an array
for each one of them.
OOP used Boolean flags for activeness
and again, by flags it's not just a Boolean,
it can be the presence or not of a pointer, whatever.
And in data-oriented design, we used different arrays
for the different states.
And to give an API to the user,
with object-oriented programming we directly give
the API of the class.
However, our user just wants to play an animation
and immediately gets 100 methods
that doesn't know what to do.
We solve that by giving a very clear API
only with the stuff that is needed.
And finally with object-orientation,
you reach out to the other systems in the program directly.
With data-oriented design, we just output tables
with the new data and let the other guys
in the pipeline decide what to do with them.
So, the key points with data-oriented design:
Try to keep your data flat,
try to use existence-based predication
to get rid of this state all over the place,
try to use ID-based handles to reduce the pointers
and to simplify the lifetime of your objects,
and try to use table-based output
because this table-based output will actually be the input
for the next thing down the pipeline.
Let's analyze how we did.
Okay, the first thing we'll be looking at is performance
and I'll be looking at the performance of this system.
It's six times faster.
So, this is a hidden treasure for us.
We did not change the algorithm,
we did not invent some very clever way to do animation,
we're executing and doing pretty much the same thing.
Only by reorganizing how the data is used,
how the code is structured, we've gained six times
better performance on this system, and think about it,
it's very likely that you have such systems
in your code bases as well
and the treasure is waiting to be found.
Next up, let's talk about scalability.
How could we hypothetically scale those systems
and hereby scaling, I mean a very simple thing,
making them work on multiple threads.
Well, in the object-oriented case,
it will be very difficult because we will have to account
for all those dependencies for the data races
that are very probable in the system,
so we don't know when we say to the target
to change its style if some other thread is doing that,
we don't know if all the methods we are calling
are thread-safe and we have to make sure they are.
Another problem with the hidden state
is that if we hypothetically solve these issues
and run, let's say, half the animation on core A,
half the animation core B,
if they have a very different runtime profile,
so if core A is lucky and gets a lot of inactive animations,
then it will be idle for a lot of time.
We won't have an even distribution of work.
With data-oriented design, things look simpler.
Our animation states are completely independent
of each other, we duplicated some data to achieve that,
so what we can do is just let's cut our vectors in half,
half goes here, half goes on the other core,
every one of them outputs its data,
and in the end, in a classic fore-conjoined fashion,
we can merge that data together
and leave it to the next system.
What about testability?
Well again, with the object-oriented, it looks difficult.
I wouldn't be the one that tries the only test
for all of this because I will have to mock a lot of things,
for instance, the events, the document object model
just to test my animation system.
Moreover, I have a lot of hidden states,
so a lot of ifs and elses and so on.
So, if I want to cover all of them,
I have this combinatorial explosion of tests
that I would have to do.
In the data-oriented cases, again, looks a bit more simpler
just because we have really isolated the animation system
from everything else.
We have our very clear input, our very clear output,
so as long as that contract is not breached
and the right input gives you the right output,
you are fine, the system works,
and the other systems down the line have the responsibility
to use that output in the correct way.
For the modifiability, such large and complex objects
throughout the hierarchy stand to harden,
and by that I mean they become so complicated
that the programmer is very much inclined
not to modify them, so it's very difficult for us to,
for instance, to experiment with changing the layout
of the object or moving some data from one class
to the other, just because everything will break
and we have to fix a lot of things.
So, those things are usually left to big re-factorings
that in the end, never happen.
In the data-oriented case, as we are completely owning
our data and it's very clear,
we have a lot more wiggle room.
We could separate our animation state into arrays,
our animation could continue to use both of those,
but we can send part of that data to another system
and we can try and reorder some of the functions,
it's a bit clearer.
However, there is one thing that object-oriented programming
excels at and this is what I call quick modifications.
Imagine that we add a new state,
like if it's Tuesday, the animation runs twice as fast.
With object-oriented programming, it's very easy to do.
If Tuesday, do whatever, else do the other thing.
So, we add a new state and it just works.
If we are to keep the pure data-oriented design,
this is a lot more difficult to do
because we would have to reanalyze our data,
we would have to look for ways to keep this state
out of our main kernel of the function
that is running the animation,
and it will probably take us a lot more time,
but in the end, it will probably pay off.
Obviously, data-oriented design is a tool
like object-oriented programming is,
like everything we do in programming is,
and as every tool has, it has its place,
but it also has some downsides
and the biggest downside, I believe, is that the current
data separation is actually very hard to do.
I mentioned that earlier, but data-oriented design
is easy to explain, but very difficult to master
and very difficult, in the end, to really achieve
what you're looking for;
and here comes a bit the domain knowledge
that is good for you to have.
If you don't know very well the problem that you're solving,
it's very difficult for you to design the data in the way
that it will be optimal, so know your problem.
As existence-based predication is also a bit difficult
to achieve if you have a lot of those states
because you start having these multi-dimensional arrays
with this and this and whatever
and my advice there is try to reduce the state.
If you've done GPU programming,
there are great examples there.
You can try and reduce the state by executing
the same operation on multiple data or other techniques.
Just get rid of those ifs and elses.
Quick modifications can be tough.
That, I mentioned already.
And we as programmers might have to unlearn a thing or two.
So, we have been wired with this object-oriented type
of thinking since forever, actually,
so in the beginning especially, it's difficult to start
and rethink everything in this new framework.
And unfortunately, the language C plus plus
is a bit difficult to use in some cases,
is not always your friend.
So, should we get rid
of object-oriented programming altogether?
I don't think so.
It is a tool, as well.
It has its place in our code bases
and it's the antithesis of my title.
But sometimes we have no choice.
You might be using third party libraries
that use object-oriented programming, so you're out of luck.
You might have API requirements that make you need
to use object-oriented programming.
You might have seen that, but object-oriented programming
is not a synonym for classes and methods and so on.
With data-oriented design, I still had classes,
I still had structs, it was that I put them first,
I put the data first and not the operations, not the code.
Polymorphism and interfaces have to be kept under control.
I don't believe that their role is in a very
low-level system, like interpolating a couple of numbers
in the animations, but they have their place
in high-level systems, in client-facing APIs,
they are amazing if you do a type of plugin
and you really need some dynamic polymorphism.
And remember that C plus plus has amazing facilities
for static polymorphism.
You can go a long way with templates
or just when you know all the types that you have,
you can generate everything, actually.
It works surprisingly well.
Finally, are there some changes in C plus plus
that will make it easier for us to write
data-oriented design code?
I have a dream, which will probably never come true,
but there are languages that allow you
to change the memory layout of objects very easily
and to experiment, for instance, with array of structures,
structure of arrays, and not having to rewrite
half of your code for that.
I don't see that coming, but we can always hope.
I didn't show that, but in games,
very often data-oriented design goes hand in hand
with entity component systems,
and even if you don't need the whole flexibility
of an entity component system,
it will be great if it was possible for us
without paying a big cost to reorder
and to split our classes into components,
and this is actually doable in C plus plus
and I'm talking about doing it without a pointer indirection
because if you do it with pointer indirection,
it's very easy, but it requires a lot of custom code
that it's a bit of a mess.
For me, the ranges proposal looks very exciting.
I believe it will be a great addition
and it will help a lot with some of the constructs,
especially to make the code more clear, more readable,
and this is a bit of a pet peeve of mine,
but unordered maps and unordered sets are not that great
because they do allocations.
I think that we need in the standard another hash table
that has some relaxed requirements
and will allow us to use another scheme of addressing
and reduce those allocations.
So, obviously there are a lot of open-source solutions,
we wrote our own hash table,
and actually reduced the allocation count
in one of our versions by 10%,
just by changing the unordered maps and sets
with a scheme that uses open addressing;
and this again brings you back to the know your machine
and design software for the machine that will run it.
Don't design for a hypothetical abstract machine
like it's in the standard because it doesn't exist.
So to conclude, object-oriented programming
is not a silver bullet, but data-oriented design
is not one either.
The thing is that you must use your best judgment.
They are just tools in your toolbox,
so they have their places.
It's your job to find them.
We have around 10 minutes for questions, so please.
- [Man 1] Yeah, you're advocating and compile time
polymorphism, and I like that as well,
and then there are always people who say to me,
yeah, but then you probably will stamp out a lot of codes
and from your experience in also knowing the machine,
when do you hit problems in that area?
That there's too much code, so that you get
instruction cache misses,
and do you know of tools, how I can see whether
those problems occur?
- Yes, this is a concern.
So, what we try to do is keep the templates
inside implementation files.
In the animation system, this is doable.
If we see an explosion of templates,
this is a problem that we try to cope to case by case,
so you have two problems that you might explode
your build time, which is obviously very bad
for productivity, and the other thing is what you mentioned,
and we have seen in some cases,
the compiler producing a lot of binary,
and increasing our instruction cache misses.
There is no formula to solve that, to be perfectly honest.
Either you have to reduce the templates
or you can rely on some black magic
and this is what we've done in some places, honestly.
So, if you have the fortune on working on a platform
that really easily shows you where the instruction
cache misses happen, which we have access to,
so on game consoles, for instance, it's very easy to see.
You can go there and try to reorder your functions
or you can try to use link time optimization.
Actually, in our experience, link time optimization
helps a lot with some of those cases,
but sometimes you're out of luck and you have version one
where everything is fine and then you change a line
somewhere in code and you see five percent slower,
just because Clang has reordered some stuff,
and when you're working really close to the maximum
possible performance, it's a thing of life.
I don't have a formula to solve this
all the time, unfortunately.
- [Man 1] Great, thank you very much.
- Thank you.
- [Man 2] Hi.
- Hi. - You've described
and architecture for this system that includes data
and operations on that data
and I'm not familiar with this domain frame much,
but I'll take your word for it that it's a good choice.
But with those structures, you may want to maintain
certain variance in the data, for example,
and you can do that with constructors or methods.
- Yes. - And you've also described
that for certain parts of your system,
like the user facing part or the client facing part,
you may want to have those abstractions.
- Yes. - And so,
I guess my biggest question is what exactly do you consider
to be object-oriented programming
and why is what you've described not that?
Because I think really you're just talking about choosing
different abstractions for different parts of your system,
and I don't really think that that contradicts
this idea of OOP unless you have a different idea
of what that is.
- Got it.
Yes, it's different way of solving the same problem,
so it's the same in variance, that's true.
However, we have turned a bit around the way it is done
because with object-oriented programming,
what I believe it means and what usually people
end up doing is having this object that we saw animation
that contains all the data, all the invariance,
and all the operations for one animation, right?
That's what we saw.
However, this does not map well to the hardware.
It also does not map well to the complexity of our system
because we end up with these giant clouds
that knows everything and can do everything.
- [Man 2] Let me try to refine my question a little bit.
So, the title of your talk is OOP is Dead,
and I know you yourself sort of weakened that stance
at the end a little bit,
but what exactly is dead is what I'm asking?
What precisely is OOP that you want to die?
- The notion of having these giant logical classes
that contain unrelated data and I believe they have
to be reworked in different collections of data
used by the different systems.
Did that answer your question?
- [Man 2] That just sorta sounds to me like bad engineering
versus good engineering and I don't know
if you can call that OOP.
- Well, a lot of code base, unfortunately,
end up with the thing that I showed
because when you have the class animation,
every developer usually goes and adds a new Boolean
or adds a new data or adds a new method
and you can call it bad engineering,
you can call it however you want,
but it's a fact of life.
No matter how much you try
and definitely the Chromium and WebKit are done
by some of the best engineers,
but in the end, you end up with this giant
inheritance hierarchy and so on.
- [Man 3] Hi, so if I understand correctly,
there's a difference of about 15 to 20 milliseconds
per frame in your initial demo versus the Chromium demo.
Do you have a sense of where those 15
to 20 milliseconds were going?
We talked a lot about virtual indirections
and cache misses, but we didn't really talk about
how many milliseconds per frame those were taking up,
just that they were occurring.
So for example, Chrome's GPU process is heavily sand boxed
and it has to actually communicate via IPC
just to even read the frame buffer,
and so that's a potentially dominating source of cycles
that obviously your program doesn't have.
So, my demo ran a bit slower than I was expecting,
to be honest, but it's probably because
the PC's not plugged, but I have an idea.
This is definitely...
It has some affect on the performance.
That's why when I do the final comparison,
I compare just this system.
The first one was just a way to engage
and to see that the overall system works better
and there are a lot of reasons for it to work better.
The GPU process, honestly, is probably not the biggest one.
The biggest one is usually the less cache misses,
the simpler call stacks.
I don't know if you're familiar with the code bases
of Chromium, et cetera, but they have very long call stacks
and there are reasons for them to do that.
I get that.
But for instance, the animation system could be made.
It just does the same thing and it's a lot faster
and the milliseconds are here,
so six, eight, and this is on this PC versus one, one.
- [Man 2] So yeah, that kinda jump me to my next question
which is that this simple demo is simple
and Chromium is not simple
and so, do you think that the data-oriented design
scales well with actual required design complexity?
For example, you got probably 20 other sub-systems
that need to hook into the animation system
and modify it or be notified when things happen
I don't know, but
- Yeah, it does that.
- Yeah. - It does that.
And one other point on the performance,
here, we usually see on PC around four to five times
better performance and this was an artificial test,
but I'm talking about real UIs and so on,
but this is because this PC has eight megabytes
of health-free cache.
I don't know what that is.
Okay? - Okay.
All right, yeah.
- If you go to a platform that has a lot less cache,
like a mobile phone or a console,
the difference in performance goes way up.
- [Man 3] Okay.
- [Man 4] Hi.
- [Man 4] So yeah, continuing on the line
of what the previous person on this microphone was saying,
I don't think you're being entirely fair
to object-oriented programming and not all implementations
of object-oriented programming feature all of those features
you listed. - Sure.
- [Man 4] And also, yeah, this is a case for, obviously,
the classic approach with object-oriented programming
is not applicable, but actually, I wanted to ask you
a bit about this: six point 12 on what type of CPU is this?
- On this?
- [Man 4] On this one, yeah, because honestly,
I think that it would be even more
on an ARM CPU, for example. - Oh, yeah.
- [Man 4] That have much lower tolerance for cache misses.
So, perhaps use the ARM CPU statistics for your next
- Yeah, I mentioned that, but the problem was
that I had to build Chromium for ARM
and I didn't want to do that,
but definitely, if you have less cache on your processor,
this number goes up,
and to answer the first part, yes,
I'm sure that there are in the world well-written
object-oriented systems that have their place.
That's what I say at the end.
- [Man 5] Hi, I can see some challenges in debugging
a data-driven system like this.
I wonder if you had some challenges
and how did you manage to solve them?
- Yes, there are some challenges.
The biggest challenge is we are accustomed to
we see a bug, we go into the debugger,
we have the subject, we can inspect all of the data,
but when you're doing data-oriented design,
you might know which data belongs to which logical entity
and what we usually do is that, in the bug,
we annotate our data so that it's easy for us
to relate it to other stuff in they systems.
That's what we do usually.
I don't have a lot of
- [Man 5] Do you think there could be a tool
that would help debugging that would access the data
because the data, if you have one entity,
the data is scattered all over the place,
so for a human to understand it, it's way harder.
- Sure, I haven't seen generic tools.
I have seen a lot of tools ad hoc like the ones
that we have just to solve these,
so we annotate the data, have some extensions
for the debugger to be a bit easier for us to find it.
I've seen a lot of game companies,
especially when things go multi-threaded,
it's even more difficult to find it.
A lot of companies have some type of graph
that they visualize or try to do it that way,
but I'm not aware and I cannot really think
of a generic tool that will really help.
It's very domain-specific.
- Thank you.
- [Man 6] What is you're experience
with long-term support and maintenance using this approach?
My biggest concern is that, as time goes,
you add more data to the structures
and you keep multiple coop is,
so the data would be like data upload,
so in a few years, you pretty much has to reorganize
all the layers and adjust them to the architecture
the CPU's using to the caches
and pretty much redoing the whole thing.
Well, we pay attention to what we do in the first place
and second, I believe that this goes a bit on the slide
that I said about the systems that tend to harden.
I believe that it's easier to re-work
and eventually split the data if it is done in this way
than if it is done in a giant hierarchy of classes.
So as time goes by, requirements change
and the system has to change.
We've had so far very good experience,
so we've worked for years on Chromium's base
and for some years on this stuff
and actually, the maintenance time
and the solving a bug and implementing a new future time
is an order of magnitude faster
and of course, it might be related to the fact
that Chromium is millions of lines of code.
Ours is very big, but it's not that big, so it depends.
I believe that in time, this is easier to maintain.
- Thank you.
- [Man 7] You talked about that there
was read-only common state between all of these classes
that you needed to duplicate across all of this state.
Could you talk about the challenge as to why
that couldn't be simply pulled out
and read-only state separated from mutable state?
Because it's not read-only.
The fact is that the user can modify that definition
at any time, so one way to do it
is to do a copy on write because you might have
multiple animation that refer that.
The other is to copy it.
So, as we have little data, it's an empiric finding
that we did that it's easier and faster
for us to copy the data.
As it changes, that fact of life might change
and you saw in the Chromium base, you might have noticed
that the class was called animation effect read only,
it's not read only.
They have cost methods that do cost cache,
remove the cost, and change them, so that's not true.
So, it's mutable in their state as well.
Unfortunately, we are out of time.
I'll be here, so everybody who has questions
is welcome and thank you again for your time.