<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Molecular Musings</title>
	<atom:link href="http://molecularmusings.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://molecularmusings.wordpress.com</link>
	<description>Development blog of the Molecule Engine</description>
	<lastBuildDate>Wed, 22 May 2013 17:32:34 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='molecularmusings.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Molecular Musings</title>
		<link>http://molecularmusings.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://molecularmusings.wordpress.com/osd.xml" title="Molecular Musings" />
	<atom:link rel='hub' href='http://molecularmusings.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Adventures in data-oriented design &#8211; Part 3b: Internal References</title>
		<link>http://molecularmusings.wordpress.com/2013/05/17/adventures-in-data-oriented-design-part-3b-internal-references/</link>
		<comments>http://molecularmusings.wordpress.com/2013/05/17/adventures-in-data-oriented-design-part-3b-internal-references/#comments</comments>
		<pubDate>Fri, 17 May 2013 13:53:42 +0000</pubDate>
		<dc:creator>Stefan Reinalter</dc:creator>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[Graphics]]></category>
		<category><![CDATA[data ownership]]></category>
		<category><![CDATA[data-oriented design]]></category>
		<category><![CDATA[game engine]]></category>
		<category><![CDATA[handles]]></category>
		<category><![CDATA[IDs]]></category>
		<category><![CDATA[molecule engine]]></category>

		<guid isPermaLink="false">http://molecularmusings.wordpress.com/?p=440</guid>
		<description><![CDATA[As promised in the last blog post, today we are going to take a look at how Molecule handles internal references to data owned by some other system in the engine. First, a quick recap of reasons why we don&#8217;t &#8230; <a href="http://molecularmusings.wordpress.com/2013/05/17/adventures-in-data-oriented-design-part-3b-internal-references/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=440&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>As promised in the <a title="Adventures in data-oriented design – Part 3a: Ownership" href="http://molecularmusings.wordpress.com/2013/05/02/adventures-in-data-oriented-design-part-3a-ownership/">last blog post</a>, today we are going to take a look at how Molecule handles internal references to data owned by some other system in the engine.</p>
<p><span id="more-440"></span>First, a quick recap of reasons why we don&#8217;t want to use pointers for referencing data is in order:</p>
<ul>
<li>With raw pointers, ownership is sometimes unclear. If I&#8217;m handed a pointer, do I need to delete the instance? Who owns it? How long can I hold on to it?<br />
This very quickly leads to double-deletes and/or dangling pointers. Both are kinds of bugs which can be hard to find if you&#8217;re unlucky.</li>
<li>The above can be somewhat alleviated by using a shared_ptr&lt;&gt; or some reference-counting mechanism, but now we have added additional overhead which isn&#8217;t really necessary. Ownership is still unclear &#8211; when is the data behind the reference-counted pointer actually freed? Who else holds on to it?</li>
<li>How are pointers replicated or copied, e.g. across the network? You always have to have some kind of serialization mechanism in place because you cannot just send pointers across the network &#8211; the address they contain won&#8217;t make sense in a different address space.</li>
<li>Pointers don&#8217;t support relocation. Ultimately, the system who owns the data should also be responsible for managing the data&#8217;s memory. Therefore, a system might want to move things around in memory, e.g. for run-time defragmentation. Notifying each and every instance that might hold a pointer to system-internal data is tedious and error-prone.</li>
</ul>
<p>So, let us now take a closer look at how we can store internal references without running into the above-mentioned problems.</p>
<h1>Handles</h1>
<p>In the Molecule Engine, <em>handles</em> are used to refer to internal data. That is, they refer to data owned by some system <strong>directly</strong>, and not via some indirection. This is also the reason why they are called <strong>internal references</strong>.</p>
<p>What are handles? Basically, they are indices into the data, but with a twist. One can think of handles as „smart indices“. But before going into detail about handles, which problems do plain indices already solve?</p>
<ul>
<li>You cannot accidentally call <em>delete</em> or <em>free()</em> on an index. Furthermore, if a system only deals with indices as input and output parameters, it should be clear that the system also owns the data.</li>
<li>Indices can be easily replicated and copied. They also support data relocation out-of-the-box: if we want to access the data at e.g. index 3, it doesn&#8217;t matter where the data itself resides, as long as it remains in the same order. It can reside at address 0xA000 or 0xB000 or someplace else &#8211; data[3] will give us the data we want.</li>
</ul>
<p>Of course, there are things which are not supported by plain indices:</p>
<ul>
<li>We cannot detect access to stale/deleted data. We might try to access the data at index 3, but that might have been freed already since our last access.</li>
<li>Whole data blocks can be moved around in memory, but the order of individual data items cannot be changed, because that would mess up our indices.</li>
</ul>
<p>Handles help us with the first problem, but also don&#8217;t support arbitrary relocation of individual data items. This is what <em>IDs</em> or <strong>external references</strong> are for, but those will be the topic of the next post.</p>
<p>The question remains: how do we turn indices into handles that can detect access to already deleted data?</p>
<p>The idea is quite simple: instead of only using an index, a handle also stores the generation in which the index was created. The generation is simply a monotonically increasing counter that gets incremented each time a data item is deleted. The generation is stored both inside the handle, and for each data item. Whenever we want to access data using a handle, the index&#8217; generation and the data item&#8217;s generation need to match.</p>
<h1>An example</h1>
<p>Going back to our example from the last post, let us assume our render backend provides space for 4k vertex buffers. New vertex buffers are allocated using a pool-allocator/free-list internally, and users only deal with a <em>VertexBufferHandle</em>.</p>
<p>Initially, our pool of vertex buffers is empty, and all generations are set to zero.</p>
<pre>
4096 Vertex buffers:
+----+----+----+----+----+----+
| VB | VB | VB | .. | VB | VB |
+----+----+----+----+----+----+

4096 Generations:
+----+----+----+----+----+----+
| 0  | 0  | 0  | .. | 0  | 0  |
+----+----+----+----+----+----+
</pre>
<p>The first time we allocate a vertex buffer, the handle will contain an index of 0, and a generation of 0. Future vertex buffer handles will have a different index, and also a generation of 0.<br />
Assume we now destroy the first vertex buffer we allocated. The generation of the slot that contained it will increment, yielding the following layout:</p>
<pre>
+----+----+----+----+----+----+
| VB | VB | VB | .. | VB | VB |
+----+----+----+----+----+----+
+----+----+----+----+----+----+
| 1  | 0  | 0  | .. | 0  | 0  |
+----+----+----+----+----+----+
</pre>
<p>If we now want to access that vertex buffer using the handle, we check its generation against the one stored with our vertex buffer, and find that they don&#8217;t match – meaning that we tried to access already deleted data.</p>
<p>In code, this situation looks somewhat like the following:</p>
<pre class="brush: cpp; title: ; notranslate">
VertexBufferHandle handle = backend::CreateVertexBuffer(...);

// some more vertex buffers created in the meantime

// at a later point in time, we destroy the vertex buffer...
backend::DestroyVertexBuffer(handle);
// ...but somebody, somewhere, still holds the same handle

backend::AccessVertexBuffer(handle);  </pre>
<h1>Handle implementations</h1>
<p>One thing we haven&#8217;t talked about yet is how handles can be implemented. As almost always, the simplest solutions are the best, so a trivial struct will suffice in this case:</p>
<pre class="brush: cpp; title: ; notranslate">
struct Handle
{
  uint32_t index;
  uint32_t generation;
};
</pre>
<p>In practice, you normally would not use two 32-bit integers for both the index and the generation, but rather use <a href="http://en.wikipedia.org/wiki/Bit_field" title="Bit fields">bitfields</a> instead. In the case of our vertex buffer handles, we need 12 bits for storing indices in the range [0, 4095], which leaves 20 bits for the generation if we want our handles to be 32-bit integers. Hence, our handles would look more like the following:</p>
<pre class="brush: cpp; title: ; notranslate">
struct Handle
{
  uint32_t index : 12;
  uint32_t generation : 20;
};
</pre>
<p>This means that the generation overflows after 1048576 vertex buffers have been deleted <strong>in the same slot</strong> in our pool. Theoretically, this means that we could wrongly access a vertex buffer via an old handle that was generated more than 1048576 vertex buffer create/delete cycles ago, in that very slot. In practice this should never happen, unless we store an old handle for ages, create/delete buffers like crazy, and never access the buffer using that handle in the meantime.<br />
Yet, depending on the number of bits you are willing to spend this <em>can </em>happen, so it is something to keep in mind.</p>
<p>Last but not least, another nice thing about handles that I mentioned in the previous blog post is that they use less memory than a pointer. Most handles can store their index and generation in a single 32-bit integer, which means they need half the amount of memory compared to 64-bit pointers. Additionally, we really only need to store the generation inside a handle for detecting access to stale data. We should not need that in a retail build, hence handles can be as small as 16-bit integers in those builds, if your indices only need to be in the range [0, 65535].</p>
<h1>A generic implementation</h1>
<p>In Molecule, I use a generic handle implementation which defines the underlying data types according to certain build rules, and also <em>static_asserts</em> whether the bits fit into that type. The basic struct is as follows:</p>
<pre class="brush: cpp; title: ; notranslate">
template &lt;size_t N1, size_t N2&gt;
struct GenericHandle
{
  // uint16_t or uint32_t, depending on build type, realized using preprocessor-#ifs
  uint32_t index : N1;
  uint32_t generation : N2;
};
</pre>
<p>All handle types then become simple typedefs, e.g.:</p>
<pre class="brush: cpp; title: ; notranslate">
typedef GenericHandle&lt;12, 20&gt; VertexBufferHandle;
</pre>
<p>And that concludes today&#8217;s post! In the next part in the series, we will discuss how external references also allow for moving individual data items around in memory, without user code having to care about that.</p>
<br />Filed under: <a href='http://molecularmusings.wordpress.com/category/c/'>C++</a>, <a href='http://molecularmusings.wordpress.com/category/graphics/'>Graphics</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/molecularmusings.wordpress.com/440/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/molecularmusings.wordpress.com/440/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=440&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://molecularmusings.wordpress.com/2013/05/17/adventures-in-data-oriented-design-part-3b-internal-references/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/913fa98c5b06dd01be79595be8d6cc4c?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">molecularmusings</media:title>
		</media:content>
	</item>
		<item>
		<title>Adventures in data-oriented design &#8211; Part 3a: Ownership</title>
		<link>http://molecularmusings.wordpress.com/2013/05/02/adventures-in-data-oriented-design-part-3a-ownership/</link>
		<comments>http://molecularmusings.wordpress.com/2013/05/02/adventures-in-data-oriented-design-part-3a-ownership/#comments</comments>
		<pubDate>Thu, 02 May 2013 13:22:13 +0000</pubDate>
		<dc:creator>Stefan Reinalter</dc:creator>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[Graphics]]></category>
		<category><![CDATA[data ownership]]></category>
		<category><![CDATA[data-oriented design]]></category>
		<category><![CDATA[game engine]]></category>
		<category><![CDATA[handles]]></category>
		<category><![CDATA[IDs]]></category>
		<category><![CDATA[molecule engine]]></category>

		<guid isPermaLink="false">http://molecularmusings.wordpress.com/?p=434</guid>
		<description><![CDATA[One thing I have noticed during the development of the Molecule Engine, is that defining clear ownership over data can tremendously help with following a data-oriented design approach, and vice versa. Defining ownership initially requires people to think more about &#8230; <a href="http://molecularmusings.wordpress.com/2013/05/02/adventures-in-data-oriented-design-part-3a-ownership/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=434&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>One thing I have noticed during the development of the Molecule Engine, is that defining clear ownership over data can tremendously help with following a data-oriented design approach, and vice versa.</p>
<p><span id="more-434"></span>Defining ownership initially requires people to think more about who owns data, who creates and destroys instances, but totally pays off in terms of maintenance, performance, and debuggability. I would like to go back to one of my favourite examples, because it is easily understood by everyone: rendering a bunch of static meshes.</p>
<h1>Mesh rendering</h1>
<p>The example we are looking at is the following:</p>
<ul>
<li>A level contains any number of static meshes. A static mesh consists of a vertex buffer, an index buffer, and a bunch of triangle groups describing the indices used by each group/submesh in the mesh. We call a struct/class holding that information a <strong>Mesh</strong>.</li>
<li>Each of these meshes can be rendered with a different shader and material, hence those are not part of the mesh, but are used by something called a <strong>MeshInstance</strong><b>, </b>a <strong>MeshComponent</strong><b>, </b> or similar. We need this distinction because the same mesh can be instantiated with different shaders and materials several times in a level.</li>
</ul>
<p>If you do not care much about memory, performance, and keeping your co-workers sane, the easiest solution is to allocate each <b>Mesh</b> individually using <i>new</i>, and store a pointer to a <b>Mesh</b> inside the <b>MeshInstance</b>. Because several such instances can now refer to the same <b>Mesh</b>, you either hold a <i>shared_ptr&lt;Mesh&gt;</i> or similar, or build some other means of reference-counting into the <b>Mesh</b> class and store a plain old raw pointer. And while you&#8217;re at it, vertex buffers and index buffers are also reference-counted, because you can and it&#8217;s nice and OOP-esque. Problem solved.</p>
<p>Well, yes, and no.</p>
<p>There are several major disadvantages with using such an approach:</p>
<ul>
<li><strong>Ownership</strong>: Who owns instances of <b>Mesh</b>? Whenever a <b>MeshInstance</b> is deleted or goes out-of-scope, it simply decrements the reference-count of the referenced object. Can anybody tell <i>when</i> the actual object gets destroyed?<br />
The same is true for vertex buffers and index buffers. When are those deleted, exactly?</li>
<li><strong>Performance</strong>: You&#8217;re handing out raw pointers, <em>shared_ptr&lt;&gt;</em> or anything along those lines to users of your class. This makes it very hard (if not impossible) to move things around in memory, and still let all pointers point to the correct object. Hence, most of your objects are scattered all over the heap, causing tons of cache-misses upon accessing them, because subsystems cannot rearrange objects in memory as they best see fit.</li>
<li><strong>Debuggability</strong>: Raw pointers just scream „dangling pointer“. Oh, you correctly released a <b>Mesh</b> reference, the original object got destroyed, but you are still holding on to your <b>MeshInstance</b>? Well, the memory manager allocated a new instance at the exact same spot in memory in the meantime, and you&#8217;re now working with stale data without noticing. Guess it&#8217;s not your lucky day.</li>
</ul>
<p>Of course we can do better.</p>
<h1>A better solution</h1>
<p>The first thing to think about is: who uses the data, who owns, creates and destroys it? Let&#8217;s look at vertex and index buffers first.</p>
<p>The only thing that should ever be doing API calls in a rendering engine is the <i>render backend</i>. In Molecule parlance, the rendering backend is just a namespace with tons of free functions, who directly talk to D3D11, OpenGL, DX9, or other APIs. Porting the graphics module means porting the backend, and additionally exposing platform-specific functionality, but that&#8217;s about it.</p>
<p>The backend is also responsible for <a title="Order your draw calls" href="http://realtimecollisiondetection.net/blog/?p=86">queueing and sorting draw calls according to 64-bit keys</a>, and hence is the only thing in the engine binding vertex buffers, setting render states, performing actual draw calls, etc.</p>
<p>Because the backend is the only thing touching that data, it would benefit the most if all low-level rendering-related data such as vertex and index buffers were as close together in memory as possible. Therefore, the backend itself should also be responsible for creating/destroying those buffers, taking ownership over them.</p>
<p>In addition, we do not want to return raw pointers to our internals, because we want to be able to track accesses to stale data – no more dangling pointers! Furthermore, by giving the user a simple identifier such as an integer instead of a pointer, the question of „Do I own this? Do I need to free this?“ never actually arises.</p>
<h1>Simplicity trumps everything</h1>
<p>So what is the simplest solution for getting something that is as close together as possible in memory? An array.</p>
<p>This is what Molecule uses. The rendering backend simply holds an array of 4096 vertex and 4096 index buffers. Of course those numbers are configurable, but do you ever need more than 4k vertex buffers in flight <i>at the same time</i>? If so, you have a worst-case scenario of <i>at least</i> 4k distinct draw calls in a certain frame, which is unreasonable anyway (in terms of performance).</p>
<p>Instead of a pointer, you can now simply return a 16-bit integer that can be used to uniquely identify vertex and index buffers in your array – it is nothing more than an index into the array. Not only is the question of ownership no longer a question (you cannot <i>delete</i> or <i>free()</i> a 16-bit integer, nor decrement its reference-count), you can also build in a mechanism for tracking whether a given integer refers to an existing object or not – this is what is often referred to as a <a title="Noel Llopis on managing data relationships" href="http://gamesfromwithin.com/managing-data-relationships">handle</a>. Depending on the maximum number of instances, a 16-bit int might suffice, or you can always go to 32-bit.</p>
<p>That being said, the interface for creating and destroying vertex and index buffers in Molecule looks like this:</p>
<pre class="brush: cpp; title: ; notranslate">
namespace backend
{
  VertexBufferHandle CreateVertexBuffer(unsigned int vertexCount, unsigned int stride, const void* initialData);
  VertexBufferHandle CreateDynamicVertexBuffer(unsigned int vertexCount, unsigned int stride);
  void DestroyVertexBuffer(VertexBufferHandle handle);

  IndexBufferHandle CreateIndexBuffer(unsigned int indexCount, IndexBuffer::Format::Enum format, const void* initialData);
  void DestroyIndexBuffer(IndexBufferHandle handle);
}
</pre>
<h1>Referencing mesh data</h1>
<p>Thinking about mesh data ownership, we can come up with a similarly simple solution for referencing/storing that data as well.</p>
<p>In the Molecule Engine, a thing called the <i>render world</i> holds all data which is tied to the graphics module, the main things being stuff that is pulled in from resource packages, such as meshes, skeletons, animations, particle systems, graphics-related components, etc.</p>
<p>Similar to the fixed-size vertex and index buffers that are being held by the render backend, the render world stores e.g. an array of all meshes contained in a resource package. Because all other rendering-related data is also owned by the render world, we can reference that by using handles as well.</p>
<p>This means that a Mesh now looks like this:</p>
<pre class="brush: cpp; title: ; notranslate">
struct Mesh
{
  VertexBufferHandle m_vertexBuffer;
  IndexBufferHandle m_indexBuffer;
  TriangleGroupHandle m_triangleGroups;
  uint16_t m_triangleGroupCount;
};
</pre>
<p>No reference-counting, no <i>shared_ptr&lt;&gt;</i>, no raw pointers. A <b>Mesh</b> is trivially copyable, and can be moved around in memory by using <i>memcpy()</i>. How do you hold on to a <b>Mesh</b>? What does a <b>MeshInstance</b> look like?</p>
<p>It&#8217;s simple: you just copy the <b>Mesh</b>. You hold on to a <b>Mesh</b> simply by copying it. A MeshComponent just stores a copy of Mesh, along with handles for shaders, materials, and so on.</p>
<p>In practice, <b>MeshComponents</b> themselves are owned by the render system which is responsible for rendering them, but that is something for another blog post.</p>
<h1>Conclusion</h1>
<p>Let us quickly recap:</p>
<ul>
<li>Vertex and index buffers are owned by the render backend. No raw pointers are handed out, only handles. Handles are an opaque data type, and the user should not (and does not) know how to interpret the given integer.</li>
<li>Mesh instances are owned by the render world. Meshes are referenced simply by copying them, because that gives you all the data you need in order to <b>do</b> something with it.</li>
<li>There are no reference-counting mechanisms, no raw pointers, and most importantly no dangling pointers. The system automatically identifies accesses to stale data. In addition, most handles occupy <b>less</b> memory than pointers, especially on 64-bit systems.</li>
<li>Mesh instances, MeshComponents, and many other components are merely data containers, and as such can be freely moved around in memory, without having to worry about ownership, construction/deletion, etc.</li>
</ul>
<p>In the next posts, we will take a closer look at what Molecule uses for referencing data that is moved around in memory by subsystems responsible for updating/rendering it. One such system is the one responsible for rendering meshes, where it&#8217;s crucial that data is accessed in a cache-friendly fashion. Specifically, we will go into detail about internal references (= handles), and external references (= IDs).</p>
<br />Filed under: <a href='http://molecularmusings.wordpress.com/category/c/'>C++</a>, <a href='http://molecularmusings.wordpress.com/category/graphics/'>Graphics</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/molecularmusings.wordpress.com/434/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/molecularmusings.wordpress.com/434/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=434&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://molecularmusings.wordpress.com/2013/05/02/adventures-in-data-oriented-design-part-3a-ownership/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/913fa98c5b06dd01be79595be8d6cc4c?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">molecularmusings</media:title>
		</media:content>
	</item>
		<item>
		<title>Input Evaluation SDK available for download</title>
		<link>http://molecularmusings.wordpress.com/2013/03/11/input-evaluation-sdk-available-for-download/</link>
		<comments>http://molecularmusings.wordpress.com/2013/03/11/input-evaluation-sdk-available-for-download/#comments</comments>
		<pubDate>Mon, 11 Mar 2013 16:50:43 +0000</pubDate>
		<dc:creator>Stefan Reinalter</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[core library]]></category>
		<category><![CDATA[evaluation sdk]]></category>
		<category><![CDATA[game engine]]></category>
		<category><![CDATA[input library]]></category>
		<category><![CDATA[molecule]]></category>
		<category><![CDATA[molecule engine]]></category>

		<guid isPermaLink="false">http://molecularmusings.wordpress.com/?p=432</guid>
		<description><![CDATA[I&#8217;m proud to announce that the first evaluation SDK for our input technology is now available! A new version of the core technology has also been released, with some minor additions and improvements. Check out www.molecular-matters.com for more information on &#8230; <a href="http://molecularmusings.wordpress.com/2013/03/11/input-evaluation-sdk-available-for-download/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=432&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I&#8217;m proud to announce that the first evaluation SDK for our input technology is now available! A new version of the core technology has also been released, with some minor additions and improvements.</p>
<p>Check out <a title="Molecular Matters" href="http://www.molecular-matters.com">www.molecular-matters.com</a> for more information on the input library. Further SDKs will follow during the next few months.</p>
<br />Filed under: <a href='http://molecularmusings.wordpress.com/category/uncategorized/'>Uncategorized</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/molecularmusings.wordpress.com/432/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/molecularmusings.wordpress.com/432/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=432&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://molecularmusings.wordpress.com/2013/03/11/input-evaluation-sdk-available-for-download/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/913fa98c5b06dd01be79595be8d6cc4c?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">molecularmusings</media:title>
		</media:content>
	</item>
		<item>
		<title>Adventures in data-oriented design &#8211; Part 2: Hierarchical data</title>
		<link>http://molecularmusings.wordpress.com/2013/02/22/adventures-in-data-oriented-design-part-2-hierarchical-data/</link>
		<comments>http://molecularmusings.wordpress.com/2013/02/22/adventures-in-data-oriented-design-part-2-hierarchical-data/#comments</comments>
		<pubDate>Fri, 22 Feb 2013 15:46:19 +0000</pubDate>
		<dc:creator>Stefan Reinalter</dc:creator>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[Graphics]]></category>
		<category><![CDATA[data-oriented]]></category>
		<category><![CDATA[DOD]]></category>
		<category><![CDATA[game engine]]></category>
		<category><![CDATA[hierarchical data]]></category>
		<category><![CDATA[molecule]]></category>
		<category><![CDATA[skeletal animation]]></category>

		<guid isPermaLink="false">http://molecularmusings.wordpress.com/?p=424</guid>
		<description><![CDATA[One task that is pretty common in game development is to transform data according to some sort of hierarchical layout. Today, we want to take a look at probably the most well-known example of such a task: transforming joints according &#8230; <a href="http://molecularmusings.wordpress.com/2013/02/22/adventures-in-data-oriented-design-part-2-hierarchical-data/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=424&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>One task that is pretty common in game development is to transform data according to some sort of hierarchical layout. Today, we want to take a look at probably the most well-known example of such a task: transforming joints according to a skeleton hierarchy.</p>
<p><span id="more-424"></span>Specifically, we will look at the difference between a simple OOP-based by-the-book implementation, and a more data-oriented design. We are going to cover the differences both in terms of implementation and performance. And once more we will see that data-oriented design and object-oriented programming do not contradict each other!</p>
<p><strong>Skeletal animation<br />
</strong></p>
<p>Before we can start thinking of a possible implementation, we need to cover some ground first by quickly identifying the steps needed for skeletal animation.</p>
<p>Typically, an animated mesh such as a character consists of several <strong>joints</strong> that are connected with each other in a hierarchical fashion. The animation data specifies <strong>animation curves</strong> for each of the different components that make up the transformation of a joint, e.g. x-, y- and z-values for translation (and scaling, if supported), and x-, y-, z- and w-values for the rotation (stored as a quaternion).</p>
<div id="attachment_425" class="wp-caption alignnone" style="width: 514px"><a href="http://molecularmusings.wordpress.com/2013/02/22/adventures-in-data-oriented-design-part-2-hierarchical-data/skeleton/" rel="attachment wp-att-425"><img class="size-full wp-image-425" alt="Skeleton consisting of 96 joints" src="http://molecularmusings.files.wordpress.com/2013/02/skeleton.jpg?w=584"   /></a><p class="wp-caption-text">Skeleton consisting of 96 joints</p></div>
<p>The data of the animation curves of each joint is stored in their <strong>local frame</strong> (= their local coordinate system) because that is easier to animate, and leads to more natural-looking animations. This means that whenever you sample the animation curves, all the transformations are stored in the local coordinate frame, and need to be transformed into the global coordinate frame in order to render the joints or the skinned character. Hence, sampling the animation curves gives us a transformation for each joint, called the <strong>local pose</strong>.</p>
<p>Note that if we want to blend between different joint transformations (typically done using <a title="LERP" href="http://en.wikipedia.org/wiki/Lerp_%28computing%29">LERP </a>or <a title="SLERP" href="http://en.wikipedia.org/wiki/Slerp">SLERP</a>), those blends are performed on local poses. After blending, layering, etc. has finished, all joints are transformed into their <strong>global pose</strong>. This is done using the joint hierarchy.</p>
<p>To recap, these are the steps that need to be performed:</p>
<ol>
<li>Work out at what time we want to sample the animation curves. This is influenced by the speed of the animation, whether it&#8217;s looping, etc. The result of this is called the <strong>local pose</strong>.</li>
<li>Optionally blend between different local poses to smooth the animation.</li>
<li>Transform the <strong>local pose</strong> into the <strong>global pose</strong> according to the hierarchy.</li>
</ol>
<p>I deliberately left out other steps such as ragdolls, steps that can affect the global pose and need a round-trip to the local pose (IK, FK), and others, in order to focus on the key parts of this post.</p>
<p><strong>Demo setup</strong></p>
<p>The demo we will be using to measure the difference between different implementations simply performs the above steps for 1000 characters. Our timings will measure how long it takes to perform <em>step 3</em> (traversing the hierarchy and transforming the joints&#8217; poses) a thousand times. We will use the character consisting of 96 joints (shown above) for this demo.</p>
<div id="attachment_427" class="wp-caption alignnone" style="width: 594px"><a href="http://molecularmusings.wordpress.com/2013/02/22/adventures-in-data-oriented-design-part-2-hierarchical-data/characters/" rel="attachment wp-att-427"><img class="size-full wp-image-427" alt="An army of 1000 characters, all sampled at a different point in time" src="http://molecularmusings.files.wordpress.com/2013/02/characters.jpg?w=584&#038;h=328" width="584" height="328" /></a><p class="wp-caption-text">An army of 1000 characters, all sampled at a different point in time</p></div>
<p><strong>Initial implementation</strong></p>
<p>A simple, naive implementation of a hierarchical, skeletal data structure could look like the following:</p>
<pre class="brush: cpp; title: ; notranslate">
struct Joint
{
  math::matrix4x4_t globalPose;
  math::matrix4x4_t localPose;
  std::vector&lt;Joint*&gt; children;
};
</pre>
<p>Simple enough: each joint stores a local pose, a global pose, and an array of child joints. A skeleton then becomes an array of joints, and that&#8217;s it. In order to transform the skeleton starting from the root, a simple recursive function can be used:</p>
<pre class="brush: cpp; title: ; notranslate">
class Skeleton
{
public:
  void LocalToGlobalPose(void)
  {
    // start at the root
    LocalToGlobalPose(&amp;joints[0], math::MatrixIdentity());
  }

private:
  void LocalToGlobalPose(Joint* joint, math::matrix4x4_arg_t parentTransform)
  {
    joint-&gt;globalPose = math::MatrixMul(parentTransform, joint-&gt;localPose);

    // propagate the transformation to the children
    for (size_t i=0; i &lt; joint-&gt;children.size(); ++i)
    {
      LocalToGlobalPose(joint-&gt;children[i], joint-&gt;globalPose);
    }
  }

  Joint* joints;
};
</pre>
<p>The above traverses the hierarchy starting from the root, and propagates the transformation of a parent joint down its children. This is done by first storing the resulting global pose of the joint, and then passing it along to the children. The code is short, and should be pretty self-explanatory.</p>
<p>So the question is, how long does it take to build the global pose for 1000 characters using the above code? The answer: <strong>3.8ms</strong>. We cannot decide yet whether that&#8217;s good or bad, but we can surely do better, otherwise I wouldn&#8217;t post about it.</p>
<p><strong>A data-oriented approach</strong></p>
<p>The first thing to observe here is simple, and should not come as a surprise to most of you: instead of storing the children for each joint, we can turn the problem on its head, and just store a parent joint for each joint instead. So instead of storing a <em>std::vector&lt;Joint*&gt; children</em> we store a <em>Joint* parent</em> instead.</p>
<p>The next optimization is also simple: we store all joints of a skeleton in an array anyway, so why not store indices instead of pointers? Assuming no skeleton has more than 65536 joints (which is a safe assumption I would say), we can store a <em>uint16_t parent</em> instead of the pointer previously mentioned.</p>
<p>The benefits?</p>
<ul>
<li>Storing an <em>uint16_t</em> instead of a pointer needs less space, especially on 64-bit systems. We save memory, and have less memory to access when building the global pose.</li>
<li>A complete skeleton is now trivially copy-able, and can be moved around in memory using e.g. <em>memcpy</em>. Similarly, a whole skeleton can be loaded from disk using a single binary read without having to worry about pointer-fixups or similar. This alone is worth the data transformation.</li>
</ul>
<p>The question that remains is: how do we change the code so that we still traverse the hierarchy in the proper order, and how do we get rid of the recursion?</p>
<p>In order to fix that, there is one key observation to make: as long as we always traverse the hierarchy in the same order, we can allocate and store the joints in that exact order, and walk through the array of joints <strong>linearly </strong>by flattening the hierarchy. Looking at the recursive code above, we can see that the code traverses the hierarchy in what is called <a title="Tree traversal" href="http://en.wikipedia.org/wiki/Tree_traversal">depth-first order</a>.<br />
Now imagine traversing the hierarchy, numbering the joints that are visited with a monotonically increasing number. Joint 0 would be the root, joint 1 would be the next in the hierarchy following a depth-first traversal, and so on. This means that if we iterate through the array of joints, a joint <em>i</em> can only have a joint <em>j</em> as its parent with <em>j &lt; i</em>. This greatly simplifies the traversal code, and gets rid of the recursion.</p>
<p>One last thing left to do is change the layout from AoS (array-of-structures) to SoA (structure-of-arrays), leaving us with the following:</p>
<pre class="brush: cpp; title: ; notranslate">
class Skeleton
{
private:
  uint16_t* hierarchy;
  math::matrix4x4_t* localPoses;
  math::matrix4x4_t* globalPoses;
};
</pre>
<p>This change has the benefit that whenever we access e.g. the parent of a joint using <em>hierarchy[i]</em>, several dozens of the next parents will also be read into the cache with the same cache-line. The same is true for accessing <em>localPoses[i]</em>.<br />
So what does the transformation code look like? It&#8217;s become surprisingly simple:</p>
<pre class="brush: cpp; title: ; notranslate">
void LocalToGlobalPose(void)
{
  // the root has no parent
  globalPoses[0] = localPoses[0];

  for (unsigned int i=1; i &lt; NUM_JOINTS; ++i)
  {
    const uint16_t parentJoint = hierarchy[i];
    globalPoses[i] = math::MatrixMul(globalPoses[parentJoint], localPoses[i]);
  }
}
</pre>
<p>Beautiful, and doesn&#8217;t get any simpler.</p>
<p>As stated above, this works because <em>parentJoint &lt; i, </em>which means we will only access global poses which have been computed already. The recursive traversal that we had before is gone and is now implicitly given by the positions of joints in the array, which we numbered in a depth-first fashion.</p>
<p>What difference in performance does this make? With the same amount of transformations, this code now takes <strong>3.1ms</strong>. Compared to the <strong>3.8ms</strong> we had, that&#8217;s more than an <strong>18%</strong> speed increase!</p>
<p><strong>Conclusion</strong></p>
<p>Not only did our few simple changes result in a nice increase in performance, it also vastly simplified the code itself. And we didn&#8217;t have to abandon OOP principles, we didn&#8217;t even have to change the calling code. We still have our skeleton class, and call <em>LocalToGlobalPose()</em> on it.</p>
<p>Concise, fast, and simple.</p>
<br />Filed under: <a href='http://molecularmusings.wordpress.com/category/c/'>C++</a>, <a href='http://molecularmusings.wordpress.com/category/graphics/'>Graphics</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/molecularmusings.wordpress.com/424/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/molecularmusings.wordpress.com/424/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=424&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://molecularmusings.wordpress.com/2013/02/22/adventures-in-data-oriented-design-part-2-hierarchical-data/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/913fa98c5b06dd01be79595be8d6cc4c?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">molecularmusings</media:title>
		</media:content>

		<media:content url="http://molecularmusings.files.wordpress.com/2013/02/skeleton.jpg" medium="image">
			<media:title type="html">Skeleton consisting of 96 joints</media:title>
		</media:content>

		<media:content url="http://molecularmusings.files.wordpress.com/2013/02/characters.jpg" medium="image">
			<media:title type="html">An army of 1000 characters, all sampled at a different point in time</media:title>
		</media:content>
	</item>
		<item>
		<title>Memory allocation strategies: a growing stack-like (LIFO) allocator</title>
		<link>http://molecularmusings.wordpress.com/2013/01/29/memory-allocation-strategies-a-growing-stack-like-lifo-allocator/</link>
		<comments>http://molecularmusings.wordpress.com/2013/01/29/memory-allocation-strategies-a-growing-stack-like-lifo-allocator/#comments</comments>
		<pubDate>Tue, 29 Jan 2013 18:38:23 +0000</pubDate>
		<dc:creator>Stefan Reinalter</dc:creator>
				<category><![CDATA[Core]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[game engine]]></category>
		<category><![CDATA[memory allocator]]></category>
		<category><![CDATA[memory system]]></category>
		<category><![CDATA[molecule engine]]></category>
		<category><![CDATA[virtual memory]]></category>

		<guid isPermaLink="false">http://molecularmusings.wordpress.com/?p=412</guid>
		<description><![CDATA[Continuing from where we left of last time, I would like to discuss how we can build growing allocators using a virtual memory system. This post describes how to build a stack-like allocator that can automatically grow up to a &#8230; <a href="http://molecularmusings.wordpress.com/2013/01/29/memory-allocation-strategies-a-growing-stack-like-lifo-allocator/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=412&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Continuing from where we left of last time, I would like to discuss how we can build growing allocators using a virtual memory system. This post describes how to build a stack-like allocator that can automatically grow up to a given maximum size.</p>
<p><span id="more-412"></span>As explained in <a title="Memory allocation strategies interlude: virtual memory" href="http://molecularmusings.wordpress.com/2012/10/02/memory-allocation-strategies-interlude-virtual-memory/">one of the last posts</a>, virtual memory allows us to reserve address space without allocating any physical memory for it. We can use this to our benefit, and build an allocator that reserves address space up-front, and allocates physical memory whenever we need it. In that regard, the allocator behaves similar to the non-growing stack allocator, but is not restricted to working within a fixed region of memory.</p>
<p>Such an allocator is extremely useful for situations like e.g. loading level data, where the amount of memory needed varies between different levels. Using a growing allocator saves development time because there is no need to constantly change the upper bound of a fixed-size allocator whenever a certain level exceeds the previous worst-case memory usage.</p>
<h1>An example</h1>
<p>Consider a game where the maximum amount of memory that is spent on level-resident resources is defined to be 128 MB. What we want to have is an allocator that reserves address space for e.g. 256 MB (to be on the safe side), and allocates physical memory in 1 MB blocks whenever needed.</p>
<p>The basic steps of building such an allocator are simple:</p>
<ul>
<li>Reserve address space for 256 MB.</li>
<li>Store the start and end of the reserved address space.</li>
<li>Store the start and end of the physical address space (=0 bytes in the beginning).</li>
<li>For each allocation:
<ul>
<li>Work out the required amount of physical memory.</li>
<li>If there is still enough physical memory left, bump the start of the physical address space (see above).</li>
<li>If there is not enough memory left, allocate 1 MB pages at the end of the current physical address space until the allocation fits.</li>
<li>Update the start and end of the physical address space.</li>
</ul>
</li>
</ul>
<p>In code this could look like the following:</p>
<pre class="brush: cpp; title: ; notranslate">
GrowingStackAllocator::GrowingStackAllocator(unsigned int maxSizeInBytes, unsigned int growSize)
  : m_virtualStart(virtualMemory::ReserveAddressSpace(maxSizeInBytes))
  , m_virtualEnd(m_virtualStart + maxSizeInBytes)
  , m_physicalCurrent(m_virtualStart)
  , m_physicalEnd(m_virtualStart)
  , m_growSize(growSize)
{
}
</pre>
<p>In the constructor, we just reserve address space and store the start and end of it.</p>
<h1>Allocating memory</h1>
<pre class="brush: cpp; title: ; notranslate">
void* GrowingStackAllocator::Allocate(size_t size, size_t alignment, size_t offset)
{
  const size_t allocationSize = size;

  // work out proper allocation sizes and possible offsets
  ...

  m_physicalCurrent = core::pointerUtil::AlignTop(m_physicalCurrent + offset, alignment) - offset;

  // is there enough physical memory left?
  if (m_physicalCurrent + size &gt; m_physicalEnd)
  {
    // out of physical memory. check if there is still address space left from which new physical pages can be allocated.
    // remember that virtual memory must always be allocated in page-size chunks, that is why we round the needed size to
    // the next grow-size multiple.
    const size_t neededPhysicalSize = bitUtil::RoundUpToMultiple(size, m_growSize);
    if (m_physicalEnd + neededPhysicalSize &gt; m_virtualEnd)
    {
      // the allocation doesn't fit into the address space, we're out of memory
      return nullptr;
    }

    // allocate new memory pages at the end of our currently allocated pages
    virtualMemory::AllocatePhysicalMemory(m_physicalEnd, neededPhysicalSize);
    m_physicalEnd += neededPhysicalSize;
  }

  // store book-keeping information if necessary
  ...

  // return allocation to user
  return m_physicalCurrent;
}
</pre>
<p>As described above, we allocate new pages whenever there is not enough physical memory left for the allocation to fit into. In that case, new pages are allocated at the end of the address range that already has physical memory committed to it. If there is not enough space left in the address space dedicated to this allocator, then we are out of memory.</p>
<p>The granularity with which the allocator grows is specified by the user in the constructor. It is a typical performance/memory-tradeoff regarding performance of allocations vs. wasted memory. If the allocator grows in 64 KB pages, it has to do lots of small allocations via the virtual memory system, but only wastes up to 64 KB, which is not much. In comparison, growing in 1 MB pages leads to far fewer allocations, but can waste up to 1 MB of memory that is backed by physical memory, but never used.</p>
<h1>Freeing memory</h1>
<p>One thing that is left to discuss is how the allocator behaves when freeing allocations. The primary thing to consider is whether the allocator should return physical memory pages whenever enough allocations have been freed. Depending on the allocation behaviour, that can be either good or bad.</p>
<p>When freeing level-resident resources, we mostly have several large allocations, and want to release physical memory as soon as possible. In situations where the allocator is used to make many small allocations, it could be more beneficial to keep physical memory that is committed to the address space. Otherwise, the allocator could suffer performance penalties whenever allocations straddle a page boundary, causing pages to be constantly allocated, freed, allocated again, and so on.</p>
<p>The way this is handled in Molecule is that whenever an allocations is freed, the allocator does not return physical memory to the OS. Instead, physical memory which is no longer<br />
needed must be explicitly released by calling <strong>Purge()</strong>, which is an extra method offered by growing allocators. In case of the growing stack allocator, it simply checks which<br />
pages are no longer needed, and returns them to the OS:</p>
<pre class="brush: cpp; title: ; notranslate">
void GrowingStackAllocator::Purge(void)
{
  // we need to free all physical memory pages which are no longer needed by any allocation while making sure that we don't free the page
  // we're currently pointing to (remember that virtual memory only works in page-size granularity).
  // we do this by rounding the current physical memory pointer to the next grow-size boundary, and freeing all physical memory from there.
  char* addressToFree = pointerUtil::AlignTop(m_physicalCurrent, m_growSize);
  const unsigned int sizeToFree = safe_static_cast(m_physicalEnd - addressToFree);
  virtualMemory::FreePhysicalMemory(addressToFree, sizeToFree);

  m_physicalEnd = addressToFree;
}
</pre>
<h1>Conclusion</h1>
<p>Having growing allocators that make use of the virtual memory system allows us to specify a safe upper-bound for the amount of memory needed during development without wasting physical memory. Furthermore, knowing in which address range certain allocations lie can be a tremendous help when debugging.</p>
<br />Filed under: <a href='http://molecularmusings.wordpress.com/category/core/'>Core</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/molecularmusings.wordpress.com/412/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/molecularmusings.wordpress.com/412/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=412&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://molecularmusings.wordpress.com/2013/01/29/memory-allocation-strategies-a-growing-stack-like-lifo-allocator/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/913fa98c5b06dd01be79595be8d6cc4c?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">molecularmusings</media:title>
		</media:content>
	</item>
		<item>
		<title>Game Connection Paris 2012 slides</title>
		<link>http://molecularmusings.wordpress.com/2012/12/03/game-connection-paris-2012-slides/</link>
		<comments>http://molecularmusings.wordpress.com/2012/12/03/game-connection-paris-2012-slides/#comments</comments>
		<pubDate>Mon, 03 Dec 2012 11:33:09 +0000</pubDate>
		<dc:creator>Stefan Reinalter</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Game Connection]]></category>
		<category><![CDATA[GC paris]]></category>

		<guid isPermaLink="false">http://molecularmusings.wordpress.com/?p=403</guid>
		<description><![CDATA[The slides for both the master class and the session I held at the Game Connection in Paris are now available: Master class: Memory Management Strategies (PPT, PDF). Session: Debugging memory stomps and other atrocities (PPT, PDF). A big &#8220;Thank &#8230; <a href="http://molecularmusings.wordpress.com/2012/12/03/game-connection-paris-2012-slides/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=403&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>The slides for both the master class and the session I held at the Game Connection in Paris are now available:</p>
<p><strong>Master class:</strong> Memory Management Strategies (<a href="http://molecularmusings.wordpress.com/2012/12/03/game-connection-paris-2012-slides/gc2012_memory_management_strategies_master_class-2/" rel="attachment wp-att-407">PPT</a>, <a href="http://molecularmusings.wordpress.com/2012/12/03/game-connection-paris-2012-slides/gc2012_memory_management_strategies_master_class/" rel="attachment wp-att-406">PDF</a>).</p>
<p><strong>Session:</strong> Debugging memory stomps and other atrocities (<a href="http://molecularmusings.wordpress.com/2012/12/03/game-connection-paris-2012-slides/gc2012_debugging_memory_stomps_session-2/" rel="attachment wp-att-405">PPT</a>, <a href="http://molecularmusings.wordpress.com/2012/12/03/game-connection-paris-2012-slides/gc2012_debugging_memory_stomps_session/" rel="attachment wp-att-404">PDF</a>).</p>
<p>A big &#8220;Thank you!&#8221; to all the people who attended, I really enjoyed working with you. Looking forward to seeing some of you again next year!</p>
<br />Filed under: <a href='http://molecularmusings.wordpress.com/category/uncategorized/'>Uncategorized</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/molecularmusings.wordpress.com/403/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/molecularmusings.wordpress.com/403/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=403&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://molecularmusings.wordpress.com/2012/12/03/game-connection-paris-2012-slides/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/913fa98c5b06dd01be79595be8d6cc4c?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">molecularmusings</media:title>
		</media:content>
	</item>
		<item>
		<title>Core Evaluation SDK available for download</title>
		<link>http://molecularmusings.wordpress.com/2012/11/08/core-evaluation-sdk-available-for-download/</link>
		<comments>http://molecularmusings.wordpress.com/2012/11/08/core-evaluation-sdk-available-for-download/#comments</comments>
		<pubDate>Thu, 08 Nov 2012 00:49:23 +0000</pubDate>
		<dc:creator>Stefan Reinalter</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[core library]]></category>
		<category><![CDATA[evaluation sdk]]></category>
		<category><![CDATA[game engine]]></category>
		<category><![CDATA[molecule]]></category>
		<category><![CDATA[molecule engine]]></category>

		<guid isPermaLink="false">http://molecularmusings.wordpress.com/?p=398</guid>
		<description><![CDATA[After lots of work I&#8217;m proud to finally announce that the first evaluation SDK for our core technology is now available! Check out www.molecular-matters.com for more information on the core library. Further SDKs will follow during the next few months. &#8230; <a href="http://molecularmusings.wordpress.com/2012/11/08/core-evaluation-sdk-available-for-download/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=398&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>After lots of work I&#8217;m proud to finally announce that the first evaluation SDK for our core technology is now available!</p>
<p>Check out <a title="Molecular Matters" href="http://www.molecular-matters.com">www.molecular-matters.com</a> for more information on the core library. Further SDKs will follow during the next few months.</p>
<br />Filed under: <a href='http://molecularmusings.wordpress.com/category/uncategorized/'>Uncategorized</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/molecularmusings.wordpress.com/398/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/molecularmusings.wordpress.com/398/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=398&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://molecularmusings.wordpress.com/2012/11/08/core-evaluation-sdk-available-for-download/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/913fa98c5b06dd01be79595be8d6cc4c?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">molecularmusings</media:title>
		</media:content>
	</item>
		<item>
		<title>Join my master class at Game Connection Paris</title>
		<link>http://molecularmusings.wordpress.com/2012/10/30/join-my-master-class-at-game-connection-paris/</link>
		<comments>http://molecularmusings.wordpress.com/2012/10/30/join-my-master-class-at-game-connection-paris/#comments</comments>
		<pubDate>Tue, 30 Oct 2012 20:18:15 +0000</pubDate>
		<dc:creator>Stefan Reinalter</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Game Connection]]></category>
		<category><![CDATA[Master Class]]></category>

		<guid isPermaLink="false">http://molecularmusings.wordpress.com/?p=396</guid>
		<description><![CDATA[I&#8217;m happy to announce that I&#8217;ll be holding a master class about memory management strategies at Game Connection, which will take place Nov. 28-30 in Paris. In this 7h master class we will discuss a wealth of different topics regarding &#8230; <a href="http://molecularmusings.wordpress.com/2012/10/30/join-my-master-class-at-game-connection-paris/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=396&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I&#8217;m happy to announce that I&#8217;ll be holding a <a title="Memory Management Strategies Master Class" href="http://www.game-connection.com/gameconn/content/memory-management-strategies">master class about memory management strategies</a> at Game Connection, which will take place Nov. 28-30 in Paris. In this 7h master class we will discuss a wealth of different topics regarding memory management, and take a much more detailed look at things I&#8217;ve written about in my blog. I&#8217;ll try to create a good mixture between technical details and practice sessions.</p>
<p>I&#8217;ll also be giving a <a title="Debugging Memory Stomps" href="http://www.game-connection.com/gameconn/content/debugging-memory-stomps-and-other-atrocities">talk about debugging memory stomps</a> on the following day.</p>
<p>It would be a pleasure to meet up with some of you there!</p>
<br />Filed under: <a href='http://molecularmusings.wordpress.com/category/uncategorized/'>Uncategorized</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/molecularmusings.wordpress.com/396/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/molecularmusings.wordpress.com/396/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=396&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://molecularmusings.wordpress.com/2012/10/30/join-my-master-class-at-game-connection-paris/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/913fa98c5b06dd01be79595be8d6cc4c?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">molecularmusings</media:title>
		</media:content>
	</item>
		<item>
		<title>Memory allocation strategies interlude: virtual memory</title>
		<link>http://molecularmusings.wordpress.com/2012/10/02/memory-allocation-strategies-interlude-virtual-memory/</link>
		<comments>http://molecularmusings.wordpress.com/2012/10/02/memory-allocation-strategies-interlude-virtual-memory/#comments</comments>
		<pubDate>Tue, 02 Oct 2012 10:20:58 +0000</pubDate>
		<dc:creator>Stefan Reinalter</dc:creator>
				<category><![CDATA[Core]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[game engine]]></category>
		<category><![CDATA[memory allocator]]></category>
		<category><![CDATA[memory system]]></category>
		<category><![CDATA[molecule engine]]></category>
		<category><![CDATA[virtual memory]]></category>

		<guid isPermaLink="false">http://molecularmusings.wordpress.com/?p=391</guid>
		<description><![CDATA[Before we can delve into the inner workings of growing allocators, I would like to explain the concept of virtual memory and discuss what it is, why it is needed, and what we can use it for. What is virtual &#8230; <a href="http://molecularmusings.wordpress.com/2012/10/02/memory-allocation-strategies-interlude-virtual-memory/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=391&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Before we can delve into the inner workings of growing allocators, I would like to explain the concept of virtual memory and discuss what it is, why it is needed, and what we can use it for.</p>
<h1><span id="more-391"></span>What is virtual memory?</h1>
<p>Speaking in simple terms, virtual memory provides an extra indirection when accessing memory. It provides an abstraction of the virtual address space of a process, which lets each process think that it&#8217;s alone in the system. It lets us write programs without having to worry which other programs allocated memory, and without having to worry which physical memory we need to access. Even though virtual memory is often also mentioned in conjunction with paging to hard disk (=swapping), this is <strong>not</strong> what we are interested in!</p>
<p>Consider a system having 4 GB of RAM that consists of four physical 1 GB RAM units. We are able to allocate more than 1 GB of contiguous memory, even though there&#8217;s no actual physical memory unit that is larger than 1 GB. This works thanks to virtual memory.<br />
Similarly, the allocations we make inside an application return <strong>virtual addresses</strong> which are valid inside <strong>our process</strong>. Such an address could be 0&#215;80000000, and another process can also have allocations residing at 0&#215;80000000, and yet everything works thanks to virtual memory.</p>
<h1>Address translation</h1>
<p>Before touching actual physical memory, the virtual address needs to be translated. This<em> virtual address to physical address translation</em> is being taken care of by the <a title="Memory management unit" href="http://en.wikipedia.org/wiki/Memory_management_unit">MMU of the CPU</a>. Modern CPUs also have a <a title="Translation look-aside buffer" href="http://en.wikipedia.org/wiki/Translation_lookaside_buffer">TLB</a> which is used to speed up this translation.</p>
<p>Traditionally, this translation is done on a <strong>page-by-page basis</strong>. This means that on the OS-level, memory can only be allocated in so-called <a title="Page" href="http://en.wikipedia.org/wiki/Page_%28computer_memory%29">pages </a>of a certain size. As an example, Windows 7 has a default page-size of 4 KB. Consequently, this also means that whenever you allocate memory directly from the OS, you can only allocate it with page-size granularity.</p>
<p>The details of address translation are actually quite involved, and here is a very good post describing this process for x86 and Cell architectures: <a title="Memory Address Translation" href="http://www.altdevblogaday.com/2011/07/24/memory-address-translation/">Memory Address Translation</a>.</p>
<h1>Allocating pages</h1>
<p>As an example, allocating just 10 bytes of memory using <a title="VirtualAlloc" href="http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887%28v=vs.85%29.aspx">VirtualAlloc</a> (the low-level allocation function on Windows) will allocate a whole page, that is 4096 bytes. You can access all of the 4096 bytes without triggering an access violation, eventhough you only requested 10 bytes.</p>
<p>Of course, page sizes differ across platforms (consoles). Some platforms even offer more than just one page-size. The reason for this is that because the TLB normally is of limited size, increasing the page-size can lead to fewer TLB misses (similar to cache misses), resulting in an increase in performance. However, larger pages can also lead to more wasted memory if you&#8217;re not careful. Thus, it&#8217;s a typical space/time-tradeoff.</p>
<p>Normally, the OS memory allocator (e.g. malloc/free) takes care of allocating pages, coalescing nearby allocations into contiguous regions, putting several small allocations on the same page, etc. However, as soon as we want to implement our own general-purpose allocator or any other custom allocation scheme, we need to be aware of such details, and cannot use malloc/free for our purposes.</p>
<p>Furthermore, knowing about such low-level details enables us to use a wealth of new debugging techniques like using <a title="Page protection" href="http://msdn.microsoft.com/en-us/library/windows/desktop/aa366786%28v=vs.85%29.aspx">protected pages</a>, <a title="Guard pages" href="http://msdn.microsoft.com/en-us/library/windows/desktop/aa366549%28v=vs.85%29.aspx">guard pages</a>, etc. As an example, some pages could be marked read-only in order to find memory stomps, race conditions (writes on shared data), and more. Guard pages serve as a one-shot alarm for memory page access, and are e.g. used for growing an application&#8217;s stack. Applications like <a title="PageHeao" href="http://msdn.microsoft.com/en-us/library/windows/hardware/ff549561%28v=vs.85%29.aspx">PageHeap</a> use those features for finding memory accesses beyond an allocation&#8217;s boundary.</p>
<h1>Use cases</h1>
<p>MMU, TLB, pages, address translation, memory protection&#8230; that all sounds wonderful, but what can we do with it?</p>
<p>Because the OS clearly distinguishes between <strong>reserving</strong> address space (see MEM_RESERVE for the <a title="VirtualAlloc" href="http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887%28v=vs.85%29.aspx">VirtualAlloc</a> function) and <strong>allocating</strong> physical memory for address space (see MEM_COMMIT), we can build allocators that can grow to a specified upper limit, but only allocate the memory they actually need.</p>
<p>This is very, very useful when implementing growing allocators, because we can reserve a contiguous region of memory (=virtual address space), but only commit physical memory to it whenever we need it.</p>
<p>Virtual memory addressing is supported by all common desktop OSs (Windows, Linux, Mac), almost all consoles (cannot go into details because of NDAs) and even on mobiles like the iPhone. How different growing allocators can be built using virtual memory on those platforms will be the topic of the next posts!</p>
<br />Filed under: <a href='http://molecularmusings.wordpress.com/category/core/'>Core</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/molecularmusings.wordpress.com/391/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/molecularmusings.wordpress.com/391/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=391&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://molecularmusings.wordpress.com/2012/10/02/memory-allocation-strategies-interlude-virtual-memory/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/913fa98c5b06dd01be79595be8d6cc4c?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">molecularmusings</media:title>
		</media:content>
	</item>
		<item>
		<title>Memory allocation strategies: a pool allocator</title>
		<link>http://molecularmusings.wordpress.com/2012/09/17/memory-allocation-strategies-a-pool-allocator/</link>
		<comments>http://molecularmusings.wordpress.com/2012/09/17/memory-allocation-strategies-a-pool-allocator/#comments</comments>
		<pubDate>Mon, 17 Sep 2012 18:16:56 +0000</pubDate>
		<dc:creator>Stefan Reinalter</dc:creator>
				<category><![CDATA[Core]]></category>
		<category><![CDATA[C++]]></category>
		<category><![CDATA[game engine]]></category>
		<category><![CDATA[memory allocator]]></category>
		<category><![CDATA[memory pool]]></category>
		<category><![CDATA[memory system]]></category>
		<category><![CDATA[molecule engine]]></category>
		<category><![CDATA[pool allocator]]></category>

		<guid isPermaLink="false">http://molecularmusings.wordpress.com/?p=377</guid>
		<description><![CDATA[As promised last time, today we will see how pool allocators can help with allocating/freeing allocations of a certain size, in any order, in O(1) time. Use cases Pool allocators are extremely helpful for allocating/freeing objects of a certain size &#8230; <a href="http://molecularmusings.wordpress.com/2012/09/17/memory-allocation-strategies-a-pool-allocator/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=377&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>As promised <a title="Memory allocation strategies: a stack-like (LIFO) allocator" href="http://molecularmusings.wordpress.com/2012/08/27/memory-allocation-strategies-a-stack-like-lifo-allocator/">last time</a>, today we will see how pool allocators can help with allocating/freeing allocations of a certain size, in any order, in O(1) time.</p>
<p><span id="more-377"></span></p>
<h1>Use cases</h1>
<p>Pool allocators are extremely helpful for allocating/freeing objects of a certain size which often have to be created/destroyed dynamically, such as weapon bullets, entity instances, rigid bodies, etc.</p>
<p>Most of those objects are created/destroyed in completely random order, due to their dynamic nature. Therefore, it is desirable to be able to allocate/free memory with as little fragmentation as possible. Pool allocators are extremely well suited for that.</p>
<h1>How does it work?</h1>
<p>Simply put, a pool allocator allocates a chunk of memory <strong>once</strong>, and divides that memory into slots/bins/pools which fit exactly M instances of size N. As an example, consider we want to have a maximum of 256 bullets in flight at the same time, each bullet having a size of 32 bytes. Thus, the pool allocator would allocate 256*32 = 8192 bytes once, dividing it into slots which are then used for allocating/freeing objects of size 32.</p>
<p>But how are those allocations made? How can we guarantee O(1) time? How can allocations be made in any order, without fragmentation?</p>
<p><strong>Freelists</strong></p>
<p>The answer to all of the above boils down to the use of what is called a <a title="Free list" href="http://en.wikipedia.org/wiki/Free_list">free list</a>. Free lists internally store a linked-list of free slots inside the allocated memory. Storing them inplace is crucial &#8211; there is no <em>std::vector</em>, <em>std::list</em>, or similar that keeps track of free slots. It is all stored <strong>inside</strong> the pre-allocated pool of memory.</p>
<p>The way it is usually done is the following: each slot (32 bytes in our example) in the pool of memory is connected to the next slot simply by storing the pointer to the next slot in the first few bytes of the slot.</p>
<p>Assuming our pool of memory sits at location <strong>0&#215;0</strong> in memory, the layout would be something like the following:</p>
<pre>         +---------+---------+---------+---------+
         | 0x20    | 0x40    | 0x60    | nullptr |
         +---------+---------+---------+---------+

         ^         ^         ^         ^
         |         |         |         |
address: 0x0       0x20      0x40      0x60</pre>
<p>The blocks denote the slots in memory, the bottom row shows the address in memory. As can be seen, the memory at <strong>0&#215;0</strong> would contain a pointer to <strong>0&#215;20</strong>, which would contain a pointer to <strong>0&#215;40</strong>, and so on. We have just formed an intrusive linked-list in our memory pool.</p>
<p>There is one thing to note here: as long as slots are free, we can store <strong>anything we want</strong> inside those 32 bytes. When a slot is in use, we don&#8217;t need to store anything, because that slot is occupied anyway and so no longer part of our free list. All we have to do is remove a slot from the free list whenever it is allocated, and add it to the linked-list again whenever it is freed.</p>
<h1>Implementation</h1>
<p>Let us take a look at the inner workings of a free list:</p>
<pre class="brush: cpp; title: ; notranslate">
class Freelist
{
public:
  Freelist(void* start, void* end, size_t elementSize, size_t alignment, size_t offset);

  inline void* Obtain(void);

  inline void Return(void* ptr);

private:
  Freelist* m_next;
};
</pre>
<p>The only member we need to store is a pointer to the free list, which simply acts as an alias in memory, and stores a pointer to a currently free slot in our memory pool.</p>
<p>Leaving alignment and offset requirements out of the equation for now, initializing a free list is quite simple:</p>
<pre class="brush: cpp; title: ; notranslate">
union
{
  void* as_void;
  char* as_char;
  Freelist* as_self;
};

// assume as_self points to the first entry in the free list
m_next = as_self;
as_char += elementSize;

// initialize the free list - make every m_next of each element point to the next element in the list
Freelist* runner = m_next;
for (size_t i=1; i&lt;numElements; ++i)
{
  runner-&gt;m_next = as_self;
  runner = as_self;
  as_char += elementSize;
}

runner-&gt;m_next = nullptr;
</pre>
<p>With the intrusive linked-list in place, allocating/freeing memory really becomes an O(1) operation, and is just ordinary linked-list manipulation code:</p>
<pre class="brush: cpp; title: ; notranslate">
inline void* Freelist::Obtain(void)
{
  // is there an entry left?
  if (m_next == nullptr)
  {
    // we are out of entries
    return nullptr;
  }

  // obtain one element from the head of the free list
  Freelist* head = m_next;
  m_next = head-&gt;m_next;
  return head;
}

inline void Freelist::Return(void* ptr)
{
  // put the returned element at the head of the free list
  Freelist* head = static_cast&lt;Freelist*&gt;(ptr);
  head-&gt;m_next = m_next;
  m_next = head;
}
</pre>
<p>I moved the free list implementation to its own class so it can be used by both the non-growing and growing variant of the pool allocator. Furthermore, it is handy for other things, too.</p>
<p><strong>Alignment requirements</strong></p>
<p>In order to satisfy alignment requirements, we need to offset our slots into the memory pool once, so all slots can satisfy the same offset and alignment requirements. The disadvantage of this approach is that a pool allocator can never satisfy more than one offset requirement, but such cases should be very seldom.</p>
<p>The way it is done in Molecule is that users have to provide their maximum object size and maximum alignment requirement when constructing the pool allocator. The allocator can then satisfy all <em>alignments &lt;= maximumAlignment</em> and <em>object sizes &lt;= maximumSize</em>, and simply asserts in all other cases. This at least allows the user to allocate objects of different sizes (such as e.g. 4, 8, 12, 16) out of the same pool, with different alignments (such as e.g. 4, 8, 16, 32), if desired.</p>
<p>Control is in the user&#8217;s hands, so it is up to the user to either use different pools for different allocations (often recommended), or live with some wasted memory inside the memory pool.</p>
<p><strong>Usage</strong></p>
<p>Usage is simple. The following shows a free list which is able to satisfy allocations up to a size of 32 bytes and an alignment of 8 bytes. Note that the free list takes a range of memory in which it initializes itself. This ensures that free lists can be used for allocations on the stack, on the heap, and by different allocators (non-growing and growing).</p>
<pre class="brush: cpp; title: ; notranslate">
ME_ALIGNED_SYMBOL(char memory[1024], 8) = {};

core::Freelist freelist(memory, memory+1024, 32, 8, 0);

// allocates a slot of 32 bytes, aligned to an 8-byte boundary
void* object0 = freelist.Obtain();

// allocates another slot of 32 bytes, aligned to an 8-byte boundary
void* object1 = freelist.Obtain();

// obtained slots can be returned in any order
freelist.Return(object1);
freelist.Return(object0);
</pre>
<p>The pool allocator described in this post simply has a <strong>Freelist</strong> instance as member, and forwards all <strong>Allocate()</strong> calls to <strong>freelist.Obtain()</strong>, and all <strong>Free()</strong> calls to <strong>freelist.Return()</strong>. Additionally, it asserts that allocation sizes and alignment requests fit the initial maximum sizes provided by the user.</p>
<h1>Outlook</h1>
<p>The pool allocator was the last remaining non-growing allocator we haven&#8217;t discussed yet. Starting with the next post, we will take a look at how to implement growing allocators by means of virtual memory and allocating physical pages from the OS.</p>
<br />Filed under: <a href='http://molecularmusings.wordpress.com/category/core/'>Core</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/molecularmusings.wordpress.com/377/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/molecularmusings.wordpress.com/377/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=molecularmusings.wordpress.com&#038;blog=24330678&#038;post=377&#038;subd=molecularmusings&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://molecularmusings.wordpress.com/2012/09/17/memory-allocation-strategies-a-pool-allocator/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/913fa98c5b06dd01be79595be8d6cc4c?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">molecularmusings</media:title>
		</media:content>
	</item>
	</channel>
</rss>
