× {{alert.msg}} Never ask again
Get notified about new tutorials RECEIVE NEW TUTORIALS

Improve performance by using an explicit capacity when instantiating dynamic arrays

Michael Safyan
Mar 27, 2016
<p>A dynamic array data structure is type of list data structure in which the data is stored contiguously in memory and, unlike an ordinary array, is able to expand or contract according to the memory needed (by, under-the-hood, allocating and copying into arrays of a more appropriate size as necessary). Dynamic arrays provide fast (O(1)) random access and also good spatial locality for CPU caches, improving the performance of iteration. In C++, the <a href="http://en.cppreference.com/w/cpp/container/vector">std::vector&lt;&gt;</a> class provides a standard implementation of a dynamic array; in Java, the <a href="https://docs.oracle.com/javase/7/docs/api/java/util/ArrayList.html">ArrayList&lt;&gt;</a> class provides a dynamic array.</p> <p>A common performance anti-pattern in code is to copy data from a list into a dynamic array data structure without specifying an explicit capacity upfront (even when the size of the original list is known). This leads to many needless allocations and copies of the data as the target dynamic array is repeatedly expanded until it reaches a suitable size.</p> <p>That is, it is not uncommon to see code like this:</p> <pre><code class="language-cpp">std::vector&lt;string&gt; ExtractNames(const std::vector&lt;Person&gt;&amp; people) { std::vector&lt;string&gt; result; for (const auto&amp; person : people) { result.push_back(person.name()); } return result; }</code></pre> <p>Or, similarly, in Java:</p> <pre><code class="language-java">List&lt;String&gt; extractNames(List&lt;Person&gt; people) { List&lt;String&gt; result = new ArrayList&lt;&gt;(); for (String person : people) { result.add(person.getName()); } return result; }</code></pre> <p>In both cases, the size of the resulting list is known, so there is no excuse for failing to indicate the required capacity. In C++, the required capacity can be specified using the <a href="http://en.cppreference.com/w/cpp/container/vector/reserve">reserve()</a> function; in Java, the constructor accepts the capacity.</p> <p>That is, simply calling:</p> <pre><code class="language-cpp">result.reserve(people.size());</code></pre> <p>... in the C++ example before the loop, and using:</p> <pre><code class="language-java"> List&lt;String&gt; result = new ArrayList&lt;&gt;(people.size()); </code></pre> <p>... instead of:</p> <pre><code class="language-java"> List&lt;String&gt; result = new ArrayList&lt;&gt;(); </code></pre> <p>... in the Java example, can provide huge savings. Similar savings can be had in other programming languages and with other dynamic array implementations that allow the dynamic array's capacity to be specified. Keep this in mind the next time you create and populate a dynamic array from a source where the resulting size is knowable.</p>
comments powered by Disqus