customallocators.ppt 140KB Jun 23 2011 07:22:22 AM
Custom STL Allocators
Pete Isensee
Xbox Advanced Technology Group
(2)
Topics
• Allocators: What are They Good For?
• Writing Your First Allocator • The Devil in the Details
• Allocator Pitfalls
– State
– Syntax
– Testing
(3)
Containers and Allocators
• STL containers allocate memory
– e.g. vector (contiguous), list (nodes)
– string is a container, for this talk • Allocators provide a standard
interface for container memory use
• If you don’t provide an allocator, one is provided for you
(4)
Example
• Default Allocator list<int> b;
// same as:
list< int, allocator<int> > b; • Custom Allocator
#include “MyAlloc.h”
(5)
The Good
• Original idea: abstract the notion
of near and far memory pointers
• Expanded idea: allow
customization of container allocation
• Good for
– Size: Optimizing memory usage (pools, fixed-size allocators)
– Speed: Reducing allocation time (single-threaded, one-time free)
(6)
Example Allocators
• No heap locking (single thread)
• Avoiding fragmentation
• Aligned allocations
(_aligned_malloc)
• Fixed-size allocations • Custom free list
• Debugging • Custom heap
(7)
The Bad
• No realloc()
• Requires advanced C++ compilers • C++ Standard hand-waving
• Generally library-specific
– If you change STL libraries you may need to rewrite allocators
• Generally not cross-platform
– If you change compilers you may need to rewrite allocators
(8)
The Ugly
• Not quite real objects
– Allocators with state may not work as expected
• Gnarly syntax
– map<int,char> m;
– map<int,char,less<int>,
(9)
Pause to Reflect
• “Premature optimization is the root of all evil” – Donald Knuth • Allocators are a last resort and
low-level optimization
• Especially for games, allocators can be the perfect optimization • Written correctly, they can be
introduced w/o many code changes
(10)
Writing Your First
Allocator
• Create MyAlloc.h
• #include <memory>
• Copy or derive from the default allocator
• Rename “allocator” to “MyAlloc” • Resolve any helper functions
(11)
Writing Your First
Allocator
• Demo
• Visual C++ Pro 7.0 (13.00.9466) • Dinkumware STL (V3.10:0009) • 933MHz PIII w/ 512MB
• Windows XP Pro 2002 • Launch Visual Studio
(12)
Two key functions
• Allocate
• Deallocate • That’s all!
(13)
Conventions
template< typename T > class allocator
{
typedef size_t size_type; typedef T* pointer;
typedef const T* const_pointer; typedef T value_type;
(14)
Allocate Function
• pointer allocate( size_type n,
allocator<void>::const_pointer p = 0)
– n is the number of items T, NOT bytes
– returns pointer to enough memory to hold n * sizeof(T) bytes
– returns raw bytes; NO construction
– may throw an exception (std::bad_alloc)
– default calls ::operator new
(15)
Deallocate function
• void deallocate( pointer p,size_type n )
– p must come from allocate()
– p must be raw bytes; already destroyed
– n must match the n passed to allocate()
– default calls ::operator delete(void*)
– Most implementations allow and ignore NULL p; you should too
(16)
A Custom Allocator
• Demo
• That’s it!
• Not quite: the devil is in the details
– Construction
– Destruction
– Example STL container code
(17)
Construction
• Allocate() doesn’t call constructors
• Why? Performance
• Allocators provide construct
function
void construct(pointer p, const T& t) { new( (void*)p ) T(t); }
• Placement new
– Doesn’t allocate memory
(18)
Destruction
• Deallocate() doesn’t call destructors
• Allocators provide a destroy function
void destroy( pointer p ) { ((T*)p)->~T(); }
• Direct destructor invocation
– Doesn’t deallocate memory
(19)
Example: Vector
template< typename T, typename A > class vector {
A a; // allocator
pointer pFirst; // first object
pointer pEnd; // 1 beyond end
pointer pLast; // 1 beyond last
(20)
Example: Reserve
vector::reserve( size_type n ) {
pointer p = a.allocate( n, 0 ); // loop on a.construct() to copy
// loop on a.destroy() to tear down
a.deallocate( pFirst, capacity() ); pFirst = p;
pLast = p + size(); pEnd = p + n;
(21)
Performance is
paramount
• Reserve
– Single allocation
– Doesn’t default construct anything
– Deals properly with real objects
• No memcpy
• Copy constructs new objects • Destroys old objects
(22)
Rebind
• Allocators don’t always allocate T
list<Obj> ObjList; // allocates nodes • How? Rebind
template<typename U> struct rebind { typedef allocator<U> other; }
• To allocate an N given type T
Alloc<T> a;
T* t = a.allocate(1); // allocs sizeof(T) Alloc<T>::rebind<N>::other na;
(23)
Allocator Pitfalls
• To Derive or Not to Derive • State
– Copy ctor and template copy ctor
– Allocator comparison • Syntax issues
• Testing
(24)
To Derive or Not To Derive
• Deriving from std::allocator
– Dinkumware derives (see <xdebug>)
– Must provide rebind, allocate, deallocate
– Less code; easier to see differences
• Writing from scratch
– Allocator not designed as base class
– Josuttis and Austern write from scratch
– Better understanding
(25)
Allocators with State
• State = allocator member data • Default allocator has no data • C++ Std says (paraphrasing
20.1.5):
– Vendors encouraged to support allocators with state
– Containers may assume that allocators don’t have state
(26)
State Recommendations
• Be aware of compatibility issues across STL vendors
• list::splice() or C::swap()will indicate if your vendor supports stateful allocators
– Dinkumware: yes – STLport: no
(27)
State Implications
• Container size increase • Must provide allocator:
– Constructor(s)
• Default may be private if parameters required – Copy constructor
– Template copy constructor
– Global comparison operators (==, !=)
• No assignment operators required
• Avoid static data; generates one per T
(28)
Heap Allocator Example
template< typename T > class Halloc {
Halloc(); // could be private
explicit Halloc( HANDLE hHeap ); Halloc( const Halloc& ); // copy
template< typename U > // templatized
Halloc( const Halloc<U>& ); // copy
(29)
Template Copy
Constructor
• Can’t see private data
template< typename U >
Halloc( const Halloc<U>& a ) :
m_hHeap( a.m_hHeap ) {} // error • Solutions
– Provide public data accessor function
– Or allow access to other types U
template <typename U> friend class Halloc;
(30)
Allocator comparison
• Example
template< typename T, typename U > bool operator==( const Alloc<T>& a, const Alloc<U>& b ) { return a.state == b.state; }
• Provide both == and !=
• Should be global fucns, not members
(31)
Syntax: Typedefs
• Prefer typedefs • Offensive
list< int, Alloc< int > > b; • Better
// .h
typedef Alloc< int > IAlloc;
typedef list< int, IAlloc > IntList;
// .cpp
(32)
Syntax: Construction
• Containers accept allocators via ctors
IntList b( IAlloc( x,y,z ) );
• If none specified, you get the default
IntList b; // calls IAlloc()
• Map/multimap requires pairs
Alloc< pair< K,T > > a; map< K, T, less<K>,
Alloc< pair< K,T > > > m( less<K>(), a );
(33)
Syntax: Other Containers
• Container adaptors accept
containers via constructors, not allocators
Alloc<T> a;
deque< T, Alloc<T> > d(a);
stack< T, deque<T,Alloc<T> > > s(d);
• String example
Alloc<T> a;
basic_string< T, char_traits<T>, Alloc<T> > s(a);
(34)
Testing
• Test the normal case
• Test with all containers (don’t forget string, hash containers, stack, etc.) • Test with different objects T,
particularly those w/ non-trivial dtors • Test edge cases like list::splice
• Verify that your version is better! • Allocator test framework:
(35)
Case Study
• In-place allocator
– Hand off existing memory block
– Dole out allocations from the block – Never free
• Example usage
typedef InPlaceAlloc< int > IPA; void* p = malloc( 1024 );
list< int, IPA > x( IPA( p, 1024 ) ); x.push_back( 1 );
free( p );
(36)
In-Place Allocator
• Problems
– Fails w/ multiple concurrent copies
– No copy constructor
– Didn’t support comparison
– Didn’t handle containers of void* • Correct implementation
– Reference counted
– Copy constructor implemented
– Comparison operators
(37)
In-Place Summary
• Speed
– Scenario: add x elements, remove half
– About 50x faster than default allocator!
• Advantages
– Fast; no overhead; no fragmentation
– Whatever memory you want
• Disadvantages
– Proper implementation isn’t easy
(38)
Recommendations
• Allocators: a last resort optimization
• Base your allocator on <memory> • Beware porting issues (both
compilers and STL vendor libraries)
• Beware allocators with state • Test thoroughly
(39)
Recommendations part II
• Use typedefs to simplify life • Don’t forget to write
– Rebind
– Copy constructor
– Templatized copy constructor – Comparison operators
(40)
References
• C++ Standard section 20.1.5, 20.4.1
• Your STL implementation: <memory>
• GDC Proceedings: References section
• Game Gems III
(1)
Case Study
• In-place allocator
– Hand off existing memory block
– Dole out allocations from the block
– Never free
• Example usage
typedef InPlaceAlloc< int > IPA; void* p = malloc( 1024 );
list< int, IPA > x( IPA( p, 1024 ) ); x.push_back( 1 );
free( p );
(2)
In-Place Allocator
• Problems– Fails w/ multiple concurrent copies
– No copy constructor
– Didn’t support comparison
– Didn’t handle containers of void*
• Correct implementation
– Reference counted
– Copy constructor implemented
– Comparison operators
(3)
In-Place Summary
• Speed
– Scenario: add x elements, remove half
– About 50x faster than default allocator!
• Advantages
– Fast; no overhead; no fragmentation
– Whatever memory you want
• Disadvantages
– Proper implementation isn’t easy
(4)
Recommendations
• Allocators: a last resortoptimization
• Base your allocator on <memory>
• Beware porting issues (both compilers and STL vendor
libraries)
• Beware allocators with state
• Test thoroughly
(5)
Recommendations part II
• Use typedefs to simplify life• Don’t forget to write
– Rebind
– Copy constructor
– Templatized copy constructor – Comparison operators
(6)
References
• C++ Standard section 20.1.5,
20.4.1
• Your STL implementation: <memory>
• GDC Proceedings: References
section
• Game Gems III