aboutsummaryrefslogtreecommitdiff
path: root/libstdc++-v3/docs/html/ext/ballocator_doc.txt
diff options
context:
space:
mode:
Diffstat (limited to 'libstdc++-v3/docs/html/ext/ballocator_doc.txt')
-rw-r--r--libstdc++-v3/docs/html/ext/ballocator_doc.txt374
1 files changed, 0 insertions, 374 deletions
diff --git a/libstdc++-v3/docs/html/ext/ballocator_doc.txt b/libstdc++-v3/docs/html/ext/ballocator_doc.txt
deleted file mode 100644
index 2173b618f4f..00000000000
--- a/libstdc++-v3/docs/html/ext/ballocator_doc.txt
+++ /dev/null
@@ -1,374 +0,0 @@
- BITMAPPED ALLOCATOR
- ===================
-
-2004-03-11 Dhruv Matani <dhruvbird@HotPOP.com>
-
----------------------------------------------------------------------
-
-As this name suggests, this allocator uses a bit-map to keep track of
-the used and unused memory locations for it's book-keeping purposes.
-
-This allocator will make use of 1 single bit to keep track of whether
-it has been allocated or not. A bit 1 indicates free, while 0
-indicates allocated. This has been done so that you can easily check a
-collection of bits for a free block. This kind of Bitmapped strategy
-works best for single object allocations, and with the STL type
-parameterized allocators, we do not need to choose any size for the
-block which will be represented by a single bit. This will be the size
-of the parameter around which the allocator has been
-parameterized. Thus, close to optimal performance will result. Hence,
-this should be used for node based containers which call the allocate
-function with an argument of 1.
-
-The bitmapped allocator's internal pool is exponentially
-growing. Meaning that internally, the blocks acquired from the Free
-List Store will double every time the bitmapped allocator runs out of
-memory.
-
---------------------------------------------------------------------
-
-The macro __GTHREADS decides whether to use Mutex Protection around
-every allocation/deallocation. The state of the macro is picked up
-automatically from the gthr abstration layer.
-
-----------------------------------------------------------------------
-
-What is the Free List Store?
-----------------------------
-
-The Free List Store (referred to as FLS for the remaining part of this
-document) is the Global memory pool that is shared by all instances of
-the bitmapped allocator instantiated for any type. This maintains a
-sorted order of all free memory blocks given back to it by the
-bitmapped allocator, and is also responsible for giving memory to the
-bitmapped allocator when it asks for more.
-
-Internally, there is a Free List threshold which indicates the Maximum
-number of free lists that the FLS can hold internally
-(cache). Currently, this value is set at 64. So, if there are more
-than 64 free lists coming in, then some of them will be given back to
-the OS using operator delete so that at any given time the Free List's
-size does not exceed 64 entries. This is done because a Binary Search
-is used to locate an entry in a free list when a request for memory
-comes along. Thus, the run-time complexity of the search would go up
-given an increasing size, for 64 entries however, lg(64) == 6
-comparisons are enough to locate the correct free list if it exists.
-
-Suppose the free list size has reached it's threshold, then the
-largest block from among those in the list and the new block will be
-selected and given back to the OS. This is done because it reduces
-external fragmentation, and allows the OS to use the larger blocks
-later in an orderly fashion, possibly merging them later. Also, on
-some systems, large blocks are obtained via calls to mmap, so giving
-them back to free system resources becomes most important.
-
-The function _S_should_i_give decides the policy that determines
-whether the current block of memory should be given to the allocator
-for the request that it has made. That's because we may not always
-have exact fits for the memory size that the allocator requests. We do
-this mainly to prevent external fragmentation at the cost of a little
-internal fragmentation. Now, the value of this internal fragmentation
-has to be decided by this function. I can see 3 possibilities right
-now. Please add more as and when you find better strategies.
-
-1. Equal size check. Return true only when the 2 blocks are of equal
- size.
-
-2. Difference Threshold: Return true only when the _block_size is
- greater than or equal to the _required_size, and if the _BS is >
- _RS by a difference of less than some THRESHOLD value, then return
- true, else return false.
-
-3. Percentage Threshold. Return true only when the _block_size is
- greater than or equal to the _required_size, and if the _BS is >
- _RS by a percentage of less than some THRESHOLD value, then return
- true, else return false.
-
-Currently, (3) is being used with a value of 36% Maximum wastage per
-Super Block.
-
---------------------------------------------------------------------
-
-1) What is a super block? Why is it needed?
-
- A super block is the block of memory acquired from the FLS from
- which the bitmap allocator carves out memory for single objects and
- satisfies the user's requests. These super blocks come in sizes that
- are powers of 2 and multiples of 32 (_Bits_Per_Block). Yes both at
- the same time! That's because the next super block acquired will be
- 2 times the previous one, and also all super blocks have to be
- multiples of the _Bits_Per_Block value.
-
-2) How does it interact with the free list store?
-
- The super block is contained in the FLS, and the FLS is responsible
- for getting / returning Super Bocks to and from the OS using
- operator new as defined by the C++ standard.
-
----------------------------------------------------------------------
-
-How does the allocate function Work?
-------------------------------------
-
-The allocate function is specialized for single object allocation
-ONLY. Thus, ONLY if n == 1, will the bitmap_allocator's specialized
-algorithm be used. Otherwise, the request is satisfied directly by
-calling operator new.
-
-Suppose n == 1, then the allocator does the following:
-
-1. Checks to see whether the a free block exists somewhere in a region
- of memory close to the last satisfied request. If so, then that
- block is marked as allocated in the bit map and given to the
- user. If not, then (2) is executed.
-
-2. Is there a free block anywhere after the current block right upto
- the end of the memory that we have? If so, that block is found, and
- the same procedure is applied as above, and returned to the
- user. If not, then (3) is executed.
-
-3. Is there any block in whatever region of memory that we own free?
- This is done by checking (a) The use count for each super block,
- and if that fails then (b) The individual bit-maps for each super
- block. Note: Here we are never touching any of the memory that the
- user will be given, and we are confining all memory accesses to a
- small region of memory! This helps reduce cache misses. If this
- succeeds then we apply the same procedure on that bit-map as (1),
- and return that block of memory to the user. However, if this
- process fails, then we resort to (4).
-
-4. This process involves Refilling the internal exponentially growing
- memory pool. The said effect is achieved by calling _S_refill_pool
- which does the following:
- (a). Gets more memory from the Global Free List of the
- Required size.
- (b). Adjusts the size for the next call to itself.
- (c). Writes the appropriate headers in the bit-maps.
- (d). Sets the use count for that super-block just allocated
- to 0 (zero).
- (e). All of the above accounts to maintaining the basic
- invariant for the allocator. If the invariant is
- maintained, we are sure that all is well.
- Now, the same process is applied on the newly acquired free blocks,
- which are dispatched accordingly.
-
-Thus, you can clearly see that the allocate function is nothing but a
-combination of the next-fit and first-fit algorithm optimized ONLY for
-single object allocations.
-
-
--------------------------------------------------------------------------
-
-How does the deallocate function work?
---------------------------------------
-
-The deallocate function again is specialized for single objects ONLY.
-For all n belonging to > 1, the operator delete is called without
-further ado, and the deallocate function returns.
-
-However for n == 1, a series of steps are performed:
-
-1. We first need to locate that super-block which holds the memory
- location given to us by the user. For that purpose, we maintain a
- static variable _S_last_dealloc_index, which holds the index into
- the vector of block pairs which indicates the index of the last
- super-block from which memory was freed. We use this strategy in
- the hope that the user will deallocate memory in a region close to
- what he/she deallocated the last time around. If the check for
- belongs_to succeeds, then we determine the bit-map for the given
- pointer, and locate the index into that bit-map, and mark that bit
- as free by setting it.
-
-2. If the _S_last_dealloc_index does not point to the memory block
- that we're looking for, then we do a linear search on the block
- stored in the vector of Block Pairs. This vector in code is called
- _S_mem_blocks. When the corresponding super-block is found, we
- apply the same procedure as we did for (1) to mark the block as
- free in the bit-map.
-
-Now, whenever a block is freed, the use count of that particular super
-block goes down by 1. When this use count hits 0, we remove that super
-block from the list of all valid super blocks stored in the
-vector. While doing this, we also make sure that the basic invariant
-is maintained by making sure that _S_last_request and
-_S_last_dealloc_index point to valid locations within the vector.
-
---------------------------------------------------------------------
-
-
-Data Layout for a Super Block:
-==============================
-
-Each Super Block will be of some size that is a multiple of the number
-of Bits Per Block. Typically, this value is chosen as Bits_Per_Byte X
-sizeof(unsigned int). On an X86 system, this gives the figure
-8 X 4 = 32. Thus, each Super Block will be of size 32 X Some_Value.
-This Some_Value is sizeof(value_type). For now, let it be called 'K'.
-Thus, finally, Super Block size is 32 X K bytes.
-
-This value of 32 has been chosen because each unsigned int has 32-bits
-and Maximum use of these can be made with such a figure.
-
-Consider a block of size 32 ints.
-In memory, it would look like this:
-
----------------------------------------------------------------------
-| 136 | 0 | 4294967295 | Data-> Space for 32-ints |
----------------------------------------------------------------------
-
-The first Columns represents the size of the Block in bytes as seen by
-the Bitmap Allocator. Internally, a global free list is used to keep
-track of the free blocks used and given back by the bitmap
-allocator. It is this Free List Store that is responsible for writing
-and managing this information. Actually the number of bytes allocated
-in this case would be: 4 + 4 + 4 + 32*4 = 140 bytes, but the first 4
-bytes are an addition by the Free List Store, so the Bitmap Allocator
-sees only 136 bytes. These first 4 bytes about which the bitmapped
-allocator is not aware hold the value 136.
-
-What do the remaining values represent?
----------------------------------------
-
-The 2nd 4 in the expression is the sizeof(unsigned int) because the
-Bitmapped Allocator maintains a used count for each Super Block, which
-is initially set to 0 (as indicated in the diagram). This is
-incremented every time a block is removed from this super block
-(allocated), and decremented whenever it is given back. So, when the
-used count falls to 0, the whole super block will be given back to the
-Free List Store.
-
-The value 4294967295 represents the integer corresponding to the
-bit representation of all bits set: 11111111111111111111111111111111.
-
-The 3rd 4 is size of the bitmap itself, which is the size of 32-bits,
-which is 4-bytes, or 1 X sizeof(unsigned int).
-
-
---------------------------------------------------------------------
-
-Another issue would be whether to keep the all bitmaps in a separate
-area in memory, or to keep them near the actual blocks that will be
-given out or allocated for the client. After some testing, I've
-decided to keep these bitmaps close to the actual blocks. this will
-help in 2 ways.
-
-1. Constant time access for the bitmap themselves, since no kind of
- look up will be needed to find the correct bitmap list or it's
- equivalent.
-
-2. And also this would preserve the cache as far as possible.
-
-So in effect, this kind of an allocator might prove beneficial from a
-purely cache point of view. But this allocator has been made to try
-and roll out the defects of the node_allocator, wherein the nodes get
-skewed about in memory, if they are not returned in the exact reverse
-order or in the same order in which they were allocated. Also, the
-new_allocator's book keeping overhead is too much for small objects
-and single object allocations, though it preserves the locality of
-blocks very well when they are returned back to the allocator.
-
--------------------------------------------------------------------
-
-Expected overhead per block would be 1 bit in memory. Also, once
-the address of the free list has been found, the cost for
-allocation/deallocation would be negligible, and is supposed to be
-constant time. For these very reasons, it is very important to
-minimize the linear time costs, which include finding a free list
-with a free block while allocating, and finding the corresponding
-free list for a block while deallocating. Therefore, I have decided
-that the growth of the internal pool for this allocator will be
-exponential as compared to linear for node_allocator. There, linear
-time works well, because we are mainly concerned with speed of
-allocation/deallocation and memory consumption, whereas here, the
-allocation/deallocation part does have some linear/logarithmic
-complexity components in it. Thus, to try and minimize them would
-be a good thing to do at the cost of a little bit of memory.
-
-Another thing to be noted is the the pool size will double every time
-the internal pool gets exhausted, and all the free blocks have been
-given away. The initial size of the pool would be sizeof(unsigned
-int)*8 which is the number of bits in an integer, which can fit
-exactly in a CPU register. Hence, the term given is exponential growth
-of the internal pool.
-
----------------------------------------------------------------------
-
-After reading all this, you may still have a few questions about the
-internal working of this allocator, like my friend had!
-
-Well here are the exact questions that he posed:
-
-1) The "Data Layout" section is cryptic. I have no idea of what you
- are trying to say. Layout of what? The free-list? Each bitmap? The
- Super Block?
-
- The layout of a Super Block of a given size. In the example, a super
- block of size 32 X 1 is taken. The general formula for calculating
- the size of a super block is 32*sizeof(value_type)*2^n, where n
- ranges from 0 to 32 for 32-bit systems.
-
-2) And since I just mentioned the term `each bitmap', what in the
- world is meant by it? What does each bitmap manage? How does it
- relate to the super block? Is the Super Block a bitmap as well?
-
- Good question! Each bitmap is part of a Super Block which is made up
- of 3 parts as I have mentioned earlier. Re-iterating, 1. The use
- count, 2. The bit-map for that Super Block. 3. The actual memory
- that will be eventually given to the user. Each bitmap is a multiple
- of 32 in size. If there are 32*(2^3) blocks of single objects to be
- given, there will be '32*(2^3)' bits present. Each 32 bits managing
- the allocated / free status for 32 blocks. Since each unsigned int
- contains 32-bits, one unsigned int can manage upto 32 blocks'
- status. Each bit-map is made up of a number of unsigned ints, whose
- exact number for a super-block of a given size I have just
- mentioned.
-
-3) How do the allocate and deallocate functions work in regard to
- bitmaps?
-
- The allocate and deallocate functions manipulate the bitmaps and have
- nothing to do with the memory that is given to the user. As I have
- earlier mentioned, a 1 in the bitmap's bit field indicates free,
- while a 0 indicates allocated. This lets us check 32 bits at a time
- to check whether there is at lease one free block in those 32 blocks
- by testing for equality with (0). Now, the allocate function will
- given a memory block find the corresponding bit in the bitmap, and
- will reset it (ie. make it re-set (0)). And when the deallocate
- function is called, it will again set that bit after locating it to
- indicate that that particular block corresponding to this bit in the
- bit-map is not being used by anyone, and may be used to satisfy
- future requests.
-
-----------------------------------------------------------------------
-
-(Tech-Stuff, Please stay out if you are not interested in the
-selection of certain constants. This has nothing to do with the
-algorithm per-se, only with some vales that must be chosen correctly
-to ensure that the allocator performs well in a real word scenario,
-and maintains a good balance between the memory consumption and the
-allocation/deallocation speed).
-
-The formula for calculating the maximum wastage as a percentage:
-
-(32 X k + 1) / (2 X (32 X k + 1 + 32 X c)) X 100.
-
-Where,
- k => The constant overhead per node. eg. for list, it is 8
- bytes, and for map it is 12 bytes.
- c => The size of the base type on which the map/list is
- instantiated. Thus, suppose the the type1 is int and type2 is
- double, they are related by the relation sizeof(double) ==
- 2*sizeof(int). Thus, all types must have this double size
- relation for this formula to work properly.
-
-Plugging-in: For List: k = 8 and c = 4 (int and double), we get:
-33.376%
-
-For map/multimap: k = 12, and c = 4 (int and double), we get:
-37.524%
-
-Thus, knowing these values, and based on the sizeof(value_type), we
-may create a function that returns the Max_Wastage_Percentage for us
-to use.
-
-