DragonFly BSD
DragonFly kernel List (threaded) for 2003-09
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

Re: SLAB allocator now the default.


From: Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx>
Date: Sun, 28 Sep 2003 09:59:56 -0700 (PDT)

:Hiten,
:
:> 	Matt,
:>
:> 	I haven't looked at the Slab-alloc code recently, but I am
:> 	wondering if you are planning to remove the power-of-2 alignment
:> 	stuff out of malloc()?
:>
:
:Doesn't power of two alignment make it easier on some CPUs internally 
:to access
:data?  If a person wanted to access a word overlapping two word's 
:alignment would it
:not take two bus accesses to get that data [or two cache reads... what 
:have  you]
:
:Do the benefits of not having power of two alignment outweigh this?
:...
:
:> 	From the many papers I have been reading in the last couple of
:> 	weeks, it seems that power-of-2 alignment seems to cause a lot
:> 	of performance degradation in SMP cases, and cache issues.

    It depends what power of 2 you are talking about.  Generally speaking,
    there is a benefit to be had when data objects fit entirely in cache
    lines.  A cache line is typically 8, 16, or 32 bytes wide depending on the 
    architecture.  There is also a benefit to the location of the initial
    data access within the cache line... that is, accessing the first word
    of a multi-word burst being loaded into the cache line from external
    memory will often unclog instruction flow earlier, but whether
    the 'first' word is the low address of the cache line or the high address
    depends on the architecture.  e.g. cache lines are loaded backwards on
    MIPS.

    Larger alignments can create performance penalties and this is the
    performance penalty being talked about above.  By larger alignments I am
    talking about the case where, say, you try to allocate 800 bytes and the
    allocation is thrown into a 1K block (which is what the old kernel
    malloc did).  If you are trying to allocate 1024 bytes then presumably
    you intend to use all 1K and you might as well 1K align it since there
    is no data loss and no likely performance loss either.  Also, once you
    reach PAGE_SIZE you almost always want to take advantage of the VM system
    to allocate whole pages.  The slab allocator does this for power-of-2
    sized requests beyond PAGE_SIZE but does NOT page-align oddly sized
    requests (like a 6K request) beyond PAGE_SIZE, at least until the requests
    get large (greater then 16K).

    So keeping the power-of-2-allocation-is-power-of-2-aligned characteristic
    is reasonable for power-of-2-sized requests.

					-Matt
					Matthew Dillon 
					<dillon@xxxxxxxxxxxxx>



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]