SLUB: Place kmem_cache_cpu structures in a NUMA aware way - linux - Various Linux trees of dubious usefulness

diff options

author	Christoph Lameter <clameter@sgi.com>	2007-10-16 01:26:08 -0700
committer	Linus Torvalds <torvalds@woody.linux-foundation.org>	2007-10-16 09:43:01 -0700
commit	4c93c355d5d563f300df7e61ef753d7a064411e9 (patch)
tree	24bcdbed58a51c69640da9c8e220dd5ce0c054a7 /include
parent	ee3c72a14bfecdf783738032ff3c73ef6412f5b3 (diff)
download	linux-4c93c355d5d563f300df7e61ef753d7a064411e9.tar.gz

SLUB: Place kmem_cache_cpu structures in a NUMA aware way

The kmem_cache_cpu structures introduced are currently an array placed in the
kmem_cache struct. Meaning the kmem_cache_cpu structures are overwhelmingly
on the wrong node for systems with a higher amount of nodes. These are
performance critical structures since the per node information has
to be touched for every alloc and free in a slab.

In order to place the kmem_cache_cpu structure optimally we put an array
of pointers to kmem_cache_cpu structs in kmem_cache (similar to SLAB).

However, the kmem_cache_cpu structures can now be allocated in a more
intelligent way.

We would like to put per cpu structures for the same cpu but different
slab caches in cachelines together to save space and decrease the cache
footprint. However, the slab allocators itself control only allocations
per node. We set up a simple per cpu array for every processor with
100 per cpu structures which is usually enough to get them all set up right.
If we run out then we fall back to kmalloc_node. This also solves the
bootstrap problem since we do not have to use slab allocator functions
early in boot to get memory for the small per cpu structures.

Pro:
	- NUMA aware placement improves memory performance
	- All global structures in struct kmem_cache become readonly
	- Dense packing of per cpu structures reduces cacheline
	  footprint in SMP and NUMA.
	- Potential avoidance of exclusive cacheline fetches
	  on the free and alloc hotpath since multiple kmem_cache_cpu
	  structures are in one cacheline. This is particularly important
	  for the kmalloc array.

Cons:
	- Additional reference to one read only cacheline (per cpu
	  array of pointers to kmem_cache_cpu) in both slab_alloc()
	  and slab_free().

[akinobu.mita@gmail.com: fix cpu hotplug offline/online path]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: "Pekka Enberg" <penberg@cs.helsinki.fi>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Diffstat (limited to 'include')

-rw-r--r--

include/linux/slub_def.h

1 files changed, 6 insertions, 3 deletions


context:
space:
mode: