diff options
author | Haicheng Li <haicheng.li@linux.intel.com> | 2010-05-24 14:32:52 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2010-05-25 08:07:02 -0700 |
commit | 4eaf3f64397c3db3c5785eee508270d62a9fabd9 (patch) | |
tree | bfd986a7e974876755ea6fe0de394199c68e2e36 /mm/memory_hotplug.c | |
parent | 1f522509c77a5dea8dc384b735314f03908a6415 (diff) | |
download | linux-4eaf3f64397c3db3c5785eee508270d62a9fabd9.tar.gz |
mem-hotplug: fix potential race while building zonelist for new populated zone
Add global mutex zonelists_mutex to fix the possible race: CPU0 CPU1 CPU2 (1) zone->present_pages += online_pages; (2) build_all_zonelists(); (3) alloc_page(); (4) free_page(); (5) build_all_zonelists(); (6) __build_all_zonelists(); (7) zone->pageset = alloc_percpu(); In step (3,4), zone->pageset still points to boot_pageset, so bad things may happen if 2+ nodes are in this state. Even if only 1 node is accessing the boot_pageset, (3) may still consume too much memory to fail the memory allocations in step (7). Besides, atomic operation ensures alloc_percpu() in step (7) will never fail since there is a new fresh memory block added in step(6). [haicheng.li@linux.intel.com: hold zonelists_mutex when build_all_zonelists] Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Reviewed-by: Andi Kleen <andi.kleen@intel.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/memory_hotplug.c')
-rw-r--r-- | mm/memory_hotplug.c | 11 |
1 files changed, 3 insertions, 8 deletions
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 089cc97aed3c..a4cfcdc00455 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -389,11 +389,6 @@ int online_pages(unsigned long pfn, unsigned long nr_pages) int nid; int ret; struct memory_notify arg; - /* - * mutex to protect zone->pageset when it's still shared - * in onlined_pages() - */ - static DEFINE_MUTEX(zone_pageset_mutex); arg.start_pfn = pfn; arg.nr_pages = nr_pages; @@ -420,14 +415,14 @@ int online_pages(unsigned long pfn, unsigned long nr_pages) * This means the page allocator ignores this zone. * So, zonelist must be updated after online. */ - mutex_lock(&zone_pageset_mutex); + mutex_lock(&zonelists_mutex); if (!populated_zone(zone)) need_zonelists_rebuild = 1; ret = walk_system_ram_range(pfn, nr_pages, &onlined_pages, online_pages_range); if (ret) { - mutex_unlock(&zone_pageset_mutex); + mutex_unlock(&zonelists_mutex); printk(KERN_DEBUG "online_pages %lx at %lx failed\n", nr_pages, pfn); memory_notify(MEM_CANCEL_ONLINE, &arg); @@ -441,7 +436,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages) else zone_pcp_update(zone); - mutex_unlock(&zone_pageset_mutex); + mutex_unlock(&zonelists_mutex); setup_per_zone_wmarks(); calculate_zone_inactive_ratio(zone); if (onlined_pages) { |