summary refs log tree commit diff
path: root/mm/madvise.c
diff options
context:
space:
mode:
authorzhong jiang <zhongjiang@huawei.com>2019-11-15 17:34:36 -0800
committerLinus Torvalds <torvalds@linux-foundation.org>2019-11-15 18:33:59 -0800
commit82072962973008201b817fae1896512977dd5083 (patch)
treed3349c9bac8fbe1e93bffa632f9310056ef03dec /mm/madvise.c
parenta85dfc305a21acfc48fa28a0fa0a0cb6ad496120 (diff)
downloadlinux-82072962973008201b817fae1896512977dd5083.tar.gz
mm: fix trying to reclaim unevictable lru page when calling madvise_pageout
Recently, I hit the following issue when running upstream.

  kernel BUG at mm/vmscan.c:1521!
  invalid opcode: 0000 [#1] SMP KASAN PTI
  CPU: 0 PID: 23385 Comm: syz-executor.6 Not tainted 5.4.0-rc4+ #1
  RIP: 0010:shrink_page_list+0x12b6/0x3530 mm/vmscan.c:1521
  Call Trace:
   reclaim_pages+0x499/0x800 mm/vmscan.c:2188
   madvise_cold_or_pageout_pte_range+0x58a/0x710 mm/madvise.c:453
   walk_pmd_range mm/pagewalk.c:53 [inline]
   walk_pud_range mm/pagewalk.c:112 [inline]
   walk_p4d_range mm/pagewalk.c:139 [inline]
   walk_pgd_range mm/pagewalk.c:166 [inline]
   __walk_page_range+0x45a/0xc20 mm/pagewalk.c:261
   walk_page_range+0x179/0x310 mm/pagewalk.c:349
   madvise_pageout_page_range mm/madvise.c:506 [inline]
   madvise_pageout+0x1f0/0x330 mm/madvise.c:542
   madvise_vma mm/madvise.c:931 [inline]
   __do_sys_madvise+0x7d2/0x1600 mm/madvise.c:1113
   do_syscall_64+0x9f/0x4c0 arch/x86/entry/common.c:290
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

madvise_pageout() accesses the specified range of the vma and isolates
them, then runs shrink_page_list() to reclaim its memory.  But it also
isolates the unevictable pages to reclaim.  Hence, we can catch the
cases in shrink_page_list().

The root cause is that we scan the page tables instead of specific LRU
list.  and so we need to filter out the unevictable lru pages from our
end.

Link: http://lkml.kernel.org/r/1572616245-18946-1-git-send-email-zhongjiang@huawei.com
Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT")
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/madvise.c')
-rw-r--r--mm/madvise.c16
1 files changed, 12 insertions, 4 deletions
diff --git a/mm/madvise.c b/mm/madvise.c
index 2be9f3fdb05e..94c343b4c968 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -363,8 +363,12 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
 		ClearPageReferenced(page);
 		test_and_clear_page_young(page);
 		if (pageout) {
-			if (!isolate_lru_page(page))
-				list_add(&page->lru, &page_list);
+			if (!isolate_lru_page(page)) {
+				if (PageUnevictable(page))
+					putback_lru_page(page);
+				else
+					list_add(&page->lru, &page_list);
+			}
 		} else
 			deactivate_page(page);
 huge_unlock:
@@ -441,8 +445,12 @@ regular_page:
 		ClearPageReferenced(page);
 		test_and_clear_page_young(page);
 		if (pageout) {
-			if (!isolate_lru_page(page))
-				list_add(&page->lru, &page_list);
+			if (!isolate_lru_page(page)) {
+				if (PageUnevictable(page))
+					putback_lru_page(page);
+				else
+					list_add(&page->lru, &page_list);
+			}
 		} else
 			deactivate_page(page);
 	}