summaryrefslogtreecommitdiffstats
path: root/kernel/Documentation/vm/unevictable-lru.txt
diff options
context:
space:
mode:
Diffstat (limited to 'kernel/Documentation/vm/unevictable-lru.txt')
-rw-r--r--kernel/Documentation/vm/unevictable-lru.txt128
1 files changed, 26 insertions, 102 deletions
diff --git a/kernel/Documentation/vm/unevictable-lru.txt b/kernel/Documentation/vm/unevictable-lru.txt
index 3be0bfc47..fa3b52708 100644
--- a/kernel/Documentation/vm/unevictable-lru.txt
+++ b/kernel/Documentation/vm/unevictable-lru.txt
@@ -467,7 +467,13 @@ mmap(MAP_LOCKED) SYSTEM CALL HANDLING
In addition the mlock()/mlockall() system calls, an application can request
that a region of memory be mlocked supplying the MAP_LOCKED flag to the mmap()
-call. Furthermore, any mmap() call or brk() call that expands the heap by a
+call. There is one important and subtle difference here, though. mmap() + mlock()
+will fail if the range cannot be faulted in (e.g. because mm_populate fails)
+and returns with ENOMEM while mmap(MAP_LOCKED) will not fail. The mmaped
+area will still have properties of the locked area - aka. pages will not get
+swapped out - but major page faults to fault memory in might still happen.
+
+Furthermore, any mmap() call or brk() call that expands the heap by a
task that has previously called mlockall() with the MCL_FUTURE flag will result
in the newly mapped memory being mlocked. Before the unevictable/mlock
changes, the kernel simply called make_pages_present() to allocate pages and
@@ -525,83 +531,20 @@ map.
try_to_unmap() is always called, by either vmscan for reclaim or for page
migration, with the argument page locked and isolated from the LRU. Separate
-functions handle anonymous and mapped file pages, as these types of pages have
-different reverse map mechanisms.
-
- (*) try_to_unmap_anon()
-
- To unmap anonymous pages, each VMA in the list anchored in the anon_vma
- must be visited - at least until a VM_LOCKED VMA is encountered. If the
- page is being unmapped for migration, VM_LOCKED VMAs do not stop the
- process because mlocked pages are migratable. However, for reclaim, if
- the page is mapped into a VM_LOCKED VMA, the scan stops.
-
- try_to_unmap_anon() attempts to acquire in read mode the mmap semaphore of
- the mm_struct to which the VMA belongs. If this is successful, it will
- mlock the page via mlock_vma_page() - we wouldn't have gotten to
- try_to_unmap_anon() if the page were already mlocked - and will return
- SWAP_MLOCK, indicating that the page is unevictable.
-
- If the mmap semaphore cannot be acquired, we are not sure whether the page
- is really unevictable or not. In this case, try_to_unmap_anon() will
- return SWAP_AGAIN.
-
- (*) try_to_unmap_file() - linear mappings
-
- Unmapping of a mapped file page works the same as for anonymous mappings,
- except that the scan visits all VMAs that map the page's index/page offset
- in the page's mapping's reverse map priority search tree. It also visits
- each VMA in the page's mapping's non-linear list, if the list is
- non-empty.
-
- As for anonymous pages, on encountering a VM_LOCKED VMA for a mapped file
- page, try_to_unmap_file() will attempt to acquire the associated
- mm_struct's mmap semaphore to mlock the page, returning SWAP_MLOCK if this
- is successful, and SWAP_AGAIN, if not.
-
- (*) try_to_unmap_file() - non-linear mappings
-
- If a page's mapping contains a non-empty non-linear mapping VMA list, then
- try_to_un{map|lock}() must also visit each VMA in that list to determine
- whether the page is mapped in a VM_LOCKED VMA. Again, the scan must visit
- all VMAs in the non-linear list to ensure that the pages is not/should not
- be mlocked.
-
- If a VM_LOCKED VMA is found in the list, the scan could terminate.
- However, there is no easy way to determine whether the page is actually
- mapped in a given VMA - either for unmapping or testing whether the
- VM_LOCKED VMA actually pins the page.
-
- try_to_unmap_file() handles non-linear mappings by scanning a certain
- number of pages - a "cluster" - in each non-linear VMA associated with the
- page's mapping, for each file mapped page that vmscan tries to unmap. If
- this happens to unmap the page we're trying to unmap, try_to_unmap() will
- notice this on return (page_mapcount(page) will be 0) and return
- SWAP_SUCCESS. Otherwise, it will return SWAP_AGAIN, causing vmscan to
- recirculate this page. We take advantage of the cluster scan in
- try_to_unmap_cluster() as follows:
-
- For each non-linear VMA, try_to_unmap_cluster() attempts to acquire the
- mmap semaphore of the associated mm_struct for read without blocking.
-
- If this attempt is successful and the VMA is VM_LOCKED,
- try_to_unmap_cluster() will retain the mmap semaphore for the scan;
- otherwise it drops it here.
-
- Then, for each page in the cluster, if we're holding the mmap semaphore
- for a locked VMA, try_to_unmap_cluster() calls mlock_vma_page() to
- mlock the page. This call is a no-op if the page is already locked,
- but will mlock any pages in the non-linear mapping that happen to be
- unlocked.
-
- If one of the pages so mlocked is the page passed in to try_to_unmap(),
- try_to_unmap_cluster() will return SWAP_MLOCK, rather than the default
- SWAP_AGAIN. This will allow vmscan to cull the page, rather than
- recirculating it on the inactive list.
-
- Again, if try_to_unmap_cluster() cannot acquire the VMA's mmap sem, it
- returns SWAP_AGAIN, indicating that the page is mapped by a VM_LOCKED
- VMA, but couldn't be mlocked.
+functions handle anonymous and mapped file and KSM pages, as these types of
+pages have different reverse map lookup mechanisms, with different locking.
+In each case, whether rmap_walk_anon() or rmap_walk_file() or rmap_walk_ksm(),
+it will call try_to_unmap_one() for every VMA which might contain the page.
+
+When trying to reclaim, if try_to_unmap_one() finds the page in a VM_LOCKED
+VMA, it will then mlock the page via mlock_vma_page() instead of unmapping it,
+and return SWAP_MLOCK to indicate that the page is unevictable: and the scan
+stops there.
+
+mlock_vma_page() is called while holding the page table's lock (in addition
+to the page lock, and the rmap lock): to serialize against concurrent mlock or
+munlock or munmap system calls, mm teardown (munlock_vma_pages_all), reclaim,
+holepunching, and truncation of file pages and their anonymous COWed pages.
try_to_munlock() REVERSE MAP SCAN
@@ -617,29 +560,15 @@ all PTEs from the page. For this purpose, the unevictable/mlock infrastructure
introduced a variant of try_to_unmap() called try_to_munlock().
try_to_munlock() calls the same functions as try_to_unmap() for anonymous and
-mapped file pages with an additional argument specifying unlock versus unmap
+mapped file and KSM pages with a flag argument specifying unlock versus unmap
processing. Again, these functions walk the respective reverse maps looking
-for VM_LOCKED VMAs. When such a VMA is found for anonymous pages and file
-pages mapped in linear VMAs, as in the try_to_unmap() case, the functions
-attempt to acquire the associated mmap semaphore, mlock the page via
-mlock_vma_page() and return SWAP_MLOCK. This effectively undoes the
-pre-clearing of the page's PG_mlocked done by munlock_vma_page.
-
-If try_to_unmap() is unable to acquire a VM_LOCKED VMA's associated mmap
-semaphore, it will return SWAP_AGAIN. This will allow shrink_page_list() to
-recycle the page on the inactive list and hope that it has better luck with the
-page next time.
-
-For file pages mapped into non-linear VMAs, the try_to_munlock() logic works
-slightly differently. On encountering a VM_LOCKED non-linear VMA that might
-map the page, try_to_munlock() returns SWAP_AGAIN without actually mlocking the
-page. munlock_vma_page() will just leave the page unlocked and let vmscan deal
-with it - the usual fallback position.
+for VM_LOCKED VMAs. When such a VMA is found, as in the try_to_unmap() case,
+the functions mlock the page via mlock_vma_page() and return SWAP_MLOCK. This
+undoes the pre-clearing of the page's PG_mlocked done by munlock_vma_page.
Note that try_to_munlock()'s reverse map walk must visit every VMA in a page's
reverse map to determine that a page is NOT mapped into any VM_LOCKED VMA.
-However, the scan can terminate when it encounters a VM_LOCKED VMA and can
-successfully acquire the VMA's mmap semaphore for read and mlock the page.
+However, the scan can terminate when it encounters a VM_LOCKED VMA.
Although try_to_munlock() might be called a great many times when munlocking a
large region or tearing down a large address space that has been mlocked via
mlockall(), overall this is a fairly rare event.
@@ -667,11 +596,6 @@ Some examples of these unevictable pages on the LRU lists are:
(3) mlocked pages that could not be isolated from the LRU and moved to the
unevictable list in mlock_vma_page().
- (4) Pages mapped into multiple VM_LOCKED VMAs, but try_to_munlock() couldn't
- acquire the VMA's mmap semaphore to test the flags and set PageMlocked.
- munlock_vma_page() was forced to let the page back on to the normal LRU
- list for vmscan to handle.
-
shrink_inactive_list() also diverts any unevictable pages that it finds on the
inactive lists to the appropriate zone's unevictable list.