Skip to content
Snippets Groups Projects
  1. Sep 09, 2024
  2. Sep 04, 2024
  3. Jul 10, 2024
    • Barry Song's avatar
      tools/mm: introduce a tool to assess swap entry allocation for thp_swapout · 95139d94
      Barry Song authored
      Both Ryan and Chris have been utilizing the small test program to aid in
      debugging and identifying issues with swap entry allocation.  While a real
      or intricate workload might be more suitable for assessing the correctness
      and effectiveness of the swap allocation policy, a small test program
      presents a simpler means of understanding the problem and initially
      verifying the improvements being made.
      
      Let's endeavor to integrate it into tools/mm.  Although it presently only
      accommodates 64KB and 4KB, I'm optimistic that we can expand its
      capabilities to support multiple sizes and simulate more complex systems
      in the future as required.
      
      Basically, we have
      
      1. Use MADV_PAGEPUT for rapid swap-out, putting the swap allocation
         code under high exercise in a short time.
      
      2. Use MADV_DONTNEED to simulate the behavior of libc and Java heap in
         freeing memory, as well as for munmap, app exits, or OOM killer
         scenarios.  This ensures new mTHP is always generated, released or
         swapped out, similar to the behavior on a PC or Android phone where
         many applications are frequently started and terminated.
      
      3. Swap in with or without the "-a" option to observe how fragments
         due to swap-in and the incoming swap-in of large folios will impact
         swap-out fallback.
      
      Due to 2, we ensure a certain proportion of mTHP.  Similarly, because of
      3, we maintain a certain proportion of small folios, as we don't support
      large folios swap-in, meaning any swap-in will immediately result in small
      folios.  Therefore, with both 2 and 3, we automatically achieve a system
      containing both mTHP and small folios.  Additionally, 1 provides the
      ability to continuously swap them out.
      
      We can also use "-s" to add a dedicated small folios memory area.
      
      [akpm@linux-foundation.org: thp_swap_allocator_test.c needs mman.h, per Kairui Song]
      Link: https://lkml.kernel.org/r/20240622071231.576056-2-21cnbao@gmail.com
      
      
      Signed-off-by: default avatarBarry Song <v-songbaohua@oppo.com>
      Acked-by: default avatarChris Li <chrisl@kernel.org>
      Tested-by: default avatarChris Li <chrisl@kernel.org>
      Reviewed-by: default avatarRyan Roberts <ryan.roberts@arm.com>
      Tested-by: default avatarRyan Roberts <ryan.roberts@arm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Kairui Song <kasong@tencent.com>
      Cc: Kalesh Singh <kaleshsingh@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      95139d94
  4. Feb 22, 2024
    • Ryan Roberts's avatar
      tools/mm: add thpmaps script to dump THP usage info · 2444172c
      Ryan Roberts authored
      With the proliferation of large folios for file-backed memory, and more
      recently the introduction of multi-size THP for anonymous memory, it is
      becoming useful to be able to see exactly how large folios are mapped into
      processes.  For some architectures (e.g.  arm64), if most memory is mapped
      using contpte-sized and -aligned blocks, TLB usage can be optimized so
      it's useful to see where these requirements are and are not being met.
      
      thpmaps is a Python utility that reads /proc/<pid>/smaps,
      /proc/<pid>/pagemap and /proc/kpageflags to print information about how
      transparent huge pages (both file and anon) are mapped to a specified
      process or cgroup.  It aims to help users debug and optimize their
      workloads.  In future we may wish to introduce stats directly into the
      kernel (e.g.  smaps or similar), but for now this provides a short term
      solution without the need to introduce any new ABI.
      
      Run with help option for a full listing of the arguments:
      
          # ./thpmaps --help
      
      --8<--
      usage: thpmaps [-h] [--pid pid | --cgroup path] [--rollup]
                     [--cont size[KMG]] [--inc-smaps] [--inc-empty]
                     [--periodic sleep_ms]
      
      Prints information about how transparent huge pages are mapped, either
      system-wide, or for a specified process or cgroup.
      
      When run with --pid, the user explicitly specifies the set of pids to
      scan.  e.g.  "--pid 10 [--pid 134 ...]".  When run with --cgroup, the user
      passes either a v1 or v2 cgroup and all pids that belong to the cgroup
      subtree are scanned.  When run with neither --pid nor --cgroup, the full
      set of pids on the system is gathered from /proc and scanned as if the
      user had provided "--pid 1 --pid 2 ...".
      
      A default set of statistics is always generated for THP mappings. 
      However, it is also possible to generate additional statistics for
      "contiguous block mappings" where the block size is user-defined.
      
      Statistics are maintained independently for anonymous and file-backed
      (pagecache) memory and are shown both in kB and as a percentage of either
      total anonymous or total file-backed memory as appropriate.
      
      THP Statistics
      --------------
      
      Statistics are always generated for fully- and contiguously-mapped THPs
      whose mapping address is aligned to their size, for each <size> supported
      by the system.  Separate counters describe THPs mapped by PTE vs those
      mapped by PMD.  (Although note a THP can only be mapped by PMD if it is
      PMD-sized):
      
      - anon-thp-pte-aligned-<size>kB
      - file-thp-pte-aligned-<size>kB
      - anon-thp-pmd-aligned-<size>kB
      - file-thp-pmd-aligned-<size>kB
      
      Similarly, statistics are always generated for fully- and contiguously-
      mapped THPs whose mapping address is *not* aligned to their size, for each
      <size> supported by the system.  Due to the unaligned mapping, it is
      impossible to map by PMD, so there are only PTE counters for this case:
      
      - anon-thp-pte-unaligned-<size>kB
      - file-thp-pte-unaligned-<size>kB
      
      Statistics are also always generated for mapped pages that belong to a THP
      but where the is THP is *not* fully- and contiguously- mapped.  These
      "partial" mappings are all counted in the same counter regardless of the
      size of the THP that is partially mapped:
      
      - anon-thp-pte-partial
      - file-thp-pte-partial
      
      Contiguous Block Statistics
      ---------------------------
      
      An optional, additional set of statistics is generated for every
      contiguous block size specified with `--cont <size>`.  These statistics
      show how much memory is mapped in contiguous blocks of <size> and also
      aligned to <size>.  A given contiguous block must all belong to the same
      THP, but there is no requirement for it to be the *whole* THP.  Separate
      counters describe contiguous blocks mapped by PTE vs those mapped by PMD:
      
      - anon-cont-pte-aligned-<size>kB
      - file-cont-pte-aligned-<size>kB
      - anon-cont-pmd-aligned-<size>kB
      - file-cont-pmd-aligned-<size>kB
      
      As an example, if monitoring 64K contiguous blocks (--cont 64K), there are
      a number of sources that could provide such blocks: a fully- and
      contiguously-mapped 64K THP that is aligned to a 64K boundary would
      provide 1 block.  A fully- and contiguously-mapped 128K THP that is
      aligned to at least a 64K boundary would provide 2 blocks.  Or a 128K THP
      that maps its first 100K, but contiguously and starting at a 64K boundary
      would provide 1 block.  A fully- and contiguously-mapped 2M THP would
      provide 32 blocks.  There are many other possible permutations.
      
      options:
        -h, --help           show this help message and exit
        --pid pid            Process id of the target process. Maybe issued
                             multiple times to scan multiple processes. --pid
                             and --cgroup are mutually exclusive. If neither
                             are provided, all processes are scanned to
                             provide system-wide information.
        --cgroup path        Path to the target cgroup in sysfs. Iterates
                             over every pid in the cgroup and its children.
                             --pid and --cgroup are mutually exclusive. If
                             neither are provided, all processes are scanned
                             to provide system-wide information.
        --rollup             Sum the per-vma statistics to provide a summary
                             over the whole system, process or cgroup.
        --cont size[KMG]     Adds stats for memory that is mapped in
                             contiguous blocks of <size> and also aligned to
                             <size>. May be issued multiple times to track
                             multiple sized blocks. Useful to infer e.g.
                             arm64 contpte and hpa mappings. Size must be a
                             power-of-2 number of pages.
        --inc-smaps          Include all numerical, additive
                             /proc/<pid>/smaps stats in the output.
        --inc-empty          Show all statistics including those whose value
                             is 0.
        --periodic sleep_ms  Run in a loop, polling every sleep_ms
                             milliseconds.
      
      Requires root privilege to access pagemap and kpageflags.
      --8<--
      
      Example command to summarise fully and partially mapped THPs and 64K
      contiguous blocks over all VMAs in all processes in the system
      (--inc-empty forces printing stats that are 0):
      
          # ./thpmaps --cont 64K --rollup --inc-empty
      
      --8<--
      anon-thp-pmd-aligned-2048kB:      139264 kB ( 6%)
      file-thp-pmd-aligned-2048kB:           0 kB ( 0%)
      anon-thp-pte-aligned-16kB:             0 kB ( 0%)
      anon-thp-pte-aligned-32kB:             0 kB ( 0%)
      anon-thp-pte-aligned-64kB:         72256 kB ( 3%)
      anon-thp-pte-aligned-128kB:            0 kB ( 0%)
      anon-thp-pte-aligned-256kB:            0 kB ( 0%)
      anon-thp-pte-aligned-512kB:            0 kB ( 0%)
      anon-thp-pte-aligned-1024kB:           0 kB ( 0%)
      anon-thp-pte-aligned-2048kB:           0 kB ( 0%)
      anon-thp-pte-unaligned-16kB:           0 kB ( 0%)
      anon-thp-pte-unaligned-32kB:           0 kB ( 0%)
      anon-thp-pte-unaligned-64kB:           0 kB ( 0%)
      anon-thp-pte-unaligned-128kB:          0 kB ( 0%)
      anon-thp-pte-unaligned-256kB:          0 kB ( 0%)
      anon-thp-pte-unaligned-512kB:          0 kB ( 0%)
      anon-thp-pte-unaligned-1024kB:         0 kB ( 0%)
      anon-thp-pte-unaligned-2048kB:         0 kB ( 0%)
      anon-thp-pte-partial:              63232 kB ( 3%)
      file-thp-pte-aligned-16kB:        809024 kB (47%)
      file-thp-pte-aligned-32kB:         43168 kB ( 3%)
      file-thp-pte-aligned-64kB:         98496 kB ( 6%)
      file-thp-pte-aligned-128kB:        17536 kB ( 1%)
      file-thp-pte-aligned-256kB:            0 kB ( 0%)
      file-thp-pte-aligned-512kB:            0 kB ( 0%)
      file-thp-pte-aligned-1024kB:           0 kB ( 0%)
      file-thp-pte-aligned-2048kB:           0 kB ( 0%)
      file-thp-pte-unaligned-16kB:       21712 kB ( 1%)
      file-thp-pte-unaligned-32kB:         704 kB ( 0%)
      file-thp-pte-unaligned-64kB:         896 kB ( 0%)
      file-thp-pte-unaligned-128kB:      44928 kB ( 3%)
      file-thp-pte-unaligned-256kB:          0 kB ( 0%)
      file-thp-pte-unaligned-512kB:          0 kB ( 0%)
      file-thp-pte-unaligned-1024kB:         0 kB ( 0%)
      file-thp-pte-unaligned-2048kB:         0 kB ( 0%)
      file-thp-pte-partial:               9252 kB ( 1%)
      anon-cont-pmd-aligned-64kB:       139264 kB ( 6%)
      file-cont-pmd-aligned-64kB:            0 kB ( 0%)
      anon-cont-pte-aligned-64kB:       100672 kB ( 4%)
      file-cont-pte-aligned-64kB:       161856 kB ( 9%)
      --8<--
      
      Link: https://lkml.kernel.org/r/20240116141235.960842-1-ryan.roberts@arm.com
      
      
      Signed-off-by: default avatarRyan Roberts <ryan.roberts@arm.com>
      Tested-by: default avatarBarry Song <v-songbaohua@oppo.com>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: William Kucharski <william.kucharski@oracle.com>
      Cc: Zenghui Yu <yuzenghui@huawei.com>
      Cc: Zi Yan <ziy@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      2444172c
  5. Oct 18, 2023
  6. Sep 05, 2023
    • Xie XiuQi's avatar
      tools/mm: fix undefined reference to pthread_once · 7f33105c
      Xie XiuQi authored
      Commit 97d5f2e9 ("tools api fs: More thread safety for global
      filesystem variables") introduces pthread_once, so the libpthread
      should be added at link time, or we'll meet the following compile
      error when 'make -C tools/mm':
      
        gcc -Wall -Wextra -I../lib/ -o page-types page-types.c ../lib/api/libapi.a
        ~/linux/tools/lib/api/fs/fs.c:146: undefined reference to `pthread_once'
        ~/linux/tools/lib/api/fs/fs.c:147: undefined reference to `pthread_once'
        ~/linux/tools/lib/api/fs/fs.c:148: undefined reference to `pthread_once'
        ~/linux/tools/lib/api/fs/fs.c:149: undefined reference to `pthread_once'
        ~/linux/tools/lib/api/fs/fs.c:150: undefined reference to `pthread_once'
        /usr/bin/ld: ../lib/api/libapi.a(libapi-in.o):~/linux/tools/lib/api/fs/fs.c:151:
        more undefined references to `pthread_once' follow
        collect2: error: ld returned 1 exit status
        make: *** [Makefile:22: page-types] Error 1
      
      Link: https://lkml.kernel.org/r/20230831034205.2376653-1-xiexiuqi@huaweicloud.com
      
      
      Fixes: 97d5f2e9 ("tools api fs: More thread safety for global filesystem variables")
      Signed-off-by: default avatarXie XiuQi <xiexiuqi@huawei.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      7f33105c
  7. Apr 16, 2023
  8. Mar 29, 2023
  9. Feb 03, 2023
  10. Jan 19, 2023
Loading