Skip to content
Snippets Groups Projects
Forked from Alex Deucher / linux
Source project has a limited visibility.
  • Mike Kravetz's avatar
    b65a4eda
    hugetlb: drop ref count earlier after page allocation · b65a4eda
    Mike Kravetz authored
    When discussing the possibility of inflated page ref counts, Muuchun Song
    pointed out this potential issue [1].  It is true that any code could
    potentially take a reference on a compound page after allocation and
    before it is converted to and put into use as a hugetlb page.
    Specifically, this could be done by any users of get_page_unless_zero.
    
    There are three areas of concern within hugetlb code.
    
    1) When adding pages to the pool.  In this case, new pages are
       allocated added to the pool by calling put_page to invoke the hugetlb
       destructor (free_huge_page).  If there is an inflated ref count on the
       page, it will not be immediately added to the free list.  It will only
       be added to the free list when the temporary ref count is dropped.
       This is deemed acceptable and will not be addressed.
    
    2) A page is allocated for immediate use normally as a surplus page or
       migration target.  In this case, the user of the page will also hold a
       reference.  There is no issue as this is just like normal page ref
       counting.
    
    3) A page is allocated and MUST be added to the free list to satisfy a
       reservation.  One such example is gather_surplus_pages as pointed out
       by Muchun in [1].  More specifically, this case covers callers of
       enqueue_huge_page where the page reference count must be zero.  This
       patch covers this third case.
    
    Three routines call enqueue_huge_page when the page reference count could
    potentially be inflated.  They are: gather_surplus_pages,
    alloc_and_dissolve_huge_page and add_hugetlb_page.
    
    add_hugetlb_page is called on error paths when a huge page can not be
    freed due to the inability to allocate vmemmap pages.  In this case, the
    temporairly inflated ref count is not an issue.  When the ref is dropped
    the appropriate action will be taken.  Instead of VM_BUG_ON if the ref
    count does not drop to zero, simply return.
    
    In gather_surplus_pages and alloc_and_dissolve_huge_page the caller
    expects a page (or pages) to be put on the free lists.  In this case we
    must ensure there are no temporary ref counts.  We do this by calling
    put_page_testzero() earlier and not using pages without a zero ref count.
    The temporary page flag (HPageTemporary) is used in such cases so that as
    soon as the inflated ref count is dropped the page will be freed.
    
    [1] https://lore.kernel.org/linux-mm/CAMZfGtVMn3daKrJwZMaVOGOaJU+B4dS--x_oPmGQMD=c=QNGEg@mail.gmail.com/
    
    Link: https://lkml.kernel.org/r/20210809184832.18342-3-mike.kravetz@oracle.com
    
    
    Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Mina Almasry <almasrymina@google.com>
    Cc: Muchun Song <songmuchun@bytedance.com>
    Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
    Cc: Oscar Salvador <osalvador@suse.de>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    b65a4eda
    History
    hugetlb: drop ref count earlier after page allocation
    Mike Kravetz authored
    When discussing the possibility of inflated page ref counts, Muuchun Song
    pointed out this potential issue [1].  It is true that any code could
    potentially take a reference on a compound page after allocation and
    before it is converted to and put into use as a hugetlb page.
    Specifically, this could be done by any users of get_page_unless_zero.
    
    There are three areas of concern within hugetlb code.
    
    1) When adding pages to the pool.  In this case, new pages are
       allocated added to the pool by calling put_page to invoke the hugetlb
       destructor (free_huge_page).  If there is an inflated ref count on the
       page, it will not be immediately added to the free list.  It will only
       be added to the free list when the temporary ref count is dropped.
       This is deemed acceptable and will not be addressed.
    
    2) A page is allocated for immediate use normally as a surplus page or
       migration target.  In this case, the user of the page will also hold a
       reference.  There is no issue as this is just like normal page ref
       counting.
    
    3) A page is allocated and MUST be added to the free list to satisfy a
       reservation.  One such example is gather_surplus_pages as pointed out
       by Muchun in [1].  More specifically, this case covers callers of
       enqueue_huge_page where the page reference count must be zero.  This
       patch covers this third case.
    
    Three routines call enqueue_huge_page when the page reference count could
    potentially be inflated.  They are: gather_surplus_pages,
    alloc_and_dissolve_huge_page and add_hugetlb_page.
    
    add_hugetlb_page is called on error paths when a huge page can not be
    freed due to the inability to allocate vmemmap pages.  In this case, the
    temporairly inflated ref count is not an issue.  When the ref is dropped
    the appropriate action will be taken.  Instead of VM_BUG_ON if the ref
    count does not drop to zero, simply return.
    
    In gather_surplus_pages and alloc_and_dissolve_huge_page the caller
    expects a page (or pages) to be put on the free lists.  In this case we
    must ensure there are no temporary ref counts.  We do this by calling
    put_page_testzero() earlier and not using pages without a zero ref count.
    The temporary page flag (HPageTemporary) is used in such cases so that as
    soon as the inflated ref count is dropped the page will be freed.
    
    [1] https://lore.kernel.org/linux-mm/CAMZfGtVMn3daKrJwZMaVOGOaJU+B4dS--x_oPmGQMD=c=QNGEg@mail.gmail.com/
    
    Link: https://lkml.kernel.org/r/20210809184832.18342-3-mike.kravetz@oracle.com
    
    
    Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Mina Almasry <almasrymina@google.com>
    Cc: Muchun Song <songmuchun@bytedance.com>
    Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
    Cc: Oscar Salvador <osalvador@suse.de>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
net NaN GiB