Skip to content
  • Dominique Martinet's avatar
    9p: add a per-client fcall kmem_cache · 91a76be3
    Dominique Martinet authored
    Having a specific cache for the fcall allocations helps speed up
    end-to-end latency.
    
    The caches will automatically be merged if there are multiple caches
    of items with the same size so we do not need to try to share a cache
    between different clients of the same size.
    
    Since the msize is negotiated with the server, only allocate the cache
    after that negotiation has happened - previous allocations or
    allocations of different sizes (e.g. zero-copy fcall) are made with
    kmalloc directly.
    
    Some figures on two beefy VMs with Connect-IB (sriov) / trans=rdma,
    with ior running 32 processes in parallel doing small 32 bytes IOs:
     - no alloc (4.18-rc7 request cache): 65.4k req/s
     - non-power of two alloc, no patch: 61.6k req/s
     - power of two alloc, no patch: 62.2k req/s
     - non-power of two alloc, with patch: 64.7k req/s
     - power of two alloc, with patch: 65.1k req/s
    
    Link: http://lkml.kernel.org/r/1532943263-24378-2-git-send-email-asmadeus@codewreck.org
    
    
    Signed-off-by: default avatarDominique Martinet <dominique.martinet@cea.fr>
    Acked-by: default avatarJun Piao <piaojun@huawei.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Greg Kurz <groug@kaod.org>
    91a76be3