- Dec 07, 2021
-
-
Pavel Begunkov authored
BUG: KASAN: use-after-free in io_submit_one+0x496/0x2fe0 fs/aio.c:1882 CPU: 2 PID: 15100 Comm: syz-executor873 Not tainted 5.16.0-rc1-syzk #1 Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7860+a7792d29 04/01/2014 Call Trace: [...] refcount_dec_and_test include/linux/refcount.h:333 [inline] iocb_put fs/aio.c:1161 [inline] io_submit_one+0x496/0x2fe0 fs/aio.c:1882 __do_sys_io_submit fs/aio.c:1938 [inline] __se_sys_io_submit fs/aio.c:1908 [inline] __x64_sys_io_submit+0x1c7/0x4a0 fs/aio.c:1908 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae __blkdev_direct_IO_async() returns errors from bio_iov_iter_get_pages() directly, in which case upper layers won't be expecting ->ki_complete to be called by the block layer and will terminate the request. However, there is also bio_endio() leading to a second ->ki_complete and a double free. Fixes: 54a88eb8 ("block: add single bio async direct IO helper") Reported-by:
George Kennedy <george.kennedy@oracle.com> Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c9eb786f6cef041e159e6287de131bec0719ad5c.1638907997.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Dec 03, 2021
-
-
Jakub Kicinski authored
cgroup.h (therefore swap.h, therefore half of the universe) includes bpf.h which in turn includes module.h and slab.h. Since we're about to get rid of that dependency we need to clean things up. v2: drop the cpu.h include from cacheinfo.h, it's not necessary and it makes riscv sensitive to ordering of include files. Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Krzysztof Wilczyński <kw@linux.com> Acked-by:
Peter Chen <peter.chen@kernel.org> Acked-by:
SeongJae Park <sj@kernel.org> Acked-by:
Jani Nikula <jani.nikula@intel.com> Acked-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/all/20211120035253.72074-1-kuba@kernel.org/ # v1 Link: https://lore.kernel.org/all/20211120165528.197359-1-kuba@kernel.org/ # cacheinfo discussion Link: https://lore.kernel.org/bpf/20211202203400.1208663-1-kuba@kernel.org
-
- Nov 05, 2021
-
-
Jens Axboe authored
We have new helpers for this, use them rather than the slower inode size reads. This makes the read/write path consistent with most of the rest of block as well. Signed-off-by:
Jens Axboe <axboe@kernel.dk> Reviewed-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/a72767cd-3c6d-47f7-80f4-aa025a17b2cb@kernel.dk Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Oct 27, 2021
-
-
Pavel Begunkov authored
If we know that a iocb is async we can optimise bio_set_polled() a bit, add a new helper bio_set_polled_async(). Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8fa137885164a5d05fadcff4c3521da8d5a83d00.1635337135.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Now __blkdev_direct_IO() serves only multi-bio I/O, thus remove not used anymore single bio refcounting optimisations. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/88eb488aae9ed4852a30f3a7132f296f56e43b80.1635337135.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
With addition of __blkdev_direct_IO_async(), __blkdev_direct_IO() now serves only multio-bio I/O, which we don't poll. Now we can remove anything related to I/O polling from it. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b8c597a6b7ee612df394853bfd24726aee5b898e.1635337135.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Nobody cares about iov iterators state if we return -EIOCBQUEUED, so as the we now have __blkdev_direct_IO_async(), which gets pages only once, we can skip expensive iov_iter_advance(). It's around 1-2% of all CPU spent. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a6158edfbfa2ae3bc24aed29a72f035df18fad2f.1635337135.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Oct 25, 2021
-
-
Jens Axboe authored
The second argument was only used by the USB gadget code, yet everyone pays the overhead of passing a zero to be passed into aio, where it ends up being part of the aio res2 value. Now that everybody is passing in zero, kill off the extra argument. Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
As with __blkdev_direct_IO_simple(), we can implement direct IO more efficiently if there is only one bio. Add __blkdev_direct_IO_async() and blkdev_bio_end_io_async(). This patch brings me from 4.45-4.5 MIOPS with nullblk to 4.7+. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f0ae4109b7a6934adede490f84d188d53b97051b.1635006010.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Oct 21, 2021
-
-
Pavel Begunkov authored
Don't use shifting by a magic number 9 but replace with a more descriptive SHIFT_SECTOR. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/068782b9f7e97569fb59a99529b23bb17ea4c5e2.1634755800.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Combine pos and len checks and mark unlikely. Also, don't reexpand if it's not truncated. Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/fff34e613aeaae1ad12977dc4592cb1a1f5d3190.1634755800.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Oct 19, 2021
-
-
Jens Axboe authored
We get all sorts of unreliable and funky results since the bio is designed to align on a cacheline, which it does not when inlined like this. Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Oct 18, 2021
-
-
Christoph Hellwig authored
Use the proper helper to read the block device size. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Kees Cook <keescook@chromium.org> Reviewed-by:
Jan Kara <jack@suse.cz> Reviewed-by:
Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20211018101130.1838532-25-hch@lst.de Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
struct io_comp_batch contains a list head and a completion handler, which will allow completions to more effciently completed batches of IO. For now, no functional changes in this patch, we just define the io_comp_batch structure and add the argument to the file_operations iopoll handler. Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
This generates a lot better code for me, and bumps performance from 7650K IOPS to 7750K IOPS. Looking at profiles for the run and running perf diff, it confirms that we're now sending a lot less time there: 6.38% -2.80% [kernel.vmlinux] [k] blkdev_direct_IO Taking it from the 2nd most cycle consumer to only the 9th most at 3.35% of the CPU time. Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
bdev = &BDEV_I(file->f_mapping->host)->bdev Getting struct block_device from a file requires 2 memory dereferences as illustrated above, that takes a toll on performance, so cache it in yet unused file->private_data. That gives a noticeable peak performance improvement. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/8415f9fe12e544b9da89593dfbca8de2b52efe03.1634115360.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Replace the blk_poll interface that requires the caller to keep a queue and cookie from the submissions with polling based on the bio. Polling for the bio itself leads to a few advantages: - the cookie construction can made entirely private in blk-mq.c - the caller does not need to remember the request_queue and cookie separately and thus sidesteps their lifetime issues - keeping the device and the cookie inside the bio allows to trivially support polling BIOs remapping by stacking drivers - a lot of code to propagate the cookie back up the submission path can be removed entirely. Signed-off-by:
Christoph Hellwig <hch@lst.de> Tested-by:
Mark Wunderlich <mark.wunderlich@intel.com> Link: https://lore.kernel.org/r/20211012111226.760968-15-hch@lst.de Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Switch the boolean spin argument to blk_poll to passing a set of flags instead. This will allow to control polling behavior in a more fine grained way. Signed-off-by:
Christoph Hellwig <hch@lst.de> Tested-by:
Mark Wunderlich <mark.wunderlich@intel.com> Link: https://lore.kernel.org/r/20211012111226.760968-10-hch@lst.de [axboe: adapt to changed io_uring iopoll] Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
If an iocb is split into multiple bios we can't poll for both. So don't even bother to try to poll in that case. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211012111226.760968-3-hch@lst.de Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Simplify the ioctl path and match the code structure on the compat side. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211012104450.659013-4-hch@lst.de Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Sep 24, 2021
-
-
Ming Lei authored
When running ->fallocate(), blkdev_fallocate() should hold mapping->invalidate_lock to prevent page cache from being accessed, otherwise stale data may be read in page cache. Without this patch, blktests block/009 fails sometimes. With this patch, block/009 can pass always. Also as Jan pointed out, no pages can be created in the discarded area while you are holding the invalidate_lock, so remove the 2nd truncate_bdev_range(). Cc: Jan Kara <jack@suse.cz> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210923023751.1441091-1-ming.lei@redhat.com Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Sep 07, 2021
-
-
Christoph Hellwig authored
Add a new block/fops.c for all the file and address_space operations that provide the block special file support. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210907141303.1371844-2-hch@lst.de [axboe: correct trailing whitespace while at it] Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-