Quantcast
Viewing all articles
Browse latest Browse all 29

ZoL+FIO Randwrite nvme bs=8k@32io = 148MiB/s?

Sorry for the title, but it's very short summary of BS that I'm looking into. Situation

I'm using ZoL 2.1.5 (from jonathonf's ppa) on Ubuntu (tried 20.04&22.04)

I have following NVMe disk

  • Kingston KC2500 1TB (/dev/nvme0n1) formatted as 512 (with nvme format -l 0)
  • Samsung 983 DCT M.2 960GB (/dev/nvme6n1) formatted as 512 with nvme format -l 0)

The following pastebin contains all commands, here is short output:

RAW device:

fio -name=rndw8k32 -ioengine=libaio -direct=1 -buffered=0 -invalidate=1 -filesize=30G -numjobs=1 -bs=8k -iodepth=32 -rw=randwrite -filename=/dev/nvme0n1WRITE: bw=1600MiB/s (1678MB/s), 1600MiB/s-1600MiB/s (1678MB/s-1678MB/s), io=30.0GiB (32.2GB), run=19202-19202msecfio -name=rndw8k32 -ioengine=libaio -direct=1 -buffered=0 -invalidate=1 -filesize=30G -numjobs=1 -bs=**8k** -iodepth=32 -rw=randwrite -filename=/dev/nvme6n1WRITE: bw=1180MiB/s (1237MB/s), 1180MiB/s-1180MiB/s (1237MB/s-1237MB/s), io=30.0GiB (32.2GB), run=26031-26031msec

Now to create stripe out of first disk:

zpool create -o ashift=9 -O compression=lz4 -O atime=off -O recordsize=64k nvme /dev/nvme0n1fio -name=rndw8k32 -ioengine=libaio -direct=1 -buffered=0 -invalidate=1 -filesize=30G -numjobs=1 -bs=**8k** -iodepth=32 -rw=**randwrite** -filename=/nvme/temp.tmpWRITE: bw=147MiB/s (154MB/s), 147MiB/s-147MiB/s (154MB/s-154MB/s), io=30.0GiB (32.2GB), run=209618-209618msec

Ok, maybe record size is to blame:

zpool create -o ashift=9 -O compression=lz4 -O atime=off -O recordsize=8k nvme /dev/nvme0n1fio -name=rndw8k32 -ioengine=libaio -direct=1 -buffered=0 -invalidate=1 -filesize=30G -numjobs=1 -bs=8k -iodepth=32 -rw=randwrite -filename=/nvme/temp.tmpWRITE: bw=349MiB/s (366MB/s), 349MiB/s-349MiB/s (366MB/s-366MB/s), io=30.0GiB (32.2GB), run=87922-87922msec

What the actually hell? The same picture is on 2nd NVMe. If I use recordsize=64k and fio bs=64k I get normal speed. If I use recordsize=64 and fio bs=8k i get bullshit speed. If i use recordsize=8k and fio bs=8k i get bullshit speed.

https://pastebin.com/0RH6gLM9

Maybe problem is that I'm using file and comparing file vs device? Well, ext4 give me:

For 8k block

fio -name=rndw8k32 -ioengine=libaio -direct=1 -buffered=0 -invalidate=1 -filesize=30G -numjobs=1 -bs=8k -iodepth=32 -rw=randwrite -filename=/mnt/temp.tmpWRITE: bw=569MiB/s (597MB/s), 569MiB/s-569MiB/s (597MB/s-597MB/s), io=30.0GiB (32.2GB), run=53989-53989msec

For 64k block

fio -name=rndw8k32 -ioengine=libaio -direct=1 -buffered=0 -invalidate=1 -filesize=30G -numjobs=1 -bs=64k -iodepth=32 -rw=randwrite -filename=/mnt/tmp.tmpWRITE: bw=2137MiB/s (2241MB/s), 2137MiB/s-2137MiB/s (2241MB/s-2241MB/s), io=30.0GiB (32.2GB), run=14373-14373msec

Just in case i have also tested it after reformating NVMe with

nvme format /dev/nvme0n1 -l 1

and using ashift=12 bs=8k gives me:

zpool create -o ashift=12 -O compression=lz4 -O atime=off -O recordsize=8k nvme /dev/nvme0n1 -ffio -name=rndw8k32 -ioengine=libaio -direct=1 -buffered=0 -invalidate=1 -filesize=30G -numjobs=1 -bs=8k -iodepth=32 -rw=randwrite -filename=/nvme/temp.tmpWRITE: bw=192MiB/s (202MB/s), 192MiB/s-192MiB/s (202MB/s-202MB/s), io=30.0GiB (32.2GB), run=159853-159853msec

and using ashift=12 bs=64k gives me:

zpool create -o ashift=12 -O compression=lz4 -O atime=off -O recordsize=64k nvme /dev/nvme0n1 -ffio -name=rndw8k32 -ioengine=libaio -direct=1 -buffered=0 -invalidate=1 -filesize=30G -numjobs=1 -bs=8k -iodepth=32 -rw=randwrite -filename=/nvme/temp.tmp    WRITE: bw=495MiB/s (519MB/s), 495MiB/s-495MiB/s (519MB/s-519MB/s), io=30.0GiB (32.2GB), run=62035-62035msec

details: https://pastebin.com/GDGgSMmR

So, what am I missing in my tests? How come that ZFS making my nvme THAT much slower? Just in case whole NVMe is zeroed before tests (like day prior).


Viewing all articles
Browse latest Browse all 29

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>