Native ZFS VDEV for Object Storage (OpenZFS Summit)

(zettalane.com)

50 points | by suprasam 4 hours ago

4 comments

  • PunchyHamster 11 minutes ago
    FS metrics without random IO benchmark are near meaningless, sequential read is best case for basically every file system and it's essentially "how fast you can get things from S3" in this case
  • curt15 1 hour ago
    How does this relate to the work presented a few years ago by the ZFS devs using S3 as object storage? https://youtu.be/opW9KhjOQ3Q?si=CgrYi0P4q9gz-2Mq
    • magicalhippo 4 minutes ago
      Just going by the submitted article, it seems very similar in what it achieves, but seems to be implemented slightly differently. As I recall the DelphiX solution did not use a character device to communicate with the user-space S3 service, and it relied on a local NVMe backed write cache to make 16kB blocks performant by coalescing them into large objects (10 MB IIRC).

      This solution instead seems to rely on using 1MB blocks and store those directly as objects, alleviating the intermediate caching and indirection layer. Larger number of objects but less local overhead.

      DelphiX's rationale for 16 kB blocks was that their primary use-case was PostgreSQL database storage. I presume this is geared for other workloads.

      And, importantly since we're on HN, DelphiX's user-space service was written in Rust as I recall it, this uses Go.

    • tw04 11 minutes ago
      AFAIK it was never released, and it used FUSE, it wasn’t native.
  • doktor2u 1 hour ago
    That’s brilliant! Always amazed at how zfs keeps morphing and stays relevant!
  • glemion43 37 minutes ago
    I do not get it.

    Why would I use zfs for this? Isn't the power of zfs that it's a filesystem with checksum and stuff like encryption?

    Why would I use it for s3?

    • mustache_kimono 30 minutes ago
      > Why would I use it for s3?

      You have it the wrong way around. Here, ZFS uses many small S3 objects as the storage substrate, rather than physical disks. The value proposition is that this should be definitely cheaper and perhaps more durable than EBS.

      See s3backer, a FUSE implementation of similar: https://github.com/archiecobbs/s3backer

      See prior in kernel ZFS work by Delphix which AFAIK was closed by Delphix management: https://www.youtube.com/watch?v=opW9KhjOQ3Q

      BTW this appears to be closed too!

    • bakies 30 minutes ago
      I've got a massive storage server built that I want to run s3 protocol on it. It's already running ZFS. This is exactly what I want.

      zfs-share already implements SMB and NFS.