Data storage



ZFS is currently the only filesystem that can do live checksumming in order to detect silent data corruption which passes the mechanisms of the underlying disk drive(s). From ZFS saved my data. Right now.:

  • Two of my disks (c10t0d0 and c9t0d0) are happily giving me garbage back instead of my data. Without knowing it.
    Thanks to ZFS' checksumming, we can detect this, even though the drive thinks everything is ok.
    No other storage device, RAID array, NAS or file system I know of can do this. Not even the increasingly hyped (and admittedly cool-looking) Drobo [1].

  • Because both drives are configured as a mirror, bad data from one device can be corrected by reading good data from the other device. This is the "applications are unaffected" and "no known data errors" part.
    Again, it's the checksums that enable ZFS to distinguish good data blocks from bad ones, and therefore enabling self-healing while the system is reading stuff from disk.
    As a result, even though both disks are not functioning properly, my data is still safe, because (luckily, albeit with millions of blocks per disk, statistics is on my side here) the erroneous blocks don't overlap in terms of what pieces of data they store.
    Again, no other storage technology can do this. RAID arrays only kick in when the disk drives as a whole are inaccessible or when a drive diagnoses itself to be broken. They do nothing against silent data corruption, which is what we see here and what all people on this planet that don't use ZFS (yet) can't see (yet). Until it's too late.



InfoWorld quote:

It's not every day that the computer industry delivers the level of innovation found in Sun's ZFS. The fluidity, the malleability, and the scalability of ZFS far surpass any file system available now on any platform. More and more advances in the science of IT are based on simply multiplying the status quo. ZFS breaks all the rules here, and it arrives in an amazingly well-thought-out and nicely implemented solution.

We're talking about a file system that can address 256 quadrillion zettabytes of storage, and that can handle a maximum file size of 16 exabytes. For reference, a zettabyte is equal to one billion terabytes. In order to bend your mind around what ZFS is and what it can do, you need to toss out just about everything you know about file systems and start over.
-- InfoWorld


Recommended hardware components

See my posts on:

My research so far is on the computers page.

Hardware implications for file servers

AMD home fileserver with OpenSolaris

2009-Oct, with an update in November. German article with Google Translate link at the top. Hardware selection, installing OpenSolaris, building the ZFS pool, installing Windows on VirtualBox.

Green, efficient storage server

AMD platform. Average consumption: 80W. 2009-Oct, Europe.

20TB ZFS file server


In case you didn't get the news; RAID is dead. Suspiciously looming over its body stands ZFS.

Michael Shadle's Recipe for ZFS Home Storage

"With 2TB disks out now, even more capacity could be had. 16 data disks capable and 2 disks for mirrored boot. It's quite quiet too."

Inexpensive ZFS home fileserver

Hardware list as of 2009-May.

A good enough ZFS NAS

Focused on low power, quietness, and reasonable cost. 2009-March.

Simon Breden's ZFS fileserver

2008-March. A newer hardware selection is available (2008-Dec).

"Green" ZFS system

2008-Dec. Focused on low power (50-55W), but achieves an average write speed of around 32MB/sec and a read speed of around 40MB/sec. Built with components available in China.

ZFS on DroboPro

In this scenario, DroboPro takes care of the hardware-level reliability, and is formatted with ZFS, via iSCSI. ZFS only does checksumming. The problem is that when ZFS does detect a bad sector, it will have no alternate disk to read a good copy (or parity data) from. That is, unless you configure multiple volumes in the Drobo, and present each volume to ZFS as a disk. In which case, what is the gain vs. using ZFS directly?

ZFS and USB disks

  • -USB is limited to 10MB/s; that means it will take a lot of time to do for scrubbing large drives
My tags:
Popular tags: