Matt Connolly's Blog

my brain dumps here…

ZFS = Data integrity

So, for a while now, I’ve been experiencing crappy performance of a Western Digital Green drive (WD15EARS) I have an a zfs mirror storing my time machine backups (using OpenIndiana and Netatalk).

Yesterday, the drive started reporting errors. Unfortunately, the system hung – that’s not so cool – ZFS is supposed to keep working when a drive fails… Aside from that, when I rebooted, the system automatically started a scrub to verify data integrity, and after about 10 minutes:

  pool: rpool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scan: resilver in progress since Thu Mar 10 10:19:42 2011
    1.68G scanned out of 1.14T at 107M/s, 3h5m to go
    146K resilvered, 0.14% done
config:

        NAME          STATE     READ WRITE CKSUM
        rpool         DEGRADED     0     0     0
          mirror-0    DEGRADED     0     0     0
            c8t1d0s0  DEGRADED     0     0    24  too many errors  (resilvering)
            c8t0d0s0  ONLINE       0     0     0
        cache
          c12d0s0     ONLINE       0     0     0

errors: No known data errors

Check it out. It’s found 24 errors on the Western Digital Drive, but so far no data errors have been found, because they were correct on the other drive.

That’s obvious, right? But what other operating systems can tell the difference between the right and wrong data when they’re both there??? Most raid systems only detect a total drive failure, but don’t deal with incorrect data coming off the drive !!

Sure backing up to a network (Time Machine’s sparse image stuff) is *way* slower than a directly connected firewire drive, but in my opinion, it’s well worth doing it this way for the data integrity that you don’t get on a single USB or Firewire drive.

Thank you ZFS for keeping my data safe. B*gger off Western Digital for making crappy drives. I’m off to get a replacement today… what will it be? Samsung or Seagate?

Advertisements

4 responses to “ZFS = Data integrity

  1. floppyDJ 10 March, 2011 at 23:10

    Make it a seagate, though I haven’t bought a samsung drive in over 5 years, maybe they are better now!

    and hey 😉

  2. mattconnolly 11 March, 2011 at 17:46

    And after that I did a `zpool scrub` to verify the entire pool again just to be sure. Ended up with 110 checksum errors:

      pool: rpool
     state: DEGRADED
    status: One or more devices has experienced an unrecoverable error.  An
    	attempt was made to correct the error.  Applications are unaffected.
    action: Determine if the device needs to be replaced, and clear the errors
    	using 'zpool clear' or replace the device with 'zpool replace'.
       see: http://www.sun.com/msg/ZFS-8000-9P
     scan: scrub repaired 422K in 15h23m with 0 errors on Fri Mar 11 06:28:31 2011
    config:
    
    	NAME          STATE     READ WRITE CKSUM
    	rpool         DEGRADED     0     0     0
    	  mirror-0    DEGRADED     0     0     0
    	    c8t1d0s0  DEGRADED     0     0   110  too many errors
    	    c8t0d0s0  ONLINE       0     0     0
    	cache
    	  c12d0s0     ONLINE       0     0     0
    
    errors: No known data errors
    

    Again, none of my data has been compromised thanks to the data safely on the other drive in the mirror.

  3. Mcof 15 August, 2012 at 04:31

    I’d just like to note that the HDD brands are a poor way of judging quality. The major companies all have varying production facilities and designs per model, in part because they acquired others. (such as seagate buying maxtor)

    • mattconnolly 15 August, 2012 at 14:50

      True that. Nonetheless, my experience so far is that I’ve only seen these corruption errors on western digital drives, and ZFS is the only file system that tells me.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: