Should you consider upgrading to a NAS?

Chuq von Rospach has made the jump from managing buckets of disks to running a centralized NAS and has posted about both the good and bad aspects of doing so. 

About 14 months ago I had to make the jump from using 2 Terabyte disks to 3 Terabytes and I realized I was within a few months of needing to start the migration to 4TB drives and that the trend was accelerating. Once you grow past 4TB drives, it gets complicated fast, and I wasn’t looking forward to that.

Indeed. Now that I’m fully moved into my Synology, I’m pretty happy with things. Now, I’m thinking about how to do a decent offsite sync with another. Maybe I’ll set up another NAS in Europe next year and get the ultimate in geographic redundancy…

Helium may be the new awesome in big storage

Solid state devices are rapidly replacing spinning platters in lots of end-user machines—our laptops and handhelds most especially. But, there’s still a huge need for spinning disks when you need to store lots of data, especially in a data center or a NAS. Western Digital is onto a new way to squeeze more disk platters into the same physical volume by using Helium. Nifty.

Seagate introduces Ethernet as a drive interface

Imagine a storage setup where instead of putting drives into some sort of array enclosure that acts as a NAS or is plugged into a SAN, you simply add drives directly to the network and let them sort it out. Ars reports on Seagate’s new Kinetic architecture.

Bit rot concerns

Jackson Faddis asks on Twitter:

with your storage scheme do you worry about bit rot? Are you using ZFS or any other file system to protect against file corruption?

I am indeed concerned about bit rot, at least for the RAW photo and video files that can’t be recreated. I was one of the people that was massively disappointed when Apple dropped the planned support for ZFS in Snow Leopard. Smarter people than I have all sorts of ideas on where we need to be going with filesystems, all I have to add to that is I can’t wait.

My current coping strategy for bit rot is the same as for accidental deletion and corruption by tools: defense in layers. Multiple copies in various forms means that if one gets corrupted, maybe another copy somewhere else isn’t. To help this along, I don’t overwrite older disk archives that go to the safe deposit box, I make new ones. For example, my entire archive (all 14TB+) is getting a fresh set of backup drives for the safe deposit box this December, but the older ones will stay parked there.

And the most important data—the stuff I’ve processed and is living on my local filesystems, gets cloned off at intervals onto a fresh drive in addition to getting Time Machined.

It’s not a great answer, but it’s my current strategy. Even if bit-rot goes away as a concern, I’ll still use the defense in layers approach to defend against other risks.

Notes:

  1. @knweiss noted on Twitter that diglloydTools Integrity Checker can help with detecting bit rot. I have been considering adding something like that to my setup, and will probably do so sooner than later thanks to the nudge.

NAS vs Thunderbolt RAID

Suraj Rai asks on Twitter:

any thoughts on Thunderbolt devices like Promise instead of a NAS. Speed-wise presumably much faster than Gigabit cap?

Oh yes, Thunderbolt devices like the Promise RAID blows through the Gigabit cap with ease. Sequential read/write performance on a directly connected Promise Pegasus is about 5-6 times what you can get over Gigabit Ethernet, using AnandTech’s numbers.

If the sole criterion is speed, then a directly attached RAID wins every time, no doubt about it. The downside is that once you format a RAID on a Pegasus or the like, you can’t expand it later. In my case, I want to be able to access my archives at high speed, but the ability to expand my archive volume without forcing a copy to a new array trumps absolute performance.

I actually see the two—NAS and directly attached RAID—as being complementary in many ways, especially in a layered data environment like I have. I wouldn’t be surprised if I end up with some sort of Thunderbolt array attached to a new Mac Pro in the future. But that disk would be for working on live projects on, not for archival purposes.

The way my data system works (right now)

Martin Irwin asks on Twitter:

If possible, would you consider making a very quick diagram of your entire backup/work system? A picture is a thousand words ;)

Sure, here’s a quick go at it hacked together in OmniGraffle. It’s simplified a bit, especially when it comes to the specifics of my photo workflow—that’s a topic for another time. But, as far as where the14TB+ of my data goes, here’s a good overview:

In essence, data goes in one way or another through my primary machine, currently a Retina Mac Book Pro. Every photo I take—no matter if I push it through Aperture, Photo Mechanic, or Lightroom—gets copied into my photo archive on my NAS in YYYY/MM/DD folders. Likewise, video clips and time-lapse data ends up in my footage archive on the NAS on a project-by-project basis. 

As I said, I use a variety of tools to work through images. The final destination for all my finished images, however, is an Aperture library on my laptop. That gets backed up, along with all the other data on my drive, to both a Time Machine volume on the NAS as well as gets cloned on a semi-frequent basis to an external drive using Carbon Copy Cloner.

The final step is offsite backups. Since the photo and footage volumes from the NAS is organized by year, it’s pretty easy to keep a set of disks for each year and the only one that needs frequent attention is the current year. Every once in a while, I use a fresh set of disks for these archives.

RAID is not backup

A flurry of email—partially caused by my glossing over things a bit to glibly in my last post (since corrected)—indicates that it’s time to say this more clearly again:

RAID is not backup.

If you’re thinking about getting or have bought a RAID or a NAS so that you don’t have to make backups, reconsider. Reconsider now. While being able to mitigate a single disk failure is a strong point of a good array, they’re still subject to catastrophic failure of multiple disks or some other hardware component. They also can’t help you with accidental or purposeful deletion of data, lightning bolts, or a beverage spill.

For planning purposes, you should think of an array as a way to get a bigger or faster volume. And that volume needs to be backed up. David Magda wrote a great definition in an email exchange:

A backup is a coherent copy of the data on independent media

Back in the days of old, this was a pile of floppy drives that could recreate data on a hard drive. Then it became a smaller pile of DVDs. If you were big time, you used tape drives. These days, it’s easy to use a single disk to backup onto if your dataset is less than 4TB. 

He went on to add that you have to be able to use that copy to restore from in order for it to be considered any good. Agreed.

Handing data backups with a NAS

Steve Kalwarf asks:

I’m curious how you handle backups of the data on your Synology box? Ever since I had a series of ReadyNAS power supply failures, I’ve been reluctant to have all my data stuck in a black box without some way to recover it if the hardware fails.

Your reluctance is totally understandable. While modern RAID systems do increase reliability to a certain point, they don’t go that far. For example, it’s not unheard for a second disk to fail while recovering an array from the first disk failure. In addition, the added number of moving parts works against you, reliability wise. Without going too far down a rathole, it’s best to think of an array as a bigger and/or faster disk that’s somewhat more bulletproof than a single drive, but certainly not infallible.

With that in mind, I treat all arrays—NAS or locally attached RAID—as single units that can and likely will fail at some point. I might feel safer about them than a single drive, but they need backing up. The sheer amount of data you can put on a NAS makes this a bit difficult, but I maintain a set of external drives that I rotate through my safe deposit box. This would be insane except for the fact that all my photo and video files are organized by year/month/day. Turns out that it’s easy to dedicate a year to an external disk.

In the medium term, I’m seriously considering two NAS devices for my home-based studio. I’d use one as my primary data store and then schedule a regular sync between the two. Even if I do this, however, I’ll maintain an offsite backup as the files I’m backing up can’t be recreated or bought over. If the bulk of your data is stuff you can recreate, then it’s not as important to go to that length. 

Also in the medium term, I’d also like to integrate with something like Glacier or CrashPlan as an offsite backup. The only challenge there is either getting a faster network connection to push 14+TB of data or getting organized enough to seed it with disks I send in.

Thoughts on the new Mac Pro

Nathan Ingraham from The Verge sent an email the other day asking:

We’re running a report on Apple’s new Mac Pro and we wanted to talk to some professionals in the fields of audio, video, photography, and so forth to see if the new machine is a compelling option for them.

It’s a good question. I answered him and today The Verge published an article titled: The new Mac Pro: will professionals embrace Apple’s brave, expensive vision of the desktop? In it, I’m quoted as saying:

"I think you’re going to hear a lot of people make a big deal about the externalization of storage," Davidson tells The Verge. “It’s an easy thing to pick on. Sort of like dropping floppy drives and DVD drives. But I’m not concerned about that at all.”

Why am I not concerned? Everyone I know who deals in media quickly gets to the point where internal drives don’t cut it and you end up moving to an external RAID enclosure connected via eSATA or Thunderbolt, or you move to a NAS. Either way, I’m totally fine with the days of spinning disks in our primary machines being over.

Also, the small size has some distinct advantages, the article notes. I know that the TED media team has talked about being able to take a MacPro in carry-on luggage to an event. That’ll be awesome.

Read the full article on The Verge.

Yep, I’d buy the Synology

Yesterday, I got a chance to have lunch with long-time friend Ryan Irelan and catch up. Over an incredibly great brunch at Tasty & Alder—quickly becoming one of my favorite restaurants—Ryan asked the same question in person that’s been asked online about the Synology DiskStation I’m setting up: Would I spend my own money on it?

I don’t have the long term experience to answer the question, but certainly an opinion is forming and it’s a favorable one.

My primary comparison point is the ReadyNAS Pioneer Pro I’ve been using as my primary archive drive for the last year or so. Before that, I’ve had Drobos as well as various other enclosures, RAID and otherwise. Both the ReadyNAS and Synology devices do the same basic thing: provide you expandable fault-tolerant storage that you can throw a bucket of drives at and continue to upgrade.

Both offer good performance. Both offer a ton of system plug-ins that will turn them into web servers, media servers, and so much more. Once you load them up with drives, the cost differences aren’t significant enough to be a decision driver.

At this point, two things really stand out:

  1. The Synology is quiet. So very quiet. The only time I notice it is when the disks spin up after they’ve spun down. Other than that, it doesn’t make its presence known. The ReadyNAS is loud enough that I almost immediately moved it into my back closet.
  2. With the Synology, I’ve got the option of adding another ten drives seamlessly to the current volume buy adding on up to two DX513 expansion units that connect to the main unit via eSATA. In other words, I can easily more than double the capacity of the unit without rebuilding volumes or moving a ton of data across the network.

Both of these points are important to me, but it’s that last one that really is nice. My data needs are only going up and my ReadyNAS is maxed out with 4TB drives and is 80% full, which means that I’ve already been looking at a data migration to a new device. I don’t want to have to do that yet again anytime soon if I can avoid it.

So, yah, while I don’t have enough experience to give an unqualified endorsement, the Synology DS1813+ is what I’d spend my own money on based on what I know so far.

Synology DS1813+ first look

A couple of weeks ago, the folks at Synology got in touch and wanted to know if I’d like to test drive one of their big beast NAS setups. Given my insatiable need for storage these days, faced with an upcoming decision on how to upgrade my current setup, and a glowing endorsement from Marco I jumped at the chance. What arrived shortly after was an eight bay Synology DS1813+ chassis stuffed with 3TB Western Digital Red drives.

image

Set up was straightforward enough—pretty much as plug and play as you can probably get with a device like this. The only hiccup I had along the way was that auto discovery of the device on my network by the setup tool was blocked by my Mac’s built-in firewall. Once I sorted that out and disabled it (temporarily), I was able to hook up and set things up.

The result is twenty-something Terabytes of fresh storage. My first order of business was to setup a Time Machine share. Easy enough to do as all you need is a share that’s accessible via AFP. The trick to doing this right—at least in my opinion—is to set the share up under a user that has a quota so that I can limit my time Machine backups to 2TB instead of slowly growing without bound. 

image

As far as performance goes, it’s as fast as you can want it to be. In the little bit of testing I’ve done, it is able to completely saturate a Gigabit Ethernet connection which means that the bottleneck is getting data to and from the device which is the limiting factor you want to have in a NAS. I’ll be interested to see what the performance looks like after I get my primary photo archive copied over.

The last thing I’ll note for now is I can confirm Marco’s statement about how quiet the Synology runs. I set it up in my living room just to play with and frankly, I don’t even notice it from a few feet away. I’ll probably move it into my closet at some point, but that’ll only be because I’ll want it out of the way, not because it’s annoying or even noticeable.

Over the coming weeks, I’ll post thoughts and observations on how things go and will put them under the same tag as this post.