The Cowboy
What I'm Tweetering about...

follow me on Twitter

Recent Posts


Archives


Subscribe to
Posts [Atom]



Monday, August 11, 2008

RAID 5: These aren't the droids you're looking for.

My earlier post on Windows Home Server's Drive Extender vs RAIDa lot of what I said was a good example of somethin' my pappy once told me:

"Good judgment comes from experience, and a whole lotta that comes from bad judgment."

Well, rest assured that I've had my share of bad judgement, as I'm sure others have, given some of the feedback I've received.  Charlie pointed out to me some of the other really good reasons why you're better off not going with RAID5, rather than just using Drive Extender:

While RAID5 hardware is fast, it is also pretty much universally incompatible. In other words, if your RAID5 controller fails you will have to find an IDENTICAL controller to replace it or your array will be unreadable.
For those playing along at home, this is bad. If you think your data is valuable enough to keep around, and you're worried enough about failure that you're going to do something about it, think about what else will fail--not can, but will-- hard drives, controller cards, motherboards, ram, network adapters, power failure.  I'd mention the possibility of meteor strikes, but ... I'm gonna play the odds with that one.

When I think about the how each of those failing is going to affect my ability to even recover data, the controller card is a nasty wildcard. There is no standard layout for how each controller card uses all that storage, and it's going to be needing replacement with the exact same controller.

All drives in a RAID5 array have to be the same size.
Hey sure, no problem. I can get 750 gb drives the same size in the future... Uh, except there is no guarantee that the drives will be exactly the same size, (even if they are labeled the same), and it's possible that other changes in the future could render them so obsolete, that it could be hard to even source drives of a particular size.

Furthermore, as you progress into the future, with Drive Extender, you can simply buy any size of drive that fits your budget, and add it to the server. No worries about how big, what brand, how fast, etc.

Drive Extender gives you the flexibility to NOT duplicate files that don’t need to be duplicated (making it MORE efficient than RAID 5 in some cases).
And this, is a really really good point too. Sometimes, I want to have data that I really don't care as much about, or I'm less worried about it being backed up and online.  With Drive Extender, you can mark some content as not requiring duplication. Like maybe TV shows I've recorded on Media Center. Or those videos of my in-law's trip to Cleavland. You know what I mean.

RAID5 rebuild times with a single drive failure are often as long as your 3.5 month ordeal of recovery!
Well that's certainly true--I took all summer to restore my data, and I had no idea until I got it back if I would.

RAID deals at the block level and thus knows nothing about the interesting file information (metadata, etc…).  Thus it will NEVER be able to be smart about your storage.  But Drive Extender is file based and tons of innovation can happen (will happen!) leveraging this fact.
This is the point that I think is even more important, and yet so many folks will either ignore or not really understand.  RAID is a system of aggregating independent disks and presenting that larger volume that to the OS as a single large drive. It works at a block level, meaning that your RAID system really knows nothing about files, just blocks. And because it's unaware of that, there is so many optimizations that can't be done with it. With file-level data redundancy, the system knows a lot more about it, and can begin to make decisions that would be much more beneficial.  Think, down the line in a couple years, as I add a few more hard drives to my growing server--which already contains drives that are connected by Firewire, SATA, ATA and USB (*sigh*)--it's possible that the technology could realize that I rarely ever touch some kinds of data (or specifically, some types of data) and it could migrate those files to the slower media, and keep the faster media for things that need it.

Now, my momma said that I should always end by sayin' somthin' nice, so I'll leave ya with this:

So is there anything good about RAID5 or are you just some sort of once-scorned-RAID-hater?
I've said it before but it bears repeatin': RAID--in general--is great for performance, and RAID5 adds to that, availability.  My desktop at home has 1.4 terabytes of space, striped (that's RAID1). My desktop at work has 2.8 terabytes of storage, again, striped. Sure, sure, with 4 750 gb drives, and using striped storage, I'm subject to failure at 4x the rate I was before (probably more, cowboy math ain't that great... :D). On those systems, I'm not leaving anything there that I can't afford to lose, either because I've backed it up, or it's recreate-able.

 

But WHOOOOOSH, it sure goes fast!

Labels: , , ,

 

Windows Home Server's Drive Extender vs RAID

I use Windows Home Server at home to store *everything*... it's really quite a fantastic product. It has a feature called Drive Extender, for which Wikipedia describes nicely:

Windows Home Server Drive Extender is a file-based replication system that provides three key capabilities:[12]

  • Multi-disk redundancy so that if any given disk fails, data is not lost
  • Arbitrary storage expansion by supporting any type of hard disk drive (Serial ATA, USB, FireWire etc.) in any mixture and capacity
  • A single folder namespace (no drive letters)

Users (specifically those who configure a family's home server) deal with storage at two levels: Shared Folders and Disks. The only concepts relevant regarding disks is whether they have been "added" to the home server's storage pool or not and whether the disk appears healthy to the system or not.
Shared Folders have a name, a description, permissions, and a flag indicating whether duplication (redundancy) is on or off for that folder.

If duplication is on for a Shared Folder (which is the default on multi-disk Home Server systems and not applicable to single disk systems) then the files in that Shared Folder are duplicated and the effective storage capacity is halved. However, in situations where a user may not want data duplicated (e.g. TV shows that have been archived to a Windows Home Server from a system running Windows Media Center), Drive Extender provides the capability to not duplicate such files if the server is short on capacity or manually mark a complete content store as not for duplication.

Here at Microsoft, we have an internal mailing list for WHS, and every once in a while, someone asks one of the following questions:

Isn't RAID better than Drive Extender?
Why should I use Drive Extender instead of RAID?
Which RAID card should I buy?
How good is software RAID5?

I try to ignore those threads, but when the responses start coming in about the merits of RAID vs simply using DE, I end up getting itchy, and chime in. The topic came up again, this last weekend, and I recycled an old response, and it started looking like a good blog post... so here's the skinny.

First, the reason why you don’t want Software RAID 5

First, there’s a big gap between software RAID5 and hardware RAID5. Software RAID5 is slow. Damn Slow. Faster than that… maybe pretty damn slow. Not a great solution. You won’t be happy at the end of the day (see section below “Why you don’t want RAID 5.”)

Hardware RAID5 is fast. Zippity fast. So is how fast you will lose your data.

Why you don’t want RAID 5

RAID 5 is not about data integrity… it’s about performance and availability

If you want your data to be safe, replicate it. Back it up. Put it in more than one place at a time.

If you use RAID5, you still need to back up your data.  RAID5 is designed so that a single drive failure, will preserve your data, and make it available (but slower) until you get another drive in place, when it will rebuild the missing volume.

Here’s the kicker. What happens when a drive fails, and you are not there? If the system is in use, it’s going to get really really busy, and all of the drives in the array are going to get a lot of use.

When that hard drive fails (and you are not planning for IF but WHEN), and the others pick up the slack, the chances of losing a second drive go thru the roof. What will you lose if a second drive goes?

This is common, especially in a server/computer in a home environment, where the drives may not be busy most of the time.

One other contributing factor to multiple drive failure in RAID5, is people tend to use the same brand of drives, especially if they are the same batch (ie, you bought them at the same time).

My personal experience with RAID5

I had a server running RAID5 at home, it ran perfect for over a year (actually, close to two). One night after I went to bed, a drive failed. 3 minutes later another failed. This was a 2 terabyte RAID array.

I came down in the morning to my worst nightmare. Every bit of ‘valuable data’ I had in the world was now gone. In desperation I scoured the internet, and finally found a piece of software that (for $40!) could  recreate every file that I still had data for, if not a little slowly. I rushed out and bought 3 750gb drives, and started to restore everything I had lost. The restore process took 3 and half MONTHS, running full time, around the clock. The good news is that I was able to get one of the failed drives spinning again, and I lost a total of one file.

What did I learn?

RAID5 doesn't back up my data. Sadly, I thought it was safer. Worse than that, it was actually less safe. A single drive failure would have meant nothing. Add another drive, and keep chugging. Potentially, it may have taken a few hours to rebuild the lost volume, but I could have been using it while it did.

A second drive failure would have meant I was offlined for the time it took to restore--If I actually had a backup. Still, not bad, considering that would have been less than the 3.5 months.

But a two drive failure(which is fairly likely)--without a backup-- is a nightmare.

If you value your data, replicate. I now have a home server with 6 250gb drives and 3 750gb drives, and the data that I value is replicated. (and the really valuable data is foldershare’d to a friend’s house, and vice versa, giving us offsite backups too). Sure, it’s not as ‘space efficient’ but at least I can deal with a drive failure.

Raid 1 (mirroring) is the only RAID where a failure doesn’t increase drive activity drastically—well, reads are all going to one drive now, but if you had 8 drives mirrored in 4 sets, typical access won’t cause all the drives to get busier.

Raid 0 of course, is purely about speed. Half the safety at twice the speed. That’s what I use in my desktops. (where I want it fast. I of course back up anything that I’m not willing to lose to the server.)

What is my Advice?

Know this: if you are using hard drives, one day, you will experience a drive failure. Not 'might', but 'will.
How you are affected depends on your choices.

Determine how valuable your data is.
Stop thinking about the price of the hard drives. Disk Space is very cheap. It got cheaper while you were reading this. It's the stuff you store that's not.

Are you planning for the inevitable, or playing the odds?
I can talk all day why DE is better than RAID, or why one particular strategy is better than the other. At the end of the discussion, you're still the one making your decision, and you're probably pretty smart. (You're reading my blog). Ask yourself: why are you doing what you are doing?

I'll leave the rest to your imagination.

Labels: , , , ,

 

Wednesday, July 23, 2008

Interesting thing found at OSCON: Taint

I attended a session this morning called "PHP Taint Tool: It Ain't a Parser" by Luke Welling. Luke introduced a tool he's working on at OmniTI that is designed to assist in sniffing out where the potential for untrusted input is handled. From the session description:

... You want to see where untrusted input can propagate taint within the application. In complex logic that might mean chasing many possible execution paths. Using an automatic tool to try to follow these paths without running all possible input variations is called static analyis. ... The Taint tool allows the PHP engine to do as much as possible, then cuts in at the last stage to analyze the compiled opcodes and trace possible flow of execution.

The Taint tool presents opcodes in a readable way, making it clear what lines of source got compiled into specific opcodes. It also performs a static analysis on the code, following the opcodes to attempt to trace all possible code branches and mark lines that tainted data can be passed to.

Essentially, the tool uses the parts of the PHP engine to compile PHP code to opcodes, and then tracks where data comes and goes, and highlights the code that handles data that *could* be tainted--that is, input from the user either by POST or GET parameters.  This provides a facility for a developer to identify the lines that they should closely review to ensure that they are not accidentally introducing security holes (like cross-site-scripting opportunities). 

Now, it's not-quite-ready for prime-time, but it's getting close, and the folks over at OmniTI intend to release it as open source when they are ready.  When this gets released, I'll be really excited, as it looks like it could be really good for hunting down security holes.

I also attended Rasmus Lerdorf's (the Yahoo PHP guy) tutorial on "PHP: Architecture, Scalability, and Security" that was really quite good too, and he demonstrated a tool (the name of which I can't remember now...grrr) that they have at Yahoo that he points to a web page, and it starts throwing a large library of strings that may uncover security problems, but it does it from the client side.  Unfortunately, he's not releasing it, not because he doesn't want to let folks find and fix their bugs, but because the release of a such a tool could bring about Internet Armageddon--it would likely find exploitable problems in the vast majority of the Internet. 

Both approaches to finding application holes are useful, and it's clear from both talks that this is still a really large problem that developers need to address.

(I've had a problem with spam comments; I'll be addressing that soon, so if you see comments turned off you can drop me a email: garretts...at...microsoft...dot...com)

Labels: , ,

 

Hey, are you at OSCON?

This week I'm at OSCON in Portland, OR. I like what their site says about it:

"OSCON is the crossroads of all things open source, bringing together the best, brightest, and most interesting people to explore what's new, and to champion the cause of open principles and open source adoption across the computing industry."

It really is exactly that. It seems like I've met so many people here, and have had so many great conversations, it's like time slows right down, and the universe is conspiring to squeeze everything it can into just a few days.

I'm having a great time here, and with so much going on, I feel like a kid in a candy store. The biggest trouble I'm having is picking what sessions I want to attend, as there is just so many worth while.  However, given the work I'm currently doing with PHP, I think I'll stick pretty close to the PHP related sessions for the most part.

The last couple of years, Microsoft has had a fair number of people here, and this year is no exception. I keep bumping into people I know... Hey, if you're reading this, and you see me, stop and say hello!

You can recognize me by my picture.

 

Monday, July 21, 2008

Blame it on your lying, cheating, cold dead-beating, two-timing, double-dealing mean mistreating, loving heart

Ever notice how folks who blog sporadically (uh, like me!) always apologize for not blogging for a while, and then re-affirm their dedication to blogging regularly? And often, accompanying their apology, is also a reason. I was going to "Blame it on the Rain" but the very thought of quoting Milli Vanilli makes me shudder.

So, instead, Patty gets to explain it for me.  Well, now that I think about it, it really doesn't explain anything. But I was listening to that song last night, and the lyrics stuck in my head.

..... Aaaaaanyway...

The worst part about not blogging for weeks on end is that I can't just ramble on as if you know what I've been up to for the last last few weeks, but I'll try to catch ya up:

Over the last several weeks, I've been moving my focus from doing "Program Management" tasks to more "Software Developer" tasks. You see, during the last year, I've discovered that I'm a Developer. Deep down, that's what I do best. Focusing in that direction is already paying off, and I'm finding that I'm accomplishing far more than I had before.

So, rather than focus on simply facilitating, I've been actually compiling, debugging, coding... aaaahhh. It's so nice.

And the best part: all the work that I'm doing is dedicated to getting Apache and PHP working much better on the Windows platform. I may just possibly have the absolute best job at Microsoft.

Technorati Tags: ,,

(Don't forget the updated .sig...)

 

Thursday, April 10, 2008

A funny thing happened on the way to ApacheCon

Back in January, I invited the Apache Software Foundation to attend the Windows Server 2008 Application Compatibility Labs, here on our campus in Redmond.  In order to get as many developers as possible to attend, we even paid for flights and accommodations for some members.

The week that Apache was here, was so valuable for both groups--the product groups got to see and understand what some of the issues were that some of the Apache projects have run into, and the Apache folks were able to get their hands on the developers who built the system.

Myself and Bill Rowe had hammered out some details before I actually sent the invitation out. Along with posting it on some of the Apache Mailing Lists, I also posted the invitation on my own blog so that others could see what we're up to. And, as to be expected, there was a wide variety of comments posted--both positive, and ... less positive.

My favorite though, was:

"Microsoft should go to Apache developers and see if Windows Server 2008 works correctly with Apache, not the other way around."

In some ways, that would have been somewhat impractical--when the Apache folks visited us, they had the opportunity to meet with engineers and program managers from many different groups, in addition to getting access to the hardware in the lab and the expertise of the folks who run that.  For us to pick up the 20 or so people from the product groups that they actually met with, and drag them all out to all the locations where Apache developers are--which is pretty much everywhere--would not have been possible.

Still, I felt it would be more than valuable for me to go ApacheCon, so that I had the opportunity to meet with Apache developers where they roam. When Bill was in Redmond, he invited me to the Apache Hackathon--the couple of days at the beginning of the conference that developers could hang out and code.  So, a snappy 10hr flight later, here I am at ApacheCon in Amsterdam.

The Apache Foundation is an interesting community--or rather community of communities.  It's not just one project (the http server is what most people think when they hear Apache), but literally dozens of top level projects, and a whole bunch more in the 'incubator' (where baby projects are cultivated until it is clear that it will have ongoing support and development).  The hackathon is just a large room with tables where folks can come in, sit down open their laptops and start coding. It's actually a lot quieter than I imagined it would be.  Naturally, the folks in communities tend to gravitate together and discuss their projects.

As I'm not really on any project, I've been bouncing around chatting up different groups, getting their perspective of their own little chunk of Apache.  Most of the people I've talked to aren't surprised at all that I'm here--which is definitely a change from conferences a year ago--and are excited to hear about our efforts.

Now, for the funny thing.  I booked my hotel a few weeks back, using the internal travel system here at Microsoft.  The hotel that the conference is at was booked, so I looked for one nearby.  Unfortunately, the tool doesn't let me search for hotels near another hotel, and I didn't know what else was close that I could search near (and my inability to read Dutch didn't help), so I used the tool to show me where the hotels were, I'd switch to http://local.live.com and see how close it was, and if it was close, I'd switch to the other tool to check out the availability, and there was not much available. ... I guess I was distracted while I was doing it, and I ended up booking a hotel right next to the airport, which is in no way close to the conference, and so I spent the night in that hotel--and called the wonderful travel support folks who found me a hotel where I needed to be, and I moved there the next morning. Lesson learned: next time I travel to the Netherlands, I'm asking Hank to find me a hotel.

 

Monday, March 24, 2008

How a cowboy spends two days in Boston: Drupalcon 2008

Howdy ya'll,

I was recently in Boston, and managed to spend a couple of days at Drupalcon, where Port25 was a silver level sponsor for the event.  The herd was over 800 attendees--all focused on Drupal.  Needless to say, I was duly impressed.

What's Drupal?

Drupal, written in PHP, is an open source content management platform. It's equipped with a powerful blend of features, and supports a variety of websites ranging from personal weblogs to large community-driven portals.  Drupal has been rapidly displacing a large number of other PHP based content management systems, and has an active community along with broad vendor support.

Over the last year or so, Microsoft has been working hard to improve PHP's support on Windows.  With the hard work from the SQL Server team, who recently published a new CTP of the native SQL Server PHP driver, the FastCGI work that the IIS team has done, and of course Zend, who we've been coordinating with--PHP is rapidly getting the support and attention it deserves.

So... Drupalcon?

Ah Yes. From the humble beginnings in 2004, where 10 people attended the first Drupalcon, it's grown into a massive bi-annual event (one in North America, and one in Europe) with over 800 attendees, plus sponsors. I was truly stunned at the sheer size of the event--I would have assumed a much larger affair.

Kieran Lal hosted a session early on Monday morning, in which he told how to get the most out of Drupalcon--and really, it was applicable to any conference, and I really enjoyed it. Between that session and the first keynote, I hung out, and got to know a bunch of folks. 

Who are the people in your neighborhood?

Drupalcon was really quite special--of all the conferences I've been to, Drupalcon was home to the most friendly folk I've ever seen.  Everybody was really fun to talk to, and they all were excited to hear about Microsoft's effort in making PHP run great on Windows.

I spent about 45 minutes talking to Larry Garfield about expanding support for databases in Drupal.  Larry has done a tremendous amount of work for Drupal 7 on database abstraction--it's going to be pretty cool, trust me.

I managed a few minutes of Kieran Lal's time, which was quite amazing, as he seemed to be doing a million things at once during the conference, and barely had a spare moment to catch his breath.  We talked about the future of Drupal, and how Microsoft could get involved, and I think we're both pretty excited about the future. 

Dries Buytaert gave his traditional "State of Drupal" presentation (video can be found here), which contained a couple real eye openers:

Drupal 6 had over 100,000 downloads in the first month of release, that's 2x over Drupal 5. Wow. That's pretty amazing.

Drupal 7 (and beyond) appears to have one of the most well thought out plans in place--I can't recall another open source project that has such a detailed road map.

Then, I came home...

Aside from the jet-lag and the shortness of the trip, I enjoyed the conference immensely.  We've been playing with Drupal in our lab over the last several months, and it's clear that the time has been well spent--Drupal is not only an emerging phenomenon, but the future looks even brighter.  I reckon you'll be seeing many more posts from me in the future about it.