[NTLUG:Discuss] Re: looking for raid & controller advice -- "FRAID" card = "software RAID"

Bryan J. Smith b.j.smith at ieee.org
Sat Dec 4 19:05:26 CST 2004


On Sat, 2004-12-04 at 15:06, Kevin Brannen wrote:
> Because speed is not the issue or the goal.

Then RAID-5 is fine.

> Sorry, I really should have mentioned that!  The file server I'd build
> only has to serve 2 computers, over the Gb ethernet card as I
> mentioned.  So the NIC will be the bottleneck.

Actually, RAID-5 _can_ be the bottleneck.  Remember, swap and /var will
be going to that RAID-5 array, in addition to other things.

> Nevertheless, my goal is data safety and capacity.

3Ware Escalade 7000/8000 series are "very safe" at RAID-5.  Now if it
was an older Escalade 6000 series, I'd recommend against RAID-5.  But
the 7000+ are fine.

> The 2 client machine control CD duplicators, but even burning at 48X 
> shouldn't tax the file server.

Er, um, I don't recommend recording over a network.  I'd copy the .iso
locally first.  It's worth the overhead in transfer time IMHO.

> Presently, when the 2 clients talk to each other, they can transfer a
> 500MB image to the other in about 20s.  Since it takes almost 3
> minutes to burn a full CD, you can see that speed is not an issue --
> even if both are going at once.

RAID-5 _reads_ are _very_fast_ on 3Ware cards.  For reads, it's
basically a N-1 (where N=number of disks in the array) RAID-0 array.

RAID-5 _writes_ are the performance consideration, not reads.

But even in that case, a 4-disc RAID-0+1 is still 33% faster than a
4-disc RAID-5 on a 3Ware card.

> They will be consistently serving large files (500MB+), so read cache
> won't matter; write cache may not either when you consider 500MB
> files, though they will be doing reading much more than writing.

If reading is mainly what you are doing, then RAID-5 is fine.

> If I were to go with 0+1, which I don't think I need, I'd have to get 
> the -8 version of the card, because I want to be able to approach TB 
> capacity over the next couple of years.  I've got almost 200GB now and 
> am growing faster than planned, so I am concerned about size.

There's always the option to do what I do at home (on a 8-channel card):
  2-disc RAID-1 System + 6-disc RAID-5 Data

The big "performance hit" isn't so much the Data (which can vary), but
the System filesystems like swap, /tmp, /var, etc...  You should _avoid_
putting swap, /tmp and /var on RAID-5 if you can.

> Because of the reliability concerns, I'm thinking hard about doing 
> RAID-5 with a hot-spare; which seems wasteful to me initially, until I 
> remember that I've just lost 2 drives in the last week, and now will 
> have to spend a day or more reloading images from old CDs.  Grrr!

Yes, hot spares are nice.  So is RAID-6 (two parity slices).

> OK, let's ignore 0+1 for a minute and discuss RAID-5. :-)
> A 7506-8 is in $390 area, a 9500S-8 is in the $440 area (both sub $500 
> cards BTW :-).  Is the 9500S worth the extra $50?  If yes, that's 
> probably $50 well spent and within my budget.  Your thoughts?

Yes, the 9500S is worth the extra dough.  It's basically the 8500 series
with SDRAM buffer added.  The SRAM is used first, but the SDRAM is used
to read buffer, as well as write buffer if necessary.

An ASIC is still driving it all, so it's very fast.

> Also, if I were to go with the 9500S-8, I only see SATA versions.  I 
> haven't heard any good SATA success stories on Linux yet.

Okay, this is where you seem to be missing something.

With an "intelligent" RAID controller, the System/OS _never_ sees the
storage controllers.  *NEVER*  The System/OS _only_ talks to the
on-board intelligence, the ASIC in the case of 3Ware.  The ASIC is the
_only_ thing that handles ATA transfers -- that's it!

As such, you _never_ have ATA/SATA compatibility issues!  Because Linux
_never_ talks to the on-board SATA controllers or drives!

This is the _power_ of an "intelligent" RAID controller.  _It_ controls
the ATA/SATA negotiation.  3Ware works with Hitachi, Maxtor, etc... and
signs NDAs and puts code into its proprietary, on-board firmware to
properly handle new drive logic, etc...

Better yet, if you ever have a 3Ware card that goes belly up, _all_
newer 3Ware cards are backward compatible.  I.e., I've moved arrays from
5000 series cards to 6000 series cards to 7000 series cards, etc...

Same deal, if you go 8000 series you can move the arrays to 9000 series.

> Not on any newsgroups, from friends, anywhere.  (maybe that means I
> don't read enough :-)  Does the 9500S deal with that and just present
> an interface to the Linux kernel so I shouldn't care?

Correct.  The on-board ASIC drives the SATA controllers/drives, _not_
Linux.  Linux _never_ sees the SATA controllers/drives.

The "FRAID" cards, on the other hand, _do_ expose the "raw"
controller/drives.  In fact, the FRAID cards (Promise, Silicon Image,
etc...) _are_ the _same_ thing as these "vendor" ATA cards that _suck_
at releasing specifications so _good_ Linux drivers can be written.

> But that is why I've been focusing on EIDE controllers.

The problem isn't because of SATA, it's because of the SATA controller
and hard drive vendors not releasing full specifications.  Going ATA
doesn't really solve that, but most newer ATA controllers are typically
backward compatible with older ones, which already have Linux drivers.

But that doesn't matter with an "intelligent" RAID card.  The
"intelligent" RAID card is a basic block device from the standpoint of
the hardware.  It's just memory mapped I/O.  All that is required is a
simple driver, one that lets the OS read/write to that memory mapped
I/O.

And when bytes come to or are requested from that range, that's when the
ASIC takes over.  And it does the rest.

I've been using the 3Ware Escalade 8000 (SATA) series cards for almost 2
years now on Linux.  0 issues.

> Hmm, OK, but I think I definitely need someone to help me on the SATA 
> question above. :-)

Linux doesn't even know the 3Ware Escalade 8000 series is SATA.  Why?
The 3Ware Escalade 7000 and 8000 series are the _exact_same_logic_!
The only difference is the end-device, ATA v. SATA.

Linux _never_ sees this!  It only sees and uses the "3w-xxxx" driver to
talk to the ASIC.

Other examples include the DPT (later, Adaptec) 2400A ATA RAID
controller, which was just an adaptation of DPT's various SCSI RAID
controllers.  They _all_ use an i960.  So DPT created an unified
"dpt_i2o" driver (using Intel's i960 I2O, Intelligent I/O, support).

The Linux kernel has _no_idea_ what drives are behind the i960
microcontroller.  It just uses I2O calls to read/write to the memory
mapped I/O.

"Intelligent" RAID cards are _very_simple_ to write drivers for.  Again,
the on-board "intelligence" basically makes it look like a set of memory
addresses that can be written to/from.

> OK, thanks!  I understand all the RAID concepts, the implementation 
> details are what I'm trying to learn quickly.

I'm going to send you something off-list.  Do _not_ share it.

> Excellent!  The RAID-5 array will not be the boot area, but the huge 
> data area.  I'll probably do RAID-1 for the boot drive with a pair of 
> 80G drives I already have, as the MB has that onboard.

You _can_ use a $125 3Ware Escalade 7006-2 or 8006-2 for this drive.

> Understand, but since I want 3 drives in there, plus the parity, plus a 
> hot spare, or so I'm now thinking after sleeping on it; I think I want 
> the 8 channel card, so if I have the room to expand, why not?

I've done all sorts of real-time upgrades.  I've never resized, but I've
totally taking an entire volume down, changed it and then brought up a
new organization and copied the data back -- with_out_ rebooting the
system with a _single_ 3Ware card.

> I meant "less reliable" in the terms of less likely to be able to 
> recover if something goes wrong, which has been my experience, and you 
> seem to confirm.  Or so I will infer from your statement. :-)

Not really.  Journals can actually be worse.  It's a long story.

> I was looking for a 4-8 channel EIDE controller card last night that 
> didn't do RAID and was having bad luck.  If you know of any of this ilk, 
> that would be appreciated.  I agree, no need spending the extra $75 for 
> software I might not use.

The RAIDCore cards are popular, but they are still FRAID cards.  But at
least they support Linux and work with its LVM.

> OK, so I could use that for software RAID if I want to go there.  Still 
> not sure what I want to do there, but I've got a few days to read, 
> think, and plan.

Software RAID-0 is not only easy, not only supported directly in Linux
LVM2, but it doesn't "cost" you in overhead.

Software RAID-1 causes you to send 2x the data over the interconnect.

Software RAID-5 is what really gets bad, because you're pumping _all_
data over the system interconnect (so the XORs can be calculated by your
CPU -- the XOR operation is of the _least_ concern ;-).  It's really the
killer in performance -- like 10-fold in some cases.

> For about the same money I'd spend on the fileserver, I could buy 2 
> 7506-4LP cards with 6 disks, and put 1 controller & 3 disks in each 
> client computer as RAID-5, then just have them sync up every night 
> (which I was doing anyway),

Consider an out-of-band network connection between the 2 fileservers.

I.e., 2 NICs in each system, with one connected to a cross-over between
them.

> instead of building a file server for both clients to hit like a NAS.

That's an idea.

> Do the scales tip towards this setup (RAID-5 individually plus mirror 
> between machines)?  Or towards the fileserver (RAID-5 plus HS)?  [While 
> the file server would give me an opportunity for Linux advocacy, it's a 
> fringe benefit and not the goal, so don't let that enter into the equation.]

Two, independent filesystems give you additional failover/redundancy.

> Such interesting things to think about. :-)

I guess.  I've done clustered filesystems as well, with NFS too.

-- 
Bryan J. Smith                                    b.j.smith at ieee.org 
-------------------------------------------------------------------- 
Subtotal Cost of Ownership (SCO) for Windows being less than Linux
Total Cost of Ownership (TCO) assumes experts for the former, costly
retraining for the latter, omitted "software assurance" costs in 
compatible desktop OS/apps for the former, no free/legacy reuse for
latter, and no basic security, patch or downtime comparison at all.





More information about the Discuss mailing list