[NTLUG:Discuss] Data Storage recommendations wanted.
Kipton Moravec
kip at kdream.com
Fri Nov 17 09:01:13 CST 2006
On Thu, 2006-11-16 at 18:51 -0600, Robert Pearson wrote:
> On 11/16/06, ntlug at thorshammer.org <ntlug at thorshammer.org> wrote:
> > > It would be cheaper to buy a hard drive load the data onto it, unplug
> > > the hard drive and put it in an antistatic bag and place the hard drive
> > > in storage.
> >
> > You should look into redundent disks for reliability (RAID, copy to two disks, etc). It sounds like the core of the service is the ability to produce a document after X amount of time with the ability to deliver 100% of all documents requested. So if the media the doc is stored on is lost, having a backup to pull it off of would seem good. Having redundant storage (either duplicate CD/DVDs or duplicate disks) will increase the odds against total failure. You can factor that into the cost of doing business, maybe even only duplicating 'important' docs (if there is such a classification).
> >
> > As an FYI, I think large institutions that need to keep records for a long time or 'forever' (banks, medical, etc) will put it to a tape and send it to a tape storage company (like Iron Mountain, Recall, etc). Those places have climate controlled facilities designed to hold media indefinitely. Since the tape is not reused, it won't have any problems with tape reuse like warping, jamming, breaking, etc.
> >
> > Robert
>
> This business scenario presents some interesting, and challenging
> situations that offer a real opportunity for innovation.
> Here they are as I see them:
>
> 1-10 TB of Disk - mostly turned off until needed
>
> RAID - with mirroring provides a safety margin
>
> No RAID - JBOD. Information must be stored in multiple locations to
> provide the safety margin of RAID. If the Information is stored in
> more than two JBOD areas the safety margin is better than RAID.
>
> 1-10 TB of DVD - requires an excellent cataloging system since all
> DVDs will not be online all the time
>
> Straight to DVD - a number of ways to copy the source to the DVD
>
> DVD to CD as required - standard procedure
>
> Tape Backup - defense against local, remote and global disasters
>
> No Tape Backup - could CD/DVD provide this with a 2-5 year life span?
> The 2-5 year archive life is mis-leading because the "roll-forward" of
> the CDs/DVDs to be kept must begin before the expiration deadline. The
> CD/DVD management solution will have to be provided. Some very
> expensive backup software, like TSM, will do this now but it is way
> too expensive for this scenario.
>
> _______
I took all the info I got from all of you to him yesterday evening. And
got some clarification.
First the data in this application must be archived for 3 years by law.
I thought it was 5 or 7 like financial records.
The current business is scanning paper records (evidence) and putting
them onto CDs for long term backup. He thought CDs lasted longer than a
couple of years. Fortunately he has been doing this less than a year, so
no CD is older than 1 year. Doing some more research he found that it
makes a lot of difference on where the CD was manufactured and what it
was made of. And that certain types of CDs will last 5 years if
maintained properly. He is looking into what to do about the stuff
already done.
I don't like this because it is only 1 copy. If the CD gets scratched
the data is gone.
The data is also a lot less than I thought. He is estimating 100 to 200
CDs per firm per year. Right now he has only one firm, and a second firm
is a possibility in a couple of months. He is putting each client's
paperwork for the firm on an individual CD at the moment. So the CDs are
not full. So if we estimate 500MB (my guess) of data per CD and 200 CDs
we are looking at only 100GB per firm per year. (It probably is less.)
I am liking the JBOD in different locations the best at the moment. We
could keep the data on the firms server for their immediate access. Two
inexpensive servers with 300GB of disk located in different locations
with low cost Internet connections could keep the data on-line in the
event the firm lost it. (This would probably hold him for the first
year. Then it could be reevaluated depending on how the business is
going.)
In addition, a calender year's data could be backed up to an off-line
long term storage. The long term storage could be a hard drive put in a
safety deposit box.
Now the data is in four physically different locations. If a firm's data
is lost, it can be recreated from one of our two servers. If one of our
servers goes down, the data is on the other server, once the server is
repaired or replaced if data is lost, then it can be recovered from the
other server. In the unlikely event both servers go down and all the
data with them at the same time, the copy in the safety deposit box
could be used.
This data is old records. By the time it goes to archive the odds that
it will be needed again is very small. The case is closed and appeals
are unlikely. Also if the case is opened again there is no problem if it
takes a couple of days to get the data.
Thanks for all the suggestions. Discussing the pluses and minuses helps
make it clearer a direction to take. This is going to evolve over time.
He is not ready to take the big jump yet and spend a lot of money and
become what he calls "hardware poor". (Having a lot of hardware no no
money or business to support it.)
Kip
--
Kipton Moravec <kip at kdream.com>
More information about the Discuss
mailing list