Endy:Data storage: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 83: Line 83:
*5,000 users, 250,000 files restored per quarter
*5,000 users, 250,000 files restored per quarter
*128-bit [http://itinfo.mit.edu/article.php?id=7444 encryption] available
*128-bit [http://itinfo.mit.edu/article.php?id=7444 encryption] available
*coming soon (summer of 2007 at the earliest):
*coming soon (as of June 2007 there are no plans to introduce this service within the next 2-3 months):
**free service (for personal use): 10-20GB
**free service (for personal use): 10-20GB
**enhanced service (for DLC use): 1TB and up, offsite mirroring, will be expensive, etc
**enhanced service (for DLC use): 1TB and up, offsite mirroring, will be expensive, etc

Revision as of 13:10, 21 June 2007

1. The problem, briefly stated

We need an easy, secure and efficient way to store all our files:

  • individual user files (backup)
  • shared project files (centralized storage and backup)
  • old user and project files (centralized storage and backup)

2. Current specifications our backup system

Most people use Bionet to store their files.

Bionet

  • Consists of two fiber channel disk filers (bionet and bionet2 located on the 3rd floor in building 68) that have about 3TB of usable storage shared among several labs. Data on this system is mirrored to NearStore R200 in building NE47 (as of 2007-06-04).
  • Lab storage space (as of 2007-06-19):
    • total: 200GB
    • available: 88GB
  • Problems with bionet:
    • Not enough space to backup microscope images
    • As of 2007-06-04, bionet and bionet2 are out of the support contract; R200 is under a support contract paid for by the Biology Department. It means that if bionet filers fail, there should be a copy of the data stored on R200 in building NE47.

R200

  • Has 16TB of total storage.
  • Lab storage space (as of 2007-06-19):
    • total: 120GB
    • available: 120GB
  • Problems with r200:
    • Not enough space to backup microscope images
    • Not backed up anywhere: no off-site backup and no snapshots (only single copy of data exists). It means that this space should not be used as primary storage.

3. Ideal specifications of our future backup system

(lab and individual data storage, sharing and backup needs - please list what would you like have available)

  • Capacity: we want to be able to store all files in a single location
  • Easy: automatic backup
  • Secure: the backup system shouldn't be located in building 68, in case of a fire
  • Efficient backing up or retrieving files should be speedy
  • Affordable

Types of data

  • Individual user data and project data (except microscope images)
    • Current data (~100GB?) can be stored on Bionet (backed up automatically)
    • Old data (~100GB?) can be archived on one of the existing lab servers (e.g., shmoo which may need an additional hard drive) but needs backup space.
  • Microscope data
    • Current and old data (~170GB) can be stored the Mac G5 attached to the scope (may need an additional hard drive)
    • New data (6 months from June 2007: Samantha ~100GB, Jason ~500GB, Francois ?GB) needs both additional primary storage (~800GB total) and backup space.

4. Potential solutions

BioMicro

  • Primary storage: 200GB on bionet in bldg 68 mirrored to R200 in bldg NE47.
  • Backup storage: ~170GB on R200 in NE47.

BioMicro plus own storage

  • Primary storage: 200GB on bionet in bldg 68 mirrored to R200 in bldg NE47.
  • Backup storage: a NAS box with a RAID 5 array would provide ~1.5TB or usable storage. Cost: ~$1,000.

Own storage

  • Primary storage: build or buy a storage server with a RAID array (e.g., 4 x 500GB drives in RAID 5 configuration would provide about 1.5TB of usable storage space). Cost: ~$1,500. Host it in the BioMicro Center on the 3rd floor of building 68.
  • Backup storage:
    • Either get the second identical server or a NAS box and host it in Tech Square (NE47). Cost: ~$1,500 for server or ~$1,000 for NAS box.
    • Or use MIT TSM (if appropriate service available - limited to 300GB per machine as of June 2007). Cost: unknown monthly fee (currently $7.50/month for 300GB).
  • Biosupport would provide maintenance for free.

Shelf for R200

At $7 per GB, an 8TB shelf (minimum increment available) would cost on the order of $50K.

Reference

The win.mit.edu Domain

MIT TSM Backup Service

  • Monthly service charge: $7.50 per month per computer
  • Storage limit: 300GB
    • a soft limit, some users go over
    • an approximate figure because it includes both "active" and "inactive" files but this is offset by data compression
  • TSM software is required to use the service and is available for Windows, Mac and Linux (free to MIT community per site license)
  • Backups are stored on one of the TSM backup servers in buildings W91 and E40 (no mirroring)
  • Types of backup:
    • Scheduled: everything by default but can be configured to exclude directories
    • Manual: nothing by default, need to specify which directories to backup
  • Inactive files (old versions of current files and deleted files) are kept for 30 days using incremental storage (only changes are stored)
  • Need a separate account for each computer to be backed up
  • Performance will vary, depending on time of the day, network condition and machine itself)
  • 5,000 users, 250,000 files restored per quarter
  • 128-bit encryption available
  • coming soon (as of June 2007 there are no plans to introduce this service within the next 2-3 months):
    • free service (for personal use): 10-20GB
    • enhanced service (for DLC use): 1TB and up, offsite mirroring, will be expensive, etc

Misc

Network Attached Storage

A Tale of Two Terabyte NAS Boxes

Buffalo Technology

  • Buffalo TeraStation Home
    • Example disk configuration: 4 x 250GB IDE (750GB in RAID5)
    • Protocols: FTP, SMB
    • USB 2.0 port for external hard drive (backup or additional storage)
    • Review by PC Magazine
      • Bottom line: Flexible and reliable storage for everyone on your network. Print sharing is a plus, as is expandable USB disk storage.
      • Pros: Offers RAID level data protection; easy-to-configure shared and private storage for all workgroup members; print sharing is a plus.
      • Cons: Large footprint. No logging or reporting features.
    • Review by ExtremeTech
    • TeraStation wiki
  • Buffalo TeraStation Pro
    • Released in March 2006
    • S-ATA drives

Infrant ReadyNAS

  • ReadyNAS NV
  • ReadyNAS NV+
  • Infrant ReadyNAS NV+ and 1100: Small steps forward - review
    • comes with a 5+5-user license for EMC's Retrospect for Windows and Macintosh client backup software
    • The NV+ is a slight improvement over the NV, with most of the value coming in the Retrospect backup client bundle
    • Since both the NV and NV+ use the same processor and have the same memory, the performance difference I saw is more due to better drives in the NV+ and newer firmware than anything else
    • with the lowest price at time of review at $831 for a driveless NV+ and $517 for an NV, you might be better served by using the $300 to buy drives
  • Infrant ReadyNAS NV Review
    • X-RAID (Expandable RAID) allows to add capacity without deleting existing data, automatically adjusts RAID level and formatted capacity to match the available drives
  • ReadyNAS NV - user review
  • ReadyNAS NV - AnandTech review
  • ReadyNAS NV - PracticallyNetworked.com review

LaCie NAS

  • LaCie Ethernet Disk
    • 1TB $740, 2TB $1,050 (June 2007)
    • Rack format
    • Powered by Windows XP® Embedded
    • Gigabit Ethernet
  • LaCie Ethernet Disk RAID
    • 1TB $840, 2TB $1,160, 4TB $3,000 (June 2007)
    • 4 removable, hot-swappable drives (spare drives: 250GB $210, 500GB $390 - June 2007)
    • Gigabit Ethernet