If managing your corporate data for the long term isn't currently on your mind, it should be, and in several different ways: cost, performance, business continuity, and compliance.
Cost versus Performance
First, let's talk about cost and performance. You want to manage your database infrastructure so it can support your growing data needs within budget, while providing acceptable performance to your users. SANs (storage area networks) have enabled us to meet these contradicting goals over the last decade, and, as I mentioned in a previous column, SAN vendors are offering innovative new technologies to push on-disk storage even further. Some interesting new strategies also are helping organizations achieve a more balanced mix of cost versus performance through the use of "tiered storage."
Under a tiered storage approach, we might place all of our most current data on our fastest disk system, such as a fast, well-cached SAN with LUNs (logical units) configured as RAID(redundant array of independent disks)10 with many spindles. Slightly older data, say 9- to 18-months or 12- to 24-months old, might be pushed to a slower SAN configuration on LUNs configured as RAID5 with just a few spindles. Since we only query this data once or twice per quarter, we can tolerate longer query times. Archival data more than two years old might then be pushed to a RAID array of SATA (serial ATA) drives, the kind online retailers sell for less than $80 per terabyte. You might not ever need to involve a tape drive and the associated storage and handling issues.
Also, due to new technologies, the extreme ranges of cost versus performance have broadened. For example, if performance is your biggest concern, you can now put your most IO intensive databases onto Solid-State Drives (SSD). They're much more expensive, but offer 10x to 30x performance improvements over hard disks. If cost is your biggest concern, and tapes are part of the mix, many tape vendors continue to push capacity to new extremes. For example, in January, IBM and Fujifilm announced the new Nanocubic tape format capable of storing 35 terabytes of data per tape, roughly a 25x improvement in storage densities.
Managing your data is more than just providing speedy answers to questions from OLTP (online transaction processing) and OLAP (online analytical processing) databases, file access and email systems. I've long been an advocate for the idea that the asset in information technology isn't the equipment; it is the data itself. This is never more evident than when an important business system unexpectedly goes offline. The hardware itself is replaceable and interchangeable. What is truly important to the business is the data that resided on that hardware.
While some businesses are ahead of the curve and have developed fully realized business continuity plans, most have not. Many business continuity plans are very shortsighted, as well. Members of one corporate IT department I spoke with were very proud of their new failover data center. If the company ever had a catastrophe, they had a completely redundant data center that could pick up where the other one failed - across the street. But, consider this: when a flood hits or the power goes out for a week, both data centers are affected. If your business is large enough to require redundant data centers, then you're large enough to locate them far enough away from each other so that location-based disasters don't strike both at the same time.
The smaller your business, the more likely your data is vulnerable. Again, new advances in technology are helping. Virtualization facilitates quick "snapshots" of entire servers, along with their data, through P2V (physical to virtual) features. Using P2V, you could take a snapshot of a set of database servers directly to a USB hard disk, toss the hard disk into your briefcase, and restore it in a different location. Then, there are cloud-based backups. While I don't advocate them for every IT use case, a cloud-based backup can offer a small business the perfect mix of business continuity features without much time, budget, or know-how.
For additional insights from Kevin Kline on best practices for data management and retention, read the July E-Edition of DBTA.