news / tech talk

Data Archiving

by Lee LeClair
06/10/2009
As seen in Inside Tucson Business

Data Archiving. It certainly does not sound exciting and, honestly, it isn't but it could affect your business in a lot of different ways so it bears some consideration for the prudent business owner. Data archiving is commonly defined as how long you are going to keep your data. While I am talking specifically about business data retention here, you will also see how it affects personal data as well.

Most people are familiar with data backups (a copy of your key data in case your primary systems become damaged) and often mistake data backups with archiving. Data backups are copies of data at a point-in-time whereas data archives are the storage of data over intervals of time. So, a data backup is like copies of tax forms you keep as you're on them, while data archives are more like your file cabinet of tax returns for the past seven years.

Things to consider for archiving include what data you actually need to archive, what format to store it in, what media to store it on, and how to retrieve it if and when you need it. The first item to think about is what you data you actually need to archive. Some businesses have clear rules regarding what they must archive; they are the lucky ones. Most have to figure out what they need to save and in these days of litigation and laws like Sarbanes/Oxley and HIPPA, that can be pretty tough. It also leads to the next issue which is what format to store data in.

Most businesses use special backup or archiving software which performs a consolidation and compression of data so that the data is easier and more economical to store. However, the downside is that the business owner is now dependent on the viability of that software or, in the case of open-source solutions, format. Will that software and/or format be around in 10 years or 5 or even 3 years? Things disappear fast in technology and as the poor economy is showing that some businesses disappear fast too. In addition, what format is the data being stored? Consider that 15 years ago your company might have been using WordPerfect, running your mail on Netscape Mail server on an old version of Sun OS. Could someone recover that data today? Will jpegs and gifs continue to be supported 10 years into the future or will png have taken over by then? SATA drives have supplanted IDE and will likely be supplanted themselves shortly.

A similar issue afflicts the media used to physically store archived data. In the first days, humanity used stone tablets to store information but media longevity has been heading downhill ever since with the moves to leather, papyrus, paper, and finally magnetic media. If you thought optical media like CDs and DVDs last forever, research it and you'll see that oxidation occurs in less than 10 years on the thin metal wafer suspended in most media. Plus, formats like DVD are being supplanted now by BluRay; will there even be players for the media you have saved up?

Finally, verify that you have a good and tested recovery and retrieval system in place for whatever decisions you make.

I wish I could offer solutions to all of these issues but I can only draw attention to them out and provide some broad advice. For long term storage, consider a dual-pronged approach with actual printed text (and all its space and damage issues) as well as electronic storage in openly supported formats. As for media, optical is still the longest lasting I know of but the formats and media must be monitored, rotated, and possibly updated at some point. All of these things bring you back to taking the time to consider what data you REALLY need to archive.

Lee Le Clair is the CTO at Ephibian. His Tech Talk column appears the third week of each month in Inside Tucson Business