It is a common misconception (and one we hear a lot) that digital preservation is simply a backup of your data, stored in the cloud somewhere or on a USB stick stowed away in a drawer.
This rings true for me on a personal level. How many of us store our own data in various cloud storage services, external hard drives or some other storage device? And when we come to access that photo or file years down the line, do you know where it is? Is that data is still accessible and useable?
This challenge is not unique to individuals, but something faced by every organization in some form. A Forrester Report on digital fragility highlights two examples where organizations lost critical data because file formats had evolved to the point that they could no longer open the old files.
“Data preservation means more than just making a backup copy of your data; it means protecting your data in a secure environment for long-term access and reuse.” – Stanford University
There are hundreds of file formats in common use across the world and these are constantly changing on an ongoing basis. This is just one reason why digital reservation is so much more than a backup.
An introduction to digital preservation
In simple terms, digital preservation is the process of ensuring future access to digital files and assets, regardless of whether they are born-digital or digitized versions. The process guarantees access and use of those files even when formats and technologies evolve.
Three main elements of digital preservation:
- Usability: You need to ensure the usability of your files once file formats evolve. This is often something not considered until it is too late for an organisation as they realize they can no longer access a file. Digital preservation tools can ensure that your data is regularly checked and updated to the latest and most appropriate formats.
- Searchability: It is also important to consider all the associated metadata is properly recorded. In a normal backup you could potentially go in and edit each individual file, but this is not sustainable in the long-term. In the future, will it be possible to search through your data properly so that you can look for specific authors or information contained within it?
- Accessibility: As discussed there are a number of challenges about accessing traditional concepts of backed up data, and you need to consider the possibility of data degradation, as well as the long-term accessibility of wherever your data has been stored.
These three elements must also be considered as a continuous ongoing challenge instead of a one-off fix. Digital preservation is a complex process of actively managing files over time to ensure future access and continuing to ask questions (including whether file formats still supported, is the data still accessible and so on).
Additional detail to be considered include:
- Appraisal as to the worthiness of preserving that piece of data
- Identification of the data
- Integrity verification
- Characterization of content
- Sustainability assurance
- Authenticity verification
- Access allowance and logging
- The addition of metadata about the preservation process
Within many organizations, a small number of people are expected to examine these activities. In these cases, automation in digital preservation can help lower the burden on these small teams and help them focus on other value add activities.
Automation in digital preservation can help lower the burden on these small teams and help them focus on other value add activities.
Automation in digital preservation can help lower the burden on these small teams and help them focus on other value add activities.
Where do I start?
Digital preservation is not a new concept, but it is one of increasing importance as we generate larger amounts of digital assets and in a wider range of file formats from different data sources than ever before. You might need to preserve PDFs, emails, social media messages, voice recordings, instant messenger posts or even entire websites.
As we’ve discussed, digital preservation is not just backing up your files in multiple locations. A good digital backup plan is important, but it isn’t the only part of preservation. Backups alone are not enough to preserve your data for the long-term and they do not usually include file format normalization.
Here are some useful things to consider when planning to preserve your digital assets:
- What are you looking to achieve with digital preservation?
- Are you documenting culturally important artifacts for future generations to understand more about this period?
- Are you preserving the data for a set period in line with regulatory requirements so the ability to implement a robust retention schedule workflow is important?
- What are the different file formats and data sources you need to include in your digital preservation strategy?
- Who needs access from across your organization and community?
- How many digital assets you need to preserve?
- What are the number of files involved?
- How much data do I need to manage?
Digital preservation is an ongoing set of processes. Information professionals will need to continually refine, add to and change over time. Digital preservation is about keeping things accessible for long periods of time, which can result in the long-term as keeping it alive “forever.” The policy that is suitable for day one is not the policy that can be used for year five, so an evolving strategy is essential. Human decision making and purpose-built technology combined are the key to successful digital preservation.