Protecting Data

When it comes to data, most people are aware of the importance of protection – we are constantly reminded to select secure passwords, not to share them, to make them difficult to guess. But what about making sure the data doesn’t get lost?

I’m currently in the middle of setting up a development environment for my new client, and had to go through this exercise. It is just as important to ensure that the data continues to exist as it is to ensure that the wrong people cannot access it. Depending on the state of the project, in fact, this concern might be greater than preventing unauthorized access. For example, during initial development of any large application, it is critically important that regular backups be made. That way, when something changes so drastically that it negatively impacts the project, you still have the ability to go back to a previous version of the project.

This segregation of changes, and the ability to retrieve older versions of data, results in the following simplified structure (this is for a programmatic project in particular):

Production

This should be backed up as often as it changes – that is, the programs should be backed up whenever a change is made to the Production version, and the data should be backed up on a daily basis at a minimum.

Testing (Quality Assurance)

This environment is for the purpose of ensuring that the changes being made to the Production version are complete and stable – essentially, a last check before making the final push to production. As such, it may have minimal amounts of data, or at least, data which does not change frequently. A normal backup schedule here would be to create a version every time the programs are changed, and to back up the data at the same time.

Testing (Integration)

This environment is usually undergoing frequent changes, often several times per day. Backups of such environments should be done on a schedule, often daily. However, this can often be bypassed by merging the backups for this area with those being done for the source code for the project, as discussed below.

Development

This is the responsibility of the developers working on the project, and should generally be done as often as major changes are being made to the project. However, the discretion of the developers is usually sufficient to be relied upon, though this will often only be sufficient once the first loss of data occurs. We learn the hard way, but we do learn.

Source Control

No program should be developed without source control to ensure that versions of the project, especially when involving multiple developers, can be kept up to date across all developers. As well, this will usually be the means that versions of the program are being pushed to the different regions – developers start from what’s in the code repository, Testing (Integration) uses this as a basis, Testing (Quality Assurance) takes specific versions from here, as does Production. As a result, the code repository should be backed up at least once a day, as without this, in the event of a system loss, restoring all information becomes exceedingly difficult. Additionally, source control provides not only a view of what the code looked like at a particular point in time, but also how it changed over time, making the locating of bugs which are introduced a little easier to do.

Summary

It is only with a proper backup system that you can ensure that you are able not only to prevent a total data loss from occurring, but also that you are able to ensure that in the event of the loss, that you can restore your systems to the point immediately prior to the loss. Naturally, the backups need to be protected from unauthorized access, but it is the fact that they exist, preferably in a location distant from their origin, that ensures your business’ ability to continue its operations even when the unexpected happens.