nav search
Data Center Software Security Transformation DevOps Business Personal Tech Science Emergent Tech Bootnotes BOFH

There's a lot more to backup than you thought

Which kind will you choose?

By Chris Mellor, 23 Oct 2014

If backup is too often an afterthought, have you considered its even poorer relation, archive. You haven’t? What a surprise.

The fact is, businesses and their infrastructure depend upon common services for general efficiency, management focus and cost control. All organisations do, and it would be a nightmare if they didn't.

Typically, there are many departments and line-of-business units in an enterprise which share common services, including:

  • Accounting
  • HR
  • Legal
  • Facilities management
  • IT

There is a dynamic tension between having common services and specialised ones. Specialisation is chosen where delivering a better outcome is more important, and more profitable.

Businesses ask themselves where the cut-off boundary should be. Is it better to have a single sales force or one divided into sections focusing on different vertical markets, customer types or channel partners?

Will we let departments and lines of business decide on their own IT or provide a centralised IT service?

That question has been largely decided: a central, shared IT service is the backbone of most organisations' computing needs.

The same tension also plays out inside departments such as IT. Should our computing be centralised or distributed?

Will our IT provide better services if we have separate physical servers devoted to different tasks or common servers that are virtualised to run specific tasks inside virtual machines?

Should we have a single storage resource, or silo, or multiple specialised ones for mainframe computing, structured databases, unstructured information and so on?

In with the new

Most organisations adopt a hybrid approach, reflecting their development over time as new generations of IT systems have appeared. They may have mainframes, servers with dedicated storage, virtualised servers with shared networked storage, specialised content-delivery servers and the like.

These all need to have the data they hold protected against file damage, storage-media failure, accidental deletion, system failure and even data-centre failure. There is a common backup need – but is it best provided by a common data-protection service or by specialised systems?

So the fundamental question applies here too.

Specialised systems can be customised for different applications and reflect their needs, but they need to be individually sourced, implemented and managed, which means extra cost and complexity.

As our IT systems develop, with new systems being added to run alongside the older ones, backing up data from these multiple systems is becoming ever more complicated.

There is also an intelligence location question. Should the backup smarts, as it were, be located in the servers themselves or in an attached backup media server, which both send backup data streams to target devices?

Should the receiving devices also run the intelligent backup software, if necessary using agents on the source systems to trigger and produce backup streams?

Having the main backup software running on the target device is simpler because you don’t need to source and manage a separate backup server or multiple backup software products for the various kinds of server environments you operate.

Keep it simple

On the other hand – and there always seems to be another hand in this debate – dedicated backup software for specific server environments can provide additional features that you might need, such as virtual machine format conversion.

Simplicity says place the intelligence in the backup appliance, but there is another problem to consider.

Data needs both to be protected against some kind of failure and preserved for the longer term, or archived. Backup requires a fast restore to recover from failure to minimise the impact on ongoing business operations.

Archive needs to be lower cost: there is a large amount of data but it can tolerate a more leisurely access than backup.

Should you have two data protection silos overall, one for backup and a second for archival data? Again, the pursuit of simplicity suggests a unified system.

Good to share

Providing a unified backup and archive service looks more and more appealing. You can have multiple backup software agents running on separate computing platforms and delivering data to a shared disk-based target device or appliance, which may deduplicate the backup data to store it more efficiently.

Then, as data ages or the device fills, you can siphon data off to an archive, storing it on tape or perhaps an object storage system. This lowers its storage cost and keeps it available for audits against compliance regimes or business-intelligence analytics.

There are three general ways of providing a combined system:

  • Basic backup-to-disk with deduplication on the target system, the approach pioneered by Data Domain
  • Combined system providing multi-source backup to disk and archiving to tape beyond that with overall management
  • An appliance with integrated backup and archival software

This is not an either/or decision; each option has its own merits and is appropriate for different needs or businesses. As with aeroplanes or delivery trucks, one size does not suit all.

The basic deduping backup-to-disk target is the workhorse of today's backup environment and has spread into virtually every company’s IT shop. Most storage suppliers – EMC (Data Domain), Exagrid, Dell, Fujitsu, HP, IBM and Quantum – offer this basic product category, which can range from a simple appliance to a capable scale-out system.

Deduplication, the removal of duplicated blocks of information, is common in backup data streams. It results in a 10:1 or better data-reduction ratio, which offers a far lower cost/GB of storage than for primary or secondary disk array storage.

The fact that data is stored on disk means that retrieval of small to medium-sized files is faster than from tape.

The devices fill up so you have to either buy another, delete old data or archive it to tape

A disadvantage is that the device is relatively dumb and you need backup software or agents on the source systems. Another is that the devices fill up so you have to either buy another, delete old data or archive it to tape, which means another data protection system to manage.

The combined disk backup/tape archiving system with common management gets over this problem, albeit with a more complicated system as it has to cope with more diverse needs.

It may offer itself as a virtual tape target for mainframe systems, provide unique facilities for non-x86 Unix servers, have intricate facilities for virtual machine backup, and offer the ability to write backup data sets to a back-end tape library using archiving software.

Science of appliances

HDS acquired Sepaton’s Virtuoso backup-to-disk deduplicating systems, which occupy this product category, as does Fujitsu’s Eternus CS8000, which uses Quantum-sourced deduplication and scales up to 22PB (before deduplication) by adding nodes to build multi-box resources.

This CS8000 can also manage the capacity of attached tape libraries, giving the sysadm flexibility in positioning backup data on disk, deduplicated disk or tape.

Where such capable and typically high-end systems are not needed then simpler backup to disk appliances with a scale-up architecture can appeal. Examples are Fujitus’s Eternus CS800 or EMC’s Data Domain Series. These can be combined with any backup software on the market.

Finally, if users focus on one backup software only, then an integrated appliance may offer greater simplicity. It will be easier to implement, operate and scale, but may lack the specialised abilities of the combined systems.

They offer backup and archive processes, with over-arching data management software providing a single interface and management resource.

For example, Fujitsu’s Eternus CS200c appliance features:

  • CommVault Simpana backup/archive management software
  • Local disk and/or SSD storage
  • Deduplication for storage cost efficiency
  • Support for physical and virtualised server data source systems
  • Tape management features for archiving to tape

We could think of this as a three-way combination of backup media server, target backup-to-disk device and backup/archive data management software.

Planes, trains and automobiles

One reason for the plethora of backup devices and types is that the needs of data source devices are so different. To say that all need data protection is not that helpful.

As an illustration, the citizens of a country all have a common need to travel, but the modes of transport vary hugely, from walking, bicycles, cars and trains to ships, planes and even space shuttles. The moral is that you share a mode of travel, or backup, when you can and specialise when you must.

If we generalise we might envisage three main customer type needs:

  • Small business system with little or no archive need
  • Centralised combined backup and archive appliance with management software
  • Consolidated enterprise systems with with mixed mainframe, Unix and x86 computing support and backup, archive, disk and tape storage environment

In vehicle delivery terms we could classify these as the panel van, the basic truck and the articulated lorry approach.

A small business’s needs could be satisfied by a simple and basic backup-to-disk deduplicating target. This could also meet the needs of larger enterprises' remote and branch offices, although these might be better served by smaller versions of more capable integrated backup/archive appliances that can communicate to the central data centre.

They can also, for example, deduplicate across the totality of the business’s backup data set, gaining better cost efficiency.

The consolidated enterprise need could be satisfied by all three kinds of backup appliance or by using the high-end systems and feeding a central system with data from remote and branch offices.

There will never be a single backup and archive system that meets all data protection needs, from those of small businesses to mid-level ones and global-scale enterprises and serving everything from mainframes to non-X86 Unix servers and commodity X86 servers, both physical and virtual.

Clouding the issue

Then there is the cloud, which we will consider only briefly here for reasons of space.

If you can use the public cloud then a starting approach would be to say that you need a gateway system to collect the backup data on your premises and then send it to the cloud.

As restoring anything but small files from the cloud takes a relatively long time you could envisage a backup appliance having cloud storage gateway functionality added to it. Recent restore needs are met by backup files on the appliance, and are hence fast. Restores of older data that is less often needed could come from the cloud.

In fact, we could do a neat split and say backup is local and archiving is in the cloud, with Amazon or Azure substituting for an on-premises tape library.

Holding backup data on the appliance on premises might better suit your need for secure storage of sensitive data, especially where there are geographic restrictions on its dispersal.

As we have seen, there are always specialised options for backup, as well as seductive all-embracing alternatives that have the appeal of simplicity, such as back up and archive everything in the cloud.

Be suspicious. In this area one size almost never, ever fits all. Use common backup and archive services when you can, when they make sense, and not to excess. Specialise when it makes sense but there is no need to hyper-specialise.

A central backup and archiving appliance core will generally make sense, with the cloud reserved for potential archiving and socialised backup systems for the corner cases that are outside the appliance corral.®

The Register - Independent news and views for the tech community. Part of Situation Publishing