Data de-duplication, the method of reducing data storage needs by eliminating redundant data from a device, is gaining attention because of its ability to reduce the size of the backup storage repository by an average of twenty-to-one.
This is the view of John Hope-Bailie, technical director of Demand Data. He says the ability to store so much more data in less space is also vindicating the growing acceptance of disk based backup systems, enabling longer disk-based retention of backed up data (instead of its transfer to tape) thereby improving the performance of restores, as disk is the faster medium.
“Market research analysts confirm that the technology of backup de-duplication is no longer in its infancy, having progressed well into the ‘hype cycle’.”
Hope-Bailie says the idea of the hype cycle was conceived in 1995 by a prominent analyst/research house in the US as a commentary on the common pattern of human response (‘hype’) to technology.
“A hype cycle is a graphical way to track multiple technologies within an IT domain or technology portfolio. It characterises the response to the emergence of a technology from an initial ‘over-enthusiasm’ through a period of ‘disillusionment’ to an eventual understanding of the technology’s relevance and role in a marketplace – the ‘plateau of high adoption’.
“Distinct indicators of market, investment and adoption activities are associated with each phase,” he says. “Many industry watchers maintain that de-duplication technology has passed through the ‘trough of disillusionment’ and has entered the final phase.”
He likens the process to runners halfway through the opening phase of a long distance race: “The players in the de-duplication arena have settled down after an initial spurt and are taking stock of who is ahead of whom. The stakes are high and a ‘winner-takes-all’ mentality seems to pervade. The model for victory seems a simple one: He with the most features and the best performance wins.”
According to Hope-Bailie, backup de-duplication is a spinoff from the D2D (disk-to-disk) backup approach.
“D2D backup boasted several advantages over tape based backup, mainly related to overcoming the natural shortcomings of streaming tape media. D2D benefits include the ability to sustain multiple streaming sessions to the same media, and the ability to locate the position of a record very quickly for a restore,” he says.
“The problem with D2D was the fact that although disk was becoming cheaper, it was still more expensive than tape cartridges, so the idea of keeping a full weekly-monthly-yearly backup regime stored on disk was untenable. With the arrival of de-duplication, this became both possible and affordable. Thus de-duplication was initially an enabler for the large-scale uptake of D2D.”
Hope-Bailie adds that the de-duplication techniques applied in this way are referred to as ‘target-based’ and most offerings available today are of this type. These include in-line de-duplication – in which the data is de-duped as it is ingested – and ‘post-processing’ in which the de-duplication is applied retrospectively.