Big data. Like many monikers for emerging paradigms this term is often loosely thrown around within technical circles. Representing an exciting new development within the management and analysis of large data sets, it promises to revolutionise how we relate to and interact with the wealth of information at our disposal.
What is often underestimated, however, is the sheer capacity required to realise this ideal.
Although we are beginning to develop solutions that will draw intelligent insights from large amounts of data in real time, is the necessary hardware available to us?
The reality is that current data centre infrastructures will be unable to cope with a growth in Big Data. With several Petabytes of information created on a daily basis across the globe, existing platforms will be simply unable to store and process meaningful analysis without delay.
This conflict occurs in two distinct areas.
The first is storage. Although exceptions to the rule certainly exist, the vast majority of data repository hardware currently available to us is not optimised for the kind of information that will inform Big Data systems.
According to research analysis it will be unstructured resources such as video, voice, images and social media content that will soon become the primary source of business intelligence.
Storing information of this nature using systems developed for traditionally structured data such as databses is expensive and inefficient. In response, new solutions – such as object storage systems which compile information into flexible objects instead of rigid data compartments, will need to be deployed on a wide scale.
When this information is stored it must be processed to yield intelligence that can be used to improve and sharpen business results.
The administration of unstructured data is considerably different to the structured formats that technologists are used to. Without the organised configuration that databases offer new processing technologies such as Apache Hadoop must be deployed.
Available as an open source platform, Hadoop makes use of large clusters to glean insights from massive amounts of unstructured data. This is just one of the solutions currently available to early proponents of Big Data.
Despite this large data centres, as they exist today, simply do not have these configurations in place to perform tasks of this nature. To solve the challenge, virtualisation services or new computing facilities must be implemented to handle these workloads.
Although the physical obstacles towards Big Data might appear considerable, we do have the answers at our disposal. In time, computing infrastructures will be reimagined to support increasingly agile analytics engines – it is simply a question of ‘when?’.