What is Hadoop

Hadoop is a framework comprising an eco-system of many tools to help navigate the sea of big data. The article provides a good summary: http://strata.oreilly.com/2012/02/what-is-apache-hadoop.html.

But how is this relevant to Krescendo? After all, most of the data we manage fits well into conventional SQL structures, and although it is high volume, it is far from the petabytes that large commercial consumer apps need to deal with.

Hadoop tools are applicable as we evolve our service infrastructure, in contexts such as File Management, where we will look at how our Lucene indexing compares/works with MapReduce, or in extracting information from larger data archives, such as our FlexReporting repositories, or in our centralised Audit Trail facility, perhaps taking advantage of HBase and Flume components.