Big Data Tools

Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analyzed for insights that lead to better decisions and strategic business moves.

[http://www.sas.com/en_us/insights/big-data/what-is-big-data.html]

The world’s technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s; as of 2012 , every day 2.5 exabytes (2.5×1018) of data are created.

Big data “size” is a constantly moving target, as of 2012 ranging from a few dozen terabytes to many petabytes of data.

In 2012, Gartner updated its definition as follows: “Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.” Gartner’s definition of the 3Vs is still widely used.

[https://en.wikipedia.org/wiki/Big_data]

However, anything above peta bytes (PB) of data is considered as Big Data.

1 PB = 1000000000000000B = 1015 bytes = 1000 Tera bytes.


Data, especially Big Data, can be classified in 3 major classes:

a. Structured Data. Data Which have proper format

e.g : Data which are present in the database, CSV, XLS.

b. Semi-Structured Data: Data which does not have proper format associated with it.

e.g: emails, Log Files, Word Documents.

c. Unstructured Data: Data which does not have any format associated with it.

e.g. Image files, Audio Files, Video Files.

1,053 total views, 3 views today

Comments are closed.