Mohammad Mahdi Mohammadi, Bijan Raahemi, Fatemeh Cheraghchi, Wael Obidallah, Elnaz Bigdeli
Proceedings of 24th Annual International Conference on Computer Science and Software Engineering,IBM CASCON, 323-325
Publication year: 2014

Abstract:

The exponential growth of data, especially over the internet; leads to the dramatic rise of unstructured and semi-structured data, in addition to the traditional (structured) data. Since relational databases and associated tools were designed to interact with structured data, companies such as Google and Yahoo were facing challenges dealing with the unstructured and semi-structured data. When the volume of data goes beyond the processing capacity of the existing algorithms, it is considered as Big Data. Hadoop is a popular technology for analyzing Big data. There are tools available on Hadoop platform to assist analysts create complex queries and run machine learning algorithms in a parallel and distributed fashion. The goal of this workshop is to provide the participants with hands-on experiences on analyzing Big data, installing Hadoop on Linux-based machines (PCs equipped with Ubuntu OS), and running examples on Hadoop framework.