Please use this identifier to cite or link to this item: http://ir.juit.ac.in:8080/jspui/jspui/handle/123456789/6402
Title: Design and Implementation of Clustering Algorithm for Big Data Analytics
Authors: Thakur, Parminder Singh
Sharma, Ankit
Kumar, Pradeep [Guided by]
Keywords: Data mining
Clustering algorithm
Hadoop Framework
Streaming via flume
Issue Date: 2017
Publisher: Jaypee University of Information Technology, Solan, H.P.
Abstract: This project is intended to implement clustering on Big Data sets. Data analysis is a crucial part of the process of formulating new policies and strategies for an organization that help the company to be successful in competitive markets, allowing them to better understand their customer base and giving solutions to so hard problems and thus gain competitive advantage. With almost every aspect of human life connected to the internet, data analysis has gained greater importance in fields like Smart Healthcare and national security and is now not just limited to business analysis. The vast and diverse nature of such Big data sets imposes new challenges as the traditional methods of knowledge discovery from databases are not equipped to handle such Big data. Distributed and Parallel computing frameworks are the key for such analysis. The purpose of clustering algorithms is to make sense of and extract value from large sets of structured and unstructured data. The technique of clustering enables a data analytic to obtain a snapshot of data of huge volumes and complexities which can be used to form some logical structures on such huge volumes of complex data. Thus clustering can provide with some form of structure and insight to the nature of data at hand and can form the basis of further analysis. To handle Big Data clustering various limitations of present clustering techniques are needed to be mitigated by understanding the framework used to handle Big data sets and analyzing data clustering. The target is to analyze these challenges in order to redesign and implement a clustering algorithm that is suitable for Big Data sets based on today's available technologies and frameworks. Furthermore, the possible future path for more advanced algorithms is illuminated
URI: http://ir.juit.ac.in:8080/jspui/jspui/handle/123456789/6402
Appears in Collections:B.Tech. Project Reports

Files in This Item:
File Description SizeFormat 
Design and Implementation of Clustering Algorithm for Big Data Analytics.pdf1.58 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.