Healthcare Data Pipeline

Please use this identifier to cite or link to this item: http://www.ir.juit.ac.in:8080/jspui/jspui/handle/123456789/9935

Title:	Healthcare Data Pipeline
Authors:	Dabral, Shivam Mohana, Rajni [Guided by]
Keywords:	Python programming Apache spark Pseudocode Healthcare
Issue Date:	2023
Publisher:	Jaypee University of Information Technology, Solan, H.P.
Abstract:	Big Data Processing is a matter of interest for many companies around the globe as they try to harness the true power of data. Similarly Nference labs private limited is trying to make use of healthcare data to provide people with better medical support. This project aims at exploring such various techniques that employ engines and frameworks that can generate useful data from raw data effectively and efficiently. Various techniques were examined based upon many research papers and compared. The results suggested the use of Apache Spark as an engine for computation. The data files were stored in parquet format with snappy compression, so that data occupies less space. Hence the aim was to come up with an efficient data generation pipeline that can handle Terabytes of data.
Description:	Enrolment No. 191273
URI:	http://ir.juit.ac.in:8080/jspui/jspui/handle/123456789/9935
Appears in Collections:	B.Tech. Project Reports

Files in This Item:

File	Description	Size	Format
Healthcare Data Pipeline.pdf		1.64 MB	Adobe PDF	View/Open