Best blog on big data technologies for the third quarter (July-September) of 2018 in survey conducted among’st 500 papers (Blog Written by Akshaya Nair, a Student of final year Computer Science & Engineering.)
Be A Big Data Analyst!
What is BIG DATA ?
The more things change, the more the progressions are caught and recorded as information. eg. for a weather forecaster, the amount of data collected around the world about local conditions is substantial. One way or another, this weather data reflects the attributes of big data.
Most information gathered presently is unstructured and requires diverse capacity and preparing rather than that found in customary social databases.
This is exactly where we need BIG DATA.
Classification of Big Data
- Structured:Data that has a proper format.Eg:data within databases.
- Semi-Structured Data:That does not have a proper format associated with it.Eg:log files
- Unstructured Data:That does not have a any format associated with it.Eg:Image,audio,video files
Processing of Big Data
- Identification of a suitable storage for Big Data:The initial step of Big Data investigation begins with the identifying suitable stockpiling for Big Data. In Big Data world, HDFS is a standout amongst the most favored document framework for putting away Big Data.
- Data ingestion: Data ingestion refers to taking data from the source and placing it in a location where it can be processed.
- Data cleaning and processing (Exploratory data analysis):After getting the data into HDFS, we should clean the data and bring it to a format which can be processed.
- Visualization of the data:Data visualization is the presentation of processed data in a pictorial or graphical format.
- Apply the machine learning algorithms (If required): Modern day processing and visualization of the Big Data have provided a strong platform to Machine learning algorithms to achieve better results for the companies using the techniques such as clustering, classifications, outlier detection and product recommenders.
Large amount of data is generated every second and in this tech world learning and exploring these increased volumes of data(BIG DATA) might get you a career.All what is needed are core aptitudes such as curiosity and agility.Follows this is the experience with programming languages such as Python,R,java and C among others.Acquisition of knowledge about Hadoop MapReduce is very important if you want to be a high-performance data software engineer. An overall understanding of Hadoop can be gathered via experiences in large-scale distributed systems. So if you want to acquire well rounded expertise in big data go for it right now.Check out online courses for big data specialization at sites like coursera and get hired by top companies.