Theory and Practice of Data Processing Concepts and Techniques

Data Processing is a collection of tasks which include scrubbing, wrangling, enriching, and partitioning data. Processing is critical to perform an efficient analysis and extract valuable knowledge from data. However, data processing in real-time and processing large-scale data is heavily challenging. This course covers data processing concepts and methodologies. The course explains the techniques and theoretical foundation of cleaning, transforming, enriching, and partitioning massive-scale data. Additionally, it discusses available data processing technologies.

Course Objectives

  • Provide a strong knowledge of data processing concepts and methods.
  • Provide knowledge of the theory of data processing concepts.
  • Explain data processing algorithms.
  • Provide a strong understanding of how to ensure data quality.
  • Provide solid background on data processing technologies.

What is in it for the Participants?

  • Learning theoretical aspects of data processing.
  • Learning concepts and techniques of data processing.
  • Learning how to handle the variety and volume of data while preprocessing
  • Being able to preprocess different types of data including structured data (SQL tables), unstructured data including texts, images, geographical data, audio, video, and so on.
  • Learning methods, algorithms, and how to asses the quality of data and explore best practices for data cleaning
  • Being able to identify unnecessary/insignificant data and prepare datasets that are relevant to the analysis
  • Learning how to enrich data by integrating a wide variety of data relevant to the subjects of analysis.
  • Learning to develop solution for combining various data for enrichment.
  • Being able to use appropriate methods, algorithms for partitioning data.