Theory and Practice of Real-time Data Processing
Real-time data is a strong need for many organizations today, as it helps to analyze and realize for instance, customer sentiment, instantly. However, real-time applications are highly challenging to develop, deploy and control on distributed infrastructure. This course covers a wide range of topics related to real-time data processing technologies. It covers the theoretical aspects of real-time data processing. The course covers several implementation tasks of real-time application ranging from configuring a real-time data processing cluster to write application codes for running processing tasks. Also, during the course a real-world end-to-end data processing scenario will be simulated.
- Provide a strong conceptual knowledge of real-time data processing technologies.
- Provide a solid understanding of theoretical foundation of stream processing technologies.
- Provide practical knowledge of setting up single-node and multi-node storm clusters.
- Provide practical knowledge of maintaining, managing real-time data processing cluster.
- Provide guidelines of how to tune the runtime performance of realtime data processing technology Apache Storm.
- Present and describe a simulation of the real-world data processing scenario.
What is in it for the Participants?
- Learning the basic concepts of data processing technologies including architectural styles, data management, and data algorithms.
- Learning the core architecture and components of data processing technologies including Apache Storm, Apache Spark, Apache Flink, Apache Tez, Apache Hama, and Apache MapReduce.
- Learning theoretical foundation of realtime data processing technologies.
- Being able to configure a single-node and multi-node Apache Storm cluster.
- Being able deploy and manage processing jobs real-time in single-node or in a distributed cluster.
- Being able to administer and maintain Apache Storm cluster.
- Learning challenging issues of Apache Storm cluster.