My CMS

Data Engineering

This training presents, through detailed explanations and real-world examples, the roles, responsibilities, architectures, tools, and best practices essential for modern data platforms. Hands-on labs provide participants the opportunity to work with leading technologies such as Snowflake, Apache Spark, dbt, and Airflow to build, optimize, and orchestrate scalable data pipelines.

Day 1 : Data Engineering Foundations

  • Understand the core responsibilities of data engineers and key architectural patterns
  • Explore Snowflake as a modern cloud data warehouse and learn its deployment models
  • Learn best practices in data modeling, including relational vs dimensional models and schema design
  • Configure a Snowflake environment with proper security and access control
  • Design and create initial data models using dimensional modeling concepts
  • Set up dbt for version-controlled data transformation workflows

Day 2 : Data Processing and Workflow Management

  • Dive deeper into Apache Spark for efficient batch processing and performance tuning
  • Learn data ingestion and transformation patterns with Spark and dbt
  • Design, schedule, and monitor workflows using Apache Airflow for orchestration
  • Implement optimized Spark data transformations with DataFrames and RDDs
  • Build and orchestrate an end-to-end batch pipeline using Spark, dbt, and Airflow
  • Apply benchmarking and code review techniques to evaluate pipeline efficiency

Day 3 : Advanced Topics and Capstone Project

  • Explore real-time data processing using Spark Streaming and event-driven architectures
  • Extract and visualize data using tools like Tableau or Looker with Snowflake as the backend
  • Complete a capstone project integrating multiple tools, followed by peer feedback and presentation
  • Explore real-time data processing using Spark Streaming and event-driven architectures
  • Extract and visualize data using tools like Tableau or Looker with Snowflake as the backend
  • Complete a capstone project integrating multiple tools, followed by peer feedback and presentation

This course is available online and onsite and fully customizable to your needs.
*The course is also available in French.

Theory

Practical Labs

Learning outcomes:

This training will equip Data Engineering professionals to design and manage scalable data systems, automate workflows, and transform raw data into actionable insights and enable your teams to optimize data management processes, improve data quality, accelerate analysis, and reduce operational costs.

Your profile and prerequisites:

  • Data Engineers
  • Software researchers interested in data engineering and management systems.

With knowledge of

  • Programming skills (1+ year experience).
  • Basic SQL knowledge.
  • Comfort with command line.
  • Familiarity with Python is a plus.

Learning outcomes:

You will learn how to effectively apply SRE hard and soft skills in your work and architecture.

  1. Understand what SRE is, why it is important and learn how it can be applied in practise with the Digital Highway for Software Delivery.
  2. Learn how to understand the inner working of your application in production through applying SLO engineering principles and Observability.
  3. Learn how to continuously deliver software into production and how to embrace the shift right paradigm through Continuous Verification and Rollbacks 3.

Your profile and prerequisites:

  • Software engineers 
  • DevOps engineers
  • System engineers
  • ML Architects

With knowledge of

  • Software Engineering skills (OOP, Scripting, ad ac code,…)
  • System Engineering skills (OS, Network, Deployment, Security, Monitoring,…)
  • Advantageous: Performance Analysis, Release Engineering, APM/Infra Monitoring Distributed/ Reliable Architect Design