Academy

Effective SRE

At Digital Innovation Academy, we empower individuals and organizations to thrive in this era of rapid technological advancement. Our academy embodies the fusion of minds, talents, and cutting-edge knowledge, ensuring you have the skills, tools, and mindset necessary to succeed in the digital age. 
 
Digital Architects Zurich, one of the SDN cells, is the forefront expert in domains such as CI/CD, DevOps, SRE, Observability, and Continuous Verification. As an integral arm of the academy, Digital Architects Zurich brings a wealth of knowledge, practical insights, and hands-on experience to the table. 

Objectives

This course will take you through all aspects of modern Site Reliability Engineering.
These principles and practices of effective SRE go from simple deployment to Continuous Delivery & Verification with Observability based SLO Engineering and Operations Efficiency.

Benefits: Learn how to effectively apply SRE hard and soft skills in your work and architecture.
Target Audience: Software engineers, DevOps engineers, System engineers, ML Architects…
Pre-requisites: Basic knowledge in biology and physiology, clinical analysis, medical research

The Program

Module 1: SRE fundamentals

  • Effective SRE: Principles & Challenges Digital Highway Blueprint
  • DevOps Culture
  • Fundamentals of Cloud Native Apps : Containers and microservices

 

Key learnings

  • Introduction to Continuous Delivery, SLO Engineering & Operations Efficiency
  • Foster continuous learning

Module 3: Operations Efficiency

  • Operations Efficiency: Automation and Emergency Response
  • Efficiency and Performance Tracking
  • All about Dashboards
  • Introduction to Error Budgetin

Key learnings

  • Effective Dashboards for optimal Operations
  • Set alerts to ensure Reliability
  • Formulation of Error Budgets and Error Budget Alerting

Module 2: SLOs, SLIs & monitoring

  • Service Level Objectives (SLOs) and Service Level Indicators (SLIs)
  • Monitoring: Past to Future
  • Observability Monitoring and Automation

Key learnings

  • Specify meaningful SLOs
  • Find your SLIs to fit your SLOs
  • Get hands-on experience SLI/SLO

Module 4: AIOps

  • Introduction to AI Ops
  • Microservices APM real-time AI-driven alerts
  • Performance Analysis

Key learnings

  • Judge achievement of SLOs
  • Automate Emergency Response

Module 5: Best practices

  • Effective SRE Best Practices
  • Reliability Architecture Patterns
  • Q&A Session of the Week & Exploring SRE across companies

Key learnings

  • Structure your architecture with reliable design choices
  • Anticipation of potential problem

Module 7: Continuous verification

  • Continuous Verification (CV) in CD Pipeline
  • Automation of CV with AI
  • How to implement CV & Rollback

Key learnings

  • Learn how implement CV in CD Pipelines
  • Automate Rollback on CV Exceptions

Module 6: CI/CD

  • Deep dive: DevOps and Continuous Delivery (CD) Philosophy
  • CI/CD Practices, Smart Automation: Gains in Efficiency and Reliability
  • How to Build CD Pipelines monitoring system

Key learnings

  • CI/CD and Automation DevOps mentality for Release Engineering (RE)

Module 8: Elastic provisioning

  • Distributed Scheduling and Reliability Perspective
  • Elastic Provisioning: Change Management and Capacity Planning
  • Provision & Manage any Infrastructure

Key learnings

  • Danger-aware data-driven Provisioning 
  • Best Practices for Reliability Assurance
  • Understanding downtime impact on Job Scheduling and automating Relaunching

Module 9: Service & Automation

  • Security Layers for Effective SRE
  • Automated release system with transparent reports for troubleshooting

Key learnings

  • Structure your architecture to be security-aware at every step
  • Streamline code reviews into development workflow

Module 10: On-call & SRE culture

  • Being On-Call: Managing Operational Load, Leveraging Automatic Alerts and Dashboards
  • Blameless Postmortem Culture
  • The New On-Call Checklist

Key learnings

  • Clear on-call scheduling, escalation paths, incident management procedures

Interested in shaping your tailored training ?

 Contact us at dia@digital-innovation-partner.ch