Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Foundations of Cloud Operations on AWS
- Operational roles and responsibilities within the cloud environment.
- AWS account structure, organizations, and multi-account strategies.
- Core operational services: CloudWatch, CloudTrail, and AWS Config.
Infrastructure as Code and Provisioning
- Principles of Infrastructure as Code (IaC) and immutable infrastructure.
- Provisioning using Terraform and AWS CloudFormation.
- Managing state, modules, and environment promotion.
CI/CD and Deployment Strategies
- Designing CI/CD pipelines for cloud-native applications.
- Blue/green, canary, and rolling deployment techniques.
- Automating rollbacks, health checks, and release validation.
Monitoring, Observability, and Alerting
- Metrics, logs, and traces: shipping, storing, and analyzing data.
- Utilizing CloudWatch, X-Ray, and third-party observability tools.
- Defining Service Level Objectives (SLOs)/Service Level Indicators (SLIs), alerting policies, and on-call practices.
Security Operations and Identity Management
- IAM best practices, least privilege access, and cross-account access.
- Secrets management, Key Management Service (KMS), and secure parameter stores.
- Operational security: patching strategies, vulnerability scanning, and audit trails.
Resilience, Backup, and Disaster Recovery
- Designing for fault tolerance and high availability.
- Backup strategies, snapshot automation, and restore procedures.
- Disaster recovery planning and runbook creation.
Cost Optimization and Governance
- Cost visibility: billing, tagging, and cost allocation strategies.
- Rightsizing, reserved instances/savings plans, and budgeting controls.
- Governance: policies, guardrails, and automation for compliance.
Containers, Serverless, and Runtime Operations
- Operational considerations for ECS, EKS, and Lambda.
- Service discovery, autoscaling, and resource limits.
- Logging, tracing, and debugging containerized workloads.
Incident Response, Playbooks, and Chaos Engineering
- Runbook-driven incident response and postmortem practices.
- Automating remediation and self-healing patterns.
- Introduction to chaos experiments for validating resilience.
Hands-on Workshop: Operate a Sample Workload
- Deploy a sample application using IaC and a CI/CD pipeline.
- Implement monitoring, alerts, and an automated remediation script.
- Simulate incidents and practice runbook-based response.
Summary and Next Steps
Requirements
- A fundamental understanding of cloud concepts and networking principles.
- Familiarity with the Linux command line and scripting.
- Experience with version control systems (Git) and basic CI/CD concepts.
Audience
- Cloud operations engineers.
- Site Reliability Engineers (SREs) and platform engineers.
- DevOps engineers and technical team leads.
21 Hours
Testimonials (1)
I've find out new interesting things about Lambda and Serverless