As a Senior DevOps & Site Reliability Engineer with a focus on AWS and Azure, your role will involve managing, automating, and optimizing hybrid cloud infrastructure and application deployment processes across both ecosystems. You will combine strong DevOps practices with dedicated Site Reliability Engineering (SRE) principles to ensure resilient, scalable, secure, and optimal systems. Your responsibilities will include:
**Infrastructure & Automation**:
Designing, deploying, and managing scalable, fault-tolerant infrastructure and services across AWS and Azure. - Developing and maintaining infrastructure automation using tools like Terraform, AWS CloudFormation, or Azure ARM Templates/Bicep. - Ensuring infrastructure designs adhere to the principles of the AWS and Azure Well-Architected Frameworks. - Building and optimizing robust CI/CD pipelines to facilitate rapid and reliable application deployment. - **Site Reliability Engineering (SRE) & Observability**:
Implementing monitoring, logging, and alerting solutions to proactively identify and resolve production issues. - Leading incident response efforts, conducting root cause analyses, and improving system reliability. - Monitoring application and infrastructure performance, identifying bottlenecks, and implementing optimizations. - Defining and tracking Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to meet SLAs. - **Security & Collaboration**:
Implementing security best practices into the DevOps workflow (DevSecOps) and managing access controls. - Collaborating with software development, QA, and product teams to ensure smooth releases and troubleshoot issues. In terms of qualifications, you should have:
5+ years of experience in DevOps, SRE, or a similar cloud infrastructure role, with experience in managing both AWS and Azure environments. - Deep working knowledge of core services in AWS and Azure. - Strong scripting skills in Python, Bash, or Go, and hands-on experience with IaC tools. - Proficiency with CI/CD platforms and version control systems. - Experience setting up and managing modern logging, monitoring, and alerting tools. - Practical experience implementing and auditing systems against the AWS and Azure Well-Architected Frameworks. Preferred qualifications include cloud certifications and experience with containerization and orchestration. As a Senior DevOps & Site Reliability Engineer with a focus on AWS and Azure, your role will involve managing, automating, and optimizing hybrid cloud infrastructure and application deployment processes across both ecosystems. You will combine strong DevOps practices with dedicated Site Reliability Engineering (SRE) principles to ensure resilient, scalable, secure, and optimal systems. Your responsibilities will include:
**Infrastructure & Automation**:
Designing, deploying, and managing scalable, fault-tolerant infrastructure and services across AWS and Azure. - Developing and maintaining infrastructure automation using tools like Terraform, AWS CloudFormation, or Azure ARM Templates/Bicep. - Ensuring infrastructure designs adhere to the principles of the AWS and Azure Well-Architected Frameworks. - Building and optimizing robust CI/CD pipelines to facilitate rapid and reliable application deployment. - **Site Reliability Engineering (SRE) & Observability**:
Implementing monitoring, logging, and alerting solutions to proactively identify and resolve production issues. - Leading incident response efforts, conducting root cause analyses, and improving system reliability. - Monitoring application and infrastructure performance, identifying bottlenecks, and implementing optimizations. - Defining and tracking Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to meet SLAs. - **Security & Collaboration**:
Implementing security best practices into the DevOps workflow (DevSecOps) and managing access controls. - Collaborating with software development, QA, and product teams to ensure smooth releases and troubleshoot issues. In terms of qualifications, you should have:
5+ years of experience in DevOps, SRE, or a similar cloud infrastructure role, with experience in managing both AWS and Azure environments. - Deep working knowledge of core services in AWS and Azure. - Strong scripting skills in Python, Bash, or Go, and hands-on experience with IaC tools. - Proficiency with CI/CD platforms and version control systems. - Experience setting up and managing modern logging, monitoring, and alerting tools. - Practical experience implementing and auditing systems against the AWS and Azure Well-Architected Frameworks. Preferred qualifications include cloud certifications and experience with containerization and orchestration.