Posted May 2, 2026
As a Lead Infra - SRE at our company, you will play a crucial role in ensuring the reliability, availability, and performance of our infrastructure. Your proactive problem-solving approach and commitment to continuous improvement will be key to your success in this role. Here is a breakdown of your responsibilities:
Design, implement, and maintain scalable and reliable infrastructure solutions for our applications and services. - Monitor system performance and reliability, identifying and resolving issues proactively to minimize downtime. - Collaborate with development teams to integrate SRE practices into the software development lifecycle. - Develop and maintain automation tools and scripts to streamline operations and improve efficiency. - Implement and manage incident response processes for timely resolution of incidents. - Conduct post-incident reviews to identify root causes and implement preventive measures. - Stay updated with industry trends and best practices in SRE and infrastructure management. - Mentor and guide junior team members to foster a culture of learning and collaboration. Your mandatory skills should include:
Strong knowledge of Site Reliability Engineering (SRE) principles. - Proficiency in cloud platforms (e.g., AWS, Azure, GCP) and container orchestration technologies (e.g., Kubernetes). - Experience with infrastructure as code (IaC) tools like Terraform or CloudFormation. - Solid understanding of monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack). - Strong scripting skills in languages such as Python, Bash, or Go. - Excellent problem-solving skills and ability to work under pressure. - Strong communication and collaboration skills to work effectively in a team environment. Preferred skills that would be beneficial:
Experience with configuration management tools like Ansible, Puppet, or Chef. - Familiarity with CI/CD pipelines and DevOps practices. - Knowledge of security best practices in infrastructure management. - Experience with database management and optimization. - Understanding of networking concepts and protocols. For qualifications, we are looking for candidates with:
Bachelor's degree in Computer Science, Information Technology, or a related field. - Relevant certifications in SRE, cloud technologies, or infrastructure management are a plus. If you are passionate about infrastructure management and possess a strong SRE background, we invite you to apply and be part of our dynamic team! As a Lead Infra - SRE at our company, you will play a crucial role in ensuring the reliability, availability, and performance of our infrastructure. Your proactive problem-solving approach and commitment to continuous improvement will be key to your success in this role. Here is a breakdown of your responsibilities:
Design, implement, and maintain scalable and reliable infrastructure solutions for our applications and services. - Monitor system performance and reliability, identifying and resolving issues proactively to minimize downtime. - Collaborate with development teams to integrate SRE practices into the software development lifecycle. - Develop and maintain automation tools and scripts to streamline operations and improve efficiency. - Implement and manage incident response processes for timely resolution of incidents. - Conduct post-incident reviews to identify root causes and implement preventive measures. - Stay updated with industry trends and best practices in SRE and infrastructure management. - Mentor and guide junior team members to foster a culture of learning and collaboration. Your mandatory skills should include:
Strong knowledge of Site Reliability Engineering (SRE) principles. - Proficiency in cloud platforms (e.g., AWS, Azure, GCP) and container orchestration technologies (e.g., Kubernetes). - Experience with infrastructure as code (IaC) tools like Terraform or CloudFormation. - Solid understanding of monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack). - Strong scripting skills in languages such as Python, Bash, or Go. - Excellent problem-solving skills and ability to work under pressure. - Strong communication and collaboration skills to work effectively in a team environment. Preferred skills that would be beneficial:
Experience with configuration management tools like Ansible, Puppet, or Chef. - Familiarity with CI/CD pipelines and DevOps practices. - Knowledge of security best practices in infrastructure management. - Experience with database management and optimization. - Understanding of networking concepts and protocols. For qualifications, we are looking for candidates with:
Bachelor's degree in Computer Science, Information Technology, or a related field. - Relevant certifications in SRE, cloud technologies, or infrastructure management are a plus. If you are passionate about infrastructure management and possess a strong SRE background, we invite you to apply and be part of our dynamic team!
Don't want to apply yourself?
Our team writes your resume, applies for you, preps you for interviews, and negotiates your offer.
Browse Jobs
By Role
By City