Browse Jobs
By Role
By City
Posted Apr 20, 2026
As a highly experienced Senior DevOps Engineer, your role will involve leading the installation, automation, and operational reliability of a modern open-source data and integration platform that supports business-critical data pipelines and integrations. You will work with technologies such as Apache Airflow, Apache NiFi, Apache Spark, Kafka, PostgreSQL, MQTT brokers, Docker, and Kubernetes in both private and public cloud environments. Key Responsibilities:
Platform Installation, Configuration & Operations:
Install, configure, upgrade, and maintain distributed open-source components including Apache Airflow, Apache NiFi, Apache Spark, Apache Kafka, PostgreSQL, and MQTT brokers. - Ensure platform stability, scalability, high availability, and fault tolerance. - Perform capacity planning, performance tuning, and lifecycle management of all components. - Containerization & Orchestration:
Design, deploy, and operate containerized workloads using Docker. - Build and manage production-grade Kubernetes clusters. - Implement Kubernetes best practices for networking, storage, scaling, and security. - Package and manage platform services using Helm or equivalent tooling. - Infrastructure as Code & Automation:
Design and maintain Infrastructure as Code (IaC) using Terraform for cloud and on-prem environments. - Build configuration management and automation workflows using Ansible. - Enable repeatable, environment-agnostic deployments across development, staging, and production. - Automate provisioning, configuration, upgrades, scaling, and recovery processes. - Cloud, Hybrid & Private Infrastructure:
Deploy and operate workloads on public cloud platforms (AWS, Azure, GCP) and private/on-prem infrastructure. - Design hybrid architectures with secure connectivity between environments. - Optimize infrastructure design for resilience, performance, and cost efficiency. - Observability, Reliability & Incident Management:
Design and implement comprehensive monitoring, logging, and alerting for infrastructure and applications. - Define, measure, and maintain SLAs, SLIs, and SLOs for critical platform services. - Own incident response, root cause analysis, and post-incident reviews. - Proactively identify risks, bottlenecks, and failure modes before they impact users. - Security & Secrets Management:
Implement infrastructure and platform security best practices across containers, Kubernetes, and networks. - Manage secrets and credentials using tools such as Vault, Kubernetes Secrets, or cloud-native solutions. - Own certificate lifecycle management, including rotation and renewal. - Design and enforce network security controls, access policies, and zero-trust principles where applicable. - Backup, Disaster Recovery & Data Protection:
Design and implement automated backup strategies for Kafka, PostgreSQL, and other stateful services. - Own disaster recovery planning and testing, including restore validation. - Support multi-cluster or cross-region strategies where required. - Ensure data durability, integrity, and recoverability. - Cost & Resource Optimization:
Implement infrastructure cost monitoring and visibility across environments. - Right-size clusters, storage, and compute resources to balance performance and cost. - Continuously optimize resource usage for cloud and hybrid deployments. - CI/CD & Release Engineering:
Build and maintain CI/CD pipelines for platform and infrastructure components. - Enable safe deployment strategies such as rolling, blue-green, or canary deployments. - Support Git-based workflows and infrastructure promotion across environments. As a highly experienced Senior DevOps Engineer, your role will involve leading the installation, automation, and operational reliability of a modern open-source data and integration platform that supports business-critical data pipelines and integrations. You will work with technologies such as Apache Airflow, Apache NiFi, Apache Spark, Kafka, PostgreSQL, MQTT brokers, Docker, and Kubernetes in both private and public cloud environments. Key Responsibilities:
Platform Installation, Configuration & Operations:
Install, configure, upgrade, and maintain distributed open-source components including Apache Airflow, Apache NiFi, Apache Spark, Apache Kafka, PostgreSQL, and MQTT brokers. - Ensure platform stability, scalability, high availability, and fault tolerance. - Perform capacity planning, performance tuning, and lifecycle management of all components. - Containerization & Orchestration:
Design, deploy, and operate containerized workloads using Docker. - Build and manage production-grade Kubernetes clusters. - Implement Kubernetes best practices for networking, storage, scaling, and security. - Package and manage platform services using Helm or equivalent tooling. - Infrastructure as Code & Automation:
Design and maintain Infrastructure as Code (I
Don't want to apply yourself?
Our team writes your resume, applies for you, preps you for interviews, and negotiates your offer.