We are a hybrid company based in the Bay Area with offices in both San Francisco and Menlo Park. For this requisition, we are open to remote candidates but will prioritize candidates who are local. We care about work-life balance and understand that there will be times where flexibility is needed. ## What you will do
Design, build, and improve the infrastructure that powers Sprinter’s patient care, clinician operations, internal tooling, and partner-facing systems
Improve reliability across distributed systems, cloud infrastructure, CI/CD, observability, and incident response
Raise the security baseline across cloud infrastructure, access controls, secrets management, identity, and operational workflows
Build and maintain infrastructure as code using Terraform and related tooling
Automate manual infrastructure and operational processes through scripting, tooling, and platform improvements
Partner with engineering teams to improve system architecture, deployment practices, monitoring, logging, and alerting
Troubleshoot complex issues across infrastructure, application, data, and operational boundaries
Help define reliability, security, and infrastructure standards that allow Sprinter to scale without creating brittle systems
Support incident response practices, postmortems, operational readiness, and continuous improvement across engineering
Make pragmatic tradeoffs between reliability, security, speed, and simplicity in a fast-moving startup environment
What you have done
Spent 8+ years in site reliability engineering, platform engineering, infrastructure engineering, security engineering, or related technical roles
Led high-impact infrastructure, reliability, platform, or security projects end to end with minimal oversight
Built and operated production systems in cloud environments, ideally AWS and/or GCP
Worked deeply with infrastructure as code, ideally Terraform
Improved observability, monitoring, logging, alerting, and incident response practices across engineering teams
Automated infrastructure, deployment, or operational workflows using scripting languages such as Python, Bash, or TypeScript