At H1, we believe access to the best healthcare information is a basic human right. Our mission is to provide a platform that can optimally inform every doctor interaction globally. This promotes health equity and builds needed trust in healthcare systems. To accomplish this our teams harness the power of data and AI-technology to unlock groundbreaking medical insights and convert those insights into action that result in optimal patient outcomes and accelerates an equitable and inclusive drug development lifecycle. Visit h1.co to learn more about us. We are looking for an AI Scientist to join a dedicated team building and evolving a clinical data platform serving the clinical operations space. You will design and implement machine learning models that power insights across the clinical trial lifecycle — working with large-scale healthcare datasets spanning publications, clinical trials, medical claims, and provider data. This is a backend, data-intensive role. You will work at the intersection of machine learning research and production data systems, building models that run at scale and integrate into real-world clinical workflows. You should be comfortable moving from research to implementation in a fast-paced, applied setting. WHAT YOU'LL DO AT H1
Design, develop, and deploy machine learning models to extract insights from clinical and healthcare data — including NLP tasks across publications, trial records, and medical claims
Build and optimize ML pipelines on large-scale distributed systems using Apache Spark and cloud-native tools
Work with structured and unstructured data sources to develop predictive models, classification systems, and entity resolution approaches
Collaborate with data engineers to integrate ML models into production ETL pipelines and ensure scalability and reliability
Partner with MLOps teams to operationalize models — including monitoring, retraining, and versioning workflows
Evaluate and benchmark model performance and present findings to both technical and non-technical stakeholders
Stay current with advances in NLP, deep learning, and healthcare-specific AI applications to identify opportunities for innovation
ABOUT YOU
A rigorous, applied scientist who moves confidently from research to production — you build models that ship
Someone who understands the realities of working with messy, large-scale healthcare data and can design around its limitations
Collaborative and communicative — comfortable working with data engineers, MLOps practitioners, and product teams
Curious about the clinical space and quick to develop domain intuition around trial processes, provider data, and healthcare terminology
Confident owning technical decisions and presenting trade-offs clearly to both technical and business stakeholders
REQUIREMENTS
5+ years of hands-on experience in machine learning, data science, or applied AI roles
Strong proficiency in Python; working knowledge of SQL and R
Deep experience with ML frameworks such as TensorFlow and PyTorch
Proven experience building and running models on large-scale data pipelines using Apache Spark (PySpark)
Exposure to NLP techniques — including text classification, named entity recognition, information extraction, and transformer-based models
Familiarity with cloud platforms (Azure preferred) and tools such as Databricks, Delta Lake, and Kubernetes
Experience working closely with data engineering and MLOps teams in production environments
Background in healthcare, life sciences, pharma, or clinical research is highly preferred
Comfortable working independently in a remote setting with a distributed, cross-timezone team