Posted Apr 24, 2026
As a highly skilled Senior Machine Learning Engineer, your role involves building and scaling next-generation generative AI systems at the intersection of machine learning and backend infrastructure. You will be responsible for working on cutting-edge generative video and multimodal AI use cases, contributing to scalable, low-latency systems used by millions of users globally. Key Responsibilities:
Design, train, fine-tune, and evaluate generative and multimodal models such as text-to-video, image-to-video, lip-sync, and character consistency. - Build and manage end-to-end ML pipelines, including data ingestion, preprocessing, training, evaluation, and model versioning. - Deploy and maintain scalable ML systems, including model serving, containerization, and GPU-optimized inference. - Implement MLOps best practices like experiment tracking, model monitoring, drift detection, and A/B testing. - Optimize inference systems for low latency, high throughput, and cost-efficient GPU utilization. - Develop batching and caching strategies to meet production SLAs. - Collaborate with backend and platform teams to integrate ML services into distributed systems. - Contribute to long-term AI strategy, including foundational model training and fine-tuning pipelines. Qualifications Required:
410 years of experience in Machine Learning or Applied ML Engineering. - Strong fundamentals in deep learning, Transformers, and generative model architectures. - Hands-on experience with large-scale model training and fine-tuning (e.g., LoRA, full fine-tuning). - Proven experience in deploying and scaling ML models in production environments. - Strong understanding of MLOps practices and tools like MLflow, Weights & Biases. - Experience with model serving frameworks such as Triton, TorchServe, vLLM, or similar. - Proficiency in Python and frameworks like PyTorch. - Experience working with cloud platforms (AWS, GCP, or Azure), including GPU provisioning and autoscaling. - Ability to work in fast-paced, ambiguous environments with cross-functional teams. Preferred Qualifications:
Experience with video generation, diffusion models, or multimodal architectures. - Familiarity with LoRA/IC-LoRA techniques for character or identity consistency. - Knowledge of inference optimization techniques such as quantization (FP8/INT8), batching, and GPU memory management. - Experience with audio/video systems like TTS, voice cloning, lip-sync pipelines. - Background in media, OTT, or large-scale content platforms. In this role, you can expect competitive compensation, the opportunity to work on cutting-edge AI products at scale, a high-impact role with ownership across the ML lifecycle, a collaborative and fast-paced work environment, and continuous learning and growth opportunities. As a highly skilled Senior Machine Learning Engineer, your role involves building and scaling next-generation generative AI systems at the intersection of machine learning and backend infrastructure. You will be responsible for working on cutting-edge generative video and multimodal AI use cases, contributing to scalable, low-latency systems used by millions of users globally. Key Responsibilities:
Design, train, fine-tune, and evaluate generative and multimodal models such as text-to-video, image-to-video, lip-sync, and character consistency. - Build and manage end-to-end ML pipelines, including data ingestion, preprocessing, training, evaluation, and model versioning. - Deploy and maintain scalable ML systems, including model serving, containerization, and GPU-optimized inference. - Implement MLOps best practices like experiment tracking, model monitoring, drift detection, and A/B testing. - Optimize inference systems for low latency, high throughput, and cost-efficient GPU utilization. - Develop batching and caching strategies to meet production SLAs. - Collaborate with backend and platform teams to integrate ML services into distributed systems. - Contribute to long-term AI strategy, including foundational model training and fine-tuning pipelines. Qualifications Required:
410 years of experience in Machine Learning or Applied ML Engineering. - Strong fundamentals in deep learning, Transformers, and generative model architectures. - Hands-on experience with large-scale model training and fine-tuning (e.g., LoRA, full fine-tuning). - Proven experience in deploying and scaling ML models in production environments. - Strong understanding of MLOps practices and tools like MLflow, Weights & Biases. - Experience with model serving frameworks such as Triton, TorchServe, vLLM, or similar. - Proficiency in Python and frameworks like PyTorch. - Experience working with cloud platforms (AWS, GCP, or Azure), including GPU provisioning and autoscaling. - Ability to work in fast-paced, ambiguous environments with cross-functional teams. Preferred Qualifications:
Experience with video generation, diffusion models, or multimodal architectures. - Familiarity with LoR
Don't want to apply yourself?
Our team writes your resume, applies for you, preps you for interviews, and negotiates your offer.
Browse Jobs
By Role
By City