Posted May 8, 2026
Key Responsibilities:
Model Development & Training:
Implement and fine-tune state-of-the-art computer vision models for object detection, classification, segmentation, OCR, pose estimation, and video analytics. - Apply transfer learning, self-supervised learning, and multimodal fusion to accelerate development. - Experiment with generative vision models (GANs, Diffusion Models, ControlNet) for synthetic data augmentation and creative tasks. - Vision System Engineering:
Develop and optimize vision pipelines from raw data preprocessing training inference deployment. - Build real-time vision systems for video streams, edge devices, and cloud platforms. - Implement OCR and document AI for text extraction, document classification, and layout understanding. - Optimization & Deployment:
Optimize models for low-latency, high-throughput inference using ONNX, TensorRT, OpenVINO, CoreML. - Deploy models on cloud (AWS/GCP/Azure) and edge platforms (NVIDIA Jetson, Coral, iOS/Android). - Benchmark models for accuracy vs performance trade-offs across hardware accelerators. - Data & Experimentation:
Work with large-scale datasets (structured/unstructured, multimodal). - Implement data augmentation, annotation pipelines, and synthetic data generation. - Conduct rigorous experimentation and maintain reproducible ML workflows. Required Skills & Qualifications:
Programming: Expert in Python; solid experience with C++ for performance-critical components. - Deep Learning Frameworks: PyTorch, TensorFlow, Keras. - Computer Vision Expertise:
Detection & Segmentation: YOLO (v5v8), Faster/Mask R-CNN, RetinaNet, Detectron2, MMDetection, Segment Anything. - Vision Transformers: ViT, Swin, DeiT, ConvNeXt, BEiT. - OCR & Document AI: Tesseract, PaddleOCR, TrOCR, LayoutLM/Donut. - Video Understanding: SlowFast, TimeSformer, action recognition models. - 3D Vision: PointNet, PointNet++, NeRF, depth estimation. - Generative AI for Vision: Stable Diffusion, StyleGAN, DreamBooth, ControlNet. - MLOps Tools: MLflow, Weights & Biases, DVC, Kubeflow. - Optimization Tools: ONNX Runtime, TensorRT, OpenVINO, CoreML, quantization/pruning frameworks. - Deployment: Docker, Kubernetes, Flask/FastAPI, Triton Inference Server. - Data Tools: OpenCV, Albumentations, Label Studio, FiftyOne. Role Overview: You are a Senior Computer Vision Developer responsible for designing, building, and deploying vision-based AI solutions for real-world applications. Your role will involve deep hands-on experience with image/video analytics, deep learning model development, optimization, and deployment. Working closely with AI Architects and data engineers, you will deliver high-performance, production-grade vision systems. Key Responsibilities:
Model Development & Training:
Implement and fine-tune state-of-the-art computer vision models for object detection, classification, segmentation, OCR, pose estimation, and video analytics. - Apply transfer learning, self-supervised learning, and multimodal fusion to accelerate development. - Experiment with generative vision models (GANs, Diffusion Models, ControlNet) for synthetic data augmentation and creative tasks. - Vision System Engineering:
Develop and optimize vision pipelines from raw data preprocessing training inference deployment. - Build real-time vision systems for video streams, edge devices, and cloud platforms. - Implement OCR and document AI for text extraction, document classification, and layout understanding. - Optimization & Deployment:
Optimize models for low-latency, high-throughput inference using ONNX, TensorRT, OpenVINO, CoreML. - Deploy models on cloud (AWS/GCP/Azure) and edge platforms (NVIDIA Jetson, Coral, iOS/Android). - Benchmark models for accuracy vs performance trade-offs across hardware accelerators. - Data & Experimentation:
Work with large-scale datasets (structured/unstructured, multimodal). - Implement data augmentation, annotation pipelines, and synthetic data generation. - Conduct rigorous experimentation and maintain reproducible ML workflows. Required Skills & Qualifications:
Programming: Expert in Python; solid experience with C++ for performance-critical components. - Deep Learning Frameworks: PyTorch, TensorFlow, Keras. - Computer Vision Expertise:
Detection & Segmentation: YOLO (v5v8), Faster/Mask R-CNN, RetinaNet, Detectron2, MMDetection, Segment Anything. - Vision Transformers: ViT, Swin, DeiT, ConvNeXt, BEiT. - OCR & Document AI: Tesserac
Don't want to apply yourself?
Our team writes your resume, applies for you, preps you for interviews, and negotiates your offer.
Browse Jobs
By Role
By City