As an experienced AI/ML Engineer, you will be responsible for partnering with software, research, architecture, and product teams to align AI ecosystem strategies for Windows RTX platforms. Your role will involve collaborating with cross-functional and external teams to drive AI innovation across graphics, browsers, and edge devices. You will optimize AI model performance on current and next-generation GPU architectures through deep analysis and tuning, and perform end-to-end optimization of AI models, data pipelines, and inference runtimes. Additionally, you will implement compute and memory optimization techniques such as quantization, pruning, and distillation, and fine-tune and compress large AI models for efficient deployment on edge devices. Your responsibilities will also include enhancing inference pipelines using frameworks like ONNX Runtime, DirectML, PyTorch, and TensorRT, contributing to the development and optimization of GPU-accelerated applications and libraries, and working on performance profiling, debugging, and system-level optimizations for high-performance computing workloads. Collaboration with distributed teams to deliver scalable, production-ready AI solutions will be a key aspect of your role. **Key Responsibilities:**
Partner with software, research, architecture, and product teams to align AI ecosystem strategies for Windows RTX platforms
Collaborate with cross-functional and external teams to drive AI innovation across graphics, browsers, and edge devices
Optimize AI model performance on current and next-generation GPU architectures through deep analysis and tuning
Perform end-to-end optimization of AI models, data pipelines, and inference runtimes
Implement compute and memory optimization techniques such as quantization, pruning, and distillation
Fine-tune and compress large AI models for efficient deployment on edge devices
Enhance inference pipelines using frameworks like ONNX Runtime, DirectML, PyTorch, and TensorRT
Contribute to development and optimization of GPU-accelerated applications and libraries
Work on performance profiling, debugging, and system-level optimizations for high-performance computing workloads
Collaborate with distributed teams to deliver scalable, production-ready AI solutions
**Qualifications Required:**
5+ years of experience in AI/ML systems, inference pipelines, or GPU optimization
Strong programming skills in C++ with a solid understanding of data structures and algorithms
Hands-on experience with ML/DL frameworks such as ONNX Runtime, PyTorch, TensorRT, or DirectML
Experience in optimizing AI models for performance, scalability, and memory efficiency
Exposure to GPU programming and high-performance computing (CUDA preferred)
Experience with model optimization techniques like quantization, pruning, and distillation
Familiarity with APIs such as DirectX, Vulkan, or similar is a plus
Strong analytical, debugging, and problem-solving skills
Ability to work in a fast-paced, collaborative, and cross-functional environment
This job requires a B.E. or B.Tech degree. As an experienced AI/ML Engineer, you will be responsible for partnering with software, research, architecture, and product teams to align AI ecosystem strategies for Windows RTX platforms. Your role will involve collaborating with cross-functional and external teams to drive AI innovation across graphics, browsers, and edge devices. You will optimize AI model performance on current and next-generation GPU architectures through deep analysis and tuning, and perform end-to-end optimization of AI models, data pipelines, and inference runtimes. Additionally, you will implement compute and memory optimization techniques such as quantization, pruning, and distillation, and fine-tune and compress large AI models for efficient deployment on edge devices. Your responsibilities will also include enhancing inference pipelines using frameworks like ONNX Runtime, DirectML, PyTorch, and TensorRT, contributing to the development and optimization of GPU-accelerated applications and libraries, and working on performance profiling, debugging, and system-level optimizations for high-performance computing workloads. Collaboration with distributed teams to deliver scalable, production-ready AI solutions will be a key aspect of your role. **Key Responsibilities:**
Partner with software, research, architecture, and product teams to align AI ecosystem strategies for Windows RTX platforms
Collaborate with cross-functional and external teams to drive AI innovation across graphics, browsers, and edge devices
Optimize AI model performance on current and next-generation GPU architectures through deep analysis and tuning
Perform end-to-end optimization of AI models, data pipelines, and inference runtimes
Implement compute and memory optimization techniques such as quantization, pruning, and distillation
Fine-tune and compress large AI models for efficient deployment on