Posted May 20, 2026
Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities. At Tenstorrent, we believe the future of computing must be open, which is why our interns don’t just watch from the sidelines - they help build the core of it. We provide a "code-to-career" pipeline where students collaborate with industry experts to solve high-stakes problems in RISC-V and AI hardware-software co-design. By joining us, you are taking an internship to democratize high-performance computers that are accessible to everyone. In this role, you will implement state of art ML models on Tenstorrent hardware using Python and C++, focusing on pushing both accuracy and inference speed. You will work hands-on with Tenstorrent’s open-source software stack (tt-metalium, tt-nn, tt-llk), taking models from framework to silicon and iterating on performance. You will own a well-defined engineering project under the guidance of a dedicated mentor, with direct impact on how real workloads run on our chips. We are looking for a minimum of 3 months for this role with the potential for extension to 6 months. This role is onsite, based in our Belgrade office.
Enrolled in the final year of BSc or MSc studies in Computer Science, Computer Engineering, Software Engineering, Electronics, Math, or a related field. - Solid coding skills in Python and C++, with a basic understanding of machine learning concepts and frameworks. - You have a passion for programming, are eager to learn, and enjoy solving complex performance and optimization problems. - You are collaborative, open to feedback, and excited to work closely with experienced engineers and a dedicated mentor. What We Need
Implement functional ML models on Tenstorrent hardware using Python and popular ML frameworks like PyTorch. - Benchmark, analyze, and optimize the performance of the implemented model's inference using existing tools and coding in C++ and Python. - Collaborate with experienced engineers to validate the accuracy of implemented models and iterate on improvements. - Contribute to performance optimization efforts where success is measured by achieving both high accuracy and fast execution (inference) of ML models on Tenstorrent hardware. What You Will Learn
How to implement state-of-the-art ML models on Tenstorrent hardware using Python, C++, and popular ML frameworks like PyTorch. - Techniques for benchmarking, analyzing, and optimizing the performance of ML model inference using existing tools and code in C++ and Python. - How to use (and potentially debug and fix) Tenstorrent’s open-source software libraries, such as tt-metalium, tt-nn, and tt-llk. - How to collaborate with experienced engineers, apply various problem-solving techniques, and drive a well-defined engineering project under the guidance of a dedicated mentor. Hiring Timelines
This internship opportunity is available throughout our 3 terms with the following corresponding recruitment cycles:
Don't want to apply yourself?
Our team writes your resume, applies for you, preps you for interviews, and negotiates your offer.
Browse Jobs
By Role
By City