Posted Feb 1, 2026
A shift is happening in AI that most people have not fully priced in. As models become more capable and agents take over more software work, inference becomes the critical bottleneck. The question stops being whether a model can do the work and becomes whether it can run fast enough to feel like thinking. Kog was built for that shift. We co-design the execution engine and the model architecture together, specifically for AMD MI300X hardware. Our monokernel runs from first token to last without returning control to the CPU. Our Laneformer architecture is designed to overlap computation and communication by deferring all-reduce by one layer. Today, Kog serves 2,500 tokens per second. Our next target is 5,000. Our MoE v3 already outperforms Llama 3.2-3B on CORE benchmarks and shows emergent reasoning capabilities where dense models of similar size score zero. We are a team of 11 people, including 10 engineers and 4 PhDs, building a different kind of inference company from first principles. Why this role matters now
Inference is becoming the constraint that shapes product quality, model design, and company velocity at the same time. At Kog, research is not upstream from execution. Research defines what can run fast, what can scale under real hardware constraints, and what kinds of capabilities become reachable in production. This role sits at that junction. The work you do here will influence the next generation of Kog models, the structure of the training and evaluation loop, and the architectural decisions that determine whether performance gains are incremental or structural. The problem
Most model research still assumes that architecture quality and execution quality can be optimized separately. That assumption leaves performance on the table and narrows the design space. Kog took a different route. We co-design the model architecture and the execution engine together. LaneFormer is a direct expression of that approach. It is built to overlap computation and communication by deferring all-reduce by one layer, which changes what becomes possible inside the generation loop and what constraints every architectural decision must satisfy downstream. This creates a harder research problem and a more interesting one. Progress comes from designing architectures that are mathematically sound, trainable in practice, and structurally aligned with the machine they will run on.
You will own the model architecture roadmap at Kog. You will work on the boundary between research judgment, training reality, and execution constraints. You will decide how the next model generation is structured, trained, evaluated, and refined. This is a hands-on leadership role. You will design architectures, write and review training code, shape experiments, make calls on model direction, and lead a team toward work that compounds. You will be expected to move with equal rigor between theory, implementation, and measured outcomes. What you will work on
Must-have
Strong signal
Top 0.1% for this role
The strongest candidates for this role have already developed original architectural judgment. They have designed model structures that improved performance because of a specific decision they made, and they can explain the mechanism clearly. They understand that model design is constrained by training dynamics, communication patterns, memory behavior, and the realities of the execution path. When they encounter an idea like Laneformer, they do not treat it as an isolated modeling trick. They immediately understand the downstream consequences for routing, layer structure, optimization, convergence, and generation speed. They know how to turn that understanding into a research program, a training plan, and a sequence of experiments that compounds. They bring both authorship and taste. They know when a result is fundamental, when it is local, and when a model change is worth the system cost it introduces. What we offer
Don't want to apply yourself?
Our team writes your resume, applies for you, preps you for interviews, and negotiates your offer.
Browse Jobs
By Role
By City