Research

Active research areas and the projects that explore them. Each interest is backed by hands-on implementation and experimentation.

LLM Architecture

Designing more efficient, capable, and interpretable language model architectures. Exploring alternatives to the standard transformer stack including state-space models, linear attention, and mixture-of-experts systems.

Key Papers

Attention Is All You NeedMamba: Linear-Time Sequence ModelingMixtral of Experts

Efficient Inference

Optimizing LLM inference through algorithmic improvements (FlashAttention, speculative decoding), system optimizations (continuous batching, KV cache management), and hardware-aware design.

Key Papers

FlashAttention-2Speculative DecodingvLLM: Easy, Fast, and Cheap LLM Serving

Model Compression

Reducing model size and inference cost through quantization, pruning, knowledge distillation, and architecture search. Focus on maintaining quality while achieving dramatic speedups.

Key Papers

GPTQ: Accurate Post-Training QuantizationAWQ: Activation-aware Weight QuantizationThe Case for 4-bit Precision

Long Context Systems

Extending transformer context windows to millions of tokens through architectural innovations, memory systems, and position encoding advances. Applications in document analysis, code understanding, and multi-turn conversation.

Key Papers

LongformerRing AttentionYaRN: Efficient Context Window Extension

Agentic AI

Building autonomous systems that can plan, reason, use tools, and interact with environments. Focus on reliability, safety, and capability in multi-step reasoning tasks.

Key Papers

ReAct: Synergizing Reasoning and ActingReflexion: Self-Reflective AgentsToolformer

Multimodal Models

Extending language models to understand and generate across vision, audio, and other modalities. Focus on efficient alignment and unified representation learning.

Key Papers

CLIPLLaVAFlamingo

Interpretability

Understanding the internal mechanisms of language models through mechanistic interpretability, feature visualization, and circuit tracing. Goal: making AI systems understandable and auditable.

Key Papers

A Mathematical Framework for Transformer CircuitsSparse Autoencoders Find Highly Interpretable FeaturesMechanistic Interpretability

Open Source AI

Contributing to open, reproducible, and accessible AI research. Building tools, datasets, and models that democratize access to state-of-the-art AI capabilities.

Key Papers

The PileOpenLMOLMo

Research Philosophy

I believe the most impactful research in AI comes from deep, end-to-end understanding of systems. Rather than optimizing isolated metrics, I focus on understanding the fundamental mechanisms that make language models work—and where they break.

My approach combines rigorous engineering with scientific curiosity. Every project starts with a question, proceeds through systematic experimentation, and ends with documented insights that inform the next question.

I am particularly interested in making AI systems more efficient, interpretable, and safe. The ultimate goal is not just to build bigger models, but to build better systems that we can understand, trust, and deploy responsibly.