Writing: 2025|
Diffusion LLMs are Faster at Writing Code
December 13, 2025
In this post, I run small experiments showing that diffusion language models generate code (and other structured text) at a faster rate. Increased stucture tends to correlate with reduced entropy, which leads to higher confident token predictions, which …
Curserve: Minimizing Agentic Coding End-to-End Latency
November 9, 2025
For Cal Hacks 2025, a few friends and I built Curserve, a fast and scalable server-side engine for agentic coding, which ended up placing for one of the sponsor prizes. We didn’t go to Cal Hacks to try and win, but instead to have a good excuse to work on …
BERT is just a Single Text Diffusion Step
October 20, 2025
A while back, Google DeepMind unveiled Gemini Diffusion, an experimental language model that generates text using diffusion. Unlike traditional GPT-style models that generate one word at a time, Gemini Diffusion creates whole blocks of text by refining …
Local SGD and DiLoCo Research Musings
October 14, 2025
Here are some notes I wrote over this topic. I’ve switched my master’s thesis to a different topic, but there were many interesting research directions I found in this area.
Local SGD and DiLoCo Overview: It is October 15th, 2025. For my last year of my …
Running GPT-2 in WebGL with Classic GPU Programming
May 24, 2025
A few weeks back, I implemented GPT-2 using WebGL and shaders (Github Repo) which made the front page of Hacker News. Here is a short write-up over what I learned about old-school general-purpose GPU programming over the course of this project!
Above is a …