A doorknob

Blog Posts

Text Diffusion Models are Faster at Writing Code

In this post, I run small experiments showing that diffusion language models generate code (and other structured text) at a faster rate. Increased stucture tends to correlate with reduced entropy, which leads to higher confident token predictions, which …

Curserve: Minimizing Agentic Coding End-to-End Latency

For Cal Hacks 2025, a few friends and I built Curserve, a fast and scalable server-side engine for agentic coding, which ended up placing for one of the sponsor prizes. We didn’t go to Cal Hacks to try and win, but instead to have a good excuse to work on …

BERT is just a Single Text Diffusion Step

This article appeared on Hacker News. Link to the discussion here. Additionally, Andrej Karpathy wrote his thoughts about the post, linked here. A while back, Google DeepMind unveiled Gemini Diffusion, an experimental language model that generates text …

Local SGD and DiLoCo Research Musings

Here are some notes I wrote over this topic. I’ve switched my master’s thesis to a different topic, but there were many interesting research directions I found in this area. Local SGD and DiLoCo Overview: It is October 15th, 2025. For my last year of my …

Running GPT-2 in WebGL with Classic GPGPU Programming

This article appeared on Hacker News. Link to the discussion here. A few weeks back, I implemented GPT-2 using WebGL and shaders (Github Repo) which made the front page of Hacker News. Here is a short write-up over what I learned about old-school …

Mathematical Statistics

My notes over Mark Maxwell’s course, Introduction to Mathematical Statistics, and his textbook, Probability & Statistics with Applications, Second Edition. Sampling Distributions and Estimation: Normally in a probability experiment, we don’t know the true …

Common Probability Distributions

An overview of common discrete and continuous distributions found in probability and statistics, from Mark Maxwell’s textbook, Probability & Statistics with Applications, Second Edition. Common Discrete Distributions: Discrete Uniform: A random variable …

How to Fix Hugo's iOS Code-Block Text-Size Rendering Issue

Lately, I’ve been coming across many blogs that have weird font-size rendering issues for code blocks on iOS. Basically, in a code snippet, the text-size would sometimes be much larger for some lines than others. Below is a screenshot of the issue from a …

Intro to Autograd Engines: Karpathy's Micrograd in Go

For a while, I wanted to build a complete autograd engine. What is an autograd engine, you might ask? To find the answer, we first must know what a neural network is. Neural Network Crash Course: A neural network can just be seen as a black-box function. …

Where Rust Shines: Algebraic Types and Match Statements

Lexical Analysis and ASTs: Recently I was going through Thorsten Ball’s “Writing An Interpreter in Go”. In this book, you create a basic interpreted language and write a lexer, parser, evaluator, and REPL for it. A Lexer takes in source code and turns it …