Michael T. recently shared something that caught my attention: "We built a browser with GPT-5.2 in Cursor. It ran uninterrupted for one week." He added that it’s "3M+ lines of code across thousands of files," with a from-scratch Rust rendering engine and even "a custom JS VM." And then the most honest line in the whole post: "It kind of works!"

That combination—audacious scope, concrete engineering detail, and a candid reality check—is exactly why this post landed. I want to expand on what Michael T. implied between the lines: building a browser is a brutal integration problem, and the fact that an AI-assisted effort can produce something that renders simple sites "quickly and largely correctly" is both impressive and clarifying about what today’s coding models are (and aren’t) doing.

"It still has issues and is of course very far from Webkit/Chromium parity, but we were astonished that simple websites render quickly and largely correctly." — Michael T.

Why "a browser" is a deceptively hard milestone

When someone says they built a browser, many readers picture tabs, an address bar, and maybe a DOM viewer. But Michael T. is pointing at the part that makes browsers legendary engineering achievements: the rendering engine.

A modern browser engine is an orchestra of subsystems that all need to be correct, fast, and secure at the same time. It’s not just "parse HTML and draw pixels." The happy path demo (loading a simple page) hides the complexity of edge cases, spec ambiguity, and performance constraints.

Michael T. listed the core pieces explicitly: HTML parsing, CSS cascade, layout, text shaping, paint, and JavaScript execution. Each of those words is its own multi-year discipline.

What Michael T.'s architecture list really tells us

Let’s unpack the ingredients Michael T. mentioned, because each one is a clue about why this is a serious engineering project and not a toy demo.

HTML parsing: where ambiguity begins

HTML is famously forgiving. Real pages contain malformed markup, weird nesting, missing tags, and legacy patterns. A browser doesn’t just "read" HTML; it applies error-correction rules so that broken pages still work. Getting this right is one reason new engines take so long.

CSS cascade: the "why does this look wrong?" machine

CSS isn’t hard because the syntax is complex; it’s hard because the cascade and specificity rules interact with inheritance, default styles, computed values, and browser quirks. Two pages can have identical HTML but render wildly differently depending on CSS precedence and layout rules.

If Michael T.’s engine renders "largely correctly" on simple sites, that suggests the project is already nailing a meaningful subset: basic selectors, inheritance, box model, and enough layout to be visually coherent.

Layout: constraint solving at scale

Layout is where engine design turns into performance engineering. Flow layout, inline formatting, flexbox, grid, positioning, overflow, and fragmentation each add layers of algorithmic complexity.

Even if this engine supports only a slice—say block and inline flow—it still has to compute sizes, positions, reflow behavior, and paint order. That it does so "quickly" is a hint that the Rust core is structured with performance in mind.

Text shaping: the part most demos skip

Text shaping is a big tell. If you’ve never built a renderer, it’s easy to underestimate the complexity of turning Unicode strings into glyphs with correct kerning, ligatures, bidi text, line breaking, and font fallback.

Including text shaping implies a practical goal: render real content, not just boxes. It’s also a common place where correctness bugs show up, especially across languages and fonts.

Paint: turning layout into pixels

Painting involves stacking contexts, clipping, borders, backgrounds, transforms, and compositing. Many prototypes draw rectangles and call it a day; a functioning paint pipeline that handles enough CSS to resemble real pages is a major milestone.

A custom JS VM: ambitious and risky

The phrase "custom JS VM" is probably the most ambitious element in the whole list. JavaScript engines are notoriously complex because they require fast parsing, JIT/bytecode execution strategies, garbage collection, and compatibility with the web platform.

A simplified VM can still unlock a huge leap: basic interactivity, DOM manipulation, event handling, and site scripts that don’t immediately crash. But it also becomes a magnet for edge cases.

The real headline: the system ran for a week

Michael T. didn’t just say "we got a page to load." He said it "ran uninterrupted for one week." That’s a reliability statement, not a demo statement.

To me, that implies two things:

The project has enough internal discipline to avoid constant panics, memory corruption, runaway loops, or resource leaks.
The integration work—the boring, unglamorous glue that connects parsing, layout, painting, and scripting—is stable enough to keep going under continuous use.

In other words: the most impressive part may not be the screenshots. It’s that the engine behaves like software, not a fragile experiment.

Where GPT-5.2 (and Cursor) likely helped—and where it didn’t

Michael T. framed this as "built with GPT-5.2 in Cursor." That wording matters. It suggests a co-development workflow: the model accelerates implementation, while humans guide architecture, debugging, and prioritization.

Here’s where AI assistance tends to shine in projects like this:

Boilerplate generation: plumbing code, data structures, repetitive traversal logic, serialization, and adapters.
Translating specs into first-pass implementations: "implement the CSS cascade" becomes a starting point rather than a blank page.
Fast iteration: refactors, moving code, renaming, and expanding test scaffolding.
Exploratory coding: trying an approach, learning what breaks, then reworking.

And here’s where I’d expect the humans to still be firmly in charge:

System architecture: deciding module boundaries, ownership models in Rust, and long-term maintainability.
Debugging correctness: subtle rendering issues are often caused by an off-by-one in layout, an incorrect computed style, or a missing default value. These aren’t solved by more code; they’re solved by better understanding.
Performance and profiling: "it works" is different from "it works within constraints." Real performance work is measurement-driven.
Spec compatibility: the web platform is a compatibility minefield. Matching behavior isn’t only about correctness; it’s about matching decades of de facto standards.

AI can write a lot of code. A browser requires deciding which code should exist at all.

What "kind of works" teaches builders (and teams)

I appreciate that Michael T. didn’t oversell the result. "Very far from Webkit/Chromium parity" is the right framing, because parity is a moving target maintained by enormous teams.

Still, there are practical takeaways for anyone building complex software with AI assistance:

1) Pick a ruthless scope, then ship the slice

"Simple websites render quickly and largely correctly" is a perfect slice definition. It implies a target set of HTML/CSS/JS features and a user-perceived outcome.

If you can define success as "render these 50 representative pages," you can progress without drowning in the entire spec.

2) Invest early in observability

A week-long run suggests logs, crash handling, and basic instrumentation exist. For complex systems, the ability to see what’s happening is as valuable as features.

3) Treat compatibility as a product, not a checkbox

Browsers are judged by how many sites work. That means building a feedback loop: which sites fail, why, and what minimal change unlocks the next chunk of the web.

4) Assume integration is the main job

The hardest part of large codebases is not writing modules; it’s making modules cooperate. AI can multiply output, but integration still demands deliberate interfaces, invariants, and tests.

Why this matters beyond one impressive project

If Michael T. and team can assemble a multi-million-line engine with AI assistance and get real pages rendering, it hints at a shift: ambitious systems projects are becoming more accessible.

Not "push button, get Chromium"—we’re not there. But "small, focused teams can build credible prototypes of historically enormous software" is already meaningful. It changes what people attempt, how quickly they can iterate, and how many experimental engines we might see.

And that’s exciting not because it replaces battle-tested browsers, but because it expands the frontier of experimentation: new layout ideas, safer sandboxes, novel scripting models, specialized devices, or educational engines that make the web stack understandable.

The conversation I hope Michael T.'s post sparks

The best posts don’t just announce a result; they invite a deeper question. For me, the question is: what does software development look like when code generation is cheap but correctness, performance, and compatibility remain expensive?

Michael T. showed a compelling answer: you can generate a lot, integrate relentlessly, and still be honest about the gap to production-grade parity.

This blog post expands on a viral LinkedIn post by Michael T., Building Cursor. View the original LinkedIn post →