🧑‍🚀 Luke Chadwick recently shared something that caught my attention: "I went down a Digital AI Twins rabbit hole for a panel I joined in April 2025." He added that what began as "just do some background research" unexpectedly became "300 notes and 3 audio deep dives." That line felt familiar to anyone who has tried to get truly fluent in a fast-moving topic.

Luke's post is not really about AI twins alone. It is about the difference between skimming and building durable understanding. The core move he describes is deceptively simple: treat notes like data.

Below is my expanded take on what Luke is pointing to, why it matters for messy domains like digital AI twins, and how to apply the same workflow to any research problem where hype and uncertainty blur together.

Digital AI twins are not one topic, they are a spectrum

Luke mentions that AI twins span a "weird range": human-like assistants for work, "companions" people bond with, digital resurrection projects, and case studies where someone turns a real person into a chatbot and discovers uncomfortable edge cases.

That range matters because it breaks most normal research habits.

If you are researching a single tool category (say, "sales automation"), you can get away with a linear outline and a few bookmarked links. But "digital AI twins" intersects:

Identity and representation (Who is being modeled? With what consent?)
Product design (What does the interface encourage people to believe?)
Data provenance (What is the model trained on? What is synthesized?)
Psychology (Attachment, projection, grief, persuasion)
Law (rights of publicity, privacy, estates, jurisdiction)
Security (impersonation, fraud, model leakage)

When a topic crosses that many domains, your notes become the product. And if your notes are sloppy, your thinking will be sloppy.

Luke's key insight: the only way to stay honest in an emerging space is to systematize how you capture and label information.

The real problem: speculation laundering

One phrase in Luke's post hit me hardest: he deliberately separates "established facts" from "this seems plausible but is still emerging." He notes that on a topic like AI twins, this helps you avoid "accidentally laundering speculation into certainty."

This is the failure mode I see constantly in AI writing:

Someone reads a confident claim in a secondary source.
They paraphrase it in their own notes.
Later, they reuse the paraphrase as if it was verified.
Over time, the claim becomes "common knowledge" without ever being grounded.

The fix is not just better memory. It is better metadata.

In practical terms, you need a note format that forces you to answer:

What is the claim?
Who said it?
What is the evidence?
How strong is the evidence?
What would change my mind?

If your system does not ask those questions, your future self will not either.

Notes as data: the entity extraction approach

Luke says the thing that worked was treating notes like data and building a "structured entity extraction template" so each source becomes consistent, wiki-style notes: organizations, people, technologies, concepts, fiction, timeline events. Each note includes attribution.

That is a quiet but powerful shift. Instead of writing notes as a diary of what you read, you are building a small database that can answer questions.

What a structured template can look like

You can implement this in many ways, but the spirit is consistent. For each source, extract entities into repeatable fields.

For example:

Source note
- Citation, link, date accessed
- Summary in 3 to 5 bullets
- Key claims (each with evidence level)
- Open questions
Person note (for founders, researchers, public examples)
- Role
- Claims attributed to them
- Where they appear in your timeline
Organization note
- What they built or published
- Business model signals
- Policies (consent, data retention, safety)
Concept note
- Definition (your current best version)
- Adjacent concepts
- Risks and failure modes
- What is established vs emerging
Timeline event note
- Date
- What happened
- Why it matters
- Links to sources and entities

Once you have 100+ notes, the payoff is compounding. You are no longer searching your memory. You are querying your knowledge base.

Why Obsidian-style linking works for this

Luke uses Obsidian to keep "300 notes navigable." That makes sense because research like this is graph-shaped.

A single case study about "turning a real person into a chatbot" might connect to:

A specific product category (companion apps)
An ethics cluster (consent, representation, grief)
A technical cluster (voice cloning, RAG, fine-tuning)
A legal cluster (right of publicity)

In a linear document, you pick one path and lose the rest. In a linked system, you keep all paths and let them reinforce each other.

My recommendation if you try this: start simple.

Use consistent note titles (so search works)
Add 3 to 6 links per note to related entities
Add a short "status" line like: Established, Emerging, Speculative

That last one is your anti-hype safety belt.

The toolchain Luke describes (and what it is really doing)

Luke lists a stack that many people dabble with, but he uses it with a clear division of labor:

Deep Research to process sources
Agentic coding (Cursor, Claude Code, etc.) to build extraction workflows
Obsidian to store and connect the notes
NotebookLM for audio summaries
Suno (with Claude for lyrics) to create songs as background listening to retain the "shape of a domain"

There are two deeper ideas here.

1) Automate the boring, not the thinking

Using AI to summarize is helpful, but the real leverage is using AI to enforce structure. If you can automate the conversion from "raw article" to "entities + claims + attribution," you reduce fatigue and increase consistency.

Consistency is what makes 300 notes usable.

2) Multimodal reinforcement helps you stay oriented

NotebookLM audio summaries and even the music experiment sound quirky, but they fit a real learning need: orientation. In a broad domain, you need repeated exposure to the same concepts from different angles so you can feel what matters.

The goal is not perfect recall. The goal is faster recognition of patterns: repeated concerns, recurring claims, and common edge cases.

Applying Luke's workflow to your own research (a simple playbook)

If you want to replicate the "become the expert" effect Luke describes, here is a lightweight version you can run in a weekend.

Step 1: Define your extraction schema

Pick 5 to 8 entity types you will capture every time (people, orgs, products, concepts, risks, timeline events). Write a template.

Step 2: Create an evidence ladder

Add a field to every claim:

Verified (primary source, direct evidence)
Supported (multiple credible sources)
Plausible (reasonable inference, limited evidence)
Speculative (interesting, but unproven)

This is how you stop speculation laundering.

Step 3: Process sources in batches

Do not switch tools constantly. Gather 10 sources, extract them with the same template, then link notes.

Step 4: Build 3 maps of the domain

In Obsidian terms, you can create "hub" notes:

The timeline (what happened when)
The landscape (orgs and product categories)
The risk map (where the edge cases cluster)

Once those exist, every new note has a natural home.

Step 5: Publish something small

Luke mentions he wrote up the workflow as a PDF guide. That is not just generosity. It is a forcing function. When you explain a system to someone else, you discover where it is fuzzy.

Why this matters specifically for digital AI twins

Digital AI twins raise unusually human questions: bonding, identity, grief, consent, and manipulation. The danger is not only technical error. It is interpretive error.

If your notes blur fact and inference, you can easily:

Overstate what systems can do today
Miss the consent boundary when "modeling a person"
Underweight the emotional impact of companion dynamics
Ignore the incentives that push products toward deception

Luke's approach is a practical antidote: structure, attribution, and explicit uncertainty.

If you care about doing serious work in this space, you do not just need better prompts. You need better research hygiene.

This blog post expands on a viral LinkedIn post by 🧑‍🚀 Luke Chadwick. View the original LinkedIn post →