Skip to main content

Networks of Thought: Finding Your Research Niche in the Age of LLMs

You can’t compete with infinite compute. But you can find adjacent spaces where depth matters more than scale.


The strategic problem

I can’t compete with OpenAI, Anthropic, or Google on LLM capabilities. They have compute, talent, and capital I’ll never approach.

So I asked a different question: not “how do we make LLMs smarter?” but “what do our conversations with LLMs reveal about how we think, and how can we make that structure useful?”

This turned out to be a far less saturated space with real intellectual problems and practical applications. It maps naturally to a research program that can sustain itself over years, spawn diverse projects, and produce something people actually use.

The lesson is general: when you can’t compete on the main axis, find the orthogonal space where your particular skills and constraints become advantages.


The cognitive MRI

I started by analyzing my own AI conversation logs. Years of chats with ChatGPT, Claude, and other systems. Thousands of conversations spanning code, research, philosophy, health, projects.

Linear text hides structure. But when you construct semantic similarity networks from these conversations and analyze their topology, something interesting emerges.

Method

  1. Embed conversations using standard language models
  2. Weight user inputs 2x more than AI responses (ablation studies showed 2:1 user:AI weighting maximizes modularity)
  3. Construct similarity graph with cosine similarity edge weights
  4. Apply threshold cutoff to keep meaningful connections (phase transition appears around theta ~ 0.9)
  5. Identify communities through network clustering

The result is a map of how your knowledge exploration actually structures itself.

Heterogeneous topology

Different kinds of conversations have fundamentally different network structures.

Programming and practical work form tree-like hierarchies: linear problem-solving paths, branching from problem to solution, few cross-domain connections, high average path length.

Research and conceptual work form small-world networks: hub-and-spoke structure, central concepts with many connections, bridge nodes linking distant domains, short paths between seemingly unrelated ideas.

Intermediate domains produce hybrid structures between these extremes.

This wasn’t predicted. The topology reveals the cognitive mode of each domain. Programming conversations are convergent (narrow toward a solution). Research conversations are divergent (explore connections).

Bridge nodes

A few key conversations act as bridge nodes that hold the entire knowledge graph together. These aren’t necessarily the “most important” conversations by any subjective measure. They’re the ones that connect otherwise separate communities.

Remove these bridges, and the knowledge graph fragments into isolated clusters. They represent the conceptual linchpins of how you integrate different domains.

The ablation studies

We varied two parameters systematically:

  • User:AI weighting ratio: from 1:1 to 3:1, with 2:1 producing highest modularity
  • Threshold cutoff theta: from 0.7 to 0.95, with phase transition around 0.9

The phase transition is important. It suggests we’re hitting real structure, not parameter artifacts. There’s a principled way to extract community structure from conversational data.


What current tools miss

Complex networks science has spent decades proving one insight: topology reveals what reductionist approaches miss. We’ve applied this to social networks, biological systems, infrastructure, citation graphs.

Now we have a new data source that’s exploding in volume: our conversations with AI systems. These are networks of thought, semantic connections, conceptual bridges, knowledge communities. And most tools ignore this structure entirely.

Typical RAG (retrieval-augmented generation):

  1. Convert query to embeddings
  2. Find nearest neighbors in vector space
  3. Return similar documents
  4. Done

This is nearest-neighbor search in a metric space. It doesn’t know which documents are strongly connected, where bridge nodes link distant communities, which documents act as hubs, or how knowledge clusters actually organize.

We’re throwing away the graph structure.


What we should build instead

Tools that understand their own network structure.

Queryable: not just “find documents similar to this query” but “what bridges connect these two topics?” and “which documents are hubs in this domain?” and “show me the path between these distant ideas.”

Browseable: surface the actual network topology. Show natural clusters. Highlight hubs. Reveal bridges. Let navigation mirror how knowledge actually connects, not how you thought it should be organized when you filed it.

Conversable: an LLM that can reason about topology. “These three documents form a tight cluster because they explore X from different angles. This other document bridges to cluster Y through principle Z.” Not just summarizing content, but reasoning about the graph.


The infrastructure

I’m building a Python package that generalizes these insights into a domain-specific language for network-augmented retrieval.

The DSL handles graph construction from arbitrary data sources (conversations, ebooks, bookmarks, documents), community detection with tunable parameters, bridge identification and path analysis, hub detection, and a conversational interface layered over graph structure. Orchestration across multiple data sources happens via MCP.

Apply it to your AI conversation history, ebook collection, browser bookmarks, email archives, personal documents. Everything becomes part of one unified knowledge graph where the topology reveals structure you didn’t know existed.


For network scientists

For decades, network science has analyzed structure in static datasets: social graphs from 2015, protein interactions, transportation networks.

Now we have a new data source (LLM conversations generating networks of thought at scale, continuously), a new capability (LLMs can interpret what network structure means in ways pure algorithms cannot), and a new scope (network analysis across interconnected data sources simultaneously through unified interfaces).

This isn’t a departure from complex networks thinking. It’s the natural next step. We’re finally building systems that understand networks the way network science understands them.


One class, one publication

I took one networking class. It resulted in a peer-reviewed publication and a talk at Complex Networks 2025.

This isn’t about being brilliant. It’s about asking the right questions in a less-saturated space, bringing complementary skills (programming, statistics, mathematical thinking), working strategically rather than competing on compute, and building infrastructure that supports long-term programs.

The activation energy for productive research here is lower than people think. The bottleneck isn’t prerequisite knowledge. It’s intellectual curiosity and problem-solving ability.


Context

This work exists in a particular context. I have stage 4 cancer, uncertain time horizons, and recurring treatment cycles.

Cancer doesn’t change the intellectual work. It clarifies what’s worth doing.

Strategic positioning matters more when time is uncertain. Finding sustainable research directions matters more. Building infrastructure that others can continue matters more.

The cognitive MRI project documents my own knowledge exploration during compressed timelines. The networks map real intellectual work under real constraints. It’s fitting: network analysis of thought, while time runs.


For graduate students

If you’re looking for research directions, don’t compete on the main axis everyone else is competing on. Find the orthogonal space.

Look for problems that are:

  • Adjacent to hyped areas but less saturated
  • Solvable with your existing skills plus learnable tools
  • Generative of multiple follow-up questions
  • Practically useful beyond academic novelty
  • Sustainable over multi-year timelines

Personal knowledge graphs + complex networks + LLMs is one such space. There are others.

The key is strategic positioning: where can you actually contribute something novel without infinite compute or decades of specialization?


What’s next

The complex-net RAG package will provide a DSL for graph-augmented retrieval, work across diverse data types, enable queryable/browseable/conversable interfaces, orchestrate multiple sources through MCP, and support both personal and public data for reproducibility.

The research program will explore methodological refinements (weighting schemes, threshold selection), applications to new domains, cognitive science questions (what do these structures reveal about thinking?), systems questions, and tool development.

Most importantly: it creates a training ground where students can enter the space quickly, contribute meaningfully, and build their own research directions.



Complex Networks 2025, New York, December.

Discussion