Parsing Perception

How Language Models Translate Text into Thought

Apr 22, 2025

👉 Try the interactive tool now: Tokenizer →

In the age of ChatGPT, Claude, and other AI language systems, we often interact with these tools as if they understand us. We type something in, something intelligent comes out, and we move on. But there's a profound gap between how these systems process language and how humans do. This gap reveals much about both artificial intelligence and ourselves.

Today, I'm sharing Tokenizer, a project that lifts the veil on how language models actually process our words.

What Happens When AI Reads Your Words?

When you type a sentence into a language model, four key transformations happen:

1. Tokenization: Breaking Language into Fragments

Language models don't process whole sentences. Instead, they slice your text into "tokens"—words, parts of words, or even individual characters. Each token gets mapped to a specific numeric ID from the model's vocabulary.

For example, the phrase "The young student didn't submit the final report on time" gets broken into tokens like "The", "young", "student", "didn", "'t", "submit"... each with its own ID number. This is the first abstraction away from human language.

For GPT models, common words might be single tokens, while rare words get split into multiple subword tokens. This affects how the model processes meaning; tokens are the fundamental units of "understanding."

2. Part-of-Speech Tagging: Assigning Grammatical Roles

Next, the system identifies the grammatical role of each token. Is it a noun, verb, adjective? Is it the subject of the sentence or an object?

The system maps out what linguists call dependency structure, which shows how words relate to each other in a sentence. It extracts subjects, verbs, and objects, creating a structured representation of who did what to whom.

In our example, tools like spaCy would identify "student" as a noun and the subject, "submit" as the main verb, and "report" as the direct object. These relationships form the skeleton of meaning.

3. Embedding: Converting Words to Vectors

Here's where things get fascinating. Each token gets transformed into a vector, which is a list of hundreds of numbers that capture its meaning and context. In the tool, we use BERT's 768-dimensional embeddings and visualize them in 2D through Principal Component Analysis (PCA).

Words with similar meanings, contexts, or functions cluster together in this mathematical space. "Dog" and "cat" would be closer to each other than either would be to "algorithm." This "distributional semantics" approach is the foundation of how language models simulate understanding.

The embedding space is where a model's "knowledge" lives, not as facts, but as geometric relationships between points in this high-dimensional space.

4. Dependency Parsing: Mapping Relationships

Finally, the system constructs a tree of relationships between words. This visualization shows how modifiers, subjects, objects, and clauses connect to form the complete meaning of your sentence.

These trees reveal the hierarchical structure of language: which words modify which others, how clauses nest within each other, and how the overall meaning is constructed from individual components.

Why This Matters (Beyond the Technical Details)

These technical steps reveal something deeper: Language models don't understand language the way humans do. They simulate it convincingly, but fundamentally differently.

When you or I say "dog", we might recall the feeling of fur, the sound of barking, even emotional responses. But when a model sees "dog", it sees a vector of numbers, shaped by how often "dog" appears near words like "bark," "tail," or "vet."

That's not wrong. It's statistical meaning. But it's also disembodied, ungrounded, and unaware.

So What?

Language models don't have beliefs or goals; they just predict what's likely to come next.

Their understanding of "truth" is co-occurrence-based, not experiential.
The ambiguity humans process instinctively must be explicitly encoded.

And yet: these systems now write our resumes, filter our content, and decide what's visible or valuable. The difference between performance and understanding is no longer philosophical trivia. It's infrastructure.

Try It Yourself & The Performance Age

Explore how these transformations work in real-time using the Tokenizer, an interactive visualization tool that reveals how AI parses and embeds your words.

This is part of a broader exploration I'm calling The Performance Age—investigating how truth, perception, and performance shift in the algorithmic era.

A System's Perspective

This tool shows how algorithmic systems convert rich, ambiguous language into structured data. Each transformation is a lossy process. We gain computational tractability, but sacrifice nuance. As AI systems increasingly mediate our lives, we must ask:

What knowledge is amplified? What subtlety is erased?

How do these algorithmic lenses shape the reality we perceive and the decisions we make?

Have thoughts about this project? Drop them in the comments below, or reach out directly.

The Performance Age