What Is a Trie? The Data Structure That Shares Its Homework
Imagine you have a list of 100,000 names — every show in a streaming catalog, say — and a pile of 10 million documents. For each document, you need to find which names appear in it. Every name, every document.
The brute-force approach is simple: for each document, search for each name. That’s 100,000 searches per document, times 10 million documents. A trillion operations. Even on fast hardware, that’s going to hurt.
There’s a better way. A data structure called a trie (rhymes with “try”) can scan a document for all 100,000 names in a single pass through the text, and it does this by exploiting a simple observation: many of those names start with the same letters.
The Prefix Problem
Look at this list of words:
cat, car, card, care, cart, cast
Notice anything? Every word starts with “ca.” Four of the six start with “car.” If you’re searching for all of these in a document, you’re doing a lot of redundant work — scanning for “ca” six separate times, scanning for “car” four separate times.
What if you could share that work? What if you could build a structure that says: “first look for c, then a — and from there, branch out to look for t, r, or s”?
That’s exactly what a trie does.
A Tree That Shares Prefixes
A trie (short for “retrieval,” coined by Edward Fredkin in 1960) is a tree where every string is stored as a path from the root. Each edge represents one character. When two strings share a prefix, they share a path.
The best way to understand it is to build one. Type words into the explorer below and watch the tree grow — pay attention to how shared prefixes merge into a single path:
Try clicking “TH- words” to load a set with dramatic prefix sharing. All six words share the path t → h → e, and the tree branches only where the words diverge. The filled nodes mark where a complete word ends.
How Sharing Saves Space
The stats at the bottom of the explorer show prefix savings — the percentage of characters saved by storing words in a trie versus storing them independently.
For the “TH- words” set (the, there, their, they, them, then), the savings are striking. Storing these six words normally requires 25 characters. The trie stores them in far fewer nodes because the shared prefix the is stored only once.
This isn’t just a space optimization. The shared structure is what makes the trie fast. When you’re scanning text for matches, you walk a single path through the tree. Every character you read either advances you deeper into the trie or tells you “no match starts here.” You never waste time re-checking prefixes you’ve already seen.
The Algorithm: Building a Trie
Inserting a word into a trie is beautifully simple:
- Start at the root
- For each character in the word, follow the edge labeled with that character — if no such edge exists, create it
- When you reach the end of the word, mark the current node as a terminal (a word ends here)
Here’s a clean Python implementation. The entire data structure is about 30 lines:
class TrieNode:
def __init__(self):
self.children = {} # char → TrieNode
self.is_terminal = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_terminal = True
def search(self, word):
node = self.root
for char in word:
if char not in node.children:
return False
node = node.children[char]
return node.is_terminal
That’s it. insert walks the tree character by character, creating nodes as needed. search walks the same way but returns False the moment it hits a missing edge.
Both operations take O(L) time, where L is the length of the word. No matter how many words are in the trie — a hundred or a hundred thousand — inserting or searching takes time proportional only to the length of the word itself.
The Killer Feature: Multi-Pattern Matching
Searching for a single word is nice, but the real power of a trie is searching for all of your patterns in a single pass through a document. Here’s how:
For each starting position in the text:
- Start at the root of the trie
- Read one character at a time, following edges
- Whenever you land on a terminal node, you’ve found a match
- Keep going — there might be a longer match (like finding both “car” and “card”)
- When you hit a dead end (no edge for the next character), move to the next starting position
def find_all_matches(self, text):
matches = []
for i in range(len(text)):
node = self.root
j = i
while j < len(text) and text[j] in node.children:
node = node.children[text[j]]
if node.is_terminal:
matches.append(text[i:j + 1])
j += 1
return matches
The time complexity is O(M × L) where M is the length of the text and L is the longest word in the trie. Compare this to the brute-force approach of checking each pattern separately: O(N × M) where N is the number of patterns.
When N is 100,000 and L is 20 characters? That’s a 5,000× speedup.
| Approach | Time per document | With 100K patterns, 20-char max |
|---|---|---|
| Brute force (one search per pattern) | O(N × M) | Proportional to 100,000 × doc length |
| Trie scan (single pass) | O(M × L) | Proportional to 20 × doc length |
The trie replaces the number of patterns with the maximum pattern length — and pattern length is almost always much, much smaller.
Where Tries Show Up
Tries are everywhere once you know to look for them:
- Autocomplete: As you type, the app walks the trie to your current prefix, then enumerates all completions. This is why suggestions appear instantly even when the dictionary has millions of entries.
- Spell-checking: Walk the trie for the typed word. If you don’t land on a terminal node, the word isn’t in the dictionary. The nearby branches suggest corrections.
- IP routing: Routers use a variant called a Patricia trie to find the longest matching prefix for an IP address, determining where to forward each packet.
- Entity detection: In production systems, tries are used to scan documents for mentions of known entities — show titles, people, places — across millions of documents. This is the use case that motivated my own trie implementations, and we’ll explore it in depth in a later post.
The Name
A quick note on pronunciation. Edward Fredkin coined the name in 1960 from “retrieval” and intended it to be pronounced “tree” — as in, it’s a tree for retrieval. But this was confusing (a trie is a tree, so calling it “tree” sounds redundant), and most people now pronounce it “try.” I’ll use “try” throughout this series, but you’ll encounter both in the wild.
What’s Next
We’ve seen the core data structure: a tree that shares prefixes, enabling multi-pattern search in a single pass. But there are subtleties we’ve glossed over — what happens with case sensitivity? How do you handle word boundaries so you don’t match “art” inside “cart”? What if two patterns share the same surface form?
The Series So Far
- What Is a Trie? — You are here
- Building an Interactive Trie Visualizer with D3 — How the animated visualization works
- Scanning Text with a Trie — Multi-pattern matching, word boundaries, overlap resolution
- Broadcasting a Trie in Spark — The distributed computation pattern
- Building a Trie-Powered Autocomplete with React — The React component
- Shrinking the Trie for the Wire — Can a custom format beat gzip? We measured.