Word Ladders and Equivalence Classes
GUEST POST
Today I have a short guest post from a friend, and work colleague, Nicholas Ormrod. Nicholas and I worked together during my tenure at Facebook, where he still works. I have fond memories of chatting with him, after hours, during many recruiting trips; he’s a very smart cookie.
If you want to be entertained and educated, grab a cup of tea, sit down, and watch one of his presentations. Might I recommend this one: Fantastic Algorithms and Where To Find Them.
Take it away Nicholas…
Change One Letter - Reachability
Recently on DataGenetics, we talked about Word Ladders. A Word Ladder is a sequence of words where each step in the sequence changes only one letter.
For example, you can create a Word Ladder from WARM to COLD like so:
WARM → WARD → CARD → CORD → COLD
Try it out yourself
Equivalence
We just constructed a Word Ladder going from WARM to COLD. We could say that WARM can "reach" COLD. This "reachability" has a few interesting properties:
- It is reversible. If WARM can reach COLD, then COLD can reach WARM - just follow the WARM-to-COLD ladder in reverse.
- It is chainable. If WARM can reach COLD and COLD can reach FOOD (it can: COLD → FOLD → FOOD), then WARM can reach FOOD. Just stick the two Word Ladders together!
- It is self-reachable. WARM can reach WARM… by making no changes. The ladder has only one rung, which is a bit weird, but that's ok.
These three properties form what mathematicians call an "Equivalence Relationship". Equivalence Relationships have a cool property: we can group the words into buckets ("Equivalence Classes") such that:
- If word A can reach word B, then A and B are in the same bucket.
- If word A cannot reach word B, then A and B are in different buckets.
How do these buckets help with Word Ladders? Well, if you want to challenge your friends to an impossible Word Ladder, all you have to do is get two words from separate buckets.
Advertisement:
Equivalence Classes of 4-Letter English Words
Are all 4-letter English words reachable from one another? The short answer is no.
This is a hard question because we haven't specified an important detail: what counts as an English word? This is a surprisingly hard question! Can EVIL reach EMIT? Yes – if you count the proper name EMIL as an English word.
Fortunately, even with the ambiguity of English, we can glean a few interesting facts:
- Around 99% of all words are in the same, huge bucket. Given any two 4-letter English words, chances are you can build a Word Ladder from one to the other.
- Approximately 1% of 4-letter words are unreachable. This is true across several definitions of "English": in a restrictive interpretation, EVIL is unreachable; in a broader interpretation, EVIL is reachable via EMIL, but other newly-added words such as OHMS are unreachable. I have found just one common English word that is unreachable from any other word: UPON.
- Apart from the massive bucket, most other buckets contain a single word. Some esoteric buckets contain two or three, such as: ONYX, a stone; ORYX, an antelope; and ONYM, a suffix.
Hard Words
Want to stump your friends at Word Ladders? Some common words are difficult to reach - they are connected, but only by esoteric and questionably-legitimate English words. A sample:
- EPIC, connected to EMIC, anthropological jargon.
- EVIL, connected to the proper name EMIL and the programming keyword EVAL.
- EXAM, connected to EDAM, a Dutch cheese.
- OKAY, connected to OKAS, an Egyptian unit of weight.
- ONCE, connected to ONCA, from the scientific name for jaguars.
- SYNC, connected to SYNE, Scottish for "ago".
- UGLY, connected to AGLY, a French river.
- UPON, no connections.
- URGE, connected to URDE, heraldric points.
- VOID, connected to OOID, small sedimentary grains.
Overall
Word Ladders is a fun game. Using 4-letter English words, it is almost always possible to find a solution, but a few words can't be connected.