Variant Generation¶
The variants and splitting modules generate sound-alike names proactively — useful for exploring the phonetic neighborhood of any word. Applications include typosquatting detection, brand-name screening, domain squatting analysis, and search-term expansion.
Phonetic Variants¶
The variants.generate() function applies phonetic substitution rules to produce names that sound like the input but are spelled differently:
from phonemenal import variants
variants.generate("flask")
# → {"phlask", "flazk", "flasc", ...}
variants.generate("click")
# → {"clik", "klick", "klik", ...}
Substitution Rules¶
phonemenal applies 22 bidirectional substitution patterns:
| Pattern | Examples |
|---|---|
ph ↔ f |
flask → phlask |
ck ↔ k |
click → clik |
x ↔ ks |
flux → fluks |
s ↔ z |
flask → flazk |
i ↔ y |
click → clyck |
qu ↔ kw |
quest → kwest |
ch ↔ c |
rich → ric |
sh ↔ s |
bash → bas |
th ↔ t |
math → mat |
er ↔ or |
server → servor |
le ↔ el |
bottle → bottel |
ai ↔ ay |
train → trayn |
| And more... |
Each substitution is applied in both directions, and double letters are toggled (e.g., ll → l, l → ll).
Separator Permutations¶
By default, variants include separator permutations — hyphens, underscores, and concatenation:
variants.generate("my-package")
# Includes: "my_package", "mypackage", "my-package" variations
# Disable separator permutations
variants.generate("my-package", include_separators=False)
Morphological Variants¶
For typosquatting that exploits suffix confusion rather than pronunciation:
from phonemenal import variants
variants.generate_morphological("packaging")
# → {"packaged", "packager", "packages", ...}
These aren't phonetic substitutions — they swap suffixes like -ing → -ed, -er, -es, which is a common typosquatting vector.
Compound Word Splitting¶
Package names are often compound words without separators. The splitting module uses ML-based segmentation to break them apart:
from phonemenal import splitting
splitting.split("bluevoyage") # → ["blue", "voyage"]
splitting.split("fastapi") # → ["fast", "api"]
splitting.split("pytorch") # → ["py", "torch"]
Homophone Permutations¶
Once a name is split, phonemenal finds homophones for each component and generates all recombinations:
splitting.component_homophones("bluevoyage")
# → {"blue": ["blew", "bleu"], "voyage": ["voyage"]}
splitting.homophone_permutations("bluevoyage")
# → ["bluevoyage", "blewvoyage", "bleuvoyage", ...]
This catches attacks like publishing blewvoyage to target users of bluevoyage.
# Cap the output for names with many homophones
splitting.homophone_permutations("bluevoyage", max_permutations=50)
Combining Approaches¶
For thorough coverage, combine all generation strategies:
from phonemenal import variants, splitting
name = "bluevoyage"
# Phonetic variants of the whole name
phonetic = variants.generate(name)
# Morphological variants
morphological = variants.generate_morphological(name)
# Homophone permutations of components
permutations = set(splitting.homophone_permutations(name))
# Union of all candidates
all_candidates = phonetic | morphological | permutations
Or let the scanning module handle this automatically with scan_with_reverse():