Contributing¶
Development Setup¶
git clone https://github.com/brokensound77/phonemenal.git
cd phonemenal
uv sync --all-extras
make setup # download NLTK data
Running Tests¶
Code Style¶
phonemenal uses Ruff for linting and formatting:
- Target: Python 3.11
- Line length: 100
- Lint rules: E, F, W, I (pyflakes, pycodestyle, isort)
Project Structure¶
phonemenal/
├── __init__.py # Public exports
├── core.py # CMU dict access, phoneme utilities
├── similarity.py # PPC-A, PLD, LCS scoring
├── homophones.py # Exact & near-homophone discovery
├── variants.py # Phonetic & morphological variants
├── splitting.py # Compound word splitting
├── fallback.py # Fast phonetic encoder (no CMU dict)
├── scanning.py # High-level collision pipeline
├── llm.py # LLM-powered analysis
├── cli.py # Click CLI
└── prompts/
└── homophone_analysis.txt # LLM prompt template
tests/
├── test_cli.py
├── test_core.py
├── test_fallback.py
├── test_homophones.py
├── test_scanning.py
├── test_similarity.py
├── test_splitting.py
└── test_variants.py
Adding a New Similarity Algorithm¶
- Add the scoring function in
similarity.pyfollowing the existing pattern (return 0.0–1.0, supportraw=True) - Add it to the
composite()weights tuple if it should participate in composite scoring - Add tests in
tests/test_similarity.py - Add a CLI option in
cli.pyunder thesimilaritycommand's--algorithmchoice
Documentation¶
Documentation is built with MkDocs + Material:
uv run mkdocs serve # local dev server at http://127.0.0.1:8000
uv run mkdocs build # build static site
API reference pages use mkdocstrings to auto-generate from docstrings. When adding or modifying public functions, ensure docstrings follow the Google style.