So I built latex-wc, a small Python CLI that:
- extracts tokens from LaTeX while ignoring common LaTeX “noise” (commands, comments, math, refs/cites, etc.)
- can take a single .tex file or a directory and recursively scan all *.tex files
- prints a combined report once (total words, unique words, top-N frequencies)
Fastest way to try it is `uvx latex-wc [path]` (file or directory). Feedback welcome, especially on edge cases where you think the heuristic filters are too aggressive or not aggressive enough.
0 comments