Tell text files apart by extension, without extra dependencies

When it’s useful

Pipelines, upload handlers, and developer tools often need a quick answer: “Does this path look like text?” Guessing from content is heavy; maintaining your own extension list drifts out of date. This library answers from a curated set of known text extensions so you can branch logic early.

What you can do

Check an extension or a full path for a text type, with case-insensitive, dot-aware matching (including dotfiles such as .gitignore).
Rely on a large, immutable list (300+ text extensions covering source, markup, configs, data formats, and docs, aligned with the widely used text-extensions npm list this project ports).
Look up in constant time via a frozenset so membership checks stay cheap at scale.
Import the raw sets (TEXT_EXTENSIONS, TEXT_EXTENSIONS_LOWER) when you need custom rules on top of the defaults.

Limits and fit

Classification is by extension and path shape, not by reading file bytes or detecting encodings. If you need to know whether a file is actually UTF-8 text, combine this with encoding checks or parsers. Python 3.8+; install from PyPI (text-extensions). API details, changelog, and tests live in the GitHub repository.

Text Extensions

Tell text files apart by extension, without extra dependencies

When it’s useful

What you can do

Limits and fit

You might also like

How to Add PyPI Download Stats to GitHub Actions

Shai Hulud 2.0: How to Detect a Suspected npm Supply-Chain Attack

Analyze Git Repositories in GitHub Actions Using Gitingest