Text Extensions

Check if a file or extension is text with 300+ known text types

Gallery image 1

Tell text files apart by extension, without extra dependencies

When it’s useful

Pipelines, upload handlers, and developer tools often need a quick answer: “Does this path look like text?” Guessing from content is heavy; maintaining your own extension list drifts out of date. This library answers from a curated set of known text extensions so you can branch logic early.

What you can do

  • Check an extension or a full path for a text type, with case-insensitive, dot-aware matching (including dotfiles such as .gitignore).
  • Rely on a large, immutable list (300+ text extensions covering source, markup, configs, data formats, and docs, aligned with the widely used text-extensions npm list this project ports).
  • Look up in constant time via a frozenset so membership checks stay cheap at scale.
  • Import the raw sets (TEXT_EXTENSIONS, TEXT_EXTENSIONS_LOWER) when you need custom rules on top of the defaults.

Limits and fit

Classification is by extension and path shape, not by reading file bytes or detecting encodings. If you need to know whether a file is actually UTF-8 text, combine this with encoding checks or parsers. Python 3.8+; install from PyPI (text-extensions). API details, changelog, and tests live in the GitHub repository.

You might also like

Explore All Blogs