# metachris/pdfx

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/metachris-pdfx).**

1,076 stars · 119 forks · Python · Apache-2.0 · archived

## Links

- GitHub: https://github.com/metachris/pdfx
- Homepage: http://www.metachris.com/pdfx
- awesome-repositories: https://awesome-repositories.com/repository/metachris-pdfx.md

## Description

Extract text, metadata and references (pdf, url, doi, arxiv) from PDF. Optionally download all referenced PDFs.

## Tags

### Part of an Awesome List

- [Document Processing](https://awesome-repositories.com/f/awesome-lists/productivity/document-processing.md) — Tool for extracting references and downloading PDFs.
