Convert PDF to text with pdftotext (command line)
Convert PDF to text with pdftotext (command line)
pdftotext is a command line utility that converts PDF files to plain text. It has many options, including the ability to specify the page range to convert, maintain > the original physical layout of the text as best as possible, set line endings (unix, dos or mac), and even work with password-protected PDF files.
pdftotext is part of the poppler / poppler-utils / poppler-tools package (depending on the Linux distribution you’re using). Install this package as follows:
sudo pacman -S poppler
Now that the package is installed, you can convert a PDF file to plain text and preserve its layout (I recommend using this -layout
option for maintaining the original physical layout, but you can try it without it too) with:
pdftotext -layout input.pdf output.txt
via: How To Convert PDF To Text On Linux (GUI And Command Line) - Linux Uprising Blog
Related posts by tag
- 07 Nov 2024 daisyUI — Tailwind CSS Components
- 04 Mar 2024 Using personal gems with Bundle without bothering your colleagues.
- 12 Jul 2023 Protomaps - A serverless system for planet-scale maps