How to use the pdftotext.Error function in pdftotext

To help you get started, we’ve selected a few pdftotext examples, based on popular ways it is used in public projects.

Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.

github jalan / pdftotext / tests / test_pdf.py View on Github external
def test_read_corrupt_page(self):
        with self.assertRaises((pdftotext.Error, IndexError)):
            pdf = pdftotext.PDF(get_file("corrupt_page.pdf"))
            pdf[0]
github jalan / pdftotext / tests / test_pdf.py View on Github external
def test_locked_with_both_passwords(self):
        with self.assertRaises(pdftotext.Error):
            pdftotext.PDF(get_file("both_passwords.pdf"))
github the-paperless-project / paperless / src / paperless_tesseract / parsers.py View on Github external
def get_text_from_pdf(pdf_file):

    with open(pdf_file, "rb") as f:
        try:
            pdf = pdftotext.PDF(f)
        except pdftotext.Error:
            return ""

    return "\n".join(pdf)

pdftotext

Simple PDF text extraction

MIT
Latest version published 3 years ago

Package Health Score

58 / 100
Full package analysis

Popular pdftotext functions