Situatie
PDF (Portable Document Format) may be a file format that has captured all the weather of a printed document as a bitmap that you simply can view, navigate, print, or forward to somebody else. PDF files are created using Adobe Acrobat.
Solutie
Pasi de urmat
Suppose a PDF file contains a Table
| User_ID | Name | Occupation |
| 1 | David | Product Manage |
| 2 | Leo | IT Administrator |
| 3 | John | Lawyer |
And we want to read this table into our Python Program.
Method 1: Using tabula-py
The tabula-py is a simple Python wrapper of tabula-java, which can read tables in a PDF. You can install the tabula-py library using the command.
pip install tabula-py pip install tabulate
The methods used in the example are:
read_pdf(): reads the data from the tables of the PDF file of the given address
tabulate(): arranges the data in a table format
fromtabulaimportread_pdffromtabulateimporttabulate#reads table from pdf filedf=read_pdf("abc.pdf",pages="all")#address of pdf file(tabulate(df))
Leave A Comment?