要从PDF提取文本,请使用以下代码
import PyPDF2pdfFileObj = open('mypdf.pdf', 'rb')pdfReader = PyPDF2.PdfFileReader(pdfFileObj)print(pdfReader.numPages)pageObj = pdfReader.getPage(0)a = pageObj.extractText()print(a)
要从PDF提取文本,请使用以下代码
import PyPDF2pdfFileObj = open('mypdf.pdf', 'rb')pdfReader = PyPDF2.PdfFileReader(pdfFileObj)print(pdfReader.numPages)pageObj = pdfReader.getPage(0)a = pageObj.extractText()print(a)