You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@gouse95 If you can provide more information about what you're trying to do, and if you have code you can share even better!
With that said, I'm not surekor is going to be the best tool to get the page number from a PDF.
If you're using pypdf to read in your PDF, you can get the page number this way:
importPyPDF2# Open the PDF filepdf_file=PyPDF2.PdfFileReader('my_pdf.pdf')
# Get the total number of pagestotal_pages=pdf_file.numPages# Get the first pagefirst_page=pdf_file.getPage(0)
If you have a PDF that maybe has a cover page, and then the page numbers don't index until after the first page of the file, then you might consider something that extracts the page number from the text. Still, I think you might be better off using a regular expression instead of kor given that you'd be passing the entire content of the PDF just to get the page number.
# Create a sample pdf pagepage='This is the content of the PDF page\nHere'sanewline.\n\npage: 10'# Define the regular expressionregex = r'page:\s*(\d+)'
# Match the regular expression against the stringmatch=re.search(regex, page)
# Extract the page number from the matchifmatch:
page_number=match.group(1)
# Print the page numberprint(page_number)
else:
print('Page number not found.')
Again, if you have some code you can share or more info about your use I might be able to give you better advice.
No description provided.
The text was updated successfully, but these errors were encountered: