Python change pdf metadata




















We often got some questions from our customers, they would like to manipulate their PDF files using Python Program Language. Here is a question from our customer in the past, I'm looking for a very fast, lightweight Python library to read PDF metadata.

I don't need any write capabilities. It would be better if only the metadata information is loaded, not the entire file. I want to add a metadata key-value pair to the metadata of a pdf file. Add a comment. Active Oldest Votes. Cyril N. This doesn't work for me. Tarun Lalwani Tarun Lalwani k 8 8 gold badges silver badges bronze badges.

Yes, it worked. In my case I needed to add a key which is not a valid python name, but it worked like this: setattr reader. Info, 'original-files', value. Thank you — guettli. Weakness is package not maintained. Weakness is PDF not preserve outlines bookmarks. The correct way to edit PDF metadata in Python. Community Bot 1 1 1 silver badge. When using pypdf2, my bookmarks becomes offset, and my toc looses all links. It was empty. The first page, in this case, is just an image, so it wouldn't have any text.

Interestingly, if you run this example you will find that it doesn't return any text. Instead, all I got was a series of line break characters. Unfortunately, PyPDF2 has pretty limited support for extracting text. Even if it is able to extract text, it may not be in the order you expect and the spacing may be different as well.

To get this example code to work, you will need to try running it against a different PDF. This is a W9 form for people who are self-employed or contract employees. It can be used in other situations too. Anyway, I downloaded it as w9. If you use that PDF instead of the sample one, it will happily extract some of the text from page 2. I won't reproduce the output here as it is kind of lengthy though. You may find that the pdfminer package works better for extracting text than PyPDF2 though.

The PyPDF2 package is quite useful. We were able to get some helpful information from PDFs using it. Give it a try and see what you think! See the original article here.



0コメント

  • 1000 / 1000