Trouble opening file with Java to use with PDFBox

12 views (last 30 days)
I am trying to use the PDFBox library to read the contents of PDF files, but I can't seem to open any of the files in the correct format for PDFBox to use. I'm using the following code to open each document:
javaaddpath('...\pdfParseDemo\pdfbox-2.0.0.jar')
javaaddpath('...\FontBox-0.1.0\FontBox-0.1.0\lib\FontBox-0.1.0.jar')
pdfname = '...\example.pdf';
import java.io.*;
pdfdoc = org.apache.pdfbox.pdmodel.PDDocument; %Define a PDDocument object placeholder
pdfdoc.load(FileInputStream(pdfname)); %Load the PDF file
However, this seems to return an empty object. When I try to query any of the file's properties or contents, it always returns an empty or zero value. I suspect the problem is with how I'm opening the file, because I know PDFBox has been successfully used natively with Java in many cases. Unfortunately the documentation for interfacing with Matlab is very sparse, so I'm not sure what I should be doing differently. Is there some kind of weirdness with how Matlab handles Java file input calls?

Accepted Answer

Elias Gule
Elias Gule on 9 May 2016
Try wrapping your pdfname variable in a java.lang.String variable. This sometimes works:
pdfname = java.lang.String('...\example.pdf');
  2 Comments
Michael Boeckel
Michael Boeckel on 9 May 2016
Good suggestion, but no joy, sadly. I tried this with several different variations on the loading call, but none worked.
Michael Boeckel
Michael Boeckel on 10 May 2016
Disregard! I tried your suggestion with the Java "File" constructor instead of the FileInputStream constructor, and with a bit of coaxing, that worked! Many thanks good sir!

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!