Wednesday, December 29, 2010

Concatenating PDF files

Since the advent of LiveCode 4.5, developers have the ability to 'print' stack content directly to PDF files. And if you need pin-point control over what goes where, you can use Quartam PDF Library to generate PDF files from scripts. That's great if you are in full control of the content, but what if you need to work with existing PDF files? In the next few posts, we will examine how you can tap into the power of the Java-based iText library from LiveCode.
So let's start by downloading a copy of iText version 2.1.7 - do not use version 5.x as the API changed and the following example code won't work.

The first question is: how can we execute Java code from LiveCode? The simplest solution is the shell function: it allows you to execute DOS or Unix commands, as if you typed them in from the command line. Note that on Windows, using this function will show a DOS window, but you can control that by setting the hideConsoleWindows property before calling the shell function.
You can test it out by simply executing the following line from the message box:
  answer shell("java -version")

The second question is: what sort of Java code do we need to write? Well, I fired up a copy of Eclipse, started a new project, and created a new class 'ConcatPdfFiles' in the default package. Then I grabbed my paper copy of iText in action (first edition) and flipped to page 64 as this contains the examples for concatenating PDF files. A little bit of thinking, and I derived the following code:

import com.lowagie.text.Document;
import com.lowagie.text.DocumentException;
import com.lowagie.text.pdf.PdfCopy;
import com.lowagie.text.pdf.PdfReader;

public class ConcatPdfFiles {
public static void main(String[] args) throws DocumentException, IOException {
final String outputFilePath = args[0];
final OutputStream outputStream = new FileOutputStream(outputFilePath);
final Document outputDocument = new Document();
final PdfCopy outputCopy = new PdfCopy(outputDocument, outputStream);;
for (int i = 1; i < args.length; i++) {
final PdfReader inputPdfReader = new PdfReader(args[i]);
final int pageCount = inputPdfReader.getNumberOfPages();
for (int pageIndex = 0; pageIndex < pageCount; pageIndex++) {
outputCopy.addPage(outputCopy.getImportedPage(inputPdfReader, pageIndex + 1));

As you can see, the code is a bit lazy when it comes to exception handling: I just let the exceptions get thrown, and this will be the output of our shell call if something goes wrong. Note also that the first argument is the output file, followed by the input files that you want to concatenate into the output file.

More importantly, at this point in time, the code doesn't compile. The problem is, we haven't yet told Eclipse where that iText-2.1.7.jar library file is, so compilation fails. This is sometimes referred to as 'classpath hell' - you have to give Java a list of paths where it can find the necessary additional libraries, not just at compile time but also at runtime as we'll see later.
Because I like to keep everything together in my Java projects, I added a new 'lib' folder to my project, and copied the iText2.1.7.jar file into it. At that point, you can use the contextual menu on the iText.2.1.7.jar file, and add it to the Build Path. Now the code I showed earlier compiles just fine, and we can proceed to the next stage.

The third question is: how do we put everything together in LiveCode? We'll begin by putting all the necessary parts into a single folder: the iText-2.1.7.jar library file, the ConcatPdfFiles.class compiled file and two example PDF files (demo1.pdf and demo2.pdf). Then we fire up LiveCode, create a new stack 'ConcatPdfFiles' and save it in the same folder as the other files, naming it "ConcatPdfFiles.liveCode'. Now we can drop a button onto the stack and start scripting.

Now we need to determine the correct command to be executed by the shell function. It should look something like:
java -classpath <class-path> ConcatPdfFiles <output-file> <input-file-1> <input-file-2> ...

The java executable needs the correct classpath, and we need to pass in compatible file paths.

Let's start with the classpath. This is a list of places that java needs to look for its .class files - as separate files in folders, or stored together in a .jar file. And for extra fun, the separator character is a colon on Unix-based platforms, and a semicolon on Windows. You can have relative paths in this classpath, and '.' (period) is short for the current directory. So rather than building a long class path, we can circumvent the issue by setting the defaultFolder property to change the working directory before calling the shell function. Then our classpath can be as short as:
on MacOS X/Linux and
on Windows.

The next bit is compatible file paths. The good news: LiveCode uses a '/' (slash) as separator, regardless of the underlying platform, and Java is more than happy to accept '/' in a path, even when it's running on Windows. However, if there are spaces in the path, we need to save them by putting quotes around the path on Windows, and escaping the spaces with a backslash on Unix-based platforms.
And to determine the paths relative to the stack's location on your hard disk, we'll need a helper function that uses the effective filename property of our stack.

So finally, we have a button script as follows:
on mouseUp
--> determine the input and output files
local tInputFiles, tOutputFile
put ShellPath(AbsolutePathFromStack("demo1.pdf")) && \
ShellPath(AbsolutePathFromStack("demo2.pdf")) \
into tInputFiles
put ShellPath(AbsolutePathFromStack("output.pdf")) \
into tOutputFile
--> determine the class path
local tClassPath
if the platform is "Win32" then
put ".;iText-2.1.7.jar" into tClassPath
put ".:iText-2.1.7.jar" into tClassPath
end if
--> assemble the shell command
local tShellCommand
put "java -classpath" && tClassPath && \
"ConcatPdfFiles" && \
tOutputFile && tInputFiles \
into tShellCommand
--> execute the shell command
local tHideConsoleWindows, tDefaultFolder, tShellResult
put the hideConsoleWindows into tHideConsoleWindows
set the hideConsoleWindows to true
put the defaultFolder into tDefaultFolder
set the defaultFolder to AbsolutePathFromStack()
put shell(tShellCommand) into tShellResult
set the defaultFolder to tDefaultFolder
set the hideConsoleWindows to tHideConsoleWindows
if tShellResult is not empty then
answer error tShellResult
end if
end mouseUp

function AbsolutePathFromStack pFileName
local tAbsolutePath
put the effective filename of this stack into tAbsolutePath
set the itemDelimiter to slash
if pFileName is not empty then
put pFileName into item -1 of tAbsolutePath
delete item -1 of tAbsolutePath
end if
return tAbsolutePath
end AbsolutePathFromStack

function ShellPath pPath
if the platform is "Win32" then
put quote & pPath & quote into pPath
replace space with backslash & space in pPath
end if
return pPath
end ShellPath

Click the button, and it happily concatenates the two PDF files (demo1.pdf and demo2.pdf) into a single PDF file (output.pdf) in the same folder as our stack. There we have it, our first use of iText from within LiveCode.


Unknown said...
This comment has been removed by the author.
Unknown said...

Hi Jan,

I solved the problem. :)
Nice tutorial and nice way to expand LiveCode.