Wednesday, January 11, 2006


How to merge multiple PDF files in ASP.NET with pdftk

There are many PDF Toolkits on the market that can merge PDF files. Most of them are expensive systems like ActivePDF but in the end I found several free alternatives that allows me to merge PDF files on the fly in ASP.NET.

The first tool I found was PDF Merge. It is a Bash command line script that uses GhostScript. I never looked into the solution in detail as I found the great pdftk toolkit.

Pdftk is a standalone executable file that you call with different command line arguments. No installation or registration, just copy the file somewhere and call it.

A DLL would have been simpler, but calling pdftk from ASP.NET only takes a few lines of code. It is a command line tool so System.Diagnostics.Process can be used to launch it with the arguments we need to merge PDFs.  The list of files and the various paths are hard coded to keep the source code short and easy to read. The list of files to merge should be passed in the request and the paths should be stored in Web.Config. One last note: the runAndWait() function below waits forever for the toolkit to return. Use a configurable timeout if you plan to use it in production code

Private Sub Page_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
    'Hard coded list of files to convert
    Dim aFiles As String() = {"c:\temp\1.pdf", "c:\temp\2.pdf", "c:\temp\3.pdf"}
    'Use a temporary output file
    Dim outputFileName = System.IO.Path.GetTempFileName()
    'Construct the command line: pdftk 1.pdf 2.pdf 3.pdf cat output 123.pdf
    Dim fileName As String
    Dim buildCommand As System.Text.StringBuilder = New System.Text.StringBuilder
    For Each fileName In aFiles
        buildCommand.AppendFormat(" ""{0}"" ", fileName)
    buildCommand.AppendFormat(" cat output ""{0}"" dont_ask", outputFileName)
    'Run PDFTK and wait for it to complete, then send the output to the user
    If runAndWait("c:\tools\pdftk.exe", buildCommand.ToString()) Then
        Response.ContentType = "application/pdf"
        'TO DO: show error message
    End If
End Sub
Private Function runAndWait(ByVal command As String, ByVal commandLine As String) As Boolean
    Dim runProcess As System.Diagnostics.Process
        runProcess = New System.Diagnostics.Process
        With runProcess.StartInfo
            .FileName = command
            .Arguments = commandLine
            .WindowStyle = System.Diagnostics.ProcessWindowStyle.Normal
        End With
        'Wait until the process passes back an exit code 
        'Free resources associated with this process
        runAndWait = True
    Catch ex As Exception           
        runAndWait = False
    End Try
End Function 
Private Function getFileData(ByVal fileName As String) As Byte()
    Dim stream As System.IO.FileStream = System.IO.File.OpenRead(fileName)
    Dim data As Byte()
    ReDim data(stream.Length)
    stream.Read(data, 0, stream.Length)
    getFileData = data
End Function

No comments:

Post a Comment