Mail merge in java for Microsoft Word document and convert to PDF without iText – Part II
In my last post Mail merge in java for Microsoft Word document – Part I , I have explained how variables can be replaced and a merged document can be generated.
This article is extension of the previous one and explains how the MS word file can be converted into PDF.
XDocReport has one cool extension which converts to PDF but using iText. As most of you are aware iText is no longer available in commercial friendly license, its available in AGPL.
We found docx4j as alternative for iText. This is a cool API which depends upon XSL FO for conversion to PDF. I am presenting below, the extended code for producing PDF. We added few dependencies in pom.xml and added two methods in java class used in last article and its done!
The updated pom.xml looks like below:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>docxtopdf-without-itext</groupId> <artifactId>docxtopdf-without-itext</artifactId> <version>0.0.1-SNAPSHOT</version> <name>docxtopdf-without-itext</name> <description>Microsoft Word DOCX format to PDF conversion with variable replacement</description> <properties> <xdocreport.version>0.9.8</xdocreport.version> </properties> <dependencies> <dependency> <groupId>fr.opensagres.xdocreport</groupId> <artifactId>fr.opensagres.xdocreport.core</artifactId> <version>${xdocreport.version}</version> </dependency> <dependency> <groupId>fr.opensagres.xdocreport</groupId> <artifactId>fr.opensagres.xdocreport.document</artifactId> <version>${xdocreport.version}</version> </dependency> <dependency> <groupId>fr.opensagres.xdocreport</groupId> <artifactId>fr.opensagres.xdocreport.document.docx</artifactId> <version>${xdocreport.version}</version> </dependency> <dependency> <groupId>fr.opensagres.xdocreport</groupId> <artifactId>fr.opensagres.xdocreport.converter</artifactId> <version>${xdocreport.version}</version> </dependency> <dependency> <groupId>fr.opensagres.xdocreport</groupId> <artifactId>fr.opensagres.xdocreport.template.freemarker</artifactId> <version>${xdocreport.version}</version> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.8</version> <scope>test</scope> </dependency> <dependency> <groupId>org.docx4j</groupId> <artifactId>docx4j</artifactId> <version>2.8.1</version> <exclusions> <exclusion> <groupId>com.lowagie</groupId> <artifactId>itext</artifactId> </exclusion> </exclusions> </dependency> <dependency> <groupId>org.docx4j</groupId> <artifactId>xhtmlrenderer</artifactId> <version>1.0.0</version> <exclusions> <exclusion> <groupId>com.lowagie</groupId> <artifactId>itext</artifactId> </exclusion> </exclusions> </dependency> <dependency> <groupId>org.apache.httpcomponents</groupId> <artifactId>httpclient</artifactId> <version>4.2.2</version> </dependency> <dependency> <groupId>batik</groupId> <artifactId>batik-util</artifactId> <version>1.6</version> </dependency> <dependency> <groupId>batik</groupId> <artifactId>batik-1.5-fop</artifactId> <version>0.20-5</version> </dependency> </dependencies> <build> <pluginManagement> <plugins> <plugin> <artifactId>maven-compiler-plugin</artifactId> <version>2.3.2</version> <configuration> <source>1.6</source> <target>1.6</target> </configuration> </plugin> </plugins> </pluginManagement> </build> </project> |
The updated java file is as below:
/** * */ package com.sambhashanam.docx; import java.io.ByteArrayInputStream; import java.io.File; import java.io.IOException; import java.io.InputStream; import java.net.URL; import java.util.Map; import org.docx4j.convert.out.pdf.PdfConversion; import org.docx4j.convert.out.pdf.viaXSLFO.PdfSettings; import org.docx4j.openpackaging.exceptions.Docx4JException; import org.docx4j.openpackaging.packages.WordprocessingMLPackage; import fr.opensagres.xdocreport.converter.ConverterTypeTo; import fr.opensagres.xdocreport.converter.ConverterTypeVia; import fr.opensagres.xdocreport.converter.Options; import fr.opensagres.xdocreport.core.XDocReportException; import fr.opensagres.xdocreport.core.io.internal.ByteArrayOutputStream; import fr.opensagres.xdocreport.document.IXDocReport; import fr.opensagres.xdocreport.document.images.FileImageProvider; import fr.opensagres.xdocreport.document.registry.XDocReportRegistry; import fr.opensagres.xdocreport.template.IContext; import fr.opensagres.xdocreport.template.TemplateEngineKind; import fr.opensagres.xdocreport.template.formatter.FieldsMetadata; /** * @author Dhananjay Jha * */ public class DocxDocumentMergerAndConverter { /** * Takes file path as input and returns the stream opened on it * @param filePath * @return * @throws IOException */ public InputStream loadDocumentAsStream(String filePath) throws IOException{ //URL url =new File(filePath).toURL(); URL url =new File(filePath).toURI().toURL(); InputStream documentTemplateAsStream=null; documentTemplateAsStream= url.openStream(); return documentTemplateAsStream; } /** * Loads the docx report * @param documentTemplateAsStream * @param freemarkerOrVelocityTemplateKind * @return * @throws IOException * @throws XDocReportException */ public IXDocReport loadDocumentAsIDocxReport(InputStream documentTemplateAsStream, TemplateEngineKind freemarkerOrVelocityTemplateKind) throws IOException, XDocReportException{ IXDocReport xdocReport = XDocReportRegistry.getRegistry().loadReport(documentTemplateAsStream, freemarkerOrVelocityTemplateKind); return xdocReport; } /** * Takes the IXDocReport instance, creates IContext instance out of it and puts variables in the context * @param report * @param variablesToBeReplaced * @return * @throws XDocReportException */ public IContext replaceVariabalesInTemplateOtherThanImages(IXDocReport report, Map<String, Object> variablesToBeReplaced) throws XDocReportException{ IContext context = report.createContext(); for(Map.Entry<String, Object> variable: variablesToBeReplaced.entrySet()){ context.put(variable.getKey(), variable.getValue()); } return context; } /** * Takes Map of image variable name and fileptah of the image to be replaced. Creates IImageprovides and adds the variable in context * @param report * @param variablesToBeReplaced * @param context */ public void replaceImagesVariabalesInTemplate(IXDocReport report, Map<String, String> variablesToBeReplaced, IContext context){ FieldsMetadata metadata = new FieldsMetadata(); for(Map.Entry<String, String> variable: variablesToBeReplaced.entrySet()){ metadata.addFieldAsImage(variable.getKey()); context.put(variable.getKey(), new FileImageProvider(new File(variable.getValue()),true)); } report.setFieldsMetadata(metadata); } /** * Generates byte array as output from merged template * @param report * @param context * @return * @throws XDocReportException * @throws IOException */ public byte[] generateMergedOutput(IXDocReport report,IContext context ) throws XDocReportException, IOException{ ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); report.process(context,outputStream); return outputStream.toByteArray(); } /** * Takes inputs and returns merged output as byte[] * @param templatePath * @param templateEngineKind * @param nonImageVariableMap * @param imageVariablesWithPathMap * @return * @throws IOException * @throws XDocReportException */ public byte[] mergeAndGenerateOutput(String templatePath, TemplateEngineKind templateEngineKind, Map<String, Object> nonImageVariableMap,Map<String, String> imageVariablesWithPathMap ) throws IOException, XDocReportException{ InputStream inputStream = loadDocumentAsStream(templatePath); IXDocReport xdocReport = loadDocumentAsIDocxReport(inputStream,templateEngineKind); IContext context = replaceVariabalesInTemplateOtherThanImages(xdocReport,nonImageVariableMap); replaceImagesVariabalesInTemplate(xdocReport, imageVariablesWithPathMap, context); byte[] mergedOutput = generateMergedOutput(xdocReport, context); return mergedOutput; } /** * Generates byte array as pdf output from merged template * @param report * @param context * @return * @throws XDocReportException * @throws IOException * @throws Docx4JException */ public byte[] generatePDFOutputFromDocx(byte[] docxBytes) throws XDocReportException, IOException, Docx4JException{ ByteArrayOutputStream pdfByteOutputStream = new ByteArrayOutputStream(); WordprocessingMLPackage wordprocessingMLPackage=null; wordprocessingMLPackage = WordprocessingMLPackage.load(new ByteArrayInputStream(docxBytes)); PdfSettings pdfSettings = new PdfSettings(); PdfConversion docx4jViaXSLFOconverter = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordprocessingMLPackage); docx4jViaXSLFOconverter.output(pdfByteOutputStream, pdfSettings); return pdfByteOutputStream.toByteArray(); } /** * Takes inputs and returns merged output as pdf byte[] * @param templatePath * @param templateEngineKind * @param nonImageVariableMap * @param imageVariablesWithPathMap * @return * @throws IOException * @throws XDocReportException * @throws Docx4JException */ public byte[] mergeAndGeneratePDFOutput(String templatePath, TemplateEngineKind templateEngineKind, Map<String, Object> nonImageVariableMap,Map<String, String> imageVariablesWithPathMap ) throws IOException, XDocReportException, Docx4JException{ InputStream inputStream = loadDocumentAsStream(templatePath); IXDocReport xdocReport = loadDocumentAsIDocxReport(inputStream,templateEngineKind); IContext context = replaceVariabalesInTemplateOtherThanImages(xdocReport,nonImageVariableMap); replaceImagesVariabalesInTemplate(xdocReport, imageVariablesWithPathMap, context); byte[] mergedOutput = generateMergedOutput(xdocReport, context); byte[] pdfBytes = generatePDFOutputFromDocx(mergedOutput); return pdfBytes; } } |
The updated test case is as below:
/** * */ package test.com.sambhashanam.docx; import static org.junit.Assert.*; import java.io.FileOutputStream; import java.io.IOException; import java.util.HashMap; import java.util.Map; import org.docx4j.openpackaging.exceptions.Docx4JException; import org.junit.Test; import com.sambhashanam.docx.DocxDocumentMergerAndConverter; import fr.opensagres.xdocreport.core.XDocReportException; import fr.opensagres.xdocreport.template.TemplateEngineKind; /** * @author Dhananjay Jha * */ public class DocxDocumentMergerAndConverterTest { /** * Test method for {@link com.sambhashanam.docx.DocxDocumentMergerAndConverter#mergeAndGenerateOutput(java.lang.String, fr.opensagres.xdocreport.template.TemplateEngineKind, java.util.Map, java.util.Map)}. * @throws XDocReportException * @throws IOException * @throws Docx4JException */ @Test public void testMergeAndGenerateOutput() throws IOException, XDocReportException, Docx4JException { String templatePath = "D:\\junoprojects\\docxtopdf\\docx-template\\ThankYouNote_Template.docx"; Map<String, Object> nonImageVariableMap = new HashMap<String, Object>(); nonImageVariableMap.put("thank_you_date", "24-June-2013"); nonImageVariableMap.put("name", "Rajani Jha"); nonImageVariableMap.put("website", "www.sambhashanam.com"); nonImageVariableMap.put("author_name", "Dhananjay Jha"); Map<String, String> imageVariablesWithPathMap =new HashMap<String, String>(); imageVariablesWithPathMap.put("header_image_logo", "D:\\junoprojects\\docxtopdf\\docx-template\\ScreenShot004.jpg"); DocxDocumentMergerAndConverter docxDocumentMergerAndConverter = new DocxDocumentMergerAndConverter(); byte[] mergedOutput = docxDocumentMergerAndConverter.mergeAndGeneratePDFOutput(templatePath, TemplateEngineKind.Freemarker, nonImageVariableMap, imageVariablesWithPathMap); assertNotNull(mergedOutput); FileOutputStream os = new FileOutputStream("D:\\junoprojects\\docxtopdf\\docx-template\\ThankYouNote"+System.nanoTime()+".pdf"); os.write(mergedOutput); os.flush(); os.close(); } } |
I have used this ThankYouNote_Template.docx MS word document as template. And the output was as this ThankYouNote18601689325878.pdf
Here is the glimpse of template and final output.
Leave a Reply