HOW TO: Convert office documents to PDF using Open Office/LibreOffice in C#
Lately, we had this requirement to convert office documents such as DOC, DOCX, XLS, XLSX, PPT, PPTX to PDF. After googling for sometime,
Microsoft Office doesn’t have any API or exposes any command to achieve this target. As I have some interaction with Open Office on my
Ubuntu netbook and I knew Open Office has their API exposed with UNO. I started searching and working on a workable demo to convert office
documents to PDF.
Open Office has CLI implementation on Java’s UNO development environment for their API to use on .NET Framework.
See Open Office Developer Guide for details.
Things need to be installed
Open Office (Libre Office now)
Open Office SDK
NOTE: Open Office SDK is not quite required if you can copy the required assemblies from GAC to you application’s assembly folder. The required files listed below:
cli_basetypes.dll
cli_cppuhelper.dll
cli_oootypes.dll
cli_ure.dll
cli_uretypes.dll
If you have installed Open Office SDK, you will get these files under sdkcli on you installed SDK folder.
Implementation
Add reference to all the DLLs above on your project follow the noted methods below.
You will be needing these namespaces imported
This is the method you will be actually using on your assembly or expose from your class library. This method starts up Open Office executable,
initialize UNO components and saves to PDF in the end.
Starts executable instance of soffice.exe where your application will be communicating with this using CLI DLLs referenced.
This initializes the document instance and load the source file.
This method saves the processed document to a destination file.
Converts file path to OpenOffice API readable format.
This methods returns the filter type required for conversion based on extension.
Well, that’s it! Just invoke the method ConvertToPdf with input and output file name parameter.