Serverless LibreOffice PDF Converter

This is a full featured LibreOffice compiled to run in AWS Lambda environment.

But stripped from useless stuff, so it takes only 109 out of 250 MB function's聽zip artifact.

109 / 250 MB

And converts almost any office document to PDF:

.doc.docx.ppt.pptx.xls.xlsx.numbers.pages.key.csv.txt.odt.ods.odt.odp.html.rtf.xlt.psd.bmp.png.xml.svg.cdr.eps.psw.dot.tiffand more
Go to GitHub

Demo

Pricing Example

So you need to convert 1 million documents average of 5 MB size.

This will result in 5 terabytes of S3 storage per month. As you see, this is the primary driver of price, not computing cost.

Resource
Amount
Price per unit
Sum
S3 Storage1,000,000 * 5MB$0.000023$115.00
Lambda Runtime1,000,000 * 1.5GB * 1.2s$0.00002501$30.01
S3 PUT Requests1,000,000$0.000005$5.00
S3 GET Requests1,000,000$0.0000004$0.40
Lambda Requests1,000,000$0.0000002$0.20

Open Improvements

Reduce Cold Start Time

Currently 茮 unpacks 109 MB .tar.gz to /tmp folder which takes ~1-2 seconds on cold start.

Would be nice to create a single compressed executable to save unpack time and increase portability. I tried using Ermine packager and it works!! But unfortunately this is commercial software. Similar open-source analogue Statifier produces broken binaries.

Maybe someone has another idea how to create a single executable from a folder full of shared objects.

Further Size Reduction

I am not a Linux or C++ expert, so for sure I missed some easy "hacks" to reduce size of compiled LibreOffice.

Mostly I just excluded from compilation as much unrelated stuff as possible. And stripped symbols from shared objects.

Here is the list of:聽available RPM packages聽and聽libraries聽available in AWS Lambda Environment, which can be helpful.