Serverless LibreOffice PDF Converter

This is a full featured LibreOffice compiled to run in AWS Lambda environment.

But stripped from useless stuff, so it takes only 109 out of 250 MB function's聽zip artifact.

109 / 250 MB

And converts almost any office document to PDF:

.doc
.docx
.ppt
.pptx
.xls
.xlsx
.numbers
.pages
.key
.csv
.txt
.odt
.ods
.odt
.odp
.html
.rtf
.xlt
.psd
.bmp
.png
.xml
.svg
.cdr
.eps
.psw
.dot
.tiff
and more

Demo

Pricing Example

So you need to convert 1 million documents average of 5 MB size.

This will result in 5 terabytes of S3 storage per month. As you see, this is the primary driver of price, not computing cost.

Resource
Amount
Price per unit
Sum
S3 Storage1,000,000 * 5MB$0.000023$115.00
Lambda Runtime1,000,000 * 1.5GB * 1.2s$0.00002501$30.01
S3 PUT Requests1,000,000$0.000005$5.00
S3 GET Requests1,000,000$0.0000004$0.40
Lambda Requests1,000,000$0.0000002$0.20

Open Improvements

Reduce Cold Start Time

Currently 茮 unpacks 109 MB .tar.gz to /tmp folder which takes ~1-2 seconds on cold start.

Would be nice to create a single compressed executable to save unpack time and increase portability. I tried using Ermine packager and it works!! But unfortunately this is commercial software. Similar open-source analogue Statifier produces broken binaries.

Maybe someone has another idea how to create a single executable from a folder full of shared objects.

Further Size Reduction

I am not a Linux or C++ expert, so for sure I missed some easy "hacks" to reduce size of compiled LibreOffice.

Mostly I just excluded from compilation as much unrelated stuff as possible. And stripped symbols from shared objects.

Here is the list of:聽available RPM packages聽and聽libraries聽available in AWS Lambda Environment, which can be helpful.