Jump directly to main content

Optimizing Puppeteer PDF generation

At Carriyo we currently are generating 10k PDFs per day. We design our documents in HTML / CSS and use puppeteer to convert them to PDF. Puppeteer uses Chromium browser to generate the PDF, which as you can guess, is very heavy on resources. So buckle up, as I take you through a whole bunch of optimizations and error handling considerations to scale this beast up.

Autoscaling behavior of various cloud services

Its is hard to find any information on how cloud services handle a sudden spike in number of requests, which makes it hard to have an informed decision when picking a cloud service. Here I attempt to list some of the cloud service providers autoscaling behavior.

"Parallel" API calls with JS

A common question I get from devs new to JS is: How do I do “parallel” HTTP/API calls with JS?

Webhooks and HMAC Verification

Say your external client(s)/partner(s) wants to be notified of certain events happening on your system. So you decide to push events/messages to their systems via HTTP (webhooks), and your client have an end-point that accepts the requests. Here’s a security problem your client has to deal with: they have an end-point open to the public internet which could be abused by bots / bad actors. How do you let their system know that a request they have received is indeed from your service and is not a bot / spoofed / spam request?

Shorter Unique IDs

If you have dealt with any form data & database, you have had the need to assign unique IDs to data saved. Many a time these IDs are internal, but sometimes IDs need to be used in other ways by your users. e.g, they need to paste it, email it or speak it over a phone etc. Here is where a nice short readable ID would be beneficial to the user.

Concurrent Redis writes and correctness

Lately I’ve been trying to re-implement a cache library using redis. The (simplified) requirement is to use cached value for a given key if present else fetch fresh value and cache the value to redis (on the first time or the first time after cache value expires). Also assume that fetching fresh value, even once, is a super expensive operation (it causes heavy load on the database).

Seems like a simple requirement, right? But one does not simply write distributed system code!

Faster JS array includes

I am sure you’ve come across cases where you need to search for items multiple times in the same array. And you have written the code in an easy to read way, but not exactly the most performant way. Right?

Background Removal with OpenCV (Take 2)

Since I last wrote my post on background removal in 2016, I’ve searched for alternative ways to get better results. Here I will dive into my new approach.