PDF Generator using Terraform, AWS, Lambda and API Gateway

Posted Jul 5, 2021 Updated Nov 14, 2022

By Gareth Hughes 3 min read

The application I’m currently working on has a lot of old documents created using a crystal reports editor and references DLLs in a folder in source control instead of using Nuget. As we’re trying to upgrade this solution to .NET 5 we need to move away from local DLLs. As part of this we decided to redo how our PDFs are being generated.

The approach: create a micro service that accepts a HTML payload and returns a PDF file. The stack we decided to use for this;

AWS
Terraform
Lambda
API Gateway
S3

The intent was to use Puppeteer and Chrome to save the PDF from the browser directly, while this worked fine locally it obviously wouldn’t work in a Lambda because it wouldn’t have Chrome available.

The solution was to use a package called chrome-aws-lambda from here: https://github.com/alixaxel/chrome-aws-lambda. Using this we could use the following code to generate a PDF file.

The Lambda

  
import * as Sentry from "@sentry/serverless";
import * as SentryTracing from "@sentry/tracing";
import { CaptureConsole } from "@sentry/integrations";
import { APIGatewayProxyEvent, APIGatewayProxyResult } from "aws-lambda";
import { generate } from "./services/pdfService";
import { save } from './services/fileService'
import logger from "./utils/logger";

export const handler = Sentry.AWSLambda.wrapHandler(
  async (event: APIGatewayProxyEvent): Promise<APIGatewayProxyResult> => {
    const document = event.body;

    if (!document) {
      return {
        statusCode: 400,
        body: "Bad request, missing document",
      };
    }

    try {
      const pdf = await generate(document);
      const result = await save(pdf);

      return {
        statusCode: 200,
        headers: {
          "Content-Type": "application/json"
        },
        body: JSON.stringify(result)
      };
    } catch (e) {
      logger.error(e, "Error processing file");

      return {
        statusCode: 500,
        body: "Error creating PDF",
      };
    }
  }
);

The PDF Service

  
import chromium from 'chrome-aws-lambda';

const generate = async (body: string): Promise<any> => {
	const browser = await chromium.puppeteer.launch({
		args: chromium.args,
		defaultViewport: chromium.defaultViewport,
		executablePath: await chromium.executablePath,
		headless: chromium.headless,
		ignoreHTTPSErrors: true
	});

	const page = await browser.newPage();
	await page.setContent(body);

  return page.pdf({
    preferCSSPageSize: true
  });
}

export { generate };

Terraform

The chrome-aws-lambda package was deployed to AWS as a layer for our lambda to reference.

  
resource "aws_lambda_layer_version" "lambda_layer" {
  filename   = "chrome_aws_lambda-10.zip"
  layer_name = "chrome-aws-lambda-${var.environment}"

  compatible_runtimes = ["nodejs12.x"]
}

Connect this up to the lambda

  
resource "aws_lambda_function" "lambda" {
  filename      = var.lambda_filename
  function_name = "${var.service_name}-${var.environment}"
  role          = aws_iam_role.lambda.arn
  handler       = "index.handler"
  runtime       = "nodejs12.x"
  memory_size   = 1024
  publish       = true
  timeout       = 800
  layers        = [aws_lambda_layer_version.lambda_layer.arn]
  environment {
    variables = {
      SENTRY_DSN      = var.sentry_dsn
      ENVIRONMENT     = var.environment
      PACKAGE_VERSION = var.lambda_version
      SERVICE_NAME    = var.service_name
      BUCKET_NAME     = var.bucket_name
    }
  }
}

The PDF can be returned directly via the lambda however there is a 6mb lambda payload limit to consider so the lambda takes the PDF file and saves it to S3 which is configured to be removed after 1 day. A signed download URL is returned with a 5 minute expiry.

This can then sit behind API Gateway and serve up beautiful PDFs like this:

Development

This post is licensed under CC BY 4.0 by the author.

The Lambda

The PDF Service

Terraform

Trending Tags