cloud Apr 19, 2018

Applying Minification and Uglification to AWS Lambda Functions

AWS Lambda — or Functions as a Service (FaaS) — has definitely been a game changer in the IaaS ecosystem. It has allowed backend developers to pay for what they use with competitive prices and efficiency. My team at Capital One uses a few Lambda functions to quickly bootstrap functionalities that require both elasticity and availability. Though AWS Lambda is already highly optimized for speed, I’m an efficiency freak, so whenever I’m working on something I always ask myself, “How can we make this even more efficient?” So, over the previous few months, I’ve been experimenting with optimizing Lambda functions to see if I’m using them optimally.

When optimizing AWS Lambda functions, one thing to keep in mind is the ephemeral nature of function invocation. Take a traditional Express Node.js application — once the application is started, it will remain running on the machine until shut off. Due to the nature of the applications and technologies we work with, back-end engineers often overlook a common procedure that front-end engineers are more familiar with using: minification and uglification.

Could this be an answer to optimizing the efficiency of our Lambda functions?

Shallow Dive into AWS Lambda — Cold Start vs Warm Start

If you have a rudimentary understanding of AWS Lambda you probably have an idea of something like what the diagram shows:

Note: I realize that AWS is not using Docker Hub and has a highly customized registry but you get the point. Note: I realize that AWS is not using Docker Hub and has a highly customized registry but you get the point.

Here we have a cold start and warm start. We can think of the cold start as a EC2 instance that needs to download docker image from Docker Hub. We can think of the warm start as when the container is already downloaded in a EC2 instance. Note, this is somewhat correct — what you won’t know from this diagram is that the “container” is not re-initialized.
 

 To test this theory, let’s run a simple Node application:

var outer = Date.now();
exports.handler = function(context, event, callback) {
  var inner = Date.now();
  console.log(‘inner: ‘ + inner);
  console.log(‘outer: ‘ + outer);
}

This is very basic — it gets two variables at run time and prints them out.
 Theoretically, the two variables should be equal (or near equal).

Now, let’s run a Docker version of this application by running:

docker run lilnate22/test-lambda.

We will see the first time you run this command that it takes some time t o execute since you need to download the image from a registry (cold start). We should also see an output where the two variables are equal or near equal.

The next time you run docker run lilnate22/test-lambda it will be near instant (warm start) because it will already be downloaded on your machine. This is pretty basic stuff. But what happens when we do the same thing in AWS?
 
 Let’s try it! Copy the above code and create a new blank Nodejs 6.10 lambda function. When you run the newly created function, it should take slightly longer (since AWS has to “download”). You will see in the first initialization that your inner and outer timings will be near equal. But what happens when we do it again?

AWS only executes code inside handler functions during warm state runs. wait a minute….why are they different?

The hint lies in the fact that AWS only executes code inside handler functions during warm state runs. I.e. anything inside of exports.handler = function(). So, what does that mean? Suppose we have a Lambda function that gets us the daily lottery numbers:

https://gist.github.com/nfons/c54449e587cd6d970bb4acff24e44f4a

If we write our Lambda function like we wrote our Docker app, each Lambda execution would “fetch” the lottery numbers.

Instead, we should optimize it where we check if a dependency is already set and only execute the expensive fetch if it’s empty.

https://gist.github.com/nfons/91bca1483d912d6f1a27ad0f37554d4c

In this way, in subsequent calls, we won’t have to do the process expensive functionality.

Minify + Uglify = Speed + Cost.

Minification / uglification is a common component of client-side JavaScript development. Front-end engineers are stressed on the importance of minifying CSS/HTML that’s delivered to browsers to help improve speed. But when it comes to your back-end developers (Service Developers), minification is often overlooked.

Take this example — if you were to search for “Nodejs minification” on google

most front-end developers would be confused and ask “why?” Which makes sense when you think about traditional Node.js service runtime environments. In these environments you run, say an Express.js rest app, once and the app stays up taking in requests. Unlike client-side code, which is quasi “ephemeral”, there is really no need to send code through the wire to browsers.

Minification

Minification is a common frontend process of taking code and removing all unnecessary characters without changing code logic.

Suppose we have a code block like this:

[code]
var longvariable = “myname”
var longArray = [1,2,3,4,5,6]
//some comment about functionality perhaps?
for(var index = 0; index < 6; index++){
console.log(longArray[index]);
}
[/code]

Minifiying it would remove all the whitespaces, tabs and comments to output a code block such as:

[code]
var index,longvariable=”myname”,longArray=[1,2,3,4,5,6] for(index=0;6>index;index++)console.log(longArray[index])
[/code]

Just in our example, we have taken a six line code and minified it to two lines. If we combine this with Webpack (a bundler) which will take multiple dependent files and minify them to one file, we can reduce total lines in our app significantly.

Uglification

Uglification is one extra process that makes code even more leaner. Uglification takes all the variables and obfuscates/simplifies them, so our minified example:

var index,longvariable=”myname”,longArray=[1,2,3,4,5,6] for(index=0;6>index;index++)console.log(longArray[index])

will look like something like this:

for(var o=”myname”,a=[1,2,3,4,5,6],e=0;e<6;e++)console.log(a[e]);

Why Minify + Uglify??

A few reasons come to mind as to why you might want to minify and uglify. From my time working as a frontend developer, I knew one of the big benefits of minification/uglification is that it ensures the client browser has the smallest possible files to download. Applying that same principle to Lambda, with its ephemeral code functionality, I was hoping those same benefits could be applied to backend work.

But another thing to keep in mind is that AWS Lambda has a limit of 75GB for all lambda functions in a region with a package limit of 50MB each. So, even if our performance gains are not that drastic, perhaps we can eliminate the headache of hitting this 75GB limit.

I have created a sample application to illustrate this. Feel free to clone it here:
 https://github.com/lilnate22/Lambda-Tester/

To run:

npm run build
zip -r package-raw.zip .
zip -r package-ugly.zip bundle.js

If you look at the statistics of the Zip files:

Raw: 13.8MB vs Minified: 1MB. More than 13x the size!

Example

But how about a real-world example?

At Capital One, we deal with a good amount of data from various vendors and partners. During a quarterly hackathon, we created a Lambda function that would automatically parse this data, and either strip them or mask them depending on the applicable data security rules.

We ran this application 100 times in both the warm state and cold state. Though AWS does not tell users how long warm state is, we made a safe assumption based on how Lambda works in containers that if we ran the application, immediately after execution completed it should be in the warm state. If we create 100 lambda functions with an integer appended to name, we can execute each of those once to calculate the cold state averages.

Data points for Minified and raw entries (in ms) Data points for Minified and raw entries (in ms)
For a warm state execution, we see about a 40% reduction in execution time. Note: we did see a higher memory footprint though. For a warm state execution, we see about a 40% reduction in execution time. Note: we did see a higher memory footprint though.

I expected cold state executions to be even faster, but to my surprise we only saw about a 30% reduction between minified and raw functions. Though the values for cold state RAW was much more volatile and sporadic (high: 1667, low: 903) than that of minified (high: 969, low: 737).

Results

Now that we’ve shown the speed increase, what does that mean for our hackathon project? We did see a higher uptick in memory used (as reported by AWS). But to the best of my research, AWS did not charge by Mb (only what is max allocated).
 
 From a cost perspective, if we were to extrapolate the values we’d see a total savings for our hackathon project of ~20%

Final Thoughts

AWS Lambda has been a game changer for backend developers. Already highly optimized for speed, by borrowing the concepts of minification and uglification from frontend development we can help push our functions even further.

As an efficiency freak, I’ve played with this tactic in hackathon projects and personal experiments and find it quite promising. Could this be an answer to further optimizing the efficiency of Lambda functions? Perhaps. The results of using minification and uglification in this way may vary from application to application and project to project. Consider weighing your options carefully when optimizing your own AWS Lambda functions.

DISCLOSURE STATEMENT: These opinions are those of the author. Unless noted otherwise in this post, Capital One is not affiliated with, nor is it endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are the ownership of their respective owners. This article is © 2018 Capital One.