Table of contents
When I was first learning programming, I was implementing a markov chain and explained to my mentor that it was taking me more than a week to do. I was implementing input sanitization on my interface. “Don’t do that” said my mentor, “your tool has zero users, soon it will have one user, you. Don’t fix problems you don’t have yet.” I’d already learned a lot about how to solve computational problems with code, but this was my first exposure to programming standards. Not a simple ‘how do I do it?’ but ‘what is the best way?’ Programming best practices exist to ensure that code is reliable, maintainable, and readable. Usually, your team can agree on best practices, follow them as you write code, and reject any code changes that break those standards. This article is about the times when you have to break best practices for better performance.
Earlier I said that best practices serve multiple purposes for your team. But really all three goals of reliability, maintainability, and readability come down to one central concern: how easy is your code to understand? You might think easy-to-read code is a courtesy to your teammates, but I’ll tell you from hard experience: the most likely person to be completely flummoxed by your unreadable code, a few weeks or months in the future, is you.
In the movie ‘Looper’ a desperate Joe tries to make his past self understand the things he’ll have to learn in the next 30 years. This is me trying to read the code I wrote in 2022.
Let’s talk about the worst advice someone can give you for coding best practices: ‘it runs faster like this’
Not all best practices are created equal
In general, it never makes sense to write hard-to-read code for the purpose of making it perform better. Among other things, when writing in interpreted languages like Node it’s hard to know exactly how your code will perform when run. I mention this to clarify that not all best practices are created equal: anyone telling you with absolute certainty that formatting your loop statements differently will run faster in Node likely formed their opinions on a much older version of the Node engine, and their beliefs rarely survive benchmarking.
There are some common best practices in Node e.g. with string handling that it’s very easy to benchmark and see performance improvements. But someone telling you not to declare variables inside a loop “for memory efficiency” is just repeating out-of-date tips they’ve memorized.
Okay, let’s talk about a best practice that everyone agrees on:
Why you should always ALWAYS include code at the top of your file.
If you require any outside code within the file you’re writing, you need to require it at the top of your list, lines like const process = require('node:process') you don’t want to drop that in the middle of your file.
Wait, why are we bringing in all our requirements at the top of a file again? A short list:
- Clarity and Organization: It provides a clear and immediate overview of the external dependencies and modules the file relies on.
- Avoiding Circular Dependencies: By importing at the top, you can more easily identify and manage potential circular dependencies, which can lead to confusing and hard-to-debug errors.
- Consistency: Keeping imports at the top ensures a consistent structure across files and projects. This consistency makes the codebase easier to navigate and maintain, especially in larger projects or when working in teams.
- Namespace Management: Having all imports in one place makes it easier to see which names are brought into the namespace, reducing the risk of name collisions and making it simpler to manage imports.
- Error Handling: If an import fails, the error will occur immediately when the script starts interpretation, rather than potentially failing partway through execution.
- Remove requirements when no longer needed: If all your requirements are in one place, it’s easier to find when you want to pull out a requirement. E.G. when someone picks a fight with npm.
So why oh why did we improve performance of our code significantly by breaking this rule?
Catching you up in our optimization journey
If you took at look at my last article about optimizing pod start times, you know this story so I’ll make it quick:
- We moved every Checkly check run to an ephemeral pod
- Pod startup times started to really matter, adding to compute billing arithmetically with usage
- By optimizing the versions of the AWS SDK we were using, we drastically improved startup time by not having to fetch many sub-versions of the same SDK
This change significantly improved startup times and improved our compute spend. Awesome! But there was one last edit that also saved a ton of compute time, and unlike the standardization of SDK versions it wasn’t a best practice.
The ‘second weird trick’ that improved startup times.
Here’s the code that saved a bit more startup time
const process = require('node:process')
const config = {}
//175 more lines of code, and then the important part
if (!r2.credentials) {
const { fromIni } = require('@aws-sdk/credential-providers')
r2.credentials = fromIni({ profile: 'r2_dev' })
}
Okay so any intern can tell you why this is bad. We’re bringing in the AWS SDK credential providers deep in the middle of our file. This breaks the rule mentioned above, and it introduces almost all the problems raised above: it’s surprising to find the code here, hurting readability. It’s inconsistent with the rest of the project. It makes namespace management tougher, and creates the possibility that someone will collide with our credentials naming, it will make errors harder to debug, and it’ll be harder to find and remove this dependency if we, say, decide to move clouds at a later date.
Okay so why do it? The answer is clear if we include more code context:
// Localstack
if (process.env.AWS_ENV === 'local' || process.env.IS_OFFLINE) {
const localStackHostname = process.env.LOCALSTACK_HOSTNAME || 'http://127.0.0.1'
const getLocalStackHostnameWithPort = (port) => `${localStackHostname}:${port}`
aws.endpoints = {
SQS: getLocalStackHostnameWithPort(9324),
}
const queuePrefix = `${getLocalStackHostnameWithPort(9324)}/queue`
aws.resultsQueueUrl = `${queuePrefix}/results-dev.fifo`
aws.testsQueueUrl = `${queuePrefix}/tests-dev.fifo`
aws.resultsEphemeralQueueUrl = `${queuePrefix}/results-ephemeral-dev.fifo`
process.env.AWS_REGION = aws.region
if (!r2.credentials) {
const { fromIni } = require('@aws-sdk/credential-providers')
r2.credentials = fromIni({ profile: 'r2_dev' })
}
}
So this requirement only exists when a developer is running this code locally. Previously, to start our pod, we had to import the AWS SDK Credential Providers every single time, for every single check run.
By only loading this requirement in the literally one-in-a-million case that the code is being run locally, we cut that little bit of startup time and infrastructure billing for every check run.
Standards matter all the time, except when they don’t
All coding standards are guidelines. And we need to fully understand the reasons for those guidelines so we know when to break them.
Like any rule, there are exceptions where breaking best practices can lead to tangible benefits. Our journey with optimizing pod start times is a testament to this.
Standards in coding are not arbitrary; they are established from accumulated experience to prevent common pitfalls and enhance collaboration. In specific contexts, adhering strictly to these guidelines may not yield the best outcome. Our case, where we deferred loading a module until absolutely necessary, demonstrates a situation where performance gains outweighed the typical benefits of following best practices.
The key takeaway is that understanding the rationale behind coding standards is vital. This understanding allows developers to make informed decisions about when it's appropriate to deviate from these norms. By thoughtfully assessing the trade-offs and context, we can strike a balance between maintaining clean, standardized code and achieving performance efficiencies.
Monitor your code running in the real world, not just with unit tests
At Checkly, we cover your synthetic monitoring, this isn’t a commonly used term by everyone in development or operations, but it becomes a lot clearer when I say: “it’s end-to-end testing run on a cadence.” Synthetic monitoring is about finding problems faster by simulating what real users do with your service. With the automation framework Playwright, Checkly is the best tool to find problems before your users do.
And that’s the real lesson of this optimization journey: performance is about more than unit tests or anything else you can simulate pre-release. You need to observe your code running in the real world to really understand its performance.
Checkly is hiring!
Do these problems seem interesting to you? Would you like to work with a team that’s helping thousands of developers run millions of end-to-end tests every day? We’re hiring! Check out our careers page to join the team.