Everyone needs observability. If you have a service that users rely on, you need to know if it’s working for your users. Heartbeat monitoring (Also called Synthetic Monitoring) should be a way to truly know how well our site is performing for users. With scheduled checks from distributed geography, its the final word in ‘are we up or down?’ And even more, sophisticated synthetics with tools like Playwright give us real-world measurements of how every part of our site performs for real users.
Large APM SaaS companies like New Relic have long offered a version of these heartbeat monitors, but this article describes how the pricing and features of New Relic’s Synthetics make it a poor solution for your team.
Expertise drives test coverage
No matter your tools, it’s your teams’ understanding of your system that drives observability. I’ll take an example from my own career. In one of my first on-call shifts at a new team, I got woken up at 2AM with an outage affecting all of LATAM. I checked our logs, and things looked normal at first glance, but when I looked into our APM dashboard, I found a very concerning pattern: With about 10,000 logins per hour we were seeing pretty normal traffic, but the calls to the userLogin method were much much higher. There were over 2,000 invocations of userLogin per minute, as if every single login was calling the method 10 times. I didn’t normally spend time in our APM dashboard, but this looked very wrong. I checked our history in the last few weeks and userLogin was consistently getting called way more often than seemed right. I spent 20 minutes hunting down a possible cause, until the issue escalated and a more senior engineer got online. “Oh that’s normal,” she explained “that method gets called in a loop for every organization the user is in, so it’s always way higher than total logins.”
Our observability tooling was working fine, but without the direct experience of what constituted ‘normal’ for our system I wasn’t able to help during a crisis. With New Relic, it becomes standard that only a chosen few within your organization have access to your observability data. That means more incidents where no one understands what their observability tool is telling them, as they wait for the few ‘experts’ to analyze the incident.
New Relic cuts your team off from their observability data
With per-user pricing and limited sophistication in testing, New Relic’s synthetics are a barrier to improving Observability across your whole team. New Relic has per-seat licensing for most of its core features, including APM and Synthetics. That means every single additional person who’s able to see, much less manipulate, your heartbeat monitors will cost an extra $49 a month. This is on top of the actual infrastructure costs of running Synthetics checks, which can be more than double the cost of running the same checks from Checkly.
It’s also worth noting how hard it is to get a clear answer from New Relic about what synthetic monitoring: If you look at that synthetics pricing page, I guarantee you that you will not be able to answer ‘What will 12k browser sessions per month cost me?’
Checkly offers clear, up-front pricing for synthetics monitoring. Even better, Checkly has no per-seat licensing so your synthetics data can be shared far and wide within your organization.
How does this affect observability in your organization? Firstly, like in the example above where human knowledge was part of making our system observable, every user who has never been able to see normal synthetics testing information won’t be able to easily interpret those dashboards when something goes wrong. This will add to your Mean Time to Resolution (MTTR) during outages. But the real problem is more pernicious than that: Limiting who can see your monitoring data encourages over-specialization.
Uptime is everyone’s problem
I won’t quote, again, the universal statistics about how terribly expensive downtime is for an organization. Essentially, a technology company with significant outages can’t really grow or succeed. Blackberry proved that in 2011.
So we can agree that uptime is something that every engineer at a company should care about, but it goes further than that, since every other part of your team relies on consistent uptime. Even very brief hiccups, poorly timed, can affect sales, marketing, and professional services. I know of more than one team that has used synthetic monitoring to create a status page for professional services teams, with that uptime information forwarded to clients.
This is a real problem with New Relic’s per-seat pricing! While it may be an ongoing conversation whether a particular engineer, not focused on front-end, should have access to all New Relic dashboards, it’s very unlikely that anyone is willing to pay $600 a year so that professional services, sales architects, and marketing people will get that access.
Checkly dashboards are often adopted first by DevOps or SRE teams, but Synthetics information quickly wants to spread from there. Check out this quote from Joe Wright, Lead Automation Test Analyst at Drivvn
“Bringing complex monitoring information together into clear, easily-understood dashboards that we can share throughout the company regardless of what role people had was a huge plus,”
When uptime becomes an obscure topic in your organization, it increases the siloing of information, and hurts trust and collaboration between customer-facing and product groups. So along with failing to share our uptime information, we’re also starting to break our agile methodologies. With Checkly, synthetics data is never siloed, and our dashboards can be shared with the right people in your organization.
If not everyone is writing tests, your observability just fell off a cliff
While the primary problem with per-user pricing at New Relic is the siloing of performance data to just engineering teams, a further problem appears at the margin: it’s very tempting for fewer and fewer engineers to have access to new relic.
At first, this makes sense: only SRE’s and Operations people really need to write tests and monitor the site, right? Front end teams can do acceptance testing before deploy, and QA can run end-to-end tests, so they don’t need to work on Synthetics monitoring, right? The problem along with a general issue with siloing is that we reduce the expertise offered on our automated checks of our site. The site, in the end, becomes a black box, with the people responsible for testing a site or API, not being the same people who engineered it.
Another significant issue is the inability of an SRE team to share results from the New Relic dashboard. Forced to attach images to a bug tracker, this siloing can hurt your developer velocity
Checkly monitoring is the last word in uptime
Synthetics monitoring is essential for ensuring the reliability and performance of your applications, regardless of your team's size or project scale. It involves simulating user actions, testing behavior, and validating inputs to understand your application's health comprehensively.
While New Relic’s pricing for site checks may seem attractive, its pricing means that the knowledge of how your site really works for users will be limited to only a small team. Synthetics monitoring is not optional; it's vital for optimal application performance. To enhance observability and provide a better user experience, consider cost-effective solutions and robust testing strategies.
Join the Checkly Slack today to discuss how to proactively address issues and improve user experiences.