Bun's Speed Hits Latency Snag in Next.js

As the JavaScript runtime ecosystem expands beyond Node.js, developers now have multiple options for running Next.js in production. These, of course, include more established runtimes like Node.js, newer alternatives such as Bun and Deno, and multi-threaded solutions like Platformatic Watt, which is an application server we built on top of Node.js. This report presents benchmark results comparing these four approaches on AWS EKS under identical conditions.

While evaluating these options and the benchmarks that follow, it’s important to keep in mind what matters most for your context and use case, as there are no “one-size fits all” solutions in software: latency, consistency, or ease of adoption.

ll runtimes completed the benchmarks without any errors. You can find the complete methodology we followed below.

Benchmark Methodology

We benchmarked Next.js 15.5 on AWS EKS across four JavaScript runtimes, each allocated six CPU cores, and the results will be of interest to any engineer building or maintaining server-side Javascript applications with any sort of performance sensitivity.

Three test runs were conducted, rotating the test order, at 1,000 requests per second for 120 seconds each, to illustrate the practical demands these runtimes might face under heavy traffic (think a flash sale in eCommerce, etc).

Infrastructure

All benchmarks ran on AWS EKS (Elastic Kubernetes Service) with the following infrastructure:

EKS Cluster: 4 nodes running m5.2xlarge instances (8 vCPUs, 32GB RAM each)
Region: us-west-2
Load Testing Instance: c7gn.large (2 vCPUs, 4GB RAM, network-optimized)
Load Testing Tool: Grafana k6

Two critical but often overlooked aspects of effective benchmarking are 1) providing clean and reproducible conditions for each test run, and 2) providing a reliable set-up for others to replicate your experiment. This empowers researchers and developers to verify the results by reproducing them themselves.

To this end, we used shell scripts and the AWS CLI to create on-demand, ephemeral environments for each testing round:

Software Versions

The benchmarks used the following software versions:

All software versions were specified in the Dockerfile to ensure reproducible benchmarks.

Resource Allocation

Each runtime received identical total CPU resources (6 cores) with the following distribution:

Node.js, Bun, and Deno, which each operate as single-threaded processes, were distributed across six single-CPU pods. We configured Watt, our multi-threaded application service built on Node.js, to use two workers per pod across three x2 CPU pods.

Considering the AWS infrastructure costs, these six cores on an m5.2xlarge instance roughly translate to approximately $0.096 per hour. By understanding this cost, you can better evaluate how any latency improvements might affect your budget, as different runtimes could potentially lead to savings by requiring fewer instances to handle the same load (measured in requests per second).

Load Test Configuration

Each runtime was tested with the following k6 configuration:

import http from 'k6/http';
import { check } from 'k6';

export const options = {
 scenarios: {
   constant_arrival_rate: {
     executor: 'constant-arrival-rate',
     duration: '120s',
     rate: 1000,
     timeUnit: '1s',
     preAllocatedVUs: 1000,
     maxVUs: 20000,
   },
 },
};

export default function () {
 const res = http.get(__ENV.TARGET, {
   timeout: "5s",
 });
 check(res, {
   'status is 200': (r) => r.status === 200,
   'response has body': (r) => r.body && r.body.length > 0,
 });
}

This configuration maintained a constant arrival rate of 1,000 requests per second for 120 seconds, resulting in approximately 120,000 requests per test.

Test Protocol

Given that our benchmark harness runs on live cloud services, there is some inherent variability to the data we collected: to ensure a fair comparison and boost confidence in our results, we run them multiple times by rotating the order of service being tested, and we took the extra effort to ‘warm up’ each environment as a part of our test runs.

To start, the Network Load Balancer (NLB) went through a warm-up phase in which all four endpoints received a 60-second warm-up, starting at 10 and reaching up to 500 requests per second, ensuring that AWS Network Load Balancers were properly scaled. Each runtime also received a 20-second pre-test warm-up to stabilize the environment before its respective test.

Test execution spanned 120 seconds at a constant arrival rate of 1,000 requests per second, providing robust data for analysis. A cooldown period of 480 seconds was implemented between each test to allow the system to return to baseline conditions, further ensuring that subsequent tests commenced without residual impact from prior runs.

Finally, the tests were executed in three complete runs with different execution orders to detect positional bias and ensure that each run's performance was accurately assessed as part of our scientific rigor.

Test Orders

Runtime Configurations

Node.js: Standard Next.js standalone server

next start

Bun: Next.js with Bun runtime (requires --bun flag to override shebang)

bun run --bun next start

Without the --bun flag, Bun respects the shebang (#!/usr/bin/env node) in the Next.js binary and executes it with Node.js instead. The --bun flag overrides this behavior to use the Bun runtime.

Deno: Next.js via npm compatibility layer

deno run -A npm:next start

Deno runs Next.js via its npm compatibility layer (npm:next), which allows running npm packages in the Deno runtime.

Watt: Platformatic Watt with 2 workers per pod

wattpm start  # with WORKERS=2

Watt uses SO_REUSEPORT to distribute connections across multiple Node.js worker threads at the kernel level, eliminating the IPC overhead present in traditional cluster-based approaches. Each worker operates with its own event loop while sharing the same listening socket.

Results

Success Rate

All runtimes achieved a 100% success rate, with zero failed requests across all three runs. Each test processed approximately 120,000 requests at the target rate of 1,000 requests per second.

Observations

Latency Distribution

The runtimes fell into distinct performance tiers based on average latency:

Tier 1 (~11-14ms): Deno and Watt
Tier 2 (~20ms): Node.js
Tier 3 (~246ms): Bun

Consistency Across Runs

Deno demonstrated the most consistent performance across different test positions, with a standard deviation of ±1.19ms, indicating minimal predictability risk. Watt exhibited similar consistency at ±1.03ms, offering low operational risk and high reliability. Node.js displayed moderate variance at ±2.42ms, posing a moderate predictability risk that decision-makers should consider when evaluating stability. Although Bun’s absolute variance was higher at ±4.72ms, this represented consistent behavior relative to its average latency, which could translate into higher predictability risk. Understanding these performance metrics in terms of predictability risk can help managers better assess the stability and reliability of deploying specific runtimes.

Test Order Impact

Rotating the test order across three runs helped identify whether the position affected the results. Of the frameworks we tested, all of them performed consistently regardless of where they fell in the testing order, with the notable exception of Node.js itself, which performed best when tested last (see "Run 3", above).

Tail Latency (p99)

The p99 latency provides insight into the worst-case user experience:

Deno: 101.27ms average p99
Watt: 114.78ms average p99
Node.js: 173.84ms average p99
Bun: 974ms average p99

Throughput

All runtimes successfully handled the target load of 1,000 requests per second with negligible dropped requests. The slight variations in reported requests per second, ranging from 997.94 to 999.96, are within normal measurement variance.

As we reflect on these results, it prompts us to consider future directions for our experiments. For example, which memory-intensive workloads might flip these rankings?

Part of our aim in our open source practice is not just to build products, but to build community, and we’d like to hear from you all: what frameworks and scenarios are most relevant to your work today that you think we should investigate next?

Reproducing these benchmarks

The complete benchmark infrastructure is available at: https://github.com/platformatic/runtimes-benchmarks.

To run the benchmarks:

AWS_PROFILE=<profile-name> ./benchmark.sh

The script creates an ephemeral EKS cluster, deploys all four runtime configurations, executes the load tests, and automatically tears down the infrastructure. Easy as that!

Let us know how this works for you (and perhaps more importantly, if anything doesn’t work for you or if you see results that surprise you…).

Conclusions

The benchmarks showed three distinct performance tiers: Deno and Watt had the lowest average latencies, at approximately 11 to 14 milliseconds; Node.js averaged 20 milliseconds; and Bun exhibited significantly higher latency at approximately 246 milliseconds. (I’m sure Bun’s showing here will surprise many - it surprised us as well.)

All configurations successfully handled the target throughput of 1,000 requests per second, achieving a 100% success rate. These results reflect performance characteristics under the specified test conditions and may vary depending on application workload, infrastructure configuration, and runtime versions. Teams prioritizing sub-15ms latency may shortlist Deno and Watt, with Watt being the natural choice for those who want to stay within the Node.js ecosystem.

What Next?

As we reflect on these results, we’re considering what future direction we’d like to take with our next round of experiments.

Don’t be shy - do drop us a comment here or on LinkedIn (DMs always open!) about what you’d like to see.

Bun Is Fast, Until Latency Matters for Next.js Workloads

Benchmark Methodology

Software Versions

Resource Allocation

Load Test Configuration

Test Protocol

Test Orders

Runtime Configurations

Results

Observations

Latency Distribution

Consistency Across Runs

Test Order Impact

Tail Latency (p99)

Throughput

Reproducing these benchmarks

Conclusions

What Next?

Comments

More from this blog

Stop Request Stampedes at the Gateway with Platformatic Deduplication

AWS ECS auto-scaler is broken (don’t worry, we’ve fixed it)

Destino: Doom in Your Terminal, Powered by Node.js FFI

Ahead of Time Scaling: How Platformatic ICC Predicts and Provisions

Run Medusa on Kubernetes with Watt as a Monorepo

Command Palette

Benchmark Methodology

Software Versions

Resource Allocation

Load Test Configuration

Test Protocol

Test Orders

Runtime Configurations

Results

Observations

Latency Distribution

Consistency Across Runs

Test Order Impact

Tail Latency (p99)

Throughput

Reproducing these benchmarks

Conclusions

What Next?

Comments

More from this blog