Halving Node.js Memory Usage

V8, the C++ engine under the proverbial hood of JavaScript, includes a feature many Node.js developers aren’t familiar with. This feature, pointer compression, is a method for using smaller memory references (pointers) in the JavaScript heap, reducing each pointer from 64 bits to 32 bits. The net is that you wind up using about 50% less memory for the same app, without changing any code. Pretty great, right?

Well, almost. Node.js does not enable Pointer compression by default for two historical reasons.

First, there was the '4 GB cage' limitation, which meant that enabling pointer compression required the entire Node.js process to share a single 4 GB memory space between the main thread and all the worker threads. This was a significant issue. Cloudflare and Igalia partner to solve it so that the cage can be per-isolate (an individual instance of the V8 engine).

Next, some worried that compressing and decompressing pointers on each heap access would introduce performance overhead. Cloudflare, Igalia, and the Node.js project collaborated to determine exactly what kind of overhead existed and assess whether it would impact real-world applications.

To test this, we created node-caged, a Node.js 25 Docker image with pointer compression turned on, and ran production-level benchmarks on AWS EKS.

In short, we achieved 50% memory savings with only a 2-4% increase in average latency across real-world workloads and reduced P99 latency by 7%. For most teams, this trade-off is an easy choice.

How Pointer Compression Works

Every JavaScript object is stored on V8’s heap. Inside, objects point to each other using 64-bit memory addresses on a 64-bit system. For example, an object like { name: "Alice", age: 30 } has several internal pointers: one to its hidden class (shape), one to where its properties are stored, and one to the string “Alice” on the heap.

As you might imagine, all these pointers can add up in a typical Node.js app, taking up a lot of valuable heap space. On a 64-bit system, each pointer uses 8 bytes, even though most V8 heaps are much smaller than the huge address space they could use.

Pointer compression takes advantage of this. Instead of saving full 64-bit memory addresses, V8 stores 32-bit offsets (relative distances from a fixed starting point, called the base address). When reading from the heap (the section of memory where objects are stored), it rebuilds the full pointer by adding the base and the offset. When writing, it compresses the pointer by subtracting the base from the full address.

The trade-off is simple:

Memory: Each pointer goes from 8 bytes to 4 bytes. For structures with many pointers—such as objects, arrays, closures, Maps, and Sets—this can reduce memory consumption by around 50%
CPU: Each heap access now needs one extra addition (for reads) or subtraction (for writes). To put it in perspective, this extra operation is akin to a Level 1 cache hit in terms of computational effort. These are incredibly fast operations, and although millions of them occur every second, their impact is minimal, akin to a gentle ripple in a vast ocean of processing tasks.
Heap limit: 32-bit offsets can only reach 4GB of memory per V8 isolate (a separate instance of the JavaScript engine with its own memory and execution state). For most Node.js services, which usually use less than 1GB, this isn’t a problem.

Chrome has used pointer compression since 2020, but Node.js hasn't. Previously, using this feature required setting a flag (--experimental-enable-pointer-compression) at compile time, which often felt like an 'expert-only' option for many developers. However, the introduction of node-caged has transformed this, enabling pointer compression with a simple one-line Docker image swap. This substantial simplification opens the door for a much broader audience to experiment with the feature more immediately.

What Changed: IsolateGroups

Pointer compression has been part of V8 for years. Node.js didn’t use it before, not because of CPU overhead, but because of the memory cage limitation.

Originally, V8’s pointer compression made every isolate in a process share a single “pointer cage”—a 4GB block of memory for all compressed pointers. This meant the main thread and all worker threads had to fit into the same 4GB. In Chrome, where each tab runs in its own process, this worked fine. But for Node.js, where workers share a process, it was a big problem.

In November 2024, James Snell (Cloudflare, Node.js TSC) initiated the endeavor to address this challenge. Cloudflare sponsored Igalia engineers Andy Wingo and Dmitry Bezhetskov to introduce a new V8 feature, IsolateGroups, which gives each pointer its own compression cage. (You can read more about this feature and work at https://dbezhetskov.dev/multi-sandboxes/.)

The pivotal modification is that multiple IsolateGroups can now exist within a single process, each having its own 4GB cage, thus eliminating the process-wide memory constraint. This work symbolizes a significant collaboration between organizations, showcasing the strength of the open-source ecosystem. Thanks to this work, enabling pointer compression in Node.js changed from (shared cage):

to (IsolateGroups):

In V8, the C++ change is simple. Use v8::Isolate::New(group, ...) instead of v8::Isolate::New(...). Now, each worker thread gets its own 4GB heap. The only limit is the system’s available memory.

Snell’s Node.js integration landed in October 2025: 62 lines across 8 files. This represents less than one commit's worth of changes across most modules, underscoring the update's maintainability. The code was reviewed and approved by Joyee Cheung [Igalia], Michael Zasso [Zakodium], Stephen Belanger [Platformatic], and me [Platformatic]. Cheung also fixed the pointer compression build itself, which had been broken since Node.js 22. I tested with real-world Next.js SSR applications and confirmed a ~50% reduction in heap usage before approving.

This feature still requires a compile-time flag and isn’t in official Node.js builds yet. That’s why we made node-caged.

The Experiment

Two of our four configurations use Platformatic Watt, our open-source Node.js application server. Watt runs multiple Node.js applications as worker threads (separate execution threads) within a single process, using the Linux kernel's 'SO_REUSEPORT' (a system feature that allows multiple processes to listen on the same network port) to distribute connections directly to workers. No master process, no IPC (Inter-Process Communication) coordination. In previous benchmarks, this eliminated the ~30% performance tax imposed by PM2 and the 'cluster' module through IPC-based load balancing.

We set up a Next.js e-commerce app—a trading card marketplace with 10,000 cards, 100,000 listings, server-side rendering, search, and simulated database delays—on a Kubernetes cluster. We tested four setups, all using the same hardware and app code:

Infrastructure: We used AWS EKS with m5.2xlarge nodes (8 vCPUs, 32GB RAM), 6 replicas for plain Node and 3 replicas for Watt (each with 2 workers, for a total of 6 processes). Both images used the same Debian bookworm-slim base and Node.js 25, so the only difference was the use of pointer compression.

Workload: We used k6 with a ramping-arrival-rate executor, running 400 requests per second for 120 seconds after a 60-second ramp-up. The traffic was mixed as follows:

20% homepage (SSR with featured cards, recent listings)
25% search (full-text search with pagination)
20% card detail (individual product page SSR)
15% game category pages
10% games listing
5% sellers listing
5% set detail pages

Each request follows the server-side rendering path. It loads JSON data from disk, applies query filters, renders React components to HTML, and sends the response. We added a simulated 1-5ms database delay to mimic real data access.

The Results

Plain Node.js: Standard vs Pointer Compression

The average overhead was 2.5%. That translates to approximately 1 ms additional latency on our 40 ms median latency. This is a minor trade-off for cutting memory use in half. But if you look at p99 and max latency, they’re actually lower with pointer compression. A smaller heap means the garbage collector has less work to do, so there are fewer and shorter GC pauses. In these cases, pointer compression doesn’t just keep up—it performs better.

Platformatic Watt (2 workers): Standard vs Pointer Compression

A similar outcome appears here. Average overhead is slightly higher (4.2%), the median remains unchanged, and maximum latency drops by 20% due to reduced garbage collection pressure.

The Full Picture: Watt + Pointer Compression vs Baseline

This is the comparison that matters for production decisions. What do you get if you adopt both Watt and pointer compression?

Consider this: on average, it’s 15% faster, delivering significant speed gains without requiring code adjustments. This kind of improvement could be likened to the gains typically achieved by rewriting key parts of a system in a more optimized language, such as C++. Not only does it increase p99 latency by 43%, but it also halves memory usage, all for free with minimal effort.

Why the Hello-World Benchmarks Were Misleading

Initial tests of pointer compression on a basic Next.js starter app showed a 56% overhead. This outcome was unexpected.

But a simple hello-world SSR page mostly does V8 internal work: compiling templates, diffing the virtual DOM, and joining strings. There’s no I/O, no data loading, and no real app logic. Every operation goes through pointer decompression.

Real applications are different. A typical request spends most of its time on:

I/O wait: database queries, cache lookups, API calls to downstream services
Data marshaling: JSON parsing, response body construction
Framework overhead: routing, middleware chains, header processing
OS/network: TCP handling, TLS, kernel scheduling

The V8 heap access that triggers pointer decompression is only one component of the total request time. As the ratio of “real work” to “pure V8 pointer chasing” increases, the overhead of pointer compression shrinks proportionally.

Our e-commerce app includes simulated database delays of 1-5ms, JSON parsing of datasets with 10,000+ records, search filtering, pagination, and full SSR rendering with React. In that context, the pointer decompression overhead rounds to noise.

The takeaway: always use realistic workloads for benchmarking; asmicrobenchmarks can give you the wrong idea. As a challenge to validate these findings, we invite you to try your heaviest endpoint and share your results. This collaborative effort can transform observations into active participation, build trust, and foster community validation of the effectiveness of pointer compression.

The Technical Details: Why GC Gets Better

The improved tail latencies deserve a deeper explanation. V8’s garbage collector (Orinoco) performs several types of collection:

Minor GC (Scavenge): Copies live objects from the young generation. Time is proportional to the number of live objects and their size.
Major GC (Mark-Sweep-Compact): Marks all reachable objects, sweeps dead ones, and optionally compacts. Time depends on the total heap size and the level of fragmentation.

With pointer compression, every object is smaller. This has domino effects:

Objects fit in fewer cache lines. A compressed object that fits in a single 64-byte cache line instead of two means the GC’s marking phase generates half as many cache misses while traversing the object graph.
The young generation fills more slowly. Smaller objects mean more allocations before a minor GC is triggered. Fewer minor GCs per unit of work.
Major GC has less to scan. A 1GB heap with compressed pointers contains the same logical data as a 2GB heap without. The GC scans half the bytes to process the same application state.
Compaction moves fewer bytes. When the GC compacts the heap to reduce fragmentation, smaller objects mean less data to copy.

The end result is that GC pauses are both shorter and less frequent. This corresponds to what we saw in the p99 and max latency numbers. When a long-tail request lines up with a GC pause, the pause is now shorter.

What This Means for Your Business

Cut Your Kubernetes Bill

If you run Node.js on Kubernetes with 2GB memory limits per pod, pointer compression lets you cut that to 1GB. You get the same app and performance, but can run twice as many pods per node or use half as many nodes. What would halving pod memory do to your cluster bill? Take a moment to calculate the potential savings based on your current setup and see how much your organization could benefit from implementing pointer compression.

A 6-node m5.2xlarge EKS cluster (at $0.384 per hour per node) costs about $16,600 a year. Dropping to 3 nodes saves $8,300 a year. In a real production fleet with 50 or more nodes, the savings can reach $80,000 to $100,000 a year, all without changing your code.

For platform teams running hundreds of Node.js microservices, these savings add up. Each service has a baseline memory load from the V8 heap, framework, and modules. Pointer compression reduces the baseline across all services simultaneously.

Double Your Tenant Density

Multi-tenant SaaS platforms, where each tenant runs in an isolated Node.js process, hit memory as the binding constraint for density. If each tenant’s worker uses 512 MB, pointer compression reduces it to ~256 MB. That’s 2x tenants per host.

At scale, this changes your costs. If each tenant costs $5 per month for infrastructure and you have 10,000 tenants, cutting memory in half saves $25,000 a month, or $300,000 a year.

Unlock Edge Deployment

Edge runtimes like Lambda@Edge, Cloudflare Workers, and Deno Deploy have strict memory limits, typically 128MB to 512MB per isolate. Cloudflare sponsored the IsolateGroups work in V8 because their Workers runtime needed pointer compression to support more isolates. Pointer compression can be the difference between your app running at the edge or needing to go back to the origin server.

That matters for revenue. Every 100ms of latency measurably reduces conversion rates. An e-commerce site moving SSR to the edge shaves 50-200ms off TTFB, depending on user location. For a $50M/year business, that latency improvement can translate to hundreds of thousands in incremental annual revenue.

Handle More Concurrent Connections

For WebSocket-based applications (chat, collaboration, live dashboards, gaming), each persistent connection holds state in memory. A server handling 50,000 connections at ~10KB heap per connection uses 500MB. With pointer compression, that drops to ~250MB, allowing the same server to handle 100,000 connections, or halving your WebSocket server fleet.

Compatibility Constraints

There is one strict limit: each V8 IsolateGroup’s pointer cage is 4GB. 32-bit compressed pointers can only address 4GB. With IsolateGroups, this limit applies to each isolate, not the whole process. Your main thread gets 4GB, each worker thread gets 4GB, and the total is only limited by your system’s memory.

For most Node.js services, 4GB per isolate is irrelevant. The vast majority of production processes run well under 1GB of heap. If your service genuinely requires more than 4GB of heap per isolate (e.g., large ML model inference, massive in-memory caches, or heavy ETL pipelines), pointer compression is not an option. Note that only the V8 JavaScript heap lives inside the cage; native add-on allocations and ArrayBuffer backing stores do not count against the 4GB limit.

There is one more compatibility constraint: native addons built with the legacy NAN (Native Abstractions for Node.js) won't work with pointer compression enabled. NAN exposes V8 internals directly, and pointer compression changes the internal representation of V8 objects. When you recompile, the ABI is different. Addons built on [Node-API](https://nodejs.org/api/n-api.html) (formerly N-API) are unaffected because Node-API abstracts away V8's pointer layout entirely. The most popular native packages have already migrated: sharp, bcrypt, canvas, sqlite3, leveldown, bufferutil, and utf-8-validate all use Node-API today. The main holdout is nodegit, which still depends on NAN. If you're unsure, check your dependency tree with npm ls nan. If nothing shows up, you're good.

For everyone else—which is most Node.js deployments—there’s nothing to lose.

Try It

It’s a drop-in replacement. You don’t need to change any code.

# Before
FROM node:25-bookworm-slim

# After
FROM platformatic/node-caged:25-slim

The platformatic/node-caged image is built from the Node.js v25.x branch with --experimental-enable-pointer-compression. It’s the same Node.js, same APIs, and everything else—just with smaller heaps.

Available tags: latest, slim, 25, 25-slim.

Start by testing in staging. Watch your memory usage go down. Make sure your p99 latency stays within your SLO. Then deploy it.

As always, we want to hear from you! Share your results and experience by dropping us a note at hello@platformatic.dev or by engaging on social media if you’d like to chat about anything you’re building.

Benchmarks were run on AWS EKS (m5.2xlarge nodes, us-west-2) using k6 with ramping-arrival-rate at 400 req/s sustained. The application is a Next.js 16 e-commerce marketplace with server-side rendering and a JSON-based data layer. Full benchmark infrastructure and results are available in the node-caged repository. The upstream V8 IsolateGroups feature was implemented by Igalia, sponsored by Cloudflare. Node.js integration by James Snell, with build fixes by Joyee Cheung. See the tracking issue for the full history.

We cut Node.js' Memory in half