Virtual Threads After JEP 491: The Bottleneck Moved

For three years, the honest answer to "should I turn on virtual threads?" was "it depends, and probably not yet." The caveat that killed most rollouts was pinning: any thread that entered a synchronized block got stuck on its carrier, and since half of the JDBC drivers, logging frameworks, and connection pools in the ecosystem still use synchronized somewhere deep, a modest load test could turn your carrier pool into a traffic jam.

JDK 24 shipped JEP 491 and quietly removed that caveat. synchronized no longer pins the carrier. The headline objection is gone. Spring Boot 4 is comfortable making spring.threads.virtual.enabled=true the sensible default — it's now on by default for Tomcat and Jetty handlers on Java 21+.

Which means the interesting engineering problem moved. It's not "will my driver pin?" anymore. It's "what's the next thing to break once I can spawn a million threads cheaply?" This post walks through where virtual threads actually buy you scalability in a Spring Boot 4 / Kotlin service, where they silently don't, and what to measure before you trust the flag.

Why pinning kept virtual threads out of production

The pre-JEP-491 mental model was simple: virtual threads are great, except when they pin, and they pin in all the libraries you care about. So I treated them as a curiosity. Services that needed concurrency went reactive — WebFlux, Kotlin coroutines on Dispatchers.IO, Project Reactor pipelines — and the color-of-your-function tax was the price for not blocking platform threads.

Post-JEP-491, you can delete a lot of that reactive scaffolding and go back to writing straight-line blocking code. Tempting. The trap is assuming the bottleneck was ever the JVM.

It wasn't. The JVM was the most visible ceiling because pinning was easy to spot in a flame graph. Under it sat three other ceilings that nobody was hitting because the JVM hit its first:

Connection pool saturation. A HikariCP pool with 20 connections is still 20 connections. Virtual threads let 10,000 requests queue for those 20 connections instead of crashing your executor — which looks like scalability until you look at p99 latency.
ThreadLocal-heavy libraries. MDC implementations that cache per-thread, ORM session caches, some tracing agents — they all assume threads are expensive and long-lived. Virtual threads are neither.
Blocking native calls. JNI, some crypto paths, some filesystem operations still pin the carrier. Rare, but high-variance: one unexpected pin per request is enough to flatten throughput.

Turning on virtual threads without knowing which of these is closest is how teams end up reporting that "virtual threads didn't help" or, worse, that they made things worse.

A two-layer mental model

The mental model I use now has two layers.

Layer one: virtual threads are a programming-model feature, not a performance feature. They let synchronous, imperative code scale to the number of in-flight requests your downstream dependencies can actually handle. They do not create capacity. If your database can serve 200 queries/sec, virtual threads give you a nicer way to queue for that capacity — nothing more.

Layer two: every service has a next bottleneck. Before turning virtual threads on, you should be able to name it. If you can't, the honest move is to build a load test that finds it, not to flip the flag and hope.

The practical workflow:

Decide what concurrency you actually need (requests in flight, not RPS).
Identify the downstream constraints — pool sizes, rate limits, upstream capacity.
Flip spring.threads.virtual.enabled=true in a staging environment.
Run a sustained load test with JFR recording.
Look for jdk.VirtualThreadPinned, pool exhaustion events, and p99 latency drift.
Decide whether the remaining ceiling is worth fixing, or whether the current behavior is already acceptable.

The goal is to arrive at a service where the bottleneck is explicit, documented, and owned — not one that happens to run fast today.

Shipping virtual threads on JDK 24+

What JEP 491 actually changed

Before JDK 24, entering a synchronized block from a virtual thread would mount that thread on its carrier and refuse to unmount until the block exited. If the code inside the block blocked on I/O, the carrier was gone. You could run out of carriers (default: number of CPU cores) while having millions of idle virtual threads.

JEP 491 reworked the monitor implementation so that a virtual thread blocked inside synchronized now unmounts from its carrier the same way it would inside a ReentrantLock. The carrier pool stays free. The VirtualThreadPinned JFR event still fires in the handful of cases that genuinely pin (native frames, specific class-initialization edges), but the common case of synchronized { blockingIO() } no longer does.

This is the part most teams were waiting for. It's the reason you can now turn the flag on without a manual audit of every transitive dependency. JDK 25 has since refined the remaining edges further; the picture only gets better as you move forward.

Enabling virtual threads in Spring Boot 4

In Spring Boot 4 the flag is on by default for Tomcat, Jetty, @Async, and scheduled-task executors when running on Java 21+. If you need to be explicit, the property is still there:

properties

spring.threads.virtual.enabled=true

Your @RestController handlers now run on virtual threads by default. Blocking calls inside them are fine — that's the whole point.

What it doesn't do: reconfigure your connection pools, your HTTP client thread pools, or any ExecutorService you constructed yourself. Those are still platform-thread pools. If you want them on virtual threads, build them explicitly:

kotlin

val scope = Executors.newVirtualThreadPerTaskExecutor()

The measurement that matters

JFR is the source of truth. Start a recording:

bash

jcmd <pid> JFR.start name=vt duration=120s \
    settings=profile filename=vt.jfr

Then filter for the events that matter:

bash

jfr print --events jdk.VirtualThreadPinned,jdk.VirtualThreadSubmitFailed vt.jfr

VirtualThreadPinned fires when a carrier was pinned for longer than 20 ms (the default JFR threshold) — that tells you where the remaining pinning happens. VirtualThreadSubmitFailed tells you your carrier pool is starved — usually because something is genuinely pinning for long periods.

If both are quiet and your throughput still isn't scaling, the bottleneck isn't in the threading model. It's downstream.

A concrete Kotlin example

A thin controller hitting a JDBC repository with a 100ms query:

kotlin

@RestController
class OrdersController(private val repo: OrderRepository) {
    @GetMapping("/orders/{id}")
    fun get(@PathVariable id: Long): OrderDto =
        repo.findById(id).toDto()
}

@Repository
class OrderRepository(private val jdbc: JdbcTemplate) {
    fun findById(id: Long): Order =
        jdbc.queryForObject(
            "SELECT pg_sleep(0.1), id, total FROM orders WHERE id = ?",
            orderRowMapper, id
        )!!
}

With platform threads and Tomcat's default 200-thread executor, this tops out around 2,000 req/s: 200 threads × 10 queries/s per thread. Adding threads hurts — context switching and GC pressure go up faster than throughput.

Flip spring.threads.virtual.enabled=true, keep HikariCP at 20 connections, and throughput goes to roughly 200 req/s — 20 connections × 10 queries/s. Worse. Everyone queues cleanly on the pool, but the pool is the ceiling.

Raise HikariCP to 100 connections and you get 1,000 req/s, cleanly. Raise it to 400 and you find out whether your Postgres instance enjoys 400 concurrent connections. (Spoiler: usually no.)

The lesson: virtual threads didn't make the service faster. They made the actual ceiling visible. On platform threads, the JVM was absorbing load by refusing to accept it. On virtual threads, the load gets to the database, and the database tells you the truth.

Sizing pools for virtual threads

The old HikariCP advice — "cores × 2, plus a bit" — was written when threads were expensive and you were protecting the JVM. On virtual threads, you're protecting the database. The math shifts:

Measure the active concurrency your DB can actually sustain under realistic queries. This is rarely the number marketing materials suggest.
Set the pool to that number. Not higher. A bigger pool doesn't make the database faster; it just lets more queries pile up before they time out.
Use connectionTimeout as a back-pressure signal, not an error to suppress. If requests are timing out waiting for a connection, the answer is usually to shed load, not to grow the pool.

Little's Law still applies: concurrency = throughput × latency. If you want 1,000 req/s at 100ms each, you need 100 in-flight connections, period. Virtual threads don't change that. They change what the queue looks like above it.

The Kotlin coroutine overlap

Kotlin coroutines on Dispatchers.IO were the pragmatic answer to "I need concurrency and I don't want reactive." They still work. What's worth thinking about is whether stacking them on virtual threads makes sense.

Dispatchers.IO backed by a platform-thread pool plus virtual threads at the request level: fine. The coroutine dispatcher hands blocking work to a pool, the pool is now running on virtual threads, and you get the scaling.

Dispatchers.IO replaced with a dispatcher backed by a virtual-thread executor: also fine, but now you have two scheduling layers — the coroutine continuation scheduler and the virtual-thread scheduler — and debugging stack traces gets interesting. I'd reach for this only if I had a specific reason, like wanting ThreadLocal behavior across coroutine suspensions, and I'd document why.

What still bites after JEP 491

The synchronized audit trap. Teams sometimes spend a week auditing every synchronized block in their dependency tree before enabling virtual threads. On JDK 24+ that work is mostly wasted. ReentrantLock migrations are still reasonable for code you own, for other reasons (interruptibility, fairness), but they're no longer a prerequisite.

ThreadLocal leaks look different. A ThreadLocal that accumulated values over the lifetime of a platform thread used to leak slowly. On virtual threads, the ThreadLocal is created and destroyed with each request — which sounds better, until you realize that any ThreadLocal used as a cache has its hit rate collapse. Look for libraries (older tracing agents, pre-6.x Hibernate versions) that assumed threads were long-lived. Consider ScopedValue for new code.

Debuggers and profilers lag. Tools that assumed thread count correlates with load will give you nonsense. A healthy service might show 50,000 virtual threads during a spike. That's not a leak. Major APM vendors have caught up, but sampling profilers configured with tight thread limits still drop events.

Pinning isn't the only way to block a carrier. JNI frames, class initializers, and Object.wait on legacy code paths can still pin. The VirtualThreadPinned event will tell you. Don't assume JEP 491 means "no pinning ever" — it means "no pinning from synchronized."

CPU-bound work doesn't benefit. Virtual threads help when you have many requests waiting on I/O. A request doing heavy computation on one core doesn't care about threading models. If your service is CPU-bound, focus on the algorithm or the number of cores, not the flag.

Structured concurrency is still preview. It's the natural companion to virtual threads and it's genuinely nice. JDK 26 ships the sixth preview (JEP 525); finalization is targeted for JDK 27. Plan for some API churn if you adopt it now.

Practical Takeaways

JEP 491 removes the main reason teams delayed virtual threads. On JDK 24+, virtual threads on by default in Spring Boot 4 is a reasonable choice for new services.
Virtual threads don't create capacity. They expose your next bottleneck — usually a connection pool, sometimes a ThreadLocal assumption, occasionally a native call.
Use JFR (jdk.VirtualThreadPinned, jdk.VirtualThreadSubmitFailed) as the source of truth, not blog posts or benchmarks.
Resize connection pools based on what your database can actually sustain, not the JVM's old rules of thumb. Little's Law still governs.
Don't replace Dispatchers.IO with a virtual-thread-backed dispatcher without a specific reason. Two scheduling layers is debugging overhead you don't need.
CPU-bound services are unaffected. Don't promise performance wins you can't deliver.
Before flipping the flag in production, run a sustained load test in staging with JFR recording. Find the next ceiling there, not at 3am.

When to flip the flag, when not to

Virtual threads are not a performance feature. They're a programming-model feature that happens to unlock scalability once you remove the ceiling they replaced. JEP 491 finally made the ceiling low enough that the other ceilings become the interesting ones — and those are in your pools, your libraries, and your downstream services, not in the JVM.

Turn it on. Measure what happens. If throughput goes up, you had headroom downstream. If latency gets worse, you just found where your real bottleneck lives — and that's information you wanted anyway.

Use virtual threads when: your service is I/O-bound, your downstream dependencies have more capacity than your current thread model exposes, and you're writing synchronous code that's readable and testable.

Skip them when: you're CPU-bound, your downstream is already the ceiling and you have no way to grow it, or you're on a JDK earlier than 24 and the pinning audit would dominate the effort.

The takeaway isn't that virtual threads are good or bad. It's that "should I enable virtual threads?" has become a measurement question instead of a library-compatibility one. That's a much better place to be.

Something didn't load