Challenges & Solutions for Migrating Java Apps to Reactive

What I learned by reimplementing a Java Spring Boot application using Reactive WebFlux


My team at Capital One has periodic Innovation Sprints, which are two week periods where we can work on something outside our normal requirements such as a new technology or tool. There has been some recent buzz around the Reactive Streams API as a compelling strategy for non-blocking IO in web applications, and I was interested in taking a look into it. So along with a fellow coworker, Kristian Langholm, we decided to investigate the Reactive paradigm by migrating one of our team’s Spring Boot applications to use WebFlux, which is one implementation of Reactive for Spring applications. For our test subject, we decided on a small API which processes requests containing information about customer transactions and returns decisions on whether the transaction should be allowed or blocked.

While both Kristian and I had done some research into WebFlux, reading the docs and several articles explaining how it is used (I found this one and this one particularly helpful), we still faced several challenges throughout the process of reimplementing our application. Some of these were more abstract in nature (How do you write code “the Reactive way?” Is Reactive even the right choice for this application?) and some were more technical and specific (How do you handle errors in a non-blocking way? What is the difference between Map, flatMap, and flatMapMany?). By the end of the process, we had come up with answers to many of these questions and thought it might be helpful to share our findings with others interested in the process of implementing an application using Reactive.

NOTE - in this article, I will use the broad term “Reactive” to refer to the Reactive Streams API going forward, but you can also mentally replace it with your Reactive framework of choice.

Challenge 0: How to think about reactive vs. imperative programming

One of the first challenges of working with Reactive is the mental shift of how to think about the code and how it works. The way I like to think of it is in terms of “One worker per request” versus “One worker per subtask.” Basically, the difference in how imperative threads handle an entire request, but reactive threads handle only a small portion of a request.

Let’s consider an analogy: suppose you own a custom furniture factory and employ a workforce to help assemble the furniture. When a request comes in, you dispatch one of the workers, give them the request, and they perform all the tasks to build the furniture to the specification provided - collecting the wood, cutting the pieces, attaching them together, painting and staining, and finally packaging the finished piece for delivery. This is roughly how imperative web applications work. One worker thread is dispatched for each request, and it handles every part of the request. If there are any portions of the process which cause some delay (e.g. waiting for the paint to dry), the worker waits for this delay to end before proceeding.

In the reactive paradigm, things are quite different. Suppose you have a pool of workers that in the break room to be assigned tasks; this collection of workers is analogous to the Executor Thread Pool in Reactive (more details on the Executor Thread Pool and the Reactive Thread Model can be found in this helpful article). At each stage in the construction of the furniture, any available worker can be requested to complete the task. Once the worker is done with that part of the task, they can go back to the break room and wait for another task. Likewise, if a portion of the task requires waiting (e.g. for the paint to dry), the worker can also go back to the break room. Then once the paint is dry, someone else (or perhaps the same worker) will be dispatched for the next step. This particular quality of the reactive furniture factory is analogous to non-blocking IO. Rather than having a worker waste time by waiting for the paint to dry, the worker is free to go work on something else. The workers in this factory spend far less time waiting and more time on actual tasks, and similarly, in the reactive paradigm, threads spend less time in an idle state.

woodworker using sliding mitre saw

Source Wikipedia (https://commons.wikimedia.org/w/index.php?title=File:Woodworker_using_sliding_mitre_saw.jpg&oldid=522132852). This file is licensed under the Creative Commons Attribution 2.0 Generic license (https://en.wikipedia.org/wiki/Creative_Commons).

Challenge 1: Should we even be using reactive?

So why would you want to use the reactive paradigm in the first place? The main advantages of Reactive Java is the more efficient usage of threads:

  • Fewer total threads in use as the number of simultaneous requests grows.
  • Less time spent switching between threads.

Between these two advantages, the former is the more significant in most cases. Context overhead for switching between threads is increasingly negligible on modern systems (on the scale of a few nanoseconds, or billionths of a second). Unless you have APIs with very tight SLA’s, this is probably not a huge concern.

The advantage of using fewer threads, however, can be significant. Threads use memory for context, and using fewer threads should directly correspond to more efficient use of memory, which can be a concern when you are in a memory constrained environment. For this reason, in some cases, switching to Reactive can significantly reduce your memory consumption.

Comparison of strong and poor candidates for Reactive

Strong Candidates for Reactive Poor candidates for Reactive
Few library dependencies or all libraries have reactive versions that can be transferred to. Custom library dependencies that are non-reactive.
New projcets (no migration needed!) or smaller, microservice style projects. Large, monolithic prorects.
Lots of downstream calls to database APIs with high response times No calls to downstream services, or downstream services have very low response times.

Using a non-blocking database like MongoDB, Cassandra, or DynamoDB.

Using a blocking database.

Application serves many (at least dozens) of requests simultaneously.

Application generally only serves one request at a time.
Application has high memory utilization Application has low memory utilization, or is not contrained by memory utilization.

To summarize, switching to Reactive will probably not improve your speed performance, but it may benefit your memory performance if you are handling many requests simultaneously with asynchronous calls.

Challenge 2: What are the similarities/differences between Java Streaming API, Java Futures, and Reactive?

If you are familiar with the Java Streaming API or Java Futures, then this knowledge has analogous concepts in the world of Reactive programming. We will begin by taking a look at the similarities between Java Streaming and Java Future with Reactive, and then discuss one particularly confusing case which highlights some differences.

Java Streaming API

The Java Streaming API allows you to apply functions to a stream of elements. Here is one simple example:

    List uppercaseStringList = Stream.of("alpha", "bravo", "charlie")
                         .map(String::toUpperCase).collect(Collectors.toList());
  

In this example, we create a stream of the Strings “alpha”, “bravo”, and “charlie”, apply the uppercase function to each of them, and collect them in a list. Reactive also has a concept of Streams, and they are quite similar: a collection of elements that you can apply functions to. The key difference is that Java Streaming API streams are for blocking/synchronous elements, and Reactive Java streams are for non-blocking/asynchronous elements.

Java Futures

Java Futures can extend the similarities between imperative streams and Reactive streams a bit further. Consider this example:

    List results = Stream.of(CompletableFuture.runAsync(() -> downstreamCall()), CompletableFuture.runAsync(() -> downstreamCall2()))
       .map(future -> {
           try {
               return future.get();
           } catch (Exception e) {
               e.printStackTrace();
           }
           return null;
       })
       .collect(Collectors.toList());
  

Suppose we have multiple downstream calls (such as calls to a database or API) and we want those calls to happen simultaneously. In this example, we create a stream of those Futures, which will be filled with the response from the downstream call when it is available. Then we “unwrap” those values in the future.get() call and collect those to a list of the actual objects we want.

It is often helpful to think of Reactive streams as a Java Streaming API Stream containing Futures. In other programming languages, these are known as “promises.” Essentially, you have a stream of items that are not yet present, and once they are present, you can execute some logic on them.

Where the analogy with a Java Stream of Futures ends...

Now that we have pieced together how Reactive Streams are similar to Java Streaming API Streams of Future objects, let’s see where they differ. First, some good news. It is actually much easier to work with Reactive Streams and perform operations on them than it would be to work with Java Streaming API streams of Futures. We have lots of convenient functions that allow us to “unwrap” the Futures in our Reactive streams, in addition to sending deferred errors or even empty future values. So basically, there is more functionality for working with asynchronous data for Reactive Streams, and the interface is simpler.

But there is a bit of bad news. Java Streaming API Streams and Reactive Streams have some similarly named functions… but they do different things. The next section tried to clarify the most confusing example of this, regarding the Map, FlatMap, and FlatMapMany functions.

Confusion around Map, flatMap, and flatMapMany

When I first encountered the Webflux flatmap command, I had some expectations of what it might do. It turns out, these expectations were incorrect.

I have used flatMap before in the context of Java Streams. In that context, flatmap took a stream of elements, and returned a new stream of elements with a mapping of zero or more elements per source element. So, for example, if you have a stream of integers, and you want to return a stream of the original integers and their value multiplied by two, you could write something like this:

    Stream.of(1,2,3)
     .flatMap(i -> Stream.of(i, i*2))
     .collect(Collectors.toList());
  

Notice in the flatMap() function we return a new stream with the element and the element multiplied by two, then we collect all the elements, so the output list contains the values [1, 2, 2, 4, 3, 6]. FlatMap takes care of combining all the streams created into one new stream.

So, I expected WebFlux flatMap to take a “stream” of “promises” (a Mono) and return a stream of promise streams (a Flux). But that’s not what it does. FlatMap actually returns a stream with each synchronous element transformed to an asynchronous element. In the example above using the Java Streams, the operation applied to each element is just multiplying it by two, which is synchronous. But suppose you had a list of elements and you wanted to apply a blocking operation to each of them. For example, you have an employee ID and you want to call a database to get the employee name for each ID. Then you might write something like this:

    Flux.just(123453, 123454, 123455)
      .flatMap(employeeId -> getEmployeeName(employeeId)); // assume getEmployeeName returns a Mono or Flux
  

In this case, instead of returning multiple output elements for each input element, we return an asynchronous output element for each input element. In this way, we have converted our inputs to asynchronous outputs. With this example in mind, let’s dive a bit deeper into the differences between Map, flatMap, and flatMapMany in both Java Streams and Reactive, and then we will revisit this example in the context of all the options available between Java Streams and Reactive Streams.

Comparing Map, flatMap, and flatMapMany in JavaStreams, Mono, and Flux

In the table below, SE stands for Synchronous element (compared to a Mono or Flux, which contain asynchronous elements).

 

Map

flatMap

flatMapMany

Java Streams

Accepts Function SE -> SE which takes a synchronous element and returns a single synchronous element. Returns a Java Stream.

Accepts Function SE -> Stream<SE> which takes a synchronous element and returns zero or more synchronous elements. Returns a Java Stream.

(Doesn’t Exist)

Reactor Mono

Accepts Function SE -> SE which takes a synchronous element and returns a single synchronous element. Returns a Mono.

Accepts Function SE -> Mono which takes a synchronous element and returns a single asynchronous element. Returns a Mono.

Accepts Function SE -> Publisher which takes a synchronous element and returns a Publisher (essentially a new reactive stream). Returns a Flux which is the combination of all the streams generated by the function.

Reactor Flux

Accepts Function SE -> SE which takes a synchronous element and returns a single synchronous element. Returns a Flux.

Accepts Function SE -> Publisher which takes a synchronous element and returns a Publisher (essentially a new reactive stream). Returns a Flux which is the combination of all the streams generated by the function.

(Doesn’t Exist)

Note that the difference between Map and flatMap for Java Streams API is not the same as the difference between Map and flatMap for Reactive Streams API. This is what I originally found confusing.

Map vs. Flat vs. Many - the meaning depends on the context

 

“Map”

“Flat”

“Many”

Java Streams

We are going to apply this function to a “raw,” unwrapped, synchronous value.

We are going to apply a function which returns zero or more output elements per input element.

(Doesn’t Exist)

Reactive Java

We are going to apply this function to a “raw”, unwrapped, synchronous value.

We are going to apply a function which returns an asynchronous publisher (either a mono or a flux).

 

We are going to return zero or more output elements per input element.

The key thing to understand is that the “flat” in “flatMap” for Java Streams means that you are “flattening” a bunch of streams created by the function. However, the “flat” in “flatMap” for Reactive Streams means you are converting the synchronous elements to asynchronous elements. You’ll notice that the “flat” in the reactive “flatMap” has nothing to do with flattening, which in my humble opinion is confusing.

Let’s use an example to help fully illustrate which Map command to use in different situations.

  • Lowercasing the employee name: For each employee, we return the employee name in lowercase. This has no downstream/blocking calls. In this case, assuming we have a list of employee names, we should use the Java Streams Map function to convert the list of employee names to lowercase values. No Reactive needed in this case!
  • Getting the employee name from their Employee ID (Database Call): For each employee, we look up the employee in a database and get their corresponding Department ID. In this case, if we had one employee, we would use Reactive Streams Mono flatMap because we want to convert our Employee IDs to an asynchronous Mono with the employee name. If we had multiple Employee IDs in a Flux, we would also use flatMap, and the output would be a Flux as well.
  • Getting the employee’s teammate’s names by Employee ID (Database Call): For each employee, we look up the employee’s teammates in a database. If we had a single Employee ID in a Mono, we would use flatMapMany because we want to get a Flux (multiple “promises” of teammate names). If we had multiple Employee IDs in a Flux, we would use flatMap to get the “promises” of teammate names.

Challenge 3: What is the best way to rewrite my app using Reactive?

If you have an existing imperative application, rewriting everything in reactive can seem daunting. The good news is, there are a few strategies you can use so that you can incrementally add reactive behavior to your application. When migrating our application, we found the best way was to work from the back to the front - starting with calls to downstream APIs/databases and working towards the response to the client. This method made it easier for us to define the initial Monos and Fluxes, and then keep refactoring the calling function to accept the Mono/Flux response.

Refactoring from the back to the front also allows you to incrementally convert portions of the code to use Reactive. For example, if you have five downstream calls to databases or APIs, you can do them one at a time. If you leave the rest of downstream calls as imperative, the code will still compile and run fine, you just won’t be fully reactive.

Once you reach the point where you are ready to deploy the application, you will want to confirm everything is working as expected. Obviously, fixing the unit and component tests is part of this process. However, you should also consider your plan for safely deploying to your production environment. I would recommend using either a canary deployment or some other incremental deployment strategy to avoid having broad impact in the case of an error. A full discussion of the different deployment options is out of scope here, but production deployment is a critical step in the process, and with the level of changes required to rewrite an application to use Reactive, I would highly recommend putting some thought into how you plan to do it.

Challenge 4: How does error handling work in Reactive?

In a conventional, imperative application, you handle errors using try catch blocks like so:

    try {
   Response response = downstreamApi.execute(request);
} catch (TimeoutException e)  {
   log.info("Timeout occurred");
}
  

However, when working with Reactive, you are working with asynchronous data, or the “promise” of future data. Errors are no exception (no pun intended) and these errors can also be processed asynchronously.

Consider this corresponding implementation using Reactive:

    public Mono downstreamAPICall(Request request) {
 return webClient
         .post()
         .uri(this.downstreamURI)
         .bodyValue(request)
         .header("Client-Correlation-Id", "my correlation id")
         .retrieve()
         .bodyToMono(FaaSMultiQueryResponse.class);
}

public Mono processResponse() {
 downstreamApiCall(request).map(response -> {
   //process response
 })
}
  

If the call to downstreamAPICall fails or times out, the exception will not be thrown until it gets to the map function in processResponse. In other words, the exception will not be thrown until it is “unwrapped” in the map call.

Error handling

To handle the errors in Reactive, you need to use one of the onError* calls:

  • onErrorResume: Accepts a function which returns a new Reactive stream (Mono or Flux) on an error. You could make a call to another fallback server and return that reactive stream instead.
  • onErrorMap: Accepts a function which returns a synchronous element; this function is executed in case of an exception.
  • onErrorReturn: Accepts a synchronous element that is returned in case an exception occurs. This is the “default” element in case of errors.

To give one example:

    public Mono processResponse() {
 downstreamApiCall(request).map(response -> {
   //process response
 })
processResponse().onErrorResume(e -> {
   log.error("Call to Downstream API failed: {}", e.toString());
   return Mono.empty();
});
}
  

In this case we simply log the error and return a Mono.empty(), which is basically skipped over when the upstream caller is unwrapping the items in the mono stream.

Null handling

The previous example is also a solution for a common case my team ran into when rewriting our application. We had several cases where we would call a downstream API, and in case of failure, we would log the exception and return null. Then, in the calling function, we would have an if statement which filtered out these null responses. The code snippet above, which uses Mono.empty() instead of null, allows you to simplify the code by removing these if statements checking for null values.

Takeaways and impacts

In the end, we were able to successfully migrate our Spring application to use WebFlux and wanted to see what benefits that provided us. To evaluate the performance, we applied some performance tests simulating dozens of concurrent users to the original imperative version and the new Reactive versions. We then measured the thread count and response time.

We found that response time was essentially the same between both versions, but thread count was significantly lower for the Reactive version. Lower thread count in the Reactive version was in line with our expectations because of the way Reactive efficiently reuses threads in a non-blocking way. However, the actual benefits you may observe in terms of memory, runtime, or any other metric will be highly dependent on your application, its load, and the number and type of downstream calls it is making.

graph showing active thread count over total requests in 4 minutes

This figure shows a comparison of the number of active threads between the Reactive (Rx) and Non-Reactive (NonRx) implementation of the API as we increase the number of concurrent requests. We found that the Reactive implementation uses far fewer threads, with the caveat that the thread pool size may be providing a hard upper limit in these cases.

Before implementing a new application to Reactive or migrating an existing application, I would definitely recommend taking a critical look at whether your application is a good fit for the Reactive paradigm. The table in the Challenge 1 Section above can help with this. If you are migrating an existing app, I would also recommend having a plan for how to safely migrate your traffic to the Reactive version once it is ready, as this can be a risky process if done incorrectly.

With these considerations in mind, I think Reactive programming has significant promise for many applications, but as with all tools, it’s all about learning to use it effectively, and ensuring you are using the right tool for the task at hand.


Ashton Webster, Software Engineer, Retail Bank and Technology

Ashton Webster is a Software Engineer on the Mischief Managed team at Capital One. He graduated from University of Maryland with Bachelor’s and Master’s degrees in Computer Science. Since then, he has worked at Capital One for over three years on a variety of teams, with a current focus on fraud detection and prevention. Outside of work, he enjoys reading and playing chess.


DISCLOSURE STATEMENT: © 2021 Capital One. Opinions are those of the individual author. Unless noted otherwise in this post, Capital One is not affiliated with, nor endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are property of their respective owners.

Related Content