AmeriSave Moved Its Microservices to the Cloud with Traefik’s Dynamic Reverse Proxy
When AmeriSave Mortgage Corporation decided to make the shift to microservices, the financial services firm was taking the first step in modernizing a legacy technology stack that had been built over the previous decade. The entire project — migrating from on-prem to cloud native — would take longer.
Back in 2002, when company founder and CEO Patrick Markert started AmeriSave, only general guidelines for determining rates were available online. “At that time, finance was very old-school, with lots of paper and face-to-face visits,” said Shakeel Osmani, the company’s principal lead software engineer.
But Markert had a technology background, and AmeriSave became a pioneer in making customized rates available online. “That DNA of technology being the driver of our business has remained with us,” said Osmani.
Since then, AmeriSave has automated the creation and processing of loan applications, giving it lower overall operating costs. With six major loan centers in 49 states and over 5,000 employees, the company’s continued rapid growth demanded an efficient, flexible technology stack.
Steps to the Cloud
With many containerized environments on-prem, company management initially didn’t want to migrate to a cloud native architecture. “The financial industry was one of the verticals hesitant to adopt the cloud because the term ‘public’ associated with it prompted security concerns,” said Maciej Miechowicz, AmeriSave’s senior vice president of enterprise architecture.
Most of the engineers on his team came from companies that had already adopted microservices, so that’s where they started. First, they ported legacy applications into microservices deployed on-prem in Docker Swarm environments, while continuing to use the legacy reverse proxy solution NGINX for routing.
“We then started seeing some of the limitations of the more distributed Docker platform, mostly the way that networking operated, and also some of the bottlenecks in that environment due to increased internal network traffic,” said Miechowicz.
The team wanted to move to an enterprise-grade cloud environment for more flexibility and reliability, so the next step was migrating microservices to Microsoft’s Azure Cloud platform. Azure’s Red Hat OpenShift, already available in the Azure Cloud environment, offered high performance and predictable cost.
The many interdependencies among AmeriSave’s hundreds of microservices required the ability to switch traffic easily and quickly between Docker Swarm and OpenShift environments, so the team wanted to use the same URL for both on-prem and in the cloud. Without that ability, extensive downtime would be required to update configurations of each microservice when its dependency microservice was being migrated. With over 100 services, that migration task would cause severe business interruptions.
First, the team tried out Azure Traffic Manager, an Azure-native, DNS-based traffic load balancer. But because it’s not automated, managing all those configurations through Azure natively would require a huge overhead of 300 to 500 lines of code for each service, said Miechowicz.
One of the lead engineers had used Traefik, a dynamic reverse proxy, at his prior company and liked it, so the team began discussions with Traefik Labs about its enterprise-grade Traefik Enterprise for cloud native networking.
Cloud and Microservices Adoption Simplified
Traefik was founded to deliver a reverse proxy for microservices that can automatically reconfigure itself on the fly, without the need to go offline.
The open source Traefik Proxy handles all of the microservices applications networking in a company’s infrastructure, said Traefik Labs founder and CEO Emile Vauge. This includes all incoming traffic management: routing, load balancing, and security.
Traefik Enterprise is built on top of that. “Its additional features include high availability and scalability, and advanced security, as well as advanced options for routing traffic to applications,” he said. “It also integrates API gateway features, and connects to legacy environments.”
Vauge began work on Traefik as an open source side project while he was developing a Mesosphere-based microservices platform. “I wanted to automate 2,000 microservices on it,” he said. “But there wasn’t much in microservices available at that time, especially for edge routing.”
He founded Traefik Labs in 2016 and the software is now one of the top 10 downloaded packages on GitHub: it’s been downloaded more than 3 billion times.
“The whole cloud native movement is driven by open source, and we think everything should be open source-based,” he said. “We build everything with simplicity in mind: we want to simplify cloud and microservices adoption for all enterprises. We want to automate all the complexity of the networking stack.”
Multilayered Routing Eliminates Downtime
Working together, Traefik’s team and Miechowicz’s team brainstormed the idea of dynamic path-based routing of the same URL, between on-prem Docker Swarm and cloud-based OpenShift. This means a service doesn’t need to be updated while its dependency microservice is being migrated.
Any migration-related problem can be quickly fixed in Traefik Enterprise by redirecting routing from OpenShift back to on-prem Docker Swarm, correcting the issue, and redirecting back to OpenShift. Also, there’s no need to update configurations of any other services.
This is made possible by the way that Traefik Enterprise’s multilayered routing works. “Layer 1 of Traffic Enterprise dynamically collects path-based and host-based routing configured in Layer 2,” said Miechowicz. “In our case, we had two Layer 2 sources: on-prem Docker Swarm and cloud-based OpenShift. Layer 1 then directs the traffic to the source that matches the host/path criteria and has a higher priority defined. Rollback from OpenShift to Docker Swarm simply consists of lowering the priority on the OpenShift route. We did a proof-of-concept and it worked perfectly and fast.”
This contrasts with how NGINX works. “You may configure it to route to a hundred services, but if one service does not come up, NGINX will fail to start and cause routing outage of all the services,” said Osmani. But Traefik Enterprise will detect a service that’s failing and stop routing to it, while other services continue to work normally. Then, once the affected service comes back up, Traefik Enterprise automatically establishes routing again.
NGINX also doesn’t have Traefik’s other capabilities, like routing on the same URL, and it’s only suited for a smaller number of services, Osmani said. Both Azure Traffic Manager and Traefik must be maintained and managed, but that’s a lot easier to do with Traefik.
No More Service Interruptions
Osmani said adopting Traefik Enterprise was one of the best decisions the team has made in the past year because it’s removed many pain points.
“When we were on-prem, we were responsible for managing everything — we’ve often gotten up at midnight to fix something that someone broke,” he said. “But with Traefik you can only take down the service you’re affecting at that moment.”
From the business standpoint, the main thing that’s better is the migration, said Osmani. “Because we are a living, breathing system, customers are directly affected. In the online mortgage lending business, if a service is down people will just move on to the next mortgage lender’s site. Now we don’t experience service interruptions. There’s no other way we could have easily accomplished this.”
“For developers in our organization, the result works like magic,” said Miechowicz. “We just add a few labels and Traefik Enterprise routes to our services. As our developers move services to the cloud, none of them have seen a solution as streamlined and automated like this before.”