The Case for Furnace, a Data Fabric that Can Evolve with Your Business
Furnace aims to become your data fabric, the platform around which your data-driven business is built. That’s a bold statement and needs context, so in this article I’ll attempt to explain.
Looking around today’s data landscape it’s easy to get lost or overwhelmed by the sheer magnitude of options; platforms vs frameworks, cloud vs. on-premise, SaaS vs. PaaS, open-source vs. commercial, data lake vs. data warehouse, I could go on… you get the idea.
There’s a lot at play here and there are many ways to approach any solution technically. However, I think it’s fair to say, choosing the wrong technology stack could set you up for a world of pain.
TLDR: Furnace is an open source platform that helps developers and businesses build highly effective data-driven applications. It addresses many of the challenges that businesses face by providing a simple, yet effective framework to build on top of serverless and cloud native infrastructure, driving down total cost of ownership.
Infrastructure; the Elephant in the Room
Depending on the size of an organization, things may look challenging in terms of having the personnel available to design, build and maintain the necessary infrastructure. Are we talking “Big Data”? Perhaps not now but if you’re a startup you may want to architect in a way that you can collect data from day one that gives you the capability further down the line, but of course you have to do that cost-effectively. Moreover, in the startup world, time to market, financial run-way and focus on core value proposition are a matter of life and death so burning time and money on DevOps resource and AWS infrastructure (or worse, on-premises) is potentially suicidal. In short, reaching product/market fit as quickly and as cheaply as possible is essential.
Of course, over in the enterprise, things are very different. It’s generally accepted that 80% of time and resource is spent on managing infrastructure leaving only 20% of focus on value creation. This hurts the bottom line and inhibits the ability for the business to adapt to the constantly changing landscape. To give more meat on the bones, enterprises usually have sizeable teams dedicated to building and maintaining these systems, the promise of cloud was supposed to shift the balance here, but poorly architected systems, perhaps a simple “lift and shift” from their on-premise environment, bring the same issues they had before, or worse.
Serverless technology can be used to address this imbalance, reducing infrastructure costs, management overhead and increasing scalability.
The Shape of Your Data
Data comes in all shapes and sizes. Chances are the data your organization collects or produces comes in many forms, APIs, relational databases, continuous streams and batches delivered at specific schedules. It is common to deal with each of these cases separately and on different platforms, resulting in data being siloed. To my previous point… more infrastructure and greater operational costs.
It is therefore desirable (if not critical) that an organization’s data infrastructure is able to deal with the ways in which their data is presented to them, in a single and succinct platform, enabling them to be closer to the value it contains.
Not all data is equal. Some sources require immediate action as soon as it is received as it may enable predictive or preventative capabilities critical for certain businesses. On the other side of the scale, historical data may be reviewed from a more “business intelligence” point of view. Mike Gualtieri of Forrester defined this in his report on “Perishable Insights” where he defined a timeline of how data loses value quickly over time, starting from preventive/predictive through actionable, reactive and historical moving from real-time, to seconds, hours and days.
Platforms that understand these characteristics provide businesses with a huge advantage, on top of the benefits of running a converged platform, data with different levels of perishability can be treated differently, perhaps stored in a much more cost-effective data lake (object storage versus realtime solution such as ElasticSearch or Redshift).
We’ve all read about the panacea of Cloud and the notion that the more of it we consume, the fewer problems we face. The reality is somewhat different. Today, the technical workforce is massively fragmented with a mix of skills, choices and preferences leading to huge inefficiency. Say you’ve gone “all-in” on AWS but one of your suppliers requires you to consume some services in Azure, or your boss wants to use the latest ML technology in Google Cloud, what do you do? Lean on the existing engineering team to up-skill? Go through the process of outsourcing? There is no easy answer. Cloud architects who are fluent in multiple clouds are few and far between and you’ll have to take out a mortgage to hire them.
Furnace provides a single, succinct framework for building data-driven applications on multiple clouds. We provide “just-enough abstraction” to allow you to focus on your core value proposition taking care of all the heavy-lifting, security policies, plumbing, etc. yet still allowing you to go deep into specific cloud native infrastructure when needed.
Some say Furnace abstracts the cloud and provides an agnostic layer on top. It does in some ways, but we feel the real power is in the ability to build highly efficient and highly effective data-driven applications, across multiple clouds, within a single framework. One tool to learn, all the power of three public clouds, and counting…
Ask yourself, just how effective are you, or your team, when it comes to utilizing the value that sits within your data?
A recent study shows most enterprises never query again 95% of the data they push into their data lake. Why is that? Is it because they lack the tools and capabilities or perhaps it’s because they are spending most of their time fighting against the technology choices they’ve made?
Here’s another question; how quickly does your business respond to a changing landscape? A new source of data? A new intelligence source?
If you’re thinking it’s all a simple matter of hiring a new “data scientist” you’re probably mistaken. These guys are so scarce (if they’re good) they get to choose their gigs, and if you don’t have the data available at their fingertips or expect them to craft ETL (extract, transform, load) pipelines, they’ll soon be out of the door.
Furnace is based on the concept of “programmable pipes,” making it highly flexible and highly effective at building dynamic data flows that closely align to the business need.
TIP: Don’t have expensive data engineers building ETL pipelines using last generation technologies, there are now good SaaS/Cloud platforms available to do this… oh hey Furnace!
If you’re consuming data in your organization today, I’m willing to bet you’re currently paying for idle time. That is, infrastructure sat waiting to run that nightly batch processing job or some beefy servers running at 5% utilization “just in case” you see a traffic spike.
How about provisioning a new application? How long does it take from conception to deployment into production? I’m willing to guess the whole process takes weeks and includes such things as infrastructure resource planning and provisioning ahead of time.
Furnace cuts through this by making use of Serverless technology, that is:
- Infrastructure costs are absolute zero when not being used.
- You pay only for what you use.
- Applications scale up and down automatically with no intervention.
When it comes to processing data, today’s solutions are often rigid and brittle, that is, such things as changing a pipeline or adding data sources often require making significant changes to the underlying infrastructure, recompiling code, running quality tests etc. Tight coupling between components also has an impact, so something that may seem like a “simple change” ends up bringing down the whole infrastructure.
Furnace allows you to build dynamic, loosely coupled application architectures that are able to adapt and evolve with the business. There are many aspects of the platform that make this possible, such as its ability to intelligently synchronize Cloud configuration (desired state into actual state) and immutable environments that can be spun up as required for testing before eventually being promoted to production.
Further, since Furnace uses a simple, declarative DSL (Domain Specific Language) this increases simplicity, understandability and security.
In this article, I’ve tried to outline some of the challenges that data-driven businesses face today and how Furnace looks to provide solutions. You can read about where Furnace came from here. And if you want to download Furnace, go here.
Feature image by kepinator from Pixabay.