FinOps Architecture Part I: Data

The FinOps vendor landscape contains a wealth of options for the best way to move a container or a workload from one instance type to another based on a secret combination of reasons. They use fancy AI/ML algorithms to reach decisions on how and where to schedule your tasks so you don’t have to.
And, if that’s the only thing your business needs to fully optimize your financial efficiency, you are doing better than most companies, and for that, salud! However, for most organizations out there, this low-hanging fruit isn’t going to yield enough juice to quench your business’s thirst for cost efficiency.
A drum that I will never stop beating is that a majority of your problems can be solved on the whiteboard. This is true for performance, resilience, security and cost efficiency.
Now, a lot of startup founders and VCs will tell you that it’s “premature optimization,” but what they’re really saying is, “You can sell your technical debt on exit.” If you keep running these workloads, you’re going to be the one in charge of paying off that technical debt. In the case of cost concerns, it may be the real debt you’re paying off too.
When you are in the design, redesign or architectural review stages for your workloads, it’s hard to predict what your costs will be. However, there are some principles that you can apply in your design that will put you in a good position to see the most financial benefit from your architecture. I break these into four categories: data, compute, network and observability. There is another area of concern that I’ll call operations, which addresses ongoing concerns around architectural decisions.
Today, we will talk about data. Our workloads run around data — creating it, crunching it, transforming it, storing it and moving it. All of these come at some kind of cost. The more data we use and the faster we need it, the more it is going to cost.
Let’s look at some basic principles around storage:
- Generate as little data as possible.
- Not everything needs to be logged; not all logs need to be retained.
- Store as little data as possible.
- Not everything needs to be saved either, especially in databases.
- If it doesn’t bring you joy, discard it. If you must keep it, compress it.
- Read as little data as possible.
- Make smart queries.
- Egress the smallest amount you can.
- Store data as cold as possible.
- You don’t need backups in block storage.
- Colder = cheaper
- Delete old data.
- Don’t hoard data.
- Move as little data as you can.
- Move data as short a distance as you can.
- Stay within your network as long as you can, and that will keep your transfer costs down.
- Stay within your region if you can.
- Move data using the least expensive path that still meets your needs.
- When you move it, like current in a circuit, you need to figure out the path of least financial resistance.
Using these principles during your design process will lead you to ask questions about your architecture.
Some questions we want to ask when working this out on the whiteboard could be:
- Where are we reading data from and how quickly do we need it?
- What types of data will this need (object, flat file, database, etc.)?
- How much data is this going to return and where does it go?
- How long do we need to keep data for this workload and how often will we use it?
- Will we be aggregating this data somewhere else?
The answers to each of these questions have other questions that will follow. These discussions need to be addressed as far as possible when designing your architecture. But when designing the architecture based on these questions and answers, it is imperative that you do not design your architecture in a way that prevents you from being able to make changes if conditions change. Why is this important? Because as you scale, the decisions you make in your architecture design will be put to the test.
As your DevOps processes progress and you iterate on your application, your needs will change. Some of your choices may no longer fit your current circumstances. Being architecturally locked into a storage type, a storage vendor or whether to use a managed service may not be financially beneficial to you in the long run, so having the option to change this is crucial.
As you scale and move into multiple regions, multi- or hybrid cloud environments, your architectural considerations for how and where you store and move data will change. The basic principles for cost-optimized storage don’t change, but how they’re implemented will because you will have more tools at your disposal and more cost variables to consider. For example, you can get less expensive object storage in your public cloud but less expensive and higher performance block storage in your data center. These two conditions must be balanced with the cost of data egress from your cloud provider.
In cases like this, you may be best served by using your data center block storage and internet egress cost advantages to do most of you data generation and transit to and from the internet, and (since data ingress from the internet is free) use the public cloud for backup, artifact storage, elastic scaling options and analytics on the artifacts that are in object storage. By designing an architecture that uses your various environments’ strong points in both performance and cost, and finding the most cost-efficient way of having those environments interact, you are maximizing your savings potential before deploying the first asset.
Ultimately, it’s safe to say that where and how you create and keep your data will be the primary driver of your other architectural decisions. As such, these decisions are not to be made lightly; they should be made with as much data as possible and reviewed regularly. Keeping these principles and questions in mind will help you with that process and help you maximize your cost savings as you address the other pillars of your architecture.
For reference, see previous article by Tim Banks: “Painting Yourself into Corners: Don’t Do FinOps Wrong”