Cloud orchestration cost optimization
The move to the cloud promised to save money, but that's not what happened. Changing how you think about cloud computing architecture might.
The move to the cloud promised to save users money and give them insights into their usage and costs.
However, the opposite happened. A 2025 report from AAG stated that around 82% of respondents found cloud spending challenging. A cloudzero report from 2024 states that more than 20% of respondents had no clear idea of their cloud costs, with reports for large users sometimes consisting of thousands of rows of hard-to-read usage data.
This is compounded by many companies and teams using more than one cloud provider for a hybrid cloud strategy to provide redundancy in case of outages or issues. Instead of taking advantage of the flexibility of cloud-native computing, many engineers still build as if they are using fixed on-premise servers, over-provisioning instances far beyond the capacity they need. All these factors mean that users overspend on idle and duplicated services and incur ingress and egress costs between providers.
Financial costs aside, complex cloud orchestration also increases energy impact, latency, and speed issues as multiple services pass bits and bytes back and forth across continents.
Most crucially, cloud applications write and access vast quantities of data, which are rarely stored in the same place where they are accessed and processed, causing yet more latency and cost.
It’s hard to get accurate cost comparisons between cloud providers, but a rough comparison is the following for on-demand rates:
AWS t4g.xlarge, $0.1344 per hour
Azure B4ms, $0.1660 per hour
Google Cloud Platform e2-standard-4, $0.1509 per hour
For spot instances, using comparable services on Azure and GCP:
AWS t4g.xlarge, $0.044 per hour. A 67% saving
Azure, A4 v2, $0.0348 per hour. An 85% saving
Google Cloud Platform, e2-standard-4, $0.0602 per hour. A 60% saving
Common Cloud Cost Optimization Techniques
To combat rising expenses, organizations try different common techniques, including:
Rightsizing resources, continuously monitors resource utilization and performance metrics to ensure you use appropriately sized instances and services. Organizations can eliminate the waste associated with over-provisioning by matching infrastructure to workload demands and ensuring they pay only for the capacity they need.
Leveraging pricing models and purchase options. Cloud providers offer pricing models beyond standard on-demand rates. For example, AWS offers Reserved Instances (RIs) and Savings Plans that give discounts in exchange for committing to a certain usage level. Spot Instances offer access to spare cloud capacity at reduced rates for fault-tolerant workloads that can handle interruptions.
Implementing auto-scaling to automatically adjust the number of compute resources allocated to an application based on demand. Automating shutdown schedules for non-production environments during off-hours also prevents paying for unutilized resources.
Limitations of Traditional Optimization
While conventional optimization techniques are valuable, they often have limitations, especially in data-intensive environments, such as:
Increased operational complexity. Managing rightsizing, reservations, and spot instances is complex and time-consuming in multi-cloud or hybrid cloud setups.
Challenges with dynamic workloads. Accurately forecasting usage for Reserved Instances or Savings Plans is difficult for applications with variable or unpredictable demand patterns. This can lead to over-committing and not solving the cost issue.
Inability to address data transfer costs. Most methods focus on optimizing compute and storage resource costs but fail to tackle the fundamental issue driving significant expense - the cost and performance impact of moving large volumes of data between storage locations and compute services, often across different regions or cloud providers.
The Solution: Bringing distributed compute and data together
Cloud cost optimization is challenging for businesses seeking to leverage cloud benefits without breaking the budget. However, the right solution to cloud cost optimization may be shifting your thinking about computing.
Bacalhau is an open-source solution that enables users to run compute and processing jobs where they generate and store data. Instead of running computations in one location that request data from another, process it, and send it back to another, with Bacalhau, you can run the whole process in one place.
With WASM and Docker support, you can run jobs with different programming languages. Bacalhau has GPU and edge device support, meaning you can still use the same cloud services you already use for compute or storage.
This brings the crucial flexibility you need, such as location, security, and device support, without the expense.
It reduces compute processes sitting idle waiting for something to do, as it’s the infrastructure you already use.
One-line install with many possibilities
Bacalhau is a one-line install that is configurable with a YAML file to determine what to run, where, and how. For smaller jobs, you can run everything on one Bacalhau instance. For more complex jobs, you can create a distributed network of orchestrator and compute nodes for distributing or processing job submissions. This is also useful if larger data sets are sharded across different locations. Each Bacalhau compute node can run over the data it can access and coordinate with the orchestrator to give the overall status and results.
This approach is also useful for global datasets that require processing differently for regulatory or security reasons. Bacalhau can process data in the same place as the data, returning aggregated and anonymized results and reducing concerns about security and privacy.
Reduce costs, not flexibility
The cloud promised reduced costs and complexity, but as countless reports and solutions for managing spiraling costs and complexity show, this hasn’t been successful.
For data processing, Bacalhau offers a simple and flexible solution to reduce cloud orchestration costs. Try the open-source version today, and if you want to know more about the hosted version, speak to the team to find out more.
What's Next?
To start using Bacalhau, install Bacalhau and give it a shot.
If you don’t have a node network available and would still like to try Bacalhau, you can use Expanso Cloud. You can also set up your own cluster (with setup guides for AWS, GCP, Azure, and more 🙂).
Get Involved!
We welcome your involvement in Bacalhau. There are many ways to contribute, and we’d love to hear from you. Reach out at any of the following locations:
Commercial Support
While Bacalhau is open-source software, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by Expanso. Read more about the difference between open-source Bacalhau and commercially supported Bacalhau in the FAQ. If you want to use the pre-built binaries and receive commercial support, contact us or get your license on Expanso Cloud!