Bacalhau 1.0 Release: Featuring Private Clusters, Octostore, and Federated Learning
Today marks the launch of Bacalhau 1.0, the general availability (GA) release of the open source distributed compute platform.
Today marks the launch of Bacalhau 1.0, the general availability (GA) release of the open-source distributed computing platform. The project’s mission is to revolutionize the way organizations and developers harness the power of collaborative computing, and the GA release marks an important milestone toward that goal. Since launching our beta release in November, the project has seen more than 3,000 commits from more than 30 contributors and a release every two weeks. Additionally, customers like the New Atlantis Foundation, the City of Las Vegas, and the University of Maryland are executing hundreds of thousands of jobs every month on the public network. To read more about Bacalhau, and try it out for yourself, go to
Background
Distributed computing has long been recognized as a powerful approach for tackling large-scale, complex problems by harnessing the collective power of devices everywhere. However, developers face significant challenges in adopting it, including inefficient resource allocation, communication bottlenecks, and high barriers to entry for non-expert users.
But the time to address the issues is now. By 2025, IDC believes that we will have generated more than 175 zettabytes of data, 50 times more data than we do today. Yet critical insights to make better decisions are hidden behind distributed devices and storage.
(Re-)Introducing the Bacalhau Project
Bacalhau was created to address these challenges head-on through a platform designed from the ground up for the distributed world. Built by core members of the Kubernetes, Kubeflow, Amazon Kinesis communities, and employees from Google, AWS, and Microsoft, Bacalhau provides a new way to build and use globally deployed applications and data that is familiar, high-scale, and efficient. Further, because Bacalhau is open source and Apache2/MIT licensed, the community is built to foster collaboration and innovation, allowing developers from around the world to contribute their expertise and continually improve upon the platform.
General Availability Release of Bacalhau
The GA release of Bacalhau includes the following features:
Running Docker & WASM jobs, with GPU support
Multi-architecture support - Intel, Apple Silicon (M1/M2), ARMv6 & ARMv7, AMD64
Support for 1000+ nodes
Running 10k+ jobs simultaneously
100 TB processing across many files
Concurrency and confidence for parallel and verifiable job execution
Log streaming for Docker and WASM jobs
DAG execution through Project Amplify
Job selection hooks (against binaries, http endpoints, etc)
Open Telemetry Tracing
Swappable verification, execution and publisher systems
Scheduling against node labels
Great examples for getting started including:
Running Python, Pandas, R, Rust, TensorFlow, PyTorch natively (or any custom container)
Reading simultaneously across many nodes from multiple S3 Buckets
And lots more!
Long Term Mission
Our long term goal is to transform the way that developers can interact with the breadth of computing and data resources out there. Some of the features we have on the horizon include:
A fully distributed computation platform that can run on any device, anywhere
A declarative pipeline that can both run the data processing and also record the lineage of the data
A highly resilient system that can schedule across latency boundaries and deliver the reliability a global deployment needs, even over spotty network connectivity
Secure and verifiable results that can be used to confirm the integrity and reproducibility of the results forever
But you tell us! We'd love to hear about new directions we may need to include.
How to Get Involved
We're looking for help in several areas. If you're interested in helping out, please reach out to us at any of the following locations:
Thanks for reading, and onward!
Your humble Bacalhau team