Project Bacalhau

Share this post

Bacalhau 1.0 Release: Featuring Private Clusters, Octostore, and Federated Learning

blog.bacalhau.org
Bacalhau Updates

Bacalhau 1.0 Release: Featuring Private Clusters, Octostore, and Federated Learning

Today marks the launch of Bacalhau 1.0, the general availability (GA) release of the open source distributed compute platform.

David Aronchick
May 9, 2023
Share
Share this post

Bacalhau 1.0 Release: Featuring Private Clusters, Octostore, and Federated Learning

blog.bacalhau.org

Today marks the launch of Bacalhau 1.0, the general availability (GA) release of the open-source distributed computing platform. The project’s mission is to revolutionize the way organizations and developers harness the power of collaborative computing, and the GA release marks an important milestone toward that goal. Since launching our beta release in November, the project has seen more than 3,000 commits from more than 30 contributors and a release every two weeks. Additionally, customers like the New Atlantis Foundation, the City of Las Vegas, and the University of Maryland are executing hundreds of thousands of jobs every month on the public network. To read more about Bacalhau, and try it out for yourself, go to

Try it out

Background

​​Distributed computing has long been recognized as a powerful approach for tackling large-scale, complex problems by harnessing the collective power of devices everywhere. However, developers face significant challenges in adopting it, including inefficient resource allocation, communication bottlenecks, and high barriers to entry for non-expert users. 

But the time to address the issues is now. By 2025, IDC believes that we will have generated more than 175 zettabytes of data, 50 times more data than we do today. Yet critical insights to make better decisions are hidden behind distributed devices and storage.

(Re-)Introducing the Bacalhau Project

Bacalhau was created to address these challenges head-on through a platform designed from the ground up for the distributed world. Built by core members of the Kubernetes, Kubeflow, Amazon Kinesis communities, and employees from Google, AWS, and Microsoft, Bacalhau provides a new way to build and use globally deployed applications and data that is familiar, high-scale, and efficient. Further, because Bacalhau is open source and Apache2/MIT licensed, the community is built to foster collaboration and innovation, allowing developers from around the world to contribute their expertise and continually improve upon the platform.

General Availability Release of Bacalhau

The GA release of Bacalhau includes the following features:

  • Running Docker & WASM jobs, with GPU support

  • Multi-architecture support - Intel, Apple Silicon (M1/M2), ARMv6 & ARMv7, AMD64

  • Support for 1000+ nodes

  • Running 10k+ jobs simultaneously

  • 100 TB processing across many files

  • Simplified private cluster setup

  • Reading and writing from any S3-compatible data store

  • Concurrency and confidence for parallel and verifiable job execution

  • Log streaming for Docker and WASM jobs

  • DAG execution through Project Amplify 

  • Job selection hooks (against binaries, http endpoints, etc)

  • Throttled allow-list networking

  • Python SDK

  • Airflow executors

  • Open Telemetry Tracing

  • Swappable verification, execution and publisher systems

  • Scheduling against node labels

  • Great examples for getting started including:

    • Running Python, Pandas, R, Rust, TensorFlow, PyTorch natively (or any custom container)

    • Running Jupyter Notebooks

    • Converting a CSV to Avro or Parquet

    • Reading simultaneously across many nodes from multiple S3 Buckets

    • Querying data using DuckDB

    • Processing Oceanographic Data

    • Converting Video Files

    • Running the Dolly 2.0 model with Hugging Face

    • Using YOLOv5 for Object Detection

    • Inferring using Stable Diffusion on a GPU

    • Performing OCR

    • Doing Speech Recognition

    • Running an OpenMM Molecular Model

    • Executing a Genomics Model

    • And lots more!

Long Term Mission

Our long term goal is to transform the way that developers can interact with the breadth of computing and data resources out there. Some of the features we have on the horizon include:

  • A fully distributed computation platform that can run on any device, anywhere

  • A declarative pipeline that can both run the data processing and also record the lineage of the data

  • A highly resilient system that can schedule across latency boundaries and deliver the reliability a global deployment needs, even over spotty network connectivity

  • Secure and verifiable results that can be used to confirm the integrity and reproducibility of the results forever

But you tell us! We'd love to hear about new directions we may need to include.

How to Get Involved

We're looking for help in several areas. If you're interested in helping out, please reach out to us at any of the following locations:

  • Our Website

  • Our Google Group

  • Our Slack

  • Our Repo

  • Our Docs

  • Our Place To Complain about Missing Features/File an Issue (and in Our Slack)

⭐️ Star on Github

Thanks for reading, and onward!

Your humble Bacalhau team

Share
Share this post

Bacalhau 1.0 Release: Featuring Private Clusters, Octostore, and Federated Learning

blog.bacalhau.org
Comments
Top
New

No posts

Ready for more?

© 2023 Project Bacalhau
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing