With the release of Bacalhau 1.2, connecting compute nodes into a cluster is now even easier with the addition of streamlined node bootstrapping.
What is Node Bootstrapping?
Bootstrapping refers to the process of starting up and initializing a system. It involves setting up the necessary components, establishing connections between different nodes or machines, and preparing the system to run.
This process allows nodes in clusters to discover and identify other nodes, as well as to configure itself and establish communication channels between those nodes. As the shape and size of the cluster changes, node bootstrapping provides a mechanism for node membership in the cluster that requires little or no human intervention. In addition, node bootstrapping provides recovery tools to establish resilient solutions to node failure.
Bootstrapping in Bacalhau
In Bacalhau, the bootstrapping process helps compute nodes to discover and connect to the requester nodes that supply it with work. Without this process, the requester node would need to be manually told about every intended compute node in the cluster. In cross-region, cross-cloud clusters, this would become an extremely tedious task.
Current versions of Bacalhau take advantage of LibP2P, which is a modular and extensible framework for building peer-to-peer systems. LibP2P provides a protocol agnostic tool which provides secure, resilient and fault-tolerant network connectivity between nodes. It works asynchronously to identify other nodes in the network, but it is necessary to connect it first to one or more nodes.
To initiate the peer-to-peer network, it is necessary for the compute nodes to be told where to find requester nodes. In previous versions, this required knowing the cryptographic hash of that node’s public key, something that wasn’t available until the requester nodes were already running. Given the requester node’s IP address and peer id (that contains a hash of the cryptographic key), bootstrapping requires running the compute node with the following multiaddress:
bacalhau serve \
--peer /ip4/127.0.0.1/tcp/1235/p2p/QmdZQ7ZbhnvWY1J12XYKGHApJ6aufKyLNvf8jZBrBaAVL
When building a cluster today, the Peer ID is only available once the requester node has been started. This means that our infrastructure-as-code tools are unable to provide this detail to other compute nodes during setup, making the process more complicated than it should be. One thing that the infrastructure tools will allow us to pass to the compute nodes is the IP address that it has just assigned to the requester node, and this is what we use to streamline the process.
Trying Streamlined Bootstrapping Today
With Bacalhau 1.2, we investigated whether the compute node could just ask the requester node for its Peer ID, and realized we can do that if we know where its API is hosted. With the latest version, and knowing the requester node’s IP address, compute nodes can bootstrap with the following address which points at the requester node’s API, available by HTTP.
bacalhau server \
--peer /ip4/127.0.0.1/tcp/1234/http
Whilst the change looks small, it significantly improves the process of building a cluster. Now, nodes can bootstrap simply by knowing the IP address of the requester node, eliminating the need to know its public key ahead of time.
Conclusion
The updates in Bacalhau 1.2 have simplified the process of how compute nodes bootstrap with the cluster's requester nodes. This enhancement makes installation and configuration more straightforward, resulting in a less complex infrastructure-as-code setup. You can learn more about the Bacalhau architecture or networking in particular, in the official documentation.
5 Days of Bacalhau 1.2 Blog Series
If you’re interested in exploring these features more in depth, check back tomorrow for our 5 Days of Bacalhau.
Day 1 - Job Templates
Day 2 - Streamlined Node Bootstrap
Day 5 - Instrumenting WASM: Enhanced Telemetry with Dylibso Observe SDK
How to Get Involved
We're looking for help in several areas. If you're interested in helping, there are several ways to contribute. Please reach out to us at any of the following locations.
Commercial Support
While Bacalhau is open-source software, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by Expanso. You can read more about the difference between open source Bacalhau and commercially supported Bacalhau in our FAQ. If you would like to use our pre-built binaries and receive commercial support, please contact us!