We're very pleased to announce the release of Bacalhau 1.3.1. Following on from the huge 1.3.0 release in March this year, this release is a smaller one, but it's packed with some great features that we think are going to further cement Bacalhau at the heart of the distributed compute ecosystem.
Following on from the momentum of shipping 1.3.0, we've been working not just to craft this release, but 1.3.2 and 1.4.0 also, which we will be releasing over the next few weeks.
So, without further delay, let's dive into what's new in Bacalhau 1.3.1!
Introducing The Heartbeat
Understanding the state of a distributed network is a tricky proposition in the best of circumstances, and it's one of the things that Bacalhau is built from the ground up to be really good at handling. But we can always do better!
To that end, we've added a "heartbeat" protocol to this and future releases of Bacalhau to enhance the communication and coordination between Requester and Compute nodes.
With the introduction of this feature, a couple of things have changed for Requester and Compute nodes:
1. Requester nodes now host a heartbeat server
2. Compute nodes now include a heartbeat client
With the introduction of the heartbeat protocol, Requester and Compute nodes remain in frequent contact with one another, with Compute nodes sending a heartbeat message to the Requester node every 15 seconds or so. With each Compute node sending these signals, the Requester can keep track of all of the available nodes in the network and make more intelligent decisions about what jobs to schedule where, and when.
If a Compute node doesn't send a heartbeat message to the Requester node within a 30 second window, the Requester node will consider that Compute node to be disconnected and will not schedule any new jobs for it until it receives another heartbeat.
This is the first time we've implemented a heartbeat feature in Bacalhau, so in order to utilize the full benefit, we recommend that all operators of Bacalhau networks upgrade both their Requester and Compute nodes to 1.3.1 at the earliest opportunity. To that end we have a short migration guide to help get you on your way.
Migration Recommendations
Upgrading Requester Nodes first, then Compute Nodes
When upgrading your Bacalhau network to 1.3.1, we strongly recommend updating your Requester nodes first, followed by upgrading each of your Compute nodes.
Once you've upgraded your Requester nodes, running bacalhau node list
will show all of your Compute nodes in a DISCONNECTED
state. This is expected at this point in the migration as the Requester node now has a heartbeat server as part of its validation process for determining Compute nodes connection state, whereas the Compute nodes running on <= 1.3.0 do not have the heartbeat client to signal their connectedness.
As you upgrade each of your compute nodes to 1.3.1, the connection state shown by bacalhau node list
will show each of the upgraded Compute nodes as CONNECTED
.
Depending on your setup, your compute node's approval state may be listed as either PENDING
, APPROVED
, or REJECTED
. For those nodes that are not in the APPROVED
state, you can run bacalhau node approve <NODE_ID>
which will enable the node to accept and execute jobs.
Upgrading Compute Nodes first, then Requester Nodes
If your setup requires that you upgrade your Compute nodes before you upgrade your Requester Nodes, the Compute nodes will attempt to send a heartbeat message to your Requester node to signal its availability.
Now, your Requester nodes running Bacalhau <= 1.3.0 will have no awareness of the heartbeats coming from the Compute node, so it will disregard the heartbeat messages and continue to schedule jobs in the same manner that it has done to date.
Your Compute nodes, however, will begin to log out a warning, stating heartbeat failed sending sequence <number>
.
Migration Conclusion
To conclude, we strongly recommend that if you're going to update any part of your network to Bacalhau 1.3.1, that you do so by upgrading your Requester nodes first, followed by your Compute nodes as soon as possible. While upgrading your Compute nodes first, followed by your Requester nodes may seem like the path of least resistance, the benefits of having a functioning heartbeat in the system vastly outweighs a piece-by-piece migration of existing Bacalhau networks.
Authentication in the Web UI
Another addition in 1.3.1 is the addition of authentication flows to the Web UI. Now, all requests made through the Web UI will be monitored, and if an invalid or unauthorized request is made, the authentication flow will be opened to give the user the opportunity to authenticate their session before continuing on.
This update won't affect any systems with the default authorization policy, as no authentication is required at that point, but if a non-anonymous authentication policy or a username/password has been configured for your system, the new authentication flow will appear in the Web UI.
Deprecation of IPFS Flags
As part of our continuing transition to NATS from libp2p as our primary network protocol, we're deprecating some of the now legacy IPFS flags that have been present in Bacalhau to date.
The following flags have been deprecated in this release. They will still function, but will be removed in future versions.
--ipfs-swarm-addrs
--ipfs-serve-path
--ipfs-profile
--ipfs-swarm-listen-addresses
--ipfs-gateway-listen-addresses
--ipfs-api-listen-addresses
Optimized Job Queuing
To date, Compute nodes have rejected jobs when they have no capacity to take them on. From 1.3.1, Compute nodes will accept and queue jobs if they have no immediate capacity to execute them, provided there are no additional circumstances that would typically warrant a rejection. This is a part of our continuing work on improving resource delegation and management and will have further updates in the near future.
Summary
We’re so pleased to get this release out into the hands of the community, and we’re working hard to get 1.3.2 and 1.4.0 out in the coming weeks. We can’t wait to share what we’ve been up to, and we can’t wait to see what you’re going to build!
With the inclusion of the heartbeat protocol, Web UI authorisation, and updates to accelerate our transition to NATs, we’re incredibly excited about the future of distributed compute with Bacalhau.
How to Get Involved
We're looking for help in various areas. If you're interested in helping, there are several ways to contribute. Please reach out to us at any of the following locations.
Commercial Support
While Bacalhau is open-source software, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by Expanso. You can read more about the difference between open-source Bacalhau and commercially supported Bacalhau in our FAQ. If you would like to use our pre-built binaries and receive commercial support, please contact us!