Bacalhau Project Report – April 24, 2023
Insulated jobs! Log streaming for WASM jobs, S3 storage provider & publisher, big upgrades to Amplify, and a Waterlily mainnet release!
This week we released Bacalhau v0.3.28, featuring an important selection of improvements and fixes! Node operators and users will need to upgrade their Bacalhau software to make use of this release – if you’re running a public Bacalhau node, please upgrade now to continue to be a part of the network.
Use AWS S3 for data input and output 🪣🤩
We’ve just landed support for using data in Amazon AWS S3 buckets as input to and output from Bacalhau jobs! All the data that you store in S3 can now be accessed easily from Bacalhau, with an appropriately configured compute node. And once your job is complete, your results can also be written back to S3. Wow!
What’s more, our support is compatible with S3-like providers like Minio – so if you want to run S3-like data stores locally, you can now make this data available to Bacalhau jobs too!
Amplify persistence, web UI, image-resize & video-resize 🌠🔁
Our data automation platform Amplify has received some major upgrades! Amplify data is now persisted so that the current state of workflows survives restarts, making Amplify more able to handle interruptions. We now also have a web UI allowing operators to inspect Amplify jobs, demoed above by the legendary Dr Phil.
We’ve also been working hard on new pipelines for Amplify and are proud to release two new pipelines for rescaling images and videos. Now, whenever Amplify detects new image or video data, it will automatically rescale the media into a variety of resolutions and formats – making it easier to work with that data next time. Awesome!
Compute over private data with insulated jobs 💿🔐
We released our first insulated job features this week! This allows owners of private data to moderate access to their data and only allow certain jobs to run against it. This unlocks running Bacalhau networks between organisations, in federated environments and even as an easy way to access huge datasets made available as a public good.
Moderators can use a simple dashboard to see what jobs are wanting to make use of their data, and approve or reject them appropriately. Above, I demo the act of submitting a job against a moderated cluster and moderating the result. Check it out!
Log streaming for WASM jobs 🪵👩💻
In our previous project update we announced that you can now stream logs from Docker jobs whilst the job is still running – a great way to see how your long-running job is progressing.
We’ve expanded this support to now include WebAssembly jobs too! So no matter what kind of job you’re running, you can use the bacalhau logs
command to see the current output from your job.
Waterlily deployed to mainnet (with automatic artist training pipeline) 🪷🏭⚡️
Our ethical AI art generator Waterlily.ai now is deployed to FVM mainnet, and with it comes a way for artists to submit their artwork to be a part of the Waterlily program.
Artists can now upload a portfolio of their existing work and Waterlily will automatically train an image generator model using their art (using Bacalhau of course!). Once the generator is trained, it is available to all Waterlily users and the artist can start earning revenue from any image generated from their model! It’s a super simple system that is bringing some much needed control back to the generative art space 😤
Questions/comments? Let us know!
Thanks for reading!
Your Humble Bacalhau Team