Project Bacalhau

Share this post

Bacalhau Project Report - Jan 13, 2023

blog.bacalhau.org

Discover more from Project Bacalhau

Compute Over Data
Continue reading
Sign in
Project Report

Bacalhau Project Report - Jan 13, 2023

Networking and a Python SDK.

Luke Marsden
Jan 13, 2023
1
Share this post

Bacalhau Project Report - Jan 13, 2023

blog.bacalhau.org
Share

We are back from our holidays and raring to go — check out what we’ve got done in the few days since everyone has been back.

Networking dawns 🌏🌅

For many moons folks have been asking for the ability to access the internet from inside a Bacalhau job. Giving jobs full unfettered internet access has two problems:

  1. Malicious users might do nasty things, like using the network to DDoS websites they don’t like

  2. It affects reproducibility, a key tenet of the goals for the project — since even downloading the same URL twice might give you different results, we can’t ensure that the same deterministic function on the same inputs will yield the same results if you run it twice if it’s also allowed to access the network

We’ve addressed the first problem first by restricting network traffic to only HTTP(S). Other protocols can wait - you can do a lot with HTTP(S)!

Then we made it harder to DDos websites by restricting the bitrate (max 10mbit/sec), along with which domains you can access using an allowlist. The allowlist is specified both on the server side - so bacalhau server operators can choose which jobs they want to support - and also in the job spec - so that users can state which domains they want access to - and then the network will match users requesting certain domains with servers who are willing to allow requests to them.

And so with that, you can now – securely, and without risking DDoS – build Bacalhau inside Bacalhau! For example, this JobSpec:

Says to Bacalhau: hey, I want to run on a node that is OK with accessing these domains over HTTP(S), and on one of those nodes, I want to run `go install github.com/filecoin-project/bacalhau@v0.3.16` - and presto - it will match with a node with a permissive policy for those domains and go run the job!

Dealing with the reproducibility issue is a bit thornier and we haven’t fully tackled it yet, but because our HTTP(S) networking goes via our own proxy, we’re able to add request/response logging to that proxy to (a) record the requests/responses and check whether they match on a subsequent request or (b) replay a job which needed networking using the recorded requests & responses, and not actually have to go out onto the network a second time.

BTW, this opens the use case for using Bacalhau as a runner in your CI system, which we’re pretty psyched about (and we know someone else who is too).

Python SDK 🐍🚀

We’ve now completed the work on our Python SDK:

  • A low-level Python bacalhau-apiclient lib that includes the API endpoints and model classes + auto-generate in CI

  • An high-level Python bacalhau-sdk lib (with utils like “sign request”) + tests in CI

The latter is super cool for users because it means they will soon just be able to run `pip install bacalhau-sdk` and then interact with Bacalhau with a nice, Pythonic interface. The SDK will take care of signing API requests with your Bacalhau key. More details and examples coming soon!

What’s next? ⏩

  • Finish hardening our IPFS integration to make it more resilient as we push more and more data through the network

  • Airflow integration to start using our production-ready Python SDK

  • Stable diffusion as a service paid for in Filecoin

  • Bacalhau dashboard web interface

  • Moving from O(n^2) to O(n) efficiency gains in job scheduling

Questions/comments? Let us know!

  • Our Website

  • Our Google Group

  • Our Slack

  • Our Repo

  • Our Docs

  • Our Build Instructions

  • Our Place To Complain about Missing Features/File an Issue (and in Our Slack)

Thanks for reading!

Your Humble Bacalhau Team

1
Share this post

Bacalhau Project Report - Jan 13, 2023

blog.bacalhau.org
Share
Comments
Top
New

No posts

Ready for more?

© 2023 Project Bacalhau
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing