Bacalhau has long supported the execution of WebAssembly modules and Docker containers, but the work involved in preparing Docker containers can sometimes be slow, laborious, and repetitive when all you want to do is run some Python code.
The release of Bacalhau v1.3.0 adds an experimental new feature, the “exec” command, which allows for more flexibility in defining custom job types with Bacalhau. This feature is experimental and allows us to explore other behaviours of users to find the optimal approach to submitting code. Please give this feature a try, and send us your feedback.
Simplifying Job Submissions
When shipping python code to Bacalhau, the process normally involves:
Finding a Docker image to use as the base
Writing a Docker file to copy the code and install the dependencies
Push the Docker container to a remote repository
Use the Docker image in a Bacalhau command.
Even after these steps, it is common to find the Docker image created on one machine type, is not supported in the Bacalhau cluster being used. For example, after creating an image on an M1 mac Docker will not run that image on an x64 Linux machine. This can often be time consuming and ultimately frustrating when you just want to run some Python code or invoke a specific command.
The new exec command aims to allow users to include local code in their job submission, even if it has dependencies, and submit the code directly without worrying about Docker images or Docker registries.
Private Cluster Requirements
To make use of the feature requires some configuration in your server setup. Specifically, for your requester (or hybrid node) you need to specify the following two flags:
--requester-job-translation-enabled --job-selection-accept-networked
These two flags will allow custom job types to be accepted and allow them network access for fetching any dependencies at runtime. If you’re using the demo network, the good news is that these settings are already applied so you can try the feature out there.
Getting Started With the “exec python” Command
In Bacalhau v1.3.0, we are shipping two custom job types. Python, and DuckDB. It’s possible that the interface might change in response to feedback, so checking with the online docs is a good idea. Both command work in similar ways, you can optionally point at some code, provide command line arguments to the program (with any that clash appearing after a double-dash --
).
The simplest possible command is to view Python ‘Zen of Python’, and you can do that with
$ bacalhau exec python -- -c "import this"
This will provide you with a job id, which you can use with bacalhau describe
to see the output.
If you have a single python file you wish to execute, you can send it with your command using the following command. As an added extra, any pip install
command found in the module doc will be executed before the script it run, allowing you to install dependencies.
$ bacalhau exec --code fib.py python fib.py 10
If the file fib.py is in the current directory, with the following contents, then the output will be available either though the job describe
command or with bacalhau logs <jobid>
.
"""
Displays fibonacci of a provided natural number, n.
pip install colorama
"""
import sys
from colorama import Fore
def fibonacci(n):
if n in {0, 1}:
return n
prev, result = 0, 1
for _ in range(2, n + 1):
prev, result = n, prev + result
return result
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python mycode.py <number>")
sys.exit(1)
try:
n = int(sys.argv[1])
except ValueError:
print(f"{Fore.RED}Error: Invalid number{Fore.RESET}")
sys.exit(1)
print(fibonacci(n))
If you have a more complex script to run, Bacalhau will also allow you to send directories full of code with the submission, and this is achieved by specifying the path to --code
, where Bacalhau will automatically compress the code and attach it to the job. There is a 10MB limit however, so it is always best to deliver any inputs with the --inputs
flag rather than in a code directory.
When sending a directory full of code, dependencies are located from a requirements.txt
file, a pyproject.toml
or a setup.py
. In addition, if a pyproject.toml specifies that it uses poetry, the system will install the script with poetry.
Implementing Custom Job Types
When implementing custom job types, we began the separation of execution environment from job types, allowing for greater flexibility in where code is executed, without the need for the user to have to understand that environment. The python job type we describe above, is translated by Bacalhau server from a Python job, into a Docker job, using a pre-built container. Future releases might run the same code in a micro-vm, or following a recompilation, as WebAssembly.
In the initial release, we provide a single Docker container with Python 3.11, which you can find in the Bacalhau GitHub at https://github.com/bacalhau-project/bacalhau/tree/main/docker/custom-job-images/python. In the future, we may provide different versions of python (accessible via the --version
flag, or containers with default sets of dependencies installed for specific purposes.
Giving Feedback
As ever, we are extremely interested in feedback on Bacalhau features - what works, what doesn’t work, and what could be better. If you have ideas or suggestions for improvements to the interface for bacalhau exec
, or have a preference for future custom job types, please come to the Bacalhau slack and let us know. Alternatively, if you prefer, you can record issues in the Bacalhau repository and we’re more than happy to have a conversation there.
Conclusion
The introduction of Custom Job Types will enable more users to submit jobs to Bacalhau without requiring a strong knowledge of the underlying execution platform, such as Docker. As we refine both the user interface, and the choice of custom jobs from user feedback, we hope that future job types can further simplify the use of Bacalhau from the command line.
If you’re interested in learning more about distributed computing and how it can benefit your work, there are several ways to connect with us. Visit our website, sign up to our bi-weekly office hour, join our Slack or send us a message.
How to Get Involved
We're looking for help in various areas. If you're interested in helping, there are several ways to contribute. Please reach out to us at any of the following locations.
Commercial Support
While Bacalhau is open-source software, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by Expanso. You can read more about the difference between open-source Bacalhau and commercially supported Bacalhau in our FAQ. If you would like to use our pre-built binaries and receive commercial support, please contact us!