Introduction
In the ever-evolving world of distributed computing, efficiency and flexibility are paramount. That’s exactly what we focus on with Bacalhau, our open-source framework for distributed compute orchestration. We’re constantly pushing for innovations that make your life easier.
In this blog post, we're excited to share our latest development: Job Templates. This isn't just another feature; it’s a major leap forward in simplifying and streamlining your computing tasks.
Overview
Bacalhau's Job Templates allow users to inject variables dynamically into their job specifications, revolutionizing the way you handle job configurations. No more manual edits for each run; instead, leverage placeholders and input actual values at runtime. Whether it's DuckDB queries, S3 buckets, prefixes, or time ranges, you can now handle these variations effortlessly.
Templating Implementation
Bacalhau's templating feature is built upon the powerful Go text/template package. This library provides an array of features for manipulating and formatting text using template definitions and input variables. For detailed information about the Go text/template library and its syntax, the official documentation is a great resource.
Usage Examples
Let's dive into some usage examples to see how Job Templates work:
Sample Job Spec (job.yaml)
Name: docker job
Type: batch
Count: 1
Tasks:
- Name: main
Engine:
Type: docker
Params:
Image: ubuntu:latest
Entrypoint:
- /bin/bash
Parameters:
- -c
- echo {{.greeting}} {{.name}}
Running with Templating:
bacalhau job run job.yaml --template-vars "greeting=Hello,name=World"
Defining Flag Multiple Times:
bacalhau job run job.yaml --template-vars "greeting=Hello" --template-vars "name=World"
Disabling Templating:
bacalhau job run job.yaml --no-template
Using Environment Variables:
You can also use environment variables for templating. For example:
export greeting=Hello
export name=World
bacalhau job run job.yaml --template-envs "*"
Passing A Subset of Environment Variables:
bacalhau job run.yaml --template-envs "greeting|name"
Dry Run to Preview Templated Spec:
To preview the final templated job spec without submitting the job, use the --dry-run
flag:
bacalhau job run job.yaml --template-vars "greeting=Hello,name=World" --dry-run
This command reveals the processed job specification, illustrating how the placeholders were substituted with the provided values.
More Examples
Let's explore a couple more examples to illustrate the versatility of Job Templates:
Query Live Logs
Name: Live logs processing
Type: ops
Tasks:
- Name: main
Engine:
Type: docker
Params:
Image: expanso/nginx-access-log-processor:1.0.0
Parameters:
- --query
- {{.query}}
- --start-time
- {{or (index . "start-time") ""}}
- --end-time
- {{or (index . "end-time") ""}}
InputSources:
- Target: /logs
Source:
Type: localDirectory
Params:
SourcePath: /data/log-orchestration/logs
This is an ops
job that runs on all nodes matching the job selection criteria. It accepts a duckdb query variable, and two optional start-time
and end-time
variables to define the query’s time range.
To run this job, you can use the following command:
bacalhau job run job.yaml \\\\
-V "query=SELECT status FROM logs WHERE status LIKE '5__'" \\\\
-V "start-time=-5m"
Query S3 Logs
Name: S3 logs processing
Type: batch
Count: 1
Tasks:
- Name: main
Engine:
Type: docker
Params:
Image: expanso/nginx-access-log-processor:1.0.0
Parameters:
- --query
- {{.query}}
InputSources:
- Target: /logs
Source:
Type: s3
Params:
Bucket: {{.AccessLogBucket}}
Key: {{.AccessLogPrefix}}
Filter: {{or (index . "AccessLogPattern") ".*"}}
Region: {{.AWSRegion}}
This is a batch
job that runs on a single node. It accepts the duckdb query
variable, and four other variables to define the S3 bucket, prefix, pattern for the logs and the AWS region.
To run this job, you can use the following command:
bacalhau job run job.yaml \\\\
-V "AccessLogBucket=my-bucket" \\\\
-V "AWSRegion=us-east-1" \\\\
-V "AccessLogPrefix=2023-11-19-*" \\\\
-V "AccessLogPattern=^[10-12].*"
Conclusion
Innovation in distributed computing is not just about keeping pace with change, but about driving it. Bacalhau's Job Templates are a key step in this direction. They offer you a tool that transforms complexity into simplicity, making your computing tasks not just easier, but smarter.
With this feature, you’re not just running jobs. You're unlocking a more efficient and flexible way to manage distributed computing tasks. It's about spending less time configuring and more time creating value. That's the future we're building at Expanso – a future where technology amplifies your impact.
Start using Job Templates today. Experience the shift from manual effort to automated efficiency. For more details, our official documentation is your guide to mastering this powerful tool.
5 Days of Bacalhau 1.2 Blog Series
If you’re interested in exploring these features more in depth, check back tomorrow for our 5 Days of Bacalhau.
Day 1 - Job Templates
Day 2 - Streamlined Node Bootstrap
Day 5 - Instrumenting WASM: Enhanced Telemetry with Dylibso Observe SDK
How to Get Involved
We're looking for help in several areas. If you're interested in helping, there are several ways to contribute. Please reach out to us at any of the following locations.
Commercial Support
While Bacalhau is open-source software, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by Expanso. You can read more about the difference between open source Bacalhau and commercially supported Bacalhau in our FAQ. If you would like to use our pre-built binaries and receive commercial support, please contact us!