We’re always asking ourselves “How can we make Bacalhau better?”. We like that question, because the answers ebb and flow from one feature to the next depending on how people are using Bacalhau. What people build with Bacalhau today may be completely different from how they build with it a year from now!
It’s this evolution of uses that makes us pay close attention to our community on GitHub, Slack, during our Community Hours, and when we meet you in the real world - pretty much any situation where anybody wants to talk to us about what they’re building with Bacalhau.
Anecdotal data can be useful, but it's not always balanced. Sometimes, a small group of users can be really good at grabbing our attention about how they think Bacalhau should work. But that doesn’t always reflect what most users actually want!
So, to help us understand what people really want from Bacalhau, we’re introducing some telemetry and analytics to help drive our decisions about what we build next.
What does this mean for you?
Not a great deal - very little has changed. Integrating OTel into Bacalhau has been on our Roadmap for some time now. While that’s not quite what we’re introducing in 1.5.0, we think the new telemetry system we’re adding is a solid first step in that direction. We can’t wait to get wider OTel integrations out to people in the near future.
In the meantime, Bacalhau nodes running 1.5.0 will send limited, anonymised telemetry back to Expanso so that we can better understand how Bacalhau is used, and where we should focus our efforts next.
If you want to opt-out, we’ve built that option in for you too. You can find instructions below on how to stop any kind of Bacalhau node sending telemetry to us.
How does this benefit Bacalhau?
Collecting telemetry is super-helpful for Bacalhau in a number of ways:
Identifying popular features
Knowing which Bacalhau features people are using can help us prioritize their further development. It can also help us understand whether new features we introduce are being utilized in the way we expect, and if not, quickly adjust course.Find pain points
If something goes wrong when people use Bacalhau - such as commands not running as expected or unanticipated results - telemetry helps us spot those patterns early. That gives us a chance to adjust and make Bacalhau as simple and efficient as possible.Optimize performance and resource allocation
With this new data, we aim to find ways to improve Bacalhau’s performance and resource management. Our goal is to fully utilize the host system’s capabilities, enabling Bacalhau to achieve more with the same compute resources.Improving overall system stability and reliability
Things don’t always go as planned, and when they don’t, this telemetry helps us spot issues in Bacalhau more quickly. The faster we find problems, the faster we can fix them, ensuring Bacalhau remains stable - no matter the circumstances!
What kind of data are you collecting?
With 1.5.0, we’ll be collecting the following pieces of information. This list shouldn’t be considered exhaustive and will likely change in the future, but it should provide a good idea of what we’re interested in with this initial release.
It’s important to note that we do not believe in, and have no interest in, collecting personally identifiable information (PII). We’ve focused on building a list of attributes that we think will give us the best insights into how Bacalhau is being used, without revealing who’s using it or why.
InstanceID
A unique, anonymous identifier for a node in your network
InstallationID
A unique, anonymous identifier for the installation of nodes in your network
Job ID
Job Name
Job Namespace (hashed value)
Job Type:
batch
ops
system
daemon
Job Count
Job State:
canceled
failed
successful
Job State Message
Message indicating why a job is in a given state
Job Version
Job Revision
Job Create Time
Job Modify Time
Task Name - (hashed value)
Task Engine Type
docker
wasm
Task Publisher Type
Local
s3
ipfs
Task Input Type:
ipfs
url
s3
inline
localDirectory
Task Environment Variable Count:
The number of environment variable fields set
Task Metadata Count:
The number of metadata fields set
Task CPU:
The amount of CPU allocated to a task
Task Memory:
The amount of Memory allocated to a task
Task Disk:
The amount of disk space allocated to a task
Task GPU:
The number of GPU(s) allocated to a task
Task Network Type:
full
http
Task Timeout Executions:
The execution timeout of a task
Task Timeout Queue:
The queue timeout of a task
Task Timeout Total:
The total timeout of a task
How can I opt-out?
We recognize that not everybody will want to share this information with us. If you’d prefer your Bacalhau installations to not send telemetry, that’s absolutely fine! We’ve included a way to opt-out of all data collection that you can use today.
If you’d like to opt-out, you can set the config for each of your Bacalhau nodes with:
bacalhau config set DisableAnalytics true
Will you share the information you collect with us?
This is just the first iteration of the advanced metrics and telemetry that we want to embed in Bacalhau for our users. Currently, we haven’t built out the ability for users to hook into Bacalhau’s telemetry collector and retrieve the same metrics that we’ll be receiving from 1.5.0.
We’re actively working on this and look forward to introducing a full, extensible OTel integration into Bacalhau. This will give us and our users deeper insight into what’s happening within their Bacalhau networks.
If that’s something of interest to you, check out our roadmap. If you have questions, drop us a message in the Bacalhau Slack, or join us in our office hours. We build Bacalhau for its users, and we love to hear your thoughts, feedback and ideas at every opportunity.