<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Bacalhau]]></title><description><![CDATA[Compute Over Data ]]></description><link>https://blog.bacalhau.org</link><image><url>https://substackcdn.com/image/fetch/$s_!7xgs!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc58548d1-2bd9-4076-a994-4b86d7e5f9ad_580x580.png</url><title>Bacalhau</title><link>https://blog.bacalhau.org</link></image><generator>Substack</generator><lastBuildDate>Sat, 04 Apr 2026 12:33:00 GMT</lastBuildDate><atom:link href="https://blog.bacalhau.org/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Expanso]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[bacalhau@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[bacalhau@substack.com]]></itunes:email><itunes:name><![CDATA[David Aronchick]]></itunes:name></itunes:owner><itunes:author><![CDATA[David Aronchick]]></itunes:author><googleplay:owner><![CDATA[bacalhau@substack.com]]></googleplay:owner><googleplay:email><![CDATA[bacalhau@substack.com]]></googleplay:email><googleplay:author><![CDATA[David Aronchick]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Bacalhau v1.8.0 - Day 4: Seamless Result Storage with Managed Publishers with Expanso Cloud]]></title><description><![CDATA[Your workflow just got simpler. Announcing secure, managed, and automatic result storage in Expanso Cloud.]]></description><link>https://blog.bacalhau.org/p/bacalhau-v180-day-4-seamless-result-storage-with-managed-publishers-in-expanso-cloud</link><guid isPermaLink="false">https://blog.bacalhau.org/p/bacalhau-v180-day-4-seamless-result-storage-with-managed-publishers-in-expanso-cloud</guid><dc:creator><![CDATA[Federico Trotta]]></dc:creator><pubDate>Thu, 26 Jun 2025 15:30:59 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/947c9fb0-4cd9-4196-aad8-15ef5e6c3568_3611x2580.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This is part of the 5-days of Bacalhau 1.8 series! Make sure to go back to the start to catch all of them!</p><ul><li><p><strong><a href="https://open.substack.com/pub/bacalhau/p/announcing-bacalhau-v180?r=1ep1nf&amp;utm_campaign=post&amp;utm_medium=web&amp;showWelcomeOnShare=true">Day 1:</a></strong><a href="https://open.substack.com/pub/bacalhau/p/announcing-bacalhau-v180?r=1ep1nf&amp;utm_campaign=post&amp;utm_medium=web&amp;showWelcomeOnShare=true"> Announcing Bacalhau v1.8.0: Intelligent Edge Computing Meets Enterprise Integration</a></p></li><li><p><strong><a href="https://open.substack.com/pub/bacalhau/p/bacalhau-v180-day-2-rerun-update-version-bacalhau-jobs?r=1ep1nf&amp;utm_campaign=post&amp;utm_medium=web&amp;showWelcomeOnShare=true">Day 2: </a></strong><a href="https://open.substack.com/pub/bacalhau/p/bacalhau-v180-day-2-rerun-update-version-bacalhau-jobs?r=1ep1nf&amp;utm_campaign=post&amp;utm_medium=web&amp;showWelcomeOnShare=true">Rerun, Update, and Version Your Bacalhau Jobs</a></p></li><li><p><strong><a href="https://blog.bacalhau.org/p/bacalhau-v180-day-3-how-bacalhau-boosted-daemon-jobs">Day 3</a></strong><a href="https://blog.bacalhau.org/p/bacalhau-v180-day-3-how-bacalhau-boosted-daemon-jobs">: How Bacalhau Boosts Daemon Job Reliability</a></p></li><li><p><strong>Day 4</strong>: Seamless Result Storage with Managed Publishers with Expanso Cloud (this post)</p></li></ul><div><hr></div><p></p><p>Let&#8217;s be honest. When you&#8217;re focused on a complex data problem, the last thing you want to think about is infrastructure. Yet, for the longest time, a simple question has added friction to almost every distributed computing job: "Where do the results go?"</p><p>This question immediately triggers a cascade of others:</p><ul><li><p>&#8220;How do I get credentials onto the compute nodes securely?&#8221;</p></li><li><p>&#8220;Am I accidentally exposing secrets?&#8221;</p></li><li><p>&#8220;How do I even find the output once the job is done?&#8221;</p></li></ul><p>This dance of configuring storage, managing credentials, and tracking outputs is a tedious, error-prone distraction from the work that actually matters.</p><p>At Expanso, we believe you should focus on your code, not your cloud storage configuration. That&#8217;s why we&#8217;re thrilled to introduce a fundamental improvement to the Expanso Cloud experience: <strong>default managed publishers.</strong> </p><p></p><h2><strong>The Old Way: A Trail of Configuration and Credentials</strong></h2><p>Until now, when running jobs on Expanso Cloud, you could face common dilemmas involving:</p><ul><li><p><strong>Manual setup:</strong> Explicitly configuring S3 buckets, or any other storage solutions for every job or pipeline you are using.</p></li><li><p><strong>Credential juggling:</strong> The risky situation of distributing AWS keys or other storage credentials to a fleet of compute nodes and clients.</p></li><li><p><strong>Operational overhead:</strong> Managing the lifecycle, cost, and access policies of yet another piece of infrastructure.</p></li><li><p><strong>Result scavenger hunts:</strong> A lack of a standardized approach to storage often left users digging through buckets to find their outputs.</p></li></ul><p>This meant that even simple jobs required additional configuration and infrastructure planning, creating friction for users who just wanted to run their workloads and get results.</p><h2>The New Way: Expanso Cloud Managed Publishers</h2><p>With our latest update, Expanso Cloud now provides <strong>managed publishers by default.</strong> When you submit a job without explicitly specifying a publisher, Expanso Cloud handles the result storage automatically. No publisher configuration needed. It just works.</p><p>For example, consider the following simple data processing job: Notice what&#8217;s missing?</p><pre><code><code># Your job specification - no publisher configuration needed!
Name: data-processing
Type: batch
Tasks:
  - Name: data-processing
    Engine:
      Type: docker
      Params:
        Image: python:3.9
        Parameters:
          - python
          - -c
          - |
            import pandas as pd
            # Your data processing logic
            results = pd.DataFrame({'output': [1, 2, 3, 4, 5]})
            results.to_csv('/outputs/results.csv', index=False)
    ResultPaths:
      - Name: outputs
        Path: /outputs
# No Publisher section needed - Expanso Cloud handles it automatically!
</code></code></pre><p>When you submit this job to Expanso Cloud, here's what happens behind the scenes:</p><ol><li><p><strong>Automatic publisher assignment</strong>: Expanso Cloud automatically assigns its managed publisher to your job.</p></li><li><p><strong>Secure upload</strong>: Compute nodes receive secure, time-limited upload URLs from the orchestrator, eliminating credential management concerns.</p></li><li><p><strong>Organized storage</strong>: Results are automatically organized and stored in Expanso Cloud's managed storage infrastructure.</p></li><li><p><strong>Seamless retrieval</strong>: You can access results through the Expanso Cloud UI or via the standard <em>bacalhau job get</em> command.</p></li></ol><p>This architecture means our developers have eliminated the most common security pitfall: distributing long-term credentials to compute nodes. It&#8217;s security by design, not as an afterthought:</p><ul><li><p><strong>Zero credential distribution</strong>: Compute nodes never receive long-term storage credentials.</p></li><li><p><strong>Time-limited access</strong>: Upload URLs have configurable expiration times.</p></li><li><p><strong>Least privilege</strong>: Each compute node only receives upload permissions for its specific job results.</p></li><li><p><strong>Audit trails</strong>: All storage operations are logged and traceable.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/p/bacalhau-v180-day-4-seamless-result-storage-with-managed-publishers-in-expanso-cloud?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/p/bacalhau-v180-day-4-seamless-result-storage-with-managed-publishers-in-expanso-cloud?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2>Getting Your Results: UI or CLI, Your Choice</h2><p>Accessing your outputs is just as frictionless.</p><p>Access your job results directly through the Expanso Cloud dashboard:</p><ol><li><p>Navigate to your completed job</p></li><li><p>View and download complete result archives</p></li><li><p>Share results with team members using secure, time-limited links</p></li></ol><p>Results are retained for 30 days, giving you ample time to download and process your outputs.</p><p>You can also use the <em>bacalhau job get</em> command:</p><pre><code><code># Submit your job
bacalhau job run my-job.yaml

# Retrieve results when job completes
bacalhau job get my-job
</code></code></pre><p>The results are downloaded to your local machine just like any other Bacalhau job. However, you can now benefit from Expanso Cloud's managed infrastructure, handling all the storage complexity.</p><h2>Don't Worry, Your Custom Setups Still Work</h2><p>If you have an existing workflow with your own custom storage, nothing changes. The managed publisher only activates when no explicit publisher is specified, ensuring <strong>100% backward compatibility</strong>. You have the freedom to choose, but the default is now effortless:</p><pre><code><code># Custom storage publisher still works
Tasks:
  - Name: main
    Publisher:
      Type: s3
      Params:
        Bucket: my-custom-bucket
        Key: my-results
    # ... rest of job spec
</code></code></pre><p>The managed publisher only activates when no explicit publisher is specified, ensuring complete backward compatibility with existing workflows.</p><h2><strong>A Better Workflow for Everyone</strong></h2><p>Why did we do all this? Well, the reality is that this workflow improves everyone&#8217;s jobs:</p><ul><li><p><strong>For data scientists:</strong> Focus on your analysis. Run your Python scripts, R analyses, or Jupyter notebooks and get results back automatically.</p></li><li><p><strong>For dev teams:</strong> Prototype and test new pipelines without the friction of provisioning and securing storage for every experiment.</p></li><li><p><strong>For production workloads:</strong> Leverage enterprise-grade, secure, and durable storage without having to manage it yourself.</p></li></ul><h2>Conclusion</h2><p>Expanso Cloud's managed publishers eliminate the complexity of result storage while maintaining the security and reliability your production workloads demand. Whether you're running simple data processing jobs or complex ML pipelines, your results are now just a click away.</p><p>Ready to experience frictionless distributed computing? <a href="https://cloud.expanso.io/">Sign up for Expanso Cloud</a> and see how managed publishers can simplify your workflow today.</p><div><hr></div><h2><strong>What's Next?</strong></h2><p>Ready to experience the enhanced daemon job capabilities? <a href="https://github.com/bacalhau-project/bacalhau/releases">Upgrade to Bacalhau 1.8</a> and see the difference in your distributed workloads</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h3><strong>Get Involved!</strong></h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Reach out at any of the following locations:</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3><strong>Commercial Support</strong></h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. <a href="https://www.expanso.io/faq/">Read more about the difference between open-source Bacalhau and commercially supported Bacalhau in the FAQ</a>. If you want to use the pre-built binaries and receive commercial support, <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p>]]></content:encoded></item><item><title><![CDATA[Bacalhau v1.8.0 - Day 3: How Bacalhau Boosts Daemon Job Reliability]]></title><description><![CDATA[A deep dive into the new orchestration logic that makes daemon jobs smarter, faster, and more resilient to infrastructure changes]]></description><link>https://blog.bacalhau.org/p/bacalhau-v180-day-3-how-bacalhau-boosted-daemon-jobs</link><guid isPermaLink="false">https://blog.bacalhau.org/p/bacalhau-v180-day-3-how-bacalhau-boosted-daemon-jobs</guid><dc:creator><![CDATA[Federico Trotta]]></dc:creator><pubDate>Wed, 25 Jun 2025 15:31:11 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/5b14b6c7-1ca7-47c3-8a84-a195295666ab_3283x2345.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This is part of the 5-days of Bacalhau 1.8 series! Make sure to go back to the start to catch all of them!</p><ul><li><p><strong><a href="https://open.substack.com/pub/bacalhau/p/announcing-bacalhau-v180?r=1ep1nf&amp;utm_campaign=post&amp;utm_medium=web&amp;showWelcomeOnShare=true">Day 1:</a></strong><a href="https://open.substack.com/pub/bacalhau/p/announcing-bacalhau-v180?r=1ep1nf&amp;utm_campaign=post&amp;utm_medium=web&amp;showWelcomeOnShare=true"> Announcing Bacalhau v1.8.0: Intelligent Edge Computing Meets Enterprise Integration</a></p></li><li><p><strong><a href="https://open.substack.com/pub/bacalhau/p/bacalhau-v180-day-2-rerun-update-version-bacalhau-jobs?r=1ep1nf&amp;utm_campaign=post&amp;utm_medium=web&amp;showWelcomeOnShare=true">Day 2: </a></strong><a href="https://open.substack.com/pub/bacalhau/p/bacalhau-v180-day-2-rerun-update-version-bacalhau-jobs?r=1ep1nf&amp;utm_campaign=post&amp;utm_medium=web&amp;showWelcomeOnShare=true">Rerun, Update, and Version Your Bacalhau Jobs</a></p></li><li><p><strong>Day 3</strong>: How Bacalhau Boosts Daemon Job Reliability (this post)</p></li></ul><div><hr></div><p>In distributed computing, change is the only constant. Compute nodes join, they leave, they scale up and down. For long-running services, this dynamism can be a source of constant anxiety. Are your monitoring agents running everywhere they should be? Is your log processor catching every new service that spins up?</p><p><a href="https://bacalhau.org/docs/specifications/job/type#daemon-jobs">Daemon jobs in Bacalhau</a> were built for this reality. They are designed to run continuously across all qualifying nodes in your cluster. Think of them as your most dependable employees, tirelessly working in the background. With the latest Bacalhau release, we&#8217;ve given these workhorses a significant upgrade, strengthening their ability to adapt to your ever-evolving infrastructure.</p><p>So, let's dive into what makes daemon jobs so critical and how we've made them even more resilient.</p><h2><strong>Why Daemon Jobs are Your Cluster's Backbone</strong></h2><p>Unlike a typical <a href="https://bacalhau.org/docs/specifications/job/type#batch-jobs">batch job</a> that runs once and then exits, a daemon job is a persistent process. It&#8217;s designed to maintain a constant presence, automatically deploying to new nodes as they come online&#8212;thus, adapting to infrastructure changes. This makes them perfect for foundational tasks that need to be everywhere at once.</p><p>Here are the scenarios where they excel:</p><ul><li><p><strong>Edge data processing</strong>: Imagine continuously processing video feeds from hundreds of cameras or sensor data from a fleet of IoT devices. Daemon jobs ensure your processing logic is always running right where the data is generated.</p></li><li><p><strong>Real-time log aggregation</strong>: As new microservices are deployed, daemon jobs automatically place a log forwarder alongside them, ensuring no message is lost.</p></li><li><p><strong>Streaming data pipelines</strong>: They can maintain persistent connections to stream services Kafka or Kinesis, ensuring your data pipeline is never broken.</p></li><li><p><strong>Cluster-wide monitoring</strong>: They are ideal for running health checks, collecting metrics, and triggering alerts across your entire fleet of machines.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/p/bacalhau-v180-day-3-how-bacalhau-boosted-daemon-jobs?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/p/bacalhau-v180-day-3-how-bacalhau-boosted-daemon-jobs?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2><strong>Smarter, Faster, Stronger: The New Daemon Jobs Orchestration</strong></h2><p>So, what did we change? We rebuilt the core daemon job orchestration logic to be more intelligent and responsive. The Bacalhau orchestrator now has a much deeper awareness of the cluster's state. When a new compute node joins, the system doesn't just see a new machine. It instantly recognizes an opportunity to be deployed to any node that matches the job's constraints.</p><p>This enhanced awareness translates into concrete improvements:</p><ul><li><p><strong>Faster node detection</strong>: Our developers have drastically reduced the latency between a node joining your cluster and a daemon job being scheduled on it.&#8221;</p></li><li><p><strong>Rock-solid constraint matching</strong>: We've hardened the process of verifying a node's capabilities against a job's requirements, eliminating deployment mismatches.</p></li><li><p><strong>Confirmed deployments</strong>: The system now has better end-to-end tracking to confirm that a job has been successfully deployed, adding another layer of reliability.</p></li></ul><p>Essentially, at Expanso we&#8217;ve tightened the feedback loop between your infrastructure and the job scheduler. The result? Your daemon jobs achieve their intended cluster-wide coverage with much stronger guarantees.</p><p>Let's see what this looks like. Imagine you want to run a log processor on every node that hosts a web service. Your job might look like this:</p><pre><code><code># A daemon job that reliably deploys to new web service nodes
Name: Enhanced-Log-Processor
Type: daemon
Constraints:
  - Key: service
    Operator: ==
    Values:
      - WebService

Tasks:
  - Name: LogProcessor
    Engine:
      Type: docker
      Params:
        Image: your-org/log-processor:latest
        Parameters:
          - --continuous-mode
          - --output-format=json
</code></code></pre><p>Previously, deploying this job to a new node that just came online worked, but now it's faster and more robust. When a new node with the <em>service=WebService</em> label joins the cluster, Bacalhau ensures your <em>Enhanced-Log-Processor</em> is deployed there almost immediately. No manual steps, no anxious waiting.</p><h2><strong>What This Means for Your Real-World Workloads</strong></h2><p>These aren't just abstract improvements; they have a direct impact on production environments:</p><ul><li><p><strong>Dynamic scaling, simplified</strong>: As you add more compute nodes to handle a traffic spike, your daemon jobs seamlessly expand with you. No need to manually deploy or configure anything again.</p></li><li><p><strong>Resilient infrastructure</strong>: During cluster maintenance or upgrades, the system works harder to maintain full coverage for your daemon workloads. This ensures service continuity.</p></li><li><p><strong>Reliable edge computing</strong>: For edge networks where nodes can be fleeting&#8212;frequently joining and leaving&#8212;this improved reliability is a game-changer that grants consistent service delivery.</p></li></ul><h2><strong>Upgrade with Confidence</strong></h2><p>The best part? These enhancements are fully backward compatible. You don't need to change a single line in your existing job specifications. Just <a href="https://github.com/bacalhau-project/bacalhau/releases">upgrade to the latest version of Bacalhau</a>, and your daemon jobs will automatically benefit from the more resilient orchestration.</p><h2>Conclusion</h2><p>This improvement represents part of our ongoing commitment to making Bacalhau the most reliable platform for distributed computing workloads. The enhanced daemon job orchestration lays the groundwork for future features like more sophisticated deployment strategies and advanced cluster management capabilities.</p><p>Whether you're processing data at the edge, aggregating logs across a distributed system, or maintaining persistent services across your compute fleet, Bacalhau 1.8's enhanced daemon job capabilities provide the reliability and responsiveness your production workloads demand.</p><div><hr></div><h2><strong>What's Next?</strong></h2><p>Ready to experience the enhanced daemon job capabilities? <a href="https://github.com/bacalhau-project/bacalhau/releases">Upgrade to Bacalhau 1.8</a> and see the difference in your distributed workloads</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h3><strong>Get Involved!</strong></h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Reach out at any of the following locations:</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3><strong>Commercial Support</strong></h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. <a href="https://www.expanso.io/faq/">Read more about the difference between open-source Bacalhau and commercially supported Bacalhau in the FAQ</a>. If you want to use the pre-built binaries and receive commercial support, <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p>]]></content:encoded></item><item><title><![CDATA[Bacalhau v1.8.0 - Day 2: Rerun, Update, and Version Your Bacalhau Jobs]]></title><description><![CDATA[How we're making job management less of a chore and more of a superpower]]></description><link>https://blog.bacalhau.org/p/bacalhau-v180-day-2-rerun-update-version-bacalhau-jobs</link><guid isPermaLink="false">https://blog.bacalhau.org/p/bacalhau-v180-day-2-rerun-update-version-bacalhau-jobs</guid><dc:creator><![CDATA[Federico Trotta]]></dc:creator><pubDate>Tue, 24 Jun 2025 15:31:50 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/ac7ac5c7-94aa-4576-a8e6-6b94c985d31b_2048x1463.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This is part of the 5-days of Bacalhau 1.8 series! Make sure to go back to the start to catch all of them!</p><ul><li><p><a href="https://open.substack.com/pub/bacalhau/p/announcing-bacalhau-v180?r=1ep1nf&amp;utm_campaign=post&amp;utm_medium=web&amp;showWelcomeOnShare=true">Day 1: Announcing Bacalhau v1.8.0: Intelligent Edge Computing Meets Enterprise Integration</a></p></li><li><p>Day 2: Rerun, Update, and Version Your Bacalhau Jobs (this post)</p></li></ul><div><hr></div><p>Remember <em>j-f47ac10b-58cc-4372-a567-0e02b2c3d479</em>? Probably not. And why should you? Until now, managing jobs has meant juggling cryptic, auto-generated IDs. You run a job, copy the ID, paste it to check the status, paste it again to get the logs, and hope you don't mix it up with the other dozen UUIDs in your terminal history.</p><p>We knew there had to be a better way.</p><p>That's why we're thrilled to introduce a fundamental overhaul to job management in Bacalhau. We're moving beyond throwaway IDs to a durable, name-based system that lets you rerun, update, and version your jobs with intuitive commands. This isn't just a quality-of-life update: it's a new paradigm for managing the entire lifecycle of your computational work.</p><p>Let&#8217;s see what&#8217;s new in Bacalhau 1.8.0. about that!</p><h2>A Fundamental Shift: From Cryptic IDs to Meaningful Names</h2><p>Let's be honest: the old way of working with jobs was functional, but could be frustrating. Every <em>bacalhau job submit</em> created a completely new job with a new ID. This led to a trail of disconnected job runs that were hard to track and even harder to manage.</p><ul><li><p><strong>The pain of UUIDs</strong>: Workflows required constant copying and pasting of non-human-readable IDs. It was tedious and error-prone.</p></li><li><p><strong>No inherent history</strong>: If you ran the same analysis five times, you would get five unrelated jobs. Tracking the evolution of your work was a manual, out-of-band process.</p></li><li><p><strong>A disconnected experience</strong>: Describing, logging, and fetching results all felt like one-off operations on ephemeral objects, rather than interactions with a persistent, evolving task.</p></li></ul><p>With this update, we're introducing a more intuitive, name-based approach. You now define a name for your job directly in the job spec file. Bacalhau uses this name to anchor the job's identity, treating subsequent runs as new versions of the <em>same</em> job. This simple change transforms the entire workflow.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/p/bacalhau-v180-day-2-rerun-update-version-bacalhau-jobs?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/p/bacalhau-v180-day-2-rerun-update-version-bacalhau-jobs?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>Here is how it was until now:</p><pre><code><code># Before: A trail of generated IDs
bacalhau job submit my-analysis.yaml  # --&gt; j-f47ac10b...
bacalhau job describe j-f47ac10b...    # Must use the generated ID
</code></code></pre><p>And this is how we transformed it:</p><pre><code><code>
# After: Use your job's name everywhere
bacalhau job submit my-analysis.yaml
bacalhau job describe monthly-sales-report-generator  # So much better!
</code></code></pre><p>Best of all? This is <strong>fully backward compatible!</strong>. All your old scripts and workflows using job IDs will continue to work exactly as they did before. You can adopt the new name-based system at your own pace.</p><h2>What's New: Your Job Management Superpowers</h2><p>This update is built on three powerful new capabilities:</p><ol><li><p><strong>Job rerunning</strong>: A dedicated <em>rerun</em> command to restart completed or failed jobs, creating a new version while preserving history.</p></li><li><p><strong>Enhanced job versioning</strong>: Automatic version tracking is now built in. All core commands (<em>describe</em>, <em>logs</em>, <em>get</em>) are now version-aware.</p></li><li><p><strong>Advanced dry-run with diff preview</strong>: See exactly what will change before you update a job, like a <em>git diff</em> for your compute.</p></li></ol><h2>The <code>rerun</code> Command: Stop Re-Submitting, Start Rerunning</h2><p>Ever had a job fail due to a transient network blip? Or needed to reprocess a dataset after a minor code tweak? Previously, you'd have to go back and submit the job all over again, creating a new, disconnected job ID.</p><p>The new <em>rerun</em> command makes this a first-class operation:</p><pre><code><code># Rerun the latest version of a job by name
bacalhau job rerun monthly-sales-report-generator

# Need to rerun an older, specific version? No problem.
bacalhau job rerun monthly-sales-report-generator --version 2
</code></code></pre><p>When you <em>rerun</em> a job, Bacalhau intelligently creates a new version under the same name. This preserves the original job's history and artifacts while letting you iterate. It also includes safety checks to prevent you from rerunning jobs that are already pending or in-flight.</p><h2>Enhanced Job Versioning: Manage Your Job's Evolution</h2><p>Your workflows aren't static. You tweak parameters, update Docker images, and refine your analysis over time. Bacalhau now embraces this evolution with built-in versioning.</p><p>Every time you submit a job with an existing name, a new version is created. You can then interact with any version of the job using a simple <em>--version</em> flag.</p><pre><code><code># View details of a specific job version
bacalhau job describe my-ml-training --version 3

# Get execution details from the very first run
bacalhau job executions my-data-pipeline --version 1

# View logs from an earlier version to debug a change
bacalhau job logs my-analysis --version 2
</code></code></pre><p>If you don't specify a version, Bacalhau defaults to the latest one. This makes your day-to-day work seamless while giving you the power to dive into the history whenever you need to.</p><h3>Advanced Dry-Run: The &#8220;<code>git diff</code>" For Your Compute Jobs</h3><p>Updating a critical production job can be nerve-wracking. Did you format the YAML correctly? Are the container arguments right?</p><p>Our enhanced <em>--dry-run</em> functionality gives you confidence by showing you a server-side "diff" of your proposed changes.</p><pre><code><code># Preview changes against the latest version before applying them
bacalhau job run --dry-run updated-job.yaml
</code></code></pre><p>Because the job name is in your YAML file, Bacalhau finds the latest existing version and generates a precise, color-coded diff showing you exactly what will be added, removed, or changed. This isn't a simple text comparison; it uses the same validation logic as a real job submission, so you can be sure that what you see is what you'll get. No more "pray and run."</p><h2>Your New Workflow, Transformed</h2><p>This update fundamentally changes the job management lifecycle from a series of disconnected, one-shot commands into a fluid, continuous workflow.</p><p><strong>Before: Basic, Disconnected Operations</strong></p><pre><code><code># Basic job operations
bacalhau job describe &lt;job-id&gt;
bacalhau job run --dry-run job.yaml  # Basic client-side validation only
</code></code></pre><p><strong>After: Rich, Version-Aware Lifecycle Management</strong></p><pre><code><code># Use names OR IDs, and interact with any version
bacalhau job describe my-job --version 2
bacalhau job rerun my-job --version 1
bacalhau job run --dry-run job.yaml    # Server-side diff preview!
bacalhau job executions my-job --version 3
</code></code></pre><h2>Conclusion</h2><p>These new capabilities are available now. Just update to the latest Bacalhau client to get started. We believe this new, name-based approach will make managing your computational workflows more intuitive, powerful, and reliable.</p><div><hr></div><h2><strong>What's Next?</strong></h2><p>Ready to experience the enhanced daemon job capabilities? <a href="https://github.com/bacalhau-project/bacalhau/releases">Upgrade to Bacalhau 1.8</a> and see the difference in your distributed workloads</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h3><strong>Get Involved!</strong></h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Reach out at any of the following locations:</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3><strong>Commercial Support</strong></h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. <a href="https://www.expanso.io/faq/">Read more about the difference between open-source Bacalhau and commercially supported Bacalhau in the FAQ</a>. If you want to use the pre-built binaries and receive commercial support, <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p>]]></content:encoded></item><item><title><![CDATA[Announcing Bacalhau v1.8.0: Intelligent Edge Computing Meets Enterprise Integration]]></title><description><![CDATA[Discover how Bacalhau v1.8.0 transforms distributed computing with a native Splunk integration, name-based job management, and enhanced daemon orchestration.]]></description><link>https://blog.bacalhau.org/p/announcing-bacalhau-v180</link><guid isPermaLink="false">https://blog.bacalhau.org/p/announcing-bacalhau-v180</guid><dc:creator><![CDATA[Federico Trotta]]></dc:creator><pubDate>Mon, 23 Jun 2025 15:31:03 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/75ea5bda-985f-43cf-a78d-4e9d1b290886_2048x1464.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We're excited to announce Bacalhau v1.8.0: a groundbreaking release that transforms how you can approach distributed computing at the edge. This new release introduces:</p><ul><li><p>Intelligent cost optimization through an advanced Splunk integration.</p></li><li><p>Enhanced daemon job orchestration for dynamic infrastructure.</p></li><li><p>Enterprise-grade improvements&#8212;docs, security, and more.</p></li><li><p>A completely reimagined job lifecycle management system.</p></li></ul><p>In a nutshell, Bacalhau v1.8.0 delivers the integrations and capabilities that you need to reduce costs, improve operational efficiency, and unlock the full potential of your distributed infrastructure.</p><p>Let&#8217;s dive into it to learn more!</p><h2>Slash Your Splunk Bill: Intelligent Logging at the Edge</h2><p>Let's talk about the elephant in the room: logging costs. For too long, you've faced a terrible choice: ship all your raw data to Splunk and watch your budget evaporate, or sample your data and risk missing the one critical event you need.</p><p>So, what if there could be an alternative solution? Bacalhau&#8217;s new Splunk integration offers a third, much smarter path. Instead of moving mountains of data, Bacalhau lets you ship the computation to the data source. This fundamentally changes the cost equation.</p><p>With v1.8.0, Bacalhau's compute power can be integrated directly into Splunk so that you can:</p><ul><li><p><strong>Process data at the source:</strong> Deploy smart logging agents that filter, aggregate, and analyze data on the edge, sending only high-signal results to Splunk.</p></li><li><p><strong>Fork your data stream:</strong> Send critical, processed alerts to Splunk for real-time monitoring while simultaneously archiving the full, compressed raw logs to cheap object storage.</p></li><li><p><strong>Query live data without ingestion:</strong> Run queries directly on remote nodes to investigate issues in real-time, without moving the data or paying ingestion fees.</p></li><li><p><strong>Replay history on demand:</strong> Need to analyze old logs? No problem. Use Bacalhau's compute power to pull data from your archive, process it, and analyze it in Splunk.</p></li></ul><p>This isn't a minor tweak. It's a paradigm shift that can <strong>cut logging costs by 60-80%</strong> while giving you <em>more</em> analytical power. The best part? It all feels native. You use the Splunk dashboards and search queries you already know, while Bacalhau&#8217;s distributed compute engine works transparently in the background.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/p/announcing-bacalhau-v180?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/p/announcing-bacalhau-v180?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2><strong>Enhanced Daemon Jobs</strong></h2><p>Bacalhau v1.8.0 strengthens daemon job reliability with enhanced orchestration. The goal is to ensure comprehensive coverage across dynamic infrastructure.</p><p>Our developers have beefed up the entire system for the reliability that production environments demand:</p><ul><li><p>Node discovery is faster.</p></li><li><p>Constraint checking is more robust.</p></li><li><p>Deployment tracking gives you a clear view of your job's footprint. When a new node spins up in your dynamic infrastructure, you can be confident your daemon job will deploy to it automatically and reliably.</p></li></ul><p>These enhancements are particularly valuable for the Splunk integration, where logging agents must reliably deploy to new infrastructure automatically.</p><h3><strong>Enterprise-Ready by Default</strong></h3><p>Rounding out the release are features designed to make Bacalhau seamless in an enterprise environment:</p><ul><li><p><strong>Managed result storage:</strong> With <a href="https://www.expanso.io/expanso-cloud">Expanso Cloud</a>, you no longer need to configure your own publisher or manage storage credentials. Job results are automatically and securely stored in our managed infrastructure. Just run your job and fetch your results.</p></li><li><p><strong>Documentation overhaul:</strong> We&#8217;ve rebuilt our <a href="https://bacalhau.org/docs">docs</a> from the ground up with task-based guides, architectural deep dives, and real-world examples for things like distributed log processing and edge ML.</p></li><li><p><strong>Hardened foundation:</strong> We've continued to strengthen security, authentication (with SSO), and reliability to ensure Bacalhau is ready for your most demanding production workloads.</p></li></ul><h2>Transformative Job Lifecycle Management</h2><p>Bacalhau v1.8.0 introduces a fundamental transformation in how you manage computational workflows. We've moved beyond cryptic, auto-generated job IDs to a powerful name-based system that supports versioning, updates, and intelligent rerunning.</p><h3><strong>Goodbye </strong><em>j-f47ac10b-58cc-4372-a567-0e02b2c3d479</em></h3><p>If you've ever felt the pain of managing workflows using nothing but UUIDs, this update is for you. We've completely reimagined job lifecycle management to be intuitive, version-controlled, and built for humans.</p><p>Jobs are now identified by meaningful names you define in your spec:</p><pre><code><code># Before: Cryptic UUID management
bacalhau job describe j-f47ac10b-58cc-4372-a567-0e02b2c3d479

# After: Intuitive name-based operations
bacalhau job describe monthly-sales-report-generator
</code></code></pre><h3>Advanced Versioning and Dry-Run</h3><p>Every job submission with an existing name creates a new version. The enhanced <em>--dry-run</em> functionality provides server-side diff previews, showing exactly what will change before applying updates:</p><pre><code><code># Preview changes with detailed diff
bacalhau job run --dry-run updated-pipeline.yaml

# Work with specific versions
bacalhau job logs data-analysis --version 2
bacalhau job describe ml-model-training --version 1
</code></code></pre><h3>Intelligent Job Rerunning</h3><p>The new <em>rerun</em> command eliminates the need to resubmit jobs. This avoids you from creating new versions of existing jobs while preserving history:</p><pre><code><code># Rerun the latest version
bacalhau job rerun data-processing-pipeline

# Rerun a specific version
bacalhau job rerun ml-training-job --version 3

</code></code></pre><p>Also, this versioning system is fully backward compatible: existing workflows using job IDs continue to work while new capabilities are available for adoption at your own pace.</p><div><hr></div><h2>What&#8217;s next?</h2><p><a href="https://github.com/bacalhau-project/bacalhau/releases">Bacalhau v1.8.0 is available now</a> with full backward compatibility. Existing workflows continue to operate unchanged while new capabilities are available for immediate adoption:</p><ul><li><p><strong>For Splunk Users</strong>: The Expanso Splunk application is available through Splunkbase and can be deployed in minutes to start realizing immediate cost savings.</p></li><li><p><strong>For Existing Users</strong>: Update to v1.8.0 to access enhanced daemon job orchestration, managed result storage, and the new job lifecycle management capabilities.</p></li><li><p><strong>For New Users</strong>: Our completely rebuilt documentation provides clear pathways for getting started with distributed computing, whether for cost optimization, edge processing, or advanced analytics.</p></li></ul><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><p></p><h3><strong>Join Us on the Journey: 5 Days of Bacalhau</strong></h3><p>Stay tuned for our "5 Days of Bacalhau" series, in which we'll go deeper into these exciting new features:</p><ul><li><p><strong>Day 1:</strong> Announcing Bacalhau 1.8.0&#8212;Intelligent Edge Computing Meets Enterprise Integration (this post)</p></li><li><p><strong>Day 2:</strong> Rerun, Update, and Version Your Bacalhau Jobs</p></li><li><p><strong>Day 3:</strong> How Bacalhau Boosts Daemon Job Reliability</p></li><li><p><strong>Day 4:</strong> Seamless Result Storage with Managed Publishers</p></li><li><p><strong>Day 5:</strong> Distributed Logs Management with Bacalhau and Splunk</p></li></ul><h3><strong>Get Involved!</strong></h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Reach out at any of the following locations:</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3><strong>Commercial Support</strong></h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. <a href="https://www.expanso.io/faq/">Read more about the difference between open-source Bacalhau and commercially supported Bacalhau in the FAQ</a>. If you want to use the pre-built binaries and receive commercial support, <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bacalhau! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Elephant in the Room: Why Is Your Data Bill So High?]]></title><description><![CDATA[Beyond 'More': It's Time for Smarter Data Infrastructure That Saves You Money, Not Costs You More.]]></description><link>https://blog.bacalhau.org/p/the-elephant-in-the-room-why-is-your</link><guid isPermaLink="false">https://blog.bacalhau.org/p/the-elephant-in-the-room-why-is-your</guid><dc:creator><![CDATA[Federico Trotta]]></dc:creator><pubDate>Tue, 10 Jun 2025 16:34:24 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/496f1c1f-4afe-4541-ba57-95dfb12d55d2_719x514.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Technology has been on a relentless march toward "more." More data, more powerful AI, more features, more connectivity. You attend conferences, read the industry reports, and the promise is always the same: a bigger, more feature-rich future. But this constant pursuit of "more" comes with a hidden, and rapidly growing, cost.</p><p>For most enterprises, data infrastructure has likely become a complex, bloated, and expensive beast. Your ingestion pipelines may be duplicative, and your analytics workloads over-provisioned. The sheer noise in the system drives costs sky-high, leaving you with little budget for actual innovation. It's the elephant in the room: as your capabilities grow, so does the waste.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/p/the-elephant-in-the-room-why-is-your?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/p/the-elephant-in-the-room-why-is-your?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p><h2>A Different Message: You Should Be Paying Less</h2><p>It's time to shift the conversation from "more" to "smarter." At <a href="https://www.expanso.io/">Expanso</a>, we believe it's time to stop accepting data waste as the cost of doing business.</p><p>Expanso's platform was born from a simple, radical idea: you should be able to slash your data infrastructure costs by 40&#8211;80% without ripping out your existing tools and without compromising on performance or security. The platform achieves this by fundamentally simplifying your ingestion pipelines, making intelligent use of your compute resources, and working with the data storage solutions you already have. You get the impact of immediate savings without the pain of replatforming or complex data migrations.</p><p>As our CEO, David Aronchick, often says, &#8220;<em><strong>No one wants to compromise on security or visibility. But everyone wants to stop paying for data waste. We built Expanso to give enterprises instant cost control&#8212;so they can reinvest in what matters</strong></em>.&#8221;</p><p>The future shouldn't be about paying more; it should be about doing more with what you have.</p><p></p><p><strong>Read more in our press release on <a href="https://www.businesswire.com/news/home/20250610651390/en/Expanso-Launches-Cost-Optimization-Platform-to-Cut-Enterprise-Data-AI-Costs-by-up-to-80">BusinessWire</a>.</strong></p><p></p><h2>How About Something Practical?</h2><p>Do you want to learn more about how Expanso, through the <a href="https://bacalhau.org/">Bacalhau project</a>, can help you save money, while providing computational power to your data? Read the following articles from our blog and website:</p><ul><li><p><a href="https://blog.bacalhau.org/p/cross-border-data-processing-with">Cross-Border Data Processing With Privacy Compliance Through Expanso</a></p></li><li><p><a href="https://blog.bacalhau.org/p/high-scale-data-processing-cosmosdb-bacalhau">High-Scale Data Processing: Over Thousands of Devices With Azure Cosmos DB and Expanso</a></p></li><li><p><a href="https://www.expanso.io/why-80-of-your-data-should-never-hit-the-cloud">Why 80% of Your Data Should Never Hit the Cloud</a></p></li><li><p><a href="https://blog.bacalhau.org/p/why-cloud-centric-architectures-are">Why Cloud-Centric Architectures Are Breaking Under Data Scale</a></p></li></ul><div><hr></div><h2><strong>What's Next?</strong></h2><p>To start using Bacalhau, <a href="https://docs.bacalhau.org/getting-started/installation">install Bacalhau</a> and give it a shot.</p><p>If you don&#8217;t have a node network available and would still like to try Bacalhau, you can use <a href="https://cloud.expanso.io/login">Expanso Cloud</a>. <a href="https://docs.bacalhau.org/getting-started/network-setup">You can also set up a cluster on your own</a> (with setup guides for <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-amazon-web-services-aws-with-terraform">AWS</a>, <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-google-cloud-platform-gcp-with-terraform">GCP</a>, <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-azure-with-terraform">Azure</a>, and more &#128578;).</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h3><strong>Get Involved!</strong></h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Reach out at any of the following locations:</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3><strong>Commercial Support</strong></h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. <a href="https://www.expanso.io/faq/">Read more about the difference between open-source Bacalhau and commercially supported Bacalhau in the FAQ</a>. If you want to use the pre-built binaries and receive commercial support, <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p>]]></content:encoded></item><item><title><![CDATA[Why Cloud-Centric Architectures Are Breaking Under Data Scale]]></title><description><![CDATA[Your cloud bills are too high and data is slow? Here&#8217;s a better way than just using the cloud!]]></description><link>https://blog.bacalhau.org/p/why-cloud-centric-architectures-are</link><guid isPermaLink="false">https://blog.bacalhau.org/p/why-cloud-centric-architectures-are</guid><dc:creator><![CDATA[Federico Trotta]]></dc:creator><pubDate>Fri, 30 May 2025 15:30:28 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f7bfbb56-b57c-46f0-91ea-65cabee6d2f3_2048x1463.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If high costs, slow data, and compliance headaches sound like your problems, your current cloud setup might be the cause. So, let's explore a more efficient way to manage your data.</p><p>The cloud changed how IT works, offering great ways to grow and be flexible. But for those who are dealing with huge amounts of distributed data, the cloud isn't always the best answer. What often happens? Costs go up, data moves slowly, and following the compliance rules becomes a big problem.</p><p>This post explains why the old way of using the cloud for data isn't working well anymore. We'll also show you a newer, better solution: Bacalhau and a computing paradigm called Compute-Over-Data.</p><h3>The Cloud Isn't Always Cheap: Costs Go Up, Benefits Go Down</h3><p>When you first look at it, the cloud seems cheaper. You pay for what you use instead of buying your own hardware. But when you have data in many different places, that changes. Moving all your data to the cloud and working on it there can get very expensive. Here&#8217;s why cloud costs can get too high:</p><ul><li><p><strong>Fees for moving data:</strong> You pay to move data into the cloud because you need its powerful computers. If your data is in many places, these fees add up quickly.</p></li><li><p><strong>Costly data storage:</strong> Cloud companies have different ways to store data. If you need to get to your data fast for quick checks, you often have to use expensive "hot" storage.</p></li><li><p><strong>High costs for central computers:</strong> Working on huge amounts of data in one main place needs a lot of computer power. Companies often buy more than they need just to be safe, which costs more. You can try to fix this by paying only for what you use, but this is more like a quick fix than a real solution.</p></li></ul><p>These high costs make companies ask <a href="https://www.expanso.io/why-80-of-your-data-should-never-hit-the-cloud">if sending all their data to one place is still a good idea</a>.</p><h3>Slow Data: Why Your Information Can't Keep Up</h3><p>Slow data is a problem for many companies. When you send all your data to a centralized cloud service, it takes time for information to go from where it starts, to the cloud, and back. This delay can make information old and useless by the time it&#8217;s ready.</p><p>As you get more and more data, this problem gets bigger. Trying to push tons of data through pipes to one main spot causes traffic jams. This can take days for very large amounts of data. This is a big problem for:</p><ul><li><p><strong>Edge computing:</strong> This is when you need to make decisions right away, where the data is generated. Sending data far away to a cloud and waiting for instructions is too slow for things that need to happen fast.</p></li><li><p><strong>Quick data checks:</strong> Slow data makes it hard to get fast answers. People who study data often need information quickly to make good choices. If they wait too long for data, they might miss chances to act on time.</p></li><li><p><strong>Real-time warnings:</strong> Delays can make you miss big problems in some cases. For example, systems that find fraud must spot bad actions right away. Hospitals need to warn staff immediately about big changes in a patient's health.</p></li></ul><p>When data has to travel far because it's spread out, it slows down. Adding more computers or faster internet to cloud services doesn't fix the problem.</p><h3>Rule Troubles: Following Data Laws in a Central Cloud</h3><p>Following data rules is a big and tricky job for businesses. Laws like Europe's GDPR, California's CCPA, and healthcare's HIPAA are very strict. If you don't follow them, you can get big fines. <a href="https://blog.bacalhau.org/p/cross-border-data-processing-with">Using a centralized cloud can make it harder</a> to follow these rules because:</p><ul><li><p><strong>Data location:</strong> Many laws say that companies must keep people's data in their own country. Using a central cloud could mean you store data in the wrong country. This can break laws and lead to big problems and fines.</p></li><li><p><strong>Moving data across borders:</strong> Sending data to computers in other countries is not simple. It needs careful legal and tech planning because the country it goes to must protect data just as well. This means more paperwork, costs, and risks, especially with private data like medical records.</p></li><li><p><strong>Showing you follow rules:</strong> It's hard to prove you're following the rules when all your data is in one big place. Showing officials exactly where certain data is, who has used it, and that everything you do follows the rules can be a very big job.</p></li><li><p><strong>Dealing with many rules:</strong> Companies that work worldwide have to follow international, national, and local data rules. Trying to use all these different, and sometimes clashing, rules with data in one main place can be hard. Because of this, companies might make things too strict and costly, or they might miss some rules to try and follow others.</p></li><li><p><strong>Information leaks:</strong> Even if your main data is safe, systems in one place often gather logs, passwords, and details about computer jobs. If this extra information (called metadata) leaks, it can show private details about your data, what you're computing, or your computer systems. This can give attackers useful information for possible attacks.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/p/why-cloud-centric-architectures-are?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/p/why-cloud-centric-architectures-are?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p><h3>Why Your Current Infrastructure Tools Struggle with Spread-Out Data</h3><p>Even common infrastructure tools have trouble with today's spread-out data. As we talked about in our "<a href="https://www.expanso.io/kubernetes-vs-nomad-vs-bacalhau">Kubernetes vs Nomad vs Bacalhau</a>" article, these tools are made to manage apps that are close to their data. Usually, this means in a single data center or cloud area with a good, fast network.</p><p>Also, how they are built still expects data to be easy to get to and brought to the apps. So, trying to make these tools work well with data that's very spread out often means using complicated fixes. This can include:</p><ul><li><p>Setting up complex data pipelines to move parts of the data.</p></li><li><p>Trying to link groups of computers (clusters) across far distances.</p></li><li><p>Building your own special ways to get data ready and keep it in sync.</p></li></ul><p>This way of doing things adds more complications, makes more work to keep things running, and can reduce the real benefits these tools are supposed to give.</p><h3>The "Send Everything" Habit Costs a Lot</h3><p>A big part of the problem is a common IT habit: &#8220;We send everything." Many systems, data setups, and processes are made to collect all data from everywhere and send it all to one main place&#8211;like a data lake or warehouse&#8211;before anyone even looks at it or figures out if it's actually useful.</p><p>This habit means you pay a lot to move, store, and work on "data noise." This "data noise" uses up good resources and causes many costs:</p><ul><li><p>Costs to move data out of where it resides.</p></li><li><p>Costs for the main system to take in data.</p></li><li><p>Storage costs.</p></li><li><p>Costs to search and study the data.</p></li></ul><p>This leads to high bills for internet use, storage space, and computer time. All for a lot of data you don't really need.</p><p>You keep doing this inefficient thing because most tools you have are built for it. You put everything in one place because you need the computer power to study the data.</p><p>Luckily, now we have a good, modern answer to this problem!</p><h3>A Smarter Way: Compute-Over-Data with Bacalhau</h3><p>The problems with working on data in one central place show that you need a new way. Instead of fighting against where your data is, you should work with it. This means:</p><ul><li><p>Stop moving lots of data to computers.</p></li><li><p>Start using a Compute-Over-Data (CoD) plan.</p></li></ul><h3>The New Idea: Send Computer Power to Your Data, Not Data to Your Computers</h3><p>Compute-Over-Data changes how you work with data. You don't send large, raw amounts of data to one place just because you don&#8217;t have enough hardware resources. Instead, Compute-Over-Data moves computer tasks to where your data already is.</p><p>Your data might be on edge devices (like sensors), local company computers, or spread across many data centers. The CoD way of computing moves much less data and solves the main problems with central models, like:</p><ul><li><p>Cost.</p></li><li><p>Speed.</p></li><li><p>Rule-following.</p></li></ul><p>When you work on data near where it starts, you can filter, combine, and change it locally. Then, you only send to the cloud the data that's useful for other tasks, and only if you need to!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h3>What is Bacalhau? Your Tool for a World of Spread-Out Data</h3><p>Bacalhau is an open-source tool for computing that's spread out. It's built for the Compute-Over-Data idea. Here are some of its main features:</p><ul><li><p><strong>Simple to use:</strong> Bacalhau works as a client (user tool), an orchestrator (manager), and a compute node (worker computer) all in one small package. This makes it easy to set up and use in many different places. It also lets you do <a href="https://blog.bacalhau.org/p/your-fast-track-to-bacalhau-local">fast tests on your own computer</a> that act like a whole cluster deployment, but without all the work of managing infrastructure.</p></li><li><p><strong>Flexible design:</strong> It supports different ways to run code, like <a href="https://bacalhau.org/docs/engines/docker">Docker containers</a> and <a href="https://bacalhau.org/docs/references/developers/workload-onboarding/wasm">WebAssembly (WASM)</a>. It works with different storage systems (like <a href="https://bacalhau.org/docs/sources/s3">S3</a>, <a href="https://bacalhau.org/docs/sources/ipfs">IPFS</a>, and gets data from websites).</p></li><li><p><strong>Handles different job types:</strong> It supports different <a href="https://bacalhau.org/docs/specifications/job/type">kinds of jobs</a>: "batch" jobs for tasks you do once, "ops" jobs for specific tasks on certain computers, "daemon" jobs for tasks that keep running in the background, and "service" jobs for programs that run for a long time.</p></li><li><p><strong>Works reliably:</strong> Bacalhau is made for systems that are spread out. It can handle times when network connections are not steady. This is important for <a href="https://bacalhau.org/use-cases/edge-computing">edge computing</a> and data that is far apart.</p></li></ul><h3>How Bacalhau Solves Data Problems</h3><p>By using Compute-Over-Data, Bacalhau fixes the headaches of working with data in one central place:</p><ul><li><p><strong>Cost savings:</strong> Working on data locally with Bacalhau cuts fees for moving data. By preparing and filtering data where it starts, you can stop paying to move and store "data noise." This saves a lot on network, storage, and central computer costs.</p></li><li><p><strong>Faster data:</strong> Running jobs close to where data is made means less network travel and faster results. You get answers quicker, can make decisions in real time, and have faster apps.</p></li><li><p><strong>Better data security and following rules:</strong> Bacalhau lets you work on private data within its safe and legal limits. This means less risk of data getting exposed when moving it. It also makes it easier to <a href="https://blog.bacalhau.org/p/cross-border-data-processing-with">follow data rules like GDPR</a> and data location laws.</p></li></ul><h3>Real Examples: Where Compute-Over-Data With Bacalhau Makes a Difference</h3><p>Bacalhau's Compute-Over-Data way helps in many areas, such as:</p><ul><li><p><strong>Working with large amounts of <a href="https://bacalhau.org/use-cases/log-processing">logs</a>:</strong> Study, filter, and combine logs on the computers or edge devices where they are made. This can cut the amount of data sent to central logging systems by <a href="https://blog.bacalhau.org/p/save-25m-yoy-by-managing-logs-the">over 90%</a>, saving on network, intake, and storage costs.</p></li><li><p><strong>Spread-out <a href="https://bacalhau.org/use-cases/distributed-data-warehousing">data warehouses</a>:</strong> Run search queries directly on data stored in different local databases or cloud storage. This speeds up queries and helps follow data location rules.</p></li><li><p><strong><a href="https://bacalhau.org/use-cases/distributed-machine-learning">Machine learning at the edge (edge ML)</a></strong>: Train ML models or run them directly on edge devices. This allows for quick predictions, uses less internet by working on raw sensor data locally, and improves privacy by keeping private data at the edge.</p></li><li><p><strong>Managing spread-out devices:</strong> Safely run commands, update software, and collect information from many <a href="https://bacalhau.org/use-cases/fleet-management">spread-out devices</a> or computers without needing direct access to each one or constant network connections.</p></li></ul><h3>Conclusion</h3><p>Using one central cloud for everything causes problems when you have large amounts of data distributed over different locations. These problems include higher costs, slow data movement, and complex rule-following.</p><p>There's a different way, and it's called Compute-Over-Data (CoD). Bacalhau is a system designed for this CoD idea that sends computer tasks to where the data is, instead of moving large amounts of data.</p><p>Using Bacalhau reduces data movement, which lowers transfer and storage costs. It also makes it easier to follow data rules by letting you work on data within certain areas or security limits.</p><p>For companies working with distributed data, Bacalhau gives a way to fix the limits of old central ways of working with data.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h3>What's Next?</h3><p>To start using Bacalhau, <a href="https://docs.bacalhau.org/getting-started/installation">install Bacalhau</a> and give it a try.</p><p>If you don&#8217;t have a group of computers (a network) ready and would still like to try Bacalhau, we suggest using <a href="https://www.expanso.io/expanso-cloud">Expanso Cloud</a>. You can also set up your own group of computers (we have setup guides for <a href="https://docs.bacalhau.org/getting-started/setting-up-your-own-cluster/aws">AWS</a>, <a href="https://docs.bacalhau.org/getting-started/setting-up-your-own-cluster/gcp">GCP</a>, <a href="https://docs.bacalhau.org/getting-started/setting-up-your-own-cluster/azure">Azure</a>, and <a href="https://docs.bacalhau.org/getting-started/setting-up-your-own-cluster">more</a> &#128578;).</p><h3>Get Involved!</h3><p>We'd like you to be part of Bacalhau. There are many ways to help, and we&#8217;d love to hear from you. You can find us here:</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="https://www.bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="https://twitter.com/BacalhauProject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/expansoinc">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@bacalhauproject">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@bacalhauproject">Youtube</a></p></li><li><p><a href="https://bacalhau.org/slack/">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expansoinc/">LinkedIn</a></p></li><li><p><a href="https://www.expanso.io/careers">Careers Page</a></p></li></ul><h3>Commercial Support</h3><p>Bacalhau is open-source software, but the official Bacalhau program files are made by Expanso with a careful security, checking, and signing process. You can read more about the difference between open-source Bacalhau and Bacalhau with commercial support in our <a href="https://www.expanso.io/faq">FAQ</a>. If you want to use our pre-built program files and get commercial support, please <a href="https://www.expanso.io/contact-us">contact us</a> or get your license on <a href="https://www.expanso.io/expanso-cloud">Expanso Cloud</a>!</p>]]></content:encoded></item><item><title><![CDATA[Cross-Border Data Processing With Privacy Compliance Through Expanso]]></title><description><![CDATA[Using Bacalhau to handle complex data pipelines that cross borders while preserving privacy]]></description><link>https://blog.bacalhau.org/p/cross-border-data-processing-with</link><guid isPermaLink="false">https://blog.bacalhau.org/p/cross-border-data-processing-with</guid><dc:creator><![CDATA[Chris Chinchilla]]></dc:creator><pubDate>Thu, 22 May 2025 15:46:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ITxH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5fdabb-a2f1-4098-9234-f6d089ec2a6f_2048x1463.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ITxH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5fdabb-a2f1-4098-9234-f6d089ec2a6f_2048x1463.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ITxH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5fdabb-a2f1-4098-9234-f6d089ec2a6f_2048x1463.heic 424w, https://substackcdn.com/image/fetch/$s_!ITxH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5fdabb-a2f1-4098-9234-f6d089ec2a6f_2048x1463.heic 848w, https://substackcdn.com/image/fetch/$s_!ITxH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5fdabb-a2f1-4098-9234-f6d089ec2a6f_2048x1463.heic 1272w, https://substackcdn.com/image/fetch/$s_!ITxH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5fdabb-a2f1-4098-9234-f6d089ec2a6f_2048x1463.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ITxH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5fdabb-a2f1-4098-9234-f6d089ec2a6f_2048x1463.heic" width="1456" height="1040" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e5fdabb-a2f1-4098-9234-f6d089ec2a6f_2048x1463.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1040,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59406,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bacalhau.org/i/164140662?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5fdabb-a2f1-4098-9234-f6d089ec2a6f_2048x1463.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ITxH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5fdabb-a2f1-4098-9234-f6d089ec2a6f_2048x1463.heic 424w, https://substackcdn.com/image/fetch/$s_!ITxH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5fdabb-a2f1-4098-9234-f6d089ec2a6f_2048x1463.heic 848w, https://substackcdn.com/image/fetch/$s_!ITxH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5fdabb-a2f1-4098-9234-f6d089ec2a6f_2048x1463.heic 1272w, https://substackcdn.com/image/fetch/$s_!ITxH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5fdabb-a2f1-4098-9234-f6d089ec2a6f_2048x1463.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Many organizations work with clients and infrastructure around the world and face significant challenges ensuring they follow privacy regulations as their application data flows across borders.</p><p>Data privacy regulations such as the General Data Protection Regulation (GDPR) in Europe and the California Consumer Privacy Act (CCPA) in California impose requirements on how applications can store, process, and transfer personal and sensitive information across borders.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bacalhau! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>The core challenge lies in maintaining data sovereignty and conforming to these rules while enabling cross-border analytics. For instance, when collecting personal data in the European Union, regulations often require that this data remain within EU borders. However, organizations still need to perform analytics on this data in other regions, creating a complex compliance challenge that requires careful architectural considerations.</p><p>Organizations often have to process large volumes of data efficiently. They need a distributed processing approach that maintains data locality and sovereignty while providing reliable job orchestration and monitoring that can scale in parallel based on demand.</p><p>This post looks at how you can use Bacalhau to handle distributed cross-border processing and anonymize data with <a href="https://microsoft.github.io/presidio/">Microsoft Presidio</a> to help meet some of these requirements.</p><p><a href="https://www.bacalhau.org/">Bacalhau</a> is an open-source distributed platform that enables you to run compute jobs where data is generated and stored.</p><h2>A practical guide</h2><p>This tutorial migrates data from the EU to the USA by creating a synthetic dataset, and anonymizing it before migration with <a href="https://microsoft.github.io/presidio/">Microsoft Presidio</a> to analyze, extract, and anonymize sensitive data.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h2>Prerequisites</h2><p>Before starting, ensure your system meets the following minimum requirements:</p><ul><li><p>20 GB of free disk space</p></li><li><p>4 CPU cores</p></li><li><p><a href="https://docs.docker.com/engine/install/">Docker</a> and<a href="https://docs.docker.com/compose/install/"> Docker Compose</a> are installed</p></li></ul><h2>Set up multi-region deployment environment</h2><p><a href="https://github.com/bacalhau-project/bacalhau-network-setups/blob/main/docker-compose/multi-region/docker-compose.yml">The Docker Compose file</a> below sets up a multi-region Bacalhau deployment with an orchestrator node that receives and schedules jobs, three compute nodes that execute the jobs, and one MinIO storage node per region, which in this case are the US and EU, <a href="https://docs.bacalhau.org/references/guides/jobs/using-labels-and-constraints">using labels and constraints</a>.</p><p>The storage nodes use <a href="https://min.io/">MinIO</a>, an S3-compatible object storage server, to store the data.</p><pre><code>git clone https://github.com/bacalhau-project/bacalhau-network-setups
cd bacalhau-network-setups/docker-compose/multi-region
docker compose up -d</code></pre><p>The Bacalhau solution implements a multi-regional data processing architecture that strictly adheres to data sovereignty requirements while enabling efficient cross-border analytics. The architecture consists of three main components:&nbsp;</p><ul><li><p>Regional compute resources in each geographical region.</p></li><li><p>Distributed storage system spread across each region that can be queried individually or as a whole</p></li><li><p>An orchestration layer that coordinates jobs and requests across the system.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UO7F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7f27e2-08a9-4573-89fe-35becdc6f1fa_1369x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UO7F!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7f27e2-08a9-4573-89fe-35becdc6f1fa_1369x1600.png 424w, https://substackcdn.com/image/fetch/$s_!UO7F!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7f27e2-08a9-4573-89fe-35becdc6f1fa_1369x1600.png 848w, https://substackcdn.com/image/fetch/$s_!UO7F!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7f27e2-08a9-4573-89fe-35becdc6f1fa_1369x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!UO7F!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7f27e2-08a9-4573-89fe-35becdc6f1fa_1369x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UO7F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7f27e2-08a9-4573-89fe-35becdc6f1fa_1369x1600.png" width="1369" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1a7f27e2-08a9-4573-89fe-35becdc6f1fa_1369x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1369,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;bacalhau-data-anonynization.png&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="bacalhau-data-anonynization.png" title="bacalhau-data-anonynization.png" srcset="https://substackcdn.com/image/fetch/$s_!UO7F!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7f27e2-08a9-4573-89fe-35becdc6f1fa_1369x1600.png 424w, https://substackcdn.com/image/fetch/$s_!UO7F!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7f27e2-08a9-4573-89fe-35becdc6f1fa_1369x1600.png 848w, https://substackcdn.com/image/fetch/$s_!UO7F!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7f27e2-08a9-4573-89fe-35becdc6f1fa_1369x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!UO7F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7f27e2-08a9-4573-89fe-35becdc6f1fa_1369x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Install the Bacalhau CLI</h2><p>To interact with the newly created Bacalhau deployment, <a href="https://docs.bacalhau.org/getting-started/installation">install the Bacalhau CLI</a>:</p><pre><code>curl -sL 'https://get.bacalhau.org/install.sh' | bash</code></pre><p>To verify that you are targeting the right Bacalhau deployment, run the command below.</p><pre><code>bacalhau node list</code></pre><p>You should see a list of 7 nodes: 1 orchestrator and 6 compute nodes. Something like the below:</p><pre><code> ID            TYPE       APPROVAL  STATUS     LABELS                                                              CPU   MEMORY   DISK     GPU

 compute-eu-1  Compute    APPROVED  CONNECTED  Architecture=arm64, Operating-System=linux, region=eu              8.0   6.6 GB   861 GB   0

 compute-eu-2  Compute    APPROVED  CONNECTED  Architecture=arm64, Operating-System=linux, region=eu              8.0   6.6 GB   861 GB   0

 compute-eu-3  Compute    APPROVED  CONNECTED  Architecture=arm64, Operating-System=linux, region=eu              8.0   6.6 GB   861 GB   0

 compute-us-1  Compute    APPROVED  CONNECTED  Architecture=arm64, Operating-System=linux, region=us              8.0   6.6 GB   861 GB   0

 compute-us-2  Compute    APPROVED  CONNECTED  Architecture=arm64, Operating-System=linux, region=us              8.0   6.6 GB   861 GB   0

 compute-us-3  Compute    APPROVED  CONNECTED  Architecture=arm64, Operating-System=linux, region=us              8.0   6.6 GB   861 GB   0

 orchestrator  Requester  APPROVED  CONNECTED  Architecture=arm64, Operating-System=linux, region=global, type=orchestrator</code></pre><h2>Clone example job repository</h2><p>In a separate directory, clone the examples repository and navigate to the data anonymization example folder.</p><pre><code>git clone https://github.com/bacalhau-project/examples.git
cd examples/data-engineering/data-anonymization-with-microsoft-presidio/</code></pre><p>You can find the job specifications used for the rest of this post in the <em>jobs</em> folder, and more details on the possible specification options in the <a href="https://docs.bacalhau.org/getting-started/cli/submitting-jobs#reusable-jobs-yaml-specification">Bacalhau documentation</a>.</p><h2>Generate fake sensitive data</h2><p>The <a href="https://github.com/bacalhau-project/examples/blob/main/data-engineering/data-anonymization-with-microsoft-presidio/jobs/data-generator.yaml">data-generator.yaml</a> job consists of a bash script that runs in a Docker container and generates 30 files. These files simulate a memo full of personal data such as names, phone numbers, and addresses. The job generates the data in the node&#8217;s location in EU regions and then pushes those files to a MinIO bucket.</p><p>Submit the job to the compute nodes in Bacalhau cluster labeled with eu using the command below:</p><pre><code>bacalhau job run -V Region=eu jobs/data-generator.yaml</code></pre><h2>Anonymize the data</h2><p>The <a href="https://github.com/bacalhau-project/examples/blob/main/data-engineering/data-anonymization-with-microsoft-presidio/jobs/anonymize-job.yaml">anonymize-job.yaml</a> job runs a Python script on a Docker image that uses <a href="https://github.com/microsoft/presidio">Microsoft Presidio</a> to analyze, extract, and anonymize sensitive data in the EU-based MinIO bucket files.</p><p>Presidio is an open-source toolkit that uses NLP models to identify and anonymize sensitive information in structured and unstructured data formats. It can process different content types, including:</p><ul><li><p>Unstructured text documents and communications.</p></li><li><p>Emails and business correspondence.</p></li><li><p>Internal memos and reports.</p></li><li><p>Images containing sensitive information.</p></li><li><p>Business documents and forms.</p></li></ul><p>Presidio&#8217;s strength lies in recognizing multiple types of Personally Identifiable Information (PII). It can identify and sanitize sensitive elements such as names, addresses, identification numbers, and other personal information while maintaining the document&#8217;s structure and meaning. This capability is ideal for preparing data for cross-border transfers while maintaining compliance with data protection regulations.</p><p>The job outputs the anonymized files to a US-based MinIO bucket.</p><p>Submit the job to the compute nodes in Bacalhau cluster labeled with eu using the command below:</p><pre><code>bacalhau job run -V Region=eu jobs/anonymize-job.yaml</code></pre><p>This job takes a while to process, and you can check the job status using the job executions <code>&lt;job-id&gt;</code> command.</p><p>Presidio anonymizes the data by replacing sensitive information with generic placeholders. For example, it replaces names with <code>&lt;PERSON&gt;</code>, IBAN codes with <code>&lt;IBAN_CODE&gt;</code>, and dates with <code>&lt;DATE_TIME&gt;</code>.</p><h2>Bacalhau input and output configuration</h2><p>The <a href="https://docs.bacalhau.org/cli-api/specifications/job/input-source">InputSources</a> and <a href="https://docs.bacalhau.org/cli-api/specifications/job/result-path">ResultPaths</a> sections of the <a href="https://github.com/bacalhau-project/examples/blob/main/data-engineering/data-anonymization-with-microsoft-presidio/jobs/anonymize-job.yaml">anonymize-job.yaml</a> specification are the key components that enable the cross-border anonymous data processing.</p><pre><code>InputSources:
  - Target: /inputs
    Source:
      Type: s3
      Params:
        Bucket: my-bucket
        Key: "confidential-memos/"
        Endpoint: "http://storage-local:9000"
        Region: "eu-central-1"
        Filter: ".*txt$"</code></pre><p>This job uses MinIO and simulates an S3 bucket in the EU region.</p><p>This connects to the <a href="https://docs.bacalhau.org/cli-api/specifications/publishers">Publisher</a> section of the <a href="https://github.com/bacalhau-project/examples/blob/main/data-engineering/data-anonymization-with-microsoft-presidio/jobs/data-generator.yaml">data-generator.yaml</a> job that defines where it writes the output of the job. In this case, a MinIO S3-compatible bucket in an EU region.</p><pre><code>Publisher:
  Type: s3
  Params:
    Bucket: my-bucket
    Key: "confidential-memos/{nodeID}/"
    Endpoint: "http://storage-local:9000"
    Region: "eu-central-1"
    Encoding: plain</code></pre><p>After anonymizing the data, the job writes the output to a different MinIO bucket in a US region, using a different MinIO endpoint.</p><pre><code>ResultPaths:
      - Name: anonymized-memos
        Path: /anonymized-output
    Publisher:
      Type: "s3"
      Params:
        Bucket: "my-bucket"
        Key: "anonymized-memos/{date}/{time}/memos-{executionID}"
        Endpoint: "http://storage-us:9000"
        Region: "us-east-1"</code></pre><p>The configuration uses a dynamic <a href="https://docs.bacalhau.org/cli-api/specifications/publishers/s3">Key</a> structure using <code>{date}</code>, <code>{time}</code>, and <code>{executionID}</code> to create a well-organized storage hierarchy that makes it easier to track different processing runs.</p><h2>Cleanup</h2><p>When you&#8217;ve finished with the example, you can clean up the environment with the following commands:</p><pre><code># Stop the stack
docker compose down -v

# Clean up volumes
docker volume prune</code></pre><h2>Summary</h2><p>This post showed how you can use Bacalhau to maintain clear boundaries between sensitive and anonymized data by taking the following steps:</p><ul><li><p>Accessing input data only from a source in an EU region</p></li><li><p>Processing using Presidio within the same region as the sensitive data</p></li><li><p>Only publishing anonymized results to a US region</p></li><li><p>US-based compute nodes can then perform analytics on the sanitized data</p></li></ul><p>This process and setup ensure compliance with data sovereignty requirements while enabling efficient cross-region data processing and analytics.</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bacalhau! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Your Fast Track to Bacalhau: Local Development via Docker-in-Docker]]></title><description><![CDATA[Docker-in-Docker With Bacalhau For Fast Tests And Low Barrier Entry]]></description><link>https://blog.bacalhau.org/p/your-fast-track-to-bacalhau-local</link><guid isPermaLink="false">https://blog.bacalhau.org/p/your-fast-track-to-bacalhau-local</guid><dc:creator><![CDATA[Federico Trotta]]></dc:creator><pubDate>Thu, 15 May 2025 14:47:29 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/541d705c-56d1-4d7e-9c4c-d837fec48083_2048x1463.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><a href="https://www.bacalhau.org/">Bacalhau</a> is an open-source platform designed for &#8220;Compute-Over-Data&#8221;, allowing you to run jobs where your data resides. This avoids the <a href="https://www.expanso.io/cloud-orchestration-cost-optimization">costly</a> process of moving massive amounts of data to other <a href="https://www.expanso.io/kubernetes-vs-nomad-vs-bacalhau">centralized locations</a>. While easy to use, getting started and experimenting with a full Bacalhau deployment can involve:</p><ul><li><p>Setting up cloud resources.</p></li><li><p>Configuring networks.</p></li><li><p>Managing multiple components.</p></li></ul><p>This process may feel complicated if you just want to try new things on the fly.</p><p>This article explains how this process works and how, at Expanso, engineers have worked to make it more straightforward, accommodating it for situations where speed is key.</p><p>Let&#8217;s dive into it!</p><h2>The Bacalhau Basics</h2><p><a href="https://docs.bacalhau.org/">Bacalhau</a> is a framework that manages distributed computing networks that can operate across diverse infrastructures. It prioritizes keeping data in place, bringing computational power to data. It does so with the <a href="https://docs.bacalhau.org/overview/key-concepts#node-types">following nodes</a>:</p><ul><li><p><strong>Orchestrator nodes</strong>: An orchestrator node is a component responsible for coordinating the execution of jobs across the network. Its primary role is to handle job scheduling, deciding which compute nodes are best suited to run a particular task, based on specified constraints.</p></li><li><p><strong>Compute nodes</strong>: A compute node is a worker machine that executes the actual computational tasks assigned by an orchestrator. These nodes run the jobs, process the data, and generate results. Like orchestrator nodes, compute nodes are also instances of the single Bacalhau binary, but configured to operate in compute mode.</p></li><li><p><strong>Hybrid Nodes:</strong> These nodes serve both roles at once. They are often used for local developments or small setups.</p></li></ul><p>So, Bacalhau consists of an orchestrator that schedules jobs and multiple compute nodes that execute those jobs. These jobs run inside containers&#8212;via <a href="https://docs.bacalhau.org/cli-api/specifications/engines/docker">Docker</a> or <a href="https://docs.bacalhau.org/cli-api/specifications/engines/wasm">WASM</a>&#8212;on the compute nodes. Setting this up traditionally can involve:</p><ol><li><p>Provisioning Virtual Machines (VMs) or instances for the orchestrator and compute nodes.</p></li><li><p>Setting up S3-compatible storage for data input/output.</p></li><li><p>Installing Docker on all compute nodes.</p></li><li><p>Configuring networking and credentials for all components to communicate.</p></li></ol><p>Let&#8217;s be honest: these are barriers you want to eliminate in the case of rapid experimentation. At Expanso, we know it, and our engineers got you covered!</p><p></p><h2>The Local Solution: Docker Compose To The Rescue</h2><p>The solution engineers wanted to solve responds to a question: &#8221; How do we eliminate these barriers?&#8221;</p><p>Bacalhau works on containerized applications, but a container allows you to work only at a node level. However, engineers wanted to mimic the whole traditional setup so that it could be replicated on a single machine without:</p><ul><li><p>Provisioning VMs.</p></li><li><p>Getting credentials and setting up S3 storage.</p></li><li><p>Installing Docker on multiple compute nodes.</p></li></ul><p>The solution is straightforward: they used Docker Compose, a tool for defining and running multi-container applications. So, engineers at Expanso developed a Docker Compose image that defines all the services and their connections. This way, with <em>docker-compose up,</em> you bring the entire Bacalhau cluster to life on your machine in a few seconds.</p><h2>The Inception Moment: Docker-in-Docker (DinD) Comes Into The Game</h2><p>Here's where it gets interesting. In the Docker Compose setup, the compute nodes are themselves already running as Docker containers. So, here&#8217;s the challenge: how does a Docker container (the compute node) launch <em>another</em> Docker container (the actual job)?</p><p>This is where <strong>Docker-in-Docker (DinD)</strong> comes in. Docker provides special <code>docker:dind</code> images. When a container is run using a DinD image, it runs its own independent Docker daemon inside that container.</p><p>By basing your Bacalhau compute node containers on a DinD image, you allow them to pull other Docker images and run containers within their own isolated environment.</p><p>The workflow looks like this:</p><ol><li><p>You run <em>docker-compose up.</em></p></li><li><p>Docker Compose starts the orchestrator, <a href="https://docs.bacalhau.org/common-workflows/mounting-input-data#s3-compatible-storage">MinIO</a>&#8212;an open-source, <a href="https://min.io/">S3-compatible object storage server</a>&#8212;, and compute node containers (which are based on DinD images).</p></li><li><p>You submit a job to the local Bacalhau orchestrator.</p></li><li><p>The orchestrator assigns the job to one of the compute node containers.</p></li><li><p>The compute node container pulls the required Docker image for the job and runs the job inside a new container, nested within itself.</p></li></ol><p>That&#8217;s it!</p><p>This mimics how a Bacalhau distributed deployment operates, but it is contained within your local Docker environment on your machine without:</p><ul><li><p>Cloud accounts.</p></li><li><p>Infrastructure management.</p></li><li><p>Overheads.</p></li><li><p>Fear of messing things up. You can destroy your container and recreate another one.</p></li></ul><h2>Why This Matters: The Benefits</h2><p>This DinD approach offers you advantages like:</p><ul><li><p><strong>Low entry barrier:</strong> No need for cloud accounts or complex infrastructure setup. Just Docker Desktop (or Docker Engine) and the compose <code>YAML</code> file.</p></li><li><p><strong>Realistic simulation:</strong> Faithfully replicates the multi-component architecture and the container-based job execution of a real Bacalhau deployment.</p></li><li><p><strong>Safe sandbox:</strong> You can experiment, break things, test configurations. If something goes wrong, you can simply <em>docker-compose down</em> and <em>docker-compose up</em> to start a fresh instance.</p></li><li><p><strong>Rapid iteration:</strong> You can quickly test changes to Bacalhau configurations, job specifications, or even Bacalhau itself, if you're contributing to <a href="https://github.com/bacalhau-project/bacalhau">the project</a>.</p></li><li><p><strong>Offline capability:</strong> You can develop and test without an internet connection (once the initial Docker images are pulled).</p></li></ul><p>Typical situations where the DinD approach with Bacalhau comes in handy are when:</p><ul><li><p>You need to create a fast Proof Of Concept (POC) for a customer.</p></li><li><p>You have to try or add a new feature on the fly.</p></li><li><p>You just wanted to give Bacalhau a fast try.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/p/your-fast-track-to-bacalhau-local?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/p/your-fast-track-to-bacalhau-local?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2>How To Implement It</h2><p>After discussing the theory, let&#8217;s see how to get started with DinD and Bacalhau.</p><h3>Prerequisites</h3><p>To replicate the following steps, your system must match the following prerequisites:</p><ul><li><p><strong>Docker engine</strong>: You need to have <a href="https://docs.docker.com/engine/install/">Docker installed</a> and running on your system.</p></li><li><p><strong>Docker Compose:</strong> You need <a href="https://docs.docker.com/compose/">Docker Compose</a> itself:</p><ul><li><p><strong>Docker Desktop:</strong> If you are using Docker Desktop, Docker Compose is typically included as part of the installation. To verify you have it installed, type <em>docker compose version.</em></p></li><li><p><strong>Server/Manual install:</strong> If you installed Docker Engine manually, you might need to <a href="https://docs.docker.com/compose/install/">install Docker Compose</a> separately.</p></li></ul></li></ul><h3>Step 1: Clone The Repository</h3><p>Clone the repository:</p><pre><code><code>git clone &lt;https://github.com/bacalhau-project/bacalhau-network-setups&gt;</code></code></pre><p>Note that the documentation for using it is in the <em><a href="https://github.com/bacalhau-project/bacalhau-network-setups/tree/main/docker-compose">docker-compose/</a></em> folder.</p><p>This is the structure of the cloned repository:</p><pre><code><code>bacalhau-network-setups/
   &#9500;&#9472;&#9472;docker-compose/
   &#9474;&#9;&#9500;&#9472;&#9472;expanso-cloud/
   &#9474;&#9;&#9500;&#9472;&#9472;multi-region/
   &#9474;&#9;&#9500;&#9472;&#9472;single-region/
   &#9474;&#9;
   &#9492;&#9472;&#9472;README.MD</code></code></pre><h3>Step 2: Choose Your Setup</h3><p>You can choose from different setups:</p><ul><li><p>Single region.</p></li><li><p>Multi region</p></li><li><p>Expanso Cloud</p></li></ul><p>Suppose you want a single region. From the main folder, you have to move to the <em>single-region/</em> folder:</p><pre><code><code>cd docker-compose/single-region</code></code></pre><p>Very well. You are now ready to launch Docker Compose!</p><h3>Step 3: Launch Docker Compose</h3><p>From the <em>single-region/</em> folder, launch Docker Compose:</p><pre><code><code>docker compose up -d</code></code></pre><p>After a few seconds, the process will be completed:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MVtS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b76a6c-9a03-4f4d-90f8-b365406b2284_480x797.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MVtS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b76a6c-9a03-4f4d-90f8-b365406b2284_480x797.png 424w, https://substackcdn.com/image/fetch/$s_!MVtS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b76a6c-9a03-4f4d-90f8-b365406b2284_480x797.png 848w, https://substackcdn.com/image/fetch/$s_!MVtS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b76a6c-9a03-4f4d-90f8-b365406b2284_480x797.png 1272w, https://substackcdn.com/image/fetch/$s_!MVtS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b76a6c-9a03-4f4d-90f8-b365406b2284_480x797.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MVtS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b76a6c-9a03-4f4d-90f8-b365406b2284_480x797.png" width="480" height="797" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85b76a6c-9a03-4f4d-90f8-b365406b2284_480x797.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:797,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:58547,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bacalhau.org/i/163636836?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b76a6c-9a03-4f4d-90f8-b365406b2284_480x797.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MVtS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b76a6c-9a03-4f4d-90f8-b365406b2284_480x797.png 424w, https://substackcdn.com/image/fetch/$s_!MVtS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b76a6c-9a03-4f4d-90f8-b365406b2284_480x797.png 848w, https://substackcdn.com/image/fetch/$s_!MVtS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b76a6c-9a03-4f4d-90f8-b365406b2284_480x797.png 1272w, https://substackcdn.com/image/fetch/$s_!MVtS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b76a6c-9a03-4f4d-90f8-b365406b2284_480x797.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Good. You now have a working single-regional Bacalhau instance.</p><h3>Step 4: Run Your Jobs</h3><p>Connect to the client container:</p><pre><code><code>docker compose exec client sh</code></code></pre><p>Create a job list:</p><pre><code><code>bacalhau job list</code></code></pre><p>The expected result is:</p><pre><code><code>CREATED  ID  JOB  TYPE  STATE</code></code></pre><p>Now you can run your jobs</p><pre><code><code>bacalhau job run ...</code></code></pre><p>Good! You have launched your jobs with a Bacalhau instance all on your machine.</p><h3>Step 5: Conclude And Cleanup</h3><p>When all the jobs are done, after closing the instance, you can clean everything up with:</p><pre><code><code>docker compose down -v --remove-orphans</code></code></pre><p>This will:</p><ul><li><p>Stop all containers.</p></li><li><p>Remove all containers.</p></li><li><p>Remove all volumes.</p></li><li><p>Remove any orphaned containers.</p></li><li><p>Remove all networks.</p></li></ul><p>Here is the expected result:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nuK2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd257da-967d-486c-9771-4bf83e8a51fa_477x225.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nuK2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd257da-967d-486c-9771-4bf83e8a51fa_477x225.png 424w, https://substackcdn.com/image/fetch/$s_!nuK2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd257da-967d-486c-9771-4bf83e8a51fa_477x225.png 848w, https://substackcdn.com/image/fetch/$s_!nuK2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd257da-967d-486c-9771-4bf83e8a51fa_477x225.png 1272w, https://substackcdn.com/image/fetch/$s_!nuK2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd257da-967d-486c-9771-4bf83e8a51fa_477x225.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nuK2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd257da-967d-486c-9771-4bf83e8a51fa_477x225.png" width="477" height="225" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/efd257da-967d-486c-9771-4bf83e8a51fa_477x225.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:225,&quot;width&quot;:477,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19037,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bacalhau.org/i/163636836?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd257da-967d-486c-9771-4bf83e8a51fa_477x225.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nuK2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd257da-967d-486c-9771-4bf83e8a51fa_477x225.png 424w, https://substackcdn.com/image/fetch/$s_!nuK2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd257da-967d-486c-9771-4bf83e8a51fa_477x225.png 848w, https://substackcdn.com/image/fetch/$s_!nuK2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd257da-967d-486c-9771-4bf83e8a51fa_477x225.png 1272w, https://substackcdn.com/image/fetch/$s_!nuK2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefd257da-967d-486c-9771-4bf83e8a51fa_477x225.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Your instance is now clean, and you can start new jobs!</p><h2>Conclusion</h2><p>By leveraging the Docker-in-Docker technique, you can create self-contained Bacalhau environments locally. This approach:</p><ul><li><p>Lowers the barrier to entry, providing an efficient environment for you to learn, develop, and test Bacalhau.</p></li><li><p>Allows you to test new features on the fly, without worrying about infrastructure management. This saves time and overhead.</p></li></ul><p>If you want to know more about the architecture and the setup, read <a href="https://github.com/bacalhau-project/bacalhau/tree/main/docker-compose-deployment">this README file</a>.</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h2><strong>What's Next?</strong></h2><p>To start using Bacalhau, <a href="https://docs.bacalhau.org/getting-started/installation">install Bacalhau</a> and give it a shot.</p><p>If you don&#8217;t have a node network available and would still like to try Bacalhau, you can use <a href="https://cloud.expanso.io/login">Expanso Cloud</a>. <a href="https://docs.bacalhau.org/getting-started/network-setup">You can also set up a cluster on your own</a> (with setup guides for <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-amazon-web-services-aws-with-terraform">AWS</a>, <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-google-cloud-platform-gcp-with-terraform">GCP</a>, <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-azure-with-terraform">Azure</a>, and more &#128578;).</p><h3><strong>Get Involved!</strong></h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Reach out at any of the following locations:</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3><strong>Commercial Support</strong></h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. <a href="https://www.expanso.io/faq/">Read more about the difference between open-source Bacalhau and commercially supported Bacalhau in the FAQ</a>. If you want to use the pre-built binaries and receive commercial support, <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p>]]></content:encoded></item><item><title><![CDATA[Getting Started with Machine Learning on Bacalhau]]></title><description><![CDATA[Distributed Machine Learning needn't be compex with the help of Bacalhau]]></description><link>https://blog.bacalhau.org/p/getting-started-with-machine-learning</link><guid isPermaLink="false">https://blog.bacalhau.org/p/getting-started-with-machine-learning</guid><dc:creator><![CDATA[Chris Chinchilla]]></dc:creator><pubDate>Thu, 08 May 2025 15:39:31 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!5o3f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97ef2b9a-50f6-42f3-a608-687b10193c92_2626x1876.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5o3f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97ef2b9a-50f6-42f3-a608-687b10193c92_2626x1876.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5o3f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97ef2b9a-50f6-42f3-a608-687b10193c92_2626x1876.heic 424w, https://substackcdn.com/image/fetch/$s_!5o3f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97ef2b9a-50f6-42f3-a608-687b10193c92_2626x1876.heic 848w, https://substackcdn.com/image/fetch/$s_!5o3f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97ef2b9a-50f6-42f3-a608-687b10193c92_2626x1876.heic 1272w, https://substackcdn.com/image/fetch/$s_!5o3f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97ef2b9a-50f6-42f3-a608-687b10193c92_2626x1876.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5o3f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97ef2b9a-50f6-42f3-a608-687b10193c92_2626x1876.heic" width="1456" height="1040" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97ef2b9a-50f6-42f3-a608-687b10193c92_2626x1876.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1040,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:73725,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bacalhau.org/i/163114717?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97ef2b9a-50f6-42f3-a608-687b10193c92_2626x1876.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5o3f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97ef2b9a-50f6-42f3-a608-687b10193c92_2626x1876.heic 424w, https://substackcdn.com/image/fetch/$s_!5o3f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97ef2b9a-50f6-42f3-a608-687b10193c92_2626x1876.heic 848w, https://substackcdn.com/image/fetch/$s_!5o3f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97ef2b9a-50f6-42f3-a608-687b10193c92_2626x1876.heic 1272w, https://substackcdn.com/image/fetch/$s_!5o3f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97ef2b9a-50f6-42f3-a608-687b10193c92_2626x1876.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Machine Learning requires vast amounts of resources, and distributing these resources across multiple devices and regions helps with cost, speed, and data sovereignty. <a href="https://www.bacalhau.org/">Bacalhau</a> is an open-source distributed orchestration framework designed to bring compute resources to the data where and when you want, drastically reducing latency and resource overhead.</p><p>Instead of moving large datasets around networks, Bacalhau makes it easy to execute jobs close to the data&#8217;s location, reducing latency and resource overhead.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/p/getting-started-with-machine-learning?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/p/getting-started-with-machine-learning?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2><strong>How does Bacalhau work?</strong></h2><p>Bacalhau is a single self-contained binary that you can run on bare metal, in containers, or as WebAssembly. <a href="https://docs.bacalhau.org/getting-started/network-setup">Bacalhau can function as a client, orchestrator, and compute node</a> or all at once. Bacalhau integrates with S3, the local file systems, and other sources via HTTP endpoints, letting you pull data from various sources.</p><p>You can install Bacalhau on any UNIX-like operating system with one a one-line command:</p><pre><code>curl -sL https://get.bacalhau.org/install.sh | bash</code></pre><p>Or, for more flexibility, <a href="https://docs.bacalhau.org/getting-started/installation#tab-docker">you can also install with Docker</a>, depending on whether you want to run jobs in containers or not.</p><p>To start an <strong>orchestrator node</strong> that schedules and manages jobs, run:</p><pre><code>bacalhau serve --orchestrator</code></pre><p>To start a <strong>compute node</strong> that executes workloads, run:</p><pre><code>bacalhau serve --compute</code></pre><p>A node can have both types if you specify both flags.</p><p>Bacalhau&#8217;s architecture enables you to create compute networks that bridge traditional infrastructure boundaries. When you submit a job, Bacalhau determines which compute nodes are best positioned to process the data based on locality, availability, and defined constraints, without requiring manual data movement or constant connectivity.</p><h2><strong>Distributed machine learning with Bacalhau</strong></h2><p>This design allows for simple, flexible, and extensible execution, which is well-suited to distributing machine learning workloads.</p><p>For example, say you have a product recommendation model to train, and for regional and regulatory reasons, you want to train versions of it in the USA, Europe, and China.</p><p>To do this, use <a href="https://docs.bacalhau.org/references/guides/jobs/using-labels-and-constraints">labels</a>, which are key-value pairs that describe a node&#8217;s characteristics, capabilities, and properties. You can define these labels in a YAML file or as you start a node.</p><p>For example, to start a new orchestrator node that runs in the US. First, create a config file:</p><pre><code># config.yaml
labels:
  region: us</code></pre><p>Then, pass it to the node:</p><pre><code>bacalhau serve --orchestrator --config config.yaml</code></pre><p>Or to pass the config as you start the node:</p><pre><code>bacalhau serve --orchestrator -c Labels="region=us"</code></pre><p><a href="https://docs.bacalhau.org/cli-api/specifications/engines">Bacalhau nodes run jobs either in Docker containers or as WASM payloads</a>. The rest of this post uses Docker.</p><p>To submit a job to a node that matches that label, use the <code>--constraints</code> argument:</p><pre><code>bacalhau docker run --constraints "region=us" data-processor</code></pre><p>Or, more conveniently, you can declare jobs in a job definition file <em>ml-job-us.yaml</em>:</p><pre><code>Type: batch
Count: 1
Constraints:
  - Key: region
    Operator: =
    Values:
    - us
Tasks:
  - Name: "data-processor"
    // rest of job definition</code></pre><p>You can use Bacalhau to submit jobs to multiple machines in each region and distribute them amongst multiple servers based on the anticipated load for each region. This could be around Singles Day in China, Christmas in Europe, or Black Friday in the USA.</p><h2><strong>Retrieving processed data</strong></h2><p>Despite this global and regional distribution, you can aggregate it and use federated learning or analysis across subsets of the data. For example, to get an impression of trends at regional or global levels, but opting to run the learning in a specific region or not.</p><p>For basic job retrieval based on a region, you first <a href="https://docs.bacalhau.org/getting-started/cli/listing-jobs">find the details of jobs based on constraints</a>. For example:</p><pre><code>bacalhau job list --labels "region=us"</code></pre><p>Then, <a href="https://docs.bacalhau.org/getting-started/cli/downloading-results">download the results</a> of a particular job:</p><pre><code>bacalhau job get &lt;jobID&gt; --output-dir /destination/path</code></pre><p>However, doing this manually for every job isn&#8217;t particularly productive, so instead, you could expand on the job definition file mentioned earlier to define what to do with the results upon completion.</p><pre><code><code>Type: batch
Count: 1
Constraints:
  - Key: region
    Operator: =
    Values:
    - us
Tasks:
  - Name: "data-processor"
    // rest of job definition
&#9;&#9;&#9;&#8230;
    Publisher:
      Type: s3
      Params:
        Bucket: us-results
        Key: us- results-folder
    ResultPaths:
      - Name: us-results
        Path: /outputs</code></code></pre><p>This YAML configuration introduced a couple of other possibilities for job results. Instead of downloading the job results to a user&#8217;s computer, it saves them to an <em>outputs</em> directory where the job is running and then publishes those results to an S3 bucket. You can also fetch the results manually from the <em>outputs</em> directory with the job get command.</p><h3><strong>Supplying data to process</strong></h3><p>Machine learning needs data to process, which, again, you often want to keep separated for practical or regulatory reasons.</p><p>With Bacalhau, you can <a href="https://docs.bacalhau.org/common-workflows/mounting-input-data">mount input data</a> from the local file system, an S3 bucket, IPFS, or an HTTP endpoint.</p><p>Either as an<code> --input</code> command line argument:</p><pre><code>bacalhau docker run --input &lt;URI&gt;&lt;SOURCE&gt;:&lt;TARGET&gt; data-processor</code></pre><p>Which consists of the URL to the storage location and where to mount it in the destination container.</p><p>Or, add the input details to the job definition file:</p><pre><code>Type: batch
Count: 1
Constraints:
  - Key: region
    Operator: =
    Values:
    - us
Tasks:
  - Name: "data-processor"
    // rest of job definition
&#9;&#9;&#9;&#8230;
&#9;&#9;&#9;InputSources:
      - Alias: input
        Target: outputs
        Source:
          Type: /us-sales-data
          Params:
            key: value
    Publisher:
      Type: s3
      Params:
        Bucket: us-results
        Key: us-results-folder
    ResultPaths:
      - Name: us-results
        Path: /outputs</code></pre><h2><strong>Summary</strong></h2><p>This post covered getting started with machine learning using Bacalhau, including some of the basic concepts for processing data in and out of Bacalhau securely and privately. To find out more, we recommend <a href="https://docs.bacalhau.org/getting-started/installation">the more detailed installation guide</a>, <a href="https://docs.bacalhau.org/references/setting-up/running-node/quick-start-docker">onboarding nodes to your network</a>, and <a href="https://docs.bacalhau.org/getting-started/cli/submitting-jobs">using jobs</a>.</p><h2><strong>What's Next?</strong></h2><p>To start using Bacalhau, <a href="https://docs.bacalhau.org/getting-started/installation">install Bacalhau</a> and give it a shot.</p><p>If you don&#8217;t have a node network available and would still like to try Bacalhau, you can use <a href="https://cloud.expanso.io/login">Expanso Cloud</a>. <a href="https://docs.bacalhau.org/getting-started/network-setup">You can also set up a cluster on your own</a> (with setup guides for <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-amazon-web-services-aws-with-terraform">AWS</a>, <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-google-cloud-platform-gcp-with-terraform">GCP</a>, <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-azure-with-terraform">Azure</a>, and more &#128578;).</p><div><hr></div><h3><strong>Get Involved!</strong></h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Reach out at any of the following locations:</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3><strong>Commercial Support</strong></h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. <a href="https://www.expanso.io/faq/">Read more about the difference between open-source Bacalhau and commercially supported Bacalhau in the FAQ</a>. If you want to use the pre-built binaries and receive commercial support, <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p>]]></content:encoded></item><item><title><![CDATA[Stop Paying for Data Noise: Optimize Your Pipeline with Compute-Over-Data]]></title><description><![CDATA[Why a great part of your data should never hit the cloud.]]></description><link>https://blog.bacalhau.org/p/stop-paying-for-data-noise-optimize</link><guid isPermaLink="false">https://blog.bacalhau.org/p/stop-paying-for-data-noise-optimize</guid><dc:creator><![CDATA[Federico Trotta]]></dc:creator><pubDate>Thu, 01 May 2025 15:30:44 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1ac2516e-55f3-4e2a-8785-61035c705430_2626x1876.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Are your cloud bills making you wince? Are you drowning in terabytes of logs? You're likely caught in a common, yet costly, data pipeline trap!</p><p>Many organizations default to a "ship everything" model. They collect data everywhere&#8211;servers, devices, applications&#8211;and funnel it all to a central location before analyzing it.</p><p>This seems logical, but it hides massive inefficiencies. Let&#8217;s discuss them.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/p/stop-paying-for-data-noise-optimize?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/p/stop-paying-for-data-noise-optimize?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2>The Cascade of Costs</h2><p>This traditional approach triggers a cascade of expenses for every byte generated, especially the noisy, low-value ones:</p><ul><li><p><strong>Egress costs:</strong> Paying to move data out of its source environment.</p></li><li><p><strong>Ingestion costs:</strong> Fees charged by your central platform just to receive the data.</p></li><li><p><strong>Storage costs:</strong> Ongoing charges, often in expensive "hot" tiers for data that might never be accessed again.</p></li><li><p><strong>Query costs:</strong> Even analyzing the data costs compute time and resources.</p></li></ul><p><a href="https://www.futuremarketinsights.com/reports/dark-analytics-market">Industry analyses</a> suggest that a staggering amount of this centrally stored data&#8211;potentially around 20%&#8211;is <em>never</em> used after storage. Yet, organizations pay the full price for this journey. </p><p>Beyond cash, this model adds operational drag: </p><ul><li><p>High transfer times.</p></li><li><p>Engineering complexity.</p></li><li><p>Managing pipelines.</p></li><li><p>Increased security risks from moving sensitive data unnecessarily.</p><p></p></li></ul><h2>Why Does Raw Data Overwhelm Your Systems?</h2><p>Sure, hot storage in the cloud offers fast access, but it's pricey. But even cheaper object storage isn't efficient if the data sitting there is redundant or useless without preprocessing.</p><p>The catch? You often need significant compute power to preprocess or analyze data <em>before</em> deciding if it's worth keeping or shipping&#8211;compute power that might not be available where the data originates.</p><h2>The Real Reason We Ship Everything</h2><p>So why stick with this inefficient pattern? Primarily because most existing tools are built for it. Log shippers, message queues, observability platforms, and data warehouses assume data <em>must</em> be centralized first. Coupled with the lack of easy-to-deploy compute at the edge, teams feel forced to ship everything, accepting the cost and complexity.</p><h2>A Smarter Way: Compute-Over-Data</h2><p>There's a better approach: flip the model and <strong>bring the compute to the data</strong>. Instead of shipping raw, noisy data first, process it <em>at the source</em>.</p><p>Imagine running logic directly where data is born:</p><ul><li><p>Filter verbose logs down to just critical errors <em>before</em> they leave the server.</p></li><li><p>Aggregate raw metrics on an IoT gateway <em>before</em> sending summaries.</p></li><li><p>Enrich events with local context instantly.</p></li><li><p>Compress data intelligently based on its type.</p></li><li><p>Decide <em>what</em> data is valuable enough to move <em>before</em> incurring network and storage costs.</p></li></ul><p>The Compute-Over-Data approach leads to tangible benefits:</p><ul><li><p><strong>Massive cost reduction:</strong> Stop paying egress, ingestion, and storage fees for useless data.</p></li><li><p><strong>Improved signal visibility:</strong> Filter out the noise early to spot important events faster.</p></li><li><p><strong>Enhanced security &amp; compliance:</strong> Keep sensitive data local; only move aggregated or necessary subsets.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h2>How To Do So? Meet Bacalhau!</h2><p>This isn't just theory. <a href="https://www.bacalhau.org/">Bacalhau</a> is an open-source framework designed specifically for Compute-Over-Data. It acts as an orchestration layer, letting you run compute jobs (packaged as Docker containers or WASM modules) directly where your data resides&#8211;be it data center servers, edge devices, or even workstations with GPUs.</p><p>Instead of pulling data <em>to</em> compute, Bacalhau sends compute <em>to</em> data. It helps you:</p><ul><li><p><strong>Process data at the edge:</strong> Execute filtering, aggregation, or analysis before data hits expensive network hops or ingestion endpoints.</p></li><li><p><strong>Slash data volumes:</strong> Send only the valuable results, drastically cutting costs for downstream systems.</p></li><li><p><strong>Handle diverse workloads:</strong> Supports various job types (batch, long-running services, ops, daemon jobs).</p></li><li><p><strong>Operate reliably:</strong> Designed for distributed environments, handling intermittent connectivity gracefully (crucial for edge).</p></li></ul><h2>The Bottom Line</h2><p>The "ship everything" era is proving unsustainable. Compute-Over-Data, powered by tools like Bacalhau, offers a more intelligent, secure, and cost-effective future for handling distributed data. Stop paying for noise and start optimizing your pipelines for value.</p><div><hr></div><h2><strong>What's Next?</strong></h2><p>To start using Bacalhau, <a href="https://docs.bacalhau.org/getting-started/installation">install Bacalhau</a> and give it a shot.</p><p>If you don&#8217;t have a node network available and would still like to try Bacalhau, you can use <a href="https://cloud.expanso.io/login">Expanso Cloud</a>. <a href="https://docs.bacalhau.org/getting-started/network-setup">You can also set up a cluster on your own</a> (with setup guides for <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-amazon-web-services-aws-with-terraform">AWS</a>, <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-google-cloud-platform-gcp-with-terraform">GCP</a>, <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-azure-with-terraform">Azure</a>, and more &#128578;).</p><h3><strong>Get Involved!</strong></h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Reach out at any of the following locations:</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3><strong>Commercial Support</strong></h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. <a href="https://www.expanso.io/faq/">Read more about the difference between open-source Bacalhau and commercially supported Bacalhau in the FAQ</a>. If you want to use the pre-built binaries and receive commercial support, <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p>]]></content:encoded></item><item><title><![CDATA[High-Scale Data Processing: Over Thousands of Devices With Azure Cosmos DB and Expanso]]></title><description><![CDATA[If you're dealing with data generated across thousands of devices or locations, you know the pain.]]></description><link>https://blog.bacalhau.org/p/high-scale-data-processing-cosmosdb-bacalhau</link><guid isPermaLink="false">https://blog.bacalhau.org/p/high-scale-data-processing-cosmosdb-bacalhau</guid><dc:creator><![CDATA[Federico Trotta]]></dc:creator><pubDate>Tue, 29 Apr 2025 16:00:51 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/798883a9-61a3-4c7c-892b-91778567c2fa_3283x2345.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you're dealing with data generated across thousands of devices or locations, you know the pain. Pipelining every raw byte back to a central data center for processing is slow, expensive, and fraught with regulatory hurdles.</p><p>Powerful databases exist to store this distributed data, but how do you process it before it gets there?</p><p>At Expanso, we recently explored this critical challenge in an Azure Cosmos DB TV episode featuring Mark Brown from Microsoft and our CEO, David Aronchick. The episode dives deep into a modern approach: <strong>Compute Over Data</strong>. Instead of moving mountains of raw data, why not process it right where it's created?</p><p>Let&#8217;s break down the main points!</p><h2>The Distributed Data Dilemma</h2><p>The world is generating data at an exponential rate, much of it unstructured and originating outside traditional data centers. Centralizing everything runs into several roadblocks:</p><ul><li><p><strong>Network bottlenecks:</strong> WANs aren't keeping up, and the speed of light imposes hard latency limits. Moving gigabytes or terabytes takes time and costs money.</p></li><li><p><strong>Data quality and context loss:</strong> Raw data from edge devices is often poorly structured. Moving it immediately means losing context like location, local timestamps, device specifics, and more.</p></li><li><p><strong>Regulatory compliance:</strong> GDPR, CCPA, and industry-specific rules restrict moving raw data, especially PII or sensitive operational details, across borders.</p></li><li><p><strong>Delayed insights:</strong> Waiting for data to traverse networks and complex central ETL pipelines before analysis means delays in taking action, sometimes measured in minutes, hours, or even weeks.</p></li></ul><h2>Bacalhau: Bringing Compute to the Data</h2><p>This is the gap Expanso, powered by the open-source Bacalhau project, is built to fill. Instead of moving data to compute, Bacalhau runs your processing jobs directly where the data resides.</p><p>This allows you to perform pre-processing steps locally like:</p><ol><li><p><strong>Schematization:</strong> Transform raw data streams into well-structured formats suitable for use in Cosmos DB.</p></li><li><p><strong>Enrichment:</strong> Add metadata right at the source, preserving context.</p></li><li><p><strong>Sanitization:</strong> Filter or modify sensitive information <em>before</em> it leaves the local environment, aiding compliance.</p></li><li><p><strong>Aggregation:</strong> Reduce data volume by calculating summaries over time windows locally, sending only the essential information. This cuts network traffic and central processing costs.</p></li></ol><h2>The Synergy: Smarter Processing, Global Storage</h2><p>The combination is powerful. Bacalhau acts as the edge/distributed processing layer, preparing and refining data. Optimized data, then, flows into the nearest Cosmos DB regional replica.</p><p>Key benefits of this approach include:</p><ul><li><p><strong>Cost savings:</strong> Reduced data transfer, lower central compute/storage needs.</p></li><li><p><strong>Faster insights:</strong> Analyze data quicker by processing it closer to the source and landing analysis-ready data in Cosmos DB.</p></li><li><p><strong>Enhanced security and compliance:</strong> Minimize raw data movement and sanitize sensitive information locally.</p></li><li><p><strong>Increased resilience:</strong> Better handling of intermittent network connectivity at the edge.</p></li><li><p><strong>Simplified operations:</strong> Declaratively manage distributed jobs without building complex custom orchestration.</p></li></ul><h2>The Takeaway</h2><p>If you're building applications that span multiple locations, deal with IoT/edge devices, or face data gravity challenges, the traditional "move-then-process" model is holding you back. The "Compute Over Data" approach, enabled by Bacalhau working in concert with globally distributed databases like Azure Cosmos DB, offers a more efficient, cost-effective, and compliant path forward.</p><p><strong>Want the full story and see the live demo?</strong> <strong>Watch the complete Azure Cosmos DB TV episode now! &#128071;</strong></p><div id="youtube2-F2VMVjSxIHY" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;F2VMVjSxIHY&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/F2VMVjSxIHY?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/p/high-scale-data-processing-cosmosdb-bacalhau?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/p/high-scale-data-processing-cosmosdb-bacalhau?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p><h2><strong>What's Next?</strong></h2><p>To start using Bacalhau, <a href="https://docs.bacalhau.org/getting-started/installation">install Bacalhau</a> and give it a shot.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><p>If you don&#8217;t have a node network available and would still like to try Bacalhau, you can use <a href="https://cloud.expanso.io/login">Expanso Cloud</a>. <a href="https://docs.bacalhau.org/getting-started/network-setup">You can also set up a cluster on your own</a> (with setup guides for <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-amazon-web-services-aws-with-terraform">AWS</a>, <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-google-cloud-platform-gcp-with-terraform">GCP</a>, <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-azure-with-terraform">Azure</a>, and more &#128578;).</p><h3><strong>Get Involved!</strong></h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Reach out at any of the following locations:</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3><strong>Commercial Support</strong></h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. <a href="https://www.expanso.io/faq/">Read more about the difference between open-source Bacalhau and commercially supported Bacalhau in the FAQ</a>. If you want to use the pre-built binaries and receive commercial support, <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p>]]></content:encoded></item><item><title><![CDATA[Cloud orchestration cost optimization]]></title><description><![CDATA[The move to the cloud promised to save money, but that's not what happened. Changing how you think about cloud computing architecture might.]]></description><link>https://blog.bacalhau.org/p/cloud-orchestration-cost-optimization</link><guid isPermaLink="false">https://blog.bacalhau.org/p/cloud-orchestration-cost-optimization</guid><dc:creator><![CDATA[Chris Chinchilla]]></dc:creator><pubDate>Fri, 25 Apr 2025 15:30:31 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Ofvn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7f29f87-4dca-4ee0-a7a1-f674fbbcd143_2626x1876.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ofvn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7f29f87-4dca-4ee0-a7a1-f674fbbcd143_2626x1876.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ofvn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7f29f87-4dca-4ee0-a7a1-f674fbbcd143_2626x1876.heic 424w, https://substackcdn.com/image/fetch/$s_!Ofvn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7f29f87-4dca-4ee0-a7a1-f674fbbcd143_2626x1876.heic 848w, https://substackcdn.com/image/fetch/$s_!Ofvn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7f29f87-4dca-4ee0-a7a1-f674fbbcd143_2626x1876.heic 1272w, https://substackcdn.com/image/fetch/$s_!Ofvn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7f29f87-4dca-4ee0-a7a1-f674fbbcd143_2626x1876.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ofvn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7f29f87-4dca-4ee0-a7a1-f674fbbcd143_2626x1876.heic" width="1456" height="1040" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b7f29f87-4dca-4ee0-a7a1-f674fbbcd143_2626x1876.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1040,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:156615,&quot;alt&quot;:&quot;An image with the text Cloud orchestration cost optimization&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bacalhau.org/i/161955544?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7f29f87-4dca-4ee0-a7a1-f674fbbcd143_2626x1876.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="An image with the text Cloud orchestration cost optimization" title="An image with the text Cloud orchestration cost optimization" srcset="https://substackcdn.com/image/fetch/$s_!Ofvn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7f29f87-4dca-4ee0-a7a1-f674fbbcd143_2626x1876.heic 424w, https://substackcdn.com/image/fetch/$s_!Ofvn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7f29f87-4dca-4ee0-a7a1-f674fbbcd143_2626x1876.heic 848w, https://substackcdn.com/image/fetch/$s_!Ofvn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7f29f87-4dca-4ee0-a7a1-f674fbbcd143_2626x1876.heic 1272w, https://substackcdn.com/image/fetch/$s_!Ofvn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7f29f87-4dca-4ee0-a7a1-f674fbbcd143_2626x1876.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The move to the cloud promised to save users money and give them insights into their usage and costs.</p><p>However, the opposite happened. <a href="https://aag-it.com/the-latest-cloud-computing-statistics/">A 2025 report from AAG</a> stated that around 82% of respondents found cloud spending challenging. <a href="https://www.cloudzero.com/blog/cloud-computing-statistics/">A cloudzero report from 2024</a> states that more than 20% of respondents had no clear idea of their cloud costs, with reports for large users sometimes consisting of thousands of rows of hard-to-read usage data.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bacalhau! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>This is compounded by many companies and teams using more than one cloud provider for a hybrid cloud strategy to provide redundancy in case of outages or issues. Instead of taking advantage of the flexibility of cloud-native computing, many engineers still build as if they are using fixed on-premise servers, over-provisioning instances far beyond the capacity they need. All these factors mean that users overspend on idle and duplicated services and incur ingress and egress costs between providers.</p><p>Financial costs aside, complex cloud orchestration also increases energy impact, latency, and speed issues as multiple services pass bits and bytes back and forth across continents.</p><p>Most crucially, cloud applications write and access vast quantities of data, which are rarely stored in the same place where they are accessed and processed, causing yet more latency and cost.</p><p>It&#8217;s hard to get accurate cost comparisons between cloud providers, <a href="https://cast.ai/blog/cloud-pricing-comparison/">but a rough comparison</a> is the following for on-demand rates:</p><ul><li><p>AWS t4g.xlarge, $0.1344 per hour</p></li><li><p>Azure B4ms, $0.1660 per hour</p></li><li><p>Google Cloud Platform e2-standard-4, $0.1509 per hour</p></li></ul><p>For spot instances, using comparable services on Azure and GCP:</p><ul><li><p>AWS t4g.xlarge, $0.044 per hour. A 67% saving</p></li><li><p>Azure, A4 v2, $0.0348 per hour. An 85% saving</p></li><li><p>Google Cloud Platform, e2-standard-4, $0.0602 per hour. A 60% saving</p></li></ul><h2>Common Cloud Cost Optimization Techniques</h2><p>To combat rising expenses, organizations try different common techniques, including:</p><ul><li><p><strong>Rightsizing resources, </strong>continuously monitors resource utilization and performance metrics to ensure you use appropriately sized instances and services. Organizations can eliminate the waste associated with over-provisioning by matching infrastructure to workload demands and ensuring they pay only for the capacity they need.</p></li><li><p><strong>Leveraging pricing models and purchase options. </strong>Cloud providers offer pricing models beyond standard on-demand rates. For example, <a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-reserved-instances.html">AWS offers Reserved Instances (RIs)</a> and <a href="https://aws.amazon.com/savingsplans/">Savings Plans</a> that give discounts in exchange for committing to a certain usage level. Spot Instances offer access to spare cloud capacity at reduced rates for fault-tolerant workloads that can handle interruptions.</p></li><li><p><strong>Implementing auto-scaling</strong> to automatically adjust the number of compute resources allocated to an application based on demand. Automating shutdown schedules for non-production environments during off-hours also prevents paying for unutilized resources.</p></li></ul><h2>Limitations of Traditional Optimization</h2><p>While conventional optimization techniques are valuable, they often have limitations, especially in data-intensive environments, such as:</p><ul><li><p><strong>Increased operational complexity. </strong>Managing rightsizing, reservations, and spot instances is complex and time-consuming in multi-cloud or hybrid cloud setups.&nbsp;</p></li><li><p><strong>Challenges with dynamic workloads. </strong>Accurately forecasting usage for Reserved Instances or Savings Plans is difficult for applications with variable or unpredictable demand patterns. This can lead to over-committing and not solving the cost issue.</p></li><li><p><strong>Inability to address data transfer costs. </strong>Most methods focus on optimizing compute and storage resource costs but fail to tackle the fundamental issue driving significant expense - the cost and performance impact of moving large volumes of data between storage locations and compute services, often across different regions or cloud providers.</p></li></ul><p></p><h2>The Solution: Bringing distributed compute and data together</h2><p>Cloud cost optimization is challenging for businesses seeking to leverage cloud benefits without breaking the budget. However, the right solution to cloud cost optimization may be shifting your thinking about computing.</p><p><a href="https://github.com/bacalhau-project/bacalhau">Bacalhau is an open-source solution</a> that enables users to run compute and processing jobs where they generate and store data. Instead of running computations in one location that request data from another, process it, and send it back to another, with Bacalhau, you can run the whole process in one place.</p><p>With <a href="https://docs.bacalhau.org/cli-api/specifications/engines/wasm">WASM </a>and <a href="https://docs.bacalhau.org/cli-api/specifications/engines/docker">Docker </a>support, you can run jobs with different programming languages. Bacalhau has GPU and edge device support,&nbsp; meaning you can still use the same cloud services you already use for compute or storage.</p><p>This brings the crucial flexibility you need, such as location, security, and device support, without the expense.</p><p>It reduces compute processes sitting idle waiting for something to do, as it&#8217;s the infrastructure you already use.</p><h3>One-line install with many possibilities</h3><p><a href="https://docs.bacalhau.org/getting-started/installation">Bacalhau is a one-line install</a> that is configurable with a YAML file to determine what to run, where, and how. For smaller jobs, you can run everything on one Bacalhau instance. For more complex jobs, you can create a distributed network of orchestrator and compute nodes for distributing or processing job submissions. This is also useful if larger data sets are sharded across different locations. Each Bacalhau compute node can run over the data it can access and coordinate with the orchestrator to give the overall status and results.</p><p>This approach is also useful for global datasets that require processing differently for regulatory or security reasons. Bacalhau can process data in the same place as the data, returning aggregated and anonymized results and reducing concerns about security and privacy.</p><h2>Reduce costs, not flexibility</h2><p>The cloud promised reduced costs and complexity, but as countless reports and solutions for managing spiraling costs and complexity show, this hasn&#8217;t been successful.</p><p>For data processing, Bacalhau offers a simple and flexible solution to reduce cloud orchestration costs. <a href="https://docs.bacalhau.org/getting-started/quick-start">Try the open-source version today</a>, and if you want to know more about the hosted version, <a href="https://www.expanso.io/expanso-cloud">speak to the team to find out more</a>.</p><h2><strong>What's Next?</strong></h2><p>To start using Bacalhau,<a href="https://docs.bacalhau.org/getting-started/installation"> install Bacalhau</a> and give it a shot.</p><p>If you don&#8217;t have a node network available and would still like to try Bacalhau, you can use<a href="https://cloud.expanso.io/login"> Expanso Cloud</a>. <a href="https://docs.bacalhau.org/getting-started/network-setup">You can also set up your own cluster</a> (with setup guides for <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-amazon-web-services-aws-with-terraform">AWS</a>, <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-google-cloud-platform-gcp-with-terraform">GCP</a>, <a href="https://docs.bacalhau.org/references/operations/readme/setting-up-a-cluster-on-azure-with-terraform">Azure</a>, and more &#128578;).</p><p><strong>Get Involved!</strong></p><p>We welcome your involvement in Bacalhau. There are many<a href="https://docs.bacalhau.org/community/ways-to-contribute/"> ways to contribute</a>, and we&#8217;d love to hear from you. Reach out at any of the following locations:</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><p><strong>Commercial Support</strong></p><p>While Bacalhau is<a href="https://en.wikipedia.org/wiki/Open-source_software"> open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by<a href="https://www.expanso.io/"> Expanso</a>. <a href="https://www.expanso.io/faq/">Read more about the difference between open-source Bacalhau and commercially supported Bacalhau in the FAQ</a>. If you want to use the pre-built binaries and receive commercial support, <a href="https://www.expanso.io/contact/">contact us</a> or<a href="https://cloud.expanso.io/login"> get your license</a> on Expanso Cloud!</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bacalhau! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[K8s vs Nomad vs Bacalhau: Choosing Your Compute Orchestrator Wisely]]></title><description><![CDATA[Why Bacalhau can be the missing piece for your infrastructure.]]></description><link>https://blog.bacalhau.org/p/k8s-vs-nomad-vs-bacalhau-choosing</link><guid isPermaLink="false">https://blog.bacalhau.org/p/k8s-vs-nomad-vs-bacalhau-choosing</guid><dc:creator><![CDATA[Federico Trotta]]></dc:creator><pubDate>Fri, 18 Apr 2025 07:33:05 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/92b389f3-462f-4e61-aed9-72218265695b_2954x2111.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Are you wrestling with massive datasets scattered everywhere? Well, then you know that moving petabytes of data to a central cluster is slow, costly, and risky. While Kubernetes and Nomad are powerful orchestrators, they assume you'll bring data to the compute.</p><p>But what if that's your bottleneck? Bacalhau offers a different path: <strong>bring compute to the data.</strong></p><p>Let&#8217;s see the differences!</p><h2><strong>Comparing Kubernetes, Nomad, and Bacalhau: Key Differences</strong></h2><p>Let's break down the key differences between these three.</p><h3><strong>What They Run</strong></h3><ul><li><p><strong>Kubernetes:</strong> Is the king of containers (Docker and more) with a huge ecosystem around them. WASM support is growing, but containers are its core strength.</p></li><li><p><strong>Nomad:</strong> More flexible than k8s. Natively handles containerized and non-containerized apps. Great for mixed legacy and modern workloads.</p></li><li><p><strong>Bacalhau:</strong> Natively runs Docker &amp; WASM. Its pluggable architecture allows custom binaries too. Built to run compute <em>where data lives</em>, including leveraging specific hardware like GPUs.</p></li></ul><h3><strong>Architecture &amp; Footprint</strong></h3><ul><li><p><strong>Kubernetes:</strong> Powerful but complex control plane. Best suited for relatively homogenous clusters, usually in one data center/cloud.</p></li><li><p><strong>Nomad:</strong> Lightweight (single binary). Simpler setup than K8s, good for managing diverse architectures within a cluster.</p></li><li><p><strong>Bacalhau:</strong> Designed for distributed environments. Lightweight agent runs anywhere, focusing on executing jobs across varied nodes, not managing the infrastructure itself.</p></li></ul><h3><strong>How They Handle Jobs</strong></h3><ul><li><p><strong>Kubernetes:</strong> Manages long-running services and batch jobs (<code>Jobs</code>/<code>CronJobs</code>). However, often needs to rely on frameworks like Spark or Flink for complex distributed data processing.</p></li><li><p><strong>Nomad:</strong> Supports service, batch, and system jobs. Handles various task types efficiently.</p></li><li><p><strong>Bacalhau:</strong> Optimized for "embarrassingly parallel" tasks common in data processing. Offers <strong>batch, service, ops, and daemon</strong> job types, giving flexibility for different distributed compute needs.</p></li></ul><h3><strong>Dealing With Queues &amp; Disconnection:</strong></h3><ul><li><p><strong>Kubernetes:</strong> Supports scheduling and queuing. Not natively designed for environments with frequent disconnections (like the edge).</p></li><li><p><strong>Nomad:</strong> Handles job queues well. Clients can tolerate temporary network disconnects.</p></li><li><p><strong>Bacalhau:</strong> Built for disconnected or intermittently connected environments (like the edge). Features explicit job queuing and can wait for nodes to reconnect.</p></li></ul><h3><strong>Security Approach</strong></h3><ul><li><p><strong>Kubernetes &amp; Nomad:</strong> Robust security features (RBAC, ACLs). However, the centralized model (moving data or needing broad access) increases risks like metadata leakage.</p></li><li><p><strong>Bacalhau:</strong> Security advantage via <strong>"Compute Over Data"</strong>. Minimizing data movement inherently reduces exposure. Less centralized metadata lowers breach impact.</p></li></ul><h2><strong>When to Choose Which</strong></h2><p>So, how do you choose one or another for your application? Let&#8217;s break down the big ideas:</p><ul><li><p><strong>Kubernetes:</strong> Best for complex, containerized microservices needing rich networking and auto-scaling, primarily in a central location.</p></li><li><p><strong>Nomad:</strong> Ideal for simpler orchestration of mixed workloads in one place, or when K8s feels too heavy.</p></li><li><p><strong>Bacalhau:</strong> Your go-to if:</p><ul><li><p>Processing <strong>large, distributed datasets</strong> is the main goal.</p></li><li><p><strong>Data gravity</strong> (cost/time of moving data) is high.</p></li><li><p><strong>Data sovereignty/privacy</strong> rules prevent moving data.</p></li><li><p>You need robust <strong>edge computing</strong> capabilities.</p></li></ul></li></ul><h2><strong>The Bottom Line</strong></h2><p>No solution suits all needs. You need to choose the orchestrator that fits your particular case.</p><p>K8s and Nomad excel at managing applications where compute and data are centralized. Bacalhau, on the other hand, shines when your data is spread out, tackling the challenges of data gravity and distribution head-on by bringing the compute to where your data already lives. </p><p>Want to know better how to make the right choice when searching for the right orchestrator? Read the <a href="https://www.expanso.io/kubernetes-vs-nomad-vs-bacalhau">complete article</a> on our website!</p><h2><strong>What's Next?</strong></h2><p>To start using Bacalhau, <a href="https://docs.bacalhau.org/getting-started/installation">install Bacalhau</a> and give it a shot.</p><p>However, if you don&#8217;t have a network and you would still like to try it out, we recommend using <a href="https://cloud.expanso.io/login">Expanso Cloud</a>. Also, if you would like to set up a cluster on your own, <a href="https://docs.bacalhau.org/getting-started/network-setup">you can do that too</a> (we have setup guides for AWS, GCP, Azure, and many more &#128578;).</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><p></p><p><strong>Get Involved!</strong></p><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Please reach out to us at any of the following locations:</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><p><strong>Commercial Support</strong></p><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. You can read more about the difference between open-source Bacalhau and commercially supported Bacalhau <a href="https://www.expanso.io/faq/">in our FAQ</a>. If you would like to use our pre-built binaries and receive commercial support, please <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p>]]></content:encoded></item><item><title><![CDATA[Expanso Wins 2025 Data Breakthrough Award ]]></title><description><![CDATA[for Open Source Data Platform of the Year]]></description><link>https://blog.bacalhau.org/p/expanso-wins-2025-data-breakthrough</link><guid isPermaLink="false">https://blog.bacalhau.org/p/expanso-wins-2025-data-breakthrough</guid><dc:creator><![CDATA[Laura Hohmann]]></dc:creator><pubDate>Thu, 03 Apr 2025 16:45:47 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/47ccf457-012c-4753-b3d3-25a68d717ecc_2626x1876.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Bacalhau Named Top Open Source Data Platform for the Second Year in a Row</strong></p><p>For the second consecutive year, Expanso has been recognized in the Data Breakthrough Awards, this time winning <strong>&#8220;Open Source Data Platform of the Year&#8221;</strong> for Bacalhau. Last year, Bacalhau took home the award for <strong>&#8220;Data Processing Solution of the Year&#8221;</strong>&#8212;and now, its role in redefining how enterprises and developers orchestrate workloads at scale has earned it another top spot.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L74M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c686a02-53a5-4038-a834-5adcbdf45200_1388x1272.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L74M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c686a02-53a5-4038-a834-5adcbdf45200_1388x1272.png 424w, https://substackcdn.com/image/fetch/$s_!L74M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c686a02-53a5-4038-a834-5adcbdf45200_1388x1272.png 848w, https://substackcdn.com/image/fetch/$s_!L74M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c686a02-53a5-4038-a834-5adcbdf45200_1388x1272.png 1272w, https://substackcdn.com/image/fetch/$s_!L74M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c686a02-53a5-4038-a834-5adcbdf45200_1388x1272.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L74M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c686a02-53a5-4038-a834-5adcbdf45200_1388x1272.png" width="380" height="348.24207492795387" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c686a02-53a5-4038-a834-5adcbdf45200_1388x1272.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1272,&quot;width&quot;:1388,&quot;resizeWidth&quot;:380,&quot;bytes&quot;:180732,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bacalhau.org/i/160420253?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c686a02-53a5-4038-a834-5adcbdf45200_1388x1272.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L74M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c686a02-53a5-4038-a834-5adcbdf45200_1388x1272.png 424w, https://substackcdn.com/image/fetch/$s_!L74M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c686a02-53a5-4038-a834-5adcbdf45200_1388x1272.png 848w, https://substackcdn.com/image/fetch/$s_!L74M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c686a02-53a5-4038-a834-5adcbdf45200_1388x1272.png 1272w, https://substackcdn.com/image/fetch/$s_!L74M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c686a02-53a5-4038-a834-5adcbdf45200_1388x1272.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p> The <a href="https://databreakthroughawards.com/2024-winners/">Data Breakthrough Awards</a> saw over 3,000 nominations globally, with winners including industry leaders like <a href="https://www.cloudera.com/">Cloudera</a>, <a href="https://www.alteryx.com/">Alteryx</a>, <a href="https://www.fivetran.com/">Fivetran</a>, <a href="https://www.redhat.com/en">Red Hat</a>, and <a href="https://www.mbusa.com/en/home">Mercedes-Benz</a>. This recognition underscores Bacalhau&#8217;s growing role in modern distributed computing.</p><p></p><blockquote><p>"Expanso&#8217;s Bacalhau is a true breakthrough in the world of open-source data platforms, redefining how organizations process and manage data at scale,&#8221; said Steve Johansson, Managing Director, Data Breakthrough. &#8220;By bringing computation directly to the data&#8212;whether in the cloud, at the edge, or on-premises&#8212;Bacalhau is driving unprecedented efficiency, security, and real-time insights. We are thrilled to recognize Expanso with our &#8216;Open Source Data Platform of the Year&#8217; award in the 2025 Data Breakthrough Awards, honoring their innovation and leadership in the evolving data landscape."</p></blockquote><p></p><p><strong>Why It Matters</strong></p><p>Data-intensive organizations face a fundamental challenge: running workloads across clouds, regions, on-prem environments, and remote edge locations&#8212;without introducing unnecessary complexity or infrastructure sprawl. <a href="https://www.expanso.io/">Bacalhau</a> solves this.</p><p>With Bacalhau, enterprises can submit thousands of jobs, and the platform ensures they&#8217;re executed efficiently, wherever needed. It handles workload scheduling across dynamic environments, resolves failures and consensus when networks go down, and seamlessly integrates into existing data architectures.</p><p>This award is a testament to the developers, contributors, and customers who have helped shape Bacalhau into a powerful enterprise solution while remaining true to its open-source roots.</p><p><strong>Looking Ahead</strong></p><p>This is just the beginning. Bacalhau is evolving with new capabilities that further simplify workload orchestration, increase efficiency, and integrate deeper into enterprise ecosystems. Stay tuned&#8212;there&#8217;s more to come.</p><p><strong><a href="https://www.globenewswire.com/news-release/2025/04/03/3055289/0/en/Standout-Data-Technology-Innovators-Honored-in-6th-Annual-Data-Breakthrough-Awards-Program.html">Read the official Data Breakthrough announcement here &#8594;</a></strong></p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h3>Get Involved!</h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Please reach out to us at any of the following locations.</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3>Commercial Support</h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. You can read more about the difference between open-source Bacalhau and commercially supported Bacalhau <a href="https://www.expanso.io/faq/">in our FAQ</a>. If you would like to use our pre-built binaries and receive commercial support, please <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cgzM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" width="74" height="74" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:240,&quot;width&quot;:240,&quot;resizeWidth&quot;:74,&quot;bytes&quot;:6393,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-repo&quot;,&quot;text&quot;:&quot;&#11088;&#65039; GitHub&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-repo"><span>&#11088;&#65039; GitHub</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PTVY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" width="72" height="72" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:72,&quot;bytes&quot;:87775,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-slack&quot;,&quot;text&quot;:&quot;Slack&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-slack"><span>Slack</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://www.expanso.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic" width="118" height="72.61538461538461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db6feb48-90c9-40be-8bf6-7a449ec5476c.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:896,&quot;width&quot;:1456,&quot;resizeWidth&quot;:118,&quot;bytes&quot;:68480,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://www.expanso.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.expanso.io&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.expanso.io"><span>Website</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z2xN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" width="128" height="94.50549450549451" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec2518b9-3542-4718-976c-c9e51a38b480.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1075,&quot;width&quot;:1456,&quot;resizeWidth&quot;:128,&quot;bytes&quot;:50073,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.bacalhau.org&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.bacalhau.org"><span>Website</span></a></p>]]></content:encoded></item><item><title><![CDATA[Bacalhau v1.7.0 - Day 5: Distributed Data Warehouse with Bacalhau and DuckDB]]></title><description><![CDATA[(3:11) Build a distributed data warehouse with Bacalhau and DuckDB to run SQL queries on regional data without moving it.]]></description><link>https://blog.bacalhau.org/p/distributed-data-warehouse-with-bacalhau</link><guid isPermaLink="false">https://blog.bacalhau.org/p/distributed-data-warehouse-with-bacalhau</guid><dc:creator><![CDATA[Chris Chinchilla]]></dc:creator><pubDate>Fri, 28 Mar 2025 19:29:06 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/3a8524e5-34c8-491b-97c8-7b6e89f25f46_3939x2814.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is part of the 5-days of Bacalhau 1.7 series! Make sure to go back to the start to catch all of them!</em></p><ul><li><p><strong><a href="https://blog.bacalhau.org/p/announcing-bacalhau-17-empowering?r=2xwcw0">Day 1: Announcing Bacalhau 1.7.0: Empowering Enterprises with Enhanced Scalability, Job Management, and Support</a></strong></p></li><li><p><strong><a href="https://blog.bacalhau.org/p/bacalhau-v170-day-2-scaling-your?r=2xwcw0">Day 2: Scaling Your Compute Jobs with Bacalhau Partitioned Jobs</a></strong></p></li><li><p><strong><a href="https://blog.bacalhau.org/p/bacalhau-v170-day-3-streamlining">Day 3: Streamlining Security: Simplifying Bacalhau's Authentication Model</a></strong></p></li><li><p><strong><a href="https://blog.bacalhau.org/p/bacalhau-v170-day-4-using-aws-s3">Day 4: Using AWS S3 Partitioning With Bacalhau</a></strong></p></li></ul><h1>Distributed Data Warehouse with Bacalhau and DuckDB</h1><p>With many applications that rely on data warehouses, you need to keep data sources in different locations. This could be due to privacy or regulatory reasons or because you want to keep processing close to the source. However, there are still times when you want to perform analysis on and across these data sources from one location but not move the data.</p><p>This post uses Bacalhau to orchestrate the distributed processing and DuckDB to provide the SQL storage and querying capacity for some mock sales data based in the EU and the US.</p><h2>Prerequisites</h2><p>To reproduce this tutorial, you need the following prerequisites:</p><ul><li><p><a href="https://docs.bacalhau.org/getting-started/installation">Bacalhau CLI</a></p></li><li><p><a href="https://docs.docker.com/compose/">Docker and Docker Compose</a></p></li><li><p>The example multi-region setup</p><ul><li><p><code>git clone https://github.com/bacalhau-project/examples.git</code></p></li></ul></li></ul><h2>Architecture</h2><p>The example Docker Compose file and Bacalhau job definitions in<a href="https://github.com/bacalhau-project/examples/tree/main/distributed-datawarehouse/duckdb"> the example repository</a> setup simulate the following architecture:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6Wzt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7fc39c9-980d-418a-b455-46f554096eb8_1600x590.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6Wzt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7fc39c9-980d-418a-b455-46f554096eb8_1600x590.png 424w, https://substackcdn.com/image/fetch/$s_!6Wzt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7fc39c9-980d-418a-b455-46f554096eb8_1600x590.png 848w, https://substackcdn.com/image/fetch/$s_!6Wzt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7fc39c9-980d-418a-b455-46f554096eb8_1600x590.png 1272w, https://substackcdn.com/image/fetch/$s_!6Wzt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7fc39c9-980d-418a-b455-46f554096eb8_1600x590.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6Wzt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7fc39c9-980d-418a-b455-46f554096eb8_1600x590.png" width="1456" height="537" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7fc39c9-980d-418a-b455-46f554096eb8_1600x590.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:537,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6Wzt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7fc39c9-980d-418a-b455-46f554096eb8_1600x590.png 424w, https://substackcdn.com/image/fetch/$s_!6Wzt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7fc39c9-980d-418a-b455-46f554096eb8_1600x590.png 848w, https://substackcdn.com/image/fetch/$s_!6Wzt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7fc39c9-980d-418a-b455-46f554096eb8_1600x590.png 1272w, https://substackcdn.com/image/fetch/$s_!6Wzt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7fc39c9-980d-418a-b455-46f554096eb8_1600x590.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><strong>Bacalhau Orchestrator</strong>: Central control plane for job distribution</p></li><li><p><strong>Compute Nodes</strong>: Distributed across regions, running close to data</p></li><li><p><strong>Regional Storage</strong>: Regional data stores, using MinIO in this setup</p></li><li><p><strong>DuckDB</strong>: SQL query engine running on each compute node. Bacalhau has a custom image that adds several<a href="https://duckdb.org/docs/stable/clients/python/function.html"> user-defined functions</a> to help with partitioning large data sets across nodes based on the following methods:</p><ul><li><p><code>partition_by_hash</code>: Even distribution of files across partitions</p></li><li><p><code>partition_by_regex</code>: Pattern-based partitioning</p></li><li><p><code>partition_by_date</code>: Time-based partitioning</p></li></ul></li></ul><ul><li><p>You can find more details on how these functions work in<a href="https://github.com/bacalhau-project/examples/tree/main/utility_containers/duckdb"> the custom image documentation</a>.</p></li></ul><p>You can see each component&#8217;s setup in<a href="https://github.com/bacalhau-project/examples/blob/main/distributed-datawarehouse/duckdb/network/docker-compose.yml"> the Docker Compose file</a>. Create the architecture by running the following command:</p><p><code>docker compose up -d</code></p><p>The Docker Compose file uses several Bacalhau configuration files, which you can see<a href="https://github.com/bacalhau-project/examples/tree/main/distributed-datawarehouse/duckdb/network/config"> in the configuration folder</a> that label the compute nodes as US and EU nodes, respectively.</p><p>They also configure the orchestrator nodes to write data to the regional MinIO buckets.</p><h2>Generate sample data</h2><p>With the simulated infrastructure in place, you can now create sample data using<a href="https://github.com/bacalhau-project/examples/blob/main/distributed-datawarehouse/duckdb/jobs/data-generator.yaml"> the data generator job</a> to write 3000 records to each region in JSON format to the relevant MinIO bucket.</p><p><code># Move to the jobs directory<br> cd ../jobs</code></p><p>Generate the US data:</p><p><code>bacalhau job run -V Region=us -V Events=3000 \<br> -V StartDate=2024-01-01 -V EndDate=2024-12-31 \<br> -V RotateInterval=month data-generator.yaml</code></p><p>Generate the EU data:</p><p><code>bacalhau job run -V Region=eu -V Events=3000 \<br> -V StartDate=2024-01-01 -V EndDate=2024-12-31 \<br> -V RotateInterval=month data-generator.yaml</code></p><h2>Accessing data for analysis</h2><p>Bacalhau supports two ways to access the regional data:</p><h3>Bacalhau input sources</h3><p><code>InputSources:<br> - Type: s3<br> Source:<br> Bucket: local-bucket<br> Key: "data/*"</code></p><p>This method provides more control and preprocessing options and<a href="https://docs.bacalhau.org/common-workflows/mounting-input-data"> supports other source types in addition to S3</a>.</p><h3>Direct DuckDB access</h3><p><code>SET VARIABLE files = (<br> SELECT LIST(file)<br> FROM partition_by_hash('s3://local-bucket/**/*.jsonl')<br> );<br> SELECT * FROM read_json_auto(getvariable('files'));</code></p><p>This method is a simpler and more familiar option for SQL-only jobs. The job definitions also use SQL queries to process data from an input source.</p><h2>Run analysis</h2><p>With data in place, you can send analysis tasks as<a href="https://docs.bacalhau.org/getting-started/cli/submitting-jobs"> Bacalhau jobs</a>. In each case, after running the job, use <code>bacalhau job describe &lt;job_id&gt;</code> to see the job results, passing the job ID from the output of the <code>bacalhau job run</code> command. All the examples show using US data. You can also change <code>Region</code> to <code>eu</code> to see the results from the EU region.</p><h3>Monthly trend analysis</h3><p><code>bacalhau job run -V Region=us monthly-trends.yaml</code></p><p><a href="https://github.com/bacalhau-project/examples/blob/main/distributed-datawarehouse/duckdb/jobs/monthly-trends.yaml">Job definition</a>.</p><p>Sample output:</p><p><code>month | total_txns | revenue | unique_customers | avg_txn_value<br> ------------|------------|----------|------------------|---------------<br> 2024-03-01 | 3,421 | 178,932 | 1,245 | 52.30<br> 2024-02-01 | 3,156 | 165,789 | 1,189 | 52.53<br> 2024-01-01 | 2,987 | 152,456 | 1,023 | 51.04</code></p><h3>Operational monitoring</h3><ol><li><p><a href="https://github.com/bacalhau-project/examples/blob/main/distributed-datawarehouse/duckdb/jobs/hourly-operations.yaml">Hourly Operations</a></p></li></ol><ul><li><p>Tracks operational health metrics</p></li><li><p>Monitors transaction success rates</p></li><li><p>Shows hourly patterns</p></li></ul><blockquote><p><code>bacalhau job run -V Region=us hourly-operations.yaml</code></p></blockquote><ol start="2"><li><p><a href="https://github.com/bacalhau-project/examples/blob/main/distributed-datawarehouse/duckdb/jobs/anomaly-detection.yaml">Anomaly Detection</a></p></li></ol><ul><li><p>Identifies unusual patterns</p></li><li><p>Uses statistical analysis</p></li><li><p>Flags significant deviations</p></li></ul><blockquote><p><code>bacalhau job run -V Region=us anomaly-detection.yaml</code></p></blockquote><h3>Business analytics</h3><ol><li><p><a href="https://github.com/bacalhau-project/examples/blob/main/distributed-datawarehouse/duckdb/jobs/product-performance.yaml">Product Performance</a></p></li></ol><ul><li><p>Analyzes category performance</p></li><li><p>Tracks market share</p></li><li><p>Shows sales patterns</p></li></ul><blockquote><p><code>bacalhau job run -V Region=us product-performance.yaml</code></p></blockquote><ol start="2"><li><p><a href="https://github.com/bacalhau-project/examples/blob/main/distributed-datawarehouse/duckdb/jobs/monthly-trends.yaml">Monthly Trends</a></p></li></ol><ul><li><p>Long-term trend analysis</p></li><li><p>Monthly aggregations</p></li><li><p>Key business metrics</p></li></ul><blockquote><p><code>bacalhau job run -V Region=us monthly-trends.yaml</code></p></blockquote><h3>Customer analysis</h3><ol><li><p>Customer Segmentation (Two-Phase)</p></li></ol><ul><li><p><strong>Phase 1</strong>: Compute local metrics</p></li><li><p><strong>Phase 2</strong>: Combine and segment</p></li></ul><blockquote><p><code># Run Phase 1<br> bacalhau job run -V Region=us customer-segments-phase1.yaml<br><br> # Note the job ID, then run Phase 2<br> bacalhau job run -V Region=us -V JobID=&lt;phase1-job-id&gt; customer-segments-phase2.yaml</code></p></blockquote><h2>Summary</h2><p>This post combined the distributed computing power of Bacalhau with the flexible SQL capabilities of DuckDB to create a distributed data warehouse spread across regions. The example Bacalhau jobs provide a range of analysis tasks, from operational monitoring to customer segmentation, all while keeping the data in its original location and using SQL to query data stored in S3-compatible buckets.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h3>Get Involved!</h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Please reach out to us at any of the following locations.</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3>Commercial Support</h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. You can read more about the difference between open-source Bacalhau and commercially supported Bacalhau <a href="https://www.expanso.io/faq/">in our FAQ</a>. If you would like to use our pre-built binaries and receive commercial support, please <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cgzM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" width="74" height="74" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:240,&quot;width&quot;:240,&quot;resizeWidth&quot;:74,&quot;bytes&quot;:6393,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-repo&quot;,&quot;text&quot;:&quot;&#11088;&#65039; GitHub&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-repo"><span>&#11088;&#65039; GitHub</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PTVY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" width="72" height="72" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:72,&quot;bytes&quot;:87775,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-slack&quot;,&quot;text&quot;:&quot;Slack&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-slack"><span>Slack</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://www.expanso.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic" width="118" height="72.61538461538461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db6feb48-90c9-40be-8bf6-7a449ec5476c.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:896,&quot;width&quot;:1456,&quot;resizeWidth&quot;:118,&quot;bytes&quot;:68480,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://www.expanso.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.expanso.io&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.expanso.io"><span>Website</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z2xN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" width="128" height="94.50549450549451" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec2518b9-3542-4718-976c-c9e51a38b480.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1075,&quot;width&quot;:1456,&quot;resizeWidth&quot;:128,&quot;bytes&quot;:50073,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.bacalhau.org&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.bacalhau.org"><span>Website</span></a></p>]]></content:encoded></item><item><title><![CDATA[Bacalhau v1.7.0 - Day 4: Using AWS S3 Partitioning With Bacalhau ]]></title><description><![CDATA[(7:00) Bacalhau 1.7.1 simplifies S3 data processing with automated partitioning and built-in failure handling.]]></description><link>https://blog.bacalhau.org/p/bacalhau-v170-day-4-using-aws-s3</link><guid isPermaLink="false">https://blog.bacalhau.org/p/bacalhau-v170-day-4-using-aws-s3</guid><dc:creator><![CDATA[Federico Trotta]]></dc:creator><pubDate>Thu, 27 Mar 2025 16:03:02 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/554f4207-6277-48a9-b8cf-21d5c1980f59_3283x2345.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is part of the 5-days of Bacalhau 1.7 series! Make sure to go back to the start to catch all of them!</em></p><ul><li><p><strong><a href="https://blog.bacalhau.org/p/announcing-bacalhau-17-empowering?r=2xwcw0">Day 1: Announcing Bacalhau 1.7.0: Empowering Enterprises with Enhanced Scalability, Job Management, and Support</a></strong></p></li><li><p><strong><a href="https://blog.bacalhau.org/p/bacalhau-v170-day-2-scaling-your?r=2xwcw0">Day 2: Scaling Your Compute Jobs with Bacalhau Partitioned Jobs</a></strong></p></li><li><p><strong><a href="https://blog.bacalhau.org/p/bacalhau-v170-day-3-streamlining">Day 3: Streamlining Security: Simplifying Bacalhau's Authentication Model</a></strong></p></li></ul><p>Processing large datasets from S3 can be a challenge, particularly when the size of the data exceeds certain values. The good news is that we made it a lot easier with Bacalhau!</p><p>While we introduced <a href="https://blog.bacalhau.org/p/bacalhau-v170-day-2-scaling-your">generic partitioning</a> in Bacalhau v1.7, our new S3 partitioning feature handles data distribution automatically across multiple executions. It&#8217;s complete with failure handling, and independent retry of failed partitions, specifically on S3.</p><p>Let's dive into how this changes the game for distributed data processing.</p><h2><strong>Why Partition at All?</strong></h2><p>Before this feature, processing large S3 datasets was challenging. You had to create multiple jobs or write custom code to split the work. Without this effort, your machine would crash or slow down due to limited computing and disk throughput.</p><p>To solve this challenge, you had to figure out which part of your job was running on which machine. Then, you had to tell each part its position in the overall task. This was hard to manage. For example, if slice #3 of 8 failed, how would you know? Or how would you know which data should handle the #7 slice? More generally: how would you see the big picture of the entire job?</p><h2><strong>The Power of Automated Partitioning</strong></h2><p>Bacalhau 1.7.1 orchestrates everything for you. You just need to choose your partitioning strategy, and each task automatically gets its assigned subset of S3 objects. Your code stays clean and focused on its main job. If a partition fails, Bacalhau automatically retries only that and keeps the results from the successful ones.</p><p>For example, suppose you run a job and obtain a result as follows:</p><p><code>Job with 5 partitions:<br>Partition 0: &#10003; Completed<br>Partition 1: &#10003; Completed<br>Partition 2: &#10003; Completed<br>Partition 3: &#10007; Failed -&gt; Scheduled for retry<br>Partition 4: &#10003; Completed</code></p><p>This means that Partition 3 has failed. However, Bacalhau will automatically retry the failed partition while preserving the results of the other four successfully completed jobs.</p><h2><strong>Partitioning Strategies for Every Need</strong></h2><p>Let&#8217;s now go through an overview of different partitioning strategies in different scenarios with data retrieved from S3.</p><h3><strong>No Partitioning: When Sharing Is Good</strong></h3><p>There are cases where every execution needs access to all the data and partitioning is not needed. Typical scenarios are:</p><ul><li><p>Loading shared reference data</p></li><li><p>Processing configuration files</p></li><li><p>Running analysis that needs the complete dataset</p></li></ul><p>In these cases, you can process the whole dataset you have like so with Bacalhau:</p><p><code>name: shared-reference-data</code></p><p><code>count: 3</code></p><p><code>...</code></p><p><code>tasks:</code></p><p><code>- inputSources:</code></p><p><code>- target: /data</code></p><p><code>source:</code></p><p><code>type: s3</code></p><p><code>params:</code></p><p><code>bucket: config-bucket</code></p><p><code>key: reference-data/</code></p><p><code># No partition config - all executions see all files</code></p><p>In this case, the partition block is not present in the code. So your dataset remains as is.</p><p>Also, <code>type: s3</code> under the <code>source</code> field specifies the type of data source used for this task. So the input data is coming from an S3-compatible storage system.</p><h3><strong>Object-Based Distribution: When Balance Matters</strong></h3><p>If you need to process many files without any specific grouping, object partitioning provides an even distribution of the load.</p><p>This solution is ideal for:</p><ul><li><p>Processing large volumes of user uploads</p></li><li><p>Handling randomly named files</p></li><li><p>Large-scale data transformation tasks</p></li></ul><p>Here is how Bacalhau handles this for you:</p><p><code>name: process-uploads</code></p><p><code>count: 5</code></p><p><code>...</code></p><p><code>tasks:</code></p><p><code>- inputSources:</code></p><p><code>- target: /uploads</code></p><p><code>source:</code></p><p><code>type: s3</code></p><p><code>params:</code></p><p><code>bucket: data-bucket</code></p><p><code>key: user-uploads/</code></p><p><code>partition:</code></p><p><code>type: object</code></p><p>In this case, the count: <code>5</code> processes data for 5 units, thanks to the <code>partition</code> block.</p><h3><strong>Processing by Date: Time-Series Analysis</strong></h3><p>Time-series analysis is the cross and delight of every data professional&#8212;well, sometimes more a cross than a delight!</p><p>With Bacalhau, you can use partitioning to process each day's data in parallel. This is the perfect case for:</p><ul><li><p>Daily analytics processing</p></li><li><p>Log aggregation and analysis</p></li><li><p>Time-series computations</p></li></ul><p>Here is how you can do so:</p><p><code>name: daily-log-analysis</code></p><p><code>count: 7 # Process a week's worth of logs in parallel</code></p><p><code>...</code></p><p><code>tasks:</code></p><p><code>- inputSources:</code></p><p><code>- target: /logs</code></p><p><code>source:</code></p><p><code>type: s3</code></p><p><code>params:</code></p><p><code>bucket: app-logs</code></p><p><code>key: "logs/*"</code></p><p><code>partition:</code></p><p><code>type: date</code></p><p><code>dateFormat: "2006-01-02"</code></p><p>In this case, the <code>count:7</code> processes data for 7 units, representing the week&#8217;s data.</p><h3><strong>Processing by Region: Geographic Analysis</strong></h3><p>Analyzing geographical data is another scenario that may come with a lot of data. A solution is to distribute processing by region with partitioning. This enables scenarios like:</p><ul><li><p>Regional sales analysis</p></li><li><p>Geographic data processing</p></li><li><p>Territory-specific reporting</p></li></ul><p>Here is how you can manage this in Bacalhau:</p><p><code>name: regional-analysis</code></p><p><code>count: 3 # One execution per region</code></p><p><code>...</code></p><p><code>tasks:</code></p><p><code>- inputSources:</code></p><p><code>- target: /sales</code></p><p><code>source:</code></p><p><code>type: s3</code></p><p><code>params:</code></p><p><code>bucket: global-sales</code></p><p><code>key: "regions/*"</code></p><p><code>partition:</code></p><p><code>type: regex</code></p><p><code>pattern: "([^/]+)/.*"</code></p><p>For example, if you have data in <code>regions/NA/, regions/EU/, regions/APAC/,</code> etc., each execution will process one region's worth of data. The <code>pattern: "([^/]+)/.*"</code> is a standard Regex that does the following:</p><ul><li><p><code>([^/]+)</code>: This part matches and captures one or more characters that are not a forward slash (/). This is the first capturing group.</p></li><li><p><code>/.*</code>: This matches a forward slash (/) followed by zero or more characters (.*).</p></li></ul><p>As a result, if the S3 key is <code>regions/europe/sales.csv</code>, the regex will capture <code>europe</code>.</p><h3><strong>Processing by Customer Segment</strong></h3><p>Another typical example of the usage of partitioning is customer segmentation. In this case, common analysis scenarios are:</p><ul><li><p>Customer cohort analysis</p></li><li><p>Segment-specific processing</p></li><li><p>Category-based computations</p></li></ul><p>You can handle your analysis with Bacalhau partitioning as follows:</p><p><code>name: segment-analytics</code></p><p><code>count: 4</code></p><p><code>...</code></p><p><code>tasks:</code></p><p><code>- inputSources:</code></p><p><code>- target: /segments</code></p><p><code>source:</code></p><p><code>type: s3</code></p><p><code>params:</code></p><p><code>bucket: customer-data</code></p><p><code>key: segments/*</code></p><p><code>partition:</code></p><p><code>type: substring</code></p><p><code>startIndex: 0</code></p><p><code>endIndex: 3</code></p><h3><strong>Combining Partitioned and Shared Inputs</strong></h3><p>In certain cases, you may need Bacalhau&#8217;s jobs to process partitioned data while sharing reference data that all executions need to access. Common scenarios are:</p><ul><li><p>Processing daily logs with shared lookup tables</p></li><li><p>Analyzing data using common reference files</p></li><li><p>Running calculations that need both partitioned data and shared configuration</p></li></ul><p>As an example, consider this job that combines static reference data with daily logs partitioned by date:</p><p><code>name: daily-analysis</code></p><p><code>count: 7# Process a week of data</code></p><p><code>...</code></p><p><code>tasks:</code></p><p><code>- inputSources:</code></p><p><code>- target: /config</code></p><p><code>source:</code></p><p><code>type: s3</code></p><p><code>params:</code></p><p><code>bucket: config-bucket</code></p><p><code>key: reference/*</code></p><p><code># No partitioning - all executions see all reference data</code></p><p><code>- target: /daily-logs</code></p><p><code>source:</code></p><p><code>type: s3</code></p><p><code>params:</code></p><p><code>bucket: app-logs</code></p><p><code>key: logs/*</code></p><p><code>partition:</code></p><p><code>type: date</code></p><p><code>dateFormat: "2006-01-02"</code></p><p>This code partitions only the <code>/daily-logs</code>, subdividing them into weekly data with <code>count:7</code>.</p><p>The reference data <code>(/config)</code>, instead, is not partitioned as it is not under the partition block.</p><p><strong>Why This Changes Your Large Data Set Processing</strong></p><p>As you have seen, this feature is simple yet powerful. You are no longer required to write partition-aware code: just clean, focused processing logic with automatic data assignment. We&#8217;ve already tested this on scaling to over 1,000 partitions with no code changes needed and automated load balancing. But <a href="https://bacalhauproject.slack.com/join/shared_invite/zt-1sihp4vxf-TjkbXz6JRQpg2AhetPzYYQ#/shared-invite/email">you tell us</a> if you&#8217;d like us to go even further!</p><h2><strong>Getting Started With S3 Partitioning</strong></h2><p>If you&#8217;d like to try this example on your own, dive right in! <a href="https://docs.bacalhau.org/getting-started/installation">Install Bacalhau</a> and give it a shot.</p><p>By the way, if you don&#8217;t have a network and you would still like to try it out, we recommend using <a href="https://cloud.expanso.io/login">Expanso Cloud</a>. Also, if you'd like to set up a cluster on your own, <a href="https://docs.bacalhau.org/getting-started/network-setup">you can do that too</a> (we have setup guides for AWS, GCP, Azure, and many more &#128578;).</p><h2><strong>What's Next?</strong></h2><p>Start processing your S3 data today:</p><ol><li><p>Identify your natural data groupings (dates, regions, categories)</p></li><li><p>Choose the matching partition strategy</p></li><li><p>Let Bacalhau handle the distribution</p></li></ol><p>Ready to simplify your distributed data processing? Check out our<a href="https://docs.bacalhau.org/common-workflows/s3-partitioning"> documentation</a> for more examples and detailed guides.</p><p>Join our <a href="https://slack.bacalhau.org/">community</a> to share your data processing stories and learn from others!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h3>Get Involved!</h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Please reach out to us at any of the following locations.</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3>Commercial Support</h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. You can read more about the difference between open-source Bacalhau and commercially supported Bacalhau <a href="https://www.expanso.io/faq/">in our FAQ</a>. If you would like to use our pre-built binaries and receive commercial support, please <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cgzM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" width="74" height="74" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:240,&quot;width&quot;:240,&quot;resizeWidth&quot;:74,&quot;bytes&quot;:6393,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-repo&quot;,&quot;text&quot;:&quot;&#11088;&#65039; GitHub&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-repo"><span>&#11088;&#65039; GitHub</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PTVY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" width="72" height="72" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:72,&quot;bytes&quot;:87775,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-slack&quot;,&quot;text&quot;:&quot;Slack&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-slack"><span>Slack</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://www.expanso.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic" width="118" height="72.61538461538461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db6feb48-90c9-40be-8bf6-7a449ec5476c.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:896,&quot;width&quot;:1456,&quot;resizeWidth&quot;:118,&quot;bytes&quot;:68480,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://www.expanso.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.expanso.io&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.expanso.io"><span>Website</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z2xN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" width="128" height="94.50549450549451" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec2518b9-3542-4718-976c-c9e51a38b480.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1075,&quot;width&quot;:1456,&quot;resizeWidth&quot;:128,&quot;bytes&quot;:50073,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.bacalhau.org&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.bacalhau.org"><span>Website</span></a></p>]]></content:encoded></item><item><title><![CDATA[Bacalhau v1.7.0 - Day 3: Streamlining Security: Simplifying Bacalhau's Authentication Model]]></title><description><![CDATA[(12:30)This post covers Bacalhau 1.7&#8217;s new, simplified auth options: Basic Auth, API tokens, and OAuth 2.0 SSO.]]></description><link>https://blog.bacalhau.org/p/bacalhau-v170-day-3-streamlining</link><guid isPermaLink="false">https://blog.bacalhau.org/p/bacalhau-v170-day-3-streamlining</guid><dc:creator><![CDATA[Walid Baruni]]></dc:creator><pubDate>Wed, 26 Mar 2025 16:01:32 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/5df01fc1-6039-4209-bb74-26d87e19856c_3939x2814.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is part of the 5-days of Bacalhau 1.7 series! Make sure to go back to the start to catch all of them!</em></p><ul><li><p><strong><a href="https://blog.bacalhau.org/p/announcing-bacalhau-17-empowering?r=2xwcw0">Announcing Bacalhau 1.7.0: Empowering Enterprises with Enhanced Scalability, Job Management, and Support</a></strong></p></li><li><p><strong><a href="https://blog.bacalhau.org/p/bacalhau-v170-day-2-scaling-your?r=2xwcw0">Bacalhau v1.7.0 - Day 2: Scaling Your Compute Jobs with Bacalhau Partitioned Jobs</a></strong></p></li></ul><p>In the ever-evolving landscape of distributed computing, robust authentication and authorization mechanisms are essential for maintaining security while enabling seamless collaboration. The recent release of Bacalhau 1.7 introduces a significant overhaul to its authentication and authorization systems, offering more flexibility, improved security, and better integration with enterprise environments. Let's explore these exciting new features and understand how they enhance the Bacalhau ecosystem.</p><h2><strong>The Authentication Evolution</strong></h2><p>Prior to version 1.7, Bacalhau relied solely on Open Policy Agent for both authentication and authorization. While powerful, this approach placed the burden of generating complex policies on users and operators, creating a steep learning curve for newcomers. Additionally, the lack of OAuth 2.0 support limited integration possibilities with modern authentication systems, and the process of making authenticated API calls involved cumbersome back-and-forth communication.</p><p>With Bacalhau 1.7, these limitations have been addressed through the introduction of three distinct authentication paths, each designed to cater to different use cases and environments.</p><h2><strong>Authentication Options in Bacalhau 1.7</strong></h2><h3><strong>1. Basic Authentication</strong></h3><p>The simplest approach leverages the time-tested HTTP Basic Authentication protocol, allowing users to access Bacalhau APIs using traditional username and password credentials. These credentials can be defined in the Node Configuration file, offering two options for password storage:</p><ul><li><p>Plain text passwords for simplicity and ease of setup</p></li><li><p>bcrypt-hashed passwords for enhanced security</p></li></ul><p>For CLI usage, users simply need to set the environment variables <code>BACALHAU_API_USERNAME</code> and <code>BACALHAU_API_PASSWORD</code>. For direct API calls, the standard Basic Authorization header with base64-encoded credentials can be used.</p><p>Below is a sample orchestrator config file that defines 3 users that can authenticate through basic auth.</p><p><code>Orchestrator:</code></p><p><code>Enabled: true</code></p><p><code>API:</code></p><p><code>Port: 1234</code></p><p><code>Auth:</code></p><p><code>Users:</code></p><p><code># User with plain text password</code></p><p><code>- Alias: Admin User</code></p><p><code>Username: admin</code></p><p><code>Password: secureAdminPassword</code></p><p><code>Capabilities:</code></p><p><code>- Actions: ["*"]</code></p><p><code># User with limited permissions and plain text password</code></p><p><code>- Alias: Read Only User</code></p><p><code>Username: reader</code></p><p><code>Password: readerPassword</code></p><p><code>Capabilities:</code></p><p><code>- Actions: ["read:*"]</code></p><p><code># User with bcrypt hashed password</code></p><p><code>- Alias: Job Manager</code></p><p><code>Username: jobmanager</code></p><p><code># This is a bcrypt password hash for the password "MySecretPassword"</code></p><p><code>Password: "$2a$10$3ZvxUe5OudgRIQQheomjMO/Ufx1Bb04SH/y0PXnR19oDRXNGps3r2"</code></p><p><code>Capabilities:</code></p><p><code>- Actions: ["read:job", "write:job", "read:node"]</code></p><p>In this configuration:</p><ol><li><p>We have three users with different permission levels:</p><ul><li><p>An admin user with full access to all capabilities</p></li><li><p>A read-only user who can view but not modify any resources</p></li><li><p>A job manager who can view nodes and has full control over jobs</p></li></ul></li><li><p>The first two users have plain text passwords, while the third uses a <strong>BCRYPT</strong> hashed password for added security.</p></li></ol><p>To help users and operators to generate secure hashed passwords, a convenient CLI command was added that generates a <strong>BCRYPT </strong>hash of a password of your choosing:</p><p><code>bacalhau auth hash-password</code></p><p>To use this configuration with the Bacalhau CLI, you would set the following environment variables:</p><p><code># For admin access</code></p><p><code>export BACALHAU_API_USERNAME=admin</code></p><p><code>export BACALHAU_API_PASSWORD=secureAdminPassword</code></p><p><code># For read-only access</code></p><p><code>export BACALHAU_API_USERNAME=reader</code></p><p><code>export BACALHAU_API_PASSWORD=readerPassword</code></p><p><code># For job management</code></p><p><code>export BACALHAU_API_USERNAME=jobmanager</code></p><p><code>export BACALHAU_API_PASSWORD=MySecretPassword</code></p><p>For direct API calls using curl, you would encode the credentials in base64:</p><p><code># For admin (base64 of "admin:secureAdminPassword")</code></p><p><code>curl -X GET -H "Authorization: Basic YWRtaW46c2VjdXJlQWRtaW5QYXNzd29yZA==" "http://orchestrator:1234/api/v1/orchestrator/nodes"</code></p><p><code># For reader (base64 of "reader:readerPassword")</code></p><p><code>curl -X GET -H "Authorization: Basic cmVhZGVyOnJlYWRlclBhc3N3b3Jk" "http://orchestrator:1234/api/v1/orchestrator/nodes"</code></p><p><code># For Job Manager (base64 of "jobmanager:MySecretPassword")</code></p><p><code>curl -X GET -H "Authorization: Basic am9ibWFuYWdlcjpNeVNlY3JldFBhc3N3b3Jk" "http://orchestrator:1234/api/v1/orchestrator/nodes"</code></p><h3><strong>2. API Tokens</strong></h3><p>For applications and scenarios where password-based authentication isn't ideal, Bacalhau 1.7 introduces API token support. Instead of username and password pairs, users can generate and use API keys as bearer tokens in authorization headers.</p><p>Configuration is straightforward &#8211; API keys are defined in the orchestrator config under user profiles. To use them with the Bacalhau CLI, users set the <code>BACALHAU_API_KEY</code> environment variable. For direct API access, the token is included in the Authorization header using the Bearer scheme.</p><p>Please note that API Keys are opaque tokens.</p><p>Here's a sample configuration for API tokens in Bacalhau:</p><p><code>Orchestrator:</code></p><p><code>Enabled: true</code></p><p><code>API:</code></p><p><code>Port: 1234</code></p><p><code>Auth:</code></p><p><code>Users:</code></p><p><code># Administrator API token with full access</code></p><p><code>- Alias: Admin API Token</code></p><p><code>APIKey: 8F42A91D7C6E4B3DA5E9F8C12B76D3A4</code></p><p><code>Capabilities:</code></p><p><code>- Actions: ["*"]</code></p><p><code># Read-only API token</code></p><p><code>- Alias: Monitoring Token</code></p><p><code>APIKey: C5D8E3F1A7B94026895C1D4E3F2A0B78</code></p><p><code>Capabilities:</code></p><p><code>- Actions: ["read:*"]</code></p><p><code># Job management API token</code></p><p><code>- Alias: CI/CD Pipeline Token</code></p><p><code>APIKey: 2E8D7F5B3A9C41608D2E6B7F4A5C3D9E</code></p><p><code>Capabilities:</code></p><p><code>- Actions: ["read:job", "write:job", "read:node"]</code></p><p><code># Agent management API token</code></p><p><code>- Alias: Agent Management Token</code></p><p><code>APIKey: 1A3B5C7D9E0F2G4H6I8J0K2L4M6N8P0</code></p><p><code>Capabilities:</code></p><p><code>- Actions: ["read:agent", "write:agent"]</code></p><p>In this configuration:</p><ol><li><p>We have four API tokens with different permission levels:</p><ul><li><p>An administrator token with full access to all capabilities</p></li><li><p>A monitoring token with read-only access to all resources</p></li><li><p>A CI/CD pipeline token that can view nodes and has full control over jobs</p></li><li><p>An agent management token that has full control over agents</p></li></ul></li><li><p>Each token has a unique, randomly generated API key. You should generate strong, unique keys for your production environment using a secure random generator. Please note that API keys do not support BCRYPT hashing</p></li></ol><p>To use these API tokens with the Bacalhau CLI, you would set the following environment variable:</p><p><code>export BACALHAU_API_KEY=8F42A91D7C6E4B3DA5E9F8C12B76D3A4</code></p><p>For direct API calls using curl, you would use the Bearer token authentication scheme:</p><p><code>curl -X GET -H "Authorization: Bearer 8F42A91D7C6E4B3DA5E9F8C12B76D3A4" "http://orchestrator:1234/api/v1/orchestrator/nodes"</code></p><h3><strong>3. Single Sign-On via OAuth 2.0</strong></h3><p>Perhaps the most significant addition is the support for OAuth 2.0 using the Device Code Flow. This enables Bacalhau to integrate seamlessly with enterprise identity providers such as Okta, Auth0, Azure Active Directory, and Google SSO.</p><p>This approach eliminates the need to define users directly in Bacalhau's configuration, instead delegating user management to the identity provider &#8211; a considerable advantage in corporate environments with existing identity infrastructure.</p><p>The configuration process involves specifying OAuth 2.0 endpoints, client IDs, and desired scopes. When users need to authenticate, they run <code>bacalhau auth sso login</code>, which presents a device code and URL. After completing authentication through their browser, they receive a JWT token that's automatically used for subsequent API calls (this token exchange will be done seamlessly and the user is not required to perform any extra actions).</p><p>Here's a sample configuration for OAuth 2.0 SSO in Bacalhau:</p><p><code>Orchestrator:</code></p><p><code>Enabled: true</code></p><p><code>API:</code></p><p><code>Port: 1234</code></p><p><code>Auth:</code></p><p><code>Oauth2:</code></p><p><code># Identity provider details</code></p><p><code>ProviderId: "okta"</code></p><p><code>ProviderName: "Okta SSO"</code></p><p><code># OAuth 2.0 endpoints</code></p><p><code>DeviceAuthorizationEndpoint: "https://your-domain.okta.com/oauth2/v1/device/authorize"</code></p><p><code>TokenEndpoint: "https://your-domain.okta.com/oauth2/v1/token"</code></p><p><code>Issuer: "https://your-domain.okta.com"</code></p><p><code>JWKSUri: "https://your-domain.okta.com/.well-known/jwks.json"</code></p><p><code># Client details</code></p><p><code>DeviceClientId: "0ab2c3d4e5f6g7h8i9j0"</code></p><p><code>PollingInterval: 5</code></p><p><code># Application settings</code></p><p><code>Audience: "https://bacalhau.your-company.com/api"</code></p><p><code>Scopes:</code></p><p><code>- "openid"</code></p><p><code>- "profile"</code></p><p><code>- "email"</code></p><p>For this to work properly, you need to:</p><ol><li><p>Register an OAuth 2.0 application in your identity provider (Okta, Auth0, Azure AD, etc.)</p></li><li><p>Configure it to support the device code flow. Make sure the provider supports OAuth2 Device code flow.</p></li><li><p>Set up appropriate roles or groups in your identity provider to map to Bacalhau permissions</p></li></ol><p>The permission mapping would happen in your identity provider. For example, in Okta you might create:</p><ul><li><p>A "Bacalhau Admins" group with permissions: <code>["*"]</code></p></li><li><p>A "Bacalhau Readers" group with permissions: <code>["read:*"]</code></p></li><li><p>A "Bacalhau Job Managers" group with permissions: <code>["read:job", "write:job", "read:node"]</code></p></li></ul><p>These permissions should be included in the JWT token under the custom claim permissions when users authenticate.</p><p>To authenticate using this setup, users would run:</p><p><code># Login</code></p><p><code>bacalhau auth sso login</code></p><p><code># Logout</code></p><p><code>bacalhau auth sso logout</code></p><p>The CLI would display something like:</p><p><code>To login, please:</code></p><p><code>1. Open this URL in your browser: https://your-domain.okta.com/activate</code></p><p><code>2. Enter this code: ABCD-EFGH</code></p><p><code>Or, open this URL in your browser: https://your-domain.okta.com/activate?user_code=ABCD-EFGH</code></p><p><code>Waiting for authentication with Okta SSO... (press Ctrl+C to cancel)</code></p><p>After completing authentication through their browser, the user would receive a JWT token that's automatically used for subsequent API calls. The token can be inspected with:</p><p><code># Inspect JWT token obtained when logiing in using SSO</code></p><p><code>bacalhau auth sso token</code></p><h2><strong>Authentication Priority in Bacalhau 1.7</strong></h2><p>When configuring Bacalhau authentication, it's important to understand the precedence rules that determine which authentication method takes effect.</p><p>Environment variables take highest precedence in the authentication hierarchy, overriding any other configured methods. This means that if you have set <code>BACALHAU_API_USERNAME</code> and <code>BACALHAU_API_PASSWORD</code> for Basic Auth, or <code>BACALHAU_API_KEY</code> for API token authentication, these will be used regardless of any SSO tokens that may be stored locally from previous <code>bacalhau auth sso login</code> sessions.</p><p>This design provides flexibility for users who need to temporarily switch between different authentication contexts without modifying configuration files</p><p>For example, a developer could have an SSO session for regular work but quickly switch to using an API key for testing by simply setting the appropriate environment variable. When the environment variable is unset, Bacalhau will fall back to the next available authentication method, typically returning to the previously established SSO session if available.</p><p><code>A command was added to inspect the current authentication status for a CLI:</code></p><p><code># Inspect current authrentication status</code></p><p><code>bacalhau auth info</code></p><h2><strong>Granular Authorization in Bacalhau 1.7</strong></h2><p>Bacalhau 1.7 introduces a sophisticated authorization system built on a resource and capability model that brings fine-grained access control to the platform. This system divides API actions into specific combinations of resources and capabilities, enabling administrators to implement the principle of least privilege across their Bacalhau deployments.</p><h3><strong>Resource and Capability Framework</strong></h3><p>The permission structure is organized around two key dimensions:</p><ul><li><p><strong>Resources</strong>: The objects being accessed or modified (Nodes, Jobs, and Agents)</p></li><li><p><strong>Capabilities</strong>: The types of operations being performed (Read and Write)</p></li></ul><p>This creates a permission taxonomy following the pattern of <code>action:resource</code>, where permissions can be assigned individually or using wildcards for broader access grants.</p><h3><strong>Core Permission Set</strong></h3><p>The complete set of permissions available in Bacalhau includes:</p><ol><li><p><code>"*"</code> - The master permission granting full access to all capabilities across all resources</p></li><li><p><code>"read:*"</code> - Provides read-only access across all resource types</p></li><li><p><code>"write:*"</code> - Grants write access to all resource types</p></li><li><p><code>"read:node"</code> - Allows viewing node information</p></li><li><p><code>"write:node"</code> - Permits actions on the node</p></li><li><p><code>"read:job"</code> - Enables querying job status, details, and logs, etc</p></li><li><p><code>"write:job"</code> - Allows submitting, canceling, and managing job execution</p></li><li><p><code>"read:agent"</code> - Provides access to agent information via "bacalhau agent" commands</p></li><li><p><code>"write:agent"</code> - Any write actions on the agent. Currently no write actions are supported</p></li></ol><h3><strong>Creating Role-Based Access Patterns</strong></h3><p>These permissions can be combined to create practical access patterns for different user roles and service accounts:</p><ul><li><p><strong>Administrator</strong>: <code>["*"]</code> - Full access to all system functions</p></li><li><p><strong>Read-only Analyst</strong>: <code>["read:*"]</code> - Can view but not modify any resources</p></li><li><p><strong>Job Manager</strong>: <code>["read:job", "write:job", "read:node"]</code> - Complete control over jobs with visibility into nodes</p></li><li><p><strong>Monitoring Service</strong>: <code>["read:node", "read:job"]</code> - View-only access for system monitoring</p></li><li><p><strong>CI/CD Pipeline</strong>: <code>["write:job", "read:job"]</code> - Can submit and monitor jobs but can't access node details</p></li></ul><h2><strong>Benefits for Different User Profiles</strong></h2><p>These authentication enhancements offer distinct advantages for different types of Bacalhau users:</p><ul><li><p><strong>Individual developers</strong> benefit from the simplicity of Basic Auth for quick setup and experimentation</p></li><li><p><strong>DevOps teams</strong> can leverage API tokens for automation, CI/CD pipelines, and service-to-service communication</p></li><li><p><strong>Enterprise environments</strong> gain seamless integration with existing identity infrastructure through OAuth 2.0</p></li><li><p><strong>Security teams</strong> appreciate the granular permission model that enforces the principle of least privilege</p></li></ul><h2><strong>Practical Implementation</strong></h2><p>Implementing these new authentication mechanisms is straightforward. For Basic Auth and API Tokens, you simply update your orchestrator configuration to include user definitions with appropriate capabilities.</p><p>For OAuth 2.0, you configure the connection to your identity provider and ensure appropriate permissions are assigned to users.</p><h2><strong>Backward Compatibility with Previous Authentication Methods</strong></h2><p>Bacalhau 1.7 maintains backward compatibility with the previous authentication mechanism based on Open Policy Agent, ensuring a smooth transition path for existing deployments.</p><p>Users can continue to use their established OPA policies without immediate migration to the new authentication paths. However, it's important to note that while backward compatibility is preserved, mixing the old and new authentication methods within the same deployment is not supported.</p><p>Organizations must choose either to continue using the Open Policy Agent approach exclusively or to migrate fully to the new authentication system with Basic Auth, API Tokens, or OAuth 2.0.</p><p>This clean separation prevents potential security inconsistencies and configuration conflicts that could arise from overlapping authentication mechanisms.</p><p>For organizations planning to migrate, the Bacalhau team recommends first setting up the new authentication in a test environment, validating access patterns and permissions, and then performing a complete cutover rather than attempting a gradual or partial migration. This approach ensures security integrity throughout the transition while still providing flexibility in timing the upgrade to the enhanced authentication capabilities.</p><h2><strong>Looking Forward</strong></h2><p>Bacalhau Auth 2.0 represents a significant step forward in the platform's security and integration capabilities. By providing multiple authentication paths and a resource-based authorization model, Bacalhau has become more accessible while simultaneously enhancing its security posture.</p><p>Looking ahead, the Bacalhau development team is committed to further expanding these security capabilities. A key focus for upcoming releases will be the implementation of even more granular namespace-based authorization and authentication. This enhancement will allow organizations to create logical boundaries within their Bacalhau deployments, enabling multi-tenant environments where different teams or projects can operate independently with their own security contexts and resource limitations.</p><p>We understand that changes to authentication can be challenging, so we're here to help with your migration. Join our <a href="https://bit.ly/bacalhau-project-slack">community discussions</a> for support.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h3>Get Involved!</h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Please reach out to us at any of the following locations.</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3>Commercial Support</h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. You can read more about the difference between open-source Bacalhau and commercially supported Bacalhau <a href="https://www.expanso.io/faq/">in our FAQ</a>. If you would like to use our pre-built binaries and receive commercial support, please <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cgzM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" width="74" height="74" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:240,&quot;width&quot;:240,&quot;resizeWidth&quot;:74,&quot;bytes&quot;:6393,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-repo&quot;,&quot;text&quot;:&quot;&#11088;&#65039; GitHub&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-repo"><span>&#11088;&#65039; GitHub</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PTVY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" width="72" height="72" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:72,&quot;bytes&quot;:87775,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-slack&quot;,&quot;text&quot;:&quot;Slack&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-slack"><span>Slack</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://www.expanso.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic" width="118" height="72.61538461538461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db6feb48-90c9-40be-8bf6-7a449ec5476c.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:896,&quot;width&quot;:1456,&quot;resizeWidth&quot;:118,&quot;bytes&quot;:68480,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://www.expanso.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.expanso.io&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.expanso.io"><span>Website</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z2xN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" width="128" height="94.50549450549451" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec2518b9-3542-4718-976c-c9e51a38b480.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1075,&quot;width&quot;:1456,&quot;resizeWidth&quot;:128,&quot;bytes&quot;:50073,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.bacalhau.org&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.bacalhau.org"><span>Website</span></a></p>]]></content:encoded></item><item><title><![CDATA[Bacalhau v1.7.0 - Day 2: Scaling Your Compute Jobs with Bacalhau Partitioned Jobs]]></title><description><![CDATA[This post is part of the 5-days of Bacalhau 1.7 series.]]></description><link>https://blog.bacalhau.org/p/bacalhau-v170-day-2-scaling-your</link><guid isPermaLink="false">https://blog.bacalhau.org/p/bacalhau-v170-day-2-scaling-your</guid><dc:creator><![CDATA[David Aronchick]]></dc:creator><pubDate>Tue, 25 Mar 2025 17:01:40 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0b7ec583-cb02-4702-8d4f-cbb00baba8c6_3939x2814.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This post is part of the 5-days of Bacalhau 1.7.0 series. Be sure to go back and <a href="https://blog.bacalhau.org/p/announcing-bacalhau-17-empowering">catch any that you missed!</a></em></p><p>We read it every day: big data is everywhere and growing constantly. Companies are sitting on goldmines of data&#8211;both their own data and data they&#8217;ve retrieved from elsewhere. However, the amount of data is often too large for a single instance. In the mid-2010s, the IT community adopted parallel machines to accelerate workloads by increasing throughput. It was able to do this by leveraging innovations such as <a href="http://docker.io">Docker</a> and <a href="http://kubernetes.io">Kubernetes</a>, but the approach was <em>typically</em> limited to a single data center and compute-focused. To this day, the community needs more flexibility.</p><p>The most pressing challenges include:</p><ul><li><p>How do you split your data so that each node can work efficiently?</p></li><li><p>How do you tell each node exactly which part of the data it is responsible for?</p></li><li><p>What happens if one of the computing nodes fails?</p></li><li><p>What happens when part of your processing fails?</p></li><li><p>How do you ensure consistency and reliability?</p></li></ul><p>Fear no more! The new Partitioned Jobs feature in Bacalhau v1.7.0 addresses these challenges.</p><h2>What are Partitioned Jobs?</h2><p>When processing large datasets, splitting the work across multiple nodes can significantly improve performance and resource utilization. Bacalhau's new Partitioned Jobs feature makes this process straightforward by:</p><ul><li><p>Distributing work across a compute pool</p></li><li><p>Managing partition assignments and tracking</p></li><li><p>Handling failures at a partition level</p></li><li><p>Providing execution context to each partition</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jjKp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96841079-304a-4dff-a803-e2943e20b506_1600x631.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jjKp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96841079-304a-4dff-a803-e2943e20b506_1600x631.png 424w, https://substackcdn.com/image/fetch/$s_!jjKp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96841079-304a-4dff-a803-e2943e20b506_1600x631.png 848w, https://substackcdn.com/image/fetch/$s_!jjKp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96841079-304a-4dff-a803-e2943e20b506_1600x631.png 1272w, https://substackcdn.com/image/fetch/$s_!jjKp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96841079-304a-4dff-a803-e2943e20b506_1600x631.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jjKp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96841079-304a-4dff-a803-e2943e20b506_1600x631.png" width="1456" height="574" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/96841079-304a-4dff-a803-e2943e20b506_1600x631.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:574,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jjKp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96841079-304a-4dff-a803-e2943e20b506_1600x631.png 424w, https://substackcdn.com/image/fetch/$s_!jjKp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96841079-304a-4dff-a803-e2943e20b506_1600x631.png 848w, https://substackcdn.com/image/fetch/$s_!jjKp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96841079-304a-4dff-a803-e2943e20b506_1600x631.png 1272w, https://substackcdn.com/image/fetch/$s_!jjKp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96841079-304a-4dff-a803-e2943e20b506_1600x631.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Core Features</h2><p>Let's look at the core features of Partitioned Jobs.</p><h3>1. Partition Management</h3><p>Bacalhau v1.7.0 manages the two key aspects of partition management: distribution and independent execution.</p><p><strong>Distribution</strong></p><p>When you specify multiple partitions, Bacalhau creates N partitions (0 to N-1) and assigns them to available compute nodes that match the data and other specified constraints.</p><p>Bacalhau maintains consistent partition assignments throughout the job lifecycle, and it ensures that each partition finishes correctly.</p><p><strong>Independent Execution</strong></p><p>Each partition runs independently, can be processed on different nodes, and manages its own lifecycle and error handling.</p><h3>2. Error Handling and Recovery</h3><p>A key strength of the Partitioned Jobs feature is its approach to error handling because partition-level isolation contains failures in individual partitions. This enables the system to continue processing while the failed partitions recover.</p><p>For example, let&#8217;s say you have a job subdivided into five partitions. Running a partitioned job in Bacalhau will generate an output like the following:</p><p>Job with 5 partitions:</p><p><code>Partition 0: &#10003; Completed</code></p><p><code>Partition 1: &#10003; Completed</code></p><p><code>Partition 2: &#10003; Completed</code></p><p><code>Partition 3: &#10007; Failed -&gt; Scheduled for retry</code></p><p><code>Partition 4: &#10003; Completed</code></p><p>In this case, Partition 3 failed. However, the system automatically recovers. In a few minutes, the job status is as follows:</p><p><code>Job with 5 partitions:</code></p><p><code>Partition 0: &#10003; Completed</code></p><p><code>Partition 1: &#10003; Completed</code></p><p><code>Partition 2: &#10003; Completed</code></p><p><code>Partition 3: &#10003; Completed</code></p><p><code>Partition 4: &#10003; Completed</code></p><p>Bacalhau automatically recovers and takes care of rescheduling, even over widely distributed systems. Neat!</p><h3>3. Execution Context</h3><p>Each job receives information about its responsibilities in the parallel execution through environment variables. For example, with every job execution, it has access to the following information:</p><p><code>BACALHAU_PARTITION_INDEX # Current partition (0 to N-1)<br>BACALHAU_PARTITION_COUNT # Total number of partitions</code></p><p><code># Additional context variables<br>BACALHAU_JOB_ID # Unique job identifier<br>BACALHAU_JOB_TYPE # Job type (Batch/Service)<br>BACALHAU_EXECUTION_ID # Unique execution identifier</code></p><p>Thus, each node can make smart decisions about the data it needs from the pool and any special execution criteria. There is no need to check in with a central catalog or other jobs, resulting in more reliable and faster parallel execution.</p><h2>Using Partitioning in Your Jobs</h2><p>To use partitioning, specify the number of partitions when running your job. Here is how you can specify three partitions:</p><p><code>bacalhau docker run --count 3 ubuntu -- sh -c 'echo Partition=$BACALHAU_PARTITION_INDEX'</code></p><p>In the above code, the --count flag specifies the number of partitions. Thus, this is the actual trigger of the partitioning.</p><p>Here is how to do the same via YAML:</p><p><code># partition.yaml<br>Name: Partitioned Job<br>Type: batch<br>Count: 3<br>Tasks:<br> - Name: main<br> Engine:<br> Type: docker<br> Params:<br> Image: ubuntu<br> Parameters:<br> - sh<br> - -c<br> - echo Partition=$BACALHAU_PARTITION_INDEX</code></p><p>To run the YAML file, execute the following:</p><p><code>bacalhau job run partition.yaml</code></p><p>That&#8217;s it! Bacalhau automatically takes care of scheduling and everything else.</p><h2>Technical Benefits</h2><p>Bacalhau's partitioning feature offers significant technical advantages for processing large datasets and compute-intensive tasks:</p><ul><li><p><strong>Enhanced performance and scalability</strong>: By distributing work across multiple nodes, partitioning enables horizontal scaling and parallel processing. This approach eliminates any constraints related to I/O and memory bandwidth of a single machine, maximizing utilization and reducing processing time.</p></li><li><p><strong>Increased reliability and resilience</strong>: Partitioning provides granular failure recovery by isolating errors in individual partitions. If one partition fails, the system continues processing the other partitions without interruption. Bacalhau's built-in retry mechanism ensures that failed partitions are automatically rescheduled, enhancing the resilience of your jobs. This approach preserves results from completed partitions, preventing unnecessary reprocessing.</p></li><li><p><strong>No rewriting code: </strong>Your data processing jobs do not require a new SDK. If you're already using WASM or containers (or just about any other execution environment), we support it!</p></li></ul><h2>Getting Started with Partitioned Jobs</h2><p>If you&#8217;d like to try this example on your own, dive right in! <a href="https://docs.bacalhau.org/getting-started/installation">Install Bacalhau</a> and give it a shot.</p><p>By the way, if you don&#8217;t have a network and you would still like to try it out, we recommend using <a href="https://cloud.expanso.io/login">Expanso Cloud</a>. Also, if you'd like to set up a cluster on your own, <a href="https://docs.bacalhau.org/getting-started/network-setup">you can do that too</a> (we have setup guides for AWS, GCP, Azure, and many more &#128578;).</p><h2>What's Next?</h2><p>Start processing your data today:</p><ol><li><p>Identify your natural data groupings (dates, regions, categories)</p></li><li><p>Choose the matching partition strategy</p></li><li><p>Let Bacalhau handle the distribution</p></li></ol><p>Ready to simplify your distributed data processing? Check out our<a href="https://docs.bacalhau.org/common-workflows/partitioning"> documentation</a> for more examples and detailed guides.</p><p>Join our community to share your data processing stories and learn from others!</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>Get Involved!</h3><p>We welcome your involvement in Bacalhau. There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a>, and we&#8217;d love to hear from you. Please reach out to us at any of the following locations.</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://www.tiktok.com/@expanso.io?_t=ZN-8uypYqUuKTW&amp;_r=1">TikTok</a></p></li><li><p><a href="https://www.youtube.com/@ExpansoIO">Youtube</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3>Commercial Support</h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. You can read more about the difference between open-source Bacalhau and commercially supported Bacalhau <a href="https://www.expanso.io/faq/">in our FAQ</a>. If you would like to use our pre-built binaries and receive commercial support, please <a href="https://www.expanso.io/contact/">contact us</a> or <a href="https://cloud.expanso.io/login">get your license</a> on Expanso Cloud!</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cgzM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" width="74" height="74" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:240,&quot;width&quot;:240,&quot;resizeWidth&quot;:74,&quot;bytes&quot;:6393,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-repo&quot;,&quot;text&quot;:&quot;&#11088;&#65039; GitHub&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-repo"><span>&#11088;&#65039; GitHub</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PTVY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" width="72" height="72" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:72,&quot;bytes&quot;:87775,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-slack&quot;,&quot;text&quot;:&quot;Slack&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-slack"><span>Slack</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://www.expanso.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic" width="118" height="72.61538461538461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db6feb48-90c9-40be-8bf6-7a449ec5476c.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:896,&quot;width&quot;:1456,&quot;resizeWidth&quot;:118,&quot;bytes&quot;:68480,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://www.expanso.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.expanso.io&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.expanso.io"><span>Website</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z2xN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" width="128" height="94.50549450549451" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec2518b9-3542-4718-976c-c9e51a38b480.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1075,&quot;width&quot;:1456,&quot;resizeWidth&quot;:128,&quot;bytes&quot;:50073,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.bacalhau.org&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.bacalhau.org"><span>Website</span></a></p><p></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Announcing Bacalhau 1.7.0: Empowering Enterprises with Enhanced Scalability, Job Management, and Support]]></title><description><![CDATA[(5:35) Bacalhau v1.7.0 makes distributed computing easier with new licensing, partitioned jobs, and simplified authentication.]]></description><link>https://blog.bacalhau.org/p/announcing-bacalhau-17-empowering</link><guid isPermaLink="false">https://blog.bacalhau.org/p/announcing-bacalhau-17-empowering</guid><dc:creator><![CDATA[Sean M. Tracey]]></dc:creator><pubDate>Mon, 24 Mar 2025 17:10:17 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/5cf0bdce-0a6b-4cb4-8abd-18e2d0b95280_3939x2814.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This post is part one of the 5-days of Bacalhau 1.7.0 series. Be sure to go back and catch any that you missed!</em></p><p>We&#8217;re thrilled to announce the release of Bacalhau v1.7.0, taking one more step in our continuing mission to bring robust, scalable distributed computing to developers and enterprises alike. We&#8217;ve packed this release with an array of features designed to streamline operations and enhance security, as well as introducing comprehensive support to enable teams and organizations to confidently deploy Bacalhau at scale.</p><p>With v1.7.0, we&#8217;re rolling out the following:</p><ul><li><p>Self-service licensing configuration</p></li><li><p>Enhanced job execution with partitioned workloads</p></li><li><p>Streamlined job management and templates in Expanso Cloud</p></li><li><p>Simplified authentication via username/password, API key, and single sign-on (SSO)</p></li></ul><h2>Self-service Licensing Configuration</h2><p>Bacalhau v1.7.0 introduces enterprise support options, offering flexible licensing models to meet the needs of any organization. Whether you prefer the fully managed orchestration of Expanso Cloud or on-premises support, we've got you covered:</p><ul><li><p><strong>Self-Service Licensing:</strong> Easily purchase support licenses through the Expanso Cloud portal, tailored to your node count.</p></li><li><p><strong>Tiered Support:</strong> Choose the support level that matches your organization's needs, ranging from startups to large enterprises.</p></li><li><p><strong>Simplified Deployment:</strong> Download your license from Expanso Cloud and deploy it to your orchestrator&#8212;no need for complex node-level licensing.</p></li></ul><h2>Enhanced Job Execution</h2><p>Process terabytes of data efficiently with our new partitioned execution system. Each node automatically understands its specific workload portion, enabling parallel processing that scales linearly with your infrastructure&#8212;all without writing complex coordination code:</p><ul><li><p><strong>Deterministic Partitioning:</strong> Distribute work intelligently across compute nodes using environment variables that indicate each execution's position.</p></li><li><p><strong>Efficient Data Processing:</strong> Process large datasets with coordinated distribution, ensuring each node understands its specific workload portion.</p></li><li><p><strong>S3 Native Partitioning: </strong>Amazon Web Services recommends <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html">native partitioning</a>, and we natively integrate with it, making it much easier to us!</p></li></ul><h2>Simplified Setup and Networking</h2><p>One thing that we really want to do with Bacalhau is remove (as much as possible) the challenges of setting up a Bacalhau network. We have made some big steps towards that here:</p><ul><li><p><strong>Bacalhau Docker-in-Docker:</strong> The Bacalhau and Expanso teams now support <a href="https://github.com/bacalhau-project/bacalhau/pkgs/container/bacalhau">a containerized version</a>. This makes it particularly easy to deploy with docker tools (including <a href="https://docs.docker.com/compose/">Docker Compose</a>) and <a href="https://podman.io/">Podman</a>!</p></li><li><p><strong>Smarter Default Networking:</strong> One of the biggest challenges we have seen is people setting up and managing their networking. We have chosen some smarter defaults that should make it simpler to use (or lock down) exactly what you need. This includes native WASM networking as well!</p></li><li><p><strong>Host Environment Variables: </strong>While many jobs are treated as ephemeral, we do want to be able to pass through information about the underlying node so that jobs can act smartly. We have added that natively as well!</p></li></ul><h2>Streamlined Job Management and Templates in Expanso Cloud</h2><p><a href="https://cloud.expanso.io/">Expanso Cloud</a> now offers enhanced job submission capabilities through a web interface. The enhancements include:</p><ul><li><p><strong>Job Templates:</strong> Choose from predefined templates for common use cases like log processing, data analysis with DuckDB, and Apache Iceberg operations.</p></li><li><p><strong>Custom Workflows:</strong> Create and share custom job templates, simplifying complex distributed computing tasks.</p></li><li><p><strong>Customizable Placeholders:</strong> Reuse templates with different values to streamline common tasks.</p></li></ul><p>The enhanced Expanso Cloud platform dramatically reduces the learning curve for distributed computing. Data scientists and developers can now launch complex distributed workflows in minutes instead of days, using pre-configured templates for common use cases.</p><h2>Simplified Authentication and Single Sign-On</h2><p>Bacalhau v1.7.0 features a completely redesigned authentication system:</p><ul><li><p><strong>Standardized Approach:</strong> Adopts familiar, widely-used authentication standards.</p></li><li><p><strong>Enterprise Choice: </strong>Whether using username/password, API keys, or single sign-on (SSO), Bacalhau supports the way <em>you</em> want to authenticate.</p></li><li><p><strong>Fine-Grained Permissions:</strong> Bacalhau supports an entire array of nouns and verbs that you can use to provide granular authentication.</p></li></ul><p>Our redesigned authentication system delivers enterprise-grade security without the complexity. SSO, API tokens, and fine-grained permissions make it easy to integrate Bacalhau into your existing security infrastructure while maintaining strict access controls.</p><h2>Join Us on the Journey: 5 Days of Bacalhau</h2><p>Stay tuned for our "5 Days of Bacalhau" series, in which we'll delve deeper into these exciting new features:</p><ul><li><p><strong>Day 1:</strong> Announcing Bacalhau 1.7.0 (this post)</p></li><li><p><strong>Day 2:</strong> <a href="https://blog.bacalhau.org/p/bacalhau-v170-day-2-scaling-your?r=2xwcw0">Scaling Your Compute Jobs with Bacalhau Partitioned Jobs</a></p></li><li><p><strong>Day 3:</strong> <a href="https://blog.bacalhau.org/p/bacalhau-v170-day-3-streamlining?r=2xwcw0">Streamlining Security: Simplifying Bacalhau's Authentication Model</a></p></li><li><p><strong>Day 4:</strong> <a href="https://blog.bacalhau.org/p/bacalhau-v170-day-4-using-aws-s3?r=2xwcw0">Using AWS S3 Partitioning With Bacalhau</a></p></li><li><p><strong>Day 5:</strong> <a href="https://blog.bacalhau.org/p/distributed-data-warehouse-with-bacalhau?r=2xwcw0">Distributed Data Warehouse with Bacalhau and DuckDB</a></p></li></ul><p>Bacalhau is rooted in community-driven advancements. If you have ideas or features that you&#8217;d like to see&#8212;get involved! Check out Bacalhau v1.7.0 and dive into our community:</p><ul><li><p><strong><a href="https://github.com/bacalhau-project/bacalhau?utm_source=cli&amp;utm_medium=installer&amp;utm_term=repo">GitHub:</a></strong> Contribute to the project.</p></li><li><p><strong><a href="https://bacalhauproject.slack.com/">Slack:</a></strong> Join our community.</p></li><li><p><strong><a href="https://www.bacalhau.org/">Website:</a></strong> Learn more about Bacalhau.</p></li><li><p>Follow our social media.</p></li></ul><p>We're excited to learn how you leverage the power of Bacalhau v1.7.0 to transform your distributed computing workflows.</p><p><strong>Onwards together!</strong></p><p><strong>The Bacalhau Team</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h2>Get Involved!</h2><p>Have a unique use case? We&#8217;d love to hear about it! Share your projects and ideas, and let&#8217;s build the future of distributed systems together.</p><p>There are many <a href="https://docs.bacalhau.org/community/community/ways-to-contribute">ways to contribute</a> and get in touch, and we&#8217;d love to hear from you! Please reach out to us at any of the following locations.</p><ul><li><p><a href="https://www.expanso.io/">Expanso&#8217;s Website</a></p></li><li><p><a href="http://bacalhau.org/">Bacalhau&#8217;s Website</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bacalhau&#8217;s Bluesky</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Bacalhau&#8217;s Twitter</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Expanso&#8217;s Twitter</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3>Commercial Support</h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. You can read more about the difference between open-source Bacalhau and commercially supported Bacalhau <a href="https://www.expanso.io/faq/">in our FAQ</a>. If you would like to use our pre-built binaries and receive commercial support, please <a href="https://www.expanso.io/contact/">contact us</a>!</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cgzM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" width="74" height="74" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:240,&quot;width&quot;:240,&quot;resizeWidth&quot;:74,&quot;bytes&quot;:6393,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-repo&quot;,&quot;text&quot;:&quot;&#11088;&#65039; GitHub&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-repo"><span>&#11088;&#65039; GitHub</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PTVY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" width="72" height="72" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:72,&quot;bytes&quot;:87775,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-slack&quot;,&quot;text&quot;:&quot;Slack&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-slack"><span>Slack</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://www.expanso.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic" width="118" height="72.61538461538461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db6feb48-90c9-40be-8bf6-7a449ec5476c.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:896,&quot;width&quot;:1456,&quot;resizeWidth&quot;:118,&quot;bytes&quot;:68480,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://www.expanso.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.expanso.io&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.expanso.io"><span>Website</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z2xN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" width="128" height="94.50549450549451" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec2518b9-3542-4718-976c-c9e51a38b480.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1075,&quot;width&quot;:1456,&quot;resizeWidth&quot;:128,&quot;bytes&quot;:50073,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.bacalhau.org&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.bacalhau.org"><span>Website</span></a></p>]]></content:encoded></item><item><title><![CDATA[The Modern Data Stack: A Scalable Future with Distributed Computing]]></title><description><![CDATA[(4:03) Navigating the Challenges of the Modern Data Stack]]></description><link>https://blog.bacalhau.org/p/the-modern-data-stack-a-scalable</link><guid isPermaLink="false">https://blog.bacalhau.org/p/the-modern-data-stack-a-scalable</guid><dc:creator><![CDATA[Mandy Moore]]></dc:creator><pubDate>Thu, 20 Mar 2025 17:01:49 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/2595c3a1-e845-4442-b413-29a054bf82d5_2626x1876.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Data is being generated at an incredible, growing volume, with projections estimating <a href="https://www.statista.com/statistics/871513/worldwide-data-created/">394 zettabytes</a> of global data generation by 2028. Companies across industries - whether in finance, cloud technology, energy, or healthcare - depend on modern data platforms to process, analyze, and extract insights from this ever-growing volume of information. However, as that volume surges, traditional modern data stack tools are increasingly strained, facing challenges in scalability, efficiency, and cost management.</p><p>This is where modern solutions like distributed computing and Compute Over Data (CoD) can change the game. By shifting computation closer to the data source, these approaches eliminate inefficiencies inherent in traditional pipelines. Bacalhau exemplifies this shift, providing a powerful tool that brings computation to where the data is&#8212;whether on edge devices, in the cloud, or on-premises&#8212;optimizing performance, reducing latency, and ensuring cost-efficient data processing.</p><h2><strong>What is the Modern Data Stack?</strong></h2><p>The modern data stack refers to the tools and technologies used for collecting, storing, processing, and analyzing data at scale. A robust modern data platform consists of the following layers:</p><ul><li><p><strong>Data Sources</strong> &#8211; Applications, IoT devices, log generators, and business systems generating data.</p></li><li><p><strong>ETL/ELT Pipelines</strong> &#8211; Transformation tools that structure and clean raw data before storage.</p></li><li><p><strong>Storage Solutions</strong> &#8211; Cloud data warehouses, data lakes, and databases that store structured and unstructured data.</p></li><li><p><strong>Transformation &amp; Analytics</strong> &#8211; Modern data tools for processing data into meaningful insights.</p></li><li><p><strong>Visualization &amp; Business Intelligence</strong> &#8211; Business intelligence tools that turn data into actionable insights.</p></li></ul><h2><strong>The Shortcomings of the Traditional Data Stack</strong></h2><p>Despite its success, the traditional data stack presents major challenges:</p><ul><li><p><strong>Fragmentation</strong> &#8211; Stitching together multiple modern tools increases complexity, often requiring businesses to invest heavily in system integration.</p></li><li><p><strong>High Costs</strong> &#8211; Cloud storage and query processing costs escalate as data volumes grow.</p></li><li><p><strong>Latency &amp; Bottlenecks</strong> &#8211; Real-time analytics and parallel processing are hindered by centralized architectures.</p></li><li><p><strong>Siloed Data</strong> &#8211; Legacy stacks prevent seamless data access across departments.</p></li><li><p><strong>Security Concerns</strong> &#8211; Centralized cloud-based tools create security vulnerabilities.</p></li></ul><h2><strong>How Bacalhau Reinvents the Modern Data Stack</strong></h2><p>While many popular solutions rely on moving data through costly and time-consuming pipelines, Bacalhau solves these challenges by enabling Compute Over Data processing workloads directly where the data is located.</p><h3><strong>Key Benefits of Bacalhau</strong></h3><h4><strong>1. Faster, More Efficient Data Processing</strong></h4><p>Bacalhau eliminates latency by executing workloads at the data source, enabling near real-time insights.</p><h4><strong>2. Reduced Infrastructure and Cloud Costs</strong></h4><p>Bacalhau leverages existing compute resources instead of requiring expensive centralized cloud services, drastically reducing operational costs.</p><h4><strong>3. Enhanced Security &amp; Compliance</strong></h4><p>By processing data in place, Bacalhau minimizes exposure to cyber threats and ensures compliance with regulations like GDPR and HIPAA.</p><h4><strong>4. Seamless Integration Across Environments</strong></h4><p>Bacalhau works effortlessly across cloud data warehouses, on-premises solutions, and edge computing environments.</p><h2><strong>Transforming Business Intelligence with Compute Over Data</strong></h2><p>Business intelligence tools traditionally rely on pre-aggregated datasets and scheduled reports. With Bacalhau, organizations can unlock new capabilities by running compute directly where data resides, enabling:</p><ul><li><p>The generation of real-time insights by supporting real-time analytics.</p></li><li><p>More informed decision-making through efficient processing for predictive analytics.</p></li><li><p>AI-powered insights by accelerating access to distributed data.</p></li><li><p>Improved data visualization by facilitating seamless integration with leading visualization tools.</p></li></ul><h2><strong>Edge Computing: Bringing Processing Closer to the Data</strong></h2><p>Traditional cloud-based data warehouses process data centrally, leading to considerable time delays and increased costs. Edge computing with Bacalhau addresses these issues by processing data at the source.</p><h3><strong>Key Advantages of Edge Computing with Bacalhau</strong></h3><ul><li><p><strong>Minimized latency</strong> &#8211; Instant insights by reducing data transfer.</p></li><li><p><strong>Lower bandwidth costs</strong> &#8211; Process only relevant data before sending it to the cloud.</p></li><li><p><strong>Enhanced security</strong> &#8211; Sensitive data stays closer to its origin.</p></li><li><p><strong>Faster decision-making</strong> &#8211; Supports AI-driven self-service analytics tools.</p></li></ul><h2><strong>Why Bacalhau is the Future of the Modern Data Stack</strong></h2><p>The modern data stack must evolve beyond cloud-based tools to meet the growing needs of business intelligence, AI, and real-time analytics. Bacalhau offers a scalable, cost-efficient, and flexible alternative to legacy systems.</p><p>By integrating Bacalhau, you gain:</p><ul><li><p>A unified platform for managing workloads across cloud, edge, and on-premises environments.</p></li><li><p>Instant access to real-time insights through efficient data processing.</p></li><li><p>Lower IT costs with reduced cloud dependency.</p></li><li><p>Optimized security &amp; compliance for business users in finance, healthcare, and energy industries.</p></li></ul><h2><strong>Get Started with Bacalhau Today</strong></h2><p>Ready to revolutionize your IT infrastructure? Download Bacalhau now or contact our team to explore how we can help optimize your modern data platform.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.bacalhau.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.bacalhau.org/subscribe?"><span>Subscribe now</span></a></p><h2>Get Involved!</h2><p>Expanso&#8217;s tools and templates make it easy to get started. Check out our<a href="https://github.com/expanso"> public GitHub repository</a> for examples and guides, and join the conversation on social media with <strong>#ExpansoInAction</strong>.</p><p>Have a unique use case? We&#8217;d love to hear about it! Share your projects and ideas, and let&#8217;s build the future of distributed systems together.</p><p>There are many <a href="https://docs.bacalhau.org/community/ways-to-contribute/">ways to contribute</a> and get in touch, and we&#8217;d love to hear from you! Please reach out to us at any of the following locations.</p><ul><li><p><a href="https://www.expanso.io/">Website Expanso</a></p></li><li><p><a href="http://bacalhau.org/">Website Bacalhau</a></p></li><li><p><a href="https://bsky.app/profile/bacalhau.org">Bluesky Bacalhau</a></p></li><li><p><a href="http://twitter.com/bacalhauproject">Twitter Bacalhau</a></p></li><li><p><a href="https://twitter.com/ExpansoIO">Twitter Expanso</a></p></li><li><p><a href="https://bit.ly/bacalhau-project-slack">Slack</a></p></li><li><p><a href="https://www.linkedin.com/company/expanso-io">LinkedIn</a></p></li><li><p><a href="https://expanso-inc.breezy.hr/">Careers Page</a></p></li></ul><h3>Commercial Support</h3><p>While Bacalhau is <a href="https://en.wikipedia.org/wiki/Open-source_software">open-source software</a>, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by <a href="https://www.expanso.io/">Expanso</a>. You can read more about the difference between open-source Bacalhau and commercially supported Bacalhau <a href="https://www.expanso.io/faq/">in our FAQ</a>. If you would like to use our pre-built binaries and receive commercial support, please <a href="https://www.expanso.io/contact/">contact us</a>!</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cgzM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png" width="74" height="74" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:240,&quot;width&quot;:240,&quot;resizeWidth&quot;:74,&quot;bytes&quot;:6393,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!cgzM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 424w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 848w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1272w, https://substackcdn.com/image/fetch/$s_!cgzM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda36d1cd-94e9-4500-bcf7-a370e04c0e31_240x240.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-repo&quot;,&quot;text&quot;:&quot;&#11088;&#65039; GitHub&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-repo"><span>&#11088;&#65039; GitHub</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PTVY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png" width="72" height="72" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:72,&quot;bytes&quot;:87775,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!PTVY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!PTVY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fd675d-f298-4b91-bb32-47211eee4a2f_2048x2048.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://link.cod.dev/bacalhau-slack&quot;,&quot;text&quot;:&quot;Slack&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://link.cod.dev/bacalhau-slack"><span>Slack</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://www.expanso.io" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic" width="118" height="72.61538461538461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db6feb48-90c9-40be-8bf6-7a449ec5476c.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:896,&quot;width&quot;:1456,&quot;resizeWidth&quot;:118,&quot;bytes&quot;:68480,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://www.expanso.io&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!6NiL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 424w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 848w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1272w, https://substackcdn.com/image/fetch/$s_!6NiL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb6feb48-90c9-40be-8bf6-7a449ec5476c.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.expanso.io&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.expanso.io"><span>Website</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z2xN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic" width="128" height="94.50549450549451" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec2518b9-3542-4718-976c-c9e51a38b480.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1075,&quot;width&quot;:1456,&quot;resizeWidth&quot;:128,&quot;bytes&quot;:50073,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!z2xN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 424w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 848w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1272w, https://substackcdn.com/image/fetch/$s_!z2xN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2518b9-3542-4718-976c-c9e51a38b480.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.bacalhau.org&quot;,&quot;text&quot;:&quot;Website&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.bacalhau.org"><span>Website</span></a></p>]]></content:encoded></item></channel></rss>