Introduction
Maintaining security within a distributed environment can feel like a losing battle. Every device, or node, in the system is a potential target for attacks, and communications between nodes are at risk of interception. Point at any part of the network and you will find some form of vulnerability. Given the ever-changing nature of cyber threats, the question isn't if a security breach will happen, but rather when it will occur.
What We’ll Cover in this Post
The current problems in fleet security.
How Bacalhau can help with those problems.
Setup instructions of osquery on Bacalhau to scan for trojans.
How you can lever your fleet security with Bacalhau.
Challenges in Fleet Security
Visibility is Vital: Businesses need the ability to monitor their network continuously for any signs of suspicious activity or potential threats, enabling quick action to mitigate damage from attacks. While preventative measures like firewalls, antivirus software, and access controls can reduce the risk of attack, no amount of safeguarding can fully protect your infrastructure from compromise. In fact, according to IBM, 95% of security breaches result from human error. So even if you’re protected against the most coordinated and sophisticated cyber attacks, there's always a risk that an employee might accidentally or intentionally introduce vulnerabilities into the system.
Breaches Cost Money and Reputation: Data breaches are on the rise, and according to IBM’s latest security report, the global average cost of a data breach in 2023 was $4.45 million USD. 82% of these breaches occurred in hybrid cloud or fleet environments, and only a third were detected by the companies’ internal security teams. Businesses risk significant financial and reputational costs without better security investments.
Current Solutions
There are many ways to secure your fleet’s data, but some of the most effective defensive measures are set out below
Staff Training - Phishing scams and compromised credentials are a leading cause of security breaches. Employee policies, access controls and regular training can be a good first line of defense, but there is a trade-off between complexity and usability. If security policies and protocols are too complicated, employees might try to bypass them out of frustration or complacency.
DevSecOps Approach - software, apps, and infrastructure should embed security into every stage of their development cycle, adhering to the 'secure by design, secure by default' principles. However, this approach can be challenging for businesses using legacy systems or third-party integrations, where updating or fully controlling the security posture can be complex.
Artificial Intelligence - AI can significantly enhance the speed, accuracy, and efficiency of cybersecurity through improved threat detection, pattern recognition, and predictive threat forecasting. However, businesses should be aware that integrating AI into their security systems comes with its own set of challenges. These include managing false positives and negatives, ensuring continuous data input and learning to remain effective, navigating the complexity and costs of implementation, and keeping pace with the evolving landscape of AI-driven cyber threats.
Attack Surface Management - Businesses need to consistently evaluate their attack surface, pinpointing vulnerabilities in their digital infrastructure. This is particularly crucial in fleet environments, which requires close examination of all network endpoints, servers, applications, and data storage.
Advantages of Using Bacalhau
Full Visibility: All of the above measures will reduce the risk of a data breach, but businesses also need a way to check that they’re working. Bacalhau offers real-time visibility across distributed systems, regardless of size or complexity. This allows engineers to monitor the entire infrastructure and respond promptly to breaches. As security increasingly becomes more automated AI-driven, it is essential to continuously track their efficacy. Bacalhau makes this easy by connecting all of your devices into one unified fleet, making continuous oversight manageable.
Secure by Design: Stationary data is safer. The moment data leaves secure storage and travels across networks, it’s reaching peak vulnerability. Security administrators have no control over data in motion (DiM), making it a prime target for theft. Encryption and stringent data-sharing protocols certainly make things harder for hackers, but won’t pose much of a problem for a malicious insider.
This is why Bacalhau takes a different approach to the way we move data. Instead of moving the data, we orchestrate jobs to run where the data already exists. Our platform thereby provides a layer of security by virtue of its design – as the only moving parts are job requests and outputs. Fleet administrators can control the flow of information by choosing what the outputs are allowed to contain, ensuring that sensitive data remains secure and exposure is minimized.
Bacalhau and Osquery: Locate Threats Fast
We will now showcase how Bacalhau brings simplicity, visibility and security to a distributed fleet. In this example, we’ll track down a trojan in real-time, identify the infected hosts, and quarantine them. Procedures like this can be automated, but malware often moves much faster than other types of threats, and thus require manual intervention. We will show you just how quick and easy this is with Bacalhau.
Osquery is a monitoring and analytics tool designed to query low-level operating systems using an SQL-like language. When used with Bacalhau, it allows you to ask fleet devices about their activities and returns their answers in a nice readable table. You don’t have to rummage through terabytes worth of accumulated logs, you can simply fire off an SQL query and find the exact information you’re looking for.
Step 1 - Prerequisites
You will need to install the Bacalhau agent on all the devices in your fleet, and you can do this by following the instructions on setting up a private cluster. The following hardware is required:
Compute Node(s): These nodes run the code on each device
Recommended Requirements:
Number of Instances: 1 per fleet device
Disk: 32GB
CPU: 1-4vCPU
Memory: 4-8GB
Control Plane Node: This node is responsible for allocating the work to the appropriate nodes.
Recommended Requirements:
Number of Instances: 1
Disk: 32-500GB
CPU: 1-8vCPU
Memory: 4-16GB
You will need to install the control plane first as it will generate environment variables that are required in setting up the compute nodes.
Step 2 - Setup
To set this up on each of our nodes, we need to install Osquery, and start the daemon as a service:
wget https://pkg.osquery.io/deb/osquery_5.8.2-1.linux_amd64.deb
dpkg -i osquery_5.8.2-1.linux_amd64.deb
sudo systemctl daemon-reload
sudo systemctl start osqueryd
Then add --allow-listed-local-paths=/var/osquery/osquery.em to your bacalhau serve command line, to allow osquery jobs run by Bacalhau to access the socket provided by the osquery daemon.
With that done, Bacalhau can be asked to run Osquery across the entire cluster using the –target=all flag to bacalhau docker run, using the official Osquery docker image. If only a subset of the nodes is of interest, we can also use the -s selector flag to restrict the query to those nodes:
bacalhau –api-host $API_HOST docker run –target=all \
-i src=file:///var/osquery/osquery.em,dst=/var/osquery/osquery.em \
osquery/osquery – osqueryi –connect=/var/osquery/osquery.em \
“SELECT * FROM processes”
…where $API_HOST refers to your Bacalhau API host’s address.
The query sends its results to standard output, Bacalhau gathers it all up and sends it to your chosen publisher (by default, IPFS). There are various options, but as we expand this example, we will use Amazon S3. The results are stored in S3, and if we configure the S3 publisher to create a file for each node in a chosen location, we can use the Python script in the fleet-management folder in the Bacalhau examples Github repo. This script automates running osquery through Bacalhau, gathering results from each node, and compiling them into a single JSON file - making querying your cluster easy.
Step 3 - Checking Policy Compliance
First, let’s verify adherence to our security policy: All ssh keys need to be encrypted and use elliptic curves:
poetry run query-cluster -p \
"SELECT system_info.hostname, uid, username, path,
encrypted, key_type
FROM system_info, users CROSS JOIN user_ssh_keys USING (uid)
WHERE encrypted=0 OR key_type<>'ec'"
This will return any keys that don’t meet the policy. It provides details like the hostname, UID, username, and the key file path, along with the encryption and key type, so you can pinpoint the compliance issues. Thankfully, it looks like everyone's following the security team's guidelines – we've got an empty JSON array, meaning no non-compliant keys:
[]
Taking a look through the osquery schema, it’s quickly apparent that a vast number of such checks can be easily implemented.
Step 4 - Looking for a Trojan
Where Bacalhau stands out from traditional security policy management and monitoring tools, is when it comes to performing ad-hoc queries with ease. When a new threat, like a worm, emerges, security administrators need to quickly find out if any servers are compromised. Let’s imagine that yet another Wordpress exploit has been found, being exploited by a worm that installs itself in wp-content/themes and acts as a trojan horse. As soon as we obtain the hash of the worm file from the news, we can trivially query our cluster to check for its presence:
poetry run query-cluster -p
"SELECT system_info.hostname, path
FROM system_info, hash
WHERE
path='/var/www/htdocs/wp-content/themes/goldenrod/functions.php' AND
sha256='ef59adcbcf15a19cd74…'"
And we find some matches!
[
{
"nodeID": "QmPaJC7b",
"hostname": "warehouse-example-us-central1-a-vm.us-central1-a.c.ddw-usecase.internal",
"path": "/var/www/htdocs/wp-content/themes/goldenrod/functions.php",
"sha256": "ef59adcbcf15a19cd742cfc3f3d45da135684eab56e042726a4fd4e4b77e94a6"
},
{
"nodeID": "QmbkfnnF",
"hostname": "warehouse-example-us-central1-a-vm.us-central1-a.c.ddw-usecase.internal",
"path": "/var/www/htdocs/wp-content/themes/goldenrod/functions.php",
"sha256": "ef59adcbcf15a19cd742cfc3f3d45da135684eab56e042726a4fd4e4b77e94a6"
},
…
With that in hand, we can quickly take down the infected nodes for forensic analysis. Meanwhile, we can deploy mitigations and await the release of the updated Wordpress security patch.
Conclusion
This is just one example of how you can use Bacalhau for fleet security and why it’s a powerful weapon to add to your cybersecurity arsenal. Distributed systems require distributed security solutions, and Bacalhau provides a platform with security baked into its core. There are countless security suites and software out there, but many of them are exceedingly complicated to set up, lack full visibility and use black-box proprietary tools. Bacalhau, on the other hand, is simple to install, provides real-time visibility and is open-source, meaning you know exactly what you’re getting.
With Bacalhau, setting up a robust distributed fleet management system is pretty much a walk in the park. However, we're just scratching the surface here. The framework's adaptability and resilience make it a must-have tool for any enterprise aiming to keep their distributed fleet in check.
By decentralizing the fleet management and enabling on-device processing, Bacalhau not only vastly reduces operational costs, but also ensures real-time data processing and compliance with archival needs. Its seamless integration across various cloud platforms, including AWS, Azure, and Google Cloud, reflects our commitment to versatile, cross-platform solutions.
While Bacalhau is open source software, the Bacalhau binaries go through the security, verification, and signing build process, lovingly crafted by Expanso.
How to Get Involved
We're looking for help in several areas. If you're interested in helping, there are several ways to contribute. Please reach out to us at any of the following locations.
Commercial Support
While Bacalhau is open-source software, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by Expanso. You can read more about the difference between open source Bacalhau and commercially supported Bacalhau in our FAQ. If you would like to use our pre-built binaries and receive commercial support, please contact us!