The Expanso crew has been out in the world again! This time, we attended the Big Data LDN conference in the beautiful borough of Kensington, London. We met loads of fascinating people and picked up valuable insights. We thought we’d take some time on the blog today to share what we got up to, the key things we learned, and what you can expect if you ever decide to check it out for yourself!
What is Big Data LDN all About?
Big Data LDN is, as you'd expect, a major event in London focused entirely on data. This free annual gathering brings together professionals from the data science, analytics, and MLOps communities, providing a space for experts to share best practices, network, and build relationships.
Since Bacalhau is all about Computing Over Data, it was a natural fit for Expanso to set up a booth and share everything we know about distributed data processing.
Expanso’s CEO, David Aronchick, has a knack for distilling the challenges people often face when working with data into three key categories:
There’s never going to be any less data than there is now. You need to build systems that can handle data as it continues to grow.
You’ll never beat the speed of light. You can increase bandwidth in a system, but that only goes so far in addressing latency. To truly scale up processing, you need to bring your compute closer to the data. This minimizes delays and allows for faster, more efficient data processing.
Society is never going to be less interested in protecting people’s data.
Privacy laws like GDPR and CCPA shape how we handle consumer data. They make security a necessity, especially when data crosses borders. These laws won’t get less stringent over time. So, we must secure the data and stay compliant without cutting corners or weakening the rules.
And if we learned anything from our conversations at Big Data LDN, it’s that so many people are looking for innovative, robust, and scalable solutions to help them process an ever-increasing volume of data - without breaking the bank or bending the rules.
Who Was There?
Big Data LDN is, as the name suggests, a BIG event. There were thousands of attendees and hundreds of exhibitors and speakers covering every aspect of data generation, storage, and analysis.
Over the course of two days, we spoke with countless people looking to solve the problems they’re having processing data at scale.
In amongst all those conversations, we also had the opportunity to attend some of the talks and workshops led by some of the brightest stars in the data science space. Some of those we enjoyed the most were:
The Headline Keynote Panel:
How often do you get to see three Gold Medal Olympians chatting about how they use data to achieve success at the highest levels of sport?
Meetup: Data Engineer Things (DET) London
It’s always fantastic when local communities are given the space to share knowledge and build relationships, and seeing the DET community come out in force to do just that was nothing short of enthralling.Women in Data Lounge
We loved seeing the Big Data London and Women in Data groups come together and hearing how WiD plays a crucial role in advocating for more accurate representation wherever data is being collected and analyzed.
Hallo Healthcare + Actian
A stand-out talk on how Hallo has leveraged cloud analytics to not only streamline their huge data flows but also significantly enhanced patient service delivery.
Thoughtworks
We had the great fortune of finding some time to speak with Danilo Sato and Amy Raygada of Thoughtworks. Their insights and ideas have already proven to be invaluable. Seek them out!
We weren’t alone at Big Data LDN—we had the great fortune of catching up with Guy Fighel, Head of the Data Program at Hetz Ventures (the VC fund that supports Expanso), as well as our good friends at Upriver, who are also a Hetz company!
The Joys of Being Big-Headed
We noticed something interesting in one of our meetings recently: Everyone at Expanso has a slightly larger head than the average person in their local region. Also, our CEO, David, has the largest head of all of us! Is it a coincidence that he’s our leader? Do we subconsciously hire only people who have heads that are slightly larger than average? We have so many questions!
And how do you answer questions? With data, of course! And where better to get data than at a data science conference?
That's right! For everyone we spoke to, we asked if they’d be willing to let us measure their heads and compare the sizes with others at Big Data LDN. We even had some branded measuring tapes made just for the occasion - and yes, we built a dashboard to track the results!
To our surprise, people were not just willing, but curious! Here are some of the key insights we learned:
The largest head we measured was 61.00 cm in circumference, whereas the smallest head was 54.0 cm.
The average head size of someone attending Big Data London was 57.05cm which is slightly larger than the UK average of 56.20cm.
The average head size of an Expanso employee is 57.20cm.
What has this taught us? Absolutely nothing, it was just a bit of fun to have with people - however, it did give us a good opportunity to show off Bacalhau!
The Distributed Dashboard
We knew that we’d be collecting some data at Big Data LDN, so we thought, instead of storing all that information in one place, what if we were to distribute the data across multiple locations with different databases and query them all simultaneously to show off the real power of Bacalhau?
And that’s what we built!
We installed Bacalhau on several systems worldwide, setting up either a MongoDB, Valkey, or Postgres database at each location. This enabled Bacalhau jobs to seamlessly store and retrieve information across various regions, ensuring smooth data interaction.
Whenever we measured someone’s head, we would add it to the system using a dashboard that we had built. It would:
Accept the measurement
Generate a Bacalhau Batch job with the value
Send it off to the network for storage using the Bacalhau API.
When the job was dispatched, we didn’t specify where the data should be stored or which datastore to use. Instead, we let Bacalhau’s scheduler handle that for us. As a result, the measurements were distributed evenly across our Bacalhau instances, ensuring balanced storage across all locations.
To retrieve all the measurements stored across our network, the dashboard created an Ops job that executed simultaneously across all nodes. This job queried the local datastore of each Bacalhau node it ran on, collecting all the results. Once each Ops job was completed, the dashboard amalgamated the results, ranked them from the largest to the smallest measurement, and displayed them on screen for easy viewing.
The Expanso SideBar
As is rapidly becoming tradition for Expanso, we love creating opportunities to connect on a deeper level than what’s possible on the conference floor - so we hosted another of our SideBar events!
And what better way to unwind after two busy days of conferencing than a couple of pints at a classic London pub? Clearly, we aren’t the only people who hold that sentiment to be true, as we were joined by 40 people from across the technology and data science community to chat, learn, and raise a pint to a few days well spent.
If you’d like to spend time getting to know more about Expanso and Bacalhau, keep an eye out for a SideBar event near you - there’ll be plenty more to come soon.
Where Will We be Next?
Over the next few weeks, we’ll be out in the world at a bunch of other events. If you’re at these events and fancy having a chat, or have any burning questions about Bacalhau, let us know!
TechCrunch Disrupt 2024
28th - 30th October 2024, San Francisco, USA
Details
We’ve been chosen to exhibit at TechCrunch Disrupt 2024 as part of Startup Battlefield 200, the world’s preeminent startup competition. We’re one of the 200 startups selected from a review of thousands of applicants to pitch in front of investors and TechCrunch editors.
Simultaneously, we’ll be exhibiting at the Open Data Science Conference just across the city from TechCrunch Disrupt.
ODSC West 2024
29th - 31st October 2024, San Francisco, USA
Details
Here’re working with some of our trusted partners to bring along some great AI/ML and Robotics demos. So, if you see a robotic dog wandering around with a Bacalhau logo on its back, you’ll know we’re close by!
We can’t wait to show off what Bacalhau can do!
Get Involved!
We welcome your involvement in Bacalhau. There are many ways to contribute, and we’d love to hear from you. Please reach out to us at any of the following locations.
Commercial Support
While Bacalhau is open-source software, the Bacalhau binaries go through the security, verification, and signing build process lovingly crafted by Expanso. You can read more about the difference between open-source Bacalhau and commercially supported Bacalhau in our FAQ. If you would like to use our pre-built binaries and receive commercial support, please contact us!