Ask any business leader in any sector what the biggest opportunity is for their department or organization and they’ll tell you: better use of data. Whether that data is being used to empower decision makers on the frontlines or guide the most strategic decisions facing the business, the ability to improve products and services, boost customer satisfaction, and increase productivity are undeniable, as research from Harvard Business Review demonstrates.
The promise of data continues to grow, as machine learning and data science chart new possibilities for businesses, while increasing the data fluency of business users. However, for most organizations, these fail to truly change how businesses make decisions and operate. The gap between those with data science skills and those responsible for the business has been too wide - until now.
The Rise of the Lakehouse
But before we get there, let’s back up. How did we get to today, where organizations are drowning in their data? The last decade has seen an explosion in how much data we collect and store.
But it’s not just the volume of data that’s multiplying. It’s the type of data itself. While the structured data that organizations have grappled with for years continues to grow, these organizations have a new opportunity at their fingertips in the form of unstructured data. Historically, these different kinds of data have required different approaches to everything from storage to architecture, giving rise to both data warehouses and data lakes.
At the same time, the benefits of cloud computing have exploded. Companies are moving away from traditional on premise environments to tap into the agility, scale, and flexibility the cloud offers - and bringing their data warehouses and data lakes with them. In the last six months, the pandemic has accelerated this shift tenfold as companies look for new ways to navigate unprecedented times.
However, even in the cloud, the problems with disparate data lakes and data warehouses persist. While data warehouses are great for analytics and reporting, they lack the variety of file types and have limited support for streaming data. Data lakes, on the other hand, were great for data science and machine learning, but had poor performance, lacked proper BI and analytics, and were complex to set up.
One solution to this? A new paradigm that brings reliability, quality, and performance to data lakes: the lakehouse.
The lakehouse gives businesses a single, unified experience for all their data use cases - data science, machine learning, BI and streaming analytics. Bringing the best of data lakes and data warehouses creates a more complete and powerful data management architecture for a variety of applications, especially data scientists looking to build machine learning and AI applications.
Bringing Data Science to the Masses with Search
Even as companies have recognized the potential of data science, and invested heavily to tap it, many organizations have failed to realize the full value. This comes down to two reasons. First, the models data professionals spend hours and countless resources building and training never make their way into production. Second, and perhaps more importantly, the results from these models are not used by the business to make better decisions. There are simply too many knowledge workers, and too few data professionals, to ever cross this chasm.
That’s why I’m so excited to announce the new partnership between Databricks and ThoughtSpot to bring search to the lakehouse for the first time, starting with ThoughtSpot Cloud for Delta Lake, leveraging Databricks’ new Databricks SQL. With our new partnership, our joint customers will not only be able to reap the benefits of the data lakeshouse, but democratize them by putting the power of unified data and analytics in the hands of their business users through search.
ThoughtSpot automatically analyzes the data inside of Delta Lake, the structured transactional layer that enables the lakehouse for Databricks, creating a unique search experience on top of that data. Users of all skill levels can then search Delta Lake using simple, natural language, which ThoughtSpot instantly translates to SQL queries to provide answers in seconds. All of this happens securely with a high level of governance, since ThoughtSpot is querying Delta Lake, the system of record, directly.
For example, a supply chain professional may want to know the impact different shutdown schedules may have on inventory costs. A data scientist can use both structured and unstructured data in Databricks’ Delta Lake to build and train a machine learning model. Then, using ThoughtSpot, the supply chain professional can search for the impact on various products or SKUs depending on different scenarios, leveraging insights and predictions from the model and underlying data, and get an answer in seconds.
Inevitably, the first answer will lead to a follow up question. With ThoughtSpot, they can simply type a new search and have the answer at their fingertips instead of asking a data professional to build a new dashboard or report. This instantaneous interaction brings the work of the data scientist directly to the business with no restrictions or delays, helping business users make more informed decisions that drive more value.
Organizations must continue innovating both how they leverage all the data at their fingertips, and how they empower their businesses to act on this data if they want to thrive in the future. Bringing together the robust power of the data lakeshouse with search, we’re helping organizations make that future possible today.
To learn more, reach out for a trial and see how you can leverage ThoughtSpot and Databricks today.