In the fast-paced world of software development, the efficiency of build processes plays a crucial role in maintaining productivity and code quality. At ThoughtSpot, while Gradle has been effective, the growing complexity of our projects demanded a more sophisticated approach to understanding and optimizing our builds.
This requirement prompted us to explore Build Analytics—harnessing data from our build processes to gain actionable insights. By analyzing aspects such as compilation duration, resource utilization, and dependency management, we set out to streamline our development workflow and boost efficiency.
To accomplish this, we developed a custom Build Analytics solution leveraging our own ThoughtSpot Analytics platform. This innovative approach enabled us to leverage ThoughtSpot Analytics capabilities for deeper insights into our build ecosystem.
This article presents the challenges associated with Build Analytics and the measures we adopted to enhance the efficiency of build processes at ThoughtSpot. It aims to explain how we transformed our development practices with a data-centric approach and offers recommendations to help your teams address similar challenges in your software development lifecycle.
Enhancing build processes with ThoughtSpot Analytics
While our previous analytics solution provided us with build scans and basic insights, we encountered limitations that hindered our ability to gain deep insights into our build processes. Two primary challenges stood out:
linking code owners to specific build issues
tracking how often a job failed at a particular Git commit
These limitations impacted our development workflow, making it harder to pinpoint and resolve build-related problems quickly.
As projects scaled, we realized the need for a tailored approach to build analytics and a solution that could provide granular insights, correlate build issues with specific code changes, and offer a comprehensive view of our build ecosystem. This realization led us to explore alternatives and develop a custom analytics pipeline integrated with the ThoughtSpot application development process.
Our Solution: Custom Build Analytics
To address the challenges and limitations in our build management process, we developed a multi-step Build Analytics solution using ThoughtSpot as the central platform. This pipeline is designed to capture detailed data, process it efficiently, and provide actionable insights through ThoughtSpot’s powerful analytics features.
Here's how we implemented this solution:
Step 1: Modifying Gradle configuration
We began by enhancing our build.gradle file to enable comprehensive logging. To ensure consistency across all build environments and capture detailed information such as timestamps, environment variables, compilation statistics, and relevant metadata for each build, we introduced custom tasks. Standardizing the log output ensured consistency across different build configurations and environments.
Step 2: Processing logs
Next, we developed a Python-based log parser to transform the raw build logs into a structured JSON format. The parser used advanced regular expressions and parsing techniques to extract critical data, such as build duration, failure points, and related code changes. The structured JSON output was designed to capture all relevant information, making it easier to analyze and query. This approach ensured comprehensive data extraction while handling various edge cases and log formats.
Step 3: Implementing a data pipeline
To automate the data collection and processing, we integrated a Jenkins job that runs hourly. This job aggregates the parsed JSON data, performs necessary transformations, and pushes the resulting data to a Cloud Data Warehouse (CDW). With ThoughtSpot’s robust data integration capabilities, this process was seamless and allowed connecting large volumes of build data in real-time.
The pipeline also supports error handling and retry mechanisms to guarantee data integrity and completeness.
Step 4: Setting up ThoughtSpot Analytics
Finally, we used ThoughtSpot to create dynamic Liveboards, offering real-time insights into our build processes. ThoughtSpot's search-driven analytics allowed us to explore the data dynamically, identifying patterns, trends, and anomalies that were previously hidden. Even non-technical team members could access and analyze build data using ThoughtSpot’s intuitive natural language querying capabilities.
The following illustrations show how the insights from ThoughtSpot Liveboards can be used to analyze build environments:
Figure 1. Top failing test classes and methods with owner details
Figure 2. Top flaky and failing tests and duration trends per team
Figure 3. Number of tests picked up with selective test selection policy
With ThoughtSpot’s SpotIQ and anomaly detection features, we could automatically surface insights, detect performance regressions, and identify areas of improvement.
Figure 4. Anomaly Detection and ThoughtSpot SpotIQ Analysis
Results and benefits
This streamlined Build Analytics solution with the custom data pipeline and ThoughtSpot features provided deeper visibility into our build processes. The solution enabled proactive issue resolution, data-driven decision-making, and continuous improvement across our development lifecycle.
Data-driven decision-making
ThoughtSpot's advanced analytics capabilities allow you to make data-driven decisions. The insights gained from analyzing build data help identify areas that require optimization, prioritize tasks, and allocate resources effectively. The data-driven approach also enhances code quality and overall development efficiency.
Improved visibility
We gained unprecedented visibility into our build processes, enabling us to identify and address issues more quickly. ThoughtSpot's ability to correlate build failures with specific Git commits and code owners has dramatically reduced troubleshooting time.
Proactive issue resolution
ThoughtSpot's anomaly detection features, both predefined and AI-driven, allow you to set alerts for regressions, test duration increases, and spikes in flaky tests. This proactive approach ensures a quicker resolution of build problems and a smooth development process.
Continuous improvement
ThoughtSpot's insights empowered us to continuously improve build processes by identifying bottlenecks, performance trends, and areas of inefficiencies. We implemented an iterative approach with targeted optimizations to enhance code quality, reduce build time, and minimize failures.
Future enhancements
While our Build Analytics capabilities have improved, we are adding new features that will further enhance the value of the Build Analytics solution and provide deeper insights into development processes.
Implementing a cleanup strategy
We are developing a robust cleanup strategy to ensure our database contains only essential data. This will involve implementing data retention policies, archiving historical data, and optimizing storage usage. By maintaining a lean and efficient database, we aim to improve query performance and reduce storage costs while ensuring all critical build information remains accessible.
Expanding data sources
Currently, we leverage ThoughtSpot Liveboards to provide a comprehensive view of various aspects of the development process, including task tracking via Jira, product usage via Pendo, and other key metrics. These insights are already an integral part of our development strategy, providing visibility and enabling informed decision-making.
We plan to expand our data sources further to include code quality metrics and test coverage data. This will allow us to correlate build performance with code quality indicators and offer a holistic view of the entire software development lifecycle. ThoughtSpot's seamless integration capabilities make this expansion effortless, ensuring that our team can continue to drive improvements based on comprehensive, data-driven insights.
Conclusion
The transition from a third-party analytics tool to an in-house solution built on ThoughtSpot has transformed our approach to Build Analytics. By harnessing data-driven insights, we enhanced code quality and overall development efficiency. Moreover, ThoughtSpot's advanced analytics capabilities empowered us to identify gaps, optimize build processes, and proactively resolve issues.
This journey underscores the importance of leveraging data analytics to drive continuous improvements in your software development lifecycle. By adopting a data-driven approach and utilizing the capabilities of ThoughtSpot, development teams can unlock valuable insights, and ensure better code quality, faster issue resolution, and process efficiency.
By leveraging tools such as ThoughtSpot Analytics, you can harness the power of data to streamline workflows, continuously improve your development and build processes, and transform your Build Analytics.