Who is the Company

A global supplier of semiconductor equipment

The Challenge

The company employed multiple platforms on Databricks for data science tasks, each serving various business groups or units. The owner of one such platform sought to track the allocation of resources (e.g., Databricks Notebooks and Clusters) among different teams via a shareable dashboard for stakeholder usage and additional purposes. However, due to the intricate nature of these requirements, the REST APIs provided by Databricks did not encompass all the necessary information. The metrics needed to be grouped by business units and available for daily, weekly, monthly, and quarterly periods. To summarize, the company wanted to go beyond Databricks APIs to build a dashboard with the following requirements:

  • User monitoring information:
    • Count of active users: Tracking the number of users who created, updated, or performed ‘Notebook runs’ during specific target intervals.
    • Number of unique user logins: Identifying the count of users based on their unique email IDs.
  • Notebook tracking information:
    • Number of Notebooks created or updated
    • Number of Notebooks per user
    • Number of active Notebooks
    • List of unique Notebook names
  • Infrastructure metrics:
    • Metrics on time usage for each Databricks cluster

The Solution

Our IA team studied the problem and recommended the implementation of Overwatch, a tool developed by Databricks, for analyzing log data from Databricks workspaces.

Databricks Overwatch is a powerful monitoring and alerting solution designed to provide insights into the performance, cost, and usage of Databricks workspaces and clusters. Overwatch offers granular details such as pipeline performance, cost, ingress, and egress data. It can assist the company by optimizing data-driven decision-making. It allows the company to capture workspace activities through structured datasets.

The team collaborated closely with company stakeholders to ensure Overwatch's successful deployment and integration across all relevant platforms. Additionally, our engineers designed the solution to be extensible, making it suitable for use with multiple workspaces.

The new system enables user activity logging through Databricks Event Hub integration and extracts the logged data on clusters, notebooks, account logins, and jobs using Overwatch. Extracted data is structured in the form of delta tables to be used for dashboard creation and further analysis.

Here are a few highlights of the expanded system:

  • Effective user activity monitoring: The company can now get an accurate count of active users and the number of unique logins.
  • Comprehensive cloud usage metrics: The new system gives the company real-time information on its Databricks component usage. The insights drawn from usage metrics allow the company to allocate its resources more efficiently, saving time and money.
  • Fully automated extensible solution: GSPANN engineers designed the implementation to be fully automated, extensible, and reusable. The same solution can easily be extended to multiple workspaces, saving the company significant future development costs.

Business Impact

  • Improved cost management: The integration of this solution significantly improved the mapping of business units and teams to the consumption of various Databricks services at a more granular level. Additionally, it provided valuable information on unnecessary resource utilization and costs, resulting in tangible cost savings for the company.
  • Increased efficiency and productivity: Dashboard analysis helps the company manage resources efficiently. The insights drawn from the new dashboard could have the company subscribe to better cloud service plans, resulting in elevated efficiency and productivity.

Technologies Used

Databricks: A unified data analytics platform running on Microsoft Azure that combines big data processing, machine learning, and collaborative workspaces
Databricks Notebooks: Web-based collaborative Databricks environments that enable data scientists and engineers to write, execute, and share code, visualizations, and insights using multiple programming languages
Databricks Overwatch: A monitoring and optimization solution for Databricks deployments providing historical and real-time analysis of data sources
Azure Event Hubs: A highly scalable, fully managed, real-time data ingestion service that can stream and process millions of events per second from any source

Related Capabilities

Utilize Actionable Insights from Multiple Data Hubs to Gain More Customers and Boost Sales

Unlock the power of the data insights buried deep within your diverse systems across the organization. We empower businesses to effectively collect, beautifully visualize, critically analyze, and intelligently interpret data to support organizational goals. Our team ensures good returns on the big data technology investments with the effective use of the latest data and analytics tools.

Do you have a similar project in mind?

Enter your email address to start the conversation