<iframe src="//www.googletagmanager.com/ns.html?id=GTM-TT4L49" height="0" width="0" style="display:none;visibility:hidden">
Jethro - We Make Real-Time Business Intelligence Work on Hadoop

Blog, BI, Big Data

How Tata CDN Made Interactive BI Work

By Mark Kremer on February 16, 2018


What makes it a big deal? Big Data is more than literally just much more data… it’s also, more co-tenant applications, more concurrent users, tighter data loading windows, more performance engineering––and all at significantly lower cost, giving better performance.

The challenges of sitting in front of a screen, interactively analyzing enormous datasets is daunting:

  • High Performance: users will not wait for more than 3 seconds for query results
  • Complex handling: modern BI tools enable users to issue complex queries that challenge the underlying database and the computing platform on which it runs
  • Adaptable: interactive BI workloads are unpredictable and tend to vary and peak with the count of concurrent users and the changing complexity of their queries.
  • Strong visibility: users expect to explore any part of the overall data at any time. No subset extractions please.

These challenges are more evident when the analytical application users are customers rather than corporate employees. The customer-facing applications are subject to higher variation in user concurrency and query complexity. As they are often tied to corporate revenue, these applications have to deliver high level of service. The monitoring application of Tata’s CDN services is a case in point.

CDN Service monitoring application

In 2006 Tata Communications, Ltd. launched the world’s first Content Delivery Network (CDN) for on-demand HD live video streaming. Today, Tata Communications’ CDN provides a wide spectrum of services – including event live streaming, 24/7 online broadcasting, website acceleration, and large-scale software downloads.

About 2 billion service requests are captured daily at 15 minutes intervals and added to the CDN’s customer-facing analytics data lake. Tata offers customers a Business Operational Intelligence (BOI) dashboard to track their content distribution service performance KPIs. The BOI dashboard tracks traffic volume, response time patterns, browser type distributions and more, as measured across time-of-day, geography, content type and other dimensions. Tata customers rely on BOI for making business critical pricing, cost, and service level decisions.

BOI is an Interactive BI SaaS application. Users may interactively drill down built-in reports and create ad hoc panes over custom queries. The application was required to support peaks of thousands of concurrent customers.

The Implementation

After piloting the application on a reduced data set, BI developers noticed the following traits:

  • BOI clients perused relatively small subsets of the data captured by CDN services
  • Specific customer data, though small, were spread across the entire data set
  • Most BOI queries triggered a nearly full scan of the entire data set
  • Users found the latency for granular and aggregate queries unacceptable
  • Significant performance degradation due to the increase in concurrent users

Attempts at reparation of the data by (time, customer) did not improve performance by much but created a multitude of unmanageable small partitions, which did not bode well with co-tenant applications.

After considering several alternatives, Tata selected Jethro Data as their Interactive BI solution for the following reasons:

  • HDP and Jethro integration is seamless making Interactive BI a quick win
  • Jethro performance was fast across all types of queries (filters, aggregations, joins)
  • Jethro scales to larger data sets without affecting performance
  • Jethro enabled Tata to deliver their Service Monitoring Application to thousands of users without expending high cost performance engineering projects
  • Jethro indexes were very effective in filtering out specific customer datasets
  • Users were able to enjoy fast segmentation analysis leveraging Jethro Auto-Cubes
  • Jethro’s ability to ingest vast amounts of data at short interval increments
  • Jethro did not require any changes to underlying data
  • Jethro does not require manual cube or index definitions

After a short POC, Tata’s team deployed Jethro’s solution to their production environment, offering their customers interactive insight into more than a year’s worth of detailed CDN service data.

The Business Impact

BOI is one of the pillars that cement the value proposition for Tata CDN services, providing customers near real-time insight to their CDN service KPIs. A higher user/customer satisfaction presented new opportunities for revenue increase for the company, through accelerated usage growth and new customer acquisition. Furthermore, with Jethro’s scalable and cost-effective solution, Tata is able to employ a flexible cost model that impacts profit margin in a significant way, without compromising the stellar quality in their product and service delivery.