Who is the Client

A well-known American manufacturer of skincare products with $1.5B annual sales.

The Challenge

Over the years, the company developed a complex IT infrastructure to communicate sales information with company management and its partners. The infrastructure includes SAP Commerce Cloud (formerly known as Hybris), Gigya, Apache Solr search engine, and Salesforce Marketing Cloud, among other technologies.

The company’s business relies upon its sales partners, organized in a multi-level marketing arrangement. They stored most of their sales data in a Hybris database, including customer and order data, product information, and coupons,

As business volume grew, the time taken to get accurate real-time information became unacceptably slow. The company identified several bottlenecks that primarily stemmed from its reliance on REST API request-based components.

Our Information Analytics team was called in to support the company’s decision to move to a cloud-based streaming architecture based upon Apache Kafka.

In short, the company was looking to:

  • Modernize their complex on-premises infrastructure: The company wanted to transition as much data infrastructure as possible from on-premises to the cloud.
  • Minimize reliance upon a REST API-based architecture: The company's legacy REST API-based architecture was part of their problem. They needed help in transitioning to a modern streaming architecture based on Kafka.
  • Make accurate sales forecasts: To make accurate sales forecasts and to best assist its partners, the company needed real-time access to data.

The Solution

Our IA team developed a solution that sends data from the Hybris database to different third parties for reports and further processing. To achieve this, we developed a series of Kafka producers written in Java that produce data from Hybris and store them in Kafka topics. Then, we developed a set of Java-based Kafka Consumers that poll the topics and send data to various end systems.

Our IA team designed over 40 Kafka topics on Confluent Kafka brokers to manage various aspects of the business - customer enrollments, customer updates, orders, products, event registration, and many customer KPIs.

Additionally, we developed a matching set of Kafka consumers, also written in Java, to read the associated topic streams. The 40+ Kafka consumers were deployed on a Kubernetes engine in GCP. Our team configured multiple GCP pods for each consumer, providing scalability and high availability.

Failed data is stored in a catch-all Kafka topic called Dead Letter Queue (DLQ) to avoid data loss. End system consumers read the DLQs in addition to their designated topics. If necessary, the consumers are programmed to DLQ technical failure messages for automatically resyncing the data to consumer applications and other endpoints. The DLQ messages that pertain to business failures are sent back to the e-commerce applications for reprocessing.

Our team used Kafka analytics monitoring APIs - and Prometheus to pull KPIs such as consumer lags, active connections, and traffic volume into a handler application to ensure consistent performance. We also set performance thresholds and configured alerts. Our team used Splunk to assist with debugging logs.

Key takeaways from the solution include:

  • Shift to a cloud-based streaming infrastructure: Our team assisted the company in moving away from a REST API-based on-premises system to a modern streaming cloud-based infrastructure.
  • No data loss:The new system can easily handle a massive amount of company data with no loss. Another unique feature of our solution captures and reprocesses any failed data in the catch-all Kafka DLQ topic.
  • Zero downtime: The new system features multiple redundant cloud-based virtual machines that ensure zero downtime.

Business Impact

  • Real-time access to data allows rapid response: The company and its partners now have real-time access to data, allowing them to make fast and well-informed business decisions - essential in today’s fast-changing world.
  • High availability increases partner satisfaction:The solution features multiple redundancies ensuring 100% uptime. With its performance threshold alerts, the new system’s monitoring allows the company to proactively address potential issues before they affect system performance.
  • Rapidly shift market focus without impacting operations: Kafka makes it easier to add topics and consumers without restarting the entire system.
  • Cost-effective future expansion: The cloud-based solution can dynamically upscale its bandwidth, processing power, and storage space, which allows the business to grow without additional development or hardware investment.

Technologies Used

Confluent Kafka: Provides a managed Kafka-based streaming event system for building high-performance data pipelines
Prometheus: Open-source monitoring toolkit that provides support for alerts
SAP Commerce Cloud (Hybris): E-commerce platform that allows you to control all aspects of the customer purchase experience
GCP BigQuery: Scalable cloud-based data warehouse that provides analytics tools and support for large-scale SQL queries
GCP Firestore: High-performance scalable NoSQL database
SAP Gigya: Provides support for customer identity management, privacy and consent
Kubernetes: Highly scalable open-source platform designed to manage and automate multi-container deployments
Oracle Java: Popular open-source programming language

Related Capabilities

Utilize Actionable Insights from Multiple Data Hubs to Gain More Customers and Boost Sales

Unlock the power of data insights buried deep within your diverse systems across the organization. We empower businesses to effectively collect, beautifully visualize, critically analyze, and intelligently interpret data to support organizational goals. Our team ensures good returns on the big data technology investment with effective use of the latest data and analytics tools.

Do you have a similar project in mind?

Enter your email address to start the conversation