Who is the Company

A large retail beverage chain with over 20,000 stores worldwide.

The Challenge

The company was experiencing severe data duplication issues causing the current management system to take an excessive amount of processing time per region. In the worst case, processing product data took up to 24 hours for one region.

The company could not provide information to help their brands determine on-demand product availability. They wanted to measure product availability across all its stores.

Another challenge was that the current inventory management system, a Python application, occasionally failed during data processing. This meant the application had to be rerun, making the process highly inefficient. Since the process was hosted on a public cloud, the company suffered substantial monthly cost overruns.

A significant impediment to the company’s progress was that it had little to no documentation, no sample data to analyze, and needed to resolve the inventory issues within three and a half months.

In short, the company wanted to:

  • Automate on-demand product availability: The solution required predictive analytics to enable the brands to measure how often the product was in the right place at the right time. Based on customer demand, the company wished to calculate product availability per store by hour and day.
  • Reduce costs and overall processing time:The client needed to replace their slow Python-based application with a hybrid Python-Spark integration using PySpark. This solution would vastly reduce processing time and lower cloud-associated CPU utilization costs.

The Solution

Our Information Analytics team examined the company’s business requirements and split the project into two layers. In the first layer, we prepared the data from incoming jobs, stored it in an intermediate table, and repeated the same for subsequent jobs. The second layer uses this data to conduct predictive analysis and provides forecasted product availability requirements. Product information is continuously replenished through process automation.

Our team used big data analytics tools to process retail store data for the last three years. We employed advanced predictive analytics techniques to determine availability. Our solution integrated Azure Cloud Services to conduct fast-paced data analyses using data on sales, inventory, and the shelf life of the products. The solution provides accurate predictions of product availability at any given store location, any time of the day.

Here are some key aspects of our solution:

  • Highly scalable: Our team overhauled inventory management operations to support 20,000 stores and scalable for many more.
  • Eliminated data duplication and reduced processing time: By storing single data points in intermediate tables, our team eliminated data duplication and reduced the time taken to complete the process. Our solution produces results 97% faster than the company’s old system.
  • Recovered potentially lost sales: By applying advanced analytics insights, the company was able to convert potentially lost sales into revenue.
  • Data recovery: The company now has the option to restore data deleted from the system in error within 30 days.

Business Impact

  • Reduced cloud computing costs: Our implementation introduced efficiency into the system and helped the client save 40 - 45% on its cloud computing costs.
  • Saves store managers’ time:Faster system response time reduces the store manager’s time gathering product statistics by 40%.
  • Reduces retail waste: The old system had a tremendous amount of data duplication resulting in products with expired ‘best-used-by‘ dates. The new system reduces retail waste by 7%
  • Improved top line: Our solution allows the company to measure how often the product was in the right place at the right time. More products are now visible and available upon demand resulting in an improved top line.

Technologies Used

Python: A popular scripted programming language
PySpark: A Python module that provides an interface to Apache Spark
Python Modules: NumPy, pandas, SciPy
Microsoft Azure Cloud Services: Cloud-based platform that provides support for highly scalable, highly available containerized applications

Related Capabilities

Utilize Actionable Insights from Multiple Data Hubs to Gain More Customers and Boost Sales

Unlock the power of the data insights buried deep within your diverse systems across the organization. We empower businesses to effectively collect, beautifully visualize, critically analyze, and intelligently interpret data to support organizational goals. Our team ensures good returns on the big data technology investments with the effective use of the latest data and analytics tools.

Do you have a similar project in mind?

Enter your email address to start the conversation