Who is the Client

A US-based Fortune 500 departmental store chain with more than 1000 stores across the states, they are bringing stylish clothing for the entire family since decades now.

The Challenge

The client has a vast e-commerce portal that showcases more than 350K products. Since the portal promotes fast fashion, product refresh frequency plays a major role in conversions and its success depends on the relevance and personalization of the search results for the input keywords.

The client was using an Apache Solr-based catalog search platform to provide search results to the end-user. However, the performance of the platform was not optimized since it would take 4.5 hours for the new products added in the product Master Data Management (MDM) to reflect in the search results.

The throughput per second (TPS) was low—around 25 per second per virtual machine. Also, the client was controlling the search behavior through Oracle Endeca-based external configuration system, which was not performing as expected. The client wanted to improve the overall quality and performance of its search result pages.

The Solution

Due to the lack of documentation and inadequate knowledge transfer from the client’s previous technology partner, the team had to dig into the code to reverse-engineer the customizations and components. We enhanced the Solr search experience, added new features, and completely rearchitected the search behavior control tool.

The rearchitected search behavior control tool enables the client to redirect keyword searches to the specific pages and alter the sequence of the searched items. We also developed a web module where the client can preview the search result pages. This allows the product team to experiment various search configuration parameters and enable beta and A/B testing setup. Some upgrades were as follows:

  • Redesigned a feed-based single-threaded indexing process to multi-thread process using MongoDB, Kafka, and Spark Solr.
  • Upgraded the existing system to support the new feature where catalog search shows products that can be bought online.
  • Modified catalog search response to show results based on a customer’s browsing history. Improved default sorting behavior based on sales and pricing data. Added the ‘bestseller’ tag to the products on the search page.
  • Improved typeahead to show suggestions that gave good results and incorporated spell check.

Business Impact

  • The indexing time improved by 30%.
  • The time required to reflect products added to the product MDM decreased from 4.5 to 3 hours.
  • The TPS per virtual machine was enhanced from 25 to 35, which improved the performance of the search results.

Technologies Used

Apache Solr. An open-source enterprise-search platform, written in Java, from the Apache Lucene project
Redis. An open-source Barkeley Software Distribution (BSD)- licensed in-memory data structure store
Google Cloud Platform (GCP). A suite of cloud computing services that runs on the same infrastructure that Google uses internally for its end-user products
Jenkins. An open-source automation server that helps in building, testing, deploying, and facilitating CI/CD
Apache Kafka. A unified, high-throughput, low-latency platform for handling real-time data feeds
Apache Spark. An open-source distributed cluster-computing framework that processes tasks on large datasets
MongoDB. A cross-platform document-oriented NoSQL database program.
Tonomi. Enables fast, safe, and secure application management
Languages. Java and Python

Related Capabilities

Optimize Business Operations by Eliminating Inefficiencies and Redundancies with High-Quality Apps

Develop advanced applications mapped to your strategic goals by utilizing modernize architectures, such as microservices, to seamlessly leverage cloud capabilities. We can help in migrating your applications to a modernized technology platform while keeping your costs in control.

Do you have a similar project in mind?

Enter your email address to start the conversation