Covering Disruptive Technology Powering Business in The Digital Age

Home > DTA news > News > How eBay Uses Big Data and Machine Learning to Drive Business Value
How eBay Uses Big Data and Machine Learning to Drive Business Value


Digital transformation, while not new, has changed tremendously with the advent of new technologies for big data analytics and machine learning. The key to most company’s digital transformation efforts is to harness insights from various types of data at the right time. Fortunately, organizations now have access to a wide range of solutions to accomplish this goal.

How are leaders in the space approaching the problem today? I recently had a discussion with Seshu Adunuthula, Senior Director of Analytics Infrastructure at eBay, to discuss this matter. eBay was always a digital business, but even IT leaders of companies that were born as digital businesses are embracing the latest digital technologies to enhance their existing processes and build new experiences. According to Adunuthula, “Data is eBay’s most important asset.” eBay is managing approximately 1 billion live listings and 164 million active buyers daily.   Of these, eBay receives 10 million new listings via mobile every week . Clearly, the company as large volumes of data, but the key to its future success will be how fast it can turn data into a personalized experience that drives sales.

Designing and updating a technical strategy

The first challenge eBay wrestled with was finding a platform, aside from its traditional data warehouse, that was capable of storing an enormous amount of data that varied by type. Adunuthula stated that the type of data, the structure of the data and the required speed of analysis meant the company had to evolve from a traditional data warehouse structure to what it calls data lakes. For example, the company needs to keep roughly nine quarters of historical trends data to provide insights on items such as year over year growth. It also needs to analyze data in real-time to assist shoppers throughout the selling cycle.

The ability to support data at the scale of an internet company was a key consideration in the selection of technologies and partners. The company chose to work with Hortonwork’s Hadoop product because it offered an open source platform that was highly scalable and the vendor was willing to work with eBay to design product enhancements. With a foundation of Hadoop and Hortonworks, the other two components of eBay’s data platform strategy are what it calls streams and services.

 A big technical challenge for eBay and every data-intensive business is to deploy a system that can rapidly analyze and act on data as it arrives into the organization’s systems (called streaming data). There are many rapidly evolving methods to support streaming data analysis. eBay is currently working with several tools including Apache Spark, Storm, Kafka, and Hortonworks HDF. The data services layer of its strategy provides functions that enable a company to access and query data. It allows the company’s data analysts to search information tags that have been associated with the data (called metadata) and makes it consumable to as many people as possible with the right level of security and permissions (called data governance). It’s also using an interactive query engine on Hadoop called Presto. The company has been at the forefront of using big data solutions and actively contributes its knowledge back to the open source community.

eBay’s current big data strategy represents a few of the potential combinations and options that are available to companies seeking to process a large volume of data that aren’t similar in format and combinations of data that may need to be analyzed in real-time or stored for analysis at a later date. Of course, the selection of big data solutions depends on what you are trying to accomplish as a business.

This article was originally published on and can be viewed in full here