Exploratory analytics for data scientists in the Ad Tech world

Summary

The ability to understand ad performance in real time, the ability to control the velocity of ad placement across various networks, ability to target ads more effectively and avoiding over serving or under serving ads in a target market are all significant challenges for the ad tech industry which employs an army of data scientists to pore over historical and near real time data to improve the overall effectiveness of the ad platform. As more and more people go online and become targets for the online ad industry, the ability to effectively target users and improve click through rates is of prime importance to ad tech companies

Challenge

  • Ingest large volumes of data in real time
  • Write behind to Hadoop for archival storage and batch analysis
  • Stream based querying in SQL
  • Real time aggregation queries (classic slice and dice queries) in SnappyData

Solution

A prominent player in the ad tech industry has deployed a SnappyData cluster that takes in data from real time sources and makes it available to data scientists who use the SQL querying mechanism among other things to analyze the data and build models for future consumption.

Snappy Capabilities in Use

  • Optimized data ingestion
  • In memory column table querying
  • Stream processing
  • Write behind to HDFS

The Spark Database

SnappyData is Spark 2.x compatible and open source. Download now