Over the past decade, NoSQL stores have become prevalent for processing high volume transactional-class (OLTP) workloads. As the need for real-time or operational analytics in these applications has arrived, many of these vendors have tried to bolt on analytics feature-sets. However, the complexity of processing OLAP workloads (fast!) is not easy to solve. As such, in the past, we've shown that simple benchmarks on widely used NoSQL stores - like Cassandra - have huge performance gaps, especially those that attempt to use a Spark connector for analytics.
Recently, a user asked us how we compare to Aerospike for running both transactional-class workloads (OLTP) and analytics-class workloads (OLAP). We knew Aerospike was a NoSQL store architected for high-volume OLTP applications, but we had not heard of it being used for OLAP. The latter led us to create a simple benchmark which is shown below.
There are some prima facie differences between SnappyData and Aerospike: Aerospike is a key-value store, written in C, targeted primarily at high write/transactional-type workloads. SnappyData is a SQL database deeply integrated with Spark, running on the JVM, primarily targeted at both transactional and analytical workloads. For this analysis, however, we are interested in performance, not features (though they may be mentioned in passing). Also, Aerospike has been analyzed in depth by third parties before.
For this benchmark, we go back to our tried and true FAA On Time Arrival data for all domestic flights. Luckily, Aerospike has released some code showcasing queries on this data before, so we reused that code to run the benchmark. We will measure latencies for simple analytics-class queries as well as point queries. Finally, like previous benchmarks, we have to distinguish between SnappyData column tables and SnappyData row tables.
Column tables: Store data in columnar format and are highly optimized for analytics-class queries (OLAP).
Row tables: Store data in a relational format and are highly optimized for transactional-class queries (OLTP).
Results when using SnappyData Column Tables:
- - For the analytics (OLAP) query, SnappyData is 200x faster than Aerospike
- - For the point lookup query, Aerospike is 124x faster than SnappyData
- - The memory footprint for Aerospike is about 4x higher
Results when using SnappyData Row Tables:
- - For the analytics (OLAP) query, SnappyData is 16x faster than Aerospike
- - For the point lookup query, Aerospike is 2x faster than SnappyData
- - SnappyData memory usage approximately 15% lower
- - Aerospike disk usage nearly 3.5x more than Snappy
In short, SnappyData dominates analytic query performance, even when using row tables. Further, Aerospike seems to have no clean support for aggregations. Users have to write Lua code as a user defined function, a far cry from SQL. For point lookups, Aerospike wins; however, on SnappyData’s row tables the margin is very small (the difference between 0.03ms and 0.05ms). So one can get better performance on analytics queries on SnappyData and almost the same performance on point lookup queries. You can download SnappyData here
- - Single machine with 32 cores and 128GB RAM
- - Approximately 30 million reocrds of On Time Arrival Data
- - Roughly 4GB of CSV formatted data.
- - 1 Aerospike server, build version 220.127.116.11
- - 50GB memory size
- - Airline Schema
- - Primary Key = ID
- - Data server, Lead & Locator on single machine, build 1.0.1
- - Heap size 10GB, off-heap 40GB
- - Airline schema
- - Partitioning Key = ID
Analytical Query Performance
The following query was used to measure OLAP performance on SnappyData and Aerospike:
This query is producing four columns, Airline Code, No. of Flights, No. of Late Flights, and Percentage of Late Flights. In short, it’s asking which airlines have the largest number of late flights and the highest percentage of late flights?
As noted above, we used the provided Aerospike code to execute the query, while for SnappyData, we just executed through the snappy-shell.
SnappyData Column Table response time: 130ms
SnappyData Row Table response time: 1795ms
Aerospike response time: 29816ms
Point Lookup Query Performance
For the point query test, we added a unique ID to all rows of the airline data for primary key lookups. We contrived it this way to better reflect point querying in the real world for the sake of the benchmark. The following queries were used to measure point lookup performance
SnappyData point lookup query:
Aerospike point lookup query:
SnappyData Row Table response time: 0.05ms
SnappyData Column Table response time: 3.74ms
Aerospike response time: 0.03ms
Aerospike is roughly 2x faster than SnappyData row tables and approximately 124x faster than SnappyData column tables for point lookup queries. This makes sense: Aerospike is optimized for very fast point lookup queries, but in the case of row tables the difference is about two hundredths of a millisecond or 20 microseconds. This difference is negligible for many classes of applications.
In conclusion, SnappyData is the clear leader for OLAP, analytics-class queries while Aerospike edges SnappyData slightly out of point lookup query performance. As a hybrid system, SnappyData must be competitive in both OLAP and OLTP style querying and we are pleased to see it can compete with some of the best OLTP-optimized systems like Aerospike. You can check out SnappyData on our download page and get help from our engineers on our Slack chat.
At the outset we discussed how many NoSQL OLTP-based systems tend to "bolt-on" analytics features sets. It's worth noting that the aforementioned Aerospike github repo for real-time analytics claims that the solution to fixing slow Big Data analytics is to "Use an Aerospike Aggregation on data streaming from the output of a query. Aggregations are the union of a query on a secondary index and one or more StreamUDFs." This is surprising advice given what we know from the past decade of key-value, OLTP stores; loading your Aerospike-managed data into even a vanilla Spark deployment and running your analytics queries would be far faster than the mentioned scenario. With SnappyData you can skip that and perform your OLTP operations and OLAP operations with predictable performance in the same system.
SnappyData create table DDLs:
For the point lookup queries, we performed 10000 warm up iterations and 10000 runs for calculating an average response time. We also tried different keys for each iteration and also different sets of keys for warmups vs timed runs to bypass and client caching. We did this for both systems. The Code used is provided below: