The SnappyData Blog

  • Running Spark SQL CERN queries 5x faster on SnappyData

    Sudhir Menon,

    In a recent blog post, Luca Canali from CERN tested the performance improvement betwen Spark 1.6 and Spark 2.0 using a Spark SQL join with two conditions. CERN discovered a 7x performance improvement from 1.6 -> 2.0. We ran the same query on equivalent hardware on SnappyData and discovered a 5x performance improvement from Spark 2.0 to Snappy. Learn more inside.

  • Joining a billion rows 20x faster than Apache Spark

    Sumedh Wale,

    One of Databricks’ most well-known blogs is the blog where they describe joining a billion rows in a second on a laptop. Since this is a fairly easy benchmark to replicate, we thought, why not try it on SnappyData and see what happens? We found that for joining two columns with a billion rows, SnappyData is nearly 20x faster.

  • SnappyData 0.7 now available: Up to 20x faster than Spark SQL and many more enhancements

    Neeraj Kumar,

    In this release, we are excited to demonstrate performance of up to 20X over Apache Spark 2.0, depending on the SparkSQL workload in question. Scan dependent workloads perform much better on SnappyData (the changes are discussed in this blog). We have improved the developer experience through one-click cloud services, better documentation, a new UI that extends the Spark console a dedicated section in our documentation for readymade code snippets to understand different aspects of the product better and many Synopses Data Engine improvements

  • SnappyData as The Data Store for Spark

    Rishitesh Mishra,

    SnappyData changes Spark into a datastore that supports real time Spark applications. It supports a high volume of writes, point updates, and point queries. SnappyData can store data in the same executor JVMs as that of Spark (Unified Mode) or out of process (Split Mode). The focus of this document is on Unified Mode.

  • The Spark Database

    Sudhir Menon,

    SnappyData users can use Spark just like a database in addition to a data processing platform. Learn more about Spark-as-a-database in this blog

  • Spark 2.0, Structured Streaming and SnappyData

    Pierce Lamb,

    How does SnappyData fit into Spark 2.0, Structured Streaming and the Spark Ecosystem? Find out more within