The SnappyData Blog

  • How Mutable DataFrames improve join performance in Spark SQL

    Sudhir Menon,

    In this blog we showcase a credit card fraud detection example where performance is limited by a vanilla Spark solution to joining a streaming DataFrame with a static DataFrame. We demonstrate how performance is improved by using Mutable DataFrames inside SnappyData. Code examples are provided.

  • Joining a billion rows 20x faster than Apache Spark

    Sumedh Wale,

    One of Databricks’ most well-known blogs is the blog where they describe joining a billion rows in a second on a laptop. Since this is a fairly easy benchmark to replicate, we thought, why not try it on SnappyData and see what happens? We found that for joining two columns with a billion rows, SnappyData is nearly 20x faster.

  • SnappyData 0.7 now available: Up to 20x faster than Spark SQL and many more enhancements

    Neeraj Kumar,

    In this release, we are excited to demonstrate performance of up to 20X over Apache Spark 2.0, depending on the SparkSQL workload in question. Scan dependent workloads perform much better on SnappyData (the changes are discussed in this blog). We have improved the developer experience through one-click cloud services, better documentation, a new UI that extends the Spark console a dedicated section in our documentation for readymade code snippets to understand different aspects of the product better and many Synopses Data Engine improvements

  • iSight Cloud - Lightning fast visualizations on large data sets at a fraction of the cost and complexity

    Jags Ramnarayan,

    How do you move a data lake into a compute cloud, so analytical workloads can run efficiently? How do you reduce the cost of running expensive analytics queries over growing data volumes? Learn more inside.

  • SnappyData as The Data Store for Spark

    Rishitesh Mishra,

    SnappyData changes Spark into a datastore that supports real time Spark applications. It supports a high volume of writes, point updates, and point queries. SnappyData can store data in the same executor JVMs as that of Spark (Unified Mode) or out of process (Split Mode). The focus of this document is on Unified Mode.

  • The Spark Database

    Sudhir Menon,

    SnappyData users can use Spark just like a database in addition to a data processing platform. Learn more about Spark-as-a-database in this blog