SnappyData versions

Download old packages

Download SnappyData 1.0.2 Release

    • Introduced an API in snappy session catalog to get Primary Key of Row tables or Key Columns of Column Tables, as DataFrame. (SNAP-2459)
    • Introduced an API in snappy session catalog to get table type as String (SNAP-2477).
    • Added support for arbitrary size view definition. It use to fail when view text size went beyond 32k.
      Support for displaying VIEWTEXT for views in SYS.HIVETABLES.
      For example: Select viewtext from sys.hivetables where tablename = ‘view_name” will give the text with which the view was created.
    • Added Row level Security feature. Admins can define multiple security policies on tables for different users or ldap groups.
      Refer Row Level Security
    • Auto refresh of UI page. Now the SnappyData UI page gets updated automatically and frequently. User does not have to refresh or reload. Refer SnappyData Pulse
    • More richer User Interface. Added graphs for memory, CPU consumption etc. for last 15 minutes. The user has the ability to see how the cluster health has been for the last 15 minutes instead of just current state.
    • Total CPU core count capacity of the cluster is now displayed on the UI.
      Refer SnappyData Pulse
    • Bucket count of tables are also displayed now on the user interface.
    • Support deployment of packages and jars as DDL command.
    • Added support for reading maven dependencies using --packages option in our job server scripts.
    • Changes to procedure sys.repair_catalog to execute it on the server (earlier this was run on lead by sending a message to it). This will be useful to repair catalog even when lead is down.
      Refer Catalog Repair
    • Added support for** PreparedStatement.getMetadata() JDBC API **. This is on an experimental basis.
    • Added support for execution of some ddl commands viz CREATE/DROP DISKSTORE, GRANT, REVOKE. CALL procedures from snappy session as well.
    • Quote table names in all store DDL/DML/query strings to allow for special characters and keywords in table names
      Spark application with same name cannot be submitted to SnappyData. This has been done so that individual apps can be killed by its name when required.
    • Users are not allowed to create tables in their own schema based on system property - snappydata.RESTRICT_TABLE_CREATION. In some cases it may be required to control use of cluster resources in which case the table creation is done only by authorized owners of schema.
    • Schema can be owned by an LDAP group also and not necessarily by a single user.
    • Support for deploying SnappyData on Kubernetes using Helm charts. This feature is currently experimental.
      Refer Kubernetes
    • Disk Store Validate tool enhancement. Validation of disk store can find out all the inconsistencies at once.
    • BINARY data type is same as Blob data type.
    • Fixed concurrent query performance issue by resolving the incorrect output partition choice. Due to numBucket check, all the partition pruned queries were converted to hash partition with one partition. This was causing an exchange node to be introduced. (SNAP-2421)
    • Fixed SnappyData UI becoming unresponsive on LowMemoryException.(SNAP-2071)
    • Cleaning up tokenization handling and fixes. Main change is addition of the following two separate classes for tokenization:
      • ParamLiteral
      • TokenLiteral
    • Fixed incorrect server status shown on the UI. Sometimes due to a race condition for the same member two entries were shown up on the UI. (SNAP-2433)
    • Fixed missing SQL tab on SnappyData UI in local mode. (SNAP-2470)
    • Fixed few issues related to wrong results for Row tables due to plan caching. (SNAP-2463 - Incorrect pushing down of OR and AND clause filter combination in push down query, SNAP-2351 - re-evaluation of filter was not happening due to plan caching, SNAP-2451, SNAP-2457)
    • Skip batch, if the stats row is missing while scanning column values from disk. This was already handled for in-memory batches and the same has been added for on-disk batches. (SNAP-2364)
    • Fixes in UI to not let unauthorized users to see any tab. (ENT-21)
    • Fixes in SnappyData parser to create inlined table. (SNAP-2302), ‘()’ as optional in some function like ‘current_date()’, ‘current_timestamp()’ etc. (SNAP-2303)
    • Consider the current schema name also as part of Caching Key for plan caching. So same query on same table but from different schema should not clash with each other. (SNAP-2438)
    • Fix for COLUMN table mysteriously shown as ROW table on dashboard after LME in data server. (SNAP-2382)
    • Fixed off-heap size for Partitioned Regions, showed on UI. (SNAP-2186)
    • Fixed failure when query on view does not fallback to Spark plan in case Code Generation fails. (SNAP-2363)
    • Fix invalid decompress call on stats row.(SNAP-2348). Use to fail in run time while scanning column tables.(SNAP-2348)
    • Fixed negative bucket size with eviction. (GITHUB-982)
    • Fixed the issue of incorrect LowMemoryException, even if a lot of memory was left. (SNAP-2356)
    • Handled int overflow case in memory accounting. Due to this ExecutionMemoryPool released more memory than it has throws AssertionError (SNAP-2312)
    • Fixed the pooled connection not being returned to the pool after authorization check failure which led to unusable cluster. (SNAP-2255)
    • Fixed different results of nearly identical queries, due to join order. Its due to EXCHANGE hash ordering being different from table partitioning. It will happen for the specific case when query join order is different from partitioning of one of the tables while the other table being joined is partitioned differently. (SNAP-2225)
    • Corrected row count updated/inserted in a column table via putInto. (SNAP-2220)
    • Fixed the OOM issue due to hive queries. This was a memory leak. Due to this the system became very slow after sometime even if idle. (SNAP-2248)
    • Fixed the issue of incomplete plan and query string info in UI due to plan caching changes.
    • Corrected the logic of existence join.
    • Sensitive information, like user password, LDAP password etc, which are passed as properties to the cluster are masked on the UI now.
    • Schema with boolean columns sometimes returned incorrect null values. Fixed. (SNAP-2436)
    • Fixed the scenario where break in colocation chain of buckets due to crash led to disk store metadata going bad causing restart failure.
    • Wrong entry count on restart, if region got closed on a server due to DiskAccessException leading to a feeling of loss of data. Do not let the region close in case of LME. This has been done by not letting non IOException get wrapped in DiskAccessException. (SNAP-2375)
    • Fix to avoid hang or delay in stop when stop is issued and the component has gone into reconnect cycle. (SNAP-2380)
    • Handle joining of new servers better. Avoid ConflictingPersistentDataException when a new server starts before any of the old server start. SNAP-2236
    • ODBC driver bug fix. Added EmbedDatabaseMetaData.getTableSchemas.
    • Change the order in which backup is taken. Internal DD diskstore of backup is taken first followed by rest of the disk stores. This helps in stream apps which want to store offset of replayable source in snappydata. They can create the offset table backed up by the internal DD store instead of default or custom disk store.
  • Artifact Name Description
    snappydata-1.0.2-bin.tar.gz Full product binary (includes Hadoop 2.7)
    snappydata-1.0.2-without-hadoop-bin.tar.gz Product without the Hadoop dependency JARs
    snappydata-client-1.6.2.jar Client (JDBC) JAR
    snappydata-zeppelin_2.11- The Zeppelin interpreter jar for SnappyData, compatible with Apache Zeppelin 0.7.3

Download SnappyData 1.0.1 Release

    • putInto and deleteFrom bulk operations support for column tables (SNAP-2092, SNAP-2093, SNAP-2094):
      • ability to specify "key columns" in the table DDL to use for putInto and deleteFrom APIs
      • "PUT INTO" SQL or putInto API extension to overwrite existing rows and insert non-existing ones
      • "DELETE FROM" SQL or deleteFrom API extension to delete a set of matching rows
      • UPDATE SQL now supports using expressions with column references of another table in RHS of SET
    • Improvements in cluster restart with off-line, failed nodes or with corrupt meta-data (SNAP-2096)
      • new admin command "unblock" to allow the initialization of a table even if it is waiting for offline members
      • retain data unlike revoke and initialize with the latest online working copy (SNAP-2143)
      • parallel recovery of data regions to break any cyclic dependencies between the nodes, and allow reporting on all off-line nodes that may have more recent copy of data
      • many bug-fixes related to startup issues due to meta-data inconsistencies:
        incorrect data conflicts (SNAP-2097, SNAP-2098), metadata corruption (SNAP-2140)
    • Compression of column batches in disk storage and over the network (SNAP-1743)
      • support for LZ4, SNAPPY compression codecs in disk storage and transport of column table data
      • new SOURCEPATH and COMPRESSION columns in SYS.HIVETABLES virtual table
    • Support for temporary, global temporary and persistent VIEWs (SNAP-2072):
    • No jar dependencies in snappydata cluster for external datasources of smart connector (SNAP-2072)
    • External tables display in dashboard and snappy command-line (SNAP-2086)
    • Auto-configuration of SPARK_PUBLIC_DNS, hostname-for-clients etc in AWS environment (SNAP-2116)
    • GRANT/REVOKE SQL support in SnappySession.sql() earlier only allowed from JDBC/ODBC (SNAP-2042)
    • LATERAL VIEW support in SnappySession.sql() (SNAP-1283)
    • FETCH FIRST syntax as an alternative to LIMIT to support some SQL tools that use former
    • Addition of IndexStats in for local row table index lookup and range scans
    • SYS.DISKSTOREIDS virtual table to disk-store IDs being used in the cluster by all members (SNAP-2113)
    • Major performance improvements in smart connector mode (SNAP-2101, SNAP-2084)
      • minimized buffer copying, key lookups in column table rather than full scan for filters, reduce round-trips
      • allow using SnappyUnifiedMemoryManager with smart connector (SNAP-2084)
    • New memory and disk iterator to minimize faultins and serialize disk reads (SNAP-2102):
      • reduce faultins and cross-iterator serial disk reads per diskstore to minimize random reads from disk
      • new remote iterator that substantially reduces the memory overhead and caches only current batch
    • Startup performance improvements to cut down on locator/server/lead start and restart times (SNAP-338)
    • Improve performance of reads of variable length data for some queries (SNAP-2118)
    • Use colocated joins with VIEWs when possible (SNAP-2204)
    • Separate disk store for delta buffer regions to substantially improve column table compaction (SNAP-2121)
    • Projection push-down to scan layer for non-deterministic expressions like spark_partition_id() (SNAP-2036)
    • code-generation cache is larger by default and configurable (SNAP-2120)
    • Now only overflow-to-disk is allowed as eviction action for tables (SNAP-1501):
      • only overflow-to-disk is allowed as a valid eviction action and cannot be explicitly specified
      • OVERFLOW=false property can be used to disable eviction which is true by default
    • Memory accounting fixes:
      • incorrect initial memory accounting causing insert failure even with memory available (SNAP-2084)
      • zero usage shown in UI on restart (SNAP-2180)
    • Disable embedded Zeppelin interpreter in a secure cluster which can bypass security (SNAP-2191)
    • Fix import of JSON data (SNAP-2087)
    • selects missing results or failing during node failures (SNAP-889, SNAP-1547)
    • fixes and improvements to server and lead status in both the launcher status and SYS.MEMBERS table
      (SNAP-1960, SNAP-2060, SNAP-1645)
    • fix updates on complex types (SNAP-2141)
    • column table scan fixes related to null value reads (SNAP-2088)
    • disable tokenization for external tables, flags to disable it and plan caching (SNAP-2114, SNAP-2124)
    • deadlock in transactional operations with GII (SNAP-1950)
    • couple of fixes in UPDATE SQL: unexpected rollover (SNAP-2192), show as update count (SNAP-2156)
    • fixes ported from Apache Geode (GEODE-2109, GEODE-2240)
    • fixes to all failures in snappy-spark test suite which includes both product and test changes
    • more comprehensive python API testing (SNAP-2044)
  • Artifact Name Description
    snappydata-1.0.1-bin.tar.gz Full product binary (includes Hadoop 2.7)
    snappydata-1.0.1-without-hadoop-bin.tar.gz Product without the Hadoop dependency JARs
    snappydata-client-1.6.1.jar Client (JDBC) JAR
    snappydata-zeppelin-0.7.3.jar The Zeppelin interpreter jar for SnappyData, compatible with Apache Zeppelin 0.7.3
    snappydata-ec2-0.8.1.tar.gz Script to Launch SnappyData cluster on AWS EC2 instances

Download SnappyData 1.0.0 Release

    • Fully compatible with Apache Spark 2.1.1
    • Mutability support for column store (SNAP-1389):
      • UPDATE and DELETE operations are now supported on column tables.
    • ALTER TABLE support for row table (SNAP-1326).
    • Security Support (available in enterprise edition): This release introduces cluster security with authentication and authorisation based on LDAP mechanism. Will be extended to other mechanisms in future (SNAP-1656, SNAP-1813).
    • DEB and RPM installers (distProduct target in source build).
    • Support for setting scheduler pools using the set command.
    • Multi-node cluster now boots up quickly as background start of server processes is enabled by default.
    • Pulse Console: SnappyData Pulse has been enhanced to be more useful to both developers and operations personnel (SNAP-1890, SNAP-1792). Improvements include
      • Ability to sort members list based on members type.
      • Added new UI view named SnappyData Member Details Page which includes, among other things, latest logs.
      • Added members Heap and Off-Heap memory usage details along with their storage and execution splits.
    • Users can specify streaming batch interval when submitting a stream job via conf/ (SNAP-1948).
    • Row tables now support LONG, SHORT, TINYINT and BYTE datatypes (SNAP-1722).
    • The history file for snappy shell has been renamed from .gfxd.history to .snappy.history. You may copy your existing ~/.gfxd.history to ~/.snappy.history to be able to access your historical snappy shell commands.
    • Performance enhancements with dictionary decoder when dictionary is large. (SNAP-1877)
      • Using a consistent sort for pushed down predicates so that different sessions do not end up creating different generated code.
      • Reduced the size of generated code.
    • Indexed cursors in decoders to improve heavily filtered queries (SNAP-1936)
    • Performance improvements in Smart Connector mode, specially with queries on tables with wide schema (SNAP-1363, SNAP-1699)
    • Several other performance improvements.
    • Fixed data inconsistency issues when a new node is joining the cluster and at the same time write operations are going on. (SNAP-1756).
    • The product internally does retries on redundant copy of partitions on the event of a node failure (SNAP-1377, SNAP-902).
    • Fixed the wrong status of locators on restarts. After cluster restart, used to show locators in waiting state even when the actual status changed to running (SNAP-1893).
    • Fixed the SnappyData Pulse freezing when loading data sets (SNAP-1426).
    • More accurate accounting of execution and storage memory (SNAP-1688, SNAP-1798).
    • Corrected case-sensitivity handling for query API calls (SNAP-1714).
  • Artifact Name Description
    snappydata-1.0.0-bin.tar.gz Full product binary (includes Hadoop 2.7)
    snappydata-1.0.0-without-hadoop-bin.tar.gz Product without the Hadoop dependency JARs
    snappydata-client-1.6.0.jar Client (JDBC) JAR
    snappydata-zeppelin-0.7.2.jar The Zeppelin interpreter jar for SnappyData, compatible with Apache Zeppelin 0.7.2

See a benchmark

SnappyData, MemSQL-Spark & Cassandra-Spark: A Benchmark