In early 2018, we quietly launched a new feature on our website called CloudBuilder. This feature, which can be found by clicking “cloud” on our top navigation, presents a form for easily launching a SnappyData cluster on AWS. The form works by utilizing Amazon CloudFormation to automatically create the desired cluster on the user's AWS account. Below we will discuss CloudBuilder’s features.

Why use CloudBuilder when I could use AWS Marketplace?

Well, you can use AWS Marketplace to launch SnappyData. CloudBuilder, however, makes a number of crucial decisions easier. First, for the person new to SnappyData that wants to get up and running as fast as possible to try out SnappyData, CloudBuilder offers a community option which spins up the community version of SnappyData on a single node for free. The user will have to pay the AWS cost of running on a single node of EC2, but no cost for the use of SnappyData. In this case, you make three clicks and have SnappyData community edition running inside your AWS account immediately. Before we continue, I’ll note that we have a screencast of using CloudBuilder to spin up a SnappyData enterprise cluster on EC2 which I’ll embed below:

When selecting the enterprise option of CloudBuilder, the first option your given is to decide if you want SnappyData’s locator and lead nodes to be highly available. These nodes manage things like cluster membership coordination and hosting the Spark driver. You can learn more about them here. We wanted to make sure nothing inside SnappyData constituted a single point of failure, which is an often mentioned issue with the Spark Driver in vanilla Spark deployments. This step gives you that option.

Secondly, we ask the user how much memory and disk they want in their deployment. We often find that our users have a rough idea of the memory they need for a cluster prior to knowing exactly which EC2 instance they want. As such, we allow them to select that first; their memory selection then updates a set of potential EC2 cluster options in the next step below. So, as one moves the memory slider in step 3, the recommended EC2 clusters update in step 4, which brings us to step 4.

In step 4, CloudBuilder develops a list of potential EC2 clusters based on the choices you have been making. These clusters are sorted by their estimated lowest price. The hope is that our form will be able to find the cheapest deployment that matches what you need. We’ve also selected a subset of clusters to choose from that are a good fit for SnappyData. Each cluster option contains the estimated hourly price for the entire cluster + SnappyData, the number of nodes, total RAM and cores. The number of nodes is editable in case you want a set number. Finally two further options are presented in this step, whether or not you want a highly available deployment, and/or whether or not your workloads have high query volumes. Both checkboxes change the recommended clusters in order to support either use case.

Finally, agree to our terms of service, click generate and select which region you want to deploy in and voila, you are done. You’ll have to click through a couple AWS CloudFormation pages and then your cluster creation will begin. Check out the screencast to see this process. Once cluster creation completes, click on the “outputs” tab and you’ll see an Apache Zeppelin URL link. Click this link and you’ll be taken to a front-end notebook tool running against your newly deployed cluster. There you will find a number of notebooks where you can learn about how SnappyData works and see running code examples.

And that's it! We hope you’ll check out CloudBuilder and, if you have questions, join our Community Slack Chat.

The Apache Spark Database

SnappyData is Spark 2.0 compatible and open source. Download now