Distributed Systems Engineer

Pune, India

Job Description:

Are you looking to put your computer science skills to use? Are you looking to work for one of the hottest startups in Silicon Valley? Are you looking to define the next generation data management platform based on Apache Spark? Are you excited by the idea of being a Spark committer?

If you answered yes to all of the questions above, we definitely want to talk to you. We are looking to add engineers with experience in building large scale distributed systems to our product development team in Pune. As a distributed systems engineer (if you are good), you will get to work on defining key elements of our real time analytics platform, including

  • Distributed in memory data management
  • OLTP and OLAP querying in a single platform
  • Approximate Query Processing over large data sets
  • Online machine learning algorithms applied to streaming data sets
  • Streaming and continuous querying


  • You should be a pro in Java (at least 4-5 years) and have some exposure to functional programming in Scala
  • If you have built SQL engines, you will find this place interesting
  • You should care about performance, and by that, we mean performance optimizations in a JVM
  • You should be willing to professionally argue for your designs
  • You should be self motivated and driven to succeed
  • If you are an open source committer on any project, especially an Apache project, you will fit right in
  • Experience working with Spark is a BIG plus
  • If you have solved big complex problems, we want to talk to you
  • If you are a math geek, with a background in statistics, mathematics and you know what a linear regression is, this just might be the place for you
  • Exposure to stream data processing Storm, Samza is a plus

Open source contributors: Send us your Github id!


SnappyData is a new real-time analytics platform that combines probabilistic data structures, approximate query processing and in memory distributed data management to deliver powerful analytic querying and alerting capabilities on Apache Spark at a fraction of the cost of traditional big data analytics platforms.

SnappyData fuses the Spark computational engine with a highly available, multi-tenanted in-memory database to execute OLAP and OLTP queries on streaming data. Further, SnappyData can store data in a variety of synopsis data structures to provide extremely fast responses on less resources. Finally, applications can either submit Spark programs or connect using JDBC/ODBC to run interactive or continuous SQL queries.


  • Distributed Systems,
  • Scala,
  • Apache Spark,
  • Spark SQL,
  • Spark Streaming,
  • Java,
  • YARN/Mesos

What's in it for you:

  • Cutting edge work that is ultra meaningful
  • Colleagues who are the best of the best
  • Meaningful startup equity
  • Competitive base salary
  • Full benefits
  • Casual, Fun Office

Company Overview:

SnappyData is a Silicon Valley funded startup founded by engineers who pioneered the distributed in memory data business. It is advised by some of the legends of the computing industry who have been instrumental in creating multiple disruptions that have defined computing over the past 40 years. The engineering team that powers SnappyData built GemFire, one of the industry leading in memory data grids, which is used worldwide in mission critical applications ranging from finance to retail.

To apply:

Click this link and select the job you're interested in

To Apply

Click the below link and select a job