Splice Machine Installation Guide

The topics in this Guide only apply to our On-Premise Database product.

You can download our free trial and learn more about Splice Machine products here.

This Installation Guide walks you through installing Splice Machine on your cluster, or on computer if you're using the standalone version.

The fastest way to get started with Splice Machine is to set up our sandbox on the Amazon Web Services (AWS) platform on EC2 instances using cloud.splicemachine.com:

See the Installing Splice Machine on Amazon Web Services topic in this chapter for step-by-step instructions for setting up the Splice Machine sandbox.

If you want to download and install Splice Machine on your cluster or standalone computer, please read the remainder of this page, which includes these sections:

  • The Cluster Node Requirements section below details the hardware and ecosystem requirements for installing Splice Machine on a cluster or on a standalone computer.
  • The Configure Linux for Splice Machine section specifies the Linux software that Splice Machine requires.
  • The Install Splice Machine links to the platform-specific installation and upgrade pages for each version of Splice Machine.

Installing the Splice Machine Sandbox on AWS

The fastest way to get started with Splice Machine is to set up our sandbox on the Amazon Web Services (AWS) platform on EC2 instances using cloud.splicemachine.com.

See the Installing Splice Machine on Amazon Web Services topic in this chapter for step-by-step instructions.
   

Cluster Node Requirements

The following table summarizes the minimum requirements for the nodes in your cluster:

Component Requirements
Cores Splice Machine recommends that each node in your cluster have 8-12 hyper-threaded cores (16-32 hyper-threads) for optimum throughput and concurrency.
Memory We recommend that each machine in your cluster have at least 64 GB of available memory.
Disk Space

Your root drive needs to have at least 100 GB of free space.

Splice Machine recommends separate data drives on each cluster node to maintain a separation between the operating system and your database data. You need capacity for a minimum of three times the size of the data you intend to load; the typical recommended configuration is 2 TB or more of attached storage per node.

Your data disks should be set up with a single partition and formatted with an ext4 file system.

Hadoop Ecosystem The table in the next section, Hadoop Ecosystem Requirements, summarizes the specific Hadoop component versions that we support in each of our product releases.
Software Tools and System Settings The Linux Configuration topic in each section of our Installation Guide that pertains to your installation summarizes the software tools and system settings required for your cluster machines.

Amazon Web Services (AWS) Requirements

If you're running on AWS, your cluster must meet these minimum requirements:

Component Requirements
Minimum Cluster Size

The minimum cluster size on AWS is 5 nodes:

  • 1 master node
  • 4 worker nodes
Minimum Node Size Minimum recommended size of each node is m4.4xlarge.
Disk Space

Minimum recommended storage space:

  • 100GB EBS root drive
  • 4 EBS data drives per node

Note that the required number of data drives per node depends on your use case.

Hadoop Ecosystem Requirements

The following table summarizes the required Hadoop ecosystem components for your platform:

Splice Machine Hadoop platform Linux Hadoop HBase ZooKeeper
Release 2.6

CentOS/RHEL 6

2.6.0 1.0.0 3.4.5

CentOS/RHEL 6

2.7.1 1.1.2 3.4.5

CentOS/RHEL 6

2.7.0 1.1.1 3.4.5
 
Release 2.5 CDH 5.8.3 and 5.8.0

CentOS/RHEL 6

2.6.0 1.0.0 3.4.5
HDP 2.5.5

CentOS/RHEL 6

2.7.1 1.1.2 3.4.5
MapR 5.2.0

CentOS/RHEL 6

2.7.0 1.1.1 3.4.5

Java JDK Requirements

Splice Machine supports the following versions of the Java JDK:

  • Oracle JDK 1.8, update 60 or higher

    We recommend that you do not use JDK 1.8.0_40

Splice Machine does not test our releases with OpenJDK, so we recommend against using it.

Standalone Version Prerequisites

You can use the standalone version of Splice Machine on MacOS and Linux computers that meet these basic requirements:

Component Requirements
Operating System

Mac OS X, version 10.8 or later.

CentOS 6.4 or equivalent.

CPU Splice Machine recommends 2 or more multiple-core CPUs.
Memory At least 16 GB RAM, of which at least 10 GB is available.
Disk Space At least 100 GB of disk space available for Splice Machine software, plus as much space as will be required for your data; for example, if you have a 1 TB dataset, you need at least 1 TB of available data space.
Software

You must have JDK installed on your computer.

Configure Linux for Splice Machine

The following table summarizes Linux configuration requirements for running Splice Machine on your cluster:

Configuration Step Description
Configure SSH access: Configure the user account that you're using for cluster administration for password-free access, to simplify installation.
Configure swappiness:
echo 'vm.swappiness = 0' >> /etc/sysctl.conf
If you are using Ubuntu:
rm /bin/sh ; ln -sf /bin/bash /bin/sh
If your using CentOS or RHEL:
sed -i '/requiretty/ s/^/#/' /etc/sudoers
Required software:

Verify that the following set of software (or packages) is available on each node in your cluster:

  • curl
  • Oracle JDK 1.8, update 60 or higher. We recommend against using JDK 1.8.0_40 or OpenJDK.

    Your platform management software may re-install JDK during its own installation process.
  • nscd
  • ntp
  • openssh, openssh-clients, and openssh-server
  • patch
  • rlwrap
  • wget
Additional required software on CentOS or RHEL

If you're running on CENTOS or RHEL, you also need to have this software available on each node:

  • ftp
  • EPEL repository
Services that must be started

You need to make sure that the following services are enabled and started:

  • nscd
  • ntpd (ntp package)
  • sshd (openssh-server package)
Time zone setting Make sure all nodes in your cluster are set to the same time zone.

Install Splice Machine

If you've decided to try our sandbox, see the Installing Splice Machine on Amazon Web Services topic in this chapter for step-by-step instructions for setting up the Splice Machine sandbox.     

To install Splice Machine on your cluster or standalone computer, click the link below to see the instructions for your platform; each page walks you through downloading and installing and configuring a specific version of Splice Machine:

Hadoop Platform Installation Guide
Cloudera-managed cluster Installing and Configuring Splice Machine for Cloudera Manager
Hortonworks-managed cluster Installing and Configuring Splice Machine for Hortonworks HDP
MapR-managed cluster Installing and Configuring Splice Machine for MapR
Standalone version of Splice Machine Installing the Standalone Version of Splice Machine
   

For access to the source code for the Community Edition of Splice Machine, visit our open source GitHub repository.