Installing and Configuring Splice Machine for Cloudera Manager

    Learn about our products

This topic describes installing and configuring Splice Machine on a Cloudera-managed cluster. Follow these steps:

  1. Verify Prerequisites
  2. Install the Splice Machine Parcel
  3. Stop Hadoop Services
  4. Make Cluster Modifications for Splice Machine
  5. Configure Hadoop Services
  6. Make any needed Optional Configuration Modifications
  7. Deploy the Client Configuration
  8. Restart the Cluster
  9. Verify your Splice Machine Installation

Verify Prerequisites

Before starting your Splice Machine installation, please make sure that your cluster contains the prerequisite software components:

  • A cluster running Cloudera Data Hub (CDH) with Cloudera Manager (CM)
  • HBase installed
  • HDFS installed
  • YARN installed
  • ZooKeeper installed

The specific versions of these components that you need depend on your operating environment, and are called out in detail in the Requirements topic of our Getting Started Guide.

Install the Splice Machine Parcel

Follow these steps to install CDH, Hadoop, Hadoop services, and Splice Machine on your cluster:

  1. Copy your parcel URL to the clipboard for use in the next step.

    Which Splice Machine parcel URL you need depends upon which Splice Machine version you’re installing and which version of CDH you are using. Here are the URLs for Splice Machine Release 2.5 and 2.5:

    CDH Version Parcel Type Installer Package Link(s)
    5.12.0 EL6 https://s3.amazonaws.com/splice-releases/2.5.0.1802/cluster/parcel/cdh5.12.0/SPLICEMACHINE-2.5.0.1802.cdh5.12.0.p0.540-el6.parcel
    EL7 https://s3.amazonaws.com/splice-releases/2.5.0.1802/cluster/parcel/cdh5.12.0/SPLICEMACHINE-2.5.0.1802.cdh5.12.0.p0.540-el7.parcel
    Precise https://s3.amazonaws.com/splice-releases/2.5.0.1802/cluster/parcel/cdh5.12.0/SPLICEMACHINE-2.5.0.1802.cdh5.12.0.p0.540-precise.parcel
    SLES11 https://s3.amazonaws.com/splice-releases/2.5.0.1802/cluster/parcel/cdh5.12.0/SPLICEMACHINE-2.5.0.1802.cdh5.12.0.p0.540-sles11.parcel
    Trusty https://s3.amazonaws.com/splice-releases/2.5.0.1802/cluster/parcel/cdh5.12.0/SPLICEMACHINE-2.5.0.1802.cdh5.12.0.p0.540-trusty.parcel
    Wheezy https://s3.amazonaws.com/splice-releases/2.5.0.1802/cluster/parcel/cdh5.12.0/SPLICEMACHINE-2.5.0.1802.cdh5.12.0.p0.540-wheezy.parcel
    5.8.3 EL6 https://s3.amazonaws.com/splice-releases/2.5.0.1802/cluster/parcel/cdh5.8.3/SPLICEMACHINE-2.5.0.1802.cdh5.8.3.p0.540-el6.parcel
    EL7 https://s3.amazonaws.com/splice-releases/2.5.0.1802/cluster/parcel/cdh5.8.3/SPLICEMACHINE-2.5.0.1802.cdh5.8.3.p0.540-el7.parcel
    Precise https://s3.amazonaws.com/splice-releases/2.5.0.1802/cluster/parcel/cdh5.8.3/SPLICEMACHINE-2.5.0.1802.cdh5.8.3.p0.540-precise.parcel
    SLES11 https://s3.amazonaws.com/splice-releases/2.5.0.1802/cluster/parcel/cdh5.8.3/SPLICEMACHINE-2.5.0.1802.cdh5.8.3.p0.540-sles11.parcel
    Trusty https://s3.amazonaws.com/splice-releases/2.5.0.1802/cluster/parcel/cdh5.8.3/SPLICEMACHINE-2.5.0.1802.cdh5.8.3.p0.540-trusty.parcel
    Wheezy https://s3.amazonaws.com/splice-releases/2.5.0.1802/cluster/parcel/cdh5.8.3/SPLICEMACHINE-2.5.0.1802.cdh5.8.3.p0.540-wheezy.parcel

    To be sure that you have the latest URL, please check the Splice Machine Community site or contact your Splice Machine representative.

  2. Add the parcel repository

    1. Make sure the Use Parcels (Recommended) option and the Matched release option are both selected.

    2. Click the Continue button to land on the More Options screen.

    3. Cick the + button for the Remote Parcel Repository URLs field. Paste your Splice Machine repository URL into this field.

  3. Use Cloudera Manager to install the parcel.

  4. Verify that the parcel has been distributed and activated.

    The Splice Machine parcel is identified as SPLICEMACHINE in the Cloudera Manager user interface. Make sure that this parcel has been downloaded, distributed, and activated on your cluster.

  5. Restart and redeploy any client changes when Cloudera Manager prompts you.

    </div>

Stop Hadoop Services

As a first step, we stop cluster services to allow our installer to make changes that require the cluster to be temporarily inactive.

From the Cloudera Manager home screen, click the drop-down arrow next to the cluster on

  1. Select your cluster in Cloudera Manager

    Click the drop-down arrow next to the name of the cluster on which you are installing Splice Machine.

  2. Stop the cluster

    Click the Stop button.

Make Cluster Modifications for Splice Machine

Splice Machine requires a few modifications at the file system level to work properly on a CDH cluster:

  1. Install updated Java Servlet library:

    You need to install an updated javax.servlet-api library so that Splice Machine can use Spark 2.0.x functionality in YARN.

  2. Remove Spark 1.6.x libraries

    By default, Splice Machine version uses Spark 2.0. To avoid Spark version mismatches, we strongly recommend that you remove Spark 1.6x libraries from /opt/cloudera/parcels/CDH/jars/; however, if you need to retain Spark 1.6 for other applications, please contact our install team to help with your configuration.

  3. Run our script as root user on each node in your cluster to add symbolic links to the Splice Machine uber jar and YARN proxy jar into the YARN directories

    Issue this command on each node in your cluster::

    sudo /opt/cloudera/parcels/SPLICEMACHINE/scripts/install-splice-symlinks.sh

Configure Hadoop Services

Now it’s time to make a few modifications in the Hadoop services configurations:

Configure and Restart the Management Service

  1. Select the Configuration tab in CM:

    Configuring the Cloudera Manager
ports

  2. Change the value of the Alerts: Listen Port to 10110.

  3. Save changes and restart the Management Service.

Configure ZooKeeper

To edit the ZooKeeper configuration, click ZooKeeper in the Cloudera Manager (CM) home screen, then click the Configuration tab and follow these steps:

  1. Select the Service-Wide category.

    Make the following changes:

    Maximum Client Connections = 0 Maximum Session Timeout = 120000

    Click the Save Changes button.

Configure HDFS

To edit the HDFS configuration, click HDFS in the Cloudera Manager home screen, then click the Configuration tab and make these changes:

  1. Verify that the HDFS data directories for your cluster are set up to use your data disks.

  2. Change the values of these settings

    Setting New Value
    Handler Count 20
    Maximum Number of Transfer Threads 8192
    NameNodeHandler Count 64
    NameNode Service Handler Count 60
    Replication Factor 2 or 3 *
    Java Heap Size of DataNode in Bytes 2 GB
  3. Click the Save Changes button.

Configure YARN

To edit the YARN configuration, click YARN in the Cloudera Manager home screen, then click the Configuration tab and make these changes:

  1. Verify that the following directories are set up to use your data disks.

    NodeManager Local Directories
    NameNode Data Directories
    HDFS Checkpoint Directories
  2. Change the values of these settings

    Setting New Value
    Heartbeat Interval 100 ms
    MR Application Classpath
    $HADOOP_MAPRED_HOME/*
    $HADOOP_MAPRED_HOME/lib/*
    $MR2_CLASSPATH/opt/cloudera/parcels/SPLICEMACHINE/lib/*
    YARN Application Classpath
    $HADOOP_CLIENT_CONF_DIR
    $HADOOP_CONF_DIR
    $HADOOP_COMMON_HOME/*
    $HADOOP_COMMON_HOME/lib/*
    $HADOOP_HDFS_HOME/*
    $HADOOP_HDFS_HOME/lib/*
    $HADOOP_YARN_HOME/*
    $HADOOP_YARN_HOME/lib/*
    $HADOOP_MAPRED_HOME/*
    $HADOOP_MAPRED_HOME/lib/*
    $MR2_CLASSPATH
    /opt/cloudera/parcels/CDH/lib/hbase/*
    /opt/cloudera/parcels/CDH/lib/hbase/lib/*
    /opt/cloudera/parcels/SPLICEMACHINE/lib/*
    Localized Dir Deletion Delay 86400
    JobHistory Server Max Log Size 1 GB
    NodeManager Max Log Size 1 GB
    ResourceManager Max Log Size 1 GB
    Container Memory 30 GB (based on node specs)
    Container Memory Maximum 30 GB (based on node specs)
    Container Virtual CPU Cores 19 (based on node specs)
    Container Virtual CPU Cores Maximum 19 (Based on node specs)
  3. Add property values

    You need to add the same two property values to each of four YARN advanced configuration settings.

    Add these properties:

    XML Property Name XML Property Value
    yarn.nodemanager.aux-services.spark_shuffle.class org.apache.spark.network.yarn.YarnShuffleService
    yarn.nodemanager.aux-services mapreduce_shuffle,spark_shuffle

    To each of these YARN settings:

    • Yarn Service Advanced Configuration Snippet (Safety Valve) for yarn-site.xml

    • Yarn Client Advanced Configuration Snippet (Safety Valve) for yarn-site.xml

    • NodeManager Advanced Configuration Snippet (Safety Valve) for yarn-site.xml

    • ResourceManager Advanced Configuration Snippet (Safety Valve) for yarn-site.xml

  4. Click the Save Changes button.

Configure HBASE

To edit the HBASE configuration, click HBASE in the Cloudera Manager home screen, then click the Configuration tab and make these changes:

  1. Change the values of these settings

    Setting New Value
    HBase Client Scanner Caching 100 ms
    Graceful Shutdown Timeout 30 seconds
    HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml The property list for the Safety Valve snippet is shown below, in Step 2
    SplitLog Manager Timeout 5 minutes
    Maximum HBase Client Retries 40
    RPC Timeout 20 minutes (or 1200000 milliseconds)
    HBase Client Pause 90
    ZooKeeper Session Timeout 120000
    HBase Master Web UI Port 16010
    HBase Master Port 16000
    Java Configuration Options for HBase Master The HBase Master Java configuration options list is shown below, in Step 3
    HBase Coprocessor Master Classes

    com.splicemachine.hbase.SpliceMasterObserver

    Java Heap Size of HBase Master in Bytes 5 GB
    HStore Compaction Threshold 5
    HBase RegionServer Web UI port 16030
    HStore Blocking Store Files 20
    Java Configuration Options for HBase RegionServer The HBase RegionServerJava configuration options list is shown below, in Step 4
    HBase Memstore Block Multiplier 4
    Maximum Number of HStoreFiles Compaction 7
    HBase RegionServer Lease Period 20 minutes (or 1200000 milliseconds)
    HFile Block Cache Size 0.25
    Java Heap Size of HBase RegionServer in Bytes 24 GB
    HBase RegionServer Handler Count 200
    HBase RegionServer Meta-Handler Count 200
    HBase Coprocessor Region Classes com.splicemachine.hbase.MemstoreAwareObserver
    com.splicemachine.derby.hbase.SpliceIndexObserver
    com.splicemachine.derby.hbase.SpliceIndexEndpoint
    com.splicemachine.hbase.RegionSizeEndpoint
    com.splicemachine.si.data.hbase.coprocessor.TxnLifecycleEndpoint
    com.splicemachine.si.data.hbase.coprocessor.SIObserver
    com.splicemachine.hbase.BackupEndpointObserver
    Maximum number of Write-Ahead Log (WAL) files 48
    RegionServer Small Compactions Thread Count 4
    HBase RegionServer Port 16020
    Per-RegionServer Number of WAL Pipelines 16
  2. Set the value of HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml:

    <property><name>dfs.client.read.shortcircuit.buffer.size</name><value>131072</value></property>
    <property><name>hbase.balancer.period</name><value>60000</value></property>
    <property><name>hbase.client.ipc.pool.size</name><value>10</value></property>
    <property><name>hbase.client.max.perregion.tasks</name><value>100</value></property>
    <property><name>hbase.coprocessor.regionserver.classes</name><value>com.splicemachine.hbase.RegionServerLifecycleObserver</value></property><property><name>hbase.hstore.defaultengine.compactionpolicy.class</name><value>com.splicemachine.compactions.SpliceDefaultCompactionPolicy</value></property>
    <property><name>hbase.hstore.defaultengine.compactor.class</name><value>com.splicemachine.compactions.SpliceDefaultCompactor</value></property>
    <property><name>hbase.htable.threads.max</name><value>96</value></property>
    <property><name>hbase.ipc.warn.response.size</name><value>-1</value></property>
    <property><name>hbase.ipc.warn.response.time</name><value>-1</value></property>
    <property><name>hbase.master.loadbalance.bytable</name><value>true</value></property>
    <property><name>hbase.mvcc.impl</name><value>org.apache.hadoop.hbase.regionserver.SIMultiVersionConsistencyControl</value></property>
    <property><name>hbase.regions.slop</name><value>0.01</value></property>
    <property><name>hbase.regionserver.global.memstore.size.lower.limit</name><value>0.9</value></property>
    <property><name>hbase.regionserver.global.memstore.size</name><value>0.25</value></property>
    <property><name>hbase.regionserver.maxlogs</name><value>48</value></property>
    <property><name>hbase.regionserver.wal.enablecompression</name><value>true</value></property>
    <property><name>hbase.rowlock.wait.duration</name><value>0</value></property>
    <property><name>hbase.status.multicast.port</name><value>16100</value></property>
    <property><name>hbase.wal.disruptor.batch</name><value>true</value></property>
    <property><name>hbase.wal.provider</name><value>multiwal</value></property>
    <property><name>hbase.wal.regiongrouping.numgroups</name><value>16</value></property>
    <property><name>hbase.zookeeper.property.tickTime</name><value>6000</value></property>
    <property><name>hfile.block.bloom.cacheonwrite</name><value>true</value></property>
    <property><name>io.storefile.bloom.error.rate</name><value>0.005</value></property>
    <property><name>splice.client.numConnections</name><value>1</value></property>
    <property><name>splice.client.write.maxDependentWrites</name><value>60000</value></property>
    <property><name>splice.client.write.maxIndependentWrites</name><value>60000</value></property>
    <property><name>splice.compression</name><value>snappy</value></property>
    <property><name>splice.marshal.kryoPoolSize</name><value>1100</value></property>
    <property><name>splice.olap_server.clientWaitTime</name><value>900000</value></property>
    <property><name>splice.ring.bufferSize</name><value>131072</value></property>
    <property><name>splice.splitBlockSize</name><value>67108864</value></property>
    <property><name>splice.timestamp_server.clientWaitTime</name><value>120000</value></property>
    <property><name>splice.txn.activeTxns.cacheSize</name><value>10240</value></property>
    <property><name>splice.txn.completedTxns.concurrency</name><value>128</value></property>
    <property><name>splice.txn.concurrencyLevel</name><value>4096</value></property>
    <property><name>hbase.hstore.compaction.max.size</name><value>260046848</value></property>
    <property><name>hbase.hstore.compaction.min.size</name><value>16777216</value></property>
    <property><name>hbase.hstore.compaction.min</name><value>5</value></property>
    <property><name>hbase.regionserver.thread.compaction.large</name><value>1</value></property>
    <property><name>splice.authentication.native.algorithm</name><value>SHA-512</value></property>
    <property><name>splice.authentication</name><value>NATIVE</value></property>
    
  3. Set the value of Java Configuration Options for HBase Master

    If you’re using version 2.2 or later of the Spark Shuffle service, set the Java Configuration Options for HBase Master to:

    -XX:MaxPermSize=512M -XX:+HeapDumpOnOutOfMemoryError -XX:MaxDirectMemorySize=2g -XX:+AlwaysPreTouch -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=10101 -Dsplice.spark.enabled=true -Dsplice.spark.app.name=SpliceMachine -Dsplice.spark.master=yarn-client -Dsplice.spark.logConf=true -Dsplice.spark.yarn.maxAppAttempts=1 -Dsplice.spark.driver.maxResultSize=1g -Dsplice.spark.driver.cores=2 -Dsplice.spark.yarn.am.memory=1g -Dsplice.spark.dynamicAllocation.enabled=true -Dsplice.spark.dynamicAllocation.executorIdleTimeout=120 -Dsplice.spark.dynamicAllocation.cachedExecutorIdleTimeout=120 -Dsplice.spark.dynamicAllocation.minExecutors=0 -Dsplice.spark.dynamicAllocation.maxExecutors=12 -Dsplice.spark.io.compression.lz4.blockSize=32k -Dsplice.spark.kryo.referenceTracking=false -Dsplice.spark.kryo.registrator=com.splicemachine.derby.impl.SpliceSparkKryoRegistrator -Dsplice.spark.kryoserializer.buffer.max=512m -Dsplice.spark.kryoserializer.buffer=4m -Dsplice.spark.locality.wait=100 -Dsplice.spark.memory.fraction=0.5 -Dsplice.spark.scheduler.mode=FAIR -Dsplice.spark.serializer=org.apache.spark.serializer.KryoSerializer -Dsplice.spark.shuffle.compress=false -Dsplice.spark.shuffle.file.buffer=128k -Dsplice.spark.shuffle.service.enabled=true -Dsplice.spark.reducer.maxReqSizeShuffleToMem=134217728 -Dsplice.spark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native -Dsplice.spark.yarn.am.waitTime=10s -Dsplice.spark.yarn.executor.memoryOverhead=2048 -Dsplice.spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/etc/spark/conf/log4j.properties -Dsplice.spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native -Dsplice.spark.driver.extraClassPath=/opt/cloudera/parcels/CDH/lib/hbase/conf:/opt/cloudera/parcels/CDH/jars/htrace-core-3.1.0-incubating.jar -Dsplice.spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native -Dsplice.spark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/hbase/conf:/opt/cloudera/parcels/CDH/jars/htrace-core-3.1.0-incubating.jar -Dsplice.spark.ui.retainedJobs=100 -Dsplice.spark.ui.retainedStages=100 -Dsplice.spark.worker.ui.retainedExecutors=100 -Dsplice.spark.worker.ui.retainedDrivers=100 -Dsplice.spark.streaming.ui.retainedBatches=100 -Dsplice.spark.executor.cores=4 -Dsplice.spark.executor.memory=8g -Dspark.compaction.reserved.slots=4 -Dsplice.spark.eventLog.enabled=true -Dsplice.spark.eventLog.dir=hdfs:///user/splice/history -Dsplice.spark.local.dir=/tmp -Dsplice.spark.yarn.jars=/opt/cloudera/parcels/SPLICEMACHINE/lib/*
    

    If you’re using a version of the Spark Shuffle service earlier than 2.2, set the Java Configuration Options for HBase Master to this instead:

    -XX:MaxPermSize=512M -XX:+HeapDumpOnOutOfMemoryError -XX:MaxDirectMemorySize=2g -XX:+AlwaysPreTouch -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=10101 -Dsplice.spark.enabled=true -Dsplice.spark.app.name=SpliceMachine -Dsplice.spark.master=yarn-client -Dsplice.spark.logConf=true -Dsplice.spark.yarn.maxAppAttempts=1 -Dsplice.spark.driver.maxResultSize=1g -Dsplice.spark.driver.cores=2 -Dsplice.spark.yarn.am.memory=1g -Dsplice.spark.dynamicAllocation.enabled=true -Dsplice.spark.dynamicAllocation.executorIdleTimeout=120 -Dsplice.spark.dynamicAllocation.cachedExecutorIdleTimeout=120 -Dsplice.spark.dynamicAllocation.minExecutors=0 -Dsplice.spark.dynamicAllocation.maxExecutors=12 -Dsplice.spark.io.compression.lz4.blockSize=32k -Dsplice.spark.kryo.referenceTracking=false -Dsplice.spark.kryo.registrator=com.splicemachine.derby.impl.SpliceSparkKryoRegistrator -Dsplice.spark.kryoserializer.buffer.max=512m -Dsplice.spark.kryoserializer.buffer=4m -Dsplice.spark.locality.wait=100 -Dsplice.spark.memory.fraction=0.5 -Dsplice.spark.scheduler.mode=FAIR -Dsplice.spark.serializer=org.apache.spark.serializer.KryoSerializer -Dsplice.spark.shuffle.compress=false -Dsplice.spark.shuffle.file.buffer=128k -Dsplice.spark.shuffle.service.enabled=true  -Dsplice.spark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native -Dsplice.spark.yarn.am.waitTime=10s -Dsplice.spark.yarn.executor.memoryOverhead=2048 -Dsplice.spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/etc/spark/conf/log4j.properties -Dsplice.spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native -Dsplice.spark.driver.extraClassPath=/opt/cloudera/parcels/CDH/lib/hbase/conf:/opt/cloudera/parcels/CDH/jars/htrace-core-3.1.0-incubating.jar -Dsplice.spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native -Dsplice.spark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/hbase/conf:/opt/cloudera/parcels/CDH/jars/htrace-core-3.1.0-incubating.jar -Dsplice.spark.ui.retainedJobs=100 -Dsplice.spark.ui.retainedStages=100 -Dsplice.spark.worker.ui.retainedExecutors=100 -Dsplice.spark.worker.ui.retainedDrivers=100 -Dsplice.spark.streaming.ui.retainedBatches=100 -Dsplice.spark.executor.cores=4 -Dsplice.spark.executor.memory=8g -Dspark.compaction.reserved.slots=4 -Dsplice.spark.eventLog.enabled=true -Dsplice.spark.eventLog.dir=hdfs:///user/splice/history -Dsplice.spark.local.dir=/tmp -Dsplice.spark.yarn.jars=/opt/cloudera/parcels/SPLICEMACHINE/lib/*
    
  4. Set the value of Java Configuration Options for Region Servers:

    -XX:+HeapDumpOnOutOfMemoryError -XX:MaxDirectMemorySize=2g -XX:MaxPermSize=512M -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:MaxNewSize=4g -XX:InitiatingHeapOccupancyPercent=60 -XX:ParallelGCThreads=24 -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=5000 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=10102
    
  5. Click the Save Changes button.

Optional Configuration Modifications

There are a few configuration modifications you might want to make:

Modify the Authentication Mechanism

Splice Machine installs with Native authentication configured; native authentication uses the sys.sysusers table in the splice schema for configuring user names and passwords.

You can disable authentication or change the authentication mechanism that Splice Machine uses to LDAP by following the simple instructions in Configuring Splice Machine Authentication

You can use Cloudera’s Kerberos Wizard to enable Kerberos mode on a CDH5.8.x cluster. If you’re enabling Kerberos, you need to add this option to your HBase Master Java Configuration Options:

-Dsplice.spark.hadoop.fs.hdfs.impl.disable.cache=true

Modify the Log Location

Splice Machine logs all SQL statements by default, storing the log entries in your region server’s logs, as described in our Using Logging topic. You can modify where Splice Machine stroes logs by adding the following snippet to your RegionServer Logging Advanced Configuration Snippet (Safety Valve) section of your HBase Configuration:

log4j.appender.spliceDerby=org.apache.log4j.FileAppender
log4j.appender.spliceDerby.File=${hbase.log.dir}/splice-derby.log
log4j.appender.spliceDerby.layout=org.apache.log4j.EnhancedPatternLayout
log4j.appender.spliceDerby.layout.ConversionPattern=%d{EEE MMM d HH:mm:ss,SSS} Thread[%t] %m%n

log4j.appender.spliceStatement=org.apache.log4j.FileAppender
log4j.appender.spliceStatement.File=${hbase.log.dir}/splice-statement.log
log4j.appender.spliceStatement.layout=org.apache.log4j.EnhancedPatternLayout
log4j.appender.spliceStatement.layout.ConversionPattern=%d{EEE MMM d HH:mm:ss,SSS} Thread[%t] %m%n

log4j.logger.splice-derby=INFO, spliceDerby
log4j.additivity.splice-derby=false

# Uncomment to log statements to a different file:
#log4j.logger.splice-derby.statement=INFO, spliceStatement
# Uncomment to not replicate statements to the spliceDerby file:
#log4j.additivity.splice-derby.statement=false

Deploy the Client Configuration

Now that you’ve updated your configuration information, you need to deploy it throughout your cluster. You should see a small notification in the upper right corner of your screen that looks like this:

Clicking the button to tell Cloudera to redeploy the client
configuration

To deploy your configuration:

  1. Click the notification.
  2. Click the Deploy Client Configuration button.
  3. When the deployment completes, click the Finish button.

Restart the Cluster

As a first step, we stop the services that we’re about to configure from the Cloudera Manager home screen:

  1. Restart ZooKeeper

    Select Start from the Actions menu in the upper right corner of the ZooKeeper Configuration tab to restart ZooKeeper.

  2. Restart HDFS

    Click the HDFS Actions drop-down arrow associated with (to the right of) HDFS in the cluster summary section of the Cloudera Manager home screen, and then click Start to restart HDFS.

    Use your terminal window to create these directories (if they are not already available in HDFS):

    sudo -iu hdfs hadoop fs -mkdir -p hdfs:///user/hbase hdfs:///user/splice/history
    sudo -iu hdfs hadoop fs -chown -R hbase:hbase hdfs:///user/hbase hdfs:///user/splice
    sudo -iu hdfs hadoop fs -chmod 1777 hdfs:///user/splice hdfs:///user/splice/history
    
  3. Restart YARN

    Click the YARN Actions drop-down arrow associated with (to the right of) YARN in the cluster summary section of the Cloudera Manager home screen, and then click Start to restart YARN.

  4. Restart HBase

    Click the HBASE Actions drop-down arrow associated with (to the right of) HBASE in the cluster summary section of the Cloudera Manager home screen, and then click Start to restart HBase.

Verify your Splice Machine Installation

Now start using the Splice Machine command line interpreter, which is referred to as the splice prompt or simply splice> by launching the sqlshell.sh script on any node in your cluster that is running an HBase region server.

The command line interpreter defaults to connecting on port 1527 on localhost, with username splice, and password admin. You can override these defaults when starting the interpreter, as described in the Command Line (splice>) Reference topic in our Developer’s Guide.

Now try entering a few sample commands you can run to verify that everything is working with your Splice Machine installation.

Operation Command to perform operation
Display tables
splice> show tables;
Create a table
splice> create table test (i int);
Add data to the table
splice> insert into test values 1,2,3,4,5;
Query data in the table
splice> select * from test;
Drop the table
splice> drop table test;
List available commands
splice> help;
Exit the command line interpreter
splice> exit;
Make sure you end each command with a semicolon (;), followed by the Enter key or Return key

See the Command Line (splice>) Reference section of our Developer’s Guide for information about our commands and command syntax.

</div>