Splice Machine Troubleshooting and Best Practices

    Learn about our products

This topic provides troubleshooting guidance for these issues that you may encounter with your Splice Machine database:

Restarting Splice Machine After HMaster Failure

If you run Splice Machine without redundant HMasters, and you lose your HMaster, follow these steps to restart Splice Machine:

  1. Restart the HMaster node
  2. Restart every HRegion Server node

Increasing Parallelism for Spark Shuffles

You can adjust the minimum parallelism for Spark shuffles by adjusting the value of the splice.olap.shuffle.partitions configuration option.

This option is similar to the spark.sql.shuffle.partitions option, which configures the number of partitions to use when shuffling data for joins or aggregations; however, the spark.sql.shuffle.partitions option is set to allow a lower number of partitions than is optimal for certain operations.

Specifically, increasing the number of shuffle partitions with the splice.olap.shuffle.partitions option is useful when performing operations on small tables that generate large, intermediate datasets; additional, but smaller sized partitions allows us to operate with better parallelism.

The default value of splice.olap.shuffle.partitions is 200.

Force Compaction to Run on Local Region Server

Splice Machine attempts to run database compaction jobs on an executor that is co-located with the serving Region Server; if it cannot find a local executor after a period of time, Splice Machine uses whatever executor Spark executor it can get; to force use of a local executor, you can adjust the splice.spark.dynamicAllocation.minExecutors configuration option.

To do so:

  • Set the value of splice.spark.dynamicAllocation.minExecutors to the number of Region Servers in your cluster
  • Set the value of splice.spark.dynamicAllocation.maxExecutors to twice that number. Adjust these setting in the Java Config Options section of your HBase Master configuration.

The default option settings are:

-Dsplice.spark.dynamicAllocation.minExecutors=0
-Dsplice.spark.dynamicAllocation.maxExecutors=12

For a cluster with 20 Region Servers, you would set these to:

-Dsplice.spark.dynamicAllocation.minExecutors=20
-Dsplice.spark.dynamicAllocation.maxExecutors=40