SYSCS_UTIL.SYSCS_BACKUP_DATABASE

The SYSCS_UTIL.SYSCS_BACKUP_DATABASE system procedure performs an immediate full or incremental backup of your database to a specified backup directory.

ENTERPRISE ONLY: This feature is available only for the Splice Machine Enterprise version of our On-Premise Database product; contact Splice Machine Sales for information.

Splice Machine supports both full and incremental backups: 

  • A full backup backs up all of the files/blocks that constitute your database.
  • An incremental backup only stores database files/blocks that have changed since a previous backup.

The first time that you run an incremental backup, a full backup is performed. Subsequent runs of the backup will only copy information that has changed since the previous backup.

For more information, see the Backing Up and Restoring topic.

Syntax

SYSCS_UTIL.SYSCS_BACKUP_DATABASE( VARCHAR backupDir,
                                  VARCHAR(30) backupType );

backupDir

Specifies the path to the directory in which you want the backup stored. This can be a local directory if you’re using the standalone version of Splice Machine, or a directory in your cluster’s file system (HDFS or MapR-FS).

You must have permissions set properly to use cloud storage as a backup destination. See Backing Up to Cloud Storage for information about setting backup permissions properties.

Relative paths are resolved based on the current user directory. To avoid confusion, we strongly recommend that you use an absolute path when specifying the backup destination.

backupType

Specifies the type of backup that you want performed. This must be one of the following values: full or incremental; any other value produces an error and the backup is not run.

Note that if you specify 'incremental', Splice Machine checks the   SYS.SYSBACKUP table to determine if there already is a backup for the system; if not, Splice Machine will perform a full backup, and subsequent backups will be incremental.

Results

This procedure does not return a result.

Backup Resource Allocation

Splice Machine backups run as Spark jobs, submitting tasks to copy HFiles. In the past, Splice Machine backups used the Apache Hadoop distcp tool to copy the HFile; distcp uses MapReduce to copy, which can require significant resources. These requirements can limit file copying parallelism and reduce backup throughput. Splice Machine backups now can run (and do so by default) using a Spark executor to copy the HFiles, which significantly increases backup performance.

You can revert to using distcp, which uses a MapReduce job that can run into resource issues. For more information, see the Understanding and Troubleshooting Backups topic.

Usage Notes

Please review these important notes about usage of this system procedure:

  • HBase Configuration Options for Incremental Backup
  • Temporary Tables and Backups

HBase Configuration Options for Incremental Backup

If you’re performing incremental backups, you must add the following options to your hbase-site.xml configuration file:

hbase.master.hfilecleaner.plugins = com.splicemachine.hbase.SpliceHFileCleaner,
org.apache.hadoop.hbase.master.cleaner.TimeToLiveHFileCleaner

Temporary Tables and Backups

There’s a subtle issue with performing a backup when you’re using a temporary table in your session: although the temporary table is (correctly) not backed up, the temporary table’s entry in the system tables will be backed up. When the backup is restored, the table entries will be restored, but the temporary table will be missing.

There’s a simple workaround:

  1. Exit your current session, which will automatically delete the temporary table and its system table entries.
  2. Start a new session (reconnect to your database).
  3. Start your backup job.

Execute Privileges

If authentication and SQL authorization are both enabled, only the database owner has execute privileges on this function by default. The database owner can grant access to other users.

JDBC example

The following example performs an immediate full backup to a subdirectory of the hdfs:///home/backup directory:

CallableStatement cs = conn.prepareCall
  ("CALL SYSCS_UTIL.SYSCS_BACKUP_DATABASE(?,?)");
  cs.setString(1, 'hdfs:///home/backup');
  cs.setString(2, 'full');
  cs.execute();
  cs.close();

SQL Example

Backing up a database may take several minutes, depending on the size of your database and how much of it you’re backing up.

The following example runs an immediate incremental backup to the hdfs:///home/backup/ directory:

splice> CALL SYSCS_UTIL.SYSCS_BACKUP_DATABASE( 'hdfs:///home/backup', 'incremental' );
Statement executed.

The following example runs the same backup and stores it on AWS:

splice> CALL SYSCS_UTIL.SYSCS_BACKUP_DATABASE( 's3://backup1234', 'incremental' );
Statement executed.

And this example does a full backup to a relative directory (relative to your splicemachine directory) on a standalone version of Splice Machine:

splice> CALL SYSCS_UTIL.SYSCS_BACKUP_DATABASE( './dbBackups', 'full' );
Statement executed.

See Also