SYSCS_UTIL.SYSCS_BACKUP_DATABASE system procedure performs an
immediate full or incremental backup of your database to a specified
ENTERPRISE ONLY: This feature is available only for the Splice Machine Enterprise version of our On-Premise Database product; contact Splice Machine Sales for information.
Splice Machine supports both full and incremental backups:
- A full backup backs up all of the files/blocks that constitute your database.
- An incremental backup only stores database files/blocks that have changed since a previous backup.
The first time that you run an incremental backup, a full backup is performed. Subsequent runs of the backup will only copy information that has changed since the previous backup.
For more information, see the Backing Up and Restoring topic.
SYSCS_UTIL.SYSCS_BACKUP_DATABASE( VARCHAR backupDir, VARCHAR(30) backupType );
Specifies the path to the directory in which you want the backup stored. This can be a local directory if you’re using the standalone version of Splice Machine, or a directory in your cluster’s file system (HDFS or MapR-FS).
You must have permissions set properly to use cloud storage as a backup destination. See Backing Up to Cloud Storage for information about setting backup permissions properties.
Relative paths are resolved based on the current user directory. To avoid confusion, we strongly recommend that you use an absolute path when specifying the backup destination.
Specifies the type of backup that you want performed. This must be one of
the following values:
incremental; any other value
produces an error and the backup is not run.
Note that if you specify
'incremental', Splice Machine checks the
SYS.SYSBACKUP table to determine if
there already is a backup for the system; if not, Splice Machine will
perform a full backup, and subsequent backups will be incremental.
This procedure does not return a result.
Backup Resource Allocation
Splice Machine backups run as Spark jobs, submitting tasks to copy HFiles. In the past, Splice Machine backups used the Apache Hadoop
distcp tool to copy the HFile;
distcp uses MapReduce to copy, which can require significant resources. These requirements can limit file copying parallelism and reduce backup throughput. Splice Machine backups now can run (and do so by default) using a Spark executor to copy the HFiles, which significantly increases backup performance.
You can revert to using
distcp, which uses a MapReduce job that can run into resource issues. For more information, see the Understanding and Troubleshooting Backups topic.
Please review these important notes about usage of this system procedure:
- HBase Configuration Options for Incremental Backup
- Temporary Tables and Backups
HBase Configuration Options for Incremental Backup
If you’re performing incremental backups, you must add the following options to your
hbase-site.xml configuration file:
hbase.master.hfilecleaner.plugins = com.splicemachine.hbase.SpliceHFileCleaner, org.apache.hadoop.hbase.master.cleaner.TimeToLiveHFileCleaner
Temporary Tables and Backups
There’s a subtle issue with performing a backup when you’re using a temporary table in your session: although the temporary table is (correctly) not backed up, the temporary table’s entry in the system tables will be backed up. When the backup is restored, the table entries will be restored, but the temporary table will be missing.
There’s a simple workaround:
- Exit your current session, which will automatically delete the temporary table and its system table entries.
- Start a new session (reconnect to your database).
- Start your backup job.
If authentication and SQL authorization are both enabled, only the database owner has execute privileges on this function by default. The database owner can grant access to other users.
The following example performs an immediate full backup to a
subdirectory of the
CallableStatement cs = conn.prepareCall ("CALL SYSCS_UTIL.SYSCS_BACKUP_DATABASE(?,?)"); cs.setString(1, 'hdfs:///home/backup'); cs.setString(2, 'full'); cs.execute(); cs.close();
Backing up a database may take several minutes, depending on the size of your database and how much of it you’re backing up.
The following example runs an immediate incremental backup to the
splice> CALL SYSCS_UTIL.SYSCS_BACKUP_DATABASE( 'hdfs:///home/backup', 'incremental' ); Statement executed.
The following example runs the same backup and stores it on AWS:
splice> CALL SYSCS_UTIL.SYSCS_BACKUP_DATABASE( 's3://backup1234', 'incremental' ); Statement executed.
And this example does a full backup to a relative directory (relative to
splicemachine directory) on a standalone version of
splice> CALL SYSCS_UTIL.SYSCS_BACKUP_DATABASE( './dbBackups', 'full' ); Statement executed.