SYSCS_UTIL.SYSCS_BACKUP_SCHEMA

The SYSCS_UTIL.SYSCS_BACKUP_SCHEMA system procedure performs an immediate full backup of the tables and indexes belonging to a schema in your database to a specified backup directory.

ENTERPRISE ONLY: This feature is available only for the Splice Machine Enterprise version of our On-Premise Database product; contact Splice Machine Sales for information.

Syntax

SYSCS_UTIL.SYSCS_BACKUP_SCHEMA( VARCHAR schemaName,
                                VARCHAR directory,
                                VARCHAR type );

schemaName

The name of the schema you want to back up.

directory

Specifies the path to the directory in which you want the backup stored. This can be a local directory if you’re using the standalone version of Splice Machine, or a directory in your cluster’s file system (HDFS or MapR-FS).

You must have permissions set properly to use cloud storage as a backup destination. See Backing Up to Cloud Storage for information about setting backup permissions properties.

Relative paths are resolved based on the current user directory. To avoid confusion, we strongly recommend that you use an absolute path when specifying the backup destination.

type

Specifies the type of table backup that you want performed. Currently, the only valid value is full.

Results

This procedure does not return a result.

Backup Resource Allocation

Splice Machine backups run as Spark jobs, submitting tasks to copy HFiles. In the past, Splice Machine backups used the Apache Hadoop distcp tool to copy the HFile; distcp uses MapReduce to copy, which can require significant resources. These requirements can limit file copying parallelism and reduce backup throughput. Splice Machine backups now can run (and do so by default) using a Spark executor to copy the HFiles, which significantly increases backup performance.

You can revert to using distcp, which uses a MapReduce job that can run into resource issues. For more information, see the Understanding and Troubleshooting Backups topic.

HBase Configuration Options for Incremental Backup

If you’re performing incremental backups, you must add the following options to your hbase-site.xml configuration file:

hbase.master.hfilecleaner.plugins = com.splicemachine.hbase.SpliceHFileCleaner,
org.apache.hadoop.hbase.master.cleaner.TimeToLiveHFileCleaner

Execute Privileges

If authentication and SQL authorization are both enabled, only the database owner has execute privileges on this function by default. The database owner can grant access to other users.

JDBC example

The following example performs an immediate full backup of the TPCH1 schema to a subdirectory of the /backup directory:

CallableStatement cs = conn.prepareCall
  ("CALL SYSCS_UTIL.SYSCS_BACKUP_SCHEMA(?,?,?)");
  cs.setString(1, 'TPCH1');
  cs.setString(2, '/backup');
  cs.setString(3, 'full');
  cs.execute();
  cs.close();

SQL Example: Backup, and Restore a Schema

This example shows you how to back up a schema, then restore it, in these steps:

Backing Up the Schema

This command line performs a full backup of the TPCH1 schema to the /backup directory on HDFS:

splice> CALL SYSCS_UTIL.SYSCS_BACKUP_SCHEMA('TPCH1', '/backup', 'full');
Success
----------------------
FULL backup to /backup

1 row selected

Examining the Backup

After the backup completes, you can examine the sys.sysbackup table to find the ID of your new backup:

splice> SELECT * FROM sys.sysbackup;
BACKUP_ID      |BEGIN_TIMESTAMP          |END_TIMESTAMP            |STATUS     |SCOPE     |INCR&|INCREMENTAL_PARENT_&|BACKUP_ITEM
-----------------------------------------------------------------------------------------------------------------------------------
125953         |2018-10-26 00:12:33.896  |2018-10-26 00:42:53.546  |SUCCESS    |SCHEMA    |false|-1                  |3

You can use the ID of your backup job to examine the sys.sysbackupitems and verify that the base table and two indexes have been backed up:

splice> SELECT * FROM sys.sysbackupitems WHERE backup_Id=125953 ;
BACKUP_ID   |ITEM             |BEGIN_TIMESTAMP           |END_TIMESTAMP
-----------------------------------------------------------------------------------------
125953      |splice:292000    |2018-10-26 00:12:40.512   |2018-10-26 00:32:14.856
125953      |splice:292033    |2018-10-26 00:12:40.513   |2018-10-26 00:42:48.573
125953      |splice:292017    |2018-10-26 00:12:40.512   |2018-10-26 00:41:25.683

3 rows selected

Restoring the Backup

You can restore the table to another table on the same cluster, or on a different cluster. You can optionally specify that you want the backup validated before it is restored; the validation process checks for inconsistencies and missing files.

This command first validates the backed-up schema, and then restores it to a different (pre-existing) schema named NEWTPCH1:

splice> CALL SYSCS_UTIL.SYSCS_RESTORE_SCHEMA('NEWTPCH1', 'TPCH1', '/backup', 125953, true);
Statement executed.

See the reference page for the SYSCS_UTIL.SYSCS_RESTORE_SCHEMA system procedure for more information about restoring a backed-up schema.

See Also