Analyze Command

The analyze command collects statistics for a specific table, or for an entire schema.

Once statistics have been collected for a schema or table, they are automatically used by the query optimizer.

Syntax

ANALYZE TABLE [schemaName '.'] table-Name
              [ESTIMATE STATISTICS SAMPLE samplepercent PERCENT];
ANALYZE SCHEMA schema-Name;

table-Name

The name of the table you want to analyze, which can optionally be qualified by its schema name. If you don’t specify a schemaName, the current schema is assumed.

You must have insert permission for the table to be able to run this command.

schema-Name

The name of the schema you want to analyze.

You must have insert permission for all tables in the schema to be able to run this command.

samplepercent

A value between 0 and 100 that specifies the sampling percentage to use when generating statistics for this table.

If you include this clause, statistics are generated by sampling the specified sampling percentage of the table. This can significantly reduce the overhead associated with generating statistics.

If you do not include this clause, statistics are generated based on the full table.

.

Analyze Table

The ANALYZE TABLE command collects statistics for a specific table in the current schema. It also collects statistics for the index associated with the table in the schema. For example, if you have the following in your database:

  • a table named myTable
  • myTable has two indices: myTableIndex1 and myTableIndex2

Then ANALYZE TABLE will collect statistics for myTable, myTableIndex1, and myTableIndex2.

The ANALYZE TABLE command displays the following information for each partition of the table:

Value Description
schemaName The name of the schema.
tableName The name of the table.
partition The Splice Machine partition. We merge the statistics for all table partitions, so the partition will show as -All- when you specify one of the non-merged type values for the statsType parameter.
rowsCollected The total number of rows collected for the table.
partitionSize The combined size of the table's partitions.
statsType

The type of statistics, which is one of these values:

0 Full table (not sampled) statistics that reflect the unmerged partition values.
1 Sampled statistics that reflect the unmerged partition values.
2 Full table (not sampled) statistics that reflect the table values after all partitions have been merged.
3 Sampled statistics that reflect the table values after all partitions have been merged.
sampleFraction

The sampling percentage, expressed as 0.0 to 1.0,

  • If statsType=0 or statsType=1 (full statistics), this value is not used, and is shown as 0.
  • If statsType=2 or statsType=3, this value is the percentage or rows to be sampled. A value of 0 means no rows, and a value of 1 means all rows (same as full statistics).

Analyze Schema

The ANALYZE SCHEMA command collects statistics for every table in the schema. It also collects statistics for the index associated with every table in the schema. For example, if you have the following situation:

  • a schema named mySchema
  • mySchema contains two tables: myTable1 and myTable2
  • myTable1 has two indices: myTable1Index1 and myTable1Index2

Then the ANALYZE SCHEMA command will collect statistics for myTable1, myTable2, myTable1Index1, and myTable1Index2.

The ANALYZE SCHEMA command displays the same information as shown for ANALYZE TABLE, for each table in the in the schema.

This command operates like the SYSCS_UTIL.COLLECT_SCHEMA_STATISTICS built-in system procedure.

Examples

splice> analyze table test.t2;
schemaName |tableName |partition |rowsCollec&|partitionSize |partitionCount  |statsType  |sampleFraction
-----------------------------------------------------------------------------------------------------------------
TEST       |T2        |-All-     |39226      |235356        |1               |2          |0

1 rows selected
splice>splice> analyze table test.t2 estimate statistics sample 50 percent;
schemaName |tableName |partition |rowsCollec&|partitionSize |partitionCount  |statsType  |sampleFraction
-----------------------------------------------------------------------------------------------------------------
TEST       |T2        |-All-     |19613      |235356        |1               |3          |0.5

1 rows selected
splice>splice> analyze schema test;
schemaName |tableName |partition |rowsCollec&|partitionSize |partitionCount  |statsType  |sampleFraction
-----------------------------------------------------------------------------------------------------------------
TEST       |T2        |-All-     |39226      |235356        |1               |2          |0
TEST       |T5        |-All-     |39226      |235356        |1               |2          |0
2 rows selected
splice>