Configuring an S3 Bucket for Splice Machine Access

Splice Machine can access S3 buckets, making it easy for you to store and manage your data on AWS. To do so, you need to configure your AWS controls to allow that access. This topic walks you through the required steps.

You must have administrative access to AWS to configure your S3 buckets for Splice Machine.

This topic contains these two sections:

Configuring S3 Bucket Access

You can follow these steps to configure access to your S3 bucket(s) for Splice Machine; when you’re done, you will have:

  • created an IAM policy for an S3 bucket
  • created an IAM user
  • generated access credential for that user
  • attached the security policy to that user
  1. Log in to the AWS Database Console

    You must have administrative access to configure S3 bucket access.

  2. Select Services at the top of the dashboard

  3. Access the IAM (Identify and Access Management) service:

    Select IAM in the Security, Identity & Compliance section:

  4. Create a new policy:

    1. Select Policies from the IAM screen, then select Create Policy:

    2. Select Create Your Own Policy to enter your own policy:

    3. In the Review Policy section, which should be pre-selected, specify a name for this policy (we call it splice_access):

    4. Paste the following JSON object specification into the Policy Document field and then modify the highlighted values to specify your bucket name and folder path.

      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Effect": "Allow",
                  "Action": [
                    "s3:PutObject",
                    "s3:GetObject",
                    "s3:GetObjectVersion",
                    "s3:DeleteObject",
                    "s3:DeleteObjectVersion"
                  ],
                  "Resource": "arn:aws:s3:::<bucket_name>/<prefix>/*"
              },
              {
                  "Effect": "Allow",
                  "Action": "s3:ListBucket",
                  "Resource": "arn:aws:s3:::<bucket_name>",
                  "Condition": {
                      "StringLike": {
                          "s3:prefix": [
                              "<prefix>/*"
                          ]
                      }
                  }
              },
              {
                  "Effect": "Allow",
                  "Action": "s3:GetAccelerateConfiguration",
                  "Resource": "arn:aws:s3:::<bucket_name>"
              }
          ]
      }
      
    5. Click Validate Policy to verify that your policy settings are valid.

    6. Click Create Policy to create and save the policy.

  5. Add Splice Machine as a user:

    After you create the policy:

    1. Select Users from the left-hand navigation pane.

    2. Click Add User.

    3. Enter a User name (we’ve used SpliceMachine) and select Programmatic access as the access type:

    4. Click Attach existing policies directly.

    5. Select the policy you just created and click Next:

    6. Review your settings, then click Create User.

  6. Save your access credentials

    You must write down your Access key ID and secret access key; you will be unable to recover the secret access key.

    Splice Machine strongly recommends that you click the Download .csv button and save your credentials in a file for future reference. Once you close this screen, you’ll be unable to display your secret access key.

Accessing S3 Buckets

Once you’ve established your access keys, you can include them inline; for example:

call SYSCS_UTIL.IMPORT_DATA ('TPCH', 'REGION', null, 's3a://(access key):(secret key)@splice-benchmark-data/flat/TPCH/100/region', '|', null, null, null, null, -1, 's3a://(access key):(secret key)@splice-benchmark-data/flat/TPCH/100/importLog', true, null);

Alternatively, you can specify the keys once in the core-site.xml file on your cluster, and then simply specify the s3a URL; for example:

call SYSCS_UTIL.IMPORT_DATA ('TPCH', 'REGION', null, 's3a://splice-benchmark-data/flat/TPCH/100/region', '|', null, null, null, null, 0, '/BAD', true, null);

To add your access and secret access keys to the core-site.xml file, define the fs.s3a.awsAccessKeyId and fs.s3a.awsSecretAccessKey properties in that file:

<property>
   <name>fs.s3a.access.key</name>
   <value>access key</value>
</property>
<property>
   <name>fs.s3a.secret.key</name>
   <value>secret key</value>
</property>