This article covers:

  1. How to set up a connection to an AWS Athena database
  2. How to add datasets
  3. How to do connection overrides on an AWS Athena connection
  4. The supported AWS Athena data types and the respective Luzmo data type they are mapped to

1. How to setup an AWS Athena connection

To create a new AWS Athena connection, navigate to the Connections page, select New Connection, and select the 'AWS Athena' connector. You'll be asked to provide values for the following properties:

  • S3 results bucket: this is the S3 bucket where you want results of your queries to be stored. This bucket should have a format like: s3://<bucket name>. You can access it by inspecting the details of your work group:
  • IAM Key: the access key id of an IAM user (with programmatic access) that is authorized to query AWS Athena (has query access to Athena, read access to Glue and read access to any S3 buckets or connected databases).
  • IAM Secret: the secret access key of the IAM user.
  • Region: the region of your AWS Athena cluster.
  • Workgroup: the workgroup of your AWS Athena cluster.

Notes:

  1. AWS Athena only supports lowercase table and column names and only lowercase special characters (even though you can create these tables in AWS Glue). More information here: https://docs.aws.amazon.com/athena/latest/ug/tables-databases-columns-names.html

  2. To ensure that e.g. only the Luzmo application can access your AWS Athena instance, it is highly recommended to whitelist Luzmo's range of static IP addresses. You should whitelist Luzmo's range of IP addresses as described here.

  3. The connector requires the IAM user to have read-level authorization to access AWS Athena, AWS Glue and any of the sources (eg. S3 buckets or databases) queried. The user must also have read & write access to the S3 results bucket used. An example IAM Policy that gives this access is:

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "athena:ListEngineVersions",
                    "athena:ListDataCatalogs",
                    "glue:GetTables",
                    "glue:GetPartitions",
                    "athena:ListWorkGroups",
                    "glue:GetDatabases",
                    "glue:GetTable"
                ],
                "Resource": "*"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "athena:GetTableMetadata",
                    "athena:StartQueryExecution",
                    "athena:GetQueryResultsStream",
                    "athena:GetQueryResults",
                    "s3:ListBucket",
                    "athena:GetDatabase",
                    "athena:GetDataCatalog",
                    "athena:GetNamedQuery",
                    "athena:GetPreparedStatement",
                    "athena:ListTagsForResource",
                    "athena:ListQueryExecutions",
                    "athena:ListNamedQueries",
                    "athena:GetWorkGroup",
                    "athena:ListDatabases",
                    "athena:StopQueryExecution",
                    "athena:GetQueryExecution",
                    "athena:BatchGetNamedQuery",
                    "athena:ListTableMetadata",
                    "athena:ListPreparedStatements",
                    "athena:BatchGetQueryExecution"
                ],
                "Resource": [
                    "arn:aws:s3:::source-bucket",
                    "arn:aws:athena:*:457806912465:datacatalog/*",
                    "arn:aws:athena:*:457806912465:workgroup/primary"
                ]
            },
            {
                "Effect": "Allow",
                "Action": "s3:GetObject",
                "Resource": "arn:aws:s3:::source-bucket/*"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "s3:PutObject",
                    "s3:GetObject",
                    "s3:ListBucketMultipartUploads",
                    "s3:AbortMultipartUpload",
                    "s3:ListBucket",
                    "s3:GetBucketLocation",
                    "s3:ListMultipartUploadParts"
                ],
                "Resource": [
                    "arn:aws:s3:::query-results-bucket/*",
                    "arn:aws:s3:::query-results-bucket"
                ]
            }
        ]
    }
    

2. How to add datasets

Once you have connected your AWS Athena instance, you can add datasets as explained here. You can select one or multiple datasets as accessible in your Athena instance, and optionally link them in Luzmo to ensure they can be used together in a dashboard. Also make sure to check out this article on Preparing your data for analytics.

To find out how to add datasets using our API, please refer to this example in our developer documentation.

3. AWS Athena Connection Overrides

When generating an Authorization token to grant a user acces to your embedded dashboards it is possible to override the data source properties in the authorization request to dynamically use different properties. Find out more about connection overrides.

The fields available for overriding an AWS Athena connection are as follows:

  • Connection ID: The ID of the Connection to be overridden. Retrieve the ID to specify as detailed here.
  • host:The new S3 results bucket to connect to. This bucket should have a format like: s3://<bucket name>. You can access it by inspecting the details of your work group.
  • key: The new Access key ID that will be used to connect to your AWS Athena cluster.
  • token: The new Secret Access key to connect to your AWS Athena cluster.
  • region: The new region of your AWS Athena cluster.
  • workgroup: The new workgroup of your AWS Athena cluster.
  • datasets: List of dataset-level overrides. Useful if you want to override only a single dataset in your dashboard or if you have a separate table per client.
    • schema: The new database to query.
    • table: The new AWS Athena table to query.

Example code demonstrating how to override an AWS Athena connection:

 import Luzmo from '@luzmo/nodejs-sdk';
const client = new Luzmo({
api_key: '<your Luzmo API key>',
api_token: '<your Luzmo API token>',
host: 'https://api.luzmo.com:443'
});


const response = await client.create('authorization',
{
  type: "embed",
  username: "< A unique and immutable identifier for your user >",
  name: "< user name >",
  email: "< user email >",
  suborganization: "< a suborganization name >",
  access: {
    collections: [
          {
        id: "<collection_id>",
        inheritRights: "use"
          }
      ]
  },
  account_overrides: {
      "<account_id>": {
          "properties": {
              "host": "<new_s3_results_bucket_endpoint>",
              "key": "<new_IAM_access_id>",
              "token": "<new_IAM_access_key>",
              "region": "<new_Athena_region>",
              "workgroup": "<new_Athena_workgroup>"
          },
          "datasets": {
              "<dataset_id>": {
                  "schema": "<new_Athena_database>",
                  "table": "<new_Athena_table_or_view>"
              }
          }
      }
  }
);

4. Supported Data Types

Data Type AWS Athena Translated Data Type in Luzmo
tinyint numeric
int numeric
smallint numeric
bigint numeric
double numeric
float numeric
decimal numeric
integer numeric
real numeric
boolean hierarchy
string hierarchy
char hierarchy
varchar hierarchy
date datetime
timestamp datetime

Need more information?

Do you still have questions? Let us know how we can help.
Send us feedback!