Skip to main content

Create a Presto Cluster

To create an Ahana-managed Presto cluster in the Ahana Compute Plane of your AWS account:

In the Ahana SaaS Console, select Clusters, then select Create Cluster.

Cluster Tab

The Create a Cluster page is displayed.

Enter Cluster Details

tip

Select the Help Text icon for additional help.

General Settings

General Settings

Enter a Cluster Name

Enter a Name for the cluster. The Cluster Name:

  • must be unique across your Ahana Compute Plane
  • must begin and end with a letter or number
  • may contain letters, numbers, spaces, and hyphens
  • must be a maximum length of 63 characters.

Ahana recommends entering a descriptive name to help identify the cluster.

The cluster name is used as part of the cluster endpoints. For example, a cluster name telemetry would be used to form the Presto endpoint https://telemetry.tenant.cp.ahana.cloud and the JDBC endpoint jdbc:presto://telemetry.tenant.cp.ahana.cloud:443. For more information about how the endpoints are defined, see Endpoints.

Select the Workload Profile

Concurrent queries are the number of queries executing at the same time in a cluster. Ahana has identified workloads based on the number of concurrent queries and curated a set of tuned session properties for each workload profile.

Select the Workload Profile based on the number of concurrent queries expected to run on the cluster.

  • Low Concurrency is useful for clusters that run a limited number of queries or a few large, complex queries. Low concurrency also supports bigger and heavier ETL jobs.

  • High Concurrency is better for running multiple queries at the same time, such as dashboard and reporting queries or A/B testing analytics.

note

The workload profile of an existing cluster can be changed, and changing the workload profile takes effect without requiring the cluster to restart. The change only apples to queries that begin after the change is made.

Cluster Settings

Cluster Settings Node Instance Types

Select the Node Instance Types

Select the AWS EC2 instance type to be provisioned for the Coordinator Instance Type. Because Presto has only one coordinator node, it is important to have an instance that can support the workload. The recommended Coordinator Instance Type is r5.4xlarge.

Select the AWS EC2 instance type to be provisioned for the Worker Node Instance Type. The recommended Worker Node Instance Type is r5.2xlarge.

tip

For more information about the R5 instance class, see Amazon EC2 R5 Instances.

Intel AWS

The M5, R5, and C5 instance types provide the best price performance from the underlying Intel Cascade Lake Process Technology.

To learn more about Intel-optimized instances, visit the AWS and Intel Partner Page.

Configure Cluster Scaling

There are two types of scaling strategy available:

  • Static: A Static scaling strategy means that the number of worker nodes is constant while the cluster is being used. See Configure Static Scaling.
  • Scale Out only (CPU): A Scale Out only (CPU) scaling strategy means that the number of worker nodes begins at a minimum and increases to a maximum based on the worker nodes' average CPU utilization. See Configure Scale Out only (CPU) Scaling.

For more information, see Presto Cluster Autoscaling.

note

The choice of Static or Scale Out only (CPU) scaling strategy, Scale to a single worker node when idle, and the Query Termination Grace Period cannot be modified after the Presto cluster is created.

Configure Static Scaling

In Scaling Strategy, select Static.

Cluster Settings Static

Enter the Default Worker Node Count for the number of worker nodes in the Presto cluster. Choose a number between 1 and 100.

Optionally, select Scale to a single worker node when idle to scale the cluster to a single worker node when the cluster is idle for a user-specified amount of time.

If Scale to a single worker node when idle is enabled, the cluster idle time limit can be set in Time window before scaling to a single worker node. The default value is 30 minutes.

Configure Scale Out only (CPU) Scaling

In Scaling Strategy, select Scale Out only (CPU).

Cluster Settings Scale Out only CPU

Enter the:

  • Minimum Worker Node Count
  • Maximum Worker Node Count
  • Scale Out Step Size

The Presto cluster starts with the number of worker nodes in Minimum Worker Node Count and if the average CPU utilization of the worker nodes goes above 75% for a period of 15 minutes, new worker nodes are added in the Scale Out Step Size amount up to the Maximum Worker Node Count. See When does autoscaling occur?

Optionally, set the Time window before scaling to minimum worker node count. The default value is 30 minutes.

Enter the Query Termination Grace Period

Optionally, set the Query Termination Grace Period value.

Reducing Presto workers on a cluster gracefully shuts down worker nodes so that any running queries do not fail due to the scale in. The Query Termination Grace Period is the maximum time window that is allowed for existing query tasks to complete on Presto workers before forcefully terminating those workers. The default is 10 minutes. The range is between 1 minute and 120 minutes.

Data Lake Settings

Data Lake Settings

Configure the Hive Metastore

Select Attach an Ahana Hive Metastore to provision a Hive Metastore named ahana_hive that is pre-configured and attached to the Presto cluster.

Select the AWS EC2 instance type to be provisioned for the Hive Metastore Instance Type. The recommended instance type is m5.xlarge.

note

The provisioned Ahana Hive Metastore is pre-integrated with an S3 bucket. You can find information about the S3 bucket at Ahana-managed Hive Metastore and Amazon S3 storage. Use the ahana_hive name in endpoints to connect to the Hive Metastore.

Enable cluster query log

Select Enable cluster query log to attach the Presto query log to the Ahana-managed Hive Metastore. The Presto Query Log is stored in an S3 bucket by default. The Presto Query Log can optionally be attached to the Presto cluster. Selecting this option creates an external table and view in the attached Hive Metastore for easy access to the query log.

note

Enable cluster query log is available only if Attach an Ahana Hive Metastore is selected.

Configure Data Lake Caching

note

If the selected Worker Node Instance Type is a type d instance - for example, c5d.xlarge - then both Enable Data IO Cache and Enable Intermediate Result Set Cache are automatically enabled, and use the instance storage instead of AWS EBS SSD volumes.

Select Enable Data IO Cache to configure a local AWS EBS SSD drive for each worker node. The volume size of the the configured AWS EBS SSD is three times the size of the memory of the selected Worker Node Instance Type for the Presto cluster.

Select Enable Intermediate Result Set Cache to cache partially computed results set on the worker node's local AWS EBS SSD. This prevents duplicated computation on multiple queries for improved query performance and decreased CPU usage. The volume size of the AWS EBS SSD for the intermediate result set cache is two times the size of the memory of the selected Worker Node Instance Type for the Presto cluster.

note

Enable Intermediate Result Set Cache is only beneficial for workloads with aggregation queries.

Identity Provider

If an Identity Provider is configured, the Identity Provider pane is displayed and Enable authentication through OIDC is selected.

Identity Provider pane

note

To enable identity providers in Ahana, contact Ahana Support.

Presto Users

Each Presto cluster must have at least one Presto user. Select the Selected checkbox for a Presto user to add that user to the cluster.

Select Create Presto User to create a new Presto user. After you create a Presto user, you can add it to your cluster.

You can also add or remove Presto users after the cluster is created.

Select Presto users for the Presto cluster

Presto user authentication is done over HTTPS to secure your connection to clients such as the Presto CLI, JDBC drivers, and Apache Superset.

info

If you are creating a cluster using an Ahana Compute Plane version below 3.0, the Presto Users pane is not present because Presto clusters created with Ahana Compute Plane versions below 3.0 support only one Presto user per cluster. See Create Single Presto User Cluster Credentials, then return to this page to finish creating the Presto cluster.

Create the Cluster

Create Cluster button

Select Create Cluster to create the cluster.