Disk Provisioning on AWS
If you are running on Kubernetes, visit Portworx on Kubernetes on AWS
Below guide explains how Portworx dynamic disk provisioning works on AWS and the requirements for it. This is typically useful when an autoscaling group (ASG) is managing your AWS instances.
AWS Requirements
Granting Portworx the needed AWS permissions
Portworx creates and attaches EBS volumes. As such, it needs the AWS permissions to do so. Below is a sample policy describing these permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "<stmt-id>",
"Effect": "Allow",
"Action": [
"ec2:AttachVolume",
"ec2:ModifyVolume",
"ec2:DetachVolume",
"ec2:CreateTags",
"ec2:CreateVolume",
"ec2:DeleteTags",
"ec2:DeleteVolume",
"ec2:DescribeTags",
"ec2:DescribeVolumeAttribute",
"ec2:DescribeVolumesModifications",
"ec2:DescribeVolumeStatus",
"ec2:DescribeVolumes",
"ec2:DescribeInstances",
"autoscaling:DescribeAutoScalingGroups"
],
"Resource": [
"*"
]
}
]
}
You can provide these permissions to Portworx in one of the following ways:
- Instance Privileges: Provide above permissions for all the instances in the autoscaling cluster by applying the corresponding IAM role. More info about IAM roles and policies can be found here
- Environment Variables: Create a User with the above policy and provide the security credentials (
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
) to Portworx.
EBS volume template
An EBS volume template defines the EBS volume properties that Portworx will use as a reference. These templates are given to Portworx during installation.
Use a template specification
You can specify a template spec which will be used by Portworx to create new EBS volumes.
The spec follows the following format:
"type=<EBS volume type>,size=<size of EBS volume>,iops=<IOPS value>,enc=<true/false>,kms=<CMK>,tags=<key:value;key:value>,throughput=<throughput of the disk>"
type: The following types are supported:
- gp2
- gp3
- io1 (For io1 volumes specifying the iops value is mandatory.)
size: This is the size of the EBS volume in GB
iops: This is the required IOs per second from the EBS volume.
enc: This needs to be set to true if EBS volumes need to be encrypted. Default: false
kms: This is the AWS KMS key to encrypt the EBS volume. i.e.
<key>
inarn:aws:kms:us-east-1:<account-id>:key/<key>
tags: This adds custom labels to EBS volumes created on EKS drives. The key-value pairs are added as labels to the newly created volumes.
throughput: This is used to specify the throughput of the disk. Valid only for gp3 disk type. The ratio between the accompanying
iops
parameter and thethroughput
value must not exceed .25. For example, if the value ofiops
is 4000, the value ofthroughput
must not exceed 1000.
See EBS details for more details on above parameters.
Examples
"type=gp2,size=200"
"type=gp2,size=100","type=io1,size=200,iops=1000"
"type=gp2,size=100,enc=true,kms=AKXXXXXXXX123","type=io1,size=200,iops=1000,enc=true,kms=AKXXXXXXXXX123"
"type=gp2,size=100,tags=key:value;key:value"
"type=gp3,size=199,iops=4000,throughput=1000"
Limiting storage nodes
Portworx allows you to create a heterogenous cluster where some of the nodes are storage nodes and rest of them are storageless.
You can specify the number of storage nodes in your cluster by setting the max_storage_nodes_per_zone
input argument.
This instructs Portworx to limit the number of storage nodes in one zone to the value specified in max_storage_nodes_per_zone argument. The total number of storage nodes in your cluster will be:
Total Storage Nodes = (Num of Zones) * max_storage_nodes_per_zone
While planning capacity for your auto scaling cluster make sure the minimum size of your cluster is equal to the total number of storage nodes in Portworx. This ensures that when you scale up your cluster, only storageless nodes will be added. While when you scale down the cluster, it will scale to the minimum size which ensures that all Portworx storage nodes are online and available.
You can always ignore the max_storage_nodes_per_zone argument. When you scale up the cluster, the new nodes will also be storage nodes but while scaling down you will lose storage nodes causing Portworx to lose quorum.
Examples:
"-s", "type=gp2,size=200", "-max_storage_nodes_per_zone", "1"
For a cluster of 6 nodes spanning 3 zones (us-east-1a,us-east-1b,us-east-1c), in the above example Portworx will have 3 storage nodes (one in each zone) and 3 storageless nodes. Portworx will create a total 3 disks of size 200 each and attach one disk to each storage node.
"-s", "type=gp2,size=200", "-s", "type=io1,size=100,iops=1000", "-max_storage_nodes_per_zone", "2"
For a cluster of 9 nodes spanning 2 zones (us-east-1a,us-east-1b), in the above example Portworx will have 4 storage nodes and 5 storageless nodes. Portworx will create a total of 8 disks (4 of size 200 and 4 of size 100). Portworx will attach a set of 2 disks (one of size 200 and one of size 100) to each of the 4 storage nodes.
EC2 Instance types
A Portworx cluster can be deployed with a heterogeneous makeup of EC2 instance types. Some of your nodes can be used for converged compute and storage, some for compute only and some for storage only.
Follow this guide to select your appropriate instance type. Once you create an AMI template for an instance type, you will create multiple instances from that AMI. Make sure your AMIs are available in each region that you want to run the Portworx cluster in.
Since Portworx is a replicated block device, you can also use instance local store volumes for maximum performance. However you must have Portworx replication turned on.
Multi Zone Availability
Since Portworx is a replicated storage solution, Portworx by Pure Storage recommends using multiple availability zones when creating your EC2 based cluster. Follow this site for more information on geographical availability of your instances: here.