This example uses the parquet format, to create parquet files in s3://bucket_name/path/to/files
, with each table placed in its own directory.
The (top level) spec section is described in the Destination Spec Reference.
It is also possible to use {{YEAR}}
, {{MONTH}}
, {{DAY}}
and {{HOUR}}
in the path to create a directory structure based on the current time. For example:
Other supported formats are json
and csv
.
The plugin needs to be authenticated with your account(s) in order to sync information from your cloud setup.
The plugin requires only PutObject
permissions (we will never make any changes to your cloud setup), so, following the principle of least privilege, it's recommended to grant it PutObject
permissions.
There are multiple ways to authenticate with AWS, and the plugin respects the AWS credential provider chain. This means that CloudQuery will follow the following priorities when attempting to authenticate:
- The
AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
, AWS_SESSION_TOKEN
environment variables.
- The
credentials
and config
files in ~/.aws
(the credentials
file takes priority).
- You can also use
aws sso
to authenticate cloudquery - you can read more about it here (opens in a new tab).
- IAM roles for AWS compute resources (including EC2 instances, Fargate and ECS containers).
You can read more about AWS authentication here (opens in a new tab) and here (opens in a new tab).
Environment Variables
CloudQuery can use the credentials from the AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
, and
AWS_SESSION_TOKEN
environment variables (AWS_SESSION_TOKEN
can be optional for some accounts). For information on obtaining credentials, see the
AWS guide (opens in a new tab).
To export the environment variables (On Linux/Mac - similar for Windows):
Shared Configuration files
The plugin can use credentials from your credentials
and config
files in the .aws
directory in your home folder.
The contents of these files are practically interchangeable, but CloudQuery will prioritize credentials in the credentials
file.
For information about obtaining credentials, see the
AWS guide (opens in a new tab).
Here are example contents for a credentials
file:
You can also specify credentials for a different profile, and instruct CloudQuery to use the credentials from this profile instead of the default one.
For example:
Then, you can either export the AWS_PROFILE
environment variable (On Linux/Mac, similar for Windows):
IAM Roles for AWS Compute Resources
The plugin can use IAM roles for AWS compute resources (including EC2 instances, Fargate and ECS containers).
If you configured your AWS compute resources with IAM, the plugin will use these roles automatically.
For more information on configuring IAM, see the AWS docs here (opens in a new tab) and here (opens in a new tab).
User Credentials with MFA
In order to leverage IAM User credentials with MFA, the STS "get-session-token" command may be used with the IAM User's long-term security credentials (Access Key and Secret Access Key). For more information, see here (opens in a new tab).
Then export the temporary credentials to your environment variables.
Using a Custom S3 Endpoint
If you are using a custom S3 endpoint, you can specify it using the endpoint
spec option. If you're using authentication, the region
option in the spec determines the signing region used.
Similar to how kubectl
works, cloudquery
depends on a Kubernetes configuration file to connect to a Kubernetes cluster and sync
its information.
By default, cloudquery
uses the default Kubernetes configuration file (~/.kube/config
).
You can also specify a different configuration by setting the KUBECONFIG
environment variable before running cloudquery sync
.
Kubernetes Service Account
If cloudquery
is running in a pod of the Kubernetes cluster, the Kubernetes Service Account can be used for direct authentication. To use the Kubernetes Service Account for direct authentication, a cluster role with all get and list privileges will need to be used.
The below command creates a new cluster role with get
and list
privileges.
Next, the cluster role and service account will need to be linked via a cluster role binding.
The following creates a cluster role binding for the role we created above and the service account for the cloudquery
pod.