To connect your workspace to a AWS S3 bucket, you will need your bucket name and the credentials of your AWS account (
If you do not yet have these credentials, go to the IAM Console. After making sure that you are on the right IAM user, click on 'Manage access keys'. Now, create your new access key via the 'Create New Access Key' button. You can choose to either just copy and paste your
AWSSecretKey, or download them to a file.
After creating your credentials, open your workspace and make a new integration in the Integration tab on the left hand side. The environment variables should be called exactly as shown in the picture below:
AWSSecretKey, respectively. The integration name itself is less important. If you want to learn more about environment variables and how you can store them securely in DataCamp Workspace; check out this article.
Once you have set up this set of environment variables (with the correct names
AWS_SECRET_ACCESS_KEY) and connected it to your workspace, you are ready to access files in your S3 bucket via your workspace. You can do this using following piece of code:
#import the AWS SDK for Python: boto3
#list all files in provided bucket
bucket_name = "<bucket_name>"
s3 = boto3.resource('s3')
my_bucket = s3.Bucket(bucket_name)
for file in my_bucket.objects.all():
You can find a lot more useful functions in the Boto3 documentation. This Python SDK is already installed in your workspace so you will only have to import it.
We also provide you with a sample database about online ticket sales for events such as sporting events, shows, and concerts (source). If you want to connect to the S3 bucket that hosts these files, there are only two things you will need to do.
Next, connect following set of credentials to your workspace. These will give you access to the sample database.
AWS_ACCESS_KEY_ID = AKIAUMJDGTMHW447X73R
AWS_SECRET_ACCESS_KEY = +gVs7e/brh8VI/+PVFvqX/CWcY5q/+ZIZXOP1jHP
AWS_BUCKET_NAME = datacamp-workspacedemo-workspacedemos3-prod