S3 Bucket

To connect your workspace to a AWS S3 bucket, you will need your bucket name and the credentials of your AWS account (AWSAccessKeyId and AWSSecretKey).
If you do not yet have these credentials, go to the IAM Console. After making sure that you are on the right IAM user, click on 'Manage access keys'. Now, create your new access key via the 'Create New Access Key' button. You can choose to either just copy and paste your AWSAccessKeyId and AWSSecretKey, or download them to a file.
After creating your credentials, open your workspace and make a new integration in the Integration tab on the left hand side. The environment variables should be called exactly as shown in the picture below: AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY for the AWSAccessKeyId and AWSSecretKey, respectively. The integration name itself is less important. If you want to learn more about environment variables and how you can store them securely in DataCamp Workspace; check out this article.
Once you have set up this set of environment variables (with the correct names AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY) and connected it to your workspace, you are ready to access files in your S3 bucket via your workspace. You can do this using following piece of code:
#import the AWS SDK for Python: boto3
import boto3
#list all files in provided bucket
bucket_name = "<bucket_name>"
s3 = boto3.resource('s3')
my_bucket = s3.Bucket(bucket_name)
for file in my_bucket.objects.all():
You can find a lot more useful functions in the Boto3 documentation. This Python SDK is already installed in your workspace so you will only have to import it.

Sample database: online ticket sales

We also provide you with a sample database about online ticket sales for events such as sporting events, shows, and concerts (source). If you want to connect to the S3 bucket that hosts these files, there are only two things you will need to do.
First, open a new workspace using this template in Python or this template in R.
Next, connect following set of credentials to your workspace. These will give you access to the sample database.
  • AWS_BUCKET_NAME = datacamp-workspacedemo-workspacedemos3-prod