This article covers all the necessary steps to access and manage files on Google Drive, Google's cloud storage solution, from inside Workspace.
Before you can run Python code to programmatically access data in Google Sheets, you need to the following steps, which we will go through in detail step by step:
- Enable the Google Sheets API
- Create a Google service account for programmatic access.
- Share the files you want to access with the service account.
- Store the service account credentials in Workspace.
- Make sure you’re signed in with your Google account.
- Create a new project by clicking in the dropdown on the navbar.
- Search for the “Google Drive API” and enable it. This can take up to 10 seconds.
Create a new Google Cloud project and enable the Google Drive API
- In the “APIs and services” navbar on the left, go to the “Credentials tab”
A Google service account is a special kind of account that can be used by programs to access Google resources like your Drive. You will use this service account to connect DataCamp Workspace to Google Drive.
You only have to set up this Google service account once for every Google account that you want to access Google resources with; you can skip this step the next time.
Create a google service account
Follow the steps below to create the service account and generate the necessary credentials:
- Click on “+ CREATE CREDENTIALS” and select “Service Account”
- In the first step (service account details), provide a name for the service account, e.g., “google-operator” and click on “Create and continue”
- In the second step, select the “Owner” role and click “Continue”
- In the third step, don’t change anything and click “Done”
- Once back on the Credentials page, click on the service account you just created.
- Go to the Keys tab, click “Add Key > Create new key”
- Choose “JSON”, then click “Create.” The JSON file with your service account credentials will automatically download to your computer.
You now have a service account and a JSON credentials file! Head over to your Downloads folder or wherever the JSON file was downloaded, open it up, and have a look. It should look something like this:
"private_key": "-----BEGIN PRIVATE KEY-----\nM<some-very-private-stuff\n",
"client_email": "[email protected]",
Your service account can only access Google Drive files that it has access to, so need to go through files in your Google Drive folder and share them with the email of the service account that you copied to your clipboard in the previous step. If you just want to read the files, "Viewer" access is enough.
Share a Google Drive file with a service account
Click this link to create a workspace in your own account that contains example Python code to connect to Google Drive, list all the files the service account has access to, and download an example CSV file.
In your new workspace, click on "Environment", and click on "+" next to "Environment variables":
Valueto the full contents of the service account JSON file that was downloaded. You can do this by opening the JSON file, selecting all, copying it to your clipboard, and then pasting it in the Value field.
- Set the “Environment Variable Set Name” to “Google Service Account” (this can be anything, really)
Set up GOOGLE_JSON environment variable
After filling in all fields, click “Create,” “Next,” and finally, “Connect.” Your workspace session will restart, and
GOOGLE_JSONwill now be available as an environment variable in your workspace. You can verify this by creating a Python cell with the following code and running it:
If you want to reuse the same services account credentials in another workspace, you don’t need to set up the environment variable again: you can connect the environment variable to your other workspaces as well.
Use the Python code snippets in the workspace that you created before (with this link) to install the necessary packages, list all the files in your Google Drive that your account has access to, and download an example CSV file; all from Python!