SQL scenarios

The different types of data sources to query and the different modes of querying enable powerful workflows that are useful in different combinations. Learn more about different ways of using the SQL cell in SQL scenarios.

Query a database with SQL and return a data frame, continue in Python

Create a SQL cell to query a database with a SQL cell in DataFrame mode. Next, create a Python cell and write Python code to analyze the query, e.g. to create a visualization.

This approach works if you prefer Python over SQL and the SQL query you're executing doesn't return a huge result (there is a query limit of 100 MB for free users and 500 MB for Premium users).

Query a database with SQL and return a data frame, continue in SQL

Create a SQL cell to query a database with a SQL cell in DataFrame mode. Next, create another SQL cell, select "DataFrames and CSVs" as a source and write SQL code to query the data frame created by the first SQL cell.

This approach works if you prefer SQL over Python and the first SQL query you're executing doesn't return a huge result (there is a query limit of 100 MB for free users and 500 MB for paying users). The second SQL query will execute right inside DataLab using DuckDB. The SQL syntax is similar to PostgreSQL; check the DuckDB docs for a full overview.

Query a CSV file with SQL

Make sure the CSV file you want to query is in your workbook file system by uploading it through the file browser. Next, create a SQL cell, select "DataFrames and CSVs" in the first dropdown and write SQL code to query the CSV file. The second SQL query will execute right inside DataLab using DuckDB. The SQL syntax is similar to PostgreSQL; check the DuckDB docs for a full overview.

Use this approach if you prefer SQL over Python to analyze tabular data but have the data in a CSV file.

Query a database with SQL and reference it in another SQL cell

You typically do this if you want to more easily debug a complex SQL query that consists of one or more common table expressions.

Rather than writing this:

You can write this:

This approach works if your first query would return a huge result. Only a preview of that query result will be shown. When you execute the second query, the 'CTE-based query' is recompiled and executed on the database side rather than locally inside your workbook. You can review the compiled SQL that's being executed by by clicking on the "View Compiled SQL" icon:

Last updated