Links
Comment on page

SQL scenarios

The different types of data sources to query and the different modes of querying enable powerful workflows that are useful in different combinations. Learn more about different ways of using the SQL cell in SQL scenarios.

Query a database with SQL and return a data frame, continue in Python

Create a SQL cell to query a database with a SQL cell in DataFrame mode. Next, create a Python cell and write Python code to analyze the query, e.g. to create a visualization.
Query a database with SQL, continue in Python
This approach works if you prefer Python over SQL and the SQL query you're executing doesn't return a huge result (there is a query limit of 100 MB for free users and 500 MB for Premium users).

Query a database with SQL and return a data frame, continue in SQL

Create a SQL cell to query a database with a SQL cell in DataFrame mode. Next, create another SQL cell, select "DataFrames and CSVs" as a source and write SQL code to query the data frame created by the first SQL cell.
Query a database with SQL, continue analyzing the dataframe with SQL
This approach works if you prefer SQL over Python and the first SQL query you're executing doesn't return a huge result (there is a query limit of 100 MB for free users and 500 MB for paying users). The second SQL query will execute right inside Workspace using DuckDB. The SQL syntax is similar to PostgreSQL; check the DuckDB docs for a full overview.

Query a CSV file with SQL

Make sure the CSV file you want to query is in your workspace file system by uploading it through the file browser. Next, create a SQL cell, select "DataFrames and CSVs" in the first dropdown and write SQL code to query the CSV file. The second SQL query will execute right inside Workspace using DuckDB. The SQL syntax is similar to PostgreSQL; check the DuckDB docs for a full overview.
Query a CSV file with SQL
Use this approach if you prefer SQL over Python to analyze tabular data but have the data in a CSV file.

Query a database with SQL and reference it in another SQL cell

You typically do this if you want to more easily debug a complex SQL query that consists of one or more common table expressions.
Rather than writing this:
SQL Query with a common table expression
You can write this:
CTE turned into a SQL cell in "Query mode" that is referenced in a second SQL cell.
This approach works if your first query would return a huge result. Only a preview of that query result will be shown. When you execute the second query, the 'CTE-based query' is recompiled and executed on the database side rather than locally inside your workspace. You can review the compiled SQL that's being executed by by clicking on the "View Compiled SQL" icon:
You can review the compiled SQL code that gets executed on the database side.