Addressing slow code

DataLab runs on performant servers in the cloud. If you are on DataLab Starter, i.e. Workspace for free, your workbooks gets 0.5 vCPUs and 4GB of RAM. If you are on DataLab Premium, your workbook gets 2 vCPU and 16GB of RAM. For more information on the difference between the different plans, check Pricing.

If the work you're doing is resource intensive, you may experience your code being slow to execute. This article lists the most common cases of code running slowly with suggestions on how you can address them.

Plots are slow to generate

  • Try to reduce the amount of data you are plotting, for example, by aggregating over a certain dimension or just taking a sample of the data.

  • Some plot types are notoriously resource intensive to generate, for example, swarm plots. Consider another plot type.

My cell keeps running forever

  • Your cell might contain an infinite loop, check for while loops that might have a condition that's never false.

Database queries are slow

  • The database you are querying might be slow to respond. In that case, consider restarting the database or making your database server more powerful.

  • You may be querying a lot of rows, resulting a large data transfer to your notebook. In that case, you can try a couple of things:

    • Make your query more specific so you fetch only the data you need.

    • Limit the number of rows returned by your query until you're certain the rows contain the data you need, and only then perform a query without a limit.

    • If you want to aggregate the result of the query, consider doing the aggregation in SQL rather than in Python, so a lot of computation already happens on the database.

Machine learning models are slow to train

  • Some machine learning models benefit from GPUs when training. DataLab does not provide GPU machines yet. As an alternative, you can train the model on your own computer and then upload the trained weights to your workspace.

  • If you're on DataLab Starter and your workload can be paralllellized, consider upgrading to DataLab Premium to use more vCPUs for the training.

Last updated