Never Have an Unmaintainable Jupyter Notebook Again!


Data visualisation is a fundamental part of Data Science. The talk will start with a practical demonstration (using pandas, scikit-learn, and matplotlib) of how relying on summary statistics and predictions alone can leave you blind to the true nature of your datasets. I will make the point that visualisations are crucial in every step of the Data Science process and therefore that Jupyter Notebooks definitely do belong in Data Science. We will then look at how maintainability is a real challenge for Jupyter Notebooks, especially when trying to keep them under version control with git. Although there exists a plethora of code quality tools for Python scripts (flake8, black, mypy, etc.), most of them don't work on Jupyter Notebooks. To this end I will present nbQA, which allows any standard Python code quality tool to be run on a Jupyter Notebook. Finally, I will demonstrate how to use it within a workflow which lets practitioners keep the interactivity of their Jupyter Notebooks without having to sacrifice their maintainability.

26 min
02 Jul, 2021

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Workshops on related topic