V.good VS.code

Sam Hardy
7 min readSep 6, 2022

What and Why

I started up at a new job recently, which is always a good excuse to see how other people work, kick some bad habits and re-assess some fundamentals. Specifically, I find myself pursuing machine learning through the lens of “classical” software development. This came as a bit of a shock to me as I had previously:

  • Skipped the two weeks of debugging taught in software 101 at university. To complete assignments (of course). But in turn, “debugging” to me thus far has meant abusing print statements and seeing what bubbles up.
  • Been living that jupyter notebook good life. Notebooks are a serious crutch of mine, as they are for the broader community since they are damn useful for refining prototypical work. Notebooks rule for things like EDA analysis, ETL pipelining and modelling, stuff which is difficult to scope within the bounds of traditional software development and requires iteration. At least, this has been the distinction in my mind until now.

In this sense, “software development” for me has kind of looked like this:

A virtuous cycle, and definitely not clunky and terrible. Diagram built with whimsical.

Which is probably fine in the sense that it’s optimising for the exploration of libraries, preprocessing and models instead of production code, but this dualist approach does have setbacks, the main ones I’ve found are:

  • Context switching. Between the notebook itself (browser), notebook server (terminal) and when library development (see below) is concerned, an IDE. Though as an FYI, a lean terminal/browser config definitely has its place (remote access viz-a-viz consulting from a client’s environment).
  • Read/write. Using gross as hell notebook magics. There have been times when I have lost lots of work because I have accidentally reloaded the old source code from a file over the top of my notebook changes. Additionally, maintaining two copies of the same code via a notebook and python source is almost certainly going to fall out of sync with each other.

So given these problems, the contours of a solution might be a way of consolidating the notebook/source experiences into a single place (an IDE). So I set about trying to tie the best of both worlds into VSCode.

Here are some of my favourite VS code extensions/snippets/misc that either consolidate the notebook experience or might be useful for other engineers who are keen to formalise their tools, but aren’t quite ready to give up the notebooks.

Python

Pylance. An alternative, and fairly new language server for VSCode python development. Something I didn’t realise was that a lot of VSCode extensibility is based upon a client/server design, including things which “feel” native, like the Pylance server that powers IntelliSense. Not much to do here as it’s now the default language server that installs as part of python-support tools.

Sourcery. Refactoring suggestions that can be automatically applied and are guaranteed to never break your code. I recall this “polite insistence” is accomplished by manipulating the underlying abstract syntax tree into a simpler, more optimised form. You can learn more about it on a TalkPython podcast the creators did a while ago. Anyway, I usually find Sourcery good for cutting through messy tangles of if/elif blocks.

Boom click boom, go away exceptions.

Python Test Explorer for Visual Studio Code. That is the name, and yes it is specific. Provides a very slick visual interface for all tests within your workspace, allowing you to drill down and debug very specific test instances. I believe this is standard functionality within Pycharm and other professional IDE’s, so please skip ahead if this is old news.

To all the notebook kids out there, this is one extension that will get you into an IDE and writing tests. Some background; in the past, I usually had the bare minimum number of tests written to accompany any .pysource I was developing, and one of the main reasons for this was I found it time-consuming to run individual tests within the pytest node tree. I usually ran individual tests like this:

# via keyword expressions
pytest tests_directory/bar.py -k 'big_large_verbose_test_name_that_sucks_to_type'
# via node ids; better.. but I still have to type? what about params?
pytest bar.py::big_large_verbose_test_name_that_sucks_to_type

Which was a huge pain to run via the terminal for one test instance, and it doesn’t allow you to run a single failing parameter without running all parameters AFAIK. So not ideal, especially when trying to parameterise slow and expensive tests involving data and models. However, this extension, combined with vanilla breakpoints has made writing and interrogating individual tests and their parameters a breeze. Some tests will always be slow, but having an ability to re-test with parameter-level specificity is.. nice.

Soooo nice having the errors inline within the breaking test, instead of within a separate console. Not pictured (gif-ed?), right-click into the test tree to run individual tests.

Breakpoints. Not an extension, but just to round off the previous discussion. Some more background, I used to use print statements in notebooks to inspect intermediate variables, and I believe a lot of people still do this, as print statements are a one-word one-line command to get some output from your program. So it’s easy, but not ideal because:

  • Each variable needs it’s own print statement, which is tedious to in-line within your programs
  • The print statement does not trace through the program and is static at the point in which it is called, so becomes immediately stale

Breakpoints solve this, by allowing you to inspect the state of multiple variables as you step through each line of code without having to explicitly and manually insert a print statement. The penny dropped for me when I realised that breakpoints could be overlaid across code with complex invocations, like a parametrised pytest suite^ where you can:

  • Find a failing test or a failing parameter combination for a single test
  • Place a suitable breakpoint just before the suspect line
  • Inspect and manipulate intermediate values, including dataframes!
  • Tweak the originating code
  • And re-run ad infinitum until your test is running

In sum, use breakpoints, alongside a test explorer to massively reduce the time taken to run a test feedback loop and make testing a JOY.

Notebooks

Jupyter, Jupyter Keymap and Jupyter Notebook Renderers. I’m listing the main jupyter extensions ****mainly for posterity. These extensions allow you to read, write and execute notebooks in VSCode. But importantly, they allow you to do so within the same context as multiple terminals and .py source. It’s missing a few things from the native browser experience, like cell-based search and replace. The “clear cell output” function is also buggy.

runStartupCommands. Once again, not an extension per se, but a config worth discussing because of its importance in facilitating a hybrid notebook/source workflow. I have a couple of miscellaneous commands (see below config) that automatically reload adjacent.py module source changes, within the current notebook that is invoking them. Useful for when you’re trying to formalise/dev a module and still have dependent code in a notebook.

Effect of auto-reloading all utilised modules via runStartUpCommands. Changes within the dependent source (the value of x) are automatically reloaded into a live notebook, allowing to me to preserve my initial, screwy workflow, but inside VSCode instead! Yay?

Misc.

JSON Tools. Providing options to minify and pretty print JSON. Useful. That’s all I have to say.

vscode-pdf. Sometimes I need to look at PDFs, this helps prevent a context switch. I note that VSCode already has nice image support out of the box and this is just complimentary to that.

Path Intellisense. I didn’t realise how much I relied on jupyter notebooks Jedi for path completion, but it was one of the first extensions I was reaching for after committing to VSCode. Path Intellisense largely hits the mark, with a few minor quibbles WRT tab-induced auto-completion (ie. it doesn’t do this). Happy linting!

Also, If you’re keen, you can just paste in my config here and install the above extensions:

{
"explorer.confirmDelete": false,
// add your own conda/venv/base path/whatever
"python.defaultInterpreterPath": "{interpreter_path}",
"editor.formatOnSave": true,
"trailing-spaces.trimOnSave": true,
// IMO default 90 chars is a bit too wrappy
"python.formatting.blackArgs": [
"--line-length",
"120"
],
"python.testing.pytestEnabled": true,
"python.testing.autoTestDiscoverOnSaveEnabled": true,
"editor.codeActionsOnSave": {
"source.organizeImports": true
},
// head over to <https://sourcery.ai/> for a free token!
"sourcery.token": "<sourcery_token>",
"jupyter.askForKernelRestart": false,
"explorer.confirmDragAndDrop": false,
"path-intellisense.autoSlashAfterDirectory": true,
"path-intellisense.autoTriggerNextSuggestion": true,
"security.workspace.trust.untrustedFiles": "open",
// prevent global search snagging on venv libraries
"search.exclude": {
"**/venv": true,
"**/*.code-search": true
},
"extensions.ignoreRecommendations": true,
// re-load adjacent .py source within notebooks
"jupyter.runStartupCommands": [
"%load_ext autoreload",
"%autoreload 2"
],
"window.zoomLevel": -2,
}

As an aside, I used Gyazo to record and link screen recordings as gifs. Gyazo provides an option to directly embed the video in a variety of ways, and notion.. just deals with it and plays the video? very cool, I don’t understand it. Some web devs I used to work with used to shuttle application bug proof around the organisation using the same approach.

Banner art developed with stable diffusion.

--

--