Samuel Bird

'Obvious' advice for running and debugging VMC experiments

As someone primarily motivated by physical questions but heavily utilising computational techniques, it feels like each research project has two parallel interacting tracks of inquiry. One is focused on the physical questions and steers the project at large, while the other is focused on engineering the computational technique.

Side comment: I experience the latter as a series of puzzles that need to be solved, one obstacle at a time, and eventually, after enough obstacles have been overcome, it leads to meaningful results.

Probably my biggest overhead/time-sink comes from days (weeks… months…) spent barking up the wrong tree when debugging Variational Monte Carlo (VMC, my current choice of computational tool) experiments - fixing instabilities, understanding anomalies and so on. This can slow down progress significantly, lead to frustration and negatively affect Morale.

More often than not, progress is made by taking a break, zooming out, breaking the problem down, and seeking advice. Nonetheless, this can take time, and it can be surprisingly hard to extricate oneself from a zoomed-in state of mind (the break helps with that!).

The key to these obstacles often appears “obvious” in hindsight. This is clearly hindsight bias - experience is largely the accumulation and internalisation of many such pieces of “obvious” advice. Nonetheless, I think it valuable to reflect on past obstacles and extract lessons from them - what worked, what didn’t, what I should look out for in the future to prevent repeated mistakes. For that reason, I keep track of pieces of “obvious” advice that I encounter during my research (often with the crucial input of a collaborator). This is very much a work in progress, and is meant to be updated if and when new ideas arise. More than anything, this is for me to come back to when I find myself facing a persistent obstacle and run through like a checklist.

As a small disclaimer, my research involving VMC almost exclusively focuses on Fermionic systems on a lattice and in the continuum.

Without further ado:

“Obvious” advice for running VMC experiments

It’s generally better to assume issues are caused at a higher level and work down

For example, it’s generally better to assume issues are caused by the variational ansatz or the Hamiltonian implementation than by the sampler or the optimiser.

I should note that this may be especially effective for me due to a personal bias of mine to err on the side of assuming the sampler/optimiser is at fault (due to my own experience with a past project where I spent ages trying to extend the ansatz when the issue was really the optimiser needed biasing).

I try to remember an order of priority that goes something like this, corresponding to a “hierarchy of abstraction” in VMC:

It is often straightforward to come up with sanity checks that can at least rule out a failure in the ansatz or Hamiltonian as the root of an anomalous observation. For example, if your optimisation shows unstable behaviour, then you can simplify the ansatz as much as possible or get rid of interactions and see if it persists. Of course, this doesn’t rule out the possibility that the more complex ansatz combines with the sampler to create a co-dependent anomaly, but I find that I normally learn a lot more from this angle than if I start at a lower level.

Understand each part of your ansatz individually and altogether

If I don’t have a clear understanding of my entire ansatz then, when hunting down the cause of an error, I have a mental block around the ansatz being the cause. Since I don’t have a clear picture of how it works, I can’t picture how it might fail.

How to make sure I clearly understand how my ansatz works? Write it out explicitly, maybe in a research log or project document so you can find it later.

Check and double check smoothness of model/ansatz, be careful with PBC

Discontinuities in the ansatz, e.g. arising under PBC due to a minimal image convention for relative distances, can easily cause instabilities. Spuriously high gradients and massive drops in energy are likely caused by such a discontinuity in the model.

If optimisation is really slow, at least understand what’s causing it

If I move from one Hamiltonian to another and the convergence time (as opposed to compute time) increases significantly, ask what changed to cause that. Likewise for a change in ansatz. More generally, if the convergence time is really long then it’s likely a good tradeoff to spend a week to understand the cause and debug any redundancies or bottlenecks.

Exhaust short-feedback-loop calculations first

If you have some quick calculations you can use to gather intuition (e.g. RPA, Hartree-Fock, perturbation theory), then try and squeeze out any intuition that you can before moving to a slow VMC calculation with correspondingly slow feedback loops. This is especially important if the VMC takes days or weeks to run.

Ockham’s razor of Ansatz and Hamiltonian complexity

Generally keep models and Hamiltonians as simple as possible, at least during development. When planning a pivot to a new ansatz or hamiltonian, be really careful about introducing unnecessary complexity and make sure that you understand the potential consequences.

Examples:

With a new Hamiltonian implementation, check any exactly known cases first

If you have a wholly new implementation of a Hamiltonian or modify it somehow (e.g. adding complex Peierls hoppings), check and re-check any exactly known results are reproduced by the VMC e.g. non-interacting energies. Then log these checks somewhere that you can find them again. Trust me, it’s easy to forget whether you conducted such tests or not and then just assume that you must have (because you would never be stupid enough not to… right?) and not notice this mistake for a long time.

Make sure your coded-up models have the correct particle symmetry

This really falls under the advice to clearly understand each part of your ansatz, and can be checked using those exactly known cases. If you accidentally take absolute values, or aren’t careful with square roots (say, in a Pfaffian), then you might mistakenly kill the antisymmetry of a Fermionic ansatz.

Be sure flat is flat, have a policy against premature claims of convergence

Think your VMC has converged? Take the running mean of your optimisation (energy-step) curve over a bunch of steps, ignore the errorbars, and plot a horizontal line at the final energy. Do they match? The human mind is a wonderful thing, but sometimes it confuses desire with reality.

Check the dtypes of parameters and wavefunction/Hamiltonian output are correct

More than once I’ve spent hours debugging a Hamiltonian or Ansatz only to discover that I’m accidentally setting the output type to be float when it needs to be complex. It can be a good idea to give an LLM your code and ask it to check the dtypes in case you aren’t sure.

Avoid unnecessary biasing of initialisation

When working with a new model or a new Ansatz, first test random initialisation of the parameters and samples to avoid biasing into suboptimal local minima, even if you intend to bias it eventually (perhaps with HF orbitals or preconfigured samples).

As an example, once I was working with a Hamiltonian that optimised abysmally given random initialisations of the model parameters. At this point, I switched to initialising it with orbitals obtained from self consistent Hartree-Fock calculations, which performed much better! At some point we pivoted to a different Hamiltonian (structurally different, not just a tuned parameter). It was a mistake to assume that the biased initialisation was useful here. In fact, for this Hamiltonian the biased initialisation got quickly stuck in a poor local minima, while the random initialisation performed brilliantly. Of course, it took me longer than I would have liked to realise this.

If a totally random initialisation cannot reach the energies obtained with a set of biased initial orbitals, ask why? Try to understand this. If you have a good reason then it’s fine to bias, but without one it’s dangerous because you won’t be aware of when it breaks down.