Introduction

The ground state of a classical Ising spin Hamiltonian HI(σ1, … , σN), where σk are binary variables, can be found after QA by mapping σk to the z-projection Pauli operators \({\sigma }_{z}^{k}\) of quantum spins-1/2 (qubits). The Hamiltonian for QA is generally defined as1,2,3,4

$$H(t)=f(t){H}_{I}+r(t){H}_{M},$$
(1)

where f(t) is monotonically increasing with time from zero to a finite value and r(t) is monotonically decreasing from a finite value to zero; HM is the initial “mixing" Hamiltonian whose ground state is easy to prepare, and

$${H}_{I}=\mathop{\sum}\limits_{k}{a}_{k}{\sigma }_{z}^{k}+\mathop{\sum}\limits_{k\ne s}{a}_{ks}{\sigma }_{z}^{k}{\sigma }_{z}^{s}+\mathop{\sum}\limits_{k\ne s\ne r}{a}_{ksr}{\sigma }_{z}^{k}{\sigma }_{z}^{s}{\sigma }_{z}^{r}+\ldots .$$
(2)

The number of different terms in (2) can be exponentially large as HI can have arbitrary k-local terms that couple k spins directly with different coefficients a{k}.

Allowing only binary couplings in (2), this already includes NP-complete problems5,6,7,8, which means that many important QA problems that are usually formulated with a different from (2) target Hamiltonian, can be mapped to the model (1) with only a polynomial overhead. The integer number factorization and the Grover algorithm can be also formulated as QA problems with some HI9,10.

Today, accessible hardware for a large number, over 100, qubits uses only heuristic approaches to QA11, for which the operator HM and the annealing schedule, f(t) and r(t), in (1) are not specifically tuned to the choice of HI. The QA protocol is chosen then mainly for the simplicity of implementing it in practice. Still, HM must not commute with HI, and have a large gap between the lowest eigenvalue and the rest of its spectrum. According to the adiabatic theorem, if the time-dependent parameters change sufficiently slowly, the system remains in the instantaneous ground state and thus transfers to the ground state of HI as t → . Measuring the qubit polarizations \({\sigma }_{z}^{k}\), k = 1, …, N, we then obtain the desired configuration of Ising spins that minimize HI.

In real heuristic QA experiments, time is restricted by the coherence time of qubits, so the adiabatic regime is practically never achievable. Given the widths ΔEI of the energy band of HI, it is possible to perform a pseudo-adiabatic evolution with T 1/ΔEI, where T is the achievable QA time. However, the gap between nearest levels of HI is generally δ ΔEI/2N, i.e., exponentially smaller than ΔEI, during the QA. The ground state of the full Hamiltonian H(t) with a complex HI then usually passes through avoided crossings with exponentially small gaps to other levels. Hence, the practical situation corresponds to the nonadiabatic regime.

Thus, the experimentally accessible QA computing is inspired by a phenomenological assumption that there are computational problems whose partial solutions, i.e., the low Ising spin energy states can be obtained during the nonadiabatic QA process faster than during classical computations. If this assumption is correct, the quantum coherent evolution can be used in combination with incoherent classical annealing for a longer time.

Whether this is true or not is hard to verify either numerically or analytically because we deal with driven and nonadiabatic many-body dynamics. We still do not have definite answers on how quickly the useful information is gained during nonadiabatic QA computations, and whether there can be quantum algorithms that outperform classical computations during the time that is accessible in practice.

Results

Solvable model

To address these problems, first, let us show that the original model (1) can be rewritten in the form of a scattering problem that depends on a single time-dependent parameter g(t). In the Schrödinger equation,

$$i\frac{d}{dt}\psi (t)=H(t)\psi (t),$$
(3)

we switch to a new time variable

$$s(t)=\int\nolimits_{0}^{t}d\tau \,f(\tau ).$$

Here, f(τ) is positive, so s(t) is a single-valued function, which is growing monotonically with t. Moreover, since both f(t) and r(t) are changing monotonically with t, they are single-valued functions of s: f(s) ≡ f(t(s)) and r(s) ≡ r(t(s)). Using that

$$\frac{d}{dt}=f(t(s))\frac{d}{ds}$$

in (3), we find that (3) is equivalent to

$$i\frac{d}{ds}\psi (s)=\left({H}_{I}+\frac{r(s)}{f(s)}{H}_{M}\right)\psi (s).$$
(4)

Since f(s) → 0 as s → 0, the initial conditions become

$$g(s)\equiv \frac{r(s)}{f(s)}\to \infty ,\quad {{{{{{{\rm{as}}}}}}}}\quad s\to 0,$$

and since r(s) decays to zero as s → , so does the redefined coupling g(s). Thus, the QA problem in (1) is equivalent to a model with the Hamiltonian

$$H(t)={H}_{I}({\sigma }_{z}^{1},\ldots ,{\sigma }_{z}^{N})+g(t){H}_{M}(\{{{{{{{{\boldsymbol{\sigma }}}}}}}}\}),$$
(5)

where g(t) is decaying from an infinite value to zero.

Next, if the goal is to study the accuracy of computations, one needs the probabilities of nonadiabatic excitations that are produced during QA starting from the ground state. Here, we point to the fact that there is a fully solvable model that provides all excitation probabilities for evolution (5) with an arbitrary HI. This model has g(t) and HM, which satisfy the basic requirements for a QA protocol. Namely,

$$g(t)=-\frac{g}{t},\quad g \, > \, 0,$$
(6)

and HM is the projection operator onto the state with all spins pointing along x axis:

$${H}_{M}=\left|{\psi }_{0}\right\rangle \left\langle {\psi }_{0}\right|,\quad \left|{\psi }_{0}\right\rangle \equiv \left|\to \cdots \to \right\rangle .$$
(7)

This HM has been considered for QA problems previously in relation to the adiabatic Grover algorithm10. In Methods, we show that the model remains solvable even when the state \(\left|{\psi }_{0}\right\rangle\) is chosen arbitrarily. This means that the model generally depends on 2N different complex parameters that encode this state. However, having no information about HI, a wise choice of HM would be to consider \(|{\psi }_{0}\rangle\) that does not discriminate among possible eigenstates of HI, which is achieved by initially polarizing all spins along the x axis.

As t → 0+, the state \(|{\psi }_{0}\rangle\) is the ground state of H with energy

$${E}_{0}\approx -\frac{g}{t}.$$

Since all the other eigenvalues of HM are zero, E0 is also the leading order energy gap to the rest of the spectrum of H as t → 0.

Let \(\left|n\right\rangle\) be the state of an arbitrary configuration of all the spins with definite projections along the z axis. For this state,

$$\left| \left\langle n| {\psi }_{0}\right\rangle \right | ^{2}=\frac{1}{{{{{{{{\mathcal{N}}}}}}}}},\quad \forall n,$$
(8)

where

$${{{{{{{\mathcal{N}}}}}}}}={2}^{N}$$
(9)

is the dimension of Hilbert space of N spins-1/2’s. Thus, the matrix form of HM in the computational basis has identical exponentially small but nonzero entries. Let us also introduce the Ising energies

$${\varepsilon }_{n}\equiv \left\langle n\right|{H}_{I}\left|n\right\rangle ,\quad 0\,\le\, n \, < \, {{{{{{{\mathcal{N}}}}}}}},$$
(10)

where we reserve n = 0 for the ground state of HI, and assume that the state indices are chosen so that

$${\varepsilon }_{0} \, < \, {\varepsilon }_{1} < \ldots {\varepsilon }_{{{{{{{{\mathcal{N}}}}}}}}-1}.$$

We postpone the case of HI with eigenvalue degeneracy to a later section. We will call n in εn the number of excitations, because this index tells how many basis states have smaller Ising energy than the given state.

Let \({a}_{0}(t),\ldots ,{a}_{{{{{{{{\mathcal{N}}}}}}}}-1}(t)\) be the amplitudes of the basis states in the Schrödinger equation solution:

$$\left|\psi (t)\right\rangle =\mathop{\sum }\limits_{n=0}^{{{{{{{{\mathcal{N}}}}}}}}-1}{a}_{n}(t)\left|n\right\rangle .$$

For our QA protocol, the Schrödinger equation is given by

$$i{\dot{a}}_{n}={\varepsilon }_{n}{a}_{n}-\frac{g}{{{{{{{{\mathcal{N}}}}}}}}}v,\quad v=\frac{1}{t}\mathop{\sum }\limits_{k=0}^{{{{{{{{\mathcal{N}}}}}}}}-1}{a}_{k},\quad n=0,\ldots ,{{{{{{{\mathcal{N}}}}}}}}-1.$$
(11)

The solvability of equations (11) follows from the fact that, after the Laplace transform, the \({{{{{{{\mathcal{N}}}}}}}}\) coupled equations reduce to a single first-order ordinary differential equation in the Laplace transform of v, which can always be solved analytically (see Methods). This model is a special case of a model that was solved by one of us12. Algebraic properties of this model were also mentioned in refs. 13,14, but the relation of its solution to the QA problem has not been discussed before.

The analytical solution gives a simple formula for the probabilities of excitation numbers at the end of the evolution. If as t → 0+ the system is in the ground state, \(|{\psi }_{0}\rangle\), the probability to produce n excitations as t →  is given by

$${P}_{n}=\frac{{p}^{n}(1-p)}{1-{p}^{{{{{{{{\mathcal{N}}}}}}}}}},\quad p\equiv {e}^{-\frac{2\pi g}{{{{{{{{\mathcal{N}}}}}}}}}}.$$
(12)

Note that the final state probabilities do not depend on the particular expressions for the eigenstates \(\left|n\right\rangle\), and in this sense tell nothing about the ground state of HI. However, equation (12) gives complete information about the performance of the given QA protocol. For example, the probability to obtain the ground state is given by

$${P}_{0}=\frac{1-p}{1-{p}^{{{{{{{{\mathcal{N}}}}}}}}}},$$
(13)

and the average number of excitations is

$$\left\langle n\right\rangle \equiv \mathop{\sum }\limits_{n=1}^{{{{{{{{\mathcal{N}}}}}}}}-1}n{P}_{n}=\frac{p}{1-p}+\left(1-\frac{1}{1-{p}^{{{{{{{{\mathcal{N}}}}}}}}}}\right){{{{{{{\mathcal{N}}}}}}}}.$$
(14)

These expressions simplify for a large number of interacting qubits N 1, for which \({{{{{{{\mathcal{N}}}}}}}}\) is exponentially large, and we can disregard \({p}^{{{{{{{{\mathcal{N}}}}}}}}}\) in comparison to p. For g 1 we find \({p}^{{{{{{{{\mathcal{N}}}}}}}}}\ll 1\), and Pn follows the geometric distribution, with

$${P}_{0}=1-p,\quad \left\langle n\right\rangle =\frac{p}{1-p}.$$
(15)

To provide an intuition about the properties of the distribution (12), we also note that if the energy dispersion of HI were linear, i.e., if εn = nδ, then the distribution (12) would be the Gibbs distribution

$${P}_{n}=\frac{{e}^{-{\varepsilon }_{n}/({k}_{B}T)}}{Z},\quad {\varepsilon }_{n}=n\delta ,$$

where 1/Z is a normalization factor and

$${k}_{B}T={{{{{{{\mathcal{N}}}}}}}}\delta /(2\pi g) \sim {{\Delta }}{E}_{I}/g.$$
(16)

As the dimensionless parameter g is growing, the effective temperature (16) of the final excitation distribution is decreasing.

Characteristic annealing times

The currently studied QA systems use a slowly changing transverse magnetic field with

$${H}_{M}^{0}\equiv -\mathop{\sum }\limits_{k=1}^{N}{\sigma }_{x}^{k},$$
(17)

where \({\sigma }_{x}^{k}\) are Pauli x-operators acting in space of individual spins. In later sections, we will argue that the model with schedule g(t) in (6) and HM from (7) is, for a certain large subclass of HI, optimal. Therefore, its solution can be used to learn about the entire strategy of using nonadiabatic QA for finding low-energy states. To show this, we must first introduce a method to compare the performance of different QA protocols with g(t)  1/tα and different HM, but the same HI and the computation time T.

There is an additional time scale that characterizes the speed of QA. The operator g(t)HM has a bounded spectrum. Due to the exponentially large Hilbert space, this spectrum must have a high-density region at some distance ΔEM from the ground state of g(t)HM. The Ising part HI also has a characteristic energy scale ΔEI, that is, the bandwidth of its spectrum (Fig. 1, left panel). Since HI and g(t)HM do not commute, the resonant nonadiabatic transitions between the ground level of g(t)HM and the dense region of its spectrum become most probable near the time τa, when the operators HI and g(τa)HM become comparable (Fig. 1, left and middle panels), i.e.,

$${{\Delta }}{E}_{M}({\tau }_{a})={{\Delta }}{E}_{I}.$$
(18)

For example, for our solvable model (see Methods)

$${\tau }_{a}=g{\tau }_{I},$$

where

$${\tau }_{I}=1/{{\Delta }}{E}_{I}$$

is the characteristic time of dephasing that can be induced by the Ising part HI. We will call τa the annealing time, in contrast to the total evolution time T that we will call computation time.

Fig. 1: Characteristic annealing time.
figure 1

Left: histogram of the spectrum of the Hamiltonian HI(2) with Gaussian distribution of all coefficients a{k}. Middle: time evolution of the spectrum of the transverse field Hamiltonian \({H}_{M}^{0}\)(17) with quench schedule g(t) = g/t, where g = 4. (Thicker curves correspond to the higher density of states). The vertical line marks the annealing time τa(18), for which the gap between the ground state and the highest density point in the spectrum equals the bandwidth ΔEI (full width at half maximum) of the spectrum of HI. Right: time evolution of the (normalized to 1) excitations \(\bar{n}\equiv \langle n\rangle /(({{{{{{{\mathcal{N}}}}}}}}-1)/2)\) for different QA protocols at g = 4 of N = 12 spins. These protocols are tuned to have the same annealing time τa as summarized in Table 1. The colored vertical lines mark the corresponding times at which the excitations reach the halfway into their saturation, which verifies almost the same effective annealing rate for the protocols with the same τa.

Any QA protocol must pass through the moment (18). Hence, τa can always be defined consistently. We will say that two different protocols with power-law decays of g(t) and the same HI and T, have the same speed of QA if they also have the same τa. The practically interesting values of τa are restricted to the range

$${\tau }_{I} \, < \, {\tau }_{a} \, < \, T.$$
(19)

The first inequality in (19) follows from the fact that the case of τa < τI corresponds to a strongly nonadiabatic regime, for which the gap in the spectrum of g(t)HM closes faster than the characteristic interaction rates of HI. We will say that one of the compared protocols is better if it produces fewer excitations, 〈n〉, when \(T/{\tau }_{a}={{{{{{{\rm{const}}}}}}}}\gg 1\) and the same characteristic times, τI and τa, are set for the different protocols.

If a protocol is optimal, i.e., outperforms all other protocols at some imposed conditions on the QA schedule and for a certain class of HI, it must remain optimal after time-rescaling, t → λt, in the Schrödinger equation, because the latter merely means the change of time-counting procedure. It has been recently proved15 that if such a protocol exists, it must correspond to a power-law decay of the coupling: g(t) ta. We will use this result because it strongly restricts the class of the schedules that should be tested in order to prove the optimality. Here we also note that the solvable protocol has g(t)  1/t, which means that it may be optimal for some classes of HI, which we will identify later.

Computational convergence rates

The analytical solution says that the probability to find the ground state configuration is growing linearly with τa, however, starting from an exponentially small value. Thus, if we assume that g = τa/τI 1, then

$${P}_{0}=1-p\approx \frac{2\pi g}{{{{{{{{\mathcal{N}}}}}}}}}=\frac{2\pi }{{{{{{{{\mathcal{N}}}}}}}}}\frac{{\tau }_{a}}{{\tau }_{I}}.$$

Hence, in order to make P0 1, we need the QA time

$${\tau }_{a} \sim {\tau }_{I}{2}^{N}.$$
(20)

The theory of simulated QA has previously produced various bounds on the rate of change of the coupling16,17,18. The simulated QA is a Monte-Carlo algorithm, which performance dependence on N and T can be different from the performance of the physical QA but both algorithms are interesting to compare. According to ref. 17, to guarantee the convergence of the simulated QA for binary couplings in the Ising Hamiltonian, as t → , to O(1) ground state probability, the field should change as

$$g(t)\,\ge \,{(\xi t)}^{-1/(2N-1)},$$
(21)

where ξ is exponentially small for large N. Our solution agrees with this estimate. It shows the convergence of QA computing to the ground state in the adiabatic limit, during a finite non-polynomial in N annealing time (20). However, for a fair comparison, the result in ref. 17 must be extended to the limit of maximal complexity of (2). At least the fact that the number of terms in HI can be exponentially large adds an extra-large overhead on the Monte-Carlo algorithms, such as the simulated QA, because the time to calculate just one eigenvalue becomes, itself, exponentially large. In the worst-case then, the calculation time should grow as \(\sim {{{{{{{{\mathcal{N}}}}}}}}}^{2}\). In contrast, programming such a complex HI for QA means setting \(O({{{{{{{\mathcal{N}}}}}}}})\) different couplings only once. This takes only \(O({{{{{{{\mathcal{N}}}}}}}})\) amount of time and therefore this preparation step for QA can change only the exponential prefactor but not the exponential scaling in (20).

The result (20) also shows that the generally exponentially hard computational problem requires exponentially large calculation time for a precise solution. Hence, computational difficulties reemerge in some form in different computational approaches. For specific problems, this annealing time can be generally obtained by the gap analysis and fine-tuning of the protocol for a specific HI. For example, if the minimal gap over the ground state scales as \(\sim 1/\sqrt{{{{{{{{\mathcal{N}}}}}}}}}\), this imposes the same constraint for the annealing time \({\tau }_{a} \sim {{{{{{{\mathcal{N}}}}}}}}\). However, we stress that the gap analysis for complex HI can be very challenging, and a proper choice of the annealing protocol, g(t) and HM, requires individual tuning19,20. In contrast, our analytic solution applies to all HI with a fixed simple form of the annealing protocol.

The time estimate (20) can be compared to the one for a classical search algorithm that would identify the ground state of the diagonal matrix HI. If the entries of HI are random, there is no other way but to compare all eigenvalues, which requires \({{{{{{{\mathcal{N}}}}}}}}\) computational steps. Using this analogy, equation (20) suggests that τI can be considered as an analog of the single computation time step and τa is the analog of the full computation time in the classical search algorithms.

Scaling for the average excitation number

The modern attempts to develop QA hardware are largely based on a heuristic assumption that at moderate QA rates we can obtain a considerable reduction in computational error rate even when the true ground state cannot be found. The needed intuition for this regime can be gained from physics using the similarity of the complex Ising Hamiltonians with spin-glass systems that correspond to randomly chosen couplings between spins21. The glass phase appears at low temperatures and corresponds to logarithmically slow relaxation of standard measurable characteristics22. Indeed, classical annealing simulations of spin glasses generally show a logarithmic residual energy dependence on time T of the temperature decay from a finite value to zero16,23:

$${\varepsilon }_{{{{{{{{\rm{res}}}}}}}}}^{{{{{{{{\rm{cl}}}}}}}}} \sim 1/{\log }^{\beta }T,$$
(22)

where β = O(1) is a constant. The transition to the glass phase is also expected for QA but the scaling of the residual energy with QA time is not clear. On one hand, quantum tunneling is more efficient than thermal fluctuations when overcoming spikes of a potential barrier. On the other hand, such barrier spikes can be bypassed in the multidimensional phase space of many qubits, whereas stochastic fluctuations are more efficient for transiting over shallow but broad potential barriers. Moreover, disordered quantum systems show purely quantum effects, such as many-body localization, that resist the propagation of information inside a system. An example of this behavior is found in gamma-magnets24—the models of arbitrarily many interacting spins that resist flipping even a single spin in response to arbitrarily strong and fast magnetic fields. Thus, there are arguments both in favor and against QA in comparison with classical annealing performance.

Early numerical studies found that QA leads to an inverse power of the logarithmic decay (22) as well, where T is the time of the QA protocol, but with a larger power β, and hence outperforms classical annealing25,26. However, later studies27 claimed that this behavior might be a numerical artifact caused by time discretization, and the improvement of QA reduces only to a small finite offset in the time-continuum limit. If the system passes into a glassy phase, there are analytical arguments showing that QA has no advantage over classical annealing at all28. In any case, if slow energy relaxation (22) describes QA of spin glasses in the pseudo-adiabatic regime generally, the heuristic QA method looks impractical for computations, apart from niche applications that avoid the spin-glass behavior.

Returning to our solvable model, QA superiority in the nonadiabatic regime would correspond to a fast suppression of the average number of excitations, for \({{{{{{{\mathcal{N}}}}}}}}\gg 1\), which is given by

$$\langle n\rangle =\frac{p}{1-p}\approx \frac{1}{{P}_{0}}={{{{{{{\mathcal{N}}}}}}}}\frac{{\tau }_{I}}{2\pi {\tau }_{a}}.$$
(23)

As expected, 〈n〉 decreases with the growing annealing time τa but nonexponentially and starting from an exponentially large initial value.

Let us now discuss the fact that, formally, the computation time T in the solvable model is infinite but in practice, it has to be finite. Let us set T to be proportional to τa. The same scaling then would be found for the dependence of 〈n〉 on T if the deviation of the QA result at finite T from the exact solution is suppressed by a small parameter τa/T. Numerically, we always found that 〈n〉 saturates for T > τa close to the T →  value, up to corrections of some order of τa/T (Fig. 1, right panel).

The following analytical arguments show that, indeed, a sudden termination of the protocol at finite Tτa produces a negligible difference from our analytical prediction. Using the Landau–Zener formula, the nonadiabatic transitions may not be suppressed during t > T for the states within the energy difference δε2dΔEM/dt ≤ g/T2. For spin glasses with a smooth density of states, the introduced deviations from 〈n〉 are suppressed, at least, by a factor O(τI/T), which has the same dependence on T as the 〈n〉 dependence on τa but the factor 1/T is much smaller. For example, if we set τa/T 0.01, then the deviations from the analytical prediction for 〈n〉 should not exceed  1%. Thus, we find the scaling

$$\langle n\rangle \sim 1/T,$$
(24)

assuming that \({\tau }_{a}/T={{{{{{{\rm{const}}}}}}}}\ll 1\).

Equation (24) is the main result of our article. We showed analytically that QA with the solvable protocol does not lead to a logarithmically slow relaxation for arbitrarily complex HI. In fact, the exact solution does not show any sharp changes in the relaxation curve, which are expected for the transition to a glass phase.

We now analyze the behavior of the residual energy

$${\varepsilon }_{res}={\langle {H}_{I}\rangle }_{t\to \infty }-{\varepsilon }_{0}.$$
(25)

For spin glasses with random HI, the middle of the density of states is smooth and broad, and can be well described with a constant density, i.e., En = δn, where \(\delta ={{\Delta }}{E}_{I}/{{{{{{{\mathcal{N}}}}}}}}\) is the characteristic distance between nearest energy levels. In this case, for a broad range of annealing times, 〈n〉 and the average energy after QA are linearly related: εres 〈nδ. Then, equation (23) means a surprising fact that the energy relaxation as a function of the annealing time follows a power law:

$${\varepsilon }_{res} \sim {{\Delta }}{E}_{I}({\tau }_{I}/{\tau }_{a}),$$
(26)

rather than a logarithmic relaxation with growing τa, which is found in the classical annealing of spin glasses.

In interacting-spin systems, the density of state typically follows a Gaussian form29, whose tail near the ground state can be distorted, e.g., to an exponential shape. Hence, for truly slow QA, deviations from (26) are expected because the residual energy becomes sensitive to the exact form of the density of states near the ground level. However, any power-law energy dispersion near the ground level, εnnα, leads to a power law \({\varepsilon }_{res} \sim 1/{\tau }_{a}^{\alpha }\) rather than logarithmic residual energy dependence on 1/τa after averaging over the distribution (12). This allows us to analyze the residual energy scaling with various forms of low-energy spectral density. In Methods, we show that the power-law relaxation for the residual energy is typically expected, including for the Gaussian and exponential spectral densities.

Numerically, we could not find a spectrum that would produce a clearly logarithmic residual energy relaxation for the solvable excitation distribution. We attribute this to the fact that the inverse power law for the average excitation (24) is a sufficiently strong constraint to lead to a power-law relaxation for a broad type of energy spectra. We leave the question open: whether this behavior is a consequence of the non-local nature of the mixing Hamiltonian (7).

Below, we discuss other properties of the solvable protocol, which should be of interest for the heuristic QA hardware developments.

Degenerate ground state

The exponentially large QA time is needed for the solvable protocol to obtain the ground state only if this state is nondegenerate. We consider now the case with the ground state degeneracy: ε1 = … = εM−1 = ε0. Summing the first M equations in (11), we then find that the superposition

$$\left|+\right\rangle =\frac{1}{\sqrt{M}}\mathop{\sum }\limits_{m=0}^{M-1}\left|m\right\rangle$$

is coupled to any \(\left|n\right\rangle\), where n ≥ M, with a larger coupling \(g\sqrt{M}\). All other orthogonal superpositions of the Ising ground states then decouple and have zero probability to be at the end of the evolution.

The solvable model in Appendix B of ref. 12 (see also Methods) is applied even when all \({{{{{{{\mathcal{N}}}}}}}}\) states are coupled to each other with different independent \({{{{{{{\mathcal{N}}}}}}}}\) parameters. Thus, the modification of the effective coupling to state \(\left|+\right\rangle\) is still described by the exact solution in ref. 12, which leads to the probability of the final state \(\left|+\right\rangle\):

$${P}_{+}=(1-{p}^{M})/(1-{p}^{{{{{{{{\mathcal{N}}}}}}}}}),\quad p={e}^{-\frac{2\pi g}{{{{{{{{\mathcal{N}}}}}}}}}},$$
(27)

whereas the probabilities of the energy excitations do not change. This gives us an estimate for the time to prepare the state \(\left|+\right\rangle\) with probability P+ 1:

$${\tau }_{a} \sim \frac{{{{{{{{\mathcal{N}}}}}}}}{\tau }_{I}}{M}.$$

If M is large, e.g.,

$$M \sim {{{{{{{\mathcal{N}}}}}}}}/{\log }_{e}^{a}{{{{{{{\mathcal{N}}}}}}}},$$
(28)

this leads to an exponential speedup for extracting non-local information that can be obtained from measurements on the prepared superposition \(\left|+\right\rangle\).

For example, suppose that all excitation energies of HI are random positive and ε0 = … = εM−1 = 0 appear periodically, so that, when sorted in the known standard computational basis, they correspond to the eigenstates \(|{x}_{0}+rT\rangle\), where x0 and T are integers, such that \({x}_{0} \, < \, T \sim {\log }_{e}^{a}{{{{{{{\mathcal{N}}}}}}}}\); r = 0, 1, 2, …, and \({{{{{{{\mathcal{N}}}}}}}}/T\) is also an integer. This corresponds to \(M \sim {{{{{{{\mathcal{N}}}}}}}}/{\log }_{e}^{a}{{{{{{{\mathcal{N}}}}}}}}\), so during the QA time of an order

$${\tau }_{a} \sim {\tau }_{I}\;{\log }_{e}^{a}{{{{{{{\mathcal{N}}}}}}}}$$

the solvable protocol prepares a state of the qubits as a symmetric superposition:

$$\left|+\right\rangle =\frac{1}{\sqrt{M}}\mathop{\sum}\limits_{r}\left|{x}_{0}+rT\right\rangle .$$
(29)

The Quantum Fourier Transform then can be used to change this state into a superposition of the states \(\left|k\right\rangle\), where k is the integer multiple of \({{{{{{{\mathcal{N}}}}}}}}/T\). Finding only two different k, one can then find their greatest common divisor by classical means, and thus determine the period T faster than by classical means.

The possibility to solve the period finding problem on a quantum computer is an essential ingredient in many quantum algorithms, such as Shor’s factorization algorithm. An important step in such algorithms is to find a symmetric superposition of equal energy eigenstates of a quantum function that has a high degeneracy of eigenstates in the entire phase space. Such a function can be usually encoded in the target Hamiltonian HI and thus one of its eigenstates can be found using QA. However, it is clear from our solution why such algorithms are hard to implement with other heuristic protocols, such as with the transverse field (17). This field couples different Ising ground states with the higher Ising energy states differently. Hence, even if we assume that the ground state can be prepared quickly, it will appear generally in a nonsymmetric superposition

$$\left|{\psi }_{G}\right\rangle =\mathop{\sum}\limits_{r}{C}_{r}\left|{x}_{0}+rT\right\rangle ,$$

where the coefficients Cr have not only different absolute values but also different phases which depend on all parameters of HI. Hence, further manipulations, such as making the Quantum Fourier Transform, may not provide the desired effect on this state, which is needed to complete the algorithm.

Effectiveness of the solvable protocol in the limit of maximal complexity of H I

The annealing protocol in our solvable model is unbiased in the sense that the amplitudes an(t)(11) do not depend on the specific structure of the basis states. This is not the case for the protocol with a transverse field30, which couples directly only to the basis states whose net spin polarization differs by ±1. Our protocol is also unbiased in the sense that degenerate ground state configurations as a symmetric superposition couple to the other states equally, which results in equal probabilities to find such ground states of HI.

Moreover, the statistical learning theory31 says that direct approaches, which avoid the gain of irrelevant information, should be favorable for learning algorithms. This is partly addressed by our finding that the final state probabilities obtained by solving equation (11) are independent of the precise values of εk, i.e., the transition probability to any state \(\left|n\right\rangle\) depends only on how many other states have smaller Ising energies. For example, the probability to find the ground state does not depend on the choice of HI at all. This independence of the scattering probabilities of certain basic parameters is shared by all integrable models with time-dependent Hamiltonians13 but is not expected otherwise. Hence, it must be unique for g(t)  1/t annealing protocol because other g(t) is not among the known solvable models with arbitrary HI. This property means that our solvable protocol does not produce irrelevant information about specific values of εk, as needed because only the ordering of these eigenvalues matters for finding good approximations to the ground energy.

Such properties altogether are unique among the possible QA protocols, which suggests that the solvable protocol, for some types of problems, could be favorable. Owing to the universality of the analytical solution, if true, this should be true for the most complex form of HI. Thus, let HI be the sum of all possible terms in (2) with independent random coefficients a{k}. Such a high-complexity limit reduces the problem of identifying the minimal value from an unsorted array of independent random energies εn that are sampled from some distribution. For instance, for Gaussian random coupling coefficients, εn forms a Gaussian distribution as well (Fig. 1, left panel). Such a construction of HI does not favor any particular ground state spin configuration and even any systematic correlations between the excited states. Hence, it is expected that the low-energy states are estimated faster with a maximally unbiased QA protocol, which is our solvable protocol.

To test this hypothesis, we employ the result in15 that allows us only to compare the performance of the solvable protocol with a family of the protocols with a power-law decay of the coupling, g(t)  1/tα, and identical for each protocol fully random HI, as well as τa and T/τa. First, we note that the protocols with α < 1 produce definitely worse than 〈n〉  1/τa scaling for the excitations if we set \(T/{\tau }_{a}={{{{{{{\rm{const}}}}}}}}\). This follows from the fact that even in the adiabatic approximation the term HM/tα mixes any Ising eigenstate with other states within the window of energy ε 1/tα. Hence, sudden termination of such protocols at a finite time T cannot resolve the states within the energy window that scales as 1/Tα, which decays slower than 1/T.

For α ≥ 1, we resort to the numerical investigation. Figure 2 compares numerically calculated final 〈n〉 for different protocols at N = 12 and the Hamiltonian (2) with randomly chosen all possible couplings. For large g, which we define for all protocols as g ≡ τa/τI, the excitation number decays as a power law. For any g and N, our analytically solvable model (Protocol 1) always outperforms the other protocols, although all of them show scaling similar to 1/g for large g. In numerous other tests (not shown), we found that all non-power-law schedules, e.g., with g(t) decaying exponentially, had a much worse performance for the same values of τI, τa, and T, in agreement with15. Figure 3 also shows the data that we used to extrapolate the results to larger N. For such interpolations, we always found that the solvable protocol produced smaller residual energy for the fully random Hamiltonian HI. Hence, as far as we could test numerically and extrapolate our results, the solvable protocol was, indeed, optimal for our comparison criteria and the most complex form of HI.

Fig. 2: Scaling of the excitation number.
figure 2

Numerically found final normalized excitation number \(\bar{n}\equiv \langle n\rangle /(({{{{{{{\mathcal{N}}}}}}}}-1)/2)\) at various g = τa/τI for N = 12 spins for three different protocols listed in Table 1. The Hamiltonian HI takes the form (2) with the coupling coefficients independently drawn from the standard normal distribution. The main figure and inset show the adiabatic (large g) and nonadiabatic (small g) regimes in log–log and semi-log scales, respectively. The solvable protocol (red points) always outperforms the other protocols for the same g.

Fig. 3: Asymptotic excitation numbers.
figure 3

Numerically found final normalized excitation numbers \(\bar{n}\equiv \langle n\rangle /(({{{{{{{\mathcal{N}}}}}}}}-1)/2)\) for various numbers of spins at g = 2 and Ising Hamiltonian (2) with the coupling coefficients independently drawn from the standard normal distribution. Blue dots correspond to the transverse field protocol, and red dots are numerical confirmation of the analytical prediction for the solvable model, which is marked by a dashed black curve. Error bars show the standard deviations of the data points obtained from averaging over 25 realizations of the random Hamiltonian. A Blue dotted curve is the best fit to \(\mathop{\sum }\nolimits_{k = 0}^{4}{c}_{k}/{N}^{k}\), which predicts that the exact solution has better performance for any N.

An alternative argument for the optimality of the solvable protocol for fully random HIs follows from the estimate (23), which says that the performance of this protocol is actually the same as in the classical Monte-Carlo search. Indeed, a random search for the lowest eigenvalue has probability \({n}_{\max }/{{{{{{{\mathcal{N}}}}}}}}\) per step to pick up an eigenvalue from the first \({n}_{\max }\) excitations. Hence it takes time \(\tau \sim {{{{{{{\mathcal{N}}}}}}}}{\tau }_{{{{{{{{\rm{step}}}}}}}}}/{n}_{\max }\) to find an eigenvalue with \(0\le n\le {n}_{\max }\), where τstep is the time of one eigenvalue of HI computation and its comparison to a previously found lowest value. This is precisely the estimate of equation (23), in which we identify τa with τ, τI with τstep and 〈n〉 with \({n}_{\max }\). Since our QA protocol has the same convergence rate as the classical Monte-Carlo search of the completely unsorted array, any improvement over its performance on HI with all random entries, either for the full or the partial search, would mean the quantum supremacy that does not rely on hints such as the oracle in the Grover algorithm, which is believed to be impossible.

Thus, our protocol gives an explicit example of heuristic QA computations leading to the same performance as for one of the known classical algorithms. This includes all possible HI with nondegenerate spectra, and all possible time restrictions. As our QA protocol, the unbiased random search Monte-Carlo is the preferable choice for searching through a completely random array but then by classical means. This raises a question of whether many other heuristic approaches, such as using the practically most accessible QA protocols without correlating them with the desired task, or post-processing the final state as in the case of the ground state degeneracy, have also the same performance for all possible tasks as certain classical algorithms.

Avoiding the bound

The limit of fully random HI represents the largest class of all possible computational problems (5). Classical optimization algorithms usually trade between good and bad performance in different applications, which is known as the “no-free-lunch” property. Although similar results are not known for QA, it is expected that the effectiveness of the solvable protocol for the big class of the most complex problems generally means that there are protocols that outperform it on simpler problems with more structured HI. Below, let us show several examples in support of this hypothesis.

A well-known example of a problem with a structured HI is the one that is solvable by the Grover algorithm. It prepares the ground state of an operator HI that has all but one zero eigenvalues, whereas the ground state energy is −1. Let ηk = ± 1, where the sign depends on whether this ground state has the k-th spin, respectively, up or down. Then, HI for Grover’s problem can be written as

$${H}_{I}^{G}=-\mathop{\prod }\limits_{k=1}^{N}\frac{(1+{\eta }_{k}{\sigma }_{z}^{k})}{2}.$$
(30)

In comparison with the most complex version of (2), this Hamiltonian is much simpler. It depends only on N sign parameters, and it has considerable symmetry: changes in these parameters do not affect the spectrum of \({H}_{I}^{G}\). It is, indeed, known that the ground state of \({H}_{I}^{G}\) can be found by adiabatic QA during the time that scales only as \({{{{{{{{\mathcal{N}}}}}}}}}^{1/2}\)10. Achieving this adiabatically requires a very fine-tuned choice of the schedule g(t). However, if our solvable protocol is not optimal for the structured problems there must be protocols that achieve better estimates for the ground state for Grover’s problem also beyond the adiabatic regime, and such protocols may not need to be very complex.

Let us show that this expectation is true. Consider the QA Hamiltonian

$$H(t)=\varepsilon {H}_{I}^{G}-g(t){H}_{M},$$
(31)

where HM is given by (7). Due to the degeneracy of eigenvalues of \({H}_{I}^{G}\), the evolution equation (11) reduces to two coupled differential equations for the amplitude a0 of the ground state and the normalized sum of the other amplitudes:

$${a}_{+}\equiv \frac{1}{\sqrt{{{{{{{{\mathcal{N}}}}}}}}-1}}\mathop{\sum }\limits_{k=1}^{{{{{{{{\mathcal{N}}}}}}}}-1}{a}_{k}.$$

Namely,

$$i{\dot{a}}_{0} =-\left(\frac{g(t)}{{{{{{{{\mathcal{N}}}}}}}}}+\varepsilon \right){a}_{0}-\frac{\sqrt{{{{{{{{\mathcal{N}}}}}}}}-1}g(t)}{{{{{{{{\mathcal{N}}}}}}}}}{a}_{+},\\ i{\dot{a}}_{+} =-\frac{({{{{{{{\mathcal{N}}}}}}}}-1)g(t)}{{{{{{{{\mathcal{N}}}}}}}}}{a}_{+}-\frac{\sqrt{{{{{{{{\mathcal{N}}}}}}}}-1}g(t)}{{{{{{{{\mathcal{N}}}}}}}}}{a}_{0}.$$
(32)

The initial conditions, as t → 0+, correspond to \({a}_{0}=1/\sqrt{{{{{{{{\mathcal{N}}}}}}}}}\approx 0\) and, hence, \({a}_{+}\approx 1\). The protocol that makes P0 ≡ a02 1 is obtained by immediately setting the schedule to a constant value

$$g(t)=\varepsilon {{{{{{{\mathcal{N}}}}}}}}/({{{{{{{\mathcal{N}}}}}}}}-2)\approx \varepsilon ,$$
(33)

and then letting the system evolve under such conditions during time

$$T=\frac{\pi \sqrt{{{{{{{{\mathcal{N}}}}}}}}}}{2\varepsilon }.$$
(34)

One can verify that this makes P0 \(\approx\) 1 by noting that equations (32) with condition (33) are equivalent to the evolution equations for a spin 1/2 in a transverse magnetic field, which rotates this spin. Condition (33) is needed to remove the component of this field that points along the spin axis. Time T corresponds to a rotation angle that switches between orthogonal states of this spin.

Unlike the time of the solvable protocol with g(t) = − g/t, which scales as \(T \sim {{{{{{{\mathcal{N}}}}}}}}\), the time in (34) scales as \(\sim \sqrt{{{{{{{{\mathcal{N}}}}}}}}}\), which is expected for Grover’s computational problem.

This efficient protocol to solve Grover’s problem is fine-tuned for \({H}_{I}^{G}\) and cannot show good performance on other tasks. Identifying such algorithms for heuristic computations requires additional optimization steps, e.g., using the methods of machine learning32, which would correlate the annealing protocol to a given structured HI. Such methods, however, become inefficient in the limit of maximal complexity with fully random HI because of the emergence of the barren plateau33.

Another example corresponds to the systems with small connectivity between qubits in HI. It is expected then that a QA protocol that emphasizes interactions without many direct spin flips can achieve a better performance, such as the protocol induced by the decaying transverse field.

To test this, we performed simulations for HI with limited connectivity ranges, i.e., a range-k Hamiltonian is of the form (2) but only contains terms with at most k simultaneously coupled spins. This allows the control of the problem complexity by tuning the connectivity range. Our numerical simulations (Fig. 4) show that, for finite size systems of up to 12 spins and the transverse field (17), the final excitation numbers always scale as a power law of g, i.e., \(\langle n\rangle \sim 1/{g}^{\alpha } \sim 1/{\tau }_{a}^{\alpha }\), and α increases with the decrease of the connectivity range of HI. At k = 2, which is known as the Sherrington–Kirkpatrick model34, α reaches the value 2.

Fig. 4: Annealing for systems with the finite connectivity range.
figure 4

The scaling rate, α, was obtained by fitting numerically obtained excitation numbers with \(\langle n\rangle \sim 1/{\tau }_{a}^{\alpha }\), for various interaction ranges of the Hamiltonian HI and the transverse field protocol (Protocol 2 in Table 1). The inset shows a comparison of the typical power-law decay for such fits of the data for protocols 1 and 2 with N = 12 spins under a range-2 (only binary spin-spin couplings) Hamiltonian.

Figure 4 demonstrates the convergence of the performance to the universality domain of the solvable protocol with increasing complexity. In agreement with our expectations, as far as we could see numerically, the protocol with the decaying transverse field produced better performance on the structured problems than the solvable protocol, in agreement with the “no-free-lunch” property.

Let us now return to the question of whether QA computations in the nonadiabatic regime can provide a better performance, in terms of scaling with the number of qubits, than the adiabatic quantum computations for the same problem. Our solvable protocol, as well as the nonadiabatic Grover protocol, do not show this feature, as their performances scale equally with the adiabatic QA. Generally, this may not be true.

Here, we note that there is one more solvable model of QA that can be used to explore the scaling of τa(N) for a specific simple HI: Consider

$${H}_{I}^{\varepsilon }=\mathop{\sum }\limits_{k=1}^{N}{\varepsilon }_{k}{\sigma }_{z}^{k},$$
(35)

that is subject to a non-local constraint \(\mathop{\sum }\nolimits_{k = 1}^{N}{\sigma }_{z}^{k}=0\). Let us assume that εk are of the order ε. The ground state of \({H}_{I}^{\varepsilon }\) has N/2 spins pointing up. They correspond to the smaller half of εk values. The other N/2 spins point down. Here, HI is parametrized by only N numbers εk. Naturally, a wise algorithm should not look through all 2N eigenvalues of HI but rather learn those parameters.

Due to the constraint, the ground state of (35) has zero total qubit polarization. To find this state, one can use the protocol with HM that also has the ground state with zero initial total spin35:

$$H(t)=\mathop{\sum }\limits_{k=1}^{N}{\varepsilon }_{k}{\sigma }_{z}^{k}-\frac{g}{t}\mathop{\sum }\limits_{i\ne j=1}^{N}{\hat{\sigma }}_{i}^{+}{\sigma }_{j}^{-}.$$
(36)

As t → 0, the ground state energy of H(t) is separated from the dense region of g(t)HM near-zero energy by \({{\Delta }}{E}_{M} \sim \frac{gN(N-1)}{2t}\), and the \({H}_{I}^{\varepsilon }\) bandwidth scales linearly with N: ΔEIεN. The exact solution of this model was found in ref. 35. It says that the ground state is determined if g \(\approx\) 1.

Using our definition of the annealing time, we can now compare the performance of such QA computations with the performance of classical algorithms for the same problem. We find for the model (36) that g \(\approx\) 1 corresponds to

$${\tau }_{a} \sim N/\varepsilon .$$

The same solution in ref. 35 also shows that if we need only a partial search by allowing a fraction α 1 of mistakes, i.e., allowing αN spins pointing in a wrong direction, then it is sufficient to choose g 1/(Nα), i.e., the computation time reduces by a factor  1/(Nα), so in our notation

$${\tau }_{a} \sim O(1/\alpha ).$$

Classically, finding the smaller half of N/2 of εk values takes N steps. The partial QA solution thus has a better N-scaling than both the best available classical algorithm and the complete solution in the adiabatic limit. This example supports the speculations that a hybrid approach that involves a moderately fast QA step combined with a subsequent classical relaxation may improve the search for the true ground state.

Estimates for the physical time of computation

The tests of QA hardware36,37,38,39,40,41 on specific problems gave contradictory results. There are claims for superior performance of QA in some instances42, but achieving scalable quantum supremacy11 using QA is still far from conclusive.

Let us estimate the performance of our solvable protocol at the current level of technology. The coupling energy of a single qubit to the rest of the quantum processor is physically restricted to some value \({\epsilon }_{\max }\). For example, for a superconducting qubit, a coupling larger than the superconducting gap may produce unwanted excitations outside the qubit phase space. The bandwidth for HI is then restricted by \({{\Delta }}{E}_{I} \, < \, {\epsilon }_{\max }N\). Hence, τI for N qubits is restricted by

$${\tau }_{I} \, > \, 1/({\epsilon }_{\max }N).$$

If we assume \({\epsilon }_{\max }=10\)GHz as the upper bound for a superconducting qubit, then to find the ground state of only 20 qubits, from (19), we need at least time τa 0.1 μs, which is the typical upper bound on coherence time of such qubits. The required computation time τa is growing exponentially with extra qubits, so chances to solve an optimization problem for >20 qubits with the modern level of quantum technology are quickly vanishing.

One practical advantage of the solvable protocol that may justify the efforts to implement it in hardware may follow the complexity to retrieve the HI eigenvalues. Namely, when the sorting problem is encoded in the Hamiltonian of spin projection operators the direct classical algorithm requires the additional computation of eigenvalues of HI at each step, which can be exponentially long on its own for the most complex HI, but is not required during QA. To exploit this resource, one should create a small processor, with only 25 high-quality qubits, but with HI that depends on 225 different coupling parameters.

Discussion

Finding the ground state of an arbitrary Ising spin Hamiltonian is generally an exponentially hard computational problem. Even harder, it seems, is to study dynamics with a time-dependent quantum Hamiltonian that implements quantum annealing computation in the nonadiabatic regime. Nevertheless, we showed that a fully solvable model for the most general case of Ising spin interactions exists.

In other branches of physics, integrable many-body models have been very influential—often not for a particular experimental application but for the opportunity to understand the behavior of complex matter in the regimes unreachable to numerical simulations. Similarly, our exact solution produces an insight into both spin-glass physics and quantum computing from an original perspective. Thus, we used it to set new limits on the computation precision and proved the better relaxation scaling of the residual energy for quantum over classical annealing computations.

Numerically, we found considerable evidence to our conjecture that in the limit of the maximal complexity of the computational problem our solvable QA protocol outperforms other protocols for arbitrary QA rate at identical conditions for the time of computation. Given also the “no-free-lunch" property of algorithms, this leads to a new conjecture that more structured computational problems can be solved by certain QA protocols faster than in our solvable model. We provided the arguments in support of this conjecture too. Hence, our analytical solution can serve as a reference for the performance that can be achievable in the nonadiabatic regime for arbitrary HI.

A currently discussed technical question, besides improving quantum coherence, is how to redesign the inter-qubit connections and the annealing protocol in order to improve heuristic QA41. It is often stated that the performance can improve if one-to-many qubit couplings are implemented in the Ising Hamiltonian, and if the annealing protocol has a simpler spectrum in order to make it less biased and thus reduce the effects of resonances that are specific to HM. Our results show that such approaches may not lead to a boost in performance. In fact, the solvability of our model follows from a high symmetry that makes the solvable protocol maximally unbiased. We showed that this provides the advantage, over other protocols, only for the tasks with the maximal complexity but not for more structured Ising spin Hamiltonians. Hence, by adding one-to-many qubit connections and preparing less biased QA protocols, we may only bring the complexity of the QA computations closer to the domain of our model’s superiority.

Our findings suggest that the quantum annealing superiority, for a specific problem, over all classical algorithms should be searched either in small size processors but with combinatorially complex interactions in HI or among relatively simple-structured HI, with a polynomial number of parameters but a transverse part g(t)HM that is tailor-made for this specific computational task. It is thus important to understand how the QA performance depends on the correlations between HI and HM, and on the prepared correlations in the initial state for quantum annealing.

Methods

Solution for QA model with arbitrary target Hamiltonian

The annealing problem is sometimes formulated so that the target Hamiltonian, H0, is different from the Ising Hamiltonian:

$$H(t)={H}_{0}+g(t){H}_{M}.$$
(37)

Let us show that our protocol with g(t) = g/t and HM given by (7) is still solvable in the sense that we can write the probabilities of the final eigenstates of H0 in terms of the parameters of H0.

Suppose that U is the unitary operator that diagonalizes H0, i.e.,

$${H}_{I}=U{H}_{0}{U}^{{{{\dagger}}} }$$

is a diagonal matrix. The latter means that it can be written in the Ising form (2), and we can define the basis states \(\left|n\right\rangle\) where n is the index of the excitation, as in the main text. Let us define the state

$$\left|\psi ^{\prime} \right\rangle =U\left|{\psi }_{0}\right\rangle ,\quad \left|{\psi }_{0}\right\rangle \equiv \left|\to ,\to ,\ldots ,\to \right\rangle .$$

In the basis \(|n\rangle\), the entire Hamiltonian has the form

$$H(t)={H}_{I}-\frac{g}{t}\left|\psi ^{\prime} \right\rangle \left\langle \psi ^{\prime} \right|.$$
(38)

It is now almost the same as in the problem considered in the main text but the state \(|\psi ^{\prime} \rangle\) is dependent on the matrix U. Hence, the matrix elements of the mixing part are given by

$${(g{H}_{M})}_{nm}={g}_{n}{g}_{m}^{* },$$

where

$${g}_{n}=\sqrt{g}\langle n|U\left|{\psi }_{0}\right\rangle .$$
(39)

Thus, unlike the model in the main text, the mixing Hamiltonian gHM depends on \({{{{{{{\mathcal{N}}}}}}}}\) generally different parameters that depend on the eigenstates of H0 via the matrix elements of U.

Nevertheless, the most general form of the model that was solved in Appendix B of ref. 12 includes this particular case. Thus, if we define the probabilities

$${p}_{n}\equiv {e}^{-2\pi | {g}_{n}{| }^{2}}$$

then equation (12) for the excitation probabilities (see also equation (B13) in ref. 12) is extended to

$${P}_{n}=\frac{(1-{p}_{n})\mathop{\prod}\limits_{k < n}{p}_{k}}{1-\mathop{\prod }\limits_{n=0}^{{{{{{{{\mathcal{N}}}}}}}}}{p}_{n}}.$$

Returning to the original problem (5) in the main text, it follows from (39) that knowledge of a unitary transformation UHIU, such that its action increases the overlap of the ground state with the state \(|{\psi }_{0}\rangle\), can be used to increase the probability to find the ground state.

Solution of the model

Following steps from Appendix B in ref. 12, we perform Laplace transformation

$${a}_{n}(t)={\int}_{{{{{{{{\mathcal{A}}}}}}}}}{e}^{-st}{b}_{n}(s)\,ds,$$
(40)

where \({{{{{{{\mathcal{A}}}}}}}}\) is a contour in the complex plane such that the integrand vanishes when \({{{{{{{\mathcal{A}}}}}}}}\) originates and escapes to infinity (Fig. 5). Substituting (40) into (11), we find a first-order differential equation with a simple solution for bn(s), which we substitute to (40) to find

$${a}_{n}(t)=c{\int}_{{{{{{{{\mathcal{A}}}}}}}}}\frac{{e}^{-st}}{-s+{\varepsilon }_{n}}\mathop{\prod }\limits_{n=0}^{{{{{{{{\mathcal{N}}}}}}}}-1}{(-s+{\varepsilon }_{n})}^{ig/{{{{{{{\mathcal{N}}}}}}}}}\,ds,$$
(41)

where c is a normalization constant that is fixed by the initial conditions. Following12, as t →  this integral is evaluated using the saddle point method and suitable deformation of \({{{{{{{\mathcal{A}}}}}}}}\) into the paths that go around the branch cuts in Fig. 5. This results in the analytical expression for an(t → ) in terms of the Gamma function of the parameters. The excitation probability is then obtained from Pn = an(t → )2, and using the properties of the Gamma function.

Fig. 5: Contour \({{{{{{{\mathcal{A}}}}}}}}\) defined in (40).
figure 5

This contour integral can be computed using the integral over the contours enclosing the branching cuts.

Setting parameters of protocols to compare their performance

First, we note that H with HM in (7) and \({H}_{M}^{0}\) in (17) have the same ground states both as t → 0+ and t → . For both of them, the maximum density of states is at zero energy. Hence, for HM, ΔEt = g(t), and for \({H}_{M}^{0}\), \({{\Delta }}{E}_{t}^{0}=Ng(t)\), where N is the number of spins. If, for the analytically solvable protocol with HM, we choose the time-dependent form g(t) = g/t and fix the quench parameter g, then the annealing time is given by g/τa = ΔEI, or

$${\tau }_{a}=g{\tau }_{I},$$
(42)

where τI = 1/ΔEI. This also gives the meaning to the parameter g, that is, the ratio of the annealing time and the characteristic time of the dephasing by HI. For the transverse field protocol (7) with \({H}_{M}^{0}\), the same annealing time τa in [(42)] is achieved if we set

$${g}_{0}(t)=\frac{g}{Nt}.$$
(43)

Similar arguments for g0(t)  1/t2 lead to g(t) = − g/(at2), where a = ΔEI/g, as listed in Table 1.

Table 1 Quantum annealing protocols with different interaction HM and schedule g(t), but of the same annealing time.

Scaling of the residual energy

In the main text, we have shown that for a uniform spectral density, \(\rho (E)={{{{{{{\rm{constant}}}}}}}}\), the residual energy (25) scaling is a power-law in the annealing time τa (or in the parameter g).

The power-law scaling of the residual energy can be generalized to any power-law dependence of εn, by readily evaluating the average over the probability distribution (12). Namely, for εnnα, and in the limit of \({{{{{{{\mathcal{N}}}}}}}}\gg 1\), it can be shown that εres 1/gα.

For a generic spectral density ρ(ε), the energy level index n can be written as a function of the energy,

$$n(\varepsilon )=\int\nolimits_{{\varepsilon }_{0}}^{\varepsilon }\rho (x)dx.$$
(44)

Without loss of generality, let us assume that the ground state has zero energy, ε0 = 0. The residual energy can be evaluated as

$${\varepsilon }_{res}\equiv \mathop{\sum}\limits_{n}{P}_{n}{\varepsilon }_{n}=\int d\varepsilon {P}_{n(\varepsilon )}\varepsilon \rho (\varepsilon ),$$

where Pn is the probability distribution (12). The behavior of the residual energy is determined by the shape of the spectral density. However, we argue that its power-law scaling is generally expected.

Consider the Gaussian and exponential spectral densities. Both of the spectra are restricted to the energy range [0, 2] and are centered at ε = 1. The Gaussian spectral density we simulated is

$$\rho (\varepsilon )=A{e}^{-10{(\varepsilon -1)}^{2}},\quad 0 \, < \, \varepsilon \, < \, 2,$$
(45)

and the exponential spectrum is

$$\rho (\varepsilon )=\left\{\begin{array}{l}B({e}^{10\varepsilon }-1),\quad 0 \, < \, \varepsilon \, < \, 1,\\ B({e}^{10(2-\varepsilon )}-1),\quad 1 \, < \, \varepsilon \, < \, 2,\end{array}\right.$$
(46)

where A and B are normalization factors. At large annealing time, when the system approaches the ground state, we can expand the spectral density to the leading order of the energy. Note that the exponential spectrum modeled above vanishes at the ground state, hence, with (44), n(ε) ε2. Using the result for power-law εn aforementioned, we expect a scaling of the residual energy \({\varepsilon }_{res} \sim 1/\sqrt{g}\). The Gaussian spectrum studied above has a finite value at the ground state cutoff. This constant value can dominate the sub-leading terms when the total number of states is sufficiently large. In this case, we get εnn and consequently εres 1/g. These two types of scaling behavior are verified in Fig. 6.

Fig. 6: Residual energy of the Gaussian and exponential model.
figure 6

Simulation of the residual energy for Gaussian (triangle) and exponential (square) spectral density, defined in equations (45, 46). Red and black solid lines are \(\sim 1/\sqrt{g}\) and ~1/g, respectively. The total number of states is fixed at \({{{{{{{\mathcal{N}}}}}}}}=1{0}^{9}\).

We now consider an analytically solvable model case. Suppose the spectral density near the ground state is given by an exponential function

$$\rho (\varepsilon )=a{e}^{\varepsilon },$$
(47)

with a finite but small density at the ground state, ρ(0) = a. The number of states below energy ε is

$$n(\varepsilon )={{{{{{{\mathcal{N}}}}}}}}a\int\nolimits_{0}^{\varepsilon }dx\,{e}^{x}={{{{{{{\mathcal{N}}}}}}}}a({e}^{\varepsilon }-1),$$
(48)

where \({{{{{{{\mathcal{N}}}}}}}}\) is the total number of states. Therefore,

$${\varepsilon }_{n}=\log (n+{{{{{{{\mathcal{N}}}}}}}}a)-\log ({{{{{{{\mathcal{N}}}}}}}}a).$$
(49)

The average of this energy over the distribution Pn(12) can be computed exactly. We note that the exponential density of the state is only valid at small energies since it diverges when ε becomes large. Hence, to get a physically sensible result representing the correct low-energy behavior, the total number of states \({{{{{{{\mathcal{N}}}}}}}}\) must be sufficiently large, so the decay of Pn at large n compensates for the nonphysical growth of the exponential spectral density. With this, we get

$${\varepsilon }_{res} =\mathop{\sum }\limits_{0}^{\infty }{\varepsilon }_{n}{P}_{n}\\ =-\frac{1-p}{1-{p}^{{{{{{{{\mathcal{N}}}}}}}}}}{{{\Phi }}}^{(0,1,0)}(p,0,{{{{{{{\mathcal{N}}}}}}}}a)-\log ({{{{{{{\mathcal{N}}}}}}}}a),$$
(50)

where \(p={e}^{-2\pi g/{{{{{{{\mathcal{N}}}}}}}}}\approx 1-2\pi g/{{{{{{{\mathcal{N}}}}}}}}\), Φ(x, y, z) is the Lerch transcendent function, and f (0, 1, 0) is the derivative of f with respect to its second argument. This function at large \({{{{{{{\mathcal{N}}}}}}}}\) simplifies to a power-law scaling  1/g (see Fig. 7).

Fig. 7: Residual energy for the exactly solvable exponential spectral density.
figure 7

Blue solid line: residual energy in equation (50) at various scales. The parameters are fixed at a = 1 and \({{{{{{{\mathcal{N}}}}}}}}=1{0}^{9}\). To guide the eye, ~1/g is plotted in red dotted lines.

As the last example, consider a model of many non-interacting spin 1/2’s. Each spin has eigenenergies ±1. The number of states for a fixed number of spin excitations is given by the binomial distribution. This allows us to compute the energy levels exactly. This model has a global spectral density well described by a Gaussian function. Figure 8 shows a finite size simulation of the residual energy, which scales as a power-law 1/gα, with the exponent fitted to α \(\approx\) 0.34. Note that α = 1/3 is expected for a Gaussian spectrum, whose density of states vanishes at the ground state, because it expands to the leading order as ε2, which results in εnn1/3. Our simulation fits into this picture very well. The residual energy eventually switches to an exponential decay near the truly adiabatic regime (as shown in the inset). This is expected because the energy relaxation is dominated then by only a few states that all decay with time exponentially.

Fig. 8: Residual energy of the independent spin model.
figure 8

Simulation of the residual energy for the independent spin model, plotted in log–log scale. The red solid line is fitted to 1/g0.34. The total number of spins is 30, which corresponds to \({{{{{{{\mathcal{N}}}}}}}}\approx 1{0}^{9}\). Inset is the same plot in semi-log scale, which shows an exponential fit of the adiabatic tail (red solid line).