Introduction

Quantum metrology concerns the task of estimating a parameter, or several parameters, characterizing the Hamiltonian of a quantum system. This task is performed by preparing a suitable initial state of the system, allowing it to evolve for a specified time, performing a suitable measurement, and inferring the value of the parameter(s) from the measurement outcome. Quantum metrology is of great importance in science and technology, with wide applications including frequency spectroscopy, magnetometry, accelerometry, gravimetry, gravitational wave detection, and other high-precision measurements1,2,3,4,5,6,7,8,9.

Quantum mechanics places a fundamental limit on measurement precision, called the Heisenberg limit (HL), which constrains how the precision of parameter estimation improves as the total probing time t increases. According to HL, the scaling of precision with t can be no better than 1/t; equivalently, precision scales no better than 1/N with the total number of probes N used in an experiment. For a noiseless system, HL scaling is attainable in principle by, for example, preparing an entangled “cat” state of N probes10,11,12. In practice, though, in most cases environmental decoherence imposes a more severe limitation on precision; instead of HL, precision scales like \(1{\mathrm{/}}\sqrt N \), called the standard quantum limit (SQL), which can be achieved by using N independent probes13,14,15,16,17,18. The quest for measurement schemes surpassing the SQL has inspired a variety of clever strategies, such as squeezing the vacuum1, optimizing the probing time19, monitoring the environment20,21, and exploiting non-Markovian effects22,23,24.

Quantum error correction (QEC) is a particularly powerful tool for enhancing the precision of quantum metrology25,26,27,28,29,30. Quantum error correction is a method for reducing noise in quantum channels and quantum processors31,32,33. In principle, it enables a noisy quantum computer to simulate faithfully an ideal quantum computer, with reasonable overhead cost, if the noise is not too strong or too strongly correlated. But the potential value of QEC in quantum metrology has not yet been fully fleshed out, even as a matter of principle. A serious obstacle for applications of QEC to sensing is that it may in some cases be exceedingly hard to distinguish the signal arising from the Hamiltonian evolution of the probe system from the effects of the noise acting on the probe. Nevertheless, it has been shown that QEC can be invoked to achieve HL scaling under suitable conditions25,26,27,28, and experiments demonstrating the efficacy of QEC in a room-temperature hybrid spin register have recently been conducted34.

As is the case for quantum computing, we should expect positive (or negative) statements about improving metrology via QEC to be premised on suitable assumptions about the properties of the noise and the capabilities of our quantum hardware. But what assumptions are appropriate, and what can be inferred from these assumptions? In this paper, we assume that the probes used for parameter estimation are subject to noise described by a Markovian master equation35,36, where the strength and structure of this noise is beyond the experimentalist’s control. However, aside from the probe system, the experimentalist also has noiseless ancilla qubits at her disposal, and the ability to apply noiseless quantum gates that act jointly on the ancilla and probe; she can also perform perfect ancilla measurements, and reset the ancillas after measurement. Furthermore, we assume that a quantum gate or measurement can be executed in an arbitrarily short time (though the Markovian description of the probe’s noise is assumed to be applicable no matter how fast the processing).

Previous studies have shown that whether HL scaling can be achieved by using QEC to protect a noisy probe depends on the algebraic structure of the noise. For example, if the probe is a qubit (two-dimensional quantum system), then HL scaling is possible when detecting a σ z signal in the presence of bit-flip (σ x ) errors25,26,27,28, but not for dephasing (σ z ) noise acting on the probe, even if arbitrary quantum controls and feedback are allowed16. (Here σx,y,z denote the Pauli matrices.) For this example, we say that σ x noise is “perpendicular” to the σ z signal, while σ z noise is “parallel” to the signal. In some previous work on improving metrology using QEC, perpendicular noise has been assumed25,26, but this assumption is not necessary—for a qubit probe, HL scaling is achievable for any noise channel with just one Hermitian jump operator L, except in the case where the signal Hamiltonian H commutes with L37.

In this paper, we extend these results to any finite-dimensional probe, finding the necessary and sufficient condition on the noise for achievability of HL scaling. This condition is formulated as an algebraic relation between the signal Hamiltonian whose coefficient is to be estimated and the Lindblad operators {L k } that appear in the master equation describing the evolution of the probe. We prove that (1) if the signal Hamiltonian can be expressed as a linear combination of the identity operator I, the Lindblad operators L k , their Hermitian conjugates \(L_k^{\mathrm{\dagger }}\) and the products \(L_k^{\mathrm{\dagger }}L_j\) for all k, j, then SQL scaling cannot be surpassed. (2) Otherwise HL scaling is achievable by using a QEC code such that the effective “logical” evolution of the probe is noiseless and unitary. Notably, under the assumptions considered here, either SQL scaling cannot be surpassed or HL scaling is achievable via quantum coding; in contrast, intermediate scaling is possible in some other metrology scenarios19. For the case where our sufficient condition is satisfied, we explicitly construct a QEC code that achieves HL scaling. Furthermore, we show that searching for the QEC code that achieves optimal precision can be formulated as a semidefinite program (SDP) that can be efficiently solved numerically, and can be solved analytically in some special cases. Our sufficient condition cannot be satisfied if the noise channel is full rank, and is therefore not applicable for generic noise. However, for noise which is \(\epsilon \)-close to meeting our criterion, using the QEC code ensures that HL scaling can be maintained approximately for a time \(O(1{\mathrm{/}}\epsilon )\), before crossing over to asymptotic SQL scaling.

Results

Sequential scheme for quantum metrology

We assume that the probes used for parameter estimation are subject to noise described by a Markovian master equation. In addition to the probe system, the experimentalist also has noiseless ancilla qubits at her disposal. She can apply fast, noiseless quantum gates that act jointly on the ancilla and probe; she can also perform perfect ancilla measurements, and reset the ancillas after measurement.

We endow the experimentalist with these powerful tools because we wish to address, as a matter of principle, how effectively QEC can overcome the deficiencies of the noisy probe system. Our scenario may be of practical interest as well, in hybrid quantum systems where ancillas are available, which have a much longer coherence time than the probe. For example, sensing of a magnetic field with a probe electron spin can be enhanced by using a quantum code, which takes advantage of the long coherence time of a nearby (ancilla) nuclear spin in diamond34. In cases where noise acting on the ancilla is weak but not completely negligible, we may be able to use QEC to enhance the coherence time of the ancilla, thus providing better justification for our idealized setting in which the ancilla is effectively noiseless. Our assumption that quantum processing is much faster than characteristic decoherence rates is necessary for QEC to succeed in quantum computing as well as in quantum metrology, and recent experimental progress indicates that this assumption is applicable in at least some realistic settings. For example, in superconducting devices, QEC has reached the break-even point where the lifetime of an encoded qubit exceeds the natural lifetime of the constituents of the system;38 one- and two-qubit logical operations have also been demonstrated39,40. Moreover, if sensing could be performed using a probe encoded within a noiseless subspace or subsystem41, then active error correction would not be needed to protect the probe, making the QEC scheme more feasible using near-term technology.

In accord with our assumptions, we adopt the sequential scheme for quantum metrology37,42,43 (Fig. 1a). In this scheme, a single noisy probe senses the unknown parameter for many rounds, where each round lasts for a short time interval dt, and the total number of rounds is t/dt, where t is the total sensing time. In between rounds, an arbitrary (noiseless) quantum operation can be applied instantaneously, which acts jointly on the probe and the noiseless ancillas. The rapid operations between rounds empower us to perform QEC, suppressing the damaging effects of the noise on the probe. Note that this sequential scheme can simulate a parallel scheme (Fig. 1b), in which N probes simultaneously sense the parameter for time t/N37,42.

Fig. 1
figure 1

Metrology schemes and qubit probe. a The sequential scheme. One probe sequentially senses the parameter for time t, with quantum controls applied every dt. b The parallel scheme. N probes sense the parameter for time t/N in parallel. The parallel scheme can be simulated by the sequential scheme. c The relation between the signal Hamiltonian, the noise, and the QEC code on the Bloch sphere for a qubit probe

Necessary and sufficient condition for HL

We denote the d-dimensional Hilbert space of our probe by \({\cal H}_P\), and we assume the state ρ p of the probe evolves according to a time-homogeneous Lindblad master equation of the form (with ħ = 1)31,35,36,

$$\frac{{d\rho _p}}{{dt}} = - i\left[ {H,\rho _p} \right] + \mathop {\sum}\limits_{k = 1}^r {\kern 1pt} \left( {L_k\rho _pL_k^{\mathrm{\dagger }} - \frac{1}{2}\left\{ {L_k^{\mathrm{\dagger }}L_k,\rho _p} \right\}} \right),$$
(1)

where H is the probe’s Hamiltonian, {L k } are the Lindblad jump operators, and r is the “rank” of the noise channel acting on the probe (the smallest number of Lindblad operators needed to describe the channel). The Hamiltonian H depends on a parameter ω, and our goal is to estimate ω. For simplicity, we will assume that H = ωG is a linear function of ω, but our arguments actually apply more generally. If H(ω) is not a linear function of ω, the coding scheme we describe below can be repeated many times if necessary, using our latest estimate of ω after each round to adjust the scheme used in the next round. By including in the protocol an inverse Hamiltonian evolution step \({\mathrm{exp}}\left( {iH\left( {\hat \omega } \right)dt} \right)\) applied to the probe, where \(\hat \omega \) is the estimated value of ω, we can justify the linear approximation when \(\hat \omega \) is sufficiently accurate. The asymptotic scaling of precision with the total probing time is not affected by the preliminary adaptive rounds44.

We denote by \({\cal H}_A\) the d-dimensional Hilbert space of a noiseless ancilla system, whose evolution is determined solely by our fast and accurate quantum controls. Over the small time interval dt, during which no controls are applied, the ancilla evolves trivially, and the joint state ρ of probe and ancilla evolves according to the quantum channel:

$$\begin{array}{*{20}{l}} {{\cal E}_{dt}(\rho )} \hfill & \hskip-8pt = \hfill &\hskip-7pt {\rho - i\omega \left[ {G,\rho } \right]dt} \hfill \\ {} \hfill & {} \hfill & { + \mathop {\sum}\limits_{k = 1}^r {\kern 1pt} \left( {L_k\rho L_k^{\mathrm{\dagger }} - \frac{1}{2}\left\{ {L_k^{\mathrm{\dagger }}L_k,\rho } \right\}} \right)dt + O\left( {dt^2} \right),} \hfill \end{array}$$
(2)

where G, L k are shorthand for \(G \otimes I\), \(L_k \otimes I\), respectively. We assume that this time interval dt is sufficiently small that corrections higher order in dt can be neglected. In between rounds of sensing, each lasting for time dt, control operations acting on ρ are applied instantaneously.

Our conclusions about HL and SQL scaling of parameter estimation make use of an algebraic condition on the master equation that we will refer to often, and it will therefore be convenient to have a name for this condition. We will call it the Hamiltonian-not-in-Lindblad span (HNLS) condition, or simply HNLS, an acronym for “Hamiltonian-not-in-Lindblad span.” We denote by \({\cal S}\) the linear span of the operators I, L k , \(L_k^{\mathrm{\dagger }}\), \(L_k^{\mathrm{\dagger }}L_j\) (for all k and j ranging from 1 to r), and say that the Hamiltonian H obeys the HNLS condition if H is not contained in \({\cal S}\). Now we can state our main conclusion about parameter estimation using fast and accurate quantum controls as Theorem 1.

Theorem 1: Consider a finite-dimensional probe with Hamiltonian H = ωG, subject to Markovian noise described by a Lindblad master equation with jump operators {L k }. Then ω can be estimated with HL (Heisenberg-limited) precision if and only if G and {L k } obey the HNLS (Hamiltonian-not-in-Lindblad-span) condition.

Theorem 1 applies if the ancilla is noiseless, and also for an ancilla subject to Markovian noise obeying suitable conditions, as we discuss in the Methods.

Qubit probe

To illustrate how Theorem 1 works, let’s look at the case where the probe is a qubit, which has been discussed in detail in ref. 37. Suppose one of the Lindblad operators is L1n · σ, where n = n r  + in i is a normalized complex 3-vector and n r , n i are its real and imaginary parts, so that \(L_1^{\mathrm{\dagger }}L_1 \propto \left( {{\bf{n}}^ \ast \cdot \sigma } \right)\,\left( {{\bf{n}} \cdot \sigma } \right) = I + 2\left( {{\bf{n}}_{\bf{i}} \times {\bf{n}}_{\bf{r}}} \right) \cdot \sigma\). If n r and n i are not parallel vectors, then n r , n i , and n i  × n r are linearly independent, which means that I, L1, \(L_1^{\mathrm{\dagger }}\), and \(L_1^{\mathrm{\dagger }}L_1\) span the four-dimensional space of linear operators acting on the qubit. Hence HNLS cannot be satisfied by any qubit Hamiltonian, and therefore parameter estimation with HL scaling is not possible according to Theorem 1. We conclude that for HL scaling to be achievable, n r and n i must be parallel, which means that (after multiplying L1 by a phase factor if necessary) we can choose L1 to be Hermitian37. Moreover, if L1 and L2 are two linearly independent Hermitian traceless Lindblad operators, then {I, L1, L2, L1L2} span the space of qubit linear operators and HL scaling cannot be achieved. In fact, for a qubit probe, HNLS can be satisfied only if there is a single Hermitian (not necessarily traceless) Lindblad operator L, and the Hamiltonian does not commute with L.

We will describe below how to achieve HL scaling for any master equation that satisfies HNLS, by constructing a two-dimensional QEC code that protects the probe from the Markovian noise. To see how the code works for a qubit probe, suppose \(G = \frac{1}{2}{\bf{m}} \cdot \sigma\) and Ln · σ, where m and n are unit vectors in \({\Bbb R}^3\) (Fig. 1c). Then the basis vectors for the QEC code may be chosen to be:

$$\left| {C_0} \right\rangle = \left| {{\bf{m}}_ \bot , + } \right\rangle _P \otimes \left| 0 \right\rangle _A,\quad \left| {C_1} \right\rangle = \left| {{\bf{m}}_ \bot , - } \right\rangle _P \otimes \left| 1 \right\rangle _A;$$
(3)

here \(\left| 0 \right\rangle _A\), \(\left| 1 \right\rangle _A\) are basis states for the ancilla qubit, and \(\left| {{\bf{m}}_ \bot , \pm } \right\rangle _P\) are the eigenstates with eigenvalues ±1 of m · σ where m is the (normalized) component of m perpendicular to n. In particular, if mn (perpendicular noise), then \(\left| {C_0} \right\rangle = \left| {{\bf{m}}, + } \right\rangle _P \otimes \left| 0 \right\rangle _A\) and \(\left| {C_1} \right\rangle = \left| {{\bf{m}}, - } \right\rangle _P \otimes \left| 1 \right\rangle _A\), the coding scheme previously discussed in refs. 25,26,27,28.

In the case of perpendicular noise, we estimate ω by tracking the evolution in the code space of a state initially prepared as (in a streamlined notation) \(\psi \left( 0 \right) = \left( {\left| { + ,0} \right\rangle + \left| { - ,1} \right\rangle } \right){\mathrm{/}}\sqrt 2 \); neglecting the noise, this state evolves in time t to

$$\left| {\psi \left( t \right)} \right\rangle = \frac{1}{{\sqrt 2 }}\left( {e^{ - i\omega t{\mathrm{/}}2}\left| { + ,0} \right\rangle + e^{i\omega t{\mathrm{/}}2}\left| { - ,1} \right\rangle } \right).$$
(4)

If a jump then occurs at time t, the state is transformed to

$$\left| {\psi \prime \left( t \right)} \right\rangle = \frac{1}{{\sqrt 2 }}\left( {e^{ - i\omega t{\mathrm{/}}2}\left| { - ,0} \right\rangle + e^{i\omega t{\mathrm{/}}2}\left| { + ,1} \right\rangle } \right).$$
(5)

Jumps are detected by performing a two-outcome measurement that projects onto either the span of {|+, 0〉, |−, 1〉} (the code space) or the span of {|−, 0〉, |+, 1〉} (orthogonal to the code space), and when detected they are immediately corrected by flipping the probe. Because errors are immediately corrected, the error-corrected evolution matches perfectly the ideal evolution (without noise), for which HL scaling is possible.

When the noise is not perpendicular to the signal, then not just the jumps but also the Hamiltonian evolution can rotate the joint state of probe and ancilla away from the code space. However, after evolution for the short time interval dt, the overlap with the code space remains large, so that the projection onto the code space succeeds with probability 1 − O(dt2). Neglecting O(dt2) corrections, then, the joint probe-ancilla state rotates noiselessly in the code space, at a rate determined by the component of the Hamiltonian evolution along the code space. As long as this component is nonzero, HL scaling can be achieved.

We will see that this reasoning can be extended to any finite-dimensional probe satisfying HNLS, including quantum many-body systems and (appropriately truncated) bosonic channels. Here we briefly mention a few other cases where HNLS applies, and therefore HL scaling is achievable. (1) For a many-qubit system, suppose that each Lindblad jump operator L k is supported on no more than t qubits (hence each \(L_k^{\mathrm{\dagger }}L_j\) is supported on no more than 2t qubits), and the Hamiltonian contains at least one term acting on at least 2t + 1 qubits. Then HNLS holds. (2) Consider a d-dimensional system (a qudit), and define generalized Pauli operators

$$X = \mathop {\sum}\limits_{k = 0}^{d - 1} {\kern 1pt} \left| {k + 1} \right\rangle \left\langle k \right|,\quad Z = \mathop {\sum}\limits_{k = 0}^{d - 1} {\kern 1pt} e^{2\pi ik{\mathrm{/}}d}\left| k \right\rangle \left\langle k \right|,$$
(6)

(where addition is modulo d). Suppose that the Hamiltonian H(Z) is a non-constant function of Z and that there is a single Lindblad jump operator L(X) which is a function of X. Then HNLS holds. HNLS may also apply for a multi-qubit sensor with qubits at distinct spatial positions, where the signal and noise are parallel for each individual qubit, but the signal and noise depend on position in different ways45.

We must explain how, when HNLS holds, a quantum code can be constructed that achieves HL scaling. But first we will discuss why HL is impossible when HNLS fails.

Non-achievability of HL when HNLS fails

The necessary condition for HL scaling can be derived from the quantum Cramér–Rao bound46,47,48

$$\delta \hat \omega \ge 1{\mathrm{/}}\sqrt {R \cdot {\cal F}\left( {\rho _\omega (t)} \right)} ;$$
(7)

here \(\hat \omega \) denotes any unbiased estimator for the parameter ω, and \(\delta \hat \omega \) is that estimator’s standard deviation. \({\cal F}\left( {\rho _\omega (t)} \right)\) is the quantum Fisher information (QFI) of the state ρ ω (t); this state is obtained by preparing an initial state ρin of the probe, and then evolving this state for total time t, where the evolution is governed by the ω-dependent probe Hamiltonian H(ω), the Markovian noise acting on the probe, and our fast quantum controls. For a scheme in which the measurement protocol is repeated many times in succession, R denotes the number of such repetitions. Here we show that \({\cal F}\left( {\rho _\omega (t)} \right)\) is at most asymptotically linear in t when the Hamiltonian H(ω) is contained in the linear span (denoted \({\cal S}\)) of I, L k , \(L_k^{\mathrm{\dagger }}\), and \(L_k^{\mathrm{\dagger }}L_j\), which means that SQL scaling cannot be surpassed in this case.

Though it is challenging to compute the maximum attainable QFI for arbitrary quantum channels, useful upper bounds on QFI can be derived, which provide lower bounds on the precision of quantum metrology15,16,17,18,37,42,49. The quantum channel describing the joint evolution of probe and ancilla has a Kraus operator representation

$${\cal E}_{dt}\left( \rho \right) = \mathop {\sum}\limits_k {\kern 1pt} K_k\rho K_k^\dagger ,$$
(8)

and in terms of these Kraus operators we define

$$\alpha _{dt} = \mathop {\sum}\limits_k {\dot K_k^\dagger \dot K_k} = {\dot{\bf K}}^\dagger {\dot{\bf K}},$$
(9)
$$\beta _{dt} = i\mathop {\sum}\limits_k {\dot K_k^{\mathrm{\dagger }}K_k} = i{\dot{\mathbf K}}^\dagger {\bf{K}},$$
(10)

where we express the Kraus operators in vector notation \({\bf{K}}: = \left( {K_0,K_1, \ldots } \right)^T\), and the over-dot means the derivative with respect to ω. If ρin is the initial joint state of probe and ancilla at time 0, and ρ(t) is the corresponding state at time t, then the upper bound on the QFI

$${\cal F}\left( {\rho (t)} \right) \le 4{\textstyle{t \over {dt}}}\left\| {\alpha _{dt}} \right\| + 4\left( {{\textstyle{t \over {dt}}}} \right)^2\left\| {\beta _{dt}} \right\|\left( {\left( {\left\| {\beta _{dt}} \right\|} \right. + 2\sqrt {\left\| {\alpha _{dt}} \right\|} } \right)$$
(11)

(\(\left\| \cdot \right\|\) denotes the operator norm) derived by the “channel extension method” holds for any choice of ρin even when fast and accurate quantum controls are applied during the evolution37. This upper bound on the QFI provides a lower bound on the precision \(\delta \hat \omega \) via Eq. (7).

Kraus representations are not unique—for any matrix u satisfying uu = I, K′ = uK represents the same channel as K. Hence, we can tighten the upper bound on the QFI by minimizing the RHS of Eq. (11) over all such valid Kraus representations. We see that

$${\dot{\bf K}}\prime = u\left( {{\dot{\bf K}} - ih{\bf{K}}} \right),\quad {\dot{\bf K}}\prime ^{\mathrm{\dagger }} = \left( {{\dot{\bf K}} - ih{\bf{K}}} \right)^{\mathrm{\dagger }}u^{\mathrm{\dagger }}$$
(12)

where \(h = iu^{\mathrm{\dagger }}\dot u\). Therefore, to find α dt and β dt providing the tightest upper bound on the QFI, it suffices to replace \({\dot{\bf K}}\) by \({\dot{\bf K}} - ih{\bf{K}}\) and to optimize over the Hermitian matrix h.

To evaluate the bound for asymptotically large t, we expand α dt , β dt , h in powers of \(\sqrt {dt} \):

$$\alpha _{dt} = \alpha ^{(0)} + \alpha ^{(1)}\sqrt {dt} + \alpha ^{(2)}dt + O\left( {dt^{3/2}} \right),$$
(13)
$$\beta _{dt} = \beta ^{(0)} + \beta ^{(1)}\sqrt {dt} + \beta ^{(2)}dt + \beta ^{(3)}dt^{3/2} + O\left( {dt^2} \right),$$
(14)
$$h = h^{(0)} + h^{(1)}\sqrt {dt} + h^{(2)}dt + h^{(3)}dt^{3/2} + O\left( {dt^2} \right).$$
(15)

We show in the Methods that the first two terms in α dt and the first four terms in β dt can all be set to 0 by choosing a suitable h, assuming that HNLS is violated. We therefore have α dt  = O(dt) and β dt  = O(dt2), so that the second term in the RHS of Eq. (11) vanishes as dt → 0:

$${\cal F}\left( {\rho (t)} \right) \le 4\left\| {\alpha ^{(2)}} \right\|t,$$
(16)

proving that SQL scaling cannot be surpassed when HNLS is violated (the necessary condition in Theorem 1). We require the probe to be finite dimensional in the statement of Theorem 1 because otherwise the norm of α dt or β dt could be infinite. The theorem can be applied to the case of a probe with an infinite-dimensional Hilbert space if the state of the probe is confined to a finite-dimensional subspace even for asymptotically large t.

QEC code for HL scaling when HNLS holds

To prove the sufficient condition for HL scaling, we show that a QEC code achieving HL scaling can be explicitly constructed if H(ω) is not in the linear span \({\cal S}\). Our discussion of the qubit probe indicates how a QEC code can be used to achieve HL scaling for estimating the parameter ω. The code allows us to correct quantum jumps whenever they occur, and in addition the noiseless error-corrected evolution in the code space depends nontrivially on ω. Similar considerations apply to higher-dimensional probes. Let Π C denote the projection onto the code space. Jumps are correctable if the code satisfies the error correction conditions31,32,33, namely:

$$\left[ 1 \right]\,{\mathrm{{\Pi}}}_CL_k{\mathrm{{\Pi}}}_C = \lambda _k{\mathrm{{\Pi}}}_C,\;\forall k,$$
(17)
$$\left[ 2 \right]\,{\mathrm{{\Pi}}}_CL_k^{\mathrm{\dagger }}L_j{\mathrm{{\Pi}}}_C = \mu _{kj}{\mathrm{{\Pi}}}_C,\;\forall k,j,$$
(18)

for some complex numbers λ k and μ kj . The error-corrected joint state of probe and ancilla evolves according to the unitary channel (asymptotically)

$$\frac{{d\rho }}{{dt}} = - i\left[ {H_{{\mathrm{eff}}},\rho } \right]$$
(19)

where Heff = Π C HΠ C  = ωGeff. There is a code state for which the evolution depends nontrivially on ω provided that

$$\left[ 3 \right]\,{\mathrm{{\Pi}}}_CG{\mathrm{{\Pi}}}_C \, \ne \,{\mathrm{constant}} \,{\mathrm{{\Pi}}}_C.$$
(20)

For this noiseless evolution with effective Hamiltonian ωGeff, the QFI of the encoded state at time t is

$${\cal F}\left( {\rho (t)} \right) = 4t^2\left[ {{\mathrm{tr}}\left( {\rho _{{\mathrm{in}}}G_{{\mathrm{eff}}}^2} \right) - \left( {{\mathrm{tr}}\left( {\rho _{{\mathrm{in}}}G_{{\mathrm{eff}}}} \right)} \right)^2} \right],$$
(21)

where ρin is the initial state at time t = 0. The QFI is maximized by choosing the initial pure state

$$\left| {\psi _{{\mathrm{in}}}} \right\rangle = \frac{1}{{\sqrt 2 }}\left( {\left| {\lambda _{{\mathrm{min}}}} \right\rangle + \left| {\lambda _{{\mathrm{max}}}} \right\rangle } \right),$$
(22)

where \(\left| {\lambda _{{\mathrm{min}}}} \right\rangle \), \(\left| {\lambda _{{\mathrm{max}}}} \right\rangle \) are the eigenstates of Geff with the minimal and maximal eigenvalues; with this choice the QFI is

$${\cal F}\left( {\rho (t)} \right) = t^2\left( {\lambda _{\max } - \lambda _{{\mathrm{min}}}} \right)^2.$$
(23)

By measuring in the appropriate basis at time t, we can estimate ω with a precision that saturates the Cramér–Rao bound in the asymptotic limit of a large number of measurements, hence realizing HL scaling.

To prove the sufficient condition in Theorem 1, we will now show that a code with properties (1)–(3) can be constructed whenever HNLS is satisfied. (For further justification of these conditions see the Methods.) In this code construction we make use of a noiseless ancilla system, but as we discuss in the Methods, the construction can be extended to the case where the ancilla system is subject to Markovian noise obeying suitable conditions.

To see how the code is constructed, note that the d-dimensional Hermitian matrices form a real Hilbert space where the inner product of two matrices A and B is defined to be tr(AB). Let \({\cal S}\) denote the subspace of Hermitian matrices spanned by I, \(L_k + L_k^{\mathrm{\dagger }}\), \(i\left( {L_k - L_k^{\mathrm{\dagger }}} \right)\), \(L_k^{\mathrm{\dagger }}L_j + L_j^{\mathrm{\dagger }}L_k\), and \(i\left( {L_k^{\mathrm{\dagger }}L_j - L_j^{\mathrm{\dagger }}L_k} \right)\) for all k, j. Then G has a unique decomposition into \(G = G_\parallel + G_ \bot \), where \(G_\parallel \in {\cal S}\) and \(G_ \bot \bot {\cal S}\).

If HNLS holds, then \(G_ \bot \) is nonzero. It must also be traceless, in order to be orthogonal to I, which is contained in \({\cal S}\). Therefore, using the spectral decomposition, we can write \(G_ \bot = \frac{1}{2}\left( {{\mathrm{tr}}\left| {G_ \bot } \right|} \right)\,(\rho _0 - \rho _1)\), where ρ0 and ρ1 are trace-one positive matrices with orthogonal support and \(\left| {G_ \bot } \right|: = \sqrt {G_ \bot ^2} \). Our QEC code is chosen to be the two-dimensional subspace of \({\cal H}_P \otimes {\cal H}_A\) spanned by |C0〉 and |C1〉, which are normalized purifications of ρ0 and ρ1 respectively, with orthogonal support in \({\cal H}_A\). (If the probe is d-dimensional, a d-dimensional ancilla can purify its state.) Because the code basis states have orthogonal support on \({\cal H}_A\), it follows that, for any O acting on \({\cal H}_P\),

$$\left\langle {C_0} \right|O \otimes I\left| {C_1} \right\rangle = 0 = \left\langle {C_1} \right|O \otimes I\left| {C_0} \right\rangle ,$$
(24)

and furthermore

$$\begin{array}{*{20}{l}} {{\mathrm{tr}}\left( {\left( {\left| {C_0} \right\rangle \left\langle {C_0} \right| - \left| {C_1} \right\rangle \left\langle {C_1} \right|} \right)\left( {O \otimes I} \right)} \right)} \hfill \\ {\quad = {\mathrm{tr}}\, \left( {\left( {\rho _0 - \rho _1} \right)O} \right)} \hfill { = \frac{{2{\kern 1pt} {\mathrm{tr}}\left( {G_ \bot O} \right)}}{{{\mathrm{tr}}\left| {G_ \bot } \right|}}.} \hfill \end{array}$$
(25)

In particular, for any O in the span \({\cal S}\) we have \({\mathrm{tr}}\left( {G_ \bot O} \right) = 0\), and therefore

$$\left\langle {C_0} \right|\left( {O \otimes I} \right)\left| {C_0} \right\rangle = \left\langle {C_1} \right|\left( {O \otimes I} \right)\left| {C_1} \right\rangle .$$
(26)

Code properties (1)–(3) now follow from Eqs. (24) and (26). For this two-dimensional code, the projector onto the code space is

$${\mathrm{{\Pi}}}_C = \left| {C_0} \right\rangle \left\langle {C_0} \right| + \left| {C_1} \right\rangle \left\langle {C_1} \right|,$$
(27)

and therefore

$${\mathrm{{\Pi}}}_C\left( {O \otimes I} \right){\mathrm{{\Pi}}}_C = \left\langle {C_0} \right|\left( {O \otimes I} \right)\left| {C_0} \right\rangle {\mathrm{{\Pi}}}_C$$
(28)

for \(O \in {\cal S}\), which implies properties (1) and (2) because L k and \(L_k^{\mathrm{\dagger }}L_j\) are in \({\cal S}\). Property (3) is also satisfied by the code, because \(\left\langle {C_0\left| G \right|C_0} \right\rangle - \left\langle {C_1\left| G \right|C_1} \right\rangle = 2{\kern 1pt} {\mathrm{tr}}\left( {G_ \bot ^2} \right){\mathrm{/tr}}\left| {G_ \bot } \right| >0\), which means that the diagonal elements of Π C GΠ C are not equal when projected onto the code space. Thus, we have demonstrated the existence of a code with properties (1) and (3).

Code optimization

When HNLS is satisfied, we can use our QEC code, along with fast and accurate quantum control, to achieve noiseless evolution of the error-corrected probe, governed by the effective Hamiltonian Heff = Π C HΠ C  = ωGeff where Π C is the orthogonal projection onto the code space. Because the optimal initial state Eq. (22) is a superposition of just two eigenstates of Geff, a two-dimensional QEC code suffices for achieving the best possible precision. For a code with basis states {|C0〉, |C1〉}, the effective Hamiltonian is

$$G_{{\mathrm{eff}}} = \left| {C_0} \right\rangle \left\langle {C_0} \right|G_ \bot \left| {C_0} \right\rangle \left\langle {C_0} \right| + \left| {C_1} \right\rangle \left\langle {C_1} \right|G_ \bot \left| {C_1} \right\rangle \left\langle {C_1} \right|;$$
(29)

here we have ignored the contribution due to \(G_\parallel \), which is an irrelevant additive constant if the code satisfies condition (2). We have seen how to construct a code for which

$$\lambda _{{\mathrm{max}}} - \lambda _{{\mathrm{min}}} = 2\frac{{{\mathrm{tr}}\left( {G_ \bot ^2} \right)}}{{{\mathrm{tr}}\left| {G_ \bot } \right|}}.$$
(30)

It is possible, though, that a larger value of this difference of eigenvalues could be achieved using a different code, improving the precision by a constant factor (independent of the time t).

To search for a better code, with basis states {|C0〉, |C1〉}, define

$$\tilde \rho _0 = {\mathrm{tr}}_A\left( {\left| {C_0} \right\rangle \left\langle {C_0} \right|} \right),\quad \tilde \rho _1 = {\mathrm{tr}}_A\left( {\left| {C_1} \right\rangle \left\langle {C_1} \right|} \right),$$
(31)

and consider

$$\tilde G = \tilde \rho _0 - \tilde \rho _1.$$
(32)

Conditions (1)–(2) on the code imply

$${\mathrm{tr}}\left( {\tilde GO} \right) = 0,\;\forall O \in {\cal S},$$
(33)

and we want to maximize

$$\lambda _{{\mathrm{max}}} - \lambda _{{\mathrm{min}}} = {\mathrm{tr}}\left( {G_{{\mathrm{eff}}}\tilde G} \right) = {\mathrm{tr}}\left( {G_ \bot \tilde G} \right),$$
(34)

over matrices \(\tilde G\) of the form Eq. (32) subject to Eq. (33). Note that \(\tilde G\) is the difference of two normalized density operators, and therefore satisfies \({\mathrm{tr}}\left| {\tilde G} \right| \le 2\). In fact, though, if \(\tilde G\) obeys the constraint Eq. (33), then the constraint is still satisfied if we rescale \(\tilde G\) by a real constant greater than one, which increases \({\mathrm{tr}}\left( {G_ \bot \tilde G} \right)\); hence the maximum of \({\mathrm{tr}}\left( {G_ \bot \tilde G} \right)\) is achieved for \({\mathrm{tr}}\left| {\tilde G} \right| = 2\), which means that \(\tilde \rho _0\) and \(\tilde \rho _1\) have orthogonal support.

Now recall that \(G_ \bot = \frac{1}{2}\left( {{\mathrm{tr}}\left| {G_ \bot } \right|} \right)\,\left( {\rho _0 - \rho _1} \right)\) is also (up to normalization) a difference of density operators with orthogonal support, and obeys the constraint Eq. (33). The quantity to be maximized is proportional to

$${\mathrm{tr}}\left[ {\left( {\rho _0 - \rho _1} \right)\,\left( {\tilde \rho _0 - \tilde \rho _1} \right)} \right] = {\mathrm{tr}}\left( {\rho _0\tilde \rho _0 + \rho _1\tilde \rho _1 - \rho _0\tilde \rho _1 - \rho _1\tilde \rho _0} \right).$$
(35)

If ρ0 and ρ1 are both rank 1, then the maximum is achieved by choosing \(\tilde \rho _0 = \rho _0\) and \(\tilde \rho _1 = \rho _1\). Conditions (1)–(2) are satisfied by choosing |C0〉 and |C1〉 to be purifications of ρ0 and ρ1 with orthogonal support on \({\cal H}_A\). Thus, we have recovered the code we constructed previously. If ρ0 or ρ1 is higher rank, though, then a different code achieves a higher maximum, and hence better precision for parameter estimation.

Geometrical picture

There is an alternative description of the code optimization, with a pleasing geometrical interpretation. As discussed in the Methods, the optimization can be formulated as a SDP with a feasible dual program. By solving the dual program we find that, for the optimal QEC code, the QFI is

$${\cal F}\left( {\rho (t)} \right) = 4t^2\mathop {{{\mathrm{min}}}}\limits_{\tilde G_\parallel \in {\cal S}} {\kern 1pt} \left\| {G_ \bot - \tilde G_\parallel } \right\|^2,$$
(36)

where \(\left\| \cdot \right\|\) denotes the operator norm. In this sense, the QFI is determined by the minimal distance between \(G_ \bot \) and \({\cal S}\) (Fig. 2b).

Fig. 2
figure 2

Schematic illustration of HNLS and code optimization. a \(G_ \bot \) is the projection of G onto \({\cal S}\) in the Hilbert space of Hermitian matrices equipped with the Hilbert-Schmidt norm \(\sqrt {{\mathrm{tr}}(O \cdot O)}\). \(G_ \bot \ne 0\) if and only if \(G \notin {\cal S}\), which is the HNLS condition. b \(\tilde G^\diamondsuit \) is the projection of G onto \({\cal S}\) in the linear space of Hermitian matrices equipped with the operator norm \(\left\| O \right\| = {\mathrm{max}}_{\left| \psi \right\rangle }{\kern 1pt} \left\langle \psi \right|O\left| \psi \right\rangle \). In general, the optimal QEC code can be contructed from \(\tilde G^\diamondsuit \) and \(\tilde G^\diamondsuit \) is not necessarily equal to \(G_ \bot \)

We can recover the solution to the primal problem from the solution to the dual problem. We denote by \(\tilde G_\parallel ^\diamondsuit \) the choice of \(\tilde G_\parallel \in {\cal S}\) that minimizes Eq. (36), and we define

$$\tilde G^\diamondsuit : = G_ \bot - \tilde G_\parallel ^\diamondsuit .$$
(37)

Then \(\tilde G^ \ast \) that maximizes Eq. (34) has the form

$$\tilde G^ \ast = \tilde \rho _0^\diamondsuit - \tilde \rho _1^\diamondsuit ,$$
(38)

where \(\tilde \rho _{\mathrm{0}}^\diamondsuit \) is a density operator supported on the eigenspace of \(\tilde G^\diamondsuit \) with the maximal eigenvalue, and \(\tilde \rho _{\mathrm{1}}^\diamondsuit \) is a density operator supported on the eigenspace of \(\tilde G^\diamondsuit \) with the minimal eigenvalue. The minimization in Eq. (36) ensures that \(\tilde G^ \ast \) of this form can be chosen to obey the constraint Eq. (33).

In the noiseless case \(\left( {{\cal S} = {\mathrm{span}}\{ I\} } \right)\), the minimum in Eq. (36) occurs when the maximum and minimum eigenvalues \(G_ \bot - \tilde G_\parallel \) have the same absolute value, and then the operator norm is half the difference of the maximum and minimum eigenvalues of \(G_ \bot \). Hence, we recover the result Eq. (23). When noise is introduced, \({\cal S}\) swells and the minimal distance shrinks, lowering the QFI and reducing the precision of parameter estimation. If HNLS fails, then the minimum distance is zero, and no QEC code can achieve HL scaling, in accord with Theorem 1.

Kerr effect with photon loss

To illustrate how the optimization procedure works, consider a bosonic mode with the nonlinear (Kerr effect50) Hamiltonian

$$H(\omega ) = \omega \left( {a^{\mathrm{\dagger }}a} \right)^2,$$
(39)

where our objective is to estimate ω. In this case, the probe is infinite dimensional, but suppose we assume that the occupation number n = aa is bounded: \(n \le \bar n\), where \(\bar n\) is even. The noise source is photon loss, with Lindblad jump operator \(L \propto a\). Can we find a QEC code that protects the probe against loss and achieves HL scaling for estimation of ω?

To solve the dual program, we find real parameters α, β, γ, δ, which minimize the operator norm of

$$\tilde n^2: = n^2 + \alpha n + \beta a + \gamma a^{\mathrm{\dagger }} + \delta ,$$
(40)

where \(n \le \bar n\). Since a and a are off-diagonal in the occupation number basis, we should set β and γ to zero for the purpose of minimizing the difference between the largest and smallest eigenvalue of \(\tilde n^2\). After choosing α such that \(\tilde n^2\) is minimized at \(n = \bar n{\mathrm{/}}2\), and choosing δ so that the maximum and minimum eigenvalues of \(\tilde n^2\) are equal in absolute value and opposite in sign, we have

$$\left( {\tilde n^2} \right)^\diamondsuit = \left( {n - \frac{1}{2}\bar n} \right)^2 - \frac{1}{8}\bar n^2,$$
(41)

which has operator norm \(\left\| {\left( {\tilde n^2} \right)^\diamondsuit } \right\| = \bar n^2{\mathrm{/}}8\); hence the optimal QFI after evolution time t is \({\cal F}\left( {\rho (t)} \right) = t^2\bar n^4{\mathrm{/}}16\), according to Eq. (36). For comparison, the minimal operator norm is \(\bar n^2{\mathrm{/}}2\) for a noiseless bosonic mode with \(n \le \bar n\). We see that loss reduces the precision of our estimate of ω, but only by a factor of 4 if we use the optimal QEC code. HL scaling can still be maintained. The scaling \(\delta \hat \omega \sim 1{\mathrm{/}}\bar n^2\) of the optimal precision arises from the nonlinear boson-boson interactions in the Hamiltonian Eq. (39)51.

To find the code states, we note that the eigenstate of \(\left( {\tilde n^2} \right)^\diamondsuit \) with the lowest eigenvalue \( - \bar n^2{\mathrm{/}}8\) is \(\left| {n = \bar n{\mathrm{/}}2} \right\rangle \), while the largest eigenvalue \( + \bar n^2{\mathrm{/}}8\) has the two degenerate eigenstates |n = 0〉 and \(\left| {n = \bar n} \right\rangle \). The code condition (2) requires that both code vectors have the same expectation value of LLn, and we therefore may choose

$$\left| {C_0} \right\rangle = \left| {\bar n{\mathrm{/}}2} \right\rangle _P \otimes \left| 0 \right\rangle _A,\,\left| {C_1} \right\rangle = \frac{1}{{\sqrt 2 }}\left( {\left| 0 \right\rangle _P + \left| {\bar n} \right\rangle _P} \right) \otimes \left| 1 \right\rangle _A$$
(42)

as the code achieving optimal precision. For \(\bar n \ge 4\), the ancilla may be discarded, and we can use the simpler code

$$\left| {C_0} \right\rangle = \left| {\bar n{\mathrm{/}}2} \right\rangle _P,\,\left| {C_1} \right\rangle = \frac{1}{{\sqrt 2 }}\left( {\left| 0 \right\rangle _P + \left| {\bar n} \right\rangle _P} \right),$$
(43)

which is easier to realize experimentally. Eqs. (17) and (18) are still satisfied without the ancilla, because the states \(\left\{ {\left| {C_0} \right\rangle ,\left| {C_1} \right\rangle ,a\left| {C_0} \right\rangle ,a\left| {C_1} \right\rangle } \right\}\) are all mutually orthogonal. This encoding Eq. (43) belongs to the family of “binomial quantum codes” which, as discussed in ref. 52, can protect against loss of bosonic excitations.

An experimental realization of this coding scheme can be achieved using tools from circuit quantum electrodynamics, by coupling a single transmon qubit to two microwave waveguide resonators. For example, when \(\bar n\) is a multiple of 4, \(\left| {C_0} \right\rangle \) and \(\left| {C_1} \right\rangle \) both have even photon parity while \(a\left| {C_0} \right\rangle \) and \(a\left| {C_1} \right\rangle \) both have odd parity. Then QEC can be carried out by the following procedure: (1) a quantum non-demolition parity measurement is performed to check whether photon loss has occurred38,53. (2) If photon loss is detected, the initial logical encoding is restored using optimal control pulses38,39. (3) If there is no photon loss, the quantum state is projected onto the code space \({\mathrm{span}}\left\{ {\left| {C_0} \right\rangle ,\left| {C_1} \right\rangle } \right\}\)54. The probability of an uncorrectable logical error becomes arbitrarily small if the QEC procedure is sufficiently fast compared to the photon loss rate. Meanwhile, the Kerr signal accumulates coherently in the relative phase of |C0〉 and |C1〉, so that HL scaling can be attained for arbitrarily fast quantum control. For integer values of \(\bar n\) that are not a multiple of 4, coding schemes can still be constructed that protect against photon loss, as described in ref. 52.

Approximate error correction

Generic Markovian noise is full rank, which means that the span \({\cal S}\) is the full Hilbert space \({\cal H}_P\) of the probe; hence the HNLS criterion of Theorem 1 is violated for any probe Hamiltonian H(ω), and asymptotic SQL scaling cannot be surpassed. Therefore, for any Markovian noise model that meets the HNLS criterion, the HL scaling achieved by our QEC code is not robust against generic small perturbations of the noise model.

We should therefore emphasize that a substantial improvement in precision can be achieved using a QEC code even in cases where HNLS is violated. Consider in particular a Markovian master equation with Lindblad operators divided into two sets {L k } (L-type noise) and {J m } (J-type noise), where the J-type noise is parametrically weak, with noise strength

$$\epsilon : = \left\| {\mathop {\sum}\limits_m {\kern 1pt} J_m^{\mathrm{\dagger }}J_m} \right\|$$
(44)

(\(\left\| \cdot \right\|\) denotes the operator norm). If we use the optimal code that protects against L-type noise, then the joint logical state of probe and ancilla evolves according to a modified master equation, with Hamiltonian Heff = Π C HΠ C , and effective Lindblad operators Jm,j acting within the code space, where

$$\left\| {\mathop {\sum}\limits_{m,\,j} {\kern 1pt} J_{m,\,j}^{\mathrm{\dagger }}\, J_{m,\,j}} \right\| \le \epsilon .$$
(45)

(See the Methods for further discussion.)

This means that the state of the error-corrected probe deviates by a distance \(O\left( {\epsilon t} \right)\) (in the L1 norm) from the (effectively noiseless) evolution in the absence of J-type noise. Therefore, using this code, the QFI of the error-corrected probe increases quadratically in time (and the precision \(\delta \hat \omega \) scales like 1/t) up until an evolution time \(t \propto 1{\mathrm{/}}\epsilon \), before crossing over to asymptotic SQL scaling.

Discussion

Noise limits the precision of quantum sensing. Quantum error correction can suppress the damaging effects of noise, thereby improving the fidelity of quantum information processing and quantum communication, but whether QEC improves the efficacy of quantum sensing depends on the structure of the noise and the signal Hamiltonian. Unless suitable conditions are met, the QEC code that tames the noise might obscure the signal as well, nullifying the advantages of QEC.

Our study of quantum sensing using a noisy probe has focused on whether the precision δ of parameter estimation scales asymptotically with the total sensing time t as δ \(\propto\) 1/t (HL) or \(\delta \propto 1{\mathrm{/}}\sqrt t \) (SQL). We have investigated this question in an idealized setting, where the experimentalist has access to noiseless (or correctable) ancillas and can apply quantum controls that are arbitrarily fast and accurate, and we have also assumed that the noise acting on the probe is Markovian. Under these assumptions, we have found the general criterion for HL scaling to be achievable, the HNLS criterion. If HNLS is satisfied, a QEC code can be constructed that achieves HL scaling, and if HNLS is violated, then SQL scaling cannot be surpassed.

In the case where HNLS is satisfied, we have seen that the QEC code achieving the optimal precision can be chosen to be two-dimensional, and we have described an algorithm for constructing this optimal code. The precision attained by this code has a geometrical interpretation in terms of the minimal distance (in the operator norm) of the signal Hamiltonian from the “Lindblad span” \({\cal S}\), the subspace spanned by I, L k , \(L_k^{\mathrm{\dagger }}\), and \(L_k^{\mathrm{\dagger }}L_j\), where {L k } is the set of Lindblad jump operators appearing in the probe’s Markovian master equation.

Many questions merit further investigation. We have focused on the dichotomy of HL vs. SQL scaling, but it is also worthwhile to characterize constant factor improvements in precision that can be achieved using QEC in cases where HNLS is violated55. We should clarify the applications of QEC to sensing when quantum controls have realistic accuracy and speed. Finally, it is interesting to consider probes subject to non-Markovian noise. In that case, tools such as dynamical decoupling56,57,58,59 can mitigate noise, but just as for QEC, we need to balance desirable suppression of the noise against undesirable suppression of the signal in order to formulate the most effective sensing strategy.

Note added: During the preparation of this manuscript, we became aware of related work by Demkowicz-Dobrzański et al.60, which provided a similar proof of the necessary condition in Theorem 1 and an equivalent description of the QEC conditions Eqs. (17), (18), and (20). We and the authors of ref. 60 obtained this result independently. Both our paper and ref. 60 generalize results obtained earlier in ref. 37.

Methods

Linear scaling of the QFI

Here we prove that the QFI scales linearly with the evolution time t in the case where the HNLS condition is violated. We follow the proof in ref. 37, which applies when the probe is a qubit, and generalize their proof to the case where the probe is d-dimensional.

We approximate the quantum channel

$$\begin{array}{*{20}{l}} {{\cal E}_{dt}\left( \rho \right)} \hfill & \hskip-8pt = \hfill &\hskip-7pt {\rho - i\omega \left[ {G,\rho } \right]dt} \hfill \\ {} \hfill & {} \hfill & { + \mathop {\sum}\limits_{k = 1}^r {\kern 1pt} \left( {L_k\rho L_k^{\mathrm{\dagger }} - \frac{1}{2}\left\{ {L_k^{\mathrm{\dagger }}L_k,\rho } \right\}} \right)dt + O\left( {dt^2} \right)} \hfill \end{array}$$
(46)

by the following one:

$$\tilde {\cal E}_{dt}\left( \rho \right) = \mathop {\sum}\limits_{k = 0}^r {\kern 1pt} K_k\rho K_k^{\mathrm{\dagger }},$$
(47)

where \(K_0 = I + \left( { - i\omega G - \frac{1}{2}{\kern 1pt} \mathop {\sum}\nolimits_{k = 1}^r {\kern 1pt} L_k^{\mathrm{\dagger }}L_k} \right)dt\) and \(K_k = L_k\sqrt {dt} \) for k ≥ 1. The approximation is valid because the distance between \({\cal E}_{dt}\) and \(\tilde {\cal E}_{dt}\) is O(dt2) and the sensing time is divided into \(\frac{t}{{dt}}\) segments, meaning the error \(O\left( {\frac{t}{{dt}} \cdot dt^2} \right) = O(tdt)\) introduced by this approximation in calculating the QFI vanishes as dt→0. Next, we calculate the operators \(\alpha _{dt} = \left( {{\dot{\bf K}} - ih{\bf{K}}} \right)^{\mathrm{\dagger }}\left( {{\dot{\bf K}} - ih{\bf{K}}} \right)\) and \(\beta _{dt} = i\left( {{\dot{\bf K}} - ih{\bf{K}}} \right)^{\mathrm{\dagger }}{\bf{K}}\) for the channel \(\tilde {\cal E}_{dt}(\rho )\), and expand these operators as a power series in \(\sqrt {dt} \):

$$\alpha _{dt} = \alpha ^{(0)} + \alpha ^{(1)}\sqrt {dt} + \alpha ^{(2)}dt + O\left( {dt^{3{\mathrm{/}}2}} \right),$$
(48)
$$\beta _{dt} = \beta ^{(0)} + \beta ^{(1)}\sqrt {dt} + \beta ^{(2)}dt + \beta ^{(3)}dt^{3/2} + O\left( {dt^2} \right).$$
(49)

We will now search for a Hermitian matrix h that sets low-order terms in each power series to 0.

Expanding h as \(h = h^{(0)} + h^{(1)}\sqrt {dt} + h^{(2)}dt + h^{(3)}dt^{3/2} + O\left( {dt^2} \right)\) in \(\sqrt {dt} \), and using the notation \(\left( {K_0,K_1, \ldots ,K_r} \right)^T = {\bf{K}} = {\bf{K}}^{(0)} + {\bf{K}}^{(1)}dt^{1/2} + {\bf{K}}^{(2)}dt\), we find

$$\begin{array}{*{20}{l}} {\alpha ^{(0)} = {\bf{K}}^{(0){\mathrm{\dagger }}}h^{(0)}h^{(0)}{\bf{K}}^{(0)} = \mathop {\sum}\limits_{k = 0}^r {\kern 1pt} \left| {h_{0k}^{(0)}} \right|^2I = 0} \hfill \\ {\quad \Rightarrow h_{0k}^{(0)} = 0,\;0 \le k \le r.} \hfill \end{array}$$
(50)

Therefore h(0)K(0) = 0 and α(1) = β(0) = 0 are automatically satisfied. Then,

$$\beta ^{(1)} = - {\bf{K}}^{(0){\mathrm{\dagger }}}h^{(1)}{\bf{K}}^{(0)} = - h_{00}^{(1)}I = 0 \Rightarrow h_{00}^{(1)} = 0.$$
(51)

and

$$\begin{array}{*{20}{l}} {\beta ^{(2)}} \hfill & \hskip-8pt = \hfill &\hskip-7pt {i\,{\dot{\bf K}}^{(2){\mathrm{\dagger }}}{\bf{K}}^{(0)} - {\bf{K}}^{(1){\mathrm{\dagger }}}h^{(0)}{\bf{K}}^{(1)}} \hfill \\ {} \hfill & {} \hfill & { - {\bf{K}}^{(0){\mathrm{\dagger }}}h^{(1)}{\bf{K}}^{(1)} - {\bf{K}}^{(1){\mathrm{\dagger }}}h^{(1)}{\bf{K}}^{(0)} - {\bf{K}}^{(0){\mathrm{\dagger }}}h^{(2)}{\bf{K}}^{(0)}} \hfill \\ {} \hfill & \hskip-8pt = \hfill &\hskip-7pt {G - \mathop {\sum}\limits_{k,j = 1}^r {\kern 1pt} h_{jk}^{(0)}L_k^{\mathrm{\dagger }}L_j - \mathop {\sum}\limits_{k = 1}^r {\kern 1pt} \left( {h_{0k}^{(1)}L_k + h_{k0}^{(1)}L_k^{\mathrm{\dagger }}} \right) - h_{00}^{(2)}I,} \hfill \end{array}$$
(52)

which can be set to 0 if and only if G is a linear combination of \(I,L_k,L_k^{\mathrm{\dagger }}\) and \(L_k^{\mathrm{\dagger }}L_j\) (0 ≤ k, j ≤ r).

In addition,

$$\begin{array}{*{20}{l}} {\beta ^{(3)}} \hfill & \hskip-8pt = \hfill &\hskip-7pt { - {\bf{K}}^{(1){\mathrm{\dagger }}}h^{(1)}{\bf{K}}^{(1)} - {\bf{K}}^{(0){\mathrm{\dagger }}}h^{(2)}{\bf{K}}^{(1)}} \hfill \\ {} \hfill & {} \hfill & { - {\bf{K}}^{(1){\mathrm{\dagger }}}h^{(2)}{\bf{K}}^{(0)} - {\bf{K}}^{(0){\mathrm{\dagger }}}h^{(3)}{\bf{K}}^{(0)}} \hfill \\ {} \hfill & \hskip-8pt = \hfill &\hskip-7pt { - \mathop {\sum}\limits_{k,j = 1}^r {\kern 1pt} h_{jk}^{(1)}L_k^{\mathrm{\dagger }}L_j - \mathop {\sum}\limits_{k = 1}^r {\kern 1pt} \left( {h_{0k}^{(2)}L_k + h_{k0}^{(2)}L_k^{\mathrm{\dagger }}} \right) - h_{00}^{(3)}I = 0} \hfill \end{array}$$
(53)

can be satisfied by setting the above parameters (which do not appear in the expressions for α(0,1) and β(0,1,2)) all to 0 (other terms in β(3) are 0 because of the constraints on h(0) and h(1) in Eqs. (50) and (51)). Therefore, when G is a linear combination of \(I,L_k,L_k^{\mathrm{\dagger }}\) and \(L_k^{\mathrm{\dagger }}L_j\), there exists an h such that α dt  = O(dt) and β dt  = O(dt2) for the quantum channel \(\tilde {\cal E}_{dt}\); therefore the QFI obeys

$$\begin{array}{*{20}{l}} {{\cal F}\left( {\rho (t)} \right)} \hfill & \hskip-8pt \le \hfill &\hskip-7pt {4\frac{t}{{dt}}\left\| {\alpha _{dt}} \right\| + 4\left( {\frac{t}{{dt}}} \right)^2\left\| {\beta _{dt}} \right\|\,\left( {\left\| {\beta _{dt}} \right\| + 2\sqrt {\left\| {\alpha _{dt}} \right\|} } \right)} \hfill \\ {} \hfill & \hskip-8pt = \hfill &\hskip-7pt {4\left\| {\alpha ^{(2)}} \right\|t + O\left( {\sqrt {dt} } \right),} \hfill \end{array}$$
(54)

in which \(\alpha ^{(2)} = \left( {h^{(1)}{\bf{K}}^{(0)} + h^{(0)}{\bf{K}}^{(1)}} \right)^{\mathrm{\dagger }}\left( {h^{(1)}{\bf{K}}^{(0)} + h^{(0)}{\bf{K}}^{(1)}} \right)\) under the constraint β(2) = 0.

The QEC condition

Here we consider the quantum channel Eq. (2), which describes the joint evolution of a noisy quantum probe and noiseless ancilla over time interval dt. Suppose that a QEC code obeys the conditions (1) and (2) in Eqs. (17) and (18), where Π C is the orthogonal projector onto the code space. We will construct a recovery operator such that the error-corrected time evolution is unitary to linear order in dt, governed by the effective Hamiltonian Heff = ωΠ C GΠ C .

For a density operator ρ = Π C ρΠ C in the code space, conditions (1) and (2) imply

$$\begin{array}{*{20}{l}} {{\mathrm{{\Pi}}}_C{\cal E}_{dt}(\rho ){\mathrm{{\Pi}}}_C} \hfill & \hskip-8pt = \hfill &\hskip-7pt {\rho - i\omega \left[ {{\mathrm{{\Pi}}}_CG{\mathrm{{\Pi}}}_C,\rho } \right]dt} \hfill \\ {} \hfill & {} \hfill & { + \mathop {\sum}\limits_{k = 1}^r {\kern 1pt} \left( {\left| {\lambda _k} \right|^2 - \mu _{kk}} \right)\rho dt + O\left( {dt^2} \right),} \hfill \end{array}$$
(55)
$${\mathrm{{\Pi}}}_E{\cal E}_{dt}(\rho ){\mathrm{{\Pi}}}_E = \mathop {\sum}\limits_{k = 1}^r {\left( {L_k - \lambda _k} \right)\rho \left( {L_k^{\mathrm{\dagger }} - \lambda _k^ \ast } \right)dt} + O\left( {dt^2} \right),$$
(56)

where Π E  = I − Π C . When acting on a state in the code space, \({\mathrm{{\Pi}}}_E{\cal E}_dt( \cdot ){\mathrm{{\Pi}}}_E\) is an operation with Kraus operators

$$K_k = \left( {I - {\Pi}_C} \right)L_k{\Pi}_C\sqrt {dt} ,$$
(57)

which obey the normalization condition

$$\begin{array}{*{20}{l}} {\mathop {\sum}\limits_{k = 1}^r K_k^\dagger K_k} \hfill & { = \mathop {\sum}\limits_{k = 1}^r {\Pi}_CL_k^\dagger \left( {I - {\Pi}_C} \right)L_k{\Pi}_Cdt} \hfill \\ {} \hfill & { = \mathop {\sum}\limits_{k = 1}^r \left( {\mu _{kk} - \left| {\lambda _k} \right|^2} \right)dt,} \hfill \end{array}$$
(58)

where we have used conditions (1) and (2). Therefore, if ρ is in the code space, then a recovery channel \({\cal R}_E( \cdot )\) such that

$${\cal R}_E\left( {{\Pi}_E{\cal E}_{dt}\left( \rho \right){\Pi}_E} \right) = - \mathop {\sum}\limits_{k = 1}^r {\left( {\left| {\lambda _k} \right|^2 - \mu _{kk}} \right)\rho dt} + O\left( {dt^2} \right)$$
(59)

can be constructed, provided that the operators \(\left\{ {L_k - \lambda _k} \right\}_{k = 1}^r\) satisfy the standard QEC conditions31,32,33. Indeed, these conditions are satisfied because \({\mathrm{{\Pi}}}_C\left( {L_k^{\mathrm{\dagger }} - \lambda _k^ \ast } \right)\,\left( {L_j - \lambda _j} \right){\mathrm{{\Pi}}}_C = \left( {\mu _{kj} - \lambda _k^ \ast \lambda _j} \right){\mathrm{{\Pi}}}_C\), for all k, j. Therefore, the quantum channel

$${\cal R}\left( \sigma \right) = {\Pi}_C\sigma {\Pi}_C + {\cal R}_E\left( {{\Pi}_E\sigma {\Pi}_E} \right)$$
(60)

completely reverses the effects of the noise. The channel describing time evolution for time dt followed by an instantaneous recovery step is

$${\cal R}\left( {{\cal E}_{dt}\left( \rho \right)} \right) = \rho - i\omega \left[ {{\Pi}_CG{\Pi}_C,\rho } \right]dt + O\left( {dt^2} \right),$$
(61)

a noiseless unitary channel with effective Hamiltonian ωΠ C GΠ C if O(dt2) corrections are neglected.

The dependence of the Hamiltonian on ω can be detected, for a suitable initial code state ρin, if and only if Π C GΠ C has at least two distinct eigenvalues. Thus, for nontrivial error-corrected sensing we require condition (3): Π C GΠ C  ≠ constant × Π C .

Error-correctable noisy ancillas

In the main text, we assumed that a noiseless ancilla system is available for the purpose of constructing the QEC code. Here, we relax that assumption. We suppose instead that the ancilla is subject to Markovian noise, which is uncorrelated with noise acting on the probe. Hence, the joint evolution of probe and ancilla during the infinitesimal time interval dt is described by the quantum channel

$$\begin{array}{*{20}{l}} {{\cal E}_{dt}\left( \rho \right) = \rho - i\omega \left[ {G \otimes I,\rho } \right]dt} \hfill \\ { + \mathop {\sum}\limits_{k = 1}^r \left( {\left( {L_k \otimes I} \right)\rho \left( {L_k^\dagger \otimes I} \right) - \frac{1}{2}\left\{ {L_k^\dagger L_k \otimes I,\rho } \right\}} \right)dt} \hfill \\ { + \mathop {\sum}\limits_{k{\prime} = 1}^{r{\prime}} \left( {\left( {I \otimes L{\prime}_{\!\!k}} \right)\rho \left( {I \otimes L{\prime}_{\!\!k}^\dagger } \right) - \frac{1}{2}\left\{ {I \otimes L{\prime}_{\!\!k}^\dagger L{\prime}_{\!\!k},\rho } \right\}} \right)dt + O\left( {dt^2} \right),} \hfill \end{array}$$
(62)

where {L k } are Lindblad jump operators acting on the probe, and \(\left\{ {L{\prime}_{\!\!k}} \right\}\) are Lindblad jump operators acting on the ancilla.

In this case, we may be able to protect the probe using a code \(\bar C\) scheme with two layers—an “inner code” C′ and an “outer code” C. Assuming as before that arbitrarily fast and accurate quantum processing can be performed, and that the Markovian noise acting on the ancilla obeys a suitable condition, an effectively noiseless encoded ancilla can be constructed using the inner code. Then, the QEC scheme that achieves HL scaling can be constructed using the same method as in the main text, but with the encoded ancilla now playing the role of the noiseless ancilla used in our previous construction.

Errors on the ancilla can be corrected if the projector ΠC onto the inner code C′ satisfies the conditions.

$$\left[ {1{\prime}} \right]\,{\Pi}_{C{\prime}}L{\prime}_{\!\!k}{\Pi}_{C{\prime}} = \lambda {\prime}_{\!\!k}{\Pi}_{C{\prime}},\;\forall k,$$
(63)
$$\left[ {2{\prime}} \right]\,{\Pi}_{C{\prime}}L{\prime}_{\!\!j}^\dagger L{\prime}_{\!\!k}{\Pi}_{C{\prime}} = \mu {\prime}_{\!\!jk}{\Pi}_{C{\prime}},\;\forall k,j.$$
(64)

Eqs. (63) and (64) resemble Eqs. (17) and (18), except that the inner code C′ is supported only on the ancilla system \({\cal H}_A\), while the code C in Eqs. (17) and (18) is supported on the joint system \({\cal H}_P \otimes {\cal H}_A\) of probe and ancilla. To search for a suitable inner code C′, we may use standard QEC methods; namely we seek an encoding of the logical ancilla with sufficient redundancy for Eqs. (63) and (64) to be satisfied.

Given a code C that satisfies Eqs. (17), (18), and (20) for the case of a noiseless ancilla, and a code C′ supported on a noisy ancilla that satisfies Eqs. (63) and (64), we construct the code \(\bar C\) that achieves HL scaling for a noisy ancilla system by “concatenating” the inner code C′ and the outer code C. That is, if the basis states for the code C are {|C0〉, |C1〉}, where

$$\left| C_i \right\rangle = \mathop {\sum}\limits_{j,k = 1}^d {C_i^{\left( {jk} \right)}\left| j\right\rangle _P} \otimes \left| k\right\rangle _A,$$
(65)

then the corresponding basis states for the code \(\bar C\) are \(\left| {\bar C_0} \right\rangle ,\left| {\bar C_1} \right\rangle \), where

$$\left| \bar C_i \right\rangle = \mathop {\sum}\limits_{j,k = 1}^d {C_i^{\left( {jk} \right)}\left| j\right\rangle _P} \otimes \left| {C_k\prime } \right\rangle _A,$$
(66)

and \(\left| {C{\prime}_k} \right\rangle \) denotes the basis state of C′ which encodes |k〉. Using our fast quantum controls, the code C′ protects the ancilla against the Markovian noise, and the code \(\bar C\) then protects the probe, so that HL scaling is achievable.

In fact, the code that achieves HL scaling need not have this concatenated structure; any code that corrects both the noise acting on the probe and the noise acting on the ancilla will do. For Markovian noise acting independently on probe and ancilla as in Eq. (62), the conditions Eqs. (17) and (18) on the QEC code should be generalized to

$${\Pi}_{\bar C}\left( {O \otimes O{\prime}} \right){\Pi}_{\bar C} \propto {\Pi}_{\bar C},\quad \forall O \in \,{\cal S}\,{\mathrm{and}}\,O{\prime} \in {\cal S}\prime ;$$
(67)

here \({\cal S} = {\mathrm{span}}\left\{ {I,L_k,L_k^{\mathrm{\dagger }},L_j^{\mathrm{\dagger }}L_k,\forall k,j} \right\}\), \({\cal S}\prime = {\mathrm{span}}\left\{ {I,L{\prime}_k,L{\prime}_k^{\mathrm{\dagger }},L{\prime}_j^{\mathrm{\dagger }}L{\prime}_k,\forall k,j} \right\}\), and \({\mathrm{{\Pi}}}_{\bar C}\) is the projector onto the code \(\bar C\) supported on \({\cal H}_P \otimes {\cal H}_A\). The condition Eq. (20) remains the same as before, but now applied to the code \(\bar C\): \({\mathrm{{\Pi}}}_{\bar C}(G \otimes I){\mathrm{{\Pi}}}_{\bar C} \ne {\mathrm{constant}}\,{\mathrm{{\Pi}}}_{\bar C}\). When these conditions are satisfied, the noise acting on probe and ancilla is correctable; rapidly applying QEC makes the evolution of the probe effectively unitary (and nontrivial), to linear order in dt.

Robustness of the QEC scheme

We consider the following quantum channel, where the “J-type noise,” with Lindblad operators \(\{ J_m\} _{m = 1}^{r_2}\), is regarded as a small perturbation:

$$\begin{array}{*{20}{l}} {{\cal E}_{dt}\left( \rho \right) = \rho - i\omega \left[ {G,\rho } \right]dt + \mathop {\sum}\limits_{k = 1}^{r_1} {\left( {L_k\rho L_k^\dagger - \frac{1}{2}\left\{ {L_k^\dagger L_k,\rho } \right\}} \right)dt} } \hfill \\ { + \mathop {\sum}\limits_{m = 1}^{r_2} \left( {J_m\rho J_m^\dagger - \frac{1}{2}\left\{ {J_m^\dagger J_m,\rho } \right\}} \right)dt + O\left( {dt^2} \right).} \hfill \end{array}$$
(68)

We assume that the “L-type noise,” with Lindblad operators \(\left\{ {L_k} \right\}_{k = 1}^{r_1}\), obeys the QEC conditions (1) and (2), and that \({\cal R}\) is the recovery operation that corrects this noise. By applying this recovery step after the action of \({\cal E}_{dt}\) on a state ρ in the code space, we obtain a modified channel with residual J-type noise.

Suppose that \({\cal R}\) has the Kraus operator decomposition \({\cal R}\left( \sigma \right) = \mathop {\sum}\nolimits_{j = 1}^s \,R_j\sigma R_j^{\mathrm{\dagger }}\), where \(\mathop {\sum}\nolimits_{j = 1}^s {\kern 1pt} R_j^{\mathrm{\dagger }}R_j = I\). We also assume that R j  = Π C R j , because the recovery procedure has been constructed such that the state after recovery is always in the code space. Then

$$\begin{array}{ccccc}\\ & {\cal R}\left( {{\cal E}_{dt}\left( \rho \right)} \right) = \rho - i\omega \left[ {{\Pi}_CG{\Pi}_C,\rho } \right]dt \hfill \\ \\ & + \mathop {\sum}\limits_{m = 1}^{r_2} \mathop {\sum}\limits_{j = 1}^s \left( {J_{m,j}^{(C)}\rho J_{m,j}^{(C)\dagger } - \frac{1}{2}\left\{ {J_{m,j}^{(C)\dagger }J_{m,j}^{(C)},\rho } \right\}} \right)dt + O\left( {dt^2} \right),\\ \end{array}$$
(69)

where \(\left\{ {J_{m,j}^{(C)} = {\mathrm{{\Pi}}}_CR_jJ_m{\mathrm{{\Pi}}}_C} \right\}\) are the effective Lindblad operators acting on code states.

The trace (L1) distance31 between the unitarily evolving state Eq. (61) and the state subjected to the residual noise Eq. (69) is bounded above by

$$\begin{array}{*{20}{l}} {\frac{1}{2}\mathop {{\max}}\nolimits_\rho {\mathrm{tr}}\left| {\mathop {\sum}\nolimits_{m,j} J_{m,j}^{(C)}\rho J_{m,j}^{(C)\dagger }} \right|dt} \hfill \\ { + \frac{1}{4}\mathop {{\max}}\nolimits_\rho {\mathrm{tr}}\left| {\mathop {\sum}\nolimits_{m,j} J_{m,j}^{(C)\dagger }J_{m,j}^{(C)}\rho + \rho \mathop {\sum}\nolimits_{m,j} J_{m,j}^{(C)\dagger }J_{m,j}^{(C)}} \right|dt} \hfill \\ { \le \left\| {\mathop {\sum}\nolimits_{m = 1}^{r_2} \mathop {\sum}\nolimits_{j = 1}^s J_{m,j}^{(C)\dagger }J_{m,j}^{(C)}} \right\|dt = \left\| {{\Pi}_C\left( {\mathop {\sum}\nolimits_{m = 1}^{r_2} J_m^\dagger J_m} \right){\Pi}_C} \right\|dt} \hfill \\ { \le \left\| {\mathop {\sum}\nolimits_{m = 1}^{r_2} J_m^\dagger J_m} \right\|dt} \hfill \end{array}$$
(70)

to first order in dt, where \(\left\| \cdot \right\|\) denotes the operator norm. If the noise strength

$$\epsilon : = \left\| {\mathop {\sum}\limits_{m = 1}^{r_2} J_m^\dagger J_m} \right\|$$
(71)

of the Lindblad operators \(\left\{ {J_m} \right\}_{m = 1}^{r_2}\) is low, the evolution is approximately unitary when \(t \ll 1{\mathrm{/}}\epsilon \). In this sense, the QEC scheme is robust against small J-type noise.

Code optimization as a semidefinite program

Optimization of the QEC code can be formulated as the following optimization problem:

$$\begin{array}{*{20}{l}} {{\mathrm{maximize}}\quad {\mathrm{tr}}\left( {\tilde GG_ \bot } \right)} \hfill \\ {{\mathrm{subject}}\,{\mathrm{to}}\,{\mathrm{tr}}\left( {\left| {\tilde G} \right|} \right) \le 2\,{\mathrm{and}}\,{\mathrm{tr}}\left( {\tilde G} \right) = {\mathrm{tr}}\left( {\tilde GL_k} \right)} \hfill \\ { = {\mathrm{tr}}\left( {\tilde GL_k^\dagger L_j} \right) = 0,\quad \forall j,k.} \hfill \end{array}$$
(72)

This optimization problem is convex (because tr|·| is convex) and satisfies the Slater’s condition, so it can be solved by solving its Lagrange dual problem61. The Lagrangian \(L\left( {\tilde G,\lambda ,\nu } \right)\) is defined for λ ≥ 0 and \(\nu _k \in {\Bbb R}\):

$$L\left( {\tilde G,\lambda ,\nu } \right) = {\mathrm{tr}}\left( {\tilde GG_ \bot } \right) - \lambda \left( {{\mathrm{tr}}\left( {\left| {\tilde G} \right|} \right) - 2} \right) + \mathop {\sum}\limits_k {\nu _k} {\mathrm{tr}}\left( {E_k\tilde G} \right),$$
(73)

where {E k } is any basis of \({\cal S}\). The optimal value is obtained by taking the minimum of the dual

$$\begin{array}{*{20}{l}} {g\left( {\lambda ,\nu } \right)} \hfill & { = \mathop {{\max}}\nolimits_{\tilde G} L\left( {\tilde G,\lambda ,\nu } \right)} \hfill \\ {} \hfill & { = \mathop {{\max}}\nolimits_{\tilde G} {\mathrm{tr}}\left( {\left( {G_ \bot + \mathop {\sum}\limits_k {\nu _kE_k} } \right)\tilde G - \lambda \left| {\tilde G} \right|} \right) + 2\lambda } \hfill \\ {} \hfill & { = \left( {\begin{array}{*{20}{l}} {2\lambda } \hfill & {\lambda \ge \left\| {G_ \bot + \mathop {\sum}\limits_k \nu _kE_k} \right\|} \hfill \\ \infty \hfill & {\lambda \le \left\| {G_ \bot + \mathop {\sum}\limits_k \nu _kE_k} \right\|} \hfill \end{array}} \right.} \hfill \end{array}$$
(74)

over λ and {ν k }, where \(\left\| \cdot \right\| = {\mathrm{max}}_{\left| \psi \right\rangle }{\kern 1pt} \left| {\left\langle \psi \right| \cdot \left| \psi \right\rangle } \right|\) is the operator norm. Hence the optimal value of the primal problem is

$$\mathop {{\min}}\limits_{\lambda ,\nu } \,g\left( {\lambda ,\nu } \right) = 2\mathop {{\min}}\limits_{\nu _k} \,\left\| {G_ \bot + \mathop {\sum}\limits_k \nu _kE_k} \right\| = 2\mathop {{\min}}\limits_{\tilde G\parallel \in {\cal S}} \left\| {G_ \bot - \tilde G_\parallel } \right\|.$$
(75)

The optimization problem Eq. (75) is equivalent to the following SDP:61

$$\begin{array}{*{20}{l}} {{\mathrm{minimize}}\,s} \hfill \\ {{\mathrm{subject}}\,{\mathrm{to}}\,\left( {\begin{array}{*{20}{l}} {sI} & {G_ \bot + \mathop {\sum}\limits_k \nu _kE_k} \hfill \\ {G_ \bot + \mathop {\sum}\limits_k \nu _kE_k} \hfill & {sI} \end{array}} \right) \succeq 0} \hfill \end{array}$$
(76)

for variables \(\nu _k \in {\Bbb R}\) and \(s\succeq 0\). Here “\(\succcurlyeq 0\)” denotes positive semidefinite matrices. SDPs can be solved using the Matlab-based package CVX62.

Once we have the solution to the dual problem, we can use it to find the solution to the primal problem. We denote by \(\lambda ^\diamondsuit \) and \(\nu ^\diamondsuit \) the values of λ and ν where g(λ,ν) attains its minimum, and define

$$\tilde G^\diamondsuit = G_ \bot + \mathop {\sum}\limits_k \nu _k^\diamondsuit E_k.$$
(77)

The minimum \(g\left( {\lambda ^\diamondsuit ,\nu ^\diamondsuit } \right)\) matches the value of the Lagrangian \(L\left( {\tilde G,\lambda ^\diamondsuit ,\nu ^\diamondsuit } \right)\) when \(\tilde G = \tilde G^ \ast \) is the value of \(\tilde G\) that maximizes \({\mathrm{tr}}\left( {\tilde GG_ \bot } \right)\) subject to the constraints. This means that

$${\mathrm{tr}}\left( {\tilde G^ \ast \tilde G^\diamondsuit } \right) = 2\left\| {\tilde G^\diamondsuit } \right\|.$$
(78)

Since we require \({\mathrm{tr}}\left( {\tilde G^ \ast } \right) = 0\) and \({\mathrm{tr}}\left| {\tilde G^ \ast } \right| = 2\), and because minimizing g(λ,ν) enforces that the maximum and minimal eigenvalues of \(\tilde G^\diamondsuit \) have the same absolute value and opposite sign, we conclude that

$$\tilde G^ \ast = \tilde \rho _0^\diamondsuit - \tilde \rho _1^\diamondsuit ,$$
(79)

where \(\tilde \rho _{\mathrm{0}}^\diamondsuit \) is a density operator supported on the eigenspace of \(\tilde G^\diamondsuit \) with the maximal eigenvalue, and \(\tilde \rho _{\mathrm{1}}^\diamondsuit \) is a density operator supported on the eigenspace of \(\tilde G^\diamondsuit \) with the minimal eigenvalue. A \(\tilde G^ \ast \) of this form which satisfies the constraints of the primal problem is guaranteed to exist.

Data availability

Data sharing not applicable to this article as no data sets were generated or analyzed during the current study.