Universal quantum computation with ideal Clifford gates and noisy ancillas
Abstract
We consider a model of quantum computation in which the set of elementary operations is limited to Clifford unitaries, the creation of the state , and qubit measurement in the computational basis. In addition, we allow the creation of a onequbit ancilla in a mixed state , which should be regarded as a parameter of the model. Our goal is to determine for which universal quantum computation (UQC) can be efficiently simulated. To answer this question, we construct purification protocols that consume several copies of and produce a single output qubit with higher polarization. The protocols allow one to increase the polarization only along certain “magic” directions. If the polarization of along a magic direction exceeds a threshold value (about 65%), the purification asymptotically yields a pure state, which we call a magic state. We show that the Clifford group operations combined with magic states preparation are sufficient for UQC. The connection of our results with the GottesmanKnill theorem is discussed.
pacs:
03.67.Lx, 03.67.PpI Introduction and summary
The theory of faulttolerant quantum computation defines an important number called the error threshold. If the physical error rate is less than the threshold value , it is possible to stabilize computation by transforming the quantum circuit into a faulttolerant form where errors can be detected and eliminated. However, if the error rate is above the threshold, then errors begin to accumulate, which results in rapid decoherence and renders the output of the computation useless. The actual value of depends on the error correction scheme and the error model. Unfortunately, this number seems to be rather small for all known schemes. Estimates vary from (see Knill et al. (1998)) to (see Zalka (1996); Steane (1997); Dennis et al. (2002)), which is hardly achievable with the present technology.
In principle, one can envision a situation in which qubits do not decohere, and a subset of the elementary gates is realized exactly due to special properties of the physical system. This scenario could be realized experimentally using spin, electron, or other manybody systems with topologically ordered ground states. Excitations in twodimensional topologically ordered systems are anyons — quasiparticles with unusual statistics described by nontrivial representations of the braid group. If we have sufficient control of anyons i.e., are able to move them around each other, fuse them, and distinguish between different particle types, then we can realize some set of unitary operators and measurements exactly. This set may or may not be computationally universal. While the universality can be achieved with sufficiently nontrivial types of anyons Mochon (2004); Kitaev (2003); Freedman et al. (2000, 2002), more realistic systems offer only decoherence protection and an incomplete set of topological gates. (See Moore and Read (1991); Nayak and Wilczek (1996) about nonAbelian anyons in quantum Hall systems and Doucot and Vidal (2001); Feigel’man and Ioffe (2002) about topological orders in Josephson junction arrays.) Nevertheless, universal computation is possible if we introduce some additional operations (e.g., measurements by AharonovBohm interference Preskill (1997) or some gates that are not related to topology at all). Of course, these nontopological operations cannot be implemented exactly and thus are prone to errors.
In this situation, the threshold error rate may become significantly larger than the values given above because we need to correct only errors of certain special type and we introduce a smaller amount of error in the correction stage. The main purpose of the present paper is to illustrate this statement by a particular computational model.
The model is built upon the Clifford group — the group of unitary operators that map the group of Pauli operators to itself under conjugation. The set of elementary operations is divided into two parts: . Operations from are assumed to be perfect. We list these operations below:

Prepare a qubit in the state ;

Apply unitary operators from the Clifford group;

Measure an eigenvalue of a Pauli operator (, , or ) on any qubit.
Here we mean nondestructive projective measurement. We also assume that no errors occur between the operations.
It is wellknown that these operations are not sufficient for universal quantum computation (unless a quantum computer can be efficiently simulated on a classical computer). More specifically, the GottesmanKnill theorem states that by operations from one can only obtain quantum states of a very special form called stabilizer states. Such a state can be specified as an intersection of eigenspaces of pairwise commuting Pauli operators, which are referred to as stabilizers. Using the stabilizer formalism, one can easily simulate the evolution of the state and the statistics of measurements on a classical probabilistic computer (see Gottesman (1997) or a textbook Nielsen and Chuang (2000) for more details).
The set describes faulty operations. In our model, it consists of just one operation:

Prepare an ancillary qubit in a mixed state .
The state should be regarded as a parameter of the model. From the physical point of view, is mixed due to imperfections of the preparation procedure (entanglement of the ancilla with the environment, thermal fluctuations, etc.). An essential requirement is that by preparing qubits we obtain the state i.e., all ancillary qubits are independent. The independence assumption is similar to the uncorrelated errors model in the standard faulttolerant computation theory.
Our motivation for including all Clifford group gates into relies mostly on the recent progress in the faulttolerant implementation of such gates. For instance, using a concatenated stabilizer code with good error correcting properties to encode each qubit and applying gates transversally (so that errors do not propogate inside code blocks) one can implement Clifford gates with an arbitrary high precision, see Gottesman (1998). However these nearly perfect gates act on encoded qubits. To establish a correspondence with our model, one needs to prepare an encoded ancilla in the state . It can be done using the schemes for faulttolerant encoding of an arbitrary known onequbit state described by Knill in Knill (2004a). In the more recent paper Knill (2004b) Knill constructed a novel scheme of faulttolerant quantum computation which combines (i) the teleported computing and error correction technique by Gottesman and Chuang Gottesman and Chuang (1999); (ii) the method of purification of CSS states by Dür and Briegel Dur and Briegel (2003); and (iii) the magic states distillation algorithms described in the present paper. As was argued in Knill (2004b), this scheme is likely to yield much higher value for the threshold (may be up to ).
Unfortunately, ideal implementation of the Clifford group can not be currently achieved in any realistic physical system with a topological order. What universality classes of anyons allow one to implement all Clifford group gates (but do not allow one to simulate UQC) is an interesting open problem.
To fully utilize the potential of our model, we allow adaptive computation. It means that a description of an operation to be performed at the step may be a function of all measurement outcomes at the steps . (For even greater generality, the dependence may be probabilistic. This assumption does not actually strengthen the model since tossing a fair coin can be simulated using .) At this point, we need to be careful because the proper choice of operations should not only be defined mathematically — it should be computed by some efficient algorithm. In all protocols described below, the algorithms will actually be very simple. (Let us point out that dropping the computational complexity restriction still leaves a nontrivial problem: can we prepare an arbitrary multiqubit pure state with any given fidelity using only operations from the basis ?)
The main question that we address in this paper is as follows: For which density matrices can one efficiently simulate universal quantum computation by adaptive computation in the basis ?
It will be convenient to use the Bloch sphere representation of onequbit states:
The vector will be referred to as the polarization vector of . Let us first consider the subset of states satisfying
This inequality says that the vector lies inside the octahedron with vertices , , , see Fig. 1. The six vertices of represent the six eigenstates of the Pauli operators , , and . We can prepare these states by operations from only. Since is a convex linear combination (probabilistic mixture) of these states, we can prepare by operations from and by tossing a coin with suitable weights. Thus we can rephrase the GottesmanKnill theorem in the following way.
Theorem 1.
Suppose the polarization vector of the state belongs to the convex hull of , , . Then any adaptive computation in the basis can be efficiently simulated on a classical probabilistic computer.
This observation leads naturally to the following question: is it true that UQC can be efficiently simulated whenever lies in the exterior of the octahedron ? In an attempt to provide at least a partial answer, we prove the universality for a large set of states. Specifically, we construct two particular schemes of UQC simulation based on a method which we call magic states distillation. Let us start by defining the magic states.
Definition 1.
Consider pure states such that
and
The images of and under the action of onequbit Clifford operators are called magic states of type and type, respectively.
(This notation is chosen since and are eigenvectors of certain Clifford group operators: the Hadamard gate and the operator usually denoted , see Eq. (7).) Denote the onequbit Clifford group by . Overall, there are eight magic states of type, (up to a phase) and twelve states of type, , see Fig. 1. Clearly, the polarization vectors of magic states are in onetoone correspondence with rotational symmetry axes of the octahedron (type states correspond to rotations and type states correspond to rotations). The role of magic states in our construction is twofold. First, adaptive computation in the basis together with the preparation of magic states (of either type) allows one to simulate UQC. (See Sec. III.) Second, by adaptive computation in the basis one can “purify” imperfect magic states. It is a rather surprising coincidence that one and the same state can comprise both of these properties, and that is the reason why we call them magic states.
More exactly, a magic state distillation procedure yields one copy of a magic state (with any desired fidelity) from several copies of the state , provided that the initial fidelity between and the magic state to be distilled is large enough. In the course of distillation, we use only operations from the set . By constructing two particular distillation schemes, for type and type magic states, respectively, we prove the following theorems.
Theorem 2.
Let be the maximum fidelity between and a type magic state i.e.,
Adaptive computation in the basis allows one to simulate universal quantum computation whenever
Theorem 3.
Let be the maximum fidelity between and an type magic state,
Adaptive computation in the basis allows one to simulate universal quantum computation whenever
The quantities and have the meaning of threshold fidelity since our distillation schemes increase the polarization of , converging to a magic state as long as the inequalities or are fulfilled. If they are not fulfilled, the process converges to the maximally mixed state. The conditions stated in the theorems can also be understood in terms of the polarization vector . Indeed, let us associate a “magic direction” with each of the magic states. Then Theorems 2, 3 say that the distillation is possible if there is a direction such that the projection of the vector onto that direction exceeds the threshold value of , or if the projection on some of the directions is greater than .
Let us remark that, although the proposed distillation schemes are probably not optimal, the threshold fidelities and can not be improved significantly. Indeed, it is easy to check that the octahedron corresponding to probabilistic mixtures of stabilizer states can be defined as
where
It means that is a lower bound on the threshold fidelity for any protocol distilling type magic states. Thus any potential improvement to Theorem 2 may only decrease from down to . From a practical perspective, the difference between these two numbers is not important.
On the other hand, such an improvement would be of great theoretical interest. Indeed, if Theorem 2 with replaced by is true, it would imply that the GottesmanKnill theorem provides necessary and sufficient conditions for the classical simulation, and that a transition from classical to universal quantum behavior occurs at the boundary of the octahedron . This kind of transition has been discussed in context of a general error model Aharonov (1996). Our model is simpler, which gives hope for sharper results.
By the same argument, one can show that the quantity
is a lower bound on the threshold fidelity for any protocol distilling type magic states.
A similar approach to UQC simulation was suggested in the work Dennis (2001), where Clifford group operations were used to distill the entangled threequbit state , which is necessary for the realization of the Toffoli gate.
The rest of the paper is organized as follows. Section II contains some wellknown facts about the Clifford group and stabilizer formalism, which will be used throughout the paper. In Sec. III we prove that magic states together with operations from are sufficient for UQC. In Sec. IV ideal magic are substituted by faulty ones and the error rate that our simulation algorithm can tolerate is estimated. In Sec. V we describe a distillation protocol for type magic states. This protocol is based on the wellknown fivequbit quantum code. In Sec. VI a distillation protocol for type magic states is constructed. It is based on a certain CSS stabilizer code that encodes one qubit into 15 and admits a nontrivial automorphism Knill et al. (1996). Specifically, the bitwise application of a certain nonClifford unitary operator preserves the code subspace and effects the same operator on the encoded qubit. We conclude with a brief summary and a discussion of open problems.
Ii The Clifford group, stabilizers, and syndrome measurements
Let denote the qubit Clifford group. Recall that it is a finite subgroup of generated by the Hadamard gate (applied to any qubit), the phaseshift gate (applied to any qubit), and the controllednot gate (which may be applied to any pair qubits).
(1) 
The Pauli operators , , belong to , for instance, and . The Pauli group is generated by the Pauli operators acting on qubits. It is known Calderbank et al. (1997) that the Clifford group augmented by scalar unitary operators coincides with the normalizer of in the unitary group . Hermitian elements of the Pauli group are of particular importance for quantum error correction theory; they are referred to as stabilizers. These are operators of the form
where . Let us denote by the set of all qubit stabilizers:
For any two stabilizers we have and . It is known that for any set of pairwise commuting stabilizers there exists a unitary operator such that
where denotes the operator applied to the th qubit, e.g., .
These properties of the Clifford group allow us to introduce a very useful computational procedure which can be realized by operations from . Specifically, we can perform a joint nondestructive eigenvalue measurement for any set of pairwise commuting stabilizers . The outcome of such a measurement is a sequence of eigenvalues , , which is usually called a syndrome. For any given outcome, the quantum state is acted upon by the projector
Now, let us consider a computation that begins with an arbitrary state and consists of operations from . It is clear that we can defer all Clifford operations until the very end if we replace the Pauli measurements by general syndrome measurements. Thus the most general transformation that can be realized by is an adaptive syndrome measurement, meaning that the choice of the stabilizer to be measured next depends on the previously measured values of . In general, this dependence may involve coin tossing. Without loss of generality one can assume that commutes with all previously measured stabilizers (for all possible values of and coin tossing outcomes). Adaptive syndrome measurement has been used in the work Ambainis and Gottesman (2003) to distill entangled states of a bipartite system by local operations.
Iii Universal quantum computation with magic states
In this section, we show that operations from are sufficient for universal quantum computation if a supply of ideal magic states is also available. First, consider a onequbit state
(2) 
and suppose that is not a multiple of . We now describe a procedure that implements the phase shift gate
by consuming several copies of and using only operations from .
Let be the unknown initial state which should be acted on by . Prepare the state and measure the stabilizer . Note that both outcomes of this measurement appear with probability . If the outcome is ‘’, we are left with the state
In the case of ‘’ outcome, the resulting state is
Let us XOR the first qubit into the second qubit (i.e., apply the operator ). The above two states are mapped to
Now the second qubit can be discarded, and we are left with the state , depending upon the measured eigenvalue. Thus the net effect of this circuit is the application of a unitary operator that is chosen randomly between or (and we know which of the two possibilities has occurred).
Applying the circuit repeatedly, we effect the transformations for some integers which obey the random walk statistics. It is well known that such a random walk visits each integer with the probability one. It means that sooner or later we will get and thus realize the desired operator . The probability that we will need more than steps to succeed can be estimated as for some constant . Note also that if is a rational multiple of , we actually have a random walk on a cyclic group . In this case, the probability that we will need more than steps decreases exponentially with .
The magic state can be explicitly written in the standard basis as
(3) 
Note that . So if we are able to prepare the state , we can realize the operator . It does not belong to the Clifford group. Moreover, the subgroup of generated by and is dense in , see ^{1}^{1}1Recall that the action of the Clifford group on the set of operators , , coincides with the action of rotational symmetry group of a cube on the set of unit vectors , , , respectively.. Thus the operators from and together with constitute a universal basis for quantum computation.
The magic state can be explicitly written in the standard basis:
(4) 
Let us prepare an initial state and measure the stabilizer . The outcome ‘’ appears with probability . If the outcome is ‘’, we discard the reduced state and try again, using a fresh pair of magic states. (On average, we need three copies of the state to get the outcome ‘’.) The reduced state corresponding to the outcome ‘’ is
Let us XOR the first qubit into the second and discard the second qubit. We arrive at the state
Next apply the Hadamard gate :
We can use this state as described above to realize the operator . It is easy to check that Clifford operators together with constitute a universal set of unitary gates.
Thus we have proved that the sets of operations and are sufficient for universal quantum computation.
Iv Error analysis
To establish a connection between the simulation algorithms described in Sec. III and the universality theorems stated in the introduction we have to substitute ideal magic states by faulty ones. Before doing that let us discuss the ideal case in more details. Suppose that a quantum circuit to be simulated uses a gate basis in which the only nonClifford gate is the phase shift or . One can apply the algorithm of Sec. III to simulate each nonClifford gate independently. To avoid fluctuations in the number of magic states consumed at each round, let us set a limit of magic states per round, where is a parameter to be chosen later. As was pointed out in Sec. III, the probability for some particular simulation round to “run out of budget” scales as for some constant . If at least one simulation round runs out of budget, we declare a failure and the whole simulation must be aborted. Denote the total number of nonClifford gates in the circuit by . The probability for the whole simulation to be aborted can be estimated as
provided that . We will assume
so the abort probability can be neglected.
Each time the algorithm requests an ideal magic state, it actually receives a slightly nonideal one. Such nearly perfect magic states must be prepared using the distillation methods described in Sec. V,VI. Let us estimate an affordable error rate for distilled magic states. Since there are nonClifford gates in the circuit, one can tolerate an error rate of the order in implementation of these gates ^{2}^{2}2This faulttolerance does not require any redundancy in the implementation of the circuit (e.g. the use of concatenated codes). It is achived automatically because in the worst case the error probability accumulates linearly in the number of gates. In our model only nonClifford gates are faulty.. Each nonClifford gate requires magic states. Thus the whole simulation is reliable enough if one chooses
(5) 
What are the resources needed to distill one copy of a magic state with the error rate ? To be more specific, let us talk about type states. It will be shown in Sec. VI that the number of raw (undistilled) ancillas needed to distill one copy of the magic state with an error rate not exceeding scales as
see Eq. (39). Taking from Eq. (5), one gets
Since the whole simulation requires copies of the distilled state, we need
raw ancillas overall.
Summarizing, the simulation theorems stated in the introduction follow from the following results (the last one will be proved later):

The circuits described in Sec. III allow one to simulate UQC with the sets of operations and ;

These circuits work reliably enough if the states and are slightly noisy, provided that the error rate does not exceed ;

A magic state having an error rate can be prepared from copies of the raw ancillary state using the distillation schemes provided that or . The distillation requires resources that are polynomial in .
V Distillation of type magic states
Suppose we are given copies of a state , and our goal is to distill one copy of the magic state . The polarization vector of can be brought into the positive octant of the Bloch space by a Clifford group operator, so we can assume that
In this case, the fidelity between and is the largest one among all type magic states i.e.,
A related quantity,
will be called the initial error probability. By definition, .
The output of the distillation algorithm will be some onequbit mixed state . To quantify the proximity between and , let us define a final error probability:
It will be certain function of and . The asymptotic behavior of this function for reveals the existence of a threshold error probability,
such that for the function converges to zero. We will see that for small ,
(6) 
On the other hand, if , the output state converges to the maximally mixed state i.e., .
Before coming to a detailed description of the distillation algorithm, let us outline the basic ideas involved in its construction. The algorithm recursively iterates an elementary distillation subroutine that transforms five copies of an imperfect magic state into one copy having a smaller error probability. This elementary subroutine involves a syndrome measurement for certain commuting stabilizers . If the measured syndrome is nontrivial ( for some ), the distillation attempt fails and the reduced state is discarded. If the measured syndrome is trivial ( for all ), the distillation attempt is successful. Applying a decoding transformation (a certain Clifford operator) to the reduced state, we transform it to a singlequbit state. This qubit is the output of the subroutine.
Our construction is similar to concatenated codes used in many faulttolerant quantum computation techniques, but it differs from them in two respects. First, we do not need to correct errors — it suffices only to detect them. Once an error has been detected, we simply discard the reduced state, since it does not contain any valuable information. This allows us to achieve higher threshold error probability. Second, we do not use quantum codes in the way for which they were originally designed: in our scheme, the syndrome is measured on a product state.
The state is an eigenstate for the unitary operator
(7) 
Note that acts on the Pauli operators as follows ^{3}^{3}3The operator denoted by in the paper Gottesman (1998) does not coincide with our . They are related by the substitution though.:
(8) 
We will denote its eigenstates by and , so that
Note that and are type magic states.
Let us apply a dephasing transformation,
(9) 
to each copy of the state . The transformation can be realized by applying one of the operators , , chosen with probability each. Since
we have
(10) 
We will assume that the dephasing transformation is applied at the very first step of the distillation, so has the form (10). Thus the initial state for the elementary distillation subroutine is
(11) 
where is a binary string, is the number of 1’s in , and
The stabilizers to be measured on the state correspond to the famous 5qubit code, see Bennett et al. (1996); Laflamme et al. (1996). They are defined as follows:
(12) 
This code has a cyclic symmetry, which becomes explicit if we introduce an auxiliary stabilizer, . Let be the twodimensional code subspace specified by the conditions , , and be the orthogonal projector onto :
(13) 
It was pointed out in the work Gottesman (1998) that the operators
and
(14) 
commute with , thus preserving the code subspace. Moreover, , , obey the same algebraic relations as onequbit Pauli operators, e.g., . Let us choose a basis in such that , , and become logical Pauli operators , , and , respectively. How does the operator act in this basis? From Eq. (8) we immediately get
Therefore coincides with the logical operator up to an overall phase factor. This factor is fixed by the condition that the logical has eigenvalues .
Let us find the eigenvectors of that belong to . Consider two particular states from , namely
In Appendix A we show that
(15) 
so that the states and are normalized. Taking into account that and that
(16) 
we get
Analogously, one can check that
It follows that is exactly the logical operator , including the overall phase, and and are the logical states and (up to some phase factors, which are not important for us). Therefore we have
(17) 
Now we are in a position to describe the syndrome measurement performed on the state . The unnormalized reduced state corresponding to the trivial syndrome is as follows:
(18) 
see Eq. (11). The probability for the trivial syndrome to be observed is
Note that the state is an eigenvector of for any . But we know that the restriction of on has eigenvalues . At the same time, Eq. (16) implies that
whenever or . This eigenvalue equation is not a contradiction only if
This equality can be interpreted as an error correction property. Indeed, the initial state is a mixture of the desired state and unwanted states with . We can interpret the number of ‘’ components in as a number of errors. Once the trivial syndrome has been measured, we can be sure that either no errors or at least two errors have occurred. Such error correction, however, is not directly related to the minimal distance of the code.
It follows from Eq. (16) that for one has , so that must be proportional to one of the states , . Our observations can be summarized as follows:
(19) 
Here the coefficients , depend upon in some way. The output state (18) can now be written as
To exclude the unknown coefficients and , we can use the identity
Substituting Eq. (19) into this identity, we get
So the final expression for the output state is as follows:
(21)  
Accordingly, the probability to observe the trivial syndrome is
(22) 
A decoding transformation for the 5qubit code is a unitary operator such that
In other words, maps the stabilizers , to . The logical operators , , are mapped to the Pauli operators , , acting on the first qubit. From Eq. (17) we infer that
(maybe up to some phase). The decoding should be followed by an additional operator , which swaps the states and (note that for small the state is close to , while our goal is to distill ). After that we get a normalized output state
where
(23) 
The plot of the function is shown on Fig. 2.
It indicates that the equation has only one nontrivial solution, . The exact value is
If , we can recursively iterate the elementary distillation subroutine to produce as good an approximation to the state as we wish. On the other hand, if , the distillation subroutine increases the error probability and iterations converge to the maximally mixed state. Thus is a threshold error probability for our scheme. The corresponding threshold polarization is . For a sufficiently small , one can use the approximation .
The probability to measure the trivial syndrome decreases monotonically from for to for , see Fig. 2. In the asymptotic regime where is small, we can use the approximation .
Now the construction of the whole distillation scheme is straightforward. We start from copies of the state . Let us split these states into groups containing five states each and apply the elementary distillation subroutine described above to each group independently. In some of these groups the distillation attempt fails, and the outputs of such groups must be discarded. The average number of “successful” groups is obviously if is small. Neglecting the fluctuations of this quantity, we can say that our scheme provides a constant yield of output states that are characterized by the error probability . Therefore we can obtain states with , states with , and so on. We have created a hierarchy of states with states on the first level and four or fewer states on the last level. Let be the number of levels in this hierarchy and the error probability characterizing the states on the last level. Up to small fluctuations, the numbers , , and are related by the following obvious equations:
(24) 
Their solution yields Eq. (6).
Vi Distillation of type magic states
A distillation scheme for type magic states also works by recursive iteration of a certain elementary distillation subroutine based on a syndrome measurement for a suitable stabilizer code. Let us start with introducing some relevant coding theory constructions, which reveal an unusual symmetry of this code and explain why it is particularly useful for type magic states distillation.
Let be the dimensional binary linear space and be a onequbit operator such that . With any binary vector we associate the qubit operator
Let denote the standard binary inner product. If is a linear subspace, we denote by the set of vectors which are orthogonal to . The Hamming weight of a binary vector is denoted by . Finally, designates the bitwise product of and i.e., .
A systematic way of constructing stabilizer codes was suggested by Calderbank, Shor, and Steane, see Calderbank and Shor (1996); Steane (1996). Codes that can be described in this way will be referred to as standard CSS codes. In addition, we consider their images under an arbitrary unitary transformation applied to every qubit. Such “rotated” codes will be called CSS codes.
Definition 2.
Consider a pair of onequbit Hermitian operators such that
and a pair of binary vector spaces , such that
A quantum code is a decomposition