Chapter 10

\[ \newcommand{\BB}{\mathbb{B}} \newcommand{\CC}{\mathbb{C}} \newcommand{\HH}{\mathbb{H}} \newcommand{\KK}{\mathbb{K}} \newcommand{\NN}{\mathbb{N}} \newcommand{\QQ}{\mathbb{Q}} \newcommand{\RR}{\mathbb{R}} \newcommand{\ZZ}{\mathbb{Z}} \newcommand{\ii}{\mathrm{i}} \newcommand{\jj}{\mathrm{j}} \newcommand{\kk}{\mathrm{k}} \newcommand{\dd}{\mathrm{d}} \newcommand{\FT}{\mathcal{F}} % Fourier Transform \newcommand{\tto}{\twoheadrightarrow} \newcommand{\inv}{^{-1}} \newcommand{\RF}{\mathrm{RF}} \newcommand{\sys}{\mathrm{sys}} \newcommand{\diag}{\mathrm{diag}} \newcommand{\cx}{\mathrm{CX}} \newcommand{\cy}{\mathrm{CY}} \newcommand{\cz}{\mathrm{CZ}} \newcommand{\cat}{\ket{\mathrm{cat}}} \newcommand{\catp}[1]{\ket{\mathrm{cat}_{#1}}} \newcommand{\calE}{\mathcal{E}} \newcommand{\calF}{\mathcal{F}} \newcommand{\calH}{\mathcal{H}} \newcommand{\calR}{\mathcal{R}} \newcommand{\abs}[1]{\left|#1\right|} \newcommand{\norm}[1]{\|#1\|} \newcommand{\sprod}[2]{\langle#1|#2\rangle} % deprecated, use braket instead. \newcommand{\braket}[2]{\langle#1|#2\rangle} % scalar product \newcommand{\ptrace}[2]{\mathrm{tr}_{#1}\left(#2\right)} \newcommand{\trace}[1]{\mathrm{tr}\left(#1\right)} \newcommand{\rank}[1]{\mathrm{rank}\left(#1\right)} \newcommand{\floor}[1]{\lfloor#1\rfloor} \newcommand{\ceil}[1]{\lceil#1\rceil} \newcommand{\bra}[1]{\langle#1|} \newcommand{\ket}[1]{|#1\rangle} \newcommand{\proj}[1]{\ket{#1}\bra{#1}} \newcommand{\mean}[1]{\langle#1\rangle} \newcommand{\wt}[1]{\mathrm{wt}\left(#1\right)} \newcommand{\prob}[1]{\mathrm{Prob}\left[#1\right]} \newcommand{\orac}{\mathrm{Orac}} \newcommand{\?}{} % sometimes I need just a separator other than whitespace \]

Table of Contents

Setup

Imports

from utils_sage import ket, kron, trace, Id, X, Y, Z, Id2, CCX
from chapter_8_sage import make_operation

Python:

from itertools import product

from qiskit.quantum_info import Pauli, Clifford
from qiskit.circuit import QuantumCircuit, Gate, ControlledGate
from qiskit.circuit.library import CXGate

Utility code

g = SR.var("g", domain="positive")
a, b = SR.var("a b", domain="complex")
assume(g <= 1)

The Pauli gates on two and three qubits:

def make_multi(gate, num_qubits):
    """Given a single qubit gate G, make all G_j for j=1..num_qubits."""
    gs = [Id]*num_qubits
    return tuple([kron(*(gs[:i]+[gate]+gs[i+1:])) for i in range(num_qubits)])

II = Id2
ZI, IZ = make_multi(Z, 2)
XI, IX = make_multi(X, 2)
YI, IY = make_multi(Y, 2)

ZII, IZI, IIZ = make_multi(Z, 3)
XII, IXI, IIX = make_multi(X, 3)
YII, IYI, IIY = make_multi(Y, 3)

The following functions return the string representations of generators of various codes we encounter in the book

def generators_5qubit_code_repr():
    return [
        "XZZXI",
        "IXZZX",
        "XIXZZ",
        "ZXIXZ",
    ]

def generators_steane_code_repr():
    return [
        "IIIXXXX",
        "IXXIIXX",
        "XIXIXIX",
        "IIIZZZZ",
        "IZZIIZZ",
        "ZIZIZIZ",
    ]

def generators_steane_code_standard_repr():
    return [
        "XIIIXXX",
        "IXIXIXX",
        "IIXXXXI",
        "ZIZZIIZ",
        "IZZIZIZ",
        "ZZZIIZI",
    ]

def generators_shor_code_repr():
    return [
        "ZZIIIIIII",
        "IZZIIIIII",
        "IIIZZIIII",
        "IIIIZZIII",
        "IIIIIIZZI",
        "IIIIIIIZZ",
        "XXXXXXIII",
        "IIIXXXXXX",
    ]

The following code computes the check matrix from the string representation:

def generators_to_check_matrix(generators: list[str]):
    """Taking a list of string representations of the generators for some code this
    returns the corresponding check matrix."""
    def xbit(pauli: str):
        if pauli == "X" or pauli == "Y":
            return 1
        else:
            return 0

    def zbit(pauli: str):
        if pauli == "Z" or pauli == "Y":
            return 1
        else:
            return 0

    check_matrix = []

    for g in generators:
        gj = [xbit(p) for p in g]
        gj += [zbit(p) for p in g]

        check_matrix.append(gj)

    return matrix(GF(2), check_matrix)

The following utility function helps us in performing elementary row and column operations on check matrices (see exercise 10.57 on how to use it).

def transform_check_matrix(check_matrix: matrix, trafos: list):
    """Apply a list of row/column transformations to a check matrix."""
    cm = check_matrix
    _, nn = cm.dimensions()
    assert nn % 2 == 0, "Expected even column dimension"
    n = nn / 2

    for t in trafos:
        if t[0] == "swap rows":
            i, j = t[1:]
            assert i < n and j < n, f"Expected i, j < {n}"
            cm = cm.with_swapped_rows(i, j)
        elif t[0] == "swap qubits":
            i, j = t[1:]
            assert i < n and j < n, f"Expected i, j < {n}"
            cm = cm.with_swapped_columns(i, j)
            cm = cm.with_swapped_columns(n+i, n+j)
        elif t[0] == "add row":
            source, target = t[1:]
            cm = cm.with_added_multiple_of_row(target, source, 1)
        else:
            raise Exception(f"Unknown check matrix operation '{t[0]}'")

    return cm

The following circuit can be used to construct the syndrome measuring circuits in qiskit as shown in e.g. Figure 10.16:

def make_syndrome_circuit(generators: list[Pauli], with_barrier=False):
    assert len(generators) != 0, "Need at least one Pauli operator"

    l = len(generators)
    n = generators[0].num_qubits
    k = n-l

    assert all([g.num_qubits == n for g in generators]), \
        "Generators have to be of same size"
    assert k >= 1, "Too many generators"

    qr_aux = QuantumRegister(l, name="|0>")
    qr_phys = QuantumRegister(n, name="q")

    qc = QuantumCircuit(qr_aux, qr_phys)

    def barrier():
        if with_barrier:
            qc.barrier()

    for i in range(l):
        qc.h(i)

    barrier()

    for i, g in enumerate(generators):
        G = g.to_instruction().control()
        qc.append(G, [i] + list(range(l, l+n)))
        barrier()

    for i in range(l):
        qc.h(i)

    return qc

For convenience the generators of some codes in the qiskit framework:

def generators_5qubit_code():
    return [
        Pauli("XZZXI"),
        Pauli("IXZZX"),
        Pauli("XIXZZ"),
        Pauli("ZXIXZ"),
    ]

def generators_shor_code():
    return [
        Pauli("ZZIIIIIII"),
        Pauli("IZZIIIIII"),
        Pauli("IIIZZIIII"),
        Pauli("IIIIZZIII"),
        Pauli("IIIIIIZZI"),
        Pauli("IIIIIIIZZ"),
        Pauli("XXXXXXIII"),
        Pauli("IIIXXXXXX"),
    ]

def generators_steane_code_standard():
    # standard form!
    return [
        Pauli("XIIIXXX"),
        Pauli("IXIXIXX"),
        Pauli("IIXXXXI"),
        Pauli("ZIZZIIZ"),
        Pauli("IZZIZIZ"),
        Pauli("ZZZIIZI"),
    ]

Notation

Below we use \(\BB=\mathrm{GF}(2)=\ZZ_2\) to denote the field of two elements. This models a bit.

Stabilizer codes

The purpose of this section is to complement chapter 10.5. From the book I am missing some concrete statement on how exactly stabilizer codes are characterized. I try to do this in the fundamental theorem of stabilizer codes below.

Let \(G_n\) be the Pauli group. As in the book let \(V_S=\{v|\forall\?g\in\?S:gv=v\}\) for a subgroup \(S\) of \(G_n\). Note that any element of \(G_n\) can be represented as

\[ g = \alpha \bigotimes_{j=1}^n N_j , \]

where \(\alpha\in\langle\ii\rangle=\{1,-1,\ii,-\ii\}\) and \(N_j\in\{I,X,Y,Z\}\). An easy to prove but important observation is that

Lemma
For all \(g,h\in\?G_n\) we have \(gh=\pm\?hg\). That is, if two elements of the Pauli group do not commute then \(gh=-hg\).
Proof
This directly follows from the commutator relations of the Pauli matrices: \(MN=-NM\) for \(M,N\in\{X,Y,Z\}\) if \(M\neq\?N\), together with the representation of \(g\in\?G_n\) given above. QED.

Define a group homomorphism \(r:G_n\to\BB^{2n}\) by the following table

\(N_j\) \(r(g)_j\) \(r(g)_{n+j}\)
\(I\) \(0\) \(0\)
\(X\) \(1\) \(0\)
\(Y\) \(1\) \(1\)
\(Z\) \(0\) \(1\)

By group homomorphism we mean that \(r(gg')=r(g)+r(g')\) and \(r(g\inv)=-r(g)\) (note that \(-r(g)=r(g)\) as we have \(\BB\) as the base field).

Let \(g_1,\ldots,g_l\in\?G_n\) and let \(S=\langle\?g_1,\ldots,g_l\rangle\). Early on in chapter 10.5 we have seen that

Lemma
If \(V_S\) is non-trivial then \(-I\notin\?S\) and all the \(g_j\) commute.
Proof
See chapter 10.5.1 for the simple proof.
Remark
In Proposition 10.5 in the book the reverse direction is proved and moreover it is shown that \(\dim(V_S)=2^k\) if \(l=n-k\).

In the following we want to better understand the condition from the Lemma. Therefore let us define the corresponding check matrix \(G\in\BB^{l\times2n}\) by

\[ G_{ij} = r(g_i)_j \]

Theorem

Let \(\Lambda=X\otimes\?I_n\) and let

\[ \beta = r(g)\Lambda r(h)^T \; (\in \BB) . \]

Then \(gh=(-1)^\beta\?hg\) for all \(g,h\in\?G_n\). In particular \(g\) and \(h\) commute iff \(\beta=0\).

Proof
See exercise 10.33.

As a corollary we get a nice way to check whether \(S\) is abelian or not:

Theorem
The \(g_1,\ldots,g_l\) commute iff \(G\Lambda\?G^T=0\in\BB^{l\times\?l}\).

It is left to find a nice way to check whether \(-I\notin\?S\) and how to characterize independent generators. This is the purpose of the following series of theorems.

Theorem

The following assertions are equivalent

  1. \(-I\notin\?S\) .
  2. \(r\) is injective .
Proof
(1) → (2)
Note that \(r\) is injective if \(r(g)=0\) has only one solution (\(g=I\)). Clearly \(r(g)=0\) iff \(N_j=I\) for all \(j\). Hence \(g=\pm\?I\) or \(g=\pm\ii\?I\). By (1) only \(g=I\) is possible.
(2) → (1)
This follows from \(r(-I)=0=r(I)\). Hence \(r\) cannot be injective whenever \(-I\in\?S\).

QED.

Theorem

Assume that \(-I\notin\?S\). The following assertions are equivalent:

  1. The \(g_1,\ldots,g_l\) are independent.
  2. \(\rank{G}=l\).

The implication (2) → (1) holds even without assuming \(-I\notin\?S\).

Proof
(1) → (2)
This follows from the fact that an injective group homomorphism maps independent sets to independent sets. By a previous theorem \(r\) is injective. Hence the rows of \(G\) are independent as elements of the group \(\BB^{2n}\). But this means they are independent as elements of the vector space \(\BB^{2n}\).
(2) → (1)
From (2) we deduce that the \(r(g_j)\) are independent. But this implies that the \(g_j\) must be independent. This holds even without the assumption that \(r\) is injective.

QED.

Theorem

Assume that \(\rank{G}=l\). The following assertions are equivalent

  1. \(-I\notin\?S\),
  2. \(\alpha_j=\pm1\) for all \(j\),

where \(g_j=\alpha_j\bigotimes_jN_j\).

Proof
(1) → (2)
Since \(g_j=\alpha_j^2I=\pm\?I\) we see that \(-I\notin\?S\) implies \(\alpha_j\neq\pm\ii\) for all \(j\).
(2) → (1)

Assume to the contrary that \(-I\in\?S\). There is a non-zero \(x\in\ZZ^l\) such that \(-I=\prod_jg_j^{x_j}\) (we may assume that \(x_j\in\{0,1,2,3\}\)). Hence

\[ 0 = r(-I) = \sum_{j=1}^n x_j r(g_j) = x G . \]

Since \(\rank{G}=l\) (as matrix over \(\BB\)) we have that \(x_j\in\{0,2\}\) for all \(j\). But \(g_j^2=\alpha_j^2I\) and hence \(\alpha_j=\pm\ii\) for at least one \(j\) - contradicting (2).

QED.

We are in a position to state and prove the main theorem of this section:

Fundamental theorem on stabilizer codes

The non-trivial stabilizer codes are given by all check matrices \(G\in\BB^{l\times2n}\) (for any \(n\) and \(l\leq\?n-1\)), corresponding to a set of generators \(g_1,\ldots,g_l\in\?G_n\), which satisfy

  1. \(\alpha_j=\pm1\) (c.f. representation of \(g_j\)),
  2. \(\rank{G}=l\) ,
  3. \(G\Lambda\?G^T=0\in\BB^{l\times\?l}\) .

In that case the \(g_j\) commute, are independent (in the group theoretical sense) and the dimension of the code is \(2^{n-l}\).

Moreover, two stabilizer codes \(C\) and \(C'\) which have the same check matrix \(G\) (and hence only differ in the signs \(\alpha_j\)) are equivalent in the sense that there exists a Pauli operator \(g\in\?G_n\) such that \(C'=gC\).

Remark
Because of the statement about equivalent stabilizer codes one might set \(\alpha_j=1\) for simplicity. Under this convention the check matrix describes the stabilizer code completely.
Proof

Let \(V_S\) be a non-trivial stabilizer code. Let \(g_1,\ldots,g_l\) be a an independent set of generators. By proposition 10.5 (or the Lemma above) we have \(-I\notin\?S\) and all \(g_j\) commute. By independence of \(g_j\) and \(-I\notin\?S\) we deduce that \(\rank{G}=l\). From this and again using \(-I\notin\?S\) we deduce that \(\alpha_j=\pm1\) for all \(j\). Finally, the fact that the generators commute is equivalent to \(G\Lambda\?G^T=0\). This shows one part of the claim.

Let us now assume the conditions 1 to 3. From (1) and (2) we deduce that \(-I\notin\?S\). (3) is equivalent to all \(g_j\) commuting. By proposition 10.5 this implies that \(V_S\) is non-trivial.

The claim about the dimension of the code is contained in proposition 10.5.

Finally let us show that two stabilizer codes which differ only in \(\alpha\) are equivalent in the sense stated. For concreteness let \(C\) be the code with all \(\alpha_j=1\) and let \(C'\) be another code with the same check matrix and signs \(\alpha_j=\chi_j=\pm1\). The claim actually follows from the proof of proposition 10.5. In fact, if \(P^{(0,\ldots,0)}_S\) is the projector of \(C\) then clearly \(P^{\chi}_S\) is the projector of \(C'\). The proof showed that there exists a \(g_{\chi}\in\?G_n\) such that \(g_{\chi}P^{(0,\ldots,0)}_Sg_{\chi}^\dagger=P^{\chi}_S\). QED.

CSS codes in the stabilizer formalism

There is a subtle error on page 456 which explains how the parity check matrices of a CSS code map to the generators in the stabilizer formalism.

They say that the \(g_j\) containing the \(X\) correspond to the check matrix of \(C_1\) and that the \(g_j\) containing the \(Z\) correspond to the check matrix of \(C_2^\top\). Actually it is the other way round. This doesn't affect the example (Steane code) because the parity check matrices are identical for \(C_1\) and \(C_2^\top\). But it might be confusing. Hence I take the opportunity to lay out the translation for the general case.

Recall that a \([n,k_1-k_2]\) CSS code is constructed from two linear codes \(C_1\) (\([n,k_1]\)) and \(C_2\) (\([n,k_2]\)) with \(C_2\subset\?C_1\). The code words are given by

\[ \ket{x+C_2} = \frac{1}{\sqrt{\abs{C_2}}} \sum_{y\in C_2} \ket{x + y} \]

where \(x+C_2\) is from the quotient space \(C_1/C_2\). Let \(A\in\BB^{(n-k_1)\times\?n}\) be the parity check matrix of \(C_1\) and let \(B\in\BB^{k_2\times\?n}\) be the parity check matrix of \(C_2^\top\). Let

\begin{align*} g_i &= \bigotimes_{j=1}^n X^{B_{ij}}, \text{ for } i=1..k_2, \\ g_{k_2+i} &= \bigotimes_{j=1}^n Z^{A_{ij}}, \text{ for } i=1..(n-k_1). \\ \end{align*}

Theorem
These \(n-(k_1-k_2)\) Pauli operators are the stabilizer for the CSS code of \(C_1\) over \(C_2\).
Proof
We provide a proof in the solution of exercise 10.32.

The check matrix of the CSS code is given by the following block matrix

\[ \begin{bmatrix} B & 0 \\ 0 & A \end{bmatrix} . \]

Exercises

Exercise 10.1

Verify that the encoding circuit in Figure 10.2 works as claimed.


|psi>: ──■────■──
       ┌─┴─┐  │
|0>_0: ┤ X ├──┼──
       └───┘┌─┴─┐
|0>_1: ─────┤ X ├
            └───┘

Solution

This is really just observing that the circuit acts like this on the relevant basis states

\begin{align*} \ket{000} &\mapsto \ket{000} , \\ \ket{100} &\mapsto \ket{111} . \\ \end{align*}

The rest follows by linearity.

Exercise 10.2

The action of the bit flip channel can be described by the quantum operation \(\calE(\rho)=(1-p)\rho+pX\rho\?X\). Show that this may be given an alternate operator-sum representation, as \(\calE(\rho)=(1-2p)\rho+2pP_+\rho\?P_++2pP_-\rho\?P_-\) where \(P_+\) and \(P_-\) are projectors onto the \(+1\) and \(-1\) eigenstates of \(X\), \((\ket{0}+\ket{1})/\sqrt{2}\) and \((\ket{0}+\ket{1})/\sqrt{2}\) respectively. This latter representation can be understood as a model in which the qubit is left alone with probability \(1-2p\), and is ‘measured’ by the environment in the \(\ket{+}\), \(\ket{-}\) basis with probability \(2p\).

Proof

Note that \(X=P_+-P_-\) and \(I=P_++P_-\). Hence

\[ X\rho X = P_+\rho P_+ + P_-\rho P_- - (P_+\rho P_- + P_-\rho P_+) = 2(P_+\rho P_+ + P_-\rho P_-) - I\rho I . \]

Hence

\[ \calE(\rho) = (1 - 2p)\rho + 2p (P_+\rho P_+ + P_-\rho P_-) . \]

QED.

Exercise 10.3

Show by explicit calculation that measuring \(Z_1Z_2\) followed by \(Z_2Z_3\) is equivalent, up to labeling of the measurement outcomes, to measuring the four projectors defined by (10.5)–(10.8), in the sense that both procedures result in the same measurement statistics and post-measurement states.

Proof

Let us define the following projectors

\begin{align*} P_{ijk} &= \proj{i}\otimes\proj{j}\otimes\proj{k} , \\ P_{ijx} &= \proj{i}\otimes\proj{j}\otimes I , \\ P_{xij} &= I\otimes \proj{i}\otimes\proj{j} . \end{align*}

That is, the \(x\) has a special meaning and just stands for the place where the identity acts. Define \(P_{ixx}\) etc analogously. Note that e.g. \(P_{ixx}P_{xjx}=P_{ijx}\). Since \(Z_1=P_{0xx}-P_{1xx}\), \(Z_2=P_{x0x}-P_{x1x}\), \(Z_3=P_{xx0}-P_{xx1}\) we have the following spectral decomposition of the observables we are interested in:

\begin{align*} Z_1Z_2 &= P_{00x} + P_{11x} - P_{01x} - P_{10x} , \\ Z_2Z_3 &= P_{x00} + P_{x11} - P_{x01} - P_{x10} . \end{align*}

Let \(P_j\) for \(j=0..3\) be the projectors from (10.5) to (10.8). If we measure for example \(+1\) for \(Z_1Z_2\) and \(+1\) for \(Z_2Z_3\) this corresponds to the projection (note: the order of the projections is not important due to commutativity)

\[ (+1, +1):\; (P_{00x} + P_{11x})(P_{x00} + P_{x11}) = P_{000} + P_{111} = P_0 . \]

In the same way the other three measurements can be mapped to the \(P_j\) too:

\begin{align*} &(+1, -1):\; (P_{00x} + P_{11x})(P_{x01} + P_{x10}) = P_{001} + P_{110} = P_3 , \\ &(-1, +1):\; (P_{01x} + P_{10x})(P_{x00} + P_{x11}) = P_{100} + P_{011} = P_1 , \\ &(-1, -1):\; (P_{01x} + P_{10x})(P_{x01} + P_{x10}) = P_{101} + P_{010} = P_2 . \end{align*}

Hence the projectors are the same, just re-labeled. QED.

Exercise 10.4

Consider the three qubit bit flip code. Suppose we had performed the error syndrome measurement by measuring the eight orthogonal projectors corresponding to projections onto the eight computational basis states.

  1. Write out the projectors corresponding to this measurement, and explain how the measurement result can be used to diagnose the error syndrome: either no bits flipped or bit number \(j\) flipped, where \(j\) is in the range one to three.
  2. Show that the recovery procedure works only for computational basis states.
  3. What is the minimum fidelity for the error-correction procedure?

Solution for part 1

The projectors are just \(P_j=\proj{j}\) for \(j=0..7\) (note that the binary representations for the numbers from \(0\) to \(7\) are \(000\), \(001\), \(010\), \(011\), \(100\), \(101\), \(110\), \(111\)). A corresponding observable distinguishing each projector would be \(A=\sum_jj\proj{j}\).

Recall that the encoded state is \(\ket{\psi}=a\ket{000}+b\ket{111}\). Let \(k=0\), \(k=1\), \(k=2\), \(k=3\) denote no error, error in qubit 1, 2, 3, respectively. Let \(M_k\) be the set of possible measurements for the given value of \(k\). We have

\begin{align*} M_0 &= \{0, 7\} , \\ M_1 &= \{1, 6\} , \\ M_2 &= \{2, 5\} , \\ M_3 &= \{4, 3\} . \end{align*}

Moreover, measuring \(0,1,2,4\) happens with probability \(\abs{a}^2\) and measuring \(7,6,5,3\) happens with probability \(\abs{b}^2\) (conditioned on a fixed value for \(j\)). Since the four sets are disjoint we can infer the error (the value of \(k\)) from the measurement (e.g. measuring \(1\) or \(6\) implies that the error was \(k=1\) (with certainty if only one of the three errors "is allowed" to happen)).

Solution for part 2

If we measure \(j\) and infer the error \(k\) according to the sets \(M_k\) we would correct it by applying \(X_k\) to the state. In other words, the following quantum operation \(\calR\) performs the syndrome measurement followed by the recovery:

\begin{align*} E_0 &= P_0, \\ E_1 &= X_1P_1, \\ E_2 &= X_2P_2, \\ E_3 &= X_3P_3, \\ E_4 &= X_3P_4, \\ E_5 &= X_2P_5, \\ E_6 &= X_1P_6, \\ E_7 &= P_7 . \end{align*}

Note that \(\calR\) is indeed trace-preserving (\(\sum_jE_j^\dagger\?E_j)=I\)). Recall \(\ket{\psi}=a\ket{000}+b\ket{111}\). Then (the no error case)

\[ \calR(\ket{\psi}) = \abs{a}^2 \proj{000} + \abs{b}^2 \proj{111} . \]

The same formula holds for \(\ket{\psi}\) replaced by \(X_k\ket{\psi}\) (\(k=1..3\)). Hence, for the following noise channel

\[ \calE(\rho) = (1-p)\rho + \sum_{k=1}^3 p_k X_k \rho X_k \]

where \(\sum_kp_k=p\) (this implies trace-preservation, but we could more generally assume \(\ldots\leq\?p\), meaning that with certain probability "something else" happens), which stands for having at most one bit-flip (with probability \(p_k\) at position \(k\)), we have

\[ \calR\circ\calE(\ket{\psi}) = \abs{a}^2 \proj{000} + \abs{b}^2 \proj{111} . \]

(If \(\calE\) was not trace-preserving the RHS would have a factor \(1-p+\sum_jp_j\).) Hence the bit-flip error gets indeed corrected, but unwanted side effects happen too. The final state after recovery is either \(\ket{000}\) (probability \(\abs{a}^2\)) or \(\ket{111}\) (probability \(\abs{b}^2\)). That is the state collapses to a basis state. This is only correct if \(\ket{\psi}\) was already in a basis state (\(a=0\) or \(b=0\)) - which is the claim of part 2.

Solution for part 3

Let us consider two cases:

  1. We do not get to know the results of the syndrome measurement.
  2. We get to know the results of the syndrome measurement.

In case 1 the whole error correction procedure is given by \(\calR\) from the solution of part 2. Let \(\rho=\calR\circ\calE(\ket{\psi})\). Then

\[ F(\ket{\psi},\rho) = \sqrt{\bra{\psi}\rho\ket{\psi}} = \sqrt{\abs{a}^4 + \abs{b}^4} . \]

Since \(\abs{a}^2+\abs{b}^2=1\) the minimum occurs at \(\abs{a}^2=\abs{b}^2=1/2\). Hence \(F\geq1/\sqrt{2}\) in that case.

In case 2 the post-correction state \(\rho\) is either \(\ket{000}\) (probability \(\abs{a}^2\)) or \(\ket{111}\) (probability \(\abs{b}^2\)). Hence

\[ F(\ket{\psi},\rho) \in \left\{\sqrt{\braket{\psi}{i_L}\braket{i_L}{\psi}} \; \middle| \; i\in\{0,1\}\right\} = \{\abs{a}, \abs{b}\} , \]

where the first alternative occurs with probability \(\abs{a}^2\) and the second one with \(\abs{b}^2\).

In both cases the minimial fidelity does not depend on the specifics of the noise model and in particular is always bad if \(\abs{a}\) and \(\abs{b}\) are bounded away from \(0\) (or equivalently, from \(1\)).

Remark

The minimum fidelity of case 1 is the weighted root square mean of the two possible fidelities of case 2. The weights are \(\abs{a}^2\) and \(\abs{b}^2\). See e.g. generalized mean on wikipedia for more infos. To make this more apparent consider \(\sigma=p\proj{0}+(1-p)\proj{1}\) (and \(\ket{\psi}=a\ket{0}+b\ket{1}\)):

\[ F(\ket{\psi},\sigma) = \sqrt{\bra{\psi}\sigma\ket{\psi}} = \sqrt{p\abs{a}^2+(1-p)\abs{b}^2} . \]

Here \(p\) and \(1-p\) are the weights.

Exercise 10.5

Show that the syndrome measurement for detecting phase flip errors in the Shor code corresponds to measuring the observables \(A=X_1X_2X_3X_4X_5X_6\) and \(B=X_4X_5X_6X_7X_8X_9\).

Proof

Let us denote

\begin{align*} \ket{\pi} &= \frac{1}{\sqrt{2}} (\ket{000} + \ket{111}) , \\ \ket{\mu} &= \frac{1}{\sqrt{2}} (\ket{000} - \ket{111}) . \end{align*}

With this notation the Shor code is given by

\begin{align*} \ket{0_L} &= \ket{\pi\pi\pi} , \\ \ket{1_L} &= \ket{\mu\mu\mu} , \\ \end{align*}

Hence we have the following encoding:

\[ \ket{\psi} = \ket{\psi_0} = a\ket{0_L} + b\ket{1_L} = a\ket{\pi\pi\pi} + b\ket{\mu\mu\mu} . \]

In this notation a phase flip error is represented naturally. For example a phase flip in one of the three qubits in the first block of \(\ket{\psi}\) is given by

\[ \ket{\psi_1} = a\ket{\mu\pi\pi} + b\ket{\pi\mu\mu} . \]

It doesn't matter in which of the three qubits, it always leads to the same state. The other two errors (phase flip in the second or third block) are given by

\begin{align*} \ket{\psi_2} &= a\ket{\pi\mu\pi} + b\ket{\mu\pi\mu} , \\ \ket{\psi_3} &= a\ket{\pi\pi\mu} + b\ket{\mu\mu\pi} . \end{align*}

Now observe that

\begin{align*} X_1X_2X_3 \ket{\pi} &= + \ket{\pi} , \\ X_1X_2X_3 \ket{\mu} &= - \ket{\mu} . \end{align*}

Hence e.g. measuring a state \(\ket{ijk}\) (\(i,j,k\in\{\pi,\mu\}\)) with respect to \(A\) detects if \(i=j\) (measurement result: \(+1\)) or \(i\neq\?j\) (measurement result: \(+1\)). Because of their special structure this translates to \(\ket{\psi_k}\). For the four errors (first one is no error) we get the following measurement results (each with certainty):

\(k\) (\(\ket{\psi_k}\)) \(A\) \(B\)
0 +1 +1
1 -1 +1
2 -1 -1
3 +1 -1

Hence we can infer the syndrome \(k\) (which error occured) from the measurement - as desired. QED.

Exercise 10.6

Show that recovery from a phase flip on any of the first three qubits may be accomplished by applying the operator \(Z_1Z_2Z_3\).

Proof

Let us reuse the notation from the solution of exercise 10.5. Clearly

\[ Z_1Z_2Z_3 \ket{\pi} = \ket{\mu} \text{ and } Z_1Z_2Z_3 \ket{\mu} = \ket{\pi} . \]

An error in the first three qubits means that the corresponding state is \(\ket{\psi_1}\). By the above we have:

\[ Z_1Z_2Z_3 \ket{\psi_1} = \ket{\psi} . \]

QED.

Exercise 10.7

Consider the three qubit bit flip code of Section 10.1.1, with corresponding projector \(P=\proj{000}+\proj{111}\). The noise process this code protects against has operation elements

\[ \left\{ \sqrt{(1-p)^3}I, \sqrt{p(1-p)^2}X_1, \sqrt{p(1-p)^2}X_2, \sqrt{p(1-p)^2}X_3 \right\} , \]

where \(p\) is the probability that a bit flips. Note that this quantum operation \(\calE\) is not trace-preserving, since we have omitted operation elements corresponding to bit flips on two and three qubits. Verify the quantum error-correction conditions for this code and noise process.

Solution

Clearly \(PX_iX_jP=\delta_{ij}P\). Hence

\begin{align*} P E_0^\dagger E_0 P &= (1-p)^3 P , \\ P E_i^\dagger E_i P &= p(1-p)^2 P \text{ for } i=1..3 , \\ P E_i^\dagger E_j P &= 0 \text{ for } i\neq j . \end{align*}

Hence the conditions are satisfied and there exists an error correction procedure \(\calR\) for this noise.

Bonus: Calculating the recovery procedure

By the previous calculation and Theorem 10.1 there exists a trace-preserving quantum operation \(\calR\) such that

\[ \calR \circ \calE (P\rho P) = [(1-p)^3 + 3p(1-p)^2] \, P \rho P . \]

The concrete form of the proportionality factor follows from (10.25) as it equals \(\sum_kd_{kk}\). Since this is a trace this formula would even be true if \((d_{ij})\) wasn't already diagonal.

To compute the Kraus matrices \(R_k\) of \(\calR\) we need the \(E_k\) such that the \((d_{ij})\) are diagonal. This is already the case

\[ d = \diag((1-p)^3, p(1-p)^2, p(1-p)^2, p(1-p)^2) . \]

From the proof of Theorem 10.1 we know that

\[ R_k = U_k^\dagger P_k \]

where \(P_k=U_kPU_k^\dagger\) and \(E_kP=\sqrt{d_{kk}}U_kP\) is a polar decomposition. The \(P_k\) are the projectors onto \(E_kC\) (\(C\) being the code space onto which \(P\) projects). The diagonality of \((d_{ij})\) implies that \(P_iP_j=\delta_{ij}P_i\) (meaning that the error subspaces are orthogonal to each other).

Clearly we can set \(U_0=I\) (it is only uniquely defined on the image of \(P\)). For \(k=1\) we have

\[ E_1 P = \sqrt{p(1-p)^2} X_1 P . \]

This is already a polar decomposition (the Pauli operators are unitary). Hence we can just set \(U_1=X_1\) and analogously \(U_k=X_k\) for \(k=1..3\). Hence

\begin{align*} P_0 &= \proj{000} + \proj{111} , \\ P_1 &= \proj{100} + \proj{011} , \\ P_2 &= \proj{010} + \proj{101} , \\ P_3 &= \proj{001} + \proj{110} . \end{align*}

Note that this corresponds to the already in (10.5-10.8) introduced syndrome measurement operators. The Kraus matrices of \(\calR\) are

\[ R_k = U_k^\dagger P_k = \begin{cases} IP = P & \text{for } k = 0 , \\ X_k X_kPX_k = PX_k & \text{for } k \in \{1,2,3\} . \end{cases} \]

Exercise 10.8

Verify that the three qubit phase flip code \(\ket{0_L}=\ket{+++}\), \(\ket{1_L}=\ket{---}\) satisfies the quantum error-correction conditions for the set of error operators \(\{I,Z_1,Z_2,Z_3\}\).

Solution 1

Let us reduce this to the bit flip code. To convert to the bit flip code we only have to conjugate the projector of the code and the Kraus operators of the errors by \(H^{\otimes3}\) (Hadamard).

\[ \proj{000} + \proj{111} = H^{\otimes3} \proj{+++} + \proj{---} H^{\otimes3} \]

and

\[ X_k = H^{\otimes3} Z_k H^{\otimes3} . \]

Let \(P\) be the projector of the phase flip code and let \(E_i\) be the mentioned errors it corrects. Let \(Q\) and \(F_j\) be the same thing for the bit flip.

\[ P E_i^\dagger E_j P = H^{\otimes3} Q F_i^\dagger F_j Q H^{\otimes3} = H^{\otimes3} \delta_{ij} Q H^{\otimes3} = \delta_{ij} P . \]

In the second equality we used what we already know about the bit flip code.

Solution 2

We can also verify this explicitly. Let \(P=P_{+++}+P_{---}\) be the projector onto the code, where we use the notation \(P_{ijk}=\proj{ijk}\). Clearly \(PE_i^\dagger\?E_iP=PIP=P\). Moreover

\[ PZ_1P = (P_{+++} + P_{---})(P_{-++} + P_{+--}) = 0 \]

and

\[ PZ_1Z_2P = (P_{+++} + P_{---})(P_{--+} + P_{++-}) = 0 . \]

Similarly \(PIZ_jP=PZ_iZ_jP=0\) for \(i\neq\?j\). Hence \(PE_i^\dagger\?E_iP=\delta_{ij}\).

Exercise 10.9

Again, consider the three qubit phase flip code. Let \(P_i\) and \(Q_i\) be the projectors onto the \(\ket{0}\) and \(\ket{1}\) states, respectively, of the \(i\)​th qubit. Prove that the three qubit phase flip code protects against the error set \({I,P_1,Q_1,P_2,Q_2,P_3,Q_3}\).

Solution

Recall that the phase flip code corrects the errors \(I\), \(Z_1\), \(Z_2\), and \(Z_3\). By theorem 10.2 linear combinations of these errors are corrected too. Note that we have

\begin{align*} P_k &= \frac{1}{2} (I + Z_k) , \\ Q_k &= \frac{1}{2} (I - Z_k) . \end{align*}

Hence those errors are corrected too.

Exercise 10.10

Explicitly verify the quantum error-correction conditions for the Shor code, for the error set containing \(I\) and the error operators \(X_j,Y_j,Z_j\) for \(j=1\) through \(9\).

Solution

Let us denote

\begin{align*} \ket{\pi} &= \frac{1}{\sqrt{2}} (\ket{000} + \ket{111}) , \\ \ket{\mu} &= \frac{1}{\sqrt{2}} (\ket{000} - \ket{111}) . \end{align*}

The Shor code is given by he projector

\[ P = \proj{\pi\pi\pi} + \proj{\mu\mu\mu} . \]

It is not hard to see that the errors transform the code space into a set of mutually orthogonal subspaces - with one exception. The effect of \(Z_j\) is the same if \(j\) stays within one "block" (there are three block \(\{1,2,3\}\), \(\{4,5,6\}\), and \(\{7,8,9\}\)). Hence:

\[ PE_i^\dagger E_j P = \begin{cases} P & \text{if } i=j \text{ or } E_i,E_j \text{ both Z on same block} , \\ 0 & \text{else} . \end{cases} \]

Exercise 10.11

Construct operation elements for a single qubit quantum operation \(\calE\) that upon input of any state \(\rho\) replaces it with the completely randomized state \(I/2\). It is amazing that even such noise models as this may be corrected by codes such as the Shor code!

Solution

We already know that the quantum operation

\[ \calE(\rho) = \frac{1}{2} I \]

is the depolarizing channel with \(p=1\). By (8.102) this is equal to

\[ \calE(\rho) = \frac{1}{4} (I\rho I + X\rho X + Y\rho Y + Z\rho Z) . \]

Hence the relevant operation elements are

\[ \left\{\frac{1}{2}I, \frac{1}{2}X, \frac{1}{2}Y, \frac{1}{2}Z \right\} . \]

Exercise 10.12

Show that the fidelity between the state \(\ket{0}\) and \(\calE(\proj{0})\) is \(\sqrt{1-2p/3}\) and use this to argue that the minimum fidelity for the depolarizing channel is \(\sqrt{1-2p/3}\).

Solution 1

Let us follow the hint of the book.

\[ F(\ket{0}, \calE(\proj{0})) = \sqrt{(1-p) + \frac{p}{3}(\bra{0}X\ket{0}^2 + \bra{0}Y\ket{0}^2 + \bra{0}Z\ket{0}^2)} = \sqrt{(1-p) + \frac{p}{3}} = \sqrt{1-\frac{2p}{3}} . \]

Let \(\ket{\psi}\) be an arbitrary state and \(U\) be a unitary operator such that \(\ket{\psi}=U\ket{0}\). Then

\[ F(\ket{\psi}, \calE(\proj{\psi})) = F(U\ket{0}, U\calE(\proj{0})U^\dagger) = F(\ket{0}, \calE(\proj{0})) = \sqrt{1-\frac{2p}{3}} . \]

In the second equality we used that the fidelity is invariant under unitary transformations.

Solution 2

We could also prove the claim by using an alternative formula for the depolarizing channel:

\[ \calE(\rho) = \frac{2p}{3} I + \left(1 - \frac{4p}{3}\right) \rho . \]

(compare equations (8.100) and (8.102).) Hence

\[ F(\ket{\psi}, \calE(\proj{\psi})) = \sqrt{\frac{2p}{3} + \left(1 - \frac{4p}{3}\right)} = \sqrt{1-\frac{2p}{3}} . \]

Solution 3

Yet another approach first calculates

\[ F(\ket{\psi}, \calE(\proj{\psi})) = \sqrt{(1-p) + \frac{p}{3}(\bra{\psi}X\ket{\psi}^2 + \bra{\psi}Y\ket{\psi}^2 + \bra{\psi}Z\ket{\psi}^2)} . \]

Recall the representation \(\proj{\psi}=2\inv(I+\vec{r}\cdot\vec{\sigma})\) of a qubit in the bloch sphere (\(\abs{\vec{r}}=1\)). One can show that

\[ \bra{\psi}Z\ket{\psi} = r_z \]

and analogous formulas for \(r_x\) and \(r_y\). Hence

\[ F(\ket{\psi}, \calE(\proj{\psi})) = \sqrt{(1-p) + \frac{p}{3}\abs{\vec{r}}^2} = \sqrt{1-\frac{2p}{3}} . \]

Remark

To see e.g.

\[ \bra{\psi}Z\ket{\psi} = r_z \]

observe that

\[ \bra{\psi}Z\ket{\psi} = \trace{\proj{\psi}Z} = r_z . \]

In the last equality we used that the Pauli matrices (including \(I\)) are an orthogonal set with respect to the Hilbert-Schmidt scalar product (with norm \(\sqrt{2}\) for each base vector). The other two formulas for \(r_x\) and \(r_y\) follow along the same lines.

If you want to see the last equality directly, observe that (using \(S=\sqrt{Z}\))

\[ \trace{\proj{\psi}Z} = \trace{S\proj{\psi}S} \]

and use \(SIS=Z\), \(SXS=\ii\?X\), \(SYS=\ii\?Y\), \(SZS=I\) (only the last term has a non-zero trace).

Exercise 10.13

Show that the minimum fidelity \(F(\ket{\psi},\calE(\proj{\psi}))\) when \(\calE\) is the amplitude damping channel with parameter \(\gamma\), is \(\sqrt{1-\gamma}\).

Proof

Let us use sage to get an algebraic expression for \(F\). First define amplitude damping

E0 = matrix.diagonal([1, sqrt(1-g)])
E1 = matrix([[0, sqrt(g)], [0, 0]])
AD = make_operation([E0, E1])

Now we can use this to compute \(F\):

psi = a*ket('0') + b*ket('1')
psi_mat = matrix(psi).T
rho = AD(psi_mat*psi_mat.H).simplify_full()
(psi.conjugate() * rho * psi).simplify_full()
2*a*b*sqrt(-g + 1)*conjugate(a)*conjugate(b) + a^2*conjugate(a)^2 + b^2*conjugate(b)^2 + (a*b*conjugate(a)*conjugate(b) - b^2*conjugate(b)^2)*g

This looks a bit ugly so let us rewrite this:

\[ F = \abs{a}^4 + \abs{b}^4 + \gamma(\abs{a}^2 \abs{b}^2 - \abs{b}^4) + 2\sqrt{1-\gamma}\, \abs{a}^2\abs{b}^2 . \]

Substituting \(x=\abs{a}^2\) and \(y=\abs{b}^2\) this can be written as:

\[ F = \sqrt{f(x,y)} = \sqrt{x^2 + y^2 + \gamma (xy - y^2) + 2\sqrt{1-\gamma} \, xy} , \]

under the constraint \(x+y=1\) and \(x,y\geq0\). Thus we have to solve a constrained minimization problem. There are two possibilities. The first one is that the minimum occurs at the boundaries of the feasible region. Therefore let us calculate

\begin{align*} f(1,0) &= 1 , \\ f(0,1) &= 1 - \gamma . \end{align*}

The second possibility is that the minimum occurs in the inside of the feasible region. In that case we might try to find it by the method of Lagrange multipliers:

\begin{align*} \partial_x f(x, y) &= 2x + (\gamma + 2\sqrt{1-\gamma}) y = \lambda \cdot 1 , \\ \partial_y f(x, y) &= 2(1-\gamma)y + (\gamma + 2\sqrt{1-\gamma}) x = \lambda \cdot 1 . \end{align*}

This looks a bit complicated so let us change the strategy a bit by defining a parameterization of the feasible region

\[ g(s) = f(s,1-s) . \]

Abbreviating \(\alpha=\gamma+2\sqrt{1-\gamma}\) we see that

\begin{align*} g'(s) &= \partial_x f(s,1-s) - \partial_y f(s,1-s) \\ &= (\alpha + 2\gamma - 2) - 2(\gamma + \alpha - 2) s \\ &=: c_0 - c_1 s . \end{align*}

Observe that \(\alpha+\gamma\geq2\). Hence \(c_0,c_1\geq0\) which implies that there is a \(s_0\geq0\) such that \(g\) is increasing for \(s\leq\?s_0\) and decreasing afterwards (note also that \(s_0\) could be larger than \(1\)). Hence a local extremum can only be a local maximum and the minimum is indeed obtained at the boundary:

\[ f(0, 1) = g(0) = 1 - \gamma . \]

Hence \(F_{\min}=\sqrt{1-\gamma}\) . QED.

Exercise 10.14

Write an expression for a generator matrix encoding \(k\) bits using \(r\) repetitions for each bit. This is an \([rk,k]\) linear code, and should have an \(rk\times\?k\) generator matrix.

Solution

Let \(\mathbb{1}_r\in\?\BB^r\) be the vector containing just ones. A generator for this code is given by the tensor product

\[ G = I_k \otimes \mathbb{1}_r . \]

This naturally reproduces the special cases \(r=3\), \(k\in\{1,2\}\) from the book.

Exercise 10.15

Show that adding one column of \(G\) to another results in a generator matrix generating the same code.

Proof

Let \((e_j)\) be the standard basis vectors in \(\BB^k\). The code generated by \(G\) is just the image \(C=G\BB^k\) of \(G\) which in turn is given by all linear combinations of \((Ge_j)\). Let \(G'\) be the generator found by adding column \(k\) to column \(l\neq\?k\).

We have to show that \(C=G'\BB^k\). For this in turn it suffices to show that \((G'e_j)\) spans \(C\). First of all observe that

\[ Ge_j = G'e_j \text{ for } j \neq l . \]

Hence it suffices to show that \(\{Ge_k,Ge_l\}\) spans the same space as \(\{G'e_k,G'e_l\}\). Observe that

\begin{align*} G' e_l &= Ge_k + Ge_l , \\ G e_l &= G'e_k + G'e_l . \end{align*}

This shows the claim. QED.

Exercise 10.16

Show that adding one row of the parity check matrix to another does not change the code. Using Gaussian elimination and swapping of bits it is therefore possible to assume that the parity check matrix has the standard form \([A|I_{n-k}]\), where \(A\) is an \((n-k)\times\?k\) matrix.

Proof

Recall that a code \(C\), in terms of a parity check matrix \(H\), is defined as

\[ y \in C \Leftrightarrow \forall i: \sum_j H_{ij} y_j = 0 . \]

Let \(H'\) be a matrix obtained from \(H\) by adding row \(k\) to row \(l\). Let \(C'\) be the code defined by \(H'\). We have to show \(C'=C\). Let \(y\in\?C\). Then we have

\[ \forall i\neq l: \sum_j H'_{ij} y_j = 0 . \]

because for those \(i\) we have \(H'_{ij}=H_{ij}\). Moreover

\[ \sum_j H'_{lj} y_j = \sum_j H_{kj} y_j + \sum_j H_{lj} y_j = 0 , \]

by definition of \(H'\). Hence \(y\in\?C'\). Since \(y\in\?C\) was arbitrary this shows \(C\subseteq\?C'\). By symmetry we also have \(C'\subseteq\?C\). In fact, \(H\) can be obtained from \(H'\) by adding row \(k\) to line \(l\) so the same reasoning applies to the reverse inclusion. QED.

Exercise 10.17

Find a parity check matrix for the \([6,2]\) repetition code defined by the generator matrix in (10.54).

\begin{bmatrix} 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ 0 & 1 \\ 0 & 1 \\ 0 & 1 \\ \end{bmatrix}

Solution

Let \(G_1\) and \(H_1\) be the generator and parity check matrix for \(k=1\) bits (see equations (10.53) and (10.58)). We have

\[ G_2 = I_2 \otimes G_1 . \]

(C.f. the solution of exercise 10.14.) This suggests to define

\[ H_2 = I_2 \otimes H_1 = \begin{bmatrix} 1 & 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 & 1 \?\end{bmatrix} . \]

This clearly does the job. It makes sense to compare this to exercise 10.18.

Exercise 10.18

Show that the parity check matrix \(H\) and generator matrix \(G\) for the same linear code satisfy \(HG=0\).

Proof

Recall that a code \(C\) is just the image of the generator \(C=G\BB^k\). On the other hand it can be defined as the kernel of the parity check matrix: \(C=\{y|Hy=0\}\). Since every code word has the form \(y=Gx\) we see that \(HGx=0\) for every \(x\). Hence the claim follows. QED.

Exercise 10.19

Suppose an \([n,k]\) linear code \(C\) has a parity check matrix of the form \(H=[A|I_{n-k}]\), for some \((n-k)\times\?k\) matrix \(A\). Show that the corresponding generator matrix is

\[ G = \left[ \begin{array}{c} I_k \\ \hline -A \end{array} \right] \]

(Note that \(-A=A\) since we are working modulo 2; however, this equation also holds for linear codes over more general fields than \(\ZZ_2\).)

Solution

Clearly \(G\) has rank \(k\). Hence its images has the same dimension as the kernel of \(H\) (which is \(k\)). We only have to verify that \(HG=0\) (c.f. exercise 10.18). But this is easy:

\[ HG = A I_k - I_{n-k} A = A - A = 0 . \]

Exercise 10.20

Let \(H\) be a parity check matrix such that any \(d-1\) columns are linearly independent, but there exists a set of \(d\) linearly dependent columns. Show that the code defined by \(H\) has distance \(d\).

Proof

Let us divide the proof into two parts. In part one we assume that \(H\) has \(d\) linearly dependent columns. In part two we assume that every \(d-1\) columns are linearly independent.

Part 1

Assume that \(H\) has \(d\) linearly dependent columns \(c_1,\ldots,c_d\). We want to show that there exists a \(y\in\?C\) (i.e. \(y\in\BB^n\) such that \(Hy=0\)) with \(\wt{y}\leq\?d\).

Linear dependence means that there exist coefficients \(\alpha_i\in\BB\) such that

\[ 0 = \sum_i \alpha_i c_i . \]

Without loss of generality we may assume that all \(\alpha_i\) are equal to \(1\). Otherwise there would exist a subset of \(d'\?<\?d\) columns such that this is fulfilled.

Let \(J=\{i_1,\ldots,i_d\}\) be the indices of these columns in \(H\). Let \(y\in\BB^n\) be such that \(y_i=1\) if \(i\in\?J\) and \(y_i=0\) otherwise. By construction we have \(\wt{y}=d\) and \(Hy=0\). This shows the claim of part 1.

Part 2

Assume that every \(d-1\) columns are linearly independent. We have to show that \(Hy=0\) implies \(\wt{y}\geq\?d\).

The assumption implies that every for every \(y\) with \(\wt{y}\leq\?d-1\) we have \(Hy\neq0\). But this is equivalent to the claim.

Taking parts 1 and 2 together we see that \(d(C)=d\) is equivalent to: Any \(d-1\) columns of \(H\) are linearly independent but there exist \(d\) columns which are linearly dependent. QED.

Exercise 10.21 (Singleton bound)

Show that an \([n,k,d]\) code must satisfy \(n-k\geq\?d-1\).

Proof

This follows directly from exercise 10.20. In fact, the parity check matrix \(H\in\BB^{(n-k)\times\?n}\) of the code must have full rank \(n-k\) in order for the code space to be \(k\)​-dimensional (otherwise it would be bigger and there could be no bijection between the code words and the elements of \(\BB^k\)).

On the other hand, from \(d(C)=d\), using exercise 10.20, we see that the rank of \(H\) must be at least \(d-1\). Hence \(n-k\geq\?d-1\). QED.

Exercise 10.22

Show that all Hamming codes have distance \(3\), and thus can correct an error on a single bit. The Hamming codes are therefore \([2^r-1,2^r-r-1,3]\) codes.

Proof

The parity check matrix of the Hamming code is made of matrices with columns \(h_j\) being the binary representation of \(j\) for \(j\in\{1,\ldots,2^r-1\}\). Hence

\[ h_1 + h_2 = h_3 , \]

implying that the first three columns are linearly dependent. On the other hand any two columns are linearly independent since

\[ h_i + h_j = 0 \]

implies \(i=j\). Now the claim follows from exercise 10.20. QED.

Exercise 10.23

Prove the Gilbert–Varshamov bound: Let \(\varepsilon\?>0\) and \(0\?<\delta\?<\?1/2\). There exists an \([n,k,d]\) code such that

\[ R \geq 1 - H(\delta) - \varepsilon , \]

where \(R=k/n\) is the rate of the code, \(\delta\leq(d-1)/n\), and \(H(x)=-x\log_2(x)-(1-x)\log_2(x)\). In other words, for a given relative distance \(\delta\?<\?1/2\) there is always a corresponding code whose transmission rate is not too bad.

Remark
  • For \(\varepsilon=0\) this is equivalent to the orginial exercise. The proof suggests that it is necessary to have \(\varepsilon\?>\?0\). Moreover the wikipedia version of the theorem also has it. Hence I believe this is just an error in the exercise.
  • Another reason why I reformulated the statement is the following: In the original statement I originally imagined \(t\) to be fixed. But then if you only consider large \(n\) this isn't really useful since a code with very many bits has also a large "attack surface" for errors. So it is more natural to consider a relative quantity instead.
  • The authors write "The proof of the Gilbert–Varshamov bound is quite simple …". I do not share this assessment. Maybe my solution is more complicated than it should be. But in my opinion most exercises in part III of the book (so far) are much easier.

Proof

In this proof we use the so called probabilistic method which seems to be the standard way to prove this bound (see e.g. wikipedia for the general case of linear codes over finite fields).

Let \(n\geq\?k\) to be fixed later. The main idea is to consider a uniform probability distribution over \(\BB^{n\times\?k}\) and draw random generators \(G\) from it (let us denote the generated code by \(C\)). The idea is to show that there is a non-zero probability that the drawn generator generates a code with the desired properties.

Note one subtle thing here: the generator must be injective in order to generate a \(k\)​-dimensional code. Therefore consider the following lemma.

Lemma
For all \(n\geq\?k\) let \(P(n,k)\) be the probability that a uniformly at random drawn matrix \(G\in\BB^{n\times\?k}\) has rank \(k\). There exists a constant \(P_\infty\?>\?0\) (which does not depend on \(n\) or \(k\)) such that \(P(n,k)\geq\?P_\infty\).
Proof

Clearly \(P(n,k)\geq\?P(k,k)\). So we only have to consider the case \(n=k\). For \(l\leq\?k\) let \(P_l\) be the probability that a random matrix from \(\BB^{l\times\?k}\) has rank \(l\). Clearly

\[ P_1 = (2^k - 1) / 2^k = 1 - 2^{-k} \]

because there is only one vector (the zero vector) which leads to a single rowed rank-zero matrix. Observe that

\[ P_{l+1} = P_l \cdot (2^k - 2^l) /2^k . \]

This is because in order to have \(l+1\) independent rows you first need \(l\) independent rows and the last row has to be outside the \(l\)​-dimensional subspace (with \(2^l\) elements) of the first \(l\) rows. Hence

\[ P(k,k) = P_k = \prod_{j=1}^{k} (1 - 2^{-j}) \geq \prod_{j=1}^{\infty} (1 - 2^{-j}) =: P_\infty > 0 . \]

QED.

Our goal is to prove that for sufficiently large \(n\) (considering also non-injective \(G\))

\[ \prob{d(C)\geq d} > P_\infty \]

because then

\[ \prob{d(C)\geq d \text{ and } \rank{G}=k} \geq \prob{d(C)\geq d} - P_\infty > 0 . \]

which proves the claim. Let us estimate the probability that the distance of the code is not at least \(d\). In order to show the above bound it suffices to show that the following can be made arbitrary small (for large \(n\)):

\[ \prob{d(C)\leq d-1} \leq \sum_{x\in\BB^k\backslash\{0\}} \prob{\wt{Gx} \leq d-1} = \frac{V_d}{2^{n-k}} . \]

The first inequality follows from the fact that the event \(d(C)\leq\?d-1\) is equivalent to the event that at least one of the code words has distance at most \(d-1\) (also use the sub-additivity of probabilities). For the equality on the right let us first note that for each \(x\) the code word \(Gx\) is uniformly distributed (each column of \(G\) is uniformly distrubuted and so is each sum of columns). Hence \(\prob{\wt{Gx}\leq\?d-1}=V_d/2^n\) where

\[ V_d = \sum_{i=0}^{d-1} \binom{n}{i} \]

is the number of potential code words with weight at most \(d-1\).

Lemma
\(V_d\leq2^{H(\delta)n}\) if \(\delta\leq1/2\), where \(\delta=(d-1)/n\).
Remark
A reverse inequality is is also true: \(V_d\geq2^{H(\delta)n+o(n)}\). This can be seen from Stirlings formula. Actually I knew this reverse inequality already since I solved exercise 6.14, see e.g. here to get an idea for the proof. This is the reason I found it natural to connect \(V_d\) to the entropy. I was a bit surprised to find out that the \(V_d\leq\ldots\) inequality has such a clean form (no Landau notation).
Proof

The claim follows from this:

\[ \sum_{i=0}^{d-1} \binom{n}{i} 2^{-H(\delta)n} = \sum_{i=0}^{d-1} \binom{n}{i} \delta^{d-1} (1-\delta)^{n-d+1} \leq \sum_{i=0}^{d-1} \binom{n}{i} \delta^j (1-\delta)^{j-d} = (\delta + (1 - \delta))^n = 1 . \]

For the inequality in the middle it is important to have \(\delta\leq1/2\). The last equality is just the binomial formula. QED.

Using the lemma in the estimate for the probability that \(d(C)\leq\?d-1\) we get

\[ P := \prob{d(C)\leq d-1} \leq \left[\frac{2^{H(\delta)+O(n\inv)}}{2^{1-R}}\right]^n = \left[\frac{2^{R+o(1)}}{2^{1-H(\delta)}}\right]^n . \]

Here we choose \(d\) minimal so that \(\delta\leq(d-1)/n\) holds. Hence \(\delta=(d-1+O(1))/n\) which explains the \(O(n\inv)\) in the inequalities above.

Let us suppose we can choose \(R\) such that

\begin{align*} R &\geq 1 - H(\delta) - \varepsilon , \\ R + o(1) &\leq 1 - H(\delta) - \varepsilon/2 , \\ \end{align*}

Then we would have \(P\leq2^{-\varepsilon\?n/2}\) which is arbitrary small if \(n\) is large enough - which proves the desired bound. But since \(R=k/n\) this is indeed possible if we choose \(n\) large enough and \(k\) appropriately. QED.

Exercise 10.24

Show that a code with generator matrix \(G\) is weakly self-dual if and only if \(G^TG=0\).

Proof

Weakly self dual means \(C\subseteq\?C^{\bot}\). Hence

\[ H^{\bot} G = 0 , \]

where \(G\) is the generator of \(C\) and \(H^\bot\) the parity check matrix of \(C^\bot\). But a parity check matrix of \(C^\bot\) is \(H^\bot=G^T\). QED.

Exercise 10.25

Let \(C\) be a linear code. Show that if \(x\in\?C^\bot\) then \(\sum_{y\in\?C}(-1)^{x\cdot\?y}=\abs{C}\), while if \(x\notin\?C^\bot\) then \(\sum_{y\in\?C}(-1)^{x\cdot\?y}=0\).

Proof

By definition of dual codes we have \(x\cdot\?y=0\) if \(x\in\?C^\bot\) and \(y\in\?C\). Hence, in that case

\[ \sum_{y\in\?C}(-1)^{x\cdot y} = \sum_{y\in\?C} 1 = \abs{C} \]

Assume now that \(x\in\?C^\bot\). Hence there exists a \(\tilde{y}\in\?C\) such that \(x\cdot\?\tilde{y}=1\). Let us decompose \(C=C'\oplus\?\tilde{y}\BB\). Hence

\[ \sum_{y\in C}(-1)^{x\cdot y} = \sum_{y'\in C'} \sum_{\alpha=0}^1 (-1)^{y'\cdot x + \alpha\tilde{y}\cdot x} = \sum_{y'\in C'} 0 = 0 . \]

QED.

Exercise 10.26

Suppose \(H\) is a parity check matrix. Explain how to compute the transformation \(\ket{x}\ket{0}\mapsto\ket{x}\ket{Hx}\) using a circuit composed entirely of CNOT gates.

Solution

Let us explain it on the specific example of the \([3,1]\) repetition code:

\[ H = \begin{bmatrix} 1 & 1 & 0 \\ 0 & 1 & 1 \end{bmatrix} . \]

A corresponding circuit would be


a_0: ──■─────────────────
       │
a_1: ──┼────■────■───────
       │    │    │
a_2: ──┼────┼────┼────■──
     ┌─┴─┐┌─┴─┐  │    │
b_0: ┤ X ├┤ X ├──┼────┼──
     └───┘└───┘┌─┴─┐┌─┴─┐
b_1: ──────────┤ X ├┤ X ├
               └───┘└───┘

The idea is that for each \((i,j)\) with \(H_{ij}=1\) we have a CNOT with control at target \(b_i\) and control \(a_j\).

Exercise 10.27

Show that the codes defined by

\[ \ket{x+C_2} = \frac{1}{\sqrt{\abs{C_2}}} \sum_{y\in\?C_2} (-1)^{u\cdot y} \ket{x+y+v} \]

and parameterized by \(u\) and \(v\) are equivalent to \(\mathrm{CSS}(C_1,C_2)\) in the sense that they have the same error-correcting properties. These codes, which we’ll refer to as \(\mathrm{CSS}_{u,v}(C_1,C_2)\), will be useful later in our study of quantum key distribution, in Section 12.6.5.

Sketch of a solution

By using \(X\), \(Z\), and Hadamard gates one can unitarily transform the code into the following form

\[ \ket{x+C_2} = \frac{1}{\sqrt{\abs{C_2}}} \sum_{y\in\?C_2} (-1)^{u\cdot x} \ket{x+y} = (-1)^{u\cdot x} \frac{1}{\sqrt{\abs{C_2}}} \sum_{y\in\?C_2} \ket{x+y} . \]

Basically we introduce errors \(e_1=v\) and \(e_2=u\) on purpose to accomplish that. Now it is not hard to verify that the factor \((-1)^{u\cdot\?x}\) does not interfere with the analysis from the book.

Exercise 10.28

Verify that the transpose of the matrix in (10.77) is the generator of the \([7,4,3]\) Hamming code.

Solution

The claim follows essentially from the following calculation:

# Standard parity check matrix for the [7,3] Hamming code:
H = matrix(GF(2), [
    [0, 0, 0, 1, 1, 1, 1],
    [0, 1, 1, 0, 0, 1, 1],
    [1, 0, 1, 0, 1, 0, 1],
])

# We want to check that this is indeed a generator:
G = matrix(GF(2), [
    [1, 0, 0, 0, 0, 1, 1],
    [0, 1, 0, 0, 1, 0, 1],
    [0, 0, 1, 0, 1, 1, 0],
    [0, 0, 0, 1, 1, 1, 1],
]).T

assert H*G == matrix.zero(3, 4)
"PASSED"
'PASSED'

One important thing to note is that \(G\) has full rank (obviously). Hence \(HG=0\) is sufficient to prove \(G\) to be a generator of the \([7,4]\) Hamming code.

Exercise 10.29

Show that an arbitrary linear combination of any two elements of \(V_S\) is also in \(V_S\). Therefore, \(V_S\) is a subspace of the \(n\) qubit state space. Show that \(V_S\) is the intersection of the subspaces fixed by each operator in \(S\) (that is, the eigenvalue one eigenspaces of elements of \(S\)).

Solution

We have

\begin{align*} V_S &= \{v | \forall g\in S: gv = v\} \\ &= \bigcap_{g\in S} \{v | gv = v\} \\ &= \bigcap_{g\in S} \ker(g-I) . \end{align*}

Since the kernel of a linear operator is a linear subspace and since the intersection of linear subscpaces is again linear the claim follows.

Exercise 10.30

Show that \(-I\notin\?S\) implies \(\pm\ii\?I\notin\?S\).

Proof

Recall the elementary logical rule that \(\neg\?a\Rightarrow\neg\?b\) is equivalent to \(a\Leftarrow\?b\). Hence, we have to prove that \(\pm\ii\?I\in\?S\) implies \(-I\in\?S\). But this is obvious! QED.

Exercise 10.31

Suppose \(S\) is a subgroup of \(G_n\) generated by elements \(g_1,\ldots,g_l\). Show that all the elements of \(S\) commute if and only if \(g_i\) and \(g_j\) commute for each pair \(i\), \(j\).

Remark
The claim remains true if we replace \(G_n\) by any group. This follows from the proof below.

Proof

One direction of the implication is obvious. Therefore let us just prove the other direction.

Assume that all \(g_i\) and \(g_j\) commute. One can show that every \(g\in\?S\) can be represented (not necessarily uniquely) as

\[ g = \prod_{j=1}^l g_j^{x_j} \]

where \(x_j\in\ZZ\), and that inversion and products are represented as follows

\begin{align*} g\inv &= \prod_{j=1}^l g_j^{-x_j} \\ g\cdot h &= \prod_{j=1}^l g_j^{x_j+y_j} \end{align*}

where \(h=\prod_{j=1}^lg_j^{y_j}\). More precisely: if we already know that \(g\) and \(h\) have such a representation then from commutativity the representations for \(g\inv\) and \(gh\) easily follow. Now observe that the set \(S'\) of \(g\) which can be represented like this must be contained in \(S\) (because the \((g_j)\) generate \(S\)). But we have just seen that making products and inverses doesn't take us outside of \(S'\). Hence \(S'=S\) as claimed.

But now it is obvious that \(S\) is abelian because the standard representations of \(gh\) and \(hg\) are the same. QED.

Remark
Above we used the fact that if \(g_i\) and \(g_j\) commute then also \(g_i^x\) and \(g_j^y\) commute for any \(x,y\in\ZZ\). This is clear for non-negative \(x\), \(y\). The general case follows from the fact that \(g_j\) and \(g_i\inv\) commute. In fact, let \(a=g_j\inv\?g_i\) and \(b=g_ig_j\inv\). Since \(g_i\) and \(g_j\) commute we have \(g_ja=g_jb\). But this implies \(a=b\).

Exercise 10.32

Verify that the generators in Figure 10.6 stabilize the codewords for the Steane code, as described in Section 10.4.2.

Solution

Note that there is a subtle error in the paragraph right before the exercise. Moreover I think it makes sense to prove a more general theorem given in the intro section of this site. So let us do this.

An element of the CSS code looks like this

\[ \ket{x+C_2} = \frac{1}{\sqrt{\abs{C_2}}} \sum_{y\in C_2} \ket{x + y} , \]

where \(x\in\?C_1\). Recall that \(C_2\subseteq\?C_1\). Hence all the \(x+y\) above also belong to \(C_1\). Hence \(A(x+y)=0\). Hence for all \(i=1..(n-k_1)\) we have

\begin{align*} g_{k_2+i}\ket{x+C_2} &= \frac{1}{\sqrt{\abs{C_2}}} \sum_{y\in C_2} \bigotimes_{j=1}^n Z^{A_{ij}} \ket{x + y} \\ &= \frac{1}{\sqrt{\abs{C_2}}} \sum_{y\in C_2} \underbrace{(-1)^{A_i\cdot(x+y)}}_{= 1} \ket{x + y} \\ &= \ket{x+C_2} . \end{align*}

(I write \(A_i\) for the \(i\)​-th row of \(A\).) Hence the \(g_{k_2+1}\) stabilize the code. To prove the same statement for \(g_i\) with \(i=1..k_2\) we switch to the X-basis by applying the Hadamard transform to the state. Note that this transform changes the \(X\) in the definition of the \(g_i\) to \(Z\). As in section 10.4.2 we see that

\[ H^{\otimes n} \ket{x+C_2} = \sqrt{\frac{\abs{C_2}}{2^n}} \sum_{z\in C_2^\top} (-1)^{x\cdot z} \ket{z} . \]

Hence we see in a similar way as in the first part that

\begin{align*} H^{\otimes n} g_{i}\ket{x+C_2} &= H^{\otimes n} g_{i} H^{\otimes n}H^{\otimes n} \ket{x+C_2} \\ &= \sqrt{\frac{\abs{C_2}}{2^n}} \sum_{z\in C_2^\top} (-1)^{x\cdot z} \bigotimes_{j=1}^n Z^{B_{ij}} \ket{z} \\ &= \sqrt{\frac{\abs{C_2}}{2^n}} \sum_{z\in C_2^\top} (-1)^{x\cdot z} \underbrace{(-1)^{B_i\cdot z}}_{= 1} \ket{z} \\ &= H^{\otimes n} \ket{x+C_2} . \end{align*}

By dividing away the Hadamard transform we see that the \(g_i\) for \(i=1..k_2\) stabilize the CSS code too. This already shows the claim of the exercise.

In order to finish the proof of the theorem we need to show that the given generators do not stabilize an even larger space than the desired CSS code. But this is basically a dimensionality argument. By design the CSS code has dimension \(2^{k}\) where \(k=k_1-k_2\). On the other hand under certain conditions a stabilizer code with \(n-k\) generators has the same dimension. Said conditions are that \(-I\notin\?S\) and that the generators commute (Proposition 10.5). The latter is clear from the form of the generators. The former follows from the fact that the parity check matrices have full rank. QED.

Exercise 10.33

Show that \(g\) and \(g'\) commute if and only if \(r(g)\Lambda\?r(g')^T=0\). (In the check matrix representation, arithmetic is done modulo two.)

Solution

Let \(\beta=r(g)\Lambda\?r(g')^T\) (which is an element of \(\BB\)). We will show that

\[ gg' = (-1)^\beta g'g , \]

which implies the claim of the exercise. Let us write

\begin{align*} g &= \alpha \bigotimes_{j=1}^n N_j , \\ g' &= \alpha' \bigotimes_{j=1}^n N'_j , \end{align*}

where \(N_j,N'_j\in\{I,X,Y,Z\}\), and \(\alpha,\alpha'\in\{1,-1,\ii,-\ii\}\). Without loss of generality we may assume \(\alpha=\alpha'=1\). Moreover let us split \(r=r_x\oplus\?r_z\) where \(r_x\) is responsible for the first \(n\) bits and \(r_z\) for the last \(n\) bits. We have

\[ \beta = r_x(g) \cdot r_z(g') + r_z(g) \cdot r_x(g') = \sum_{j=1}^n \underbrace{r_x(N_j) \cdot r_z(N'_j) + r_z(N_j) \cdot r_x(N'_j)}_{=:\beta_j} . \]

It is not hard to verify that \(N_jN'_j=(-1)^{\beta_j}N'_jN_j\). Hence \(gg'=(-1)^{\sum_j\beta_j}g'g\). QED.

Exercise 10.34

Let \(S=\langle\?g_1,\ldots,g_l\rangle\). Show that \(-I\) is not an element of \(S\) if and only if \(g_j^2=I\) for all \(j\), and \(g_j\neq-I\) for all \(j\).

Remark
The implication \(-I\notin\?S\Rightarrow\ldots\) is just exercise 10.35. The other direction is actually wrong.

Counterexamples

The claim of the exercise is wrong. To see this consider the following counterexample. Let

\[ g_1 = X, \; g_2 = Z . \]

Clearly \(g_j^2=I\) for all \(j\), and \(g_j\neq-I\) for all \(j\). On the other hand \(XZXZ=-ZZ=-I\) and hence \(-I\in\?S\). One might wonder if the claim becomes true if one assumes that the \(g_j\) commute (and are independent). But this is also not true as the following counterexample shows:

\[ g_1 = Z, \; g_2 = -Z . \]

Exercise 10.35

Let \(S\) be a subgroup of \(G_n\) such that \(-I\) is not an element of \(S\). Show that \(g^2=I\) for all \(g\in\?S\), and thus \(g^\dagger=g\).

Remark
The original exercise contained the additional implication that \(g\in\?-I\) for all \(g\in\?S\) but this is a slightly too trivial conclusion if you already assume that \(-I\notin\?S\) - in my opinion.

Proof

Recall that any \(g\in\?G_n\) has the form

\begin{align*} g &= \alpha \bigotimes_{j=1}^n N_j , \end{align*}

where \(N_j\in\{I,X,Y,Z\}\), and \(\alpha\in\{1,-1,\ii,-\ii\}\). Hence \(g^2=\alpha^2I=\pm\?I\). By \(-I\notin\?S\) we must have \(g^2=I\) as desired. QED.

Exercise 10.36

Let \(U\) be the CNOT gate. Explicitly verify that \(UX_1U^\dagger=X_1X_2\), \(UX_2U^\dagger=X_2\), \(UZ_1U^\dagger=Z_1\), and \(UZ_2U^\dagger=Z_1Z_2\). These and other useful conjugation relations for the Hadamard, phase, and Pauli gates are summarized in Figure 10.7.

Solution

This is a simple and slightly tedious exercise. Hence let us do some fun version of this exercise: let qiskit solve it! Therefore let us define the following auxiliary function:

def print_ex1036(pauli_strings: list[str]):
    """Given a list of Pauli strings (e.g. ["XI", "IX"]) print how CX acts on them."""
    qc = QuantumCircuit(2)
    qc.cx(1, 0)  # swapping control and target compensates for different qubit order
    U = Clifford(qc)

    print("Applying CX acts as")
    for pstring in pauli_strings:
        p0 = Pauli(pstring)
        p1 = p0.evolve(U, frame="s")
        print(f"{p0} -> {p1}")

It prints the solution to this exercise. We also included YI and IY for later reference.

print_ex1036(["XI", "IX", "ZI", "IZ", "YI", "IY"])
Applying CX acts as
XI -> XX
IX -> IX
ZI -> ZI
IZ -> ZZ
YI -> YX
IY -> ZY

In case you wonder whether p0.evolve swallows the phase: it does not, as can be seen from here:

qc = QuantumCircuit(1)
qc.x(0)
U = Clifford(qc)
Pauli("Y").evolve(U)
-Y

As you can see it shows the minus sign and we can correctly infer \(XYX=-Y\).

Exercise 10.37

What is \(UY_1U^\dagger\) (for \(U\) being CNOT)?

Solution

Knowing the transformation for \(Z_1\) and \(X_1\) suffices to calculate this (also note that \(U^\dagger=U\)):

\[ UY_1U = -\ii UZ_1X_1U = -\ii UZ_1UUX_1U = -\ii Z_1X_1X_2 = Y_1X_2 . \]

Note also that this coincides with the output in the solution of exercise 10.36.

Exercise 10.38

Suppose \(U\) and \(V\) are unitary operators on two qubits which transform \(Z_1\), \(Z_2\), \(X_1\), and \(X_2\) by conjugation in the same way. Show this implies that \(U=V\).

Remark
Clearly, equality has to be understood in the sense "equal up to global phase". For example \(\ii\?I\) and \(I\) act the same way under conjugation.

Proof

Let us generalize the exercise by considering \(n\) qubits and assuming that \(U\) and \(V\) act in the same way on \(Z_j\) and \(X_j\) for all \(j=1..n\). Since these Paulis already generate the whole Pauli group \(G_n\) we have

\[ \forall g\in G_n: \; UgU^\dagger = VgV^\dagger . \]

Let us note that we can easily deduce that this even holds for all \(g\in\CC^{2^n\times\?2^n}\). The reason is that the \(4^n\) possible Pauli operators (without phase) form a basis for this matrix space (they are orthogonal with respect to the Hilbert-Schmidt inner product). But we do not need this. Instead consider \(J=V^\dagger\?U\) and observe that

\[ \forall g\in G_n: \; JgJ^\dagger = g = IgI . \]

We have to show that \(J=e^{\ii\varphi}I\) for some \(\varphi\). Note that a stabilizer for the one dimensional space \(\CC\ket{k}\) generated by a standard basis element \(\ket{k}\) is given by \(\langle(-1)^{k_j}Z_j\;|\;j=1..n\rangle\) (where \(k_1\ldots\?k_n\) is the binary expansion of \(k\)). Hence \(J\) has to map \(\CC\ket{k}\) to itself (as \(I\) does). In other words

\[ J = \sum_{k=1}^n e^{\ii\varphi_k} \proj{k} . \]

For a binary vector \(v\in\BB^n\) let \(X^v=\bigotimes_{j=1}^nX^{v_j}\). Observe that (here \(v+k\) is performed in \(\BB^n\))

\begin{align*} J X^v J^\dagger &= \left( \sum_k e^{\ii\varphi_k} \proj{k} \right) \left( \sum_k e^{-\ii\varphi_k} \ket{v+k}\bra{k} \right) \\ &= \left( \sum_k e^{\ii\varphi_k} \proj{k} \right) \left( \sum_k e^{-\ii\varphi_{v+k}} \ket{k}\bra{v+k} \right) \\ &= \sum_k e^{\ii(\varphi_k-\varphi_{v+k})} \ket{k}\bra{v+k} . \end{align*}

Comparing this with \(IX^vI\) we see that \(\varphi_k-\varphi_{v+k}=0\) for all \(k\) and \(v\). In other words, all the \(\varphi_k\) are the same. QED.

Exercise 10.39

Verify (10.91):

\[ SXS^\dagger = Y; \quad SZS^\dagger = Z \]

Proof

The equality \(SZS^\dagger=Z\) follows from the fact that \(S=\sqrt{Z}\) is a function of \(Z\) (in the sense of the functional calculus for hermitian operators) and hence commutes with \(Z\).

The equality \(SXS^\dagger=Y\) most easily follows from the bloch sphere interpretation. In fact, \(S=\ii\?R_z(\pi/2)\) is a rotation by 90° around the z-axis. This rotation maps the x-axis to the y-axis.

QED.

Exercise 10.40

Provide an inductive proof of Theorem 10.6 as follows.

  1. Prove that the Hadamard and phase gates can be used to perform any normalizer operation on a single qubit.
  2. Suppose \(U\) an \(n+1\) qubit gate in \(N(G_{n+1})\) such that \(UZ_1U^\dagger=X_1\otimes\?g\) and \(UX_1U^\dagger=Z_1\otimes\?g'\) for some \(g,g\in\?G_n\). Define \(U'\) on \(n\) qubits by \(U'=\sqrt{2}\bra{0}U(\ket{0}\otimes\ket{\psi})\). Use the inductive hypothesis to show that the construction for U in Figure 10.9 may be implemented using \(O(n^2)\) Hadamard, phase and CNOT gates.
  3. Show that any gate \(U\in\?N(G_{n+1})\) may be implemented using \(O(n^2)\) Hadamard, phase and CNOT gates.
                 ┌───┐
 n+1: ────────■──┤ H ├──■───
      ┌────┐┌─┴─┐└───┘┌─┴──┐
1..n: ┤ U' ├┤ g ├─────┤ g' ├
      └────┘└───┘     └────┘

Proof of (1)

A single qubit normalizer gate is a gate \(U\) which satisfies

\[ UZU^\dagger , UXU^\dagger \in \{\pm X, \pm Y, \pm Z\} . \]

We can restrict our attention to the action of \(U\) on \(Z\) and \(X\) because the action on the other Pauli gates are either trivial (for \(I\)) or can be inferred like in this example: \(UYU^\dagger=-\ii\?UZU^\dagger\?UXU^\dagger\) (because \(Y=-\ii\?ZX\) or more generally \(\langle\?Z,X\rangle=G_1\)). In case you wonder why I do not allow e.g. \(UZU^\dagger=\pm\?I\) or \(UZU^\dagger=\pm\ii\?Z\): This is not possible because conjugation preserves eigenvalues.

A convenient way to go on is to switch into the Bloch sphere picture. In that sense \(U\) can be identified (neglecting global phase) with a rotation (in three dimensional real space) which maps \(\hat{z}\) and \(\hat{x}\) to two (orthogonal) axes (including their negatives). In other words \(\hat{z}\) and \(\hat{x}\) are mapped to \(\{\pm\hat{x},\pm\hat{y},\pm\hat{z}\}\) where every combination is possible if orthogonality is satisfied (e.g. mapping both \(\hat{z}\) and \(\hat{x}\) to \(\hat{z}\) is not possible). In the following we identify the Pauli matrices \(X,Y,Z\) with their corresponding axes \(\hat{x},\hat{y},\hat{z}\).

Note that the Bloch sphere interpretation shows that \(U\) is already uniquely defined by its action on \(Z\) and \(X\) (for a rotation the action on the third axis is fixed by orthogonality and orientation-preservation). Compare this to the solution of exercise 10.38 which also shows this.

Let us show that rotations by (multiples of) 90° around any coordinate axis can be used to compose \(U\). Let \(\calR\) denote the group of all such rotations. Clearly there is a \(V\in\calR\) which satisfies

\[ V Z V^\dagger = U Z U^\dagger . \]

In case we already have \(VXV^\dagger=UXU^\dagger\) we are done (\(V=U\), hence \(U\in\calR\)). Otherwise we know that \(UXU^\dagger\) and \(VXV^\dagger\) are both orthogonal (in the bloch sphere) to \(VZV^\dagger\). Hence there is a \(W\in\calR\) with rotation axis \(VZV^\dagger\) such that \(WVXV^\dagger\?W^\dagger=UXU^\dagger\). Since \(W\) leaves \(VZV^\dagger\) invariant we have \(W=U\) and thus \(U\in\calR\).

It remains to show that \(\calR=\langle\?H,S\rangle\). Note that \(S=\sqrt{Z}=\ii\?R_z(\pi/2)\) is a rotation by 90° around \(Z\). Hence \(S\in\calR\). Moreover \(H=(X+Z)/\sqrt{2}\) is a rotation by 180° around the axis \((X+Z)/\sqrt{2}\). We know that \(HZH=X\) and \(HXH=Z\). Hence \(H\) is a normalizer and thus \(H\in\calR\). But let us give an explicit formula in terms of elements of \(\calR\) (you can use the procedure used to show \(U\in\calR\) to find it)

\[ H = R_x(\pi) R_y(\pi/2) \in \calR . \]

Hence \(\langle\?H,S\rangle\subseteq\calR\). For the other directions note that the 90° rotations around the three axes \(Z\), \(X\), and \(Y\) are:

\[ \sqrt{Z} = S, \quad \sqrt{X} = HSH, \quad \sqrt{Y} = S\sqrt{X}S^\dagger = SHSHS^3 . \]

To see all this also note that \(Y=SXS^\dagger\) and \(S^\dagger=S^3\). Hence \(\calR\subseteq\langle\?H,S\rangle\).

Proof of (2)

Let us first calculate a matrix representation of the circuit given in the exercise:

\begin{align*} & (\proj{0}\otimes I + \proj{1}\otimes g) \cdot H\otimes I \cdot (\proj{0}\otimes I + \proj{1}\otimes g') \cdot I\otimes U' \\ =\; & (\proj{0}\otimes I + \proj{1}\otimes g) \cdot (\ket{+}\bra{0}\otimes I + \ket{-}\bra{1}\otimes g') \cdot I\otimes U' \\ =\; & \frac{1}{\sqrt{2}} (\proj{0}\otimes I + \ket{0}\bra{1}\otimes g' + \ket{1}\bra{0}\otimes g - \proj{1}\otimes gg') I \otimes U' . \end{align*}

Hence the circuit implements the following unitary operator

\[ \tilde{U} = \frac{1}{\sqrt{2}} \begin{bmatrix} U' & g'U' \\ gU' & -gg'U' \end{bmatrix} . \]

We have to show that \(U=\tilde{U}\). Let us note that \(g\) and \(g'\) commute because \(Z_1\) and \(X_1\) already anti-commute.

Now let us investigate \(U\). Let us write it in block form where each block \(U_{ij}\) represents \(n\) qubits:

\[ U = \begin{bmatrix} U_{00} & U_{01} \\ U_{10} & U_{11} \end{bmatrix} . \]

Note that this already implies \(U_{00}=U'/\sqrt{2}\) (by definition of \(U'\)). Unitarity of \(U\) (\(UU^\dagger=I\)) implies for \(k\in\{0,1\}\):

\[ \sum_{k=0}^1 U_{ik}U_{jk}^\dagger = \delta_{ij} I_n . \]

Consider

\[ X_1\otimes\?g = UZ_1U^\dagger = \sum_{ij}\sum_k (-1)^k \ket{i}\bra{j} \otimes U_{ik} U_{jk}^\dagger . \]

It is not hard to see that this and the unitarity relation imply

\[ \sqrt{2}U_{ij} \text{ is unitary}, \]

and

\[ U_{0k}U_{1k}^\dagger = \frac{1}{2} (-1)^k g = U_{1k}U_{0k}^\dagger . \]

In particular \(U'=\sqrt{2}U_{00}\) is unitary. Moreover, those findings further imply

\begin{align*} U_{10} &= gU_{00} = \frac{1}{\sqrt{2}} g U' , \\ U_{11} &= -g U_{01} . \end{align*}

Next consider

\[ Z_1\otimes g' = UX_1U^\dagger = \sum_{ij}\sum_k \ket{i}\bra{j} \otimes U_{ik}U_{j\overline{k}}^\dagger , \]

where \(\overline{k}=k+1\) (in \(\BB\)). From this we deduce

\begin{align*} U_{10}U_{01}^\dagger + U_{11}U_{00}^\dagger &= 0 , \\ U_{10}U_{11}^\dagger + U_{11}U_{10}^\dagger &= -g' . \end{align*}

The first of these two relations can be used with the formulas for \(U_{10}\) and \(U_{11}\) to deduce \(U_{00}U_{11}^\dagger=gU_{11}U_{00}^\dagger\?g\). This can be used in the second relation and again with said formulas (to replace \(U_{10}\)) to obtain

\[ U_{11} = -gg' U_{00} = \frac{-1}{\sqrt{2}} gg' U' . \]

From this we also deduce \(U_{01}=-gU_{11}=g'U'/\sqrt{2}\). Hence we proved \(U=\tilde{U}\). It remains to count the number of gates to implement the circuit given for \(\tilde{U}=U\). By the induction hypothesis we need \(O(n^2)\) gates (CNOT, \(S\), \(H\)) to implement \(U'\) (it is clearly a normalizer because \(U\) is one and we will show that the rest in the circuit is also a normalizer). The controlled Pauli gates \(C(g)\) and \(C(g')\) can be implemented by at most \(n\) gates of the form \(C(N)\) where \(N\in\{\pm\,X,\pm\,Y,\pm\,Z\}\) (each of which are normalizer gates and can be implemented by at most five of our base gates).

To summarize: If \(s_n=O(n^2)\) is the maximal number of gates (\(S\), \(H\), CNOT) needed to implement an \(n\)​-qubit normalizer gate, then the circuit \(U\) can be implemented in at most

\[ s_n + 10n + 1 = O(n^2) \]

many gates.

Proof of (3)

Let \(U\) be any \(n+1\)​-qubit normalizer gate. Then we have

\begin{align*} U Z_1 U^\dagger &= M \otimes g , \\ U X_1 U^\dagger &= N \otimes g' , \end{align*}

where we assume that \(M,N\in\{\pm\,I,\pm\,X,\pm\,Y,\pm\,Z\}\) and \(g\) and \(g'\) are a product of such operators (even without sign to make them unique). There are two cases:

Case 1
\(M,N\in\{\pm\,X,\pm\,Y,\pm\,Z\}\) are both "proper" Pauli operators (none of them is plus/minus the identity).
Case 2
\(M=\pm\,I\) or \(N=\pm\,I\).

Consider case 1. By the proof of part (1) we can apply \(O(1)\) many single qubit operations to implement a circuit \(V\) which does \(VMV^\dagger=X\) and \(VNV^\dagger=Z\). Hence \(VU\) satisfies the assumptions of part (2) and this case can be reduced to part (2).

Consider case 2. Recall that swap gates can be implemented by three CNOT gates. Try to find a qubit \(j\) so that both \(M\otimes\?g\) and \(N\otimes\?g'\) have a proper Pauli at position \(j\). If this is possible we can just swap qubits \(1\) and \(j\) to reduce to case 1. Otherwise we can at least assume that e.g. \(M\neq\pm\?I\) by a swap argument. Hence we have the following situation

\begin{align*} U Z_1 U^\dagger &= M \otimes I \otimes \ldots , \\ U X_1 U^\dagger &= I \otimes K \otimes \ldots , \end{align*}

where again by a swap argument \(K\neq\pm\?I\) can be assumed. Without loss of generality we can even assume that \(M=X\) and \(K=Z\):

\begin{align*} U Z_1 U^\dagger &= X \otimes I \otimes \ldots , \\ U X_1 U^\dagger &= I \otimes Z \otimes \ldots . \end{align*}

In fact, this follows from part (1) by applying single qubit operations to the first and second qubit. Note that applying a single CNOT gate (with control at \(1\) and target at \(2\)) the RHS can be replaced by

\begin{align*} & X \otimes X \otimes \ldots , \\ & Z \otimes Z \otimes \ldots . \end{align*}

which leaves us in the situation of case 1. QED.

Exercise 10.41

Verify Equations (10.92) through (10.95):

Solution

The equations

\[ TZT^\dagger = Z; \quad TXT^\dagger = \frac{X+Y}{\sqrt{2}} \]

can be seen in exact the same way as the corresponding equations in exercise 10.39. The first one follows most easily from commutativity, using \(T=\sqrt[4]{Z}\). The second one from the fact that \(T\) is a 45° rotation around the z-axis which maps the x-axis to \((\hat{x}+\hat{y})/\sqrt{2}\).

The relations \(UNU^\dagger=N\) for \(N=Z_1,Z_2,X_3\) follow from commutativity (the Toffoli gate is a sum of tensor products of the eigen-projectors of \(Z_1\), \(Z_2\) with \(X_3\) because a controlled gate like \(C(V)\) is just \(\proj{0}\otimes\?I+\proj{1}\otimes\?V\)).

For the remaining three formulas let us use sage to verify them:

U = CCX
assert U == U.H, "CCX is symmetric!"

assert U * XII * U == kron(X, II + ZI + IX - kron(Z,X)) / 2
assert U * IXI * U == (IXI + IXI*ZII + IXI*IIX - kron(Z,X,X)) / 2
assert U * IIZ * U == kron(II + ZI + IZ - kron(Z,Z), Z) / 2
"PASSED"
'PASSED'

To find these formulas one could use that the Pauli matrices (without phase) form an orthogonal basis of \(\CC^{2^n\times2^n}\) with respect to the Hilbert-Schmidt scalar product. Also note that the norm of each Pauli operator is \(2^{n/2}\). Hence to compute e.g. the coefficients of \(UX_1U^\dagger\) one could do the following:

A = U * XII * U
assert A == A.H, "Nice to have, to avoid typing A.H all the time."
# For example: coefficients of XII and IXI of A are:
assert trace(A * XII) / 8 == 1/2
assert trace(A * IXI) / 8 == 0
# ... and so on.
"PASSED"
'PASSED'

Exercise 10.42

Use the stabilizer formalism to verify that the circuit of Figure 1.13 on page 27 teleports qubits, as claimed. Note that the stabilizer formalism restricts the class of states being teleported, so in some sense this is not a complete description of teleportation, nevertheless it does allow an understanding of the dynamics of teleportation.

Remark
Unfortunately I do not see how this exercise improves my understanding of teleportation. It seems to be a very complicated way to verify that the circuit actually teleports qubits. On the other hand it is a nice exercise to understand the (certainly very useful) stabilizer formalism.

Solution

The circuit can be divided into three major steps:

  • Unitary evolution up until the measurements,
  • Measuring the first two qubits in the Z-basis,
  • Performing conditioned operations on the third qubit.

I split up the solution according to the three stages. Additionally I add a section on a specific example to get some feeling for how things work out.

Unitary evolution

Let \(\ket{\psi}\) be the single qubit state to be teleported. The whole system is in the state \(\ket{\psi,0,0}\) initially (the circuit displayed in Figure 1.12 produces \(\ket{\psi,\beta_{00}}\) which feeds into the circuit in Figure 13.).

Let \(N\in\{\pm\?X,\pm\?Y,\pm\?Z\}\) be the stabilizer of \(\ket{\psi}\). Of course this restricts \(\ket{\psi}\) to be an eigenstate of one of the Paulis (six possibilities up to global phase), but this is just the restriction of the stabilizer formalism already mentioned in the statement of the exercise. The stabilizer of the whole system is \(\langle\?N_1,Z_2,Z_3\rangle\) (where \(N_1=N\otimes\?I\otimes\?I\)).

Let us abbreviate \(V=\cx_{12}\) and calculate the evolution by the stabilizer formalism:

\begin{align*} & \langle N_1, Z_2, Z_3 \rangle & (\text{initial state } \ket{\psi,0,0}) \\ H_2 :\; & \langle N_1, X_2, Z_3 \rangle \\ \cx_{23} :\; & \langle N_1, X_2X_3, Z_2Z_3 \rangle & (\ket{\psi,\beta_{00}}) \\ \cx_{12} :\; & \langle VN_1V, X_2X_3, Z_1Z_2Z_3 \rangle \\ H_1 :\; & \langle H_1VN_1VH_1, X_2X_3, X_1Z_2Z_3 \rangle & (\text{right before measurement}). \end{align*}

In the following let us denote by \(g_1\), \(g_2\), \(g_3\) the three generators from the last step:

\[ g_1 = HVN_1VH, \quad g_2 = X_2X_3, \quad g_3 = X_1Z_2Z_3 . \]

Intermezzo: A specific example

Let us look at the specific example \(N=+X\), i.e. \(\ket{\psi}=\ket{+}\) (up to global phase). Here we have

\[ g_1 = H_1V N_1 VH_1 = Z_1X_2 . \]

We have to measure \(Z_1\) and \(Z_2\). Lets do \(Z_1\) first. Clearly it anti-commutes with \(g_3\) and commutes with the others. This is precisely one of the two standard situations described in section 10.5.3 which outlines how measurements are performed in the stabilizer formalism. If we measure \((-1)^{m_1}\) for \(m_1\in\BB\) the post-measurment state is

\[ S' = \langle Z_1X_2, X_2X_3, (-1)^{m_1}Z_1 \rangle . \]

Next we measure \(Z_2\). Note that \(Z_2\) anti-commutes with two of the generators from the preciding equation. We can fix this by multiplying the second one to the first one:

\[ S' = \langle Z_1X_3, X_2X_3, (-1)^{m_1}Z_1 \rangle . \]

Now \(Z_2\) only anti-commutes with the second generator. If we measure \((-1)^{m_2}\) for \(m_2\in\BB\) the post-measurment state is

\[ S'' = \langle Z_1X_3, (-1)^{m_2}Z_2, (-1)^{m_1}Z_1 \rangle = \langle (-1)^{m_1} X_3, (-1)^{m_2}Z_2, (-1)^{m_1}Z_1 \rangle . \]

For the second equality we multiplied the third generator to the first one to get rid of the \(Z_1\). Let us note two things here: the second and the third generator do not affect the third qubit and just say that the first two qubits are in the state \(\ket{m_1,m_2}\). On the other hand \((-1)^{m_1}X_3\) only refers to the third qubit and says that it is in the state \(H\ket{m_1}\in\{\ket{+},\ket{-}\}\). For teleportation to be succesful we thus have to get rid of the sign bit \(m_1\). This is what the conditioned operations accomplish.

\begin{align*} & S'' = \langle (-1)^{m_1}X_3, (-1)^{m_2}Z_2, (-1)^{m_1}Z_1 \rangle \\ X_3^{m_2} :\; & \langle (-1)^{m_1} X_3, (-1)^{m_2}Z_2, (-1)^{m_1}Z_1 \rangle \\ Z_3^{m_1} :\; & \langle X_3, (-1)^{m_2}Z_2, (-1)^{m_1}Z_1 \rangle . \end{align*}

Note that the last state is indeed \(\ket{m_1,m_2,\psi}\) as desired. Thus we have verified the teleportation property for the particular state stabilized by \(N=+X\).

Measuring \(Z_1\) and \(Z_2\)

More generally any (for the stabilizer formalism) possible single qubit state is stabilized by some \(N=(-1)^s\ii^{ab}X^aZ^b\in\{\pm\?X,\pm\?Y,\pm\?Z\}\) where \(s,a,b\in\BB\). Hence the state right before measurement is

\begin{align*} S &= \langle H_1V (-1)^s \ii^{ab} X_1^aZ_1^b VH_1, X_2X_3, X_1Z_2Z_3 \rangle \\ &= \langle H_1 (-1)^s \ii^{ab} X_1^aX_2^aZ_1^b H_1, X_2X_3, X_1Z_2Z_3 \rangle \\ &= \langle (-1)^s \ii^{ab} Z_1^aX_2^aX_1^b, X_2X_3, X_1Z_2Z_3 \rangle \\ &= \langle (-1)^{s+ab} \ii^{ab} X_1^bX_2^aZ_1^a, X_2X_3, X_1Z_2Z_3 \rangle . \end{align*}

In the last equality I brought the product in the standard form so that all \(X\) precede all \(Z\). Doing so I had to move \(X_1^b\) past \(Z_1^a\), which introduces an additional sign factor \(-1\) in case \(a=b=1\).

Let us measure \(Z_1\) now. Note that it anti-commutes with the first generator iff \(b=1\), and always anti-commutes with the third generator. To fix this multiply the first generator by the third one raised to the power \(b\) to obtain (note that this cancels the \((-1)^{ab}\) if we rearrange the Paulis in the "X first" order):

\begin{align*} S = \langle (-1)^{s} \ii^{ab} X_2^aZ_1^aZ_2^bZ_3^b, X_2X_3, X_1Z_2Z_3 \rangle . \end{align*}

Now \(Z_1\) only anti-commutes with the third generator. After measuring \((-1)^{m_1}\) for \(m_1\in\BB\) the post-measurment state is

\begin{align*} S' = \langle (-1)^{s} \ii^{ab} X_2^aZ_1^aZ_2^bZ_3^b, X_2X_3, (-1)^{m_1}Z_1 \rangle . \end{align*}

Now let us measure \(Z_2\). Observe that \(Z_2\) anti-commutes with the first generator iff \(a=1\) and always with the second generator. To fix this let us multipy the first generator by the second raised to the power \(a\):

\begin{align*} S' = \langle (-1)^{s} \ii^{ab} X_3^aZ_1^aZ_2^bZ_3^b, X_2X_3, (-1)^{m_1}Z_1 \rangle . \end{align*}

Now \(Z_2\) only anti-commutes with the second generator. Hence, aber measuring \((-1)^{m_2}\) for \(m_2\in\BB\) the post-measurment state is

\begin{align*} S'' &= \langle (-1)^{s} \ii^{ab} X_3^aZ_1^aZ_2^bZ_3^b, (-1)^{m_2}Z_2, (-1)^{m_1}Z_1 \rangle \\ &= \langle (-1)^{s+m_1a+m_2b} \ii^{ab} X_3^aZ_3^b, (-1)^{m_2}Z_2, (-1)^{m_1}Z_1 \rangle . \end{align*}

In the second equality I multiplied the first generator by the second and third raised to the power \(a\).

Apply \(Z_3^{m_1}X_3^{m_2}\)

Let us finally apply the conditioned operations:

\begin{align*} & S'' = \langle (-1)^{s+m_1a+m_2b} \ii^{ab} X_3^aZ_3^b, (-1)^{m_2}Z_2, (-1)^{m_1}Z_1 \rangle \\ X_3^{m_2} :\; & \langle (-1)^{s+m_1a} \ii^{ab} X_3^aZ_3^b, (-1)^{m_2}Z_2, (-1)^{m_1}Z_1 \rangle \\ Z_3^{m_1} :\; & \langle (-1)^{s} \ii^{ab} X_3^aZ_3^b, (-1)^{m_2}Z_2, (-1)^{m_1}Z_1 \rangle . \end{align*}

The final result is the stabilizer of the state \(\ket{m_1,m_2,\psi}\) as desired.

Exercise 10.43

Show that \(S\subseteq\?N(S)\) for any subgroup \(S\) of \(G_n\).

Remark
One can replace \(G_n\) by and group \(G\) and the statement is still true.

Proof

Recall that the normalizer of a subgroup \(S\) of a group \(G\) is

\[ N(S) = \left\{ g\in G\; | \; \forall s\in S: gsg\inv \in g \right\} . \]

By the group property, if \(g\in\?S\) then also \(gsg\inv\in\?S\). Hence \(S\subseteq\?N(S)\). QED.

Exercise 10.44

Show that \(N(S)=Z(S)\) for any subgroup \(S\) of \(G_n\) not containing \(-I\).

Proof

Obviously we have \(Z(S)\subseteq\?N(S)\) (for any subgroup \(S\) of any group). Hence we only have to show the other inclusion and this is where we need the fact that we are in the Pauli group and \(-I\notin\?S\).

Let \(g\in\?N(S)\).

Recall that for \(g,h\in\?G_n\) we have \(gh=\pm\?hg\), that is, two elements of the Pauli group either commute or anti-commute. In particular

\[ \forall s \in S: gsg^\dagger = \pm s . \]

We have to show that always the \(+\) sign holds. Assume to the contrary that there is a \(s\in\?S\) such that \(gsg^\dagger=-s\). Using \(g\in\?N(S)\) we deduce \(-s\in\?S\). Note that \(s^2=I\) by exercise 10.35. Hence \(S\owns\?s\cdot(-s)=-I\) - contradiction! QED.

Exercise 10.45 (Correcting located errors)

Suppose \(C(S)\) is an \([n,k,d]\) stabilizer code. Suppose \(k\) qubits are encoded in \(n\) qubits using this code, which is then subjected to noise. Fortunately, however, we are told that only \(d-1\) of the qubits are affected by the noise, and moreover, we are told precisely which \(d-1\) qubits have been affected. Show that it is possible to correct the effects of such located errors.

Proof

Let us first consider the case that we know that the errors only happen in the first \(d-1\) qubits. The set of all such errors is the linear span of the following set of base errors:

\[ \calE = \left\{\bigotimes_{j=1}^{d-1} N_j \otimes I^{\otimes(n-d+1)} \; \middle| \; N_j \in \{I,X,Y,Z\} \right\} . \]

To see this recall that the Pauli operators are an orthogonal basis with respect to the Hilbert-Schmidt inner product. By theorem 10.2 a recovery procedure \(\calR\) which corrects all errors in \(\calE\) also corrects all errors in the linear span and hence all errors affecting the first \(d-1\) qubits.

Clearly the weight of \(E^\dagger\?F\) is at most \(d-1\) for every \(E,F\in\calE\). Since \(C(S)\) has distance \(d\) this means that \(E^\dagger\?F\notin\?N(S)\backslash\?S\). By theorem 10.8 there exists a recovery procedure \(\calR\) which corrects all errors in \(\calE\).

More generally consider errors at arbitrary positions. Let \(\varepsilon\in\BB^n\) encode at which positions an error occured (\(\varepsilon_j=1\) means that an error might have happened at position \(j\) and \(\varepsilon=0\) means that definitively no error occured at position \(j\)). We consider only the \(\binom{n}{d-1}\) many \(\varepsilon\in\BB^n\) which have weight \(d-1\) (they could have smaller weight but this doesn't change anything in the arguments). The argument above obviously generalizes to: For every error for which we know the corresponding error locations \(\varepsilon\) (with weight \(d-1\)) there is a recovery procedure \(\calR_{\epsilon}\) (only depending on \(\varepsilon\)) which corrects this error.

To summarize, the general error correction procedure is:

  • Determine the location of the errors \(\varepsilon\in\BB^n\).
  • Apply \(\calR_{\varepsilon}\) to correct a possible error.

QED.

Exercise 10.46

Show that the stabilizer for the three qubit phase flip code is generated by \(X_1X_2\) and \(X_2X_3\).

Proof

We already know that \(Z_1Z_2\) and \(Z_2Z_3\) are generators of the stabilizer of the bit flip code. Since "everything" about the phase flip code is just the Hadamard transform of the bit flip code, the claim follows.

Alternatively you could first check that the code words

\[ a\ket{+++} + b\ket{---} \]

of the phase flip code are stabilized by \(X_1X_2\) and \(X_2X_3\) (which is easy to see). From the beginning of chapter 10.5.1 (p. 455) we know that this implies that \(S=\langle\?X_1X_2,X_2X_3\rangle\) commutes (which is clear) and does not contain \(-I\). Hence using that \(X_1X_2\) and \(X_2X_3\) are independent proposition 10.5 implies that the generated code \(V_S\) is two dimensional. We have already seen that \(V_S\) contains the phase flip code, so it must be identical with it. QED.

Exercise 10.47

Verify that the generators of Figure 10.11 generate the two codewords of Equation (10.13) (Shor code).

Name Operator
g1 ZZIIIIIII
g2 IZZIIIIII
g3 IIIZZIIII
g4 IIIIZZIII
g5 IIIIIIZZI
g6 IIIIIIIZZ
g7 XXXXXXIII
g8 IIIXXXXXX
Logical Z XXXXXXXXX
Logical X ZZZZZZZZZ

Solution

The Shor code encodes the two logical qubits as

\[ \ket{k_L} = \bigotimes_{j=1}^3 \frac{1}{\sqrt{2}} (\ket{000} + (-1)^k \ket{111}), \quad (k\in\BB) . \]

It is sufficient to check that the generators \(g_i\) stabilize the two base code words. In fact, by the argument from the beginning of chapter 10.5.1 (p. 455) all \(g_i\) commute (which is easy to see directly) and \(-I\notin\?S\). Since the \(g_i\) are clearly independent a dimensionality argument based on proposition 10.5 then shows that \(S\) only stabilizes the Shor code and not a larger code space.

Now let us check that the generators indeed stabilize the base code words. Observe that the qubits of \(\ket{k_L}\) can be split into three identical blocks with three qubits each. The generators \(g_i\) for \(i=1..6\) leave each single block invariant, hence they stabilize \(\ket{k_L}\). Moreover \(g_7\) inverts the sign (the \((-1)^k\) term) of the first two blocks, which in total has no effect either. Hence \(g_7\) stabilizes \(\ket{k_L}\). A similar argument applies to \(g_8\). QED.

Exercise 10.48

Show that the operations \(\bar{Z}=X_1X_2X_3X_4X_5X_6X_7X_8X_9\) and \(\bar{X}=Z_1Z_2Z_3Z_4Z_5Z_6Z_7Z_8Z_9\) act as logical \(Z\) and \(X\) operations on a Shor-code encoded qubit. That is, show that this \(\bar{Z}\) is independent of and commutes with the generators of the Shor code, and that \(\bar{X}\) is independent of and commutes with the generators of the Shor code, and anti-commutes with \(\bar{Z}\).

Solution

That \(\bar{Z}\) and \(\bar{X}\) commute with the \(g_i\) should be clear from inspection. That they anti-commute with each other is also clear.

Let \(\ket{k_L}\) for \(k\in\BB\) be the two base code words of the Shor code for the two standard base states of the logical qubit.

In my opinion the easiest way to see that \(\bar{Z}\) is independent from the generators is to observe that it stabilizes \(\ket{0_L}\) but not \(\ket{1_L}\) (eigenvalue \(-1\)). (If \(\bar{Z}\) were dependent it would be an element of the generated subgroup \(S\) and hence had to stabilize both code words.) The justification to call it "Z" comes from \(\bar{Z}\ket{k_L}=(-1)^k\ket{k_L}\). A similar argument applies to \(\bar{X}\) which swaps \(\ket{0_L}\) with \(\ket{1_L}\) (which justifies to call it "X"). QED.

Remark
Another possible definition of the Z and X operator would be \(\bar{Z}=X_1X_2X_3\) and \(\bar{X}=Z_1Z_4Z_7\). Those two operators behave the same way on the code space as the original operators.

Exercise 10.49

Use Theorem 10.8 to verify that the five qubit code can protect against an arbitrary single qubit error.

Name Operator
g1 XZZXI
g2 IXZZX
g3 XIXZZ
g4 ZXIXZ
Logical Z ZZZZZ
Logical X XXXXX

Solution

A basis for the set \(\calE\) of single qubit errors is given by all Pauli operators on five qubits with weight (at most) one (e.g. XIIII, IIZII, etc.).

In principle we have to check that \(E^\dagger\?F\notin\?N(S)\backslash\?S\) for every combination of of \(E,F\in\calE\). These are a lot of checks to perform. We can simplify the task by using the following lemma:

Lemma

For \(E,F\in\calE\) we have

  1. \(E^\dagger\?F\in\?S\) is equivalent to \(E\cong\?F\), that is, they are identical if restricted to the stabilized subspace \(V_S\).
  2. \(E^\dagger\?F\in\?N(S)\) is equivalent to \(E\) and \(F\) having the same syndromes.
Proof of (1)
By assumption we have \(s:=E^\dagger\?F\in\?S\). In other words \(F=Es\). This implies \(F\ket{\psi}=E\ket{\psi}\) for each \(\ket{\psi}\in\?V_S\). QED.
Proof of (2)

By exercise 10.44 (\(Z(S)=N(S)\) in our situation) we have

\[ \forall g\in S: \; E^\dagger F g = g E^\dagger F . \]

Note that this is the same as

\[ \forall g\in S: \; F g F^\dagger = E g E^\dagger . \]

Let \(\beta(g,E)\) be the syndrome of \(E\) when the observable \(g\in\?S\) is measured. Recall that this means \(EgE^\dagger=(-1)^{\beta(g,E)}g\). Using what we mentioned at the beginning of the proof this yields the claim. QED.

We will show that the fifteen "proper" errors \(\calE\backslash\?I^{\otimes5}\) all have different non-zero syndromes. Using the lemma we see that this is sufficient to prove that the code corrects all single qubit errors. It would be OK if some errors have the same syndrome as long as they act identically on the code space (\(E^\dagger\?F\in\?S\)), but this does not happen for the five qubit code.

In order to show this, let us use qiskit:

def compute_syndromes(generators: list[Pauli]):
    """Given the generators of a [n,k=1] code this computes the syndromes for
    each possible single qubit (Pauli) error."""
    n = generators[0].num_qubits
    paulis = ["X", "Y", "Z"]

    errors = [Pauli("I"*i + p + "I"*(n-i-1))
              for (i, p) in product(range(n), paulis)]

    syndromes = dict()

    for err in errors:
        syndrome = ""
        for i, gi in enumerate(generators, 1):
            if gi.anticommutes(err):
                syndrome += "1"
            else:
                syndrome += "0"

        syndromes[err.to_label()] = syndrome

    return syndromes

This function can now be used to compute the syndromes of the five qubit code:

g1 = Pauli("XZZXI")
g2 = Pauli("IXZZX")
g3 = Pauli("XIXZZ")
g4 = Pauli("ZXIXZ")
generators = [g1, g2, g3, g4]

syndromes = compute_syndromes(generators)
assert len(syndromes) == 15, "For each of 5 positions and 3 Paulis one syndrome."
assert len(set(syndromes.values())) == 15, "No duplicate syndromes!"


for error, syndrome in sorted(syndromes.items(), key=(lambda x: x[1])):
    print(f"{syndrome}: {error}")
0001: XIIII
0010: IIZII
0011: IIIIX
0100: IIIIZ
0101: IZIII
0110: IIIXI
0111: IIIIY
1000: IXIII
1001: IIIZI
1010: ZIIII
1011: YIIII
1100: IIXII
1101: IYIII
1110: IIYII
1111: IIIYI

Exercise 10.50

Show that the five qubit code saturates the quantum Hamming bound, that is, it satisfies the inequality of (10.51) with equality.

Solution

The quantum hamming bound is

\[ \sum_{j=0}^t \binom{n}{j} 3^j \leq 2^{n-k} . \]

For the five qubit code we have \(n=5\), \(k=1\), and \(t=1\). Plugging this in yields:

\[ 1 + \binom{5}{1} 3^1 \leq 2^{5-1} , \]

which is clearly satisfied with equality.

Recall that the quantum hamming bound only applies to non-degenerate codes. The five qubit is acutally non-degenerate (we have seen in the solution of exercise 10.49 that the syndromes of all base errors are different). For \(k=t=1\) the hamming bound says:

\[ 3n + 1 \leq 2^{n-1} , \]

which has \(n=5\) as the smallest solution. Hence this exercise shows that the hamming bound is actually tight for \(k=t=1\) in the sense that there really is a non-degenerate code with just \(n=5\) qubits.

Exercise 10.51

Verify that the check matrix defined in (10.106) corresponds to the stabilizer of the CSS code \(\mathrm{CSS}(C_1,C_2)\), and use Theorem 10.8 to show that arbitrary errors on up to \(t\) qubits may be corrected by this code.

Proof

In the course of solving exercise 10.32 we have already seen that the check matrix looks like described. It remains to show that if \(C_1\) and \(C_2^\top\) correct \(t\) (classical) errors then the CSS code corrects \(t\) quantum errors.

Let \(H_1\) be the parity check matrix of \(C_1\) and let \(H_2\) be the parity check matrix of \(C_2^\top\).

Let \(E,F\) be two phase-less Pauli errors (products of Pauli matrices, without additional factor \(\pm1,\pm\ii\)) with weight less or equal to \(d\) such that \(d\leq\?t\). According to theorem 10.8 we have to show that \(E^\dagger\?F\notin\?N(S)\backslash\?S\). Without loss of generality we may assume that \(E\neq\?F\). We will even show that \(E^\dagger\?F\notin\?N(S)\), in other words, the syndromes of \(E\) and \(F\) are different (implying that a CSS code is non-degenerate, see the lemma in my solution of exercise 10.49).

Let us abbreviate \(N=\bigotimes_{j=1}^nN_j=E^\dagger\?F\) and define \((x,z)\in\BB^{2n}\) to be its encoding such that \((x_j,z_j)\) tells whether \(N_j\) is \(I\), \(X\), \(Y\), or \(Z\), in the same way as the check matrix \(G\) encodes this information. Recall that \(Z(S)=N(S)\) (exercise 10.44). Hence \(N\notin\?N(S)\) is the same as \(Ng_j=-g_jN\) for at least one \(j\). On the other hand the latter is equivalent to \(G_j\Lambda(x,z)^T=1\) (for some \(j\)), where \(G_j\) is just the \(j\)​th row of \(G\) which encodes \(g_j\). This in turn is equivalent to: \(H_1x\neq0\) or \(H_2z\neq0\).

On the other hand, since \(N\) has weight at most \(2d\) the weights of \(x\) and \(z\) are at most \(2d\) too. Moreover at least one of \(x\) and \(z\) has to be non-zero (because we assumed \(E\neq\?F\), or more precisely \(E^\dagger\?F\) not being a scalar multiple of the identity). By assumption \(C_1\) and \(C_2^\top\) both correct \(t\) errors and \(2d\?<\?2t+1\), hence \(H_1x\neq0\) or \(H_2z\neq0\) as desired. QED.

Exercise 10.52

Verify by direct operation on the codewords that the operators of (10.107) act appropriately, as logical \(Z\) and \(X\).

Solution

A characteristic of the code word \(\ket{0_L}\) is that each of its base-kets contains an even number of \(1\). This implies \(\bar{Z}\ket{0_L}=\ket{0_L}\). Similarly \(\ket{1_L}\) has only base kets with an odd number of ones, hence \(\bar{Z}\ket{1_L}=-\ket{1_L}\).

Moreover \(\ket{1_L}\) can be constructed from \(\ket{0_L}\) by just replacing each \(1\) by a \(0\) and vice versa. But this is just what \(\bar{X}\) does. Hence \(\bar{X}\ket{0_L}=\ket{1_L}\) and \(\bar{X}\ket{1_L}=\ket{0_L}\).

Exercise 10.53

Prove that the encoded \(Z\) operators are independent of one another.

Remark
  • I think the wording is sub-optimal. Independence is a property of a subset of a group. The above formulation could be interpreted as: every set of two \(Z\) operators are independent. Why does this matter? To see the difference consider \(\{Z_1Z_2,Z_1Z_3,Z_2Z_3\}\). The elements are pairwise independent, but \(Z_1Z_3=(Z_1Z_2)(Z_2Z_3)\), hence they are not independent. The same phenomen happens in other contexts where the term independent is used e.g.:
    • Independence of vectors in a vector space.
    • Inpependence of random variables in the sense of probability theory.
  • In the light of the above remark I think it makes even sense to prove that the set \(M=\{g_1,\ldots,g_{n-k},\bar{Z}_1,\ldots,\bar{Z}_k\}\) is independent. This is a more satisfying exercise.

Proof

As suggested by the remark let us prove that

\[ M=\{g_1,\ldots,g_{n-k},\bar{Z}_1,\ldots,\bar{Z}_k\} \]

is independent. By assumptions (made in the main text) the elements of \(M\) commute and \(-I\) is not part of the generated group \(S_Z=\langle\?M\rangle\). Hence, the homomorphism (of groups) \(r:S_Z\to\BB^{2n}\) is injective. This implies that independence is an invariant property, that is:

\[ M \text{ independent} \quad \Longleftrightarrow \quad r(M) := \{r(g) \;|\; g\in M\} \text{ independent.} \]

Moreover independence in \(\BB^{2n}\) as a group is the same as independence in \(\BB^{2n}\) as a vector space over \(\BB\). Hence, in order prove independence we can use linear algebra methods. Let \(G\) be the check matrix (where the \(j\)​th row is \(r(g_j)\)) in standard form and let

\[ G_z = \begin{bmatrix} 0 & 0 & 0 & A_2^T & 0 & I \end{bmatrix} , \]

be the "check matrix" for the Z operators as given by the book (which actually defines the Z operators by the isomorphism \(r\)). Note that the \(I\in\BB^{k\times\?k}\) in the sixth block is responsible for \(G_z\) having (full) rank \(k\). In the same way the \(I\in\BB^{r\times\?r}\) (first block-column) and \(I\in\BB^{(n-k-r)\times\?(n-k-r)}\) (fifth block-column) are responsible for \(G\) having (full) rank \(n-k\). Hence

\[ \?\begin{bmatrix} G \\ G_z \end{bmatrix} \]

as full rank \(n-k=(n-k-r)+r\) because the three mentioned identities all live in different block-columns. This shows the independece of \(r(M)\) and hence of \(M\). QED.

Exercise 10.54

Prove that with the check matrix for the encoded \(X\) operators defined as above, the encoded \(X\) operators are independent of one another and of the generators, commute with the generators of the stabilizer, with each other, and \(\bar{X}_j\) commutes with all the \(\bar{Z}_k\) except \(\bar{Z}_j\), with which it anti-commutes.

Remark
Similar remarks as in exercise 10.53 apply here too.

Proof

Recall that within the linear algebra framework we have

\[ \bar{X} = \left[ \begin{array}{ccc|ccc} 0 & E^T & I & C^T & 0 & 0 \end{array} \right] , \]

where each line corresponds to a \(\bar{X}_j\) according to the homomorphism \(r\). From

\[ G \Lambda \bar{X}^T = \begin{bmatrix} IC + CI \\ IE + EI \end{bmatrix} = 0 \in \BB^{(n-k)\otimes k} . \]

we deduce that all \(g_i\) commute with all \(\bar{X}_j\). That all \(\bar{X}_j\) commute with each other is obvious from the structure of its matrix representation.

Independence of the set of generators and X operators (as a single set) follows from linear independence of their linear algebra representation. That the latter is true follows from looking where the identity matrices are because those are responsible for making \(G\) and \(\bar{X}\) into full-row-rank matrices. For the former we have identities in columns 1 and 5, for the latter we have it at column 3.

Finally we have

\[ \bar{Z} \Lambda \bar{X}^T = I\cdot I = I \in \BB^{k\times k} , \]

which precisely means that \(\bar{Z}_i\) and \(\bar{X}_j\) commute iff \(i\neq\?j\). QED.

Exercise 10.55

Find the \(\bar{X}\) operator for the standard form of the Steane code.

Solution

This is just plugging stuff in. In the linear algebra framework we have:

\[ \bar{X} = \left[ \begin{array}{ccccccc|ccccccc} 0 & 0 & 0 & 1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array} \right] \in \BB^{1\times14} . \]

This corresponds to a single logical X operator which equals \(X_4X_5X_7\). As a bonus: in the same way we get \(Z_1Z_2Z_7\) as the logical Z operator.

Exercise 10.56

Show that replacing an encoded \(X\) or \(Z\) operator by \(g\) times that operator, where \(g\) is an element of the stabilizer, does not change the action of the operator on the code.

Proof

Let \(N\) be one of the logical X or Z operators. Let \(g\in\?S\) and \(\ket{\psi}\in\?C\) an element of the code. The generators let the code invariant, hence \(g\ket{\psi}=\ket{\psi}\). On the other hand \(g\) commutes with \(N\) by construction of the logical Pauli operators. Hence

\[ gN \ket{\psi} = Ng \ket{\psi} = N \ket{\psi} , \]

which is precisely the claim. QED.

Exercise 10.57

Give the check matrices for the five and nine qubit codes in standard form.

Solution

Let us use the code prepared in one of the introductory sections to this site. First consider the 5 qubit code.

cm = generators_to_check_matrix(generators_5qubit_code_repr())
transform_check_matrix(cm, [
    # n = 5, k = 1
    ["add row", 0, 2],
    ["add row", 1, 3],
    ["add row", 3, 2],
    ["add row", 3, 0],
    # Now we know r = 4, Hence the middle part is trivial!
])
[1 0 0 0 1 1 1 0 1 1]
[0 1 0 0 1 0 0 1 1 0]
[0 0 1 0 1 1 1 0 0 0]
[0 0 0 1 1 1 0 1 1 1]

Hence the 5 qubit code is an example where the the place for the matrix \(A_1\) completely vanishes (\(n-k-r=0\)). Next consider Shor's nine qubit code:

cm = generators_to_check_matrix(generators_shor_code_repr())
transform_check_matrix(cm, [
    # n = 9, k = 1, already see that r = 2
    ["swap rows", 0, 6],
    ["swap rows", 1, 7],
    ["swap qubits", 1, 3],
    ["add row", 1, 0],
    # X part is in standard form now. But by accident the Z part too!
])
[1 0 1 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0]
[0 1 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1]
[0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0]

Exercise 10.58

Verify that the circuits in Figures 10.13–10.15 work as described, and check the claimed circuit equivalences.

Solution

The action of of the circuit in Figure 10.13 is

\begin{align*} \ket{0,\psi} &\stackrel{H_1}{\mapsto} \frac{1}{\sqrt{2}} \left(\ket{0} + \ket{1}\right) \ket{\psi} \\ &\stackrel{CM}{\mapsto} \frac{1}{\sqrt{2}} \left(\ket{0}\ket{\psi} + \ket{1}M\ket{\psi} \right) \\ &\stackrel{H_1}{\mapsto} \ket{0} Q_+\ket{\psi} + \ket{1} Q_-\ket{\psi} , \end{align*}

where \(Q_{\pm}=(I\pm\?M)/2\) are the projectors onto the eigenspaces of \(M\). This shows that the circuits works as described (measures \(M\)).

The LHS in Figure 10.14 and 10.15 is just a special case (\(M=X\) or \(M=Z\)) of Figure 10.13. The RHS in each of Figure 10.14 and 10.15 follow from two simple properties:

\[ \cz = H_2 \cx H_2 \]

on the one hand, and that for \(\cz\) the source and target play the same role (can be swapped without changing the action of the gate).

Exercise 10.59

Show that by using the identities of Figures 10.14 and 10.15, the syndrome circuit of Figure 10.16 can be replaced with the circuit of Figure 10.17

Solution

I do not know what to say more except that it is very easy to see. You just have to do it. One thing to remark is maybe that the transformation behaves like a transposition of matrices.

Exercise 10.60

Construct a syndrome measuring circuit analogous to that in Figure 10.16, but for the nine and five qubit codes.

Solution

Using the utility code we can easily construct the circuits using qiskit.

generators = generators_5qubit_code()
qc = make_syndrome_circuit(generators, with_barrier=False)
qc.draw()
       ┌───┐                       ┌───┐
|0>_0: ┤ H ├────────■──────────────┤ H ├─────────────────────────────────────────────
       ├───┤        │              └───┘            ┌───┐
|0>_1: ┤ H ├────────┼────────────────■──────────────┤ H ├────────────────────────────
       ├───┤        │                │              └───┘            ┌───┐
|0>_2: ┤ H ├────────┼────────────────┼────────────────■──────────────┤ H ├───────────
       ├───┤        │                │                │              └───┘      ┌───┐
|0>_3: ┤ H ├────────┼────────────────┼────────────────┼────────────────■────────┤ H ├
       └───┘┌───────┴───────┐┌───────┴───────┐┌───────┴───────┐┌───────┴───────┐└───┘
  q_0: ─────┤0              ├┤0              ├┤0              ├┤0              ├─────
            │               ││               ││               ││               │
  q_1: ─────┤1              ├┤1              ├┤1              ├┤1              ├─────
            │               ││               ││               ││               │
  q_2: ─────┤2 Pauli(XZZXI) ├┤2 Pauli(IXZZX) ├┤2 Pauli(XIXZZ) ├┤2 Pauli(ZXIXZ) ├─────
            │               ││               ││               ││               │
  q_3: ─────┤3              ├┤3              ├┤3              ├┤3              ├─────
            │               ││               ││               ││               │
  q_4: ─────┤4              ├┤4              ├┤4              ├┤4              ├─────
            └───────────────┘└───────────────┘└───────────────┘└───────────────┘

Alternatively one can decompose the controlled pauli matrices. Unfortunately this also decomposes \(Z=HXH\):

generators = generators_5qubit_code()
qc = make_syndrome_circuit(generators, with_barrier=True)
qc.decompose(["cpauli"]).draw()
       ┌───┐ ░                                ░                                ░                               »
|0>_0: ┤ H ├─░───■────■─────────■────■────────░────────────────────────────────░───────────────────────────────»
       ├───┤ ░   │    │         │    │        ░                                ░                               »
|0>_1: ┤ H ├─░───┼────┼─────────┼────┼────────░───■────■─────────■────■────────░───────────────────────────────»
       ├───┤ ░   │    │         │    │        ░   │    │         │    │        ░                               »
|0>_2: ┤ H ├─░───┼────┼─────────┼────┼────────░───┼────┼─────────┼────┼────────░────────■─────────■────■───────»
       ├───┤ ░   │    │         │    │        ░   │    │         │    │        ░        │         │    │       »
|0>_3: ┤ H ├─░───┼────┼─────────┼────┼────────░───┼────┼─────────┼────┼────────░────────┼─────────┼────┼───────»
       └───┘ ░   │    │         │    │        ░ ┌─┴─┐  │         │    │        ░ ┌───┐┌─┴─┐┌───┐  │    │       »
  q_0: ──────░───┼────┼─────────┼────┼────────░─┤ X ├──┼─────────┼────┼────────░─┤ H ├┤ X ├┤ H ├──┼────┼───────»
             ░ ┌─┴─┐  │         │    │        ░ ├───┤┌─┴─┐┌───┐  │    │        ░ ├───┤└───┘└───┘┌─┴─┐  │  ┌───┐»
  q_1: ──────░─┤ X ├──┼─────────┼────┼────────░─┤ H ├┤ X ├┤ H ├──┼────┼────────░─┤ H ├──────────┤ X ├──┼──┤ H ├»
             ░ ├───┤┌─┴─┐┌───┐  │    │        ░ ├───┤└───┘└───┘┌─┴─┐  │  ┌───┐ ░ └───┘          └───┘┌─┴─┐└───┘»
  q_2: ──────░─┤ H ├┤ X ├┤ H ├──┼────┼────────░─┤ H ├──────────┤ X ├──┼──┤ H ├─░─────────────────────┤ X ├─────»
             ░ ├───┤└───┘└───┘┌─┴─┐  │  ┌───┐ ░ └───┘          └───┘┌─┴─┐└───┘ ░                     └───┘     »
  q_3: ──────░─┤ H ├──────────┤ X ├──┼──┤ H ├─░─────────────────────┤ X ├──────░───────────────────────────────»
             ░ └───┘          └───┘┌─┴─┐└───┘ ░                     └───┘      ░                               »
  q_4: ──────░─────────────────────┤ X ├──────░────────────────────────────────░───────────────────────────────»
             ░                     └───┘      ░                                ░                               »
«             ░                                     ░ ┌───┐
«|0>_0: ──────░─────────────────────────────────────░─┤ H ├
«             ░                                     ░ ├───┤
«|0>_1: ──────░─────────────────────────────────────░─┤ H ├
«             ░                                     ░ ├───┤
«|0>_2: ──■───░─────────────────────────────────────░─┤ H ├
«         │   ░                                     ░ ├───┤
«|0>_3: ──┼───░────────■────■─────────■────■────────░─┤ H ├
«         │   ░ ┌───┐┌─┴─┐  │  ┌───┐  │    │        ░ └───┘
«  q_0: ──┼───░─┤ H ├┤ X ├──┼──┤ H ├──┼────┼────────░──────
«         │   ░ └───┘└───┘┌─┴─┐└───┘  │    │        ░
«  q_1: ──┼───░───────────┤ X ├───────┼────┼────────░──────
«         │   ░           └───┘       │    │        ░
«  q_2: ──┼───░───────────────────────┼────┼────────░──────
«         │   ░                     ┌─┴─┐  │        ░
«  q_3: ──┼───░─────────────────────┤ X ├──┼────────░──────
«       ┌─┴─┐ ░ ┌───┐               └───┘┌─┴─┐┌───┐ ░
«  q_4: ┤ X ├─░─┤ H ├────────────────────┤ X ├┤ H ├─░──────
«       └───┘ ░ └───┘                    └───┘└───┘ ░

For the shor code I only print the compact representations:

generators = generators_shor_code()
qc = make_syndrome_circuit(generators, with_barrier=False)
qc.draw()
       ┌───┐                             ┌───┐                                                  »
|0>_0: ┤ H ├──────────■──────────────────┤ H ├──────────────────────────────────────────────────»
       ├───┤          │                  └───┘                ┌───┐                             »
|0>_1: ┤ H ├──────────┼────────────────────■──────────────────┤ H ├─────────────────────────────»
       ├───┤          │                    │                  └───┘                ┌───┐        »
|0>_2: ┤ H ├──────────┼────────────────────┼────────────────────■──────────────────┤ H ├────────»
       ├───┤          │                    │                    │                  └───┘        »
|0>_3: ┤ H ├──────────┼────────────────────┼────────────────────┼────────────────────■──────────»
       ├───┤          │                    │                    │                    │          »
|0>_4: ┤ H ├──────────┼────────────────────┼────────────────────┼────────────────────┼──────────»
       ├───┤          │                    │                    │                    │          »
|0>_5: ┤ H ├──────────┼────────────────────┼────────────────────┼────────────────────┼──────────»
       ├───┤          │                    │                    │                    │          »
|0>_6: ┤ H ├──────────┼────────────────────┼────────────────────┼────────────────────┼──────────»
       ├───┤          │                    │                    │                    │          »
|0>_7: ┤ H ├──────────┼────────────────────┼────────────────────┼────────────────────┼──────────»
       └───┘┌─────────┴─────────┐┌─────────┴─────────┐┌─────────┴─────────┐┌─────────┴─────────┐»
  q_0: ─────┤0                  ├┤0                  ├┤0                  ├┤0                  ├»
            │                   ││                   ││                   ││                   │»
  q_1: ─────┤1                  ├┤1                  ├┤1                  ├┤1                  ├»
            │                   ││                   ││                   ││                   │»
  q_2: ─────┤2                  ├┤2                  ├┤2                  ├┤2                  ├»
            │                   ││                   ││                   ││                   │»
  q_3: ─────┤3                  ├┤3                  ├┤3                  ├┤3                  ├»
            │                   ││                   ││                   ││                   │»
  q_4: ─────┤4 Pauli(ZZIIIIIII) ├┤4 Pauli(IZZIIIIII) ├┤4 Pauli(IIIZZIIII) ├┤4 Pauli(IIIIZZIII) ├»
            │                   ││                   ││                   ││                   │»
  q_5: ─────┤5                  ├┤5                  ├┤5                  ├┤5                  ├»
            │                   ││                   ││                   ││                   │»
  q_6: ─────┤6                  ├┤6                  ├┤6                  ├┤6                  ├»
            │                   ││                   ││                   ││                   │»
  q_7: ─────┤7                  ├┤7                  ├┤7                  ├┤7                  ├»
            │                   ││                   ││                   ││                   │»
  q_8: ─────┤8                  ├┤8                  ├┤8                  ├┤8                  ├»
            └───────────────────┘└───────────────────┘└───────────────────┘└───────────────────┘»
«
«|0>_0: ─────────────────────────────────────────────────────────────────────────────────────────
«
«|0>_1: ─────────────────────────────────────────────────────────────────────────────────────────
«
«|0>_2: ─────────────────────────────────────────────────────────────────────────────────────────
«               ┌───┐
«|0>_3: ────────┤ H ├────────────────────────────────────────────────────────────────────────────
«               └───┘                ┌───┐
«|0>_4: ──────────■──────────────────┤ H ├───────────────────────────────────────────────────────
«                 │                  └───┘                ┌───┐
«|0>_5: ──────────┼────────────────────■──────────────────┤ H ├──────────────────────────────────
«                 │                    │                  └───┘                ┌───┐
«|0>_6: ──────────┼────────────────────┼────────────────────■──────────────────┤ H ├─────────────
«                 │                    │                    │                  └───┘        ┌───┐
«|0>_7: ──────────┼────────────────────┼────────────────────┼────────────────────■──────────┤ H ├
«       ┌─────────┴─────────┐┌─────────┴─────────┐┌─────────┴─────────┐┌─────────┴─────────┐└───┘
«  q_0: ┤0                  ├┤0                  ├┤0                  ├┤0                  ├─────
«       │                   ││                   ││                   ││                   │
«  q_1: ┤1                  ├┤1                  ├┤1                  ├┤1                  ├─────
«       │                   ││                   ││                   ││                   │
«  q_2: ┤2                  ├┤2                  ├┤2                  ├┤2                  ├─────
«       │                   ││                   ││                   ││                   │
«  q_3: ┤3                  ├┤3                  ├┤3                  ├┤3                  ├─────
«       │                   ││                   ││                   ││                   │
«  q_4: ┤4 Pauli(IIIIIIZZI) ├┤4 Pauli(IIIIIIIZZ) ├┤4 Pauli(XXXXXXIII) ├┤4 Pauli(IIIXXXXXX) ├─────
«       │                   ││                   ││                   ││                   │
«  q_5: ┤5                  ├┤5                  ├┤5                  ├┤5                  ├─────
«       │                   ││                   ││                   ││                   │
«  q_6: ┤6                  ├┤6                  ├┤6                  ├┤6                  ├─────
«       │                   ││                   ││                   ││                   │
«  q_7: ┤7                  ├┤7                  ├┤7                  ├┤7                  ├─────
«       │                   ││                   ││                   ││                   │
«  q_8: ┤8                  ├┤8                  ├┤8                  ├┤8                  ├─────
«       └───────────────────┘└───────────────────┘└───────────────────┘└───────────────────┘

Exercise 10.61

Describe explicit recovery operations \(E_j^\dagger\) corresponding to the different possible error syndromes that may be measured using the circuit in Figure 10.16.

Solution

This can easily be done using the code we have produced for previous exercises.

generators = generators_steane_code_standard()
steane_syndromes = compute_syndromes(generators)

assert len(steane_syndromes) == 3*7, "Steane code is non-degenerate (1)"
assert len(set(steane_syndromes.values())) == 3*7, "Steane code is non-degenerate (2)"

def shorten(pauli_str: str):
    ret = None

    if (i := pauli_str.find('Z')) != -1:
        ret = f"Z_{i+1}"
    elif (i := pauli_str.find('X')) != -1:
        ret = f"X_{i+1}"
    elif (i := pauli_str.find('Y')) != -1:
        ret = f"Y_{i+1}"

    assert ret != None, "Invalid input"
    s = pauli_str[:i] + "I" + pauli_str[i+1:]
    assert s == "I" * len(pauli_str), "Invalid input"

    return ret

# This prints the syndromes and their error E (note: E=E.H)
for error, syndrome in sorted(steane_syndromes.items(), key=(lambda x: x[1])):
    print(f"{syndrome}: {error}, {shorten(error)}")
000001: IIIIIXI, X_6
000010: IIIIXII, X_5
000011: IXIIIII, X_2
000100: IIIXIII, X_4
000101: XIIIIII, X_1
000110: IIIIIIX, X_7
000111: IIXIIII, X_3
001000: IIZIIII, Z_3
001111: IIYIIII, Y_3
010000: IZIIIII, Z_2
010011: IYIIIII, Y_2
011000: IIIZIII, Z_4
011100: IIIYIII, Y_4
100000: ZIIIIII, Z_1
100101: YIIIIII, Y_1
101000: IIIIZII, Z_5
101010: IIIIYII, Y_5
110000: IIIIIIZ, Z_7
110110: IIIIIIY, Y_7
111000: IIIIIZI, Z_6
111001: IIIIIYI, Y_6

Exercise 10.62

Show by explicit construction of generators for the stabilizer that concatenating an \([n_1,1]\) stabilizer code with an \([n_2,1]\) stabilizer code gives an \([n_1n_2,1]\) stabilizer code.

Solution

Let us assume that we first encode everything with \(C_1\) and then, on top of this, we encode every of the resulting \(n_1\) qubits with \(C_2\). Clearly this leads to a space with \(n_1n_2\) dimensions which we use to encode a single logical qubit.

What are the generators of the concatenated code \(C_2\circ\?C_1\)? Therefore let \((g^{(1)}_i)_{i\in1..n_1}\) and \((g^{(2)}_i)_{i\in1..n_2}\) be the generators of \(C_1\) and \(C_2\) (individually). The stabilizer of \(C_2\circ\?C_1\) has two responsibilities

  1. Stabilize each of the \(n_2\) copies of \(C_1\) (one for each qubit of \(C_2\)).
  2. On top of that, stabilize \(C_2\) relative to \(C_1\).

For the first responsibility we use the following \(n_2(n_1-1)\) generators:

\[ \tilde{g}_{l,k} = I \ldots \otimes g_k^{(1)} \otimes \ldots I , \]

for \(k=1..n_1\) and \(l=1..n_2\) and the factor \(g_k^{(1)}\) sits at position \(l\) in the tensor product. For the second set of stabilizer generators let us first write

\[ g_k^{(2)} = \bigotimes_{j=1}^{n_2} N_{k,j} , \]

where \(N_{k,j}\in\{I,X,Y,Z\}\). Moreover let \(\bar{N}_{kj}\) be the "lifting" of \(N_{kj}\) to \(C_1\). Then the second set of stabilizer generators (containing \(n_2-1\) elements) are

\[ \bar{g}_k = \bigotimes_{j=1}^{n_2} \bar{N}_{k,j} , \]

where \(k=1..n_2\). Note that by construction of the lifted operators the \(\bar{N}_{kj}\) commute with all generators from the first set. Overall we found the \(n_1n_2-1\) generators of \(C_2\circ\?C_1\)!

Exercise 10.63

Suppose \(U\) is any unitary operation mapping the Steane code into itself, and such that \(U\bar{Z}U^\dagger=\bar{X}\) and \(U\bar{X}U^\dagger=\bar{Z}\). Prove that up to a global phase the action of \(U\) on the encoded states \(\ket{0_L}\) and \(\ket{1_L}\) is \(\ket{0_L}\mapsto(\ket{0_L}+\ket{1_L})/\sqrt{2}\) and \(\ket{1_L}\mapsto(\ket{0_L}-\ket{1_L})/\sqrt{2}\).

Proof

Let \(\ket{\pm_L}=(\ket{0_L}\pm\ket{1_L})/\sqrt(2)\) be the eigenstates of \(\bar{X}\). From \(\bar{X}\ket{+_L}=\ket{+_L}\) and \(\bar{X}=U\bar{Z}U^\dagger\) we obtain

\[ U^\dagger \ket{+_L} = \bar{Z} U^\dagger \ket{+_L} . \]

Hence \(U^\dagger\ket{+_L}\) is a \(+1\) eigenstate of \(\bar{Z}\). Hence

\[ U^\dagger\ket{+_L} = e^{\ii\varphi_{0}} \ket{0_L} . \]

In the same way (using \(\bar{X}\ket{-_L}=-\ket{-_L}\) and again \(\bar{X}=U\bar{Z}U^\dagger\)) we obtain

\[ U^\dagger\ket{-_L} = e^{\ii\varphi_{1}} \ket{1_L} . \]

It remains to show that \(\varphi_0=\varphi_1=:\varphi\). Note that it is not possible to show that \(\varphi=0\) (simply because it is wrong in general).

From \(\bar{Z}\ket{-_L}=\ket{+_L}\) and \(U\ket{X}U^\dagger=\bar{Z}\) we deduce

\[ \ket{-_L} = U\bar{X}U^\dagger \ket{+_L} = U\bar{X} e^{\ii\varphi_0} \ket{0_L} = U e^{\ii\varphi_0} \ket{1_L} = e^{\ii(\varphi_0-\varphi_1)} \ket{-_L} . \]

Hence \(\varphi_0=\varphi_1\) as desired. QED.

Exercise 10.64 (Back propagation of errors)

It is clear that an \(X\) error on the control of a CNOT gate propagates to the target qubit. In addition, it turns out that a \(Z\) error on the target propagates back to the control! Show this using the stabilizer formalism, and also directly using quantum circuit identities. You may find Exercise 4.20 on page 179 useful.

Solution

As mentioned we already know the following propagation rule \(\cx_{12}X_1=X_1X_2\cx_{12}\). Moreover, from exercise 4.20 we know that

\[ \cx_{21} = H_1H_2 \cx_{12} H_1H_2 . \]

Hence

\begin{align*} \cx_{12} Z_2 &= H_1H_2 \cx_{21} H_1H_2 Z_2 \\ &= H_1H_2 \cx_{21} X_2 H_1H_2 \\ &= H_1H_2 X_1X_2 \cx_{21} H_1H_2 \\ &= Z_1Z_2 H_1H_2 \cx_{21} H_1H_2 \\ &= Z_1Z_2 \cx_{12} . \end{align*}

Hence a \(Z\) error at the control of \(\cx\) progagates as two \(Z\), one at control and one at the target.

Exercise 10.65

An unknown qubit in the state \(\ket{\psi}\) can be swapped with a second qubit which is prepared in the state \(\ket{0}\) using only two CNOT gates using the circuit

            ┌───┐
  |0>: ──■──┤ X ├ |psi>
       ┌─┴─┐└─┬─┘
|psi>: ┤ X ├──■── |0>
       └───┘

Show that the two circuits below, which use only a single CNOT gate, with measurement and a classically controlled single qubit operation, also accomplish the same task:

       ┌───┐         ┌───┐
  |0>: ┤ X ├─────────┤ Z ├ |psi>
       └─┬─┘┌───┐┌─┐ └─╥─┘
|psi>: ──■──┤ H ├┤M├───║──
            └───┘└╥┘   ║
  c: 1/═══════════╩════╩══
                  0
       ┌───┐         ┌───┐
  |0>: ┤ H ├──■──────┤ X ├ |psi>
       └───┘┌─┴─┐┌─┐ └─╥─┘
|psi>: ─────┤ X ├┤M├───║──
            └───┘└╥┘   ║
  c: 1/═══════════╩════╩═
                  0

Solution

Let \(\ket{\psi}=a\ket{0}+\ket{1}\). Let us track the action of the first circuit:

\begin{align*} \ket{0}\ket{\psi} &\stackrel{\cx_{12}}{\mapsto} a\ket{00} + b\ket{11} \\ &\stackrel{H_2}{\mapsto} \frac{1}{\sqrt{2}} \left(a\ket{00} + a\ket{01} + b\ket{10} - b\ket{11} \right) \\ &= \frac{1}{\sqrt{2}} \left[ (a\ket{0} + b\ket{1})\ket{0} + (a\ket{0} - b\ket{1})\ket{1} \right] \end{align*}

Hence measuring \(0\) or \(1\) occurs with probability 50% (which is not important but nice to know) and after measuring \(0\) the state in the first wire is already \(\ket{\psi}\). If \(1\) is measured only a \(Z\) is needed to transform the state into \(\ket{\psi}\).

For the second circuit we proceed analogously:

\begin{align*} \ket{0}\ket{\psi} &\stackrel{H_1}{\mapsto} \frac{1}{\sqrt{2}} (a\ket{00} + b\ket{01} + a\ket{10} + b\ket{11}) \\ &\stackrel{\cx_{12}}{\mapsto} \frac{1}{\sqrt{2}} (a\ket{00} + b\ket{01} + a\ket{11} + b\ket{10}) \\ &= \frac{1}{\sqrt{2}} \left[ (a\ket{0} + b\ket{1}) \ket{0} + (a\ket{1} + b\ket{0}) \ket{1} \right] \end{align*}

After measuring \(0\) the state in the first wire is already \(\ket{\psi}\). If \(1\) is measured only a \(X\) is needed to transform the state into \(\ket{\psi}\).

Exercise 10.66 (Fault-tolerant \(\pi/8\) gate construction)

One way to implement a π/8 gate is to first swap the qubit state \(\ket{\psi}\) you wish to transform with some known state \(\ket{0}\), then to apply a π/8 gate to the resulting qubit. Here is a quantum circuit which does that:

       ┌───┐         ┌───┐┌───┐
  |0>: ┤ H ├──■──────┤ X ├┤ T ├ T|psi>
       └───┘┌─┴─┐┌─┐ └─╥─┘└───┘
|psi>: ─────┤ X ├┤M├───║───────
            └───┘└╥┘   ║
  c: 1/═══════════╩════╩═══════
                  0

Doing this does not seem particularly useful, but actually it leads to something which is! Show that by using the relations \(TX=\exp(−\ii\pi/4)SXT\) and \(TU=UT\) (\(U\) is the CNOT gate and \(T\) acts on the control qubit) me may obtain the circuit from Figure 10.25.

Solution

That \(T\) commutes with \(U\) follows from the fact that \(T\equiv\?R_z(\pi/4)\) is a function of the \(Z\) operator whose eigenstates form the decision states of the CNOT gate (\(U\)). Here with \(\equiv\) I mean equality up to global phase. That \(TX=\exp(−\ii\pi/4)SXT\) follows essentially from the bloch sphere interpretation:

\[ XTX \equiv X R_z(\pi/4) X = R_z(-\pi/4) \equiv S^\dagger T \equiv XSXT . \]

The correct phase is not really important here but could also be recovered because there are transparent rules to translate between \(T,S,Z\) on the one hand and \(R_z(\theta)\) including phase on the other hand. For example:

\[ T = \sqrt[4]{Z} = e^{i\pi/8} R_z(\pi/4) . \]

Having this commutation relation at hand it is easy to commute \(T\) from the end to the desired place near the beginning of the circuit.

Exercise 10.67

Show that the following circuit identities

(1)
     ┌───┐
q_0: ┤ X ├──■──
     └───┘  │
q_1: ───────■──
          ┌─┴─┐
q_2: ─────┤ X ├
          └───┘

equals

          ┌───┐
q_0: ──■──┤ X ├
       │  └───┘
q_1: ──■────■──
     ┌─┴─┐┌─┴─┐
q_2: ┤ X ├┤ X ├
     └───┘└───┘
(2)

q_0: ───────■──
            │
q_1: ───────■──
     ┌───┐┌─┴─┐
q_2: ┤ Z ├┤ X ├
     └───┘└───┘

equals


q_0: ──■────■──
       │    │
q_1: ──■────■──
     ┌─┴─┐┌───┐
q_2: ┤ X ├┤ Z ├
     └───┘└───┘

Solution

There are several simple ways to see these identities. The most straightforward way is to check that the circuits act the same on the eight basis states \(\ket{ijk}\) (\(i,j,k\in\{0,1\}\)). This is essentially what we do in the following, but we take some simple shortcuts which save us from tabulating eight possibilities for each identity (instead we only have to consider two carefully chosen cases).

Consider identity (1). Clearly both circuits act the same way on the first two circuits. For the first circuit we see that it only acts non-trivially (with an \(X\)) on the third qubit if \(i=0\), \(j=1\). It is not hard to see that the second circuit does the same.

Consider identity (2). It is not hard to see that both circuits map \(\ket{11k}\) to \((-1)^k\ket{11\bar{k}}\), where \(k=k+1\mod2\). On the other hand, if at least one of \(i\), \(j\) is not \(1\), then both circuits act as \(\ket{ijk}\mapsto(-1)^{k}\ket{ijk}\).

Exercise 10.68 (Fault-tolerant Toffoli gate construction)

A procedure similar to the above sequence of exercises for the π/8 gate gives a fault-tolerant Toffoli gate

  1. First, swap the three qubit state \(\ket{xyz}\) you wish to transform with some known state \(\ket{000}\), then apply a Toffoli gate to the resulting qubits. Show that the following circuit accomplishes this task:

    The book contains a circuit here.
    
  2. Using the commutation rules from Exercise 10.67, show that moving the final Toffoli gate all the way back to the left side gives the circuit

    The book contains a circuit here.
    
  3. Assuming the ancilla preparation shown in the leftmost dotted box can be done fault-tolerantly, show that this circuit can be used to give a fault-tolerant implementation of the Toffoli gate using the Steane code.
Remark
This arxiv paper might be and interesting read on the topic of this exercise.

Solution

Part (1) is really just two times the second circuit and one time the third circuit from exercise 10.65.

For part (2) there isn't much to say. The exercise statement already explained that this follows from the identities from exercise 10.67.

For part (3) observe that the non-ancilla part of the circuit only contains Clifford gates (some of which are classically controlled). In the main text we have already seen how to implement those in the Steane code. So assuming the ancilla can also be prepared so that it is already Steane-encoded the whole circuit can be implemented in a fault tolerant way.

Exercise 10.69

Show that a single failure anywhere in the ancilla preparation and verification can lead to at most one \(X\) or \(Y\) error in the ancilla output.

Proof

The following argumentation is designed to work with the 7-qubit Steane code. However, it is generic as it applies to any \([n,1]\) stabilizer code showing that the arguments also apply to e.g. the 5-qubit code. Note however that strictly speaking several assumptions go into the construction of the schematic circuit from Firgure 10.28. For example we assume that the controlled-​\(M\) operation can be implemented in a transversal way. For this exercise we are only concerned about the Prepare-Verify part of the circuit allowing us to be more general. But you can always set \(n=7\) to restrict everything to the Steane code and keep things simple.

Let us distinguish two cases:

  1. A single error in the preparation stage (and no error in verification stage).
  2. A single error in the verification stage (and no error in preparation stage).

Consider case 1 (error in preparation stage). Let us briefly show that multiple \(X\) errors are possible directly after the preparation stage even if only one of its components fails. To see this assume that the first CNOT between the first and second ancilla produces a single \(X\) error at its control qubit (ancilla 1). By the propagation rules for CNOT this error gets distributed (by the following CNOT gates) to almost all ancilla qubits, leading to the following total error after preparation:

\[X\otimes I \otimes X^{\otimes n-2}\]

For that reason we assume that the error after the preparation stage is in the most general form according to the error model from the main text

\[ E = \bigotimes_{j=1}^n N_j, \text{ where } N_j \in \{I,X,Y,Z\} . \]

Recall that this error model says that a component fails with a certain probability \(p\) and produces only Pauli errors at its outputs. If no errors would have occured the output of the preparation stage is a cat state:

\[ \cat = \frac{1}{\sqrt{2}} (\ket{0^n} + \ket{1^n}) . \]

With error the output is \(E\cat\). Now we show that the verification stage detects if any \(X\) or \(Y\) errors are present (but it does not see \(Z\) errors). In fact, the verification stage can be implemented as a series of \(n-1\) pairs of CNOT gates

\[ G_j = \cx_{j,x} \cx_{j+1,x}, \quad (j=1..n-1), \]

(\(x\) being an additional ancilla outside the "main" ancillas initialized to \(\ket{0}\) for each \(G_j\)) which are followed by a measurement (each). It is not hard to see that the verification step implements the measurement of all \(g_j=Z_jZ_{j+1}\) for \(j=1..n-1\). These operators can be seen as the stabilizer of the two dimensional subspace spanned by

\[ \ket{0^n}, \ket{1^n} , \]

which contains \(\cat\). Thus applying verification to \(E\cat\) the verification either detects an error (which saves us) or it says "OK" and the final state is one of two possibilities

\[ \catp{\pm} = \frac{1}{\sqrt{2}} (\ket{0^n} \pm \ket{1^n}) , \]

Note that \(\catp{+}=\cat\) is the cat state and contains effectively no errors but could be the result of an even number of \(Z\) errors (which do not affect the cat state). In case of an odd number of \(Z\) errors we get \(\catp{-}\). This would be a "proper" error. It does not propagate into the data qubits (because \(Z\) commutes with the control of any controlled gate) but it would be decoded into \(\ket{10^{n-1}}\) which would be measured as \(1\) at the first ancilla. Hence this error would basically swap the measurement result for the overall procedure (whose goal is to measure an operator \(M\) whose eigenvalues are \(\pm1\)). This concludes case 1, as we have seen that actually no \(X\) or \(Y\) errors can hide from the verification procedure.

Consider case 2 (verification fails). In that case, by assumption, the preparation stage correctly produces \(\cat\). Any error (\(X,Y,Z\)) originating from a control of the \(G_j\) (see above) can only propagate into the additional ancillas and hence not into the rest of the \(n\) "main" ancillas. Moreover, although a \(Z\) error in the additional ancillas can propagate into the main ancillas, but \(X\) cannot. Similarly a \(Y=\ii\?ZX\) error in the additional ancillas can only propagate a \(Z\) error into the main ancillas.

Hence, in this case, a \(X\) or \(Y\) error can only be produced if it originates from a part of a component (a control of the CNOT gates) which is attached to the main ancillas and furthermore it cannot further propagate into the other main ancillas. QED.

Exercise 10.70

Show that \(Z\) errors in the ancilla do not propagate to affect the encoded data, but result in an incorrect measurement result being observed.

Solution

In this solution we do not need to restrict to the Steane code. Instead we consider a general stabilizer code on \(n\) qubits (Steane has \(n=7\)) for which the transversal construction for the controlled \(M\) operator works as outlined and such that \(n\) is odd. See also the first paragraph of the solution of exercise 10.69.

Let us briefly recall that \(Z\) errors commute with the control of any controlled gate \(C(U)\). In fact, we have

\[ C(U) = P_0\otimes I + P_1\otimes U . \]

Note that the projections \(P_j=\proj{j}\) onto the standard basis actually commute with \(Z\) because the standard basis is the eigenbasis of \(Z\) (by definition). For that reason we have

\[ C(U) \cdot Z\otimes I = (P_0 Z) \otimes I + (P_1 Z) \otimes U = (Z P_0) \otimes I + (Z P_1) \otimes U = Z\otimes I \cdot C(U) . \]

This implies that \(Z\) errors in the ancillas cannot propagate into the data qubits because ancillas and data are only connected via CNOT gates (with controls in the ancillas). This shows the first part of the claim.

Now let us check how \(Z\) errors affect the measurement. For that purpose let us define

\[ \catp{\pm} = \frac{1}{\sqrt{2}} ( \ket{0^n} \pm \ket{1^n} ) \]

and first consider the working of the operator measurement in case no error happens. Let \(\ket{\psi}\) be the encoded data and decompose it into

\[ \ket{\psi} = a\ket{\psi_+} + b\ket{\psi_-} , \]

where \(\ket{\psi_{\pm}}\) are the encoded eigenstates of \(M\) (with eigenvalues \(\pm1\)). Right before applying the controlled \(M\) the state is \(\cat\ket{\psi}\). After applying the (encoded) controlled \(M\) operator we get (after reordering terms)

\begin{align*} & a \catp{+} \ket{\psi_+} + b\catp{-} \ket{\psi_-} . \end{align*}

Here at this point we need that the transversal construction of the encoded controlled \(M\) gate looks like in Figure 10.28 (see also Figure 10.24) and the coded uses an odd number \(n\) of qubits (so that \((-1)^n=-1\)). Decoding transforms it into

\begin{align*} & a \ket{00^{n-1}} \ket{\psi_+} + b\ket{10^{n-1}} \ket{\psi_-} . \end{align*}

Hence, measuring \(0\) (in the first ancilla) yields \(\ket{\psi_+}\) in the data qubits, and measuring \(1\) yields \(\ket{\psi_-}\) in the data qubit. This is as desired.

What happens if we get \(r\geq1\) many \(Z\) errors after the prepare-verify stages? If \(r\) is even this does not affect the cat state and the procedure works as expected. If \(r\) is odd the cat state \(\cat=\catp{\?+}\) is replaced by \(\catp{-}\). In essence, in the above derivation \(\catp{+}\) and \(\catp{-}\) swap their roles. Hence, right before measurement we get the state

\begin{align*} & a \ket{10^{n-1}} \ket{\psi_+} + b\ket{00^{n-1}} \ket{\psi_-} . \end{align*}

This implies that we infer the wrong eigenvalue in this case (with 100% probability). QED.

Exercise 10.71

Verify that when \(M=e^{-\ii\pi/4}SX\) the procedure we have described gives a fault-tolerant method for measuring \(M\).

Remark
If I did no mistake in the proof below the \(\pi/8\) gates at the ancillas are not needed.

Proof

Let us consider this exercise for the 7-qubit Steane code. Let us denote the ancilla qubits by the labels \(a_i\) and the qubits for the data by \(d_i\) (\(i=1..7\)). We show that the encoded controlled \(M\) operator is given by a product of controlled \(V=ZSX\) gates with controls and targets as in Figure 10.24 (where the first block are the ancillas and the second block the data):

\[ \prod_{i=1}^{7} C(ZSX)_{a_i,d_i} \]

First of all observe that

\begin{align*} M Z M^\dagger &= SX Z XS^\dagger = -Z , \\ M X M^\dagger &= SX X XS^\dagger = Y . \end{align*}

This is the behaviour we want to have on the encoded qubits (logical qubits). On the other hand we have (on the physical qubits)

\begin{align*} V Z V^\dagger &= ZSX Z XS^\dagger Z = -Z , \\ V X V^\dagger &= ZSX X XS^\dagger Z = -Y . \end{align*}

This is exactly what we want for the Steane code! In fact, the logical Pauli operators are \(\bar{Z}=Z^{\otimes7}\), \(\bar{X}=X^{\otimes7}\), and \(\bar{Y}=-Y^{\otimes7}\). The formula for \(\bar{Y}\) follows from \(Y=\ii\?XZ\). So the minus sign in front of \(Y\) is exactly what we want. QED.

WIP Exercise 10.72 (Fault-tolerant Toffoli ancilla state construction)

Show how to fault-tolerantly prepare the state created by the circuit in the dotted box of Exercise 10.68, that is,

\[ \ket{\Theta} = \frac{1}{2}( \ket{000} + \ket{100} + \ket{010} + \ket{111} ) \]

You may find it helpful to first give the stabilizer generators for this state.

Remark
Since I couldn't solve the exercise in the way suggested I searched for a solution elsewhere. In section 6.2 of (Shor, Peter W., 1996) (link) I found a nice and transparent construction which does not take the approach suggested here (but it is still based on cat states). I didn't read the whole paper but it seems to have had a big influence on the contents of chapter 10 (it is also cited by the book).

Attempted solution

By construction we have

\[ \ket{\Theta} = \mathrm{CCX} H_1 H_2 \ket{000} . \]

So in order to find generators of the stabilizer of \(\ket{\Theta}\) we start with the generators

\[ \langle Z_1, Z_2, Z_3 \rangle \]

of \(\ket{000}\) and apply the stabilizer formalism. Applying the two Hadamard gates according to the stabilizer formalism (every generator gets conjugated by the gates: \(g\mapsto\?UgU^\dagger\)) yields

\[ \langle X_1, X_2, Z_3 \rangle . \]

The interesting thing happens when we apply the Toffoli gate (a non-clifford gate). In that case we leave the Pauli group and get:

\[ \langle X_1\cx_{23}, X_2\cx_{13}, Z_3\cz_{12} \rangle . \]

To see this use exercise 10.67. All three generators are three qubit gates, but they still have eigenvalues \(\pm1\) (as they are produced by conjugation).

Now I would like to use the construction from 10.28. However there are several problems I cannot solve at the moment. For example the construction requires to put yet another layer of control to the generators. Since the generators already contain controlled gates this requires doubly controlled gates (a priori at least). This does not look like a promising approach to me. So I leave this solution open ended …

Exercise 10.73 (Fault-tolerant encoded state construction)

Show that the Steane code encoded \(\ket{0}\) state can be constructed fault-tolerantly in the following manner.

  1. Begin with the circuit of Figure 10.16, and replace the measurement of each generator, as shown in Figure 10.30, with each ancilla qubit becoming a cat state \((\ket{0^7}+\ket{1^7})/\sqrt{2}\), and the operations rearranged to have their controls on different qubits, so that errors do not propagate within the code block.
  2. Add a stage to fault-tolerantly measure \(Z\).
  3. Calculate the error probability of this circuit, and of the circuit when the generator measurements are repeated three times and majority voting is done.
  4. Enumerate the operations which should be performed conditioned on the measurement results and show that they can be done fault-tolerantly.

Solution

The construction which is described above is just a variation of the schematic given in Figure 10.28. Each of the six measurements (one for each generator) has its own block of seven ancilla qubits (but the same data block). Measuring them in a row implies that we get a state \(\ket{\psi}\) (in the data block) which is a simultaneous eigenstate of the six generators. Repeating this procedure three times and taking majority votes ensures a success probability for a correct measurement of \(O(p^2)\) (NOTE: this is not explicitly mentioned in the book but each repetition reuses the data-block output from the previous iteration. This works because errors do not propagate in the data block and hence at most one qubit is corrupted in case a single component fails).

As already mentioned, after all this the state \(\ket{\psi}\) is a simultaneous eigenstate of the six generators. Let us denote the measurement result by a bit vector \(b\in\BB^6\). The state \(\ket{\psi}\) is \(\ket{0_L}\) iff \(b=0^6\). Otherwise we can correct the state by treating \(b\) as a syndrome and apply operators according to the table produced in the solution of exercise 10.61.

Exercise 10.74

Construct a quantum circuit to fault-tolerantly generate the encoded \(\ket{0}\) state for the five qubit code (Section 10.5.6).

Sketch of a solution

Conceptually one does exactly the same as for the Steane code. Just have a look into exercise 10.73. I see no benefit in repeating it here.