Chapter 3

\[ \newcommand{\BB}{\mathbb{B}} \newcommand{\CC}{\mathbb{C}} \newcommand{\HH}{\mathbb{H}} \newcommand{\KK}{\mathbb{K}} \newcommand{\NN}{\mathbb{N}} \newcommand{\QQ}{\mathbb{Q}} \newcommand{\RR}{\mathbb{R}} \newcommand{\ZZ}{\mathbb{Z}} \newcommand{\ii}{\mathrm{i}} \newcommand{\jj}{\mathrm{j}} \newcommand{\kk}{\mathrm{k}} \newcommand{\dd}{\mathrm{d}} \newcommand{\FT}{\mathcal{F}} % Fourier Transform \newcommand{\tto}{\twoheadrightarrow} \newcommand{\inv}{^{-1}} \newcommand{\RF}{\mathrm{RF}} \newcommand{\sys}{\mathrm{sys}} \newcommand{\diag}{\mathrm{diag}} \newcommand{\cx}{\mathrm{CX}} \newcommand{\cy}{\mathrm{CY}} \newcommand{\cz}{\mathrm{CZ}} \newcommand{\cat}{\ket{\mathrm{cat}}} \newcommand{\catp}[1]{\ket{\mathrm{cat}_{#1}}} \newcommand{\calE}{\mathcal{E}} \newcommand{\calF}{\mathcal{F}} \newcommand{\calH}{\mathcal{H}} \newcommand{\calR}{\mathcal{R}} \newcommand{\abs}[1]{\left|#1\right|} \newcommand{\norm}[1]{\|#1\|} \newcommand{\sprod}[2]{\langle#1|#2\rangle} % deprecated, use braket instead. \newcommand{\braket}[2]{\langle#1|#2\rangle} % scalar product \newcommand{\ptrace}[2]{\mathrm{tr}_{#1}\left(#2\right)} \newcommand{\trace}[1]{\mathrm{tr}\left(#1\right)} \newcommand{\rank}[1]{\mathrm{rank}\left(#1\right)} \newcommand{\floor}[1]{\lfloor#1\rfloor} \newcommand{\ceil}[1]{\lceil#1\rceil} \newcommand{\bra}[1]{\langle#1|} \newcommand{\ket}[1]{|#1\rangle} \newcommand{\proj}[1]{\ket{#1}\bra{#1}} \newcommand{\mean}[1]{\langle#1\rangle} \newcommand{\wt}[1]{\mathrm{wt}\left(#1\right)} \newcommand{\prob}[1]{\mathrm{Prob}\left[#1\right]} \newcommand{\orac}{\mathrm{Orac}} \newcommand{\?}{} % sometimes I need just a separator other than whitespace \]

Table of Contents

Setup

Python Libraries

from functools import reduce
from typing import Literal, NewType, Optional
from collections.abc import Callable

from utils import TermColor

Turing Machines

Introduction

In this section we define a class TuringMachine which empowers us to execute a turing program like this one:

# This program computes the function f(n)=1.
# Example: input="▶1001", output="▶1".
(S,▶,A,▶,+) # S is the start state
(A,0,A,■,+)
(A,1,A,■,+)
(A,■,B,■,-)
(B,■,B,■,-)
(B,▶,C,▶,+)
(C,■,H,1,0) # H is the halt state

Below we initialize a Turing machine my_tm from the above source code:

my_tm = TuringMachine.fromSource(source, start_state="S", halt_state="H")
my_tm.LOG_LEVEL = 3
my_tm.COLORED_LOG = False  # set to false since listing below does not support colors

result = my_tm.run("▶0")
print(f"\nresult = '{result}'")

Setting LOG_LEVEL=3 makes the Turing machine print certain informations for each step of execution. The result can be seen in the following listing (with COLORED_LOG=True this would be even more informative). Note that in theory the tape is infinite. We realize this by dynamically allocating new blank cells whenever we want to access a cell not already allocated (we roughly double the tape size in that case).


Executing step 1
tape = '▶0'
state='S', symbol='▶', head_position=0
source line = 2: (S,▶,A,▶,+) # S is the start state

Executing step 2
tape = '▶0'
state='A', symbol='0', head_position=1
source line = 3: (A,0,A,■,+)

Executing step 3
tape = '▶■■■■'
state='A', symbol='■', head_position=2
source line = 5: (A,■,B,■,-)

Executing step 4
tape = '▶■■■■'
state='B', symbol='■', head_position=1
source line = 6: (B,■,B,■,-)

Executing step 5
tape = '▶■■■■'
state='B', symbol='▶', head_position=0
source line = 7: (B,▶,C,▶,+)

Executing step 6
tape = '▶■■■■'
state='C', symbol='■', head_position=1
source line = 8: (C,■,H,1,0) # H is the halt state

Execution took 6 steps.

result = '▶1'

Implementation

First we introduce some useful types. We represent the state by strings since this allows for readable or at least mnemonic state names. This would not be possible had we used integers. The SourceMap is responsible for connecting each parsed line of the turing program with its original from the source-string/file. It is the reason why in the listing above the TuringMachine instance can refer to its source (see the line starting with source line =).

Char = NewType("Char", str)  # better than nothing, represents a letter from the alphabet
State = NewType("State", str)
Move = Literal[-1, 0, +1]
TuringProgLine = tuple[State, Char, State, Char, Move]  # (s,c,s',c',m)
TuringProgram = list[TuringProgLine]

SourceMap = list[tuple[int, str]]
TuringParser = Callable[[str], [tuple[TuringProgram, SourceMap | None]]]

Below follows the implementation of the TuringMachine. The logging functionality bloats the implementation a bit. But it you ignore this the implementation is very simple and straightforward to understand (IMHO).

Note that the __init__ method takes the program as an already parsed list of tuples (TuringProgram). The canonical way to create a TuringMachine is to use the fromSource factory method. Among other things you can optionally provide a custom parser. This might be beneficial to introduce syntactic sugar. For example the code snippit

(a,0,a,0,+)
(a,1,a,1,+)
(a,2,a,2,+)
(a,3,a,3,+)
(a,4,a,4,+)
(a,5,a,5,+)

might seem somewhat verbose. Probably this would be nicer syntax:

(a,0|1|2|3|4|5,a,$,+)  # "$" meaning: take the original
# or even (if this is the intended meaning)
(a,*,a,$,+)  # "*" meaning: match all

Another reason to use a custom parser would be to circumvent some restrictions of the default parser. For example the default parser treats the symbols #, ,, (, ) in a special way. In particular, those cannot be used as characters on the tape.

class TuringMachine:
    """Given a turing program produces the corresponding turing machine."""
    BLANK: Char = "■"  # Special symbol for the blank cell. Is used when tape dynamically grows.
    MAX_STEPS = 1000000  # Abort after running this many steps
    LOG_LEVEL = 0  # Set greater 0 for a chatty execution
    COLORED_LOG = True  # Turn of if environment does not support colors
    SOURCE_LINE_OFFSET = 0  # SourceMap assumes that source file start with line 0. Adjust this here.

    def __init__(self, program: TuringProgram,
                 start_state: State = "START", halt_state: State = "HALT",
                 source_map: Optional[SourceMap] = None):
        self._start_state = start_state
        self._halt_state = halt_state
        self._program = program
        self._source_map = source_map

        assert source_map is None or len(source_map) == len(program), "Invalid source_map"

        self._state = None
        self._head_position: int = None
        self._tape: list[Char] = []  # gets input during execution
        self._step_count = None

        # Only relevant for logging:
        self._index_width = len(str(len(self._program)-1))

    @classmethod
    def fromSource(cls, source: str,
                   start_state: State = "START", halt_state: State = "HALT",
                   parser: TuringParser = parse_turing_program) -> "TuringMachine":
        """Create a turing machine from a source string. You can provide a custom parser."""
        program, source_map = parser(source)
        return TuringMachine(program,
                             start_state=start_state, halt_state=halt_state,
                             source_map=source_map)

    def run(self, tape_input: str) -> str:
        """Run the TM and return what is on the tape after it halts."""
        self._state = self._start_state
        self._head_position = 0
        self._tape = list(tape_input)
        self._step_count = 0

        while self._state != self._halt_state:
            if self._step_count > self.MAX_STEPS:
                raise Exception("Turing machine takes too long. Aborting.")
            self._step_count += 1
            self._log(2, f"\nExecuting step {self._step_count}")
            self._run_one_step()

        self._log(1, f"\nExecution took {self._step_count} steps.")

        return self._tape_content.strip(self.BLANK)

    @property
    def _line_no_width(self):  # only relevant for logging
        return None if self._source_map is None \
            else len(str(self._source_map[-1][0] + self.SOURCE_LINE_OFFSET - 1))

    def _log(self, level, *args, **kwargs):
        if self.LOG_LEVEL >= level:
            print(*args, **kwargs)

    def _run_one_step(self) -> None:
        self._log(3, f"tape = '{self._colored_tape_content}'")
        current_line = self._current_program_line

        if current_line is None:
            RED, ENDC = (TermColor.RED, TermColor.ENDC) if self.COLORED_LOG else ("", "")
            self._log(2, f"{RED}No matching program line - halting{ENDC}")
            self._state = self._halt_state
            return

        index, current_line = current_line
        self._log(4, f"program line = {index:{self._index_width}}: {current_line}")

        if self._source_map is not None:
            line_no, source_line = self._source_map[index]
            line_no += self.SOURCE_LINE_OFFSET
            self._log(3, f"source line = {line_no:{self._line_no_width}}: {source_line}")

        s1, c1, m = current_line[2:]

        self._state = s1
        self._head_symbol = c1
        # avoid negative positions:
        self._head_position = max(self._head_position + m, 0)

    @property
    def _tape_content(self) -> str:
        return "".join(self._tape)

    @property
    def _colored_tape_content(self) -> str:
        self._enlarge_tape_if_necessary()
        text = self._tape_content
        pos = self._head_position
        GREEN, ENDC = (TermColor.GREEN, TermColor.ENDC) if self.COLORED_LOG else ("", "")
        return text[:pos] + GREEN + text[pos] + ENDC + text[pos+1:]

    @property
    def _current_program_line(self) -> tuple[int, TuringProgLine]:
        s0, c0 = self._state, self._head_symbol
        self._log(2, f"state='{s0}', symbol='{c0}', head_position={self._head_position}")
        for i, line in enumerate(self._program):
            s, c = line[:2]
            if s0 == s and c0 == c:
                return i, line

        return None

    @property
    def _head_symbol(self) -> Char:
        self._enlarge_tape_if_necessary()
        return self._tape[self._head_position]

    @_head_symbol.setter
    def _head_symbol(self, new_symbol) -> Char:
        self._enlarge_tape_if_necessary()  # just being paranoid
        self._tape[self._head_position] = new_symbol

    def _enlarge_tape_if_necessary(self) -> None:
        """Call this to ensure that our finite tape can actually be accessed at head position."""
        pos = self._head_position
        while pos >= len(self._tape):  # only one loop usually suffices
            some_blanks = [self.BLANK] * (1 + len(self._tape))
            self._tape += some_blanks  # double tape size

Finally, the default parser:

def parse_turing_program_line(line: str) -> TuringProgLine | None:
    """Parse a single line of the form `(s,c,s',c',m)` (up to comments).

    Empty or comment lines are ignored by returning `None`."""
    # Remove comments and leading/trailing whitespace
    line = line[:line.find('#')].strip()
    if len(line) == 0:
        return None  # line comment or empty line

    line = line.strip().lstrip("(").rstrip(")")
    s0, c0, s1, c1, m = line.split(",")
    m = +1 if m == "+" else m  # for convinience ...
    m = -1 if m == "-" else m  # ... allow shortcuts for +1, -1
    m = int(m)
    assert len(c0) == 1, f"Expected character got '{c0}'."
    assert len(c1) == 1, f"Expected character got '{c1}'."
    assert m in [-1, 0, +1], f"Forbidden head movement: {m}."

    return s0, c0, s1, c1, m


def parse_turing_program(source: str) -> tuple[TuringProgram, SourceMap]:
    """Parses the source of a turing program line by line."""
    program: TuringProgram = []
    source_map: list[int] = []

    lines = source.split("\n")
    for line_no, line in enumerate(lines):
        try:
            parsed_line = parse_turing_program_line(line)
        except Exception:
            # Probably good enough for such a simple language:
            raise Exception(f"Could not parse line {line_no}: '{line}'")

        if parsed_line is not None:
            program.append(parsed_line)
            source_map.append((line_no, line))

    return program, source_map

Exercises

TODO Exercise 3.1 (Non-computable processes in Nature)

How might we recognize that a process in Nature computes a function not computable by a Turing machine?

Exercise 3.2 (Turing numbers)

Show that single-tape Turing machines can each be given a number from the list \(1,2,3,\ldots\) in such a way that the number uniquely specifies the corresponding machine. We call this number the Turing number of the corresponding Turing machine. (Hint: Every positive integer has a unique prime factorization \(p_1^{a_1}p_2^{a_2}\cdots p_k^{a_k}\), where \(p_i\) are distinct prime numbers, and \(a_1,a_2,\ldots,a_k\) are non-negative integers.)

Proof

For simplicity we assume that the state space and the alphabet are fixed for all Turing machines we consider (one could easily relax this constraint without complicating the encoding). Hence we need to encode the start state, the halt state and the program into a single number. This is sufficient to describe the Turing Machine. We follow the proposal to use prime numbers for that. Therefore let us enumerate the prime numbers starting by \(2\): \(p_1=2\), \(p_2=3\), …, \(p_5=11\), … .

This procedure is certainly not the most efficient one but it serves the purpose to show that all Turing machines (satsifying a specific definition/architecture) can be effectively enumerated. That is, there is an algorithm (which could potentially run on a Turing machine), which assigns numbers to each Turing machine. Moreover this assignment is invertible and the inverse mapping (numbers to Turing machines) can be computed by an algorithm too.

Let us identify the states \(S\) and the elements of the alphabet \(\Gamma\) with integers:

\[ S = \{0, 1, 2, \ldots\} ; \quad \Gamma = \{0, 1, 2, \ldots\} . \]

Moreover, we encode the possible movements by \(\{0,1,2\}\), meaning left, stay, right in that order. We encode the start state \(s\in\NN\) and the halt state \(h\in\NN\) by

\[ p_1^s \text{ and } p_2^h . \]

The \(i\)​-th program line

\[ (q_i,c_i,q'_i,c'_i,m_i) \in \NN^4\times\{0,1,2\} \]

can be encoded by

\[ l_i = p_{5i+3}^{q_i} \, p_{5i+4}^{c_i} \, p_{5i+5}^{q'_i} \, p_{5i+6}^{c'_i} \, p_{5i+7}^{m_i} . \]

The whole program is then encoded by the product \(\Pi_il_i\) and the Turing machine itself by \(p_1^s\,p_2^h\,\Pi_il_i\). Since prime factorization of integers is unique, this encoding is invertible and hence the Turing machine can be recovered from this (potentially large) number. QED.

Exercise 3.3 (Turing machine to reverse a bit string)

Describe a Turing machine which takes a binary number \(x\) as input, and outputs the bits of \(x\) in reverse order. (Hint: In this exercise and the next it may help to use a multi-tape Turing machine and/or symbols other than ▶, 0, 1, and the blank.)

Solution

We use a single tape together with and additional symbol "□", which we call white blank in the following.

Assume that the input tape starts with the left-end-marker "▶" followed by the number \(x\) in binary. All other cells should be blank. As an example, for \(x=13\) the input tape should look like that:

┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ 1 │ 0 │ 1 │   │   │   │   │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴

Note that the empty tape is not allowed - there should always be at least one digit. The Turing machine should halt with only the reversed bitstring at the beginning of the tape. In the example above:

┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ 0 │ 1 │ 1 │   │   │   │   │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴

The idea of the algorithm is as follows (the white blanks simplify the implementation):

1. Reverse the string, but shifted
▶1011■■■■■▶□□□□1101■
2. Shift the reversed string to the correct position
▶□□□□1101■▶1101□□□□■
3. Remove trailing white blanks
▶1101□□□□■▶1101■■■■■

Step-by-step the first part does the following. What we call step is actually several steps for the Turing machine (since the head can only move by one position at a time). We remark already here that the last displayed cell, containing a blank (always), is very important as an end-marker for the second step of the algorithm.

┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ 0 │ 1 │ 1 │   │   │   │   │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ 1 │ 0 │ □ │ 1 │   │   │   │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ 1 │ □ │ □ │ 1 │ 0 │   │   │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ □ │ □ │ □ │ 1 │ 0 │ 1 │   │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ □ │ □ │ □ │ □ │ 1 │ 0 │ 1 │ 1 │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴

The second part of the algorithm does this:

┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ □ │ □ │ □ │ □ │ 1 │ 0 │ 1 │ 1 │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ □ │ □ │ □ │ □ │ 0 │ 1 │ 1 │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ 0 │ □ │ □ │ □ │ □ │ 1 │ 1 │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ 0 │ 1 │ □ │ □ │ □ │ □ │ 1 │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ 0 │ 1 │ 1 │ □ │ □ │ □ │ □ │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴

Finally the third part acts like this:

┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ 0 │ 1 │ 1 │ □ │ □ │ □ │ □ │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ 0 │ 1 │ 1 │ □ │ □ │ □ │   │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ 0 │ 1 │ 1 │ □ │ □ │   │   │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ 0 │ 1 │ 1 │ □ │   │   │   │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬
│ ▶ │ 1 │ 0 │ 1 │ 1 │   │   │   │   │   │ ...
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴

In the following we present a Turing program which does exactly this. Don't worry, most of the program is just comments - explaining how it works, and at the same time giving a kind of inline-proof that the program is correct in the sense that it "does what we want".

To best understand the inline-proof read the following explanations:

  • The program consists of set of source blocks - kind of subroutines of the program.
  • Each block is a sequence of program lines without any intermediate empty lines (line comments are OK).
  • Each block has one or more entry lines (marked by a comment # entry). If a block is entered we require that we start at one of these lines - everything else is considered an error (does not happen if the program as a whole is correct).
  • Each block has one ore more exit lines (marked by a comment # exit). Each block must guarantee that upon entering another block (or halting) we do this via one of these lines.
  • Each block has pre-conditions. These are sets of statements that we require to hold immediately before the first line of the block is executed (each time a block is "called"). If we enter a block but the pre-conditions do not hold we call this undefined behavior. We require that our program never runs into undefined behavior.
  • Each block has post-conditions. These are sets of statements that we require to hold immediately after we "return" from the block. The block must ensure that these statements hold.
  • Each block has zero or more successor blocks. Upon exiting the block it is guaranteed that we must enter one if these successor blocks. This info helps navigating the code, since there are usually only very few succesor blocks for each block.
# Turing program to reverse a string of bits (0s and 1s):
#   - input  : "▶a...z■..." where a...z is a string of 0s and 1s
#     followed by at least length-of(a..z)+1 blanks (this is important).
#   - output : "▶z...a■..." where z...a is the reversed string.
# The head starts and finishes at "▶".

# Note: in the following we assume that the input string is non-empty to simplify
# the analysis of the *generic* case. But it is easy to check that this corner case
# also works. This is more or less an accident as you can see by the fact that we rely
# on the fact that (by our definition) a Turing machine halts if no matching line is
# found.

#
##
### Part 1: Map ▶a...z■...■ to ▶□...□z...a
##
#

# In part 1 the tape always contains "▶" followed by a bitstring A, followed
# by a series W of *white* blanks (special markers), followed by a bitstring B.
# Initially B and W are empty and A is the input string. At the end A is empty,
# W is at its place, and B contains the reversed string.

# B1: At the beginning we just step into A
# Pre-Conditions:
#   - tape content is "▶" followed by A, followed by blanks
#   - A is non-empty, W and B are empty
#   - Head is at "▶"
# Post-Conditions:
#   - Head is at the beginning of A
#   - tape content is unchanged (by the action of this block)
# Successor Blocks: B1.
(START,▶,a,▶,+) # entry and exit

# B2: Move the head to the end of A
# Pre-Conditions:
#   - Head is inside A,
#   - A is followed by at least one blank.
# Post-Conditions:
#   - Head is at last bit of A (points at 0 or 1)
#   - tape content is unchanged
# Successor Blocks: B3
(a,0,a,0,+) # entry
(a,1,a,1,+) # entry
(a,■,b,■,-) # exit

# B3: Special case if B still empty: Move the last bit of A one step to the right.
# Pre-Conditions:
#   - Head is at last bit of A
#   - B is empty, A is followed by blanks
# Post-Conditions:
#   - Last bit of A replaced by *white* blank and moved one step to right;
#   - B consists of one bit;
#   - Head is inside W
# Successor Blocks: B4
(b,0,b0,□,+) # entry
(b,1,b1,□,+) # entry
(b0,■,c,0,-) # exit
(b1,■,c,1,-) # exit

# B4: Traverse to the left over a series of white blanks
# Pre-Conditions:
#   - Head is inside W
#   - B is non-empty (but A might be empty)
# Post-Conditions:
#   - Head is at the last bit of A or at "▶" if A is empty
# Successor Blocks: B5, B7
(c,□,c,□,-) # entry and exit

# B5: Move the right-most bit of A to the end of B.
# Pre-Conditions:
#   - A and B are non-empty
#   - Head is at the last bit of A
# Post-Conditions:
#   - Head is inside B
# Successor Blocks: B6
(c,0,c0,□,+) # entry
(c,1,c1,□,+) # entry
# Move over the white blanks and then past the end of B:
(c0,□,c0,□,+)
(c0,0,c0,0,+)
(c0,1,c0,1,+)
(c1,□,c1,□,+)
(c1,0,c1,0,+)
(c1,1,c1,1,+)
# Append the remembered bit to the end of B
(c0,■,d,0,-) # exit
(c1,■,d,1,-) # exit

# B6: Go from B (back) to the white blanks
# Pre-Conditions:
#   - Head is inside B
# Post-Condition:
#   - Head is inside W
# Successor Blocks: B4
(d,0,d,0,-) # entry
(d,1,d,1,-) # entry
(d,□,c,□,-) # exit

#
##
### Part 2: Map ▶□...□z...a to ▶z...a□...□
##
#

# In part 2 the tape always contains "▶" followed by a bitstring C, followed
# by a series W of *white* blanks, followed by a bitstring B.
# Initially C is empty and B is the reversed input string. At the end B is empty,
# W is at its place, and C contains the left-shifted version of B.

# B7: Entrypoint for part 2 of the algorithm. Move into W.
# Pre-Conditions:
#   - Every bit of A was replaced by white blanks - now W
#   - B is reversed version of the initial value of A (and non-empty)
#   - Head is at ▶
# Post-Conditions:
#   - Head is at first white blank - in W
#   - tape content is unchanged
# Successor Blocks: B8
(c,▶,e,▶,+) # entry and exit

# B8: Traverse to the right over a series of white blanks
# Pre-Conditions:
#   - Head is inside W
# Post-Conditions:
#   - Head is at the end of B or at the first blank (after B)
# Successor Blocks: B9, B10
(e,□,e,□,+) # entry and exit

# B9: Erase the first bit of B and append it to C
# Pre-Conditions:
#   - Head is at first bit of B (which is non-empty)
# Post-Conditions:
#   - Bit was replaced by white blank and appended to end of C
#     or put directly after "▶" if C is empty
#   - Head is past the end of C, points to a white blank (inside W)
# Successor Blocks: B8
(e,0,f0,□,-) # entry
(e,1,f1,□,-) # entry
(f0,□,f0,□,-)
(f1,□,f1,□,-)
# Reaching end of C or ▶ if C is empty
(f0,▶,g0,▶,+)
(f0,0,g0,0,+)
(f0,1,g0,1,+)
(f1,▶,g1,▶,+)
(f1,0,g1,0,+)
(f1,1,g1,1,+)
# Append the remembered bit
(g0,□,e,0,+) # exit
(g1,□,e,1,+) # exit

#
##
### Part 3: Map ▶z...a□...□ to ▶z...a
##
#

# This just removes the trailing *white* blanks.

# B10: Remove the trailing white blanks
# Pre-Conditions:
#   - The tape contains the desired result (C) but followed by some white blanks
#   - Head is at the last of the white blanks
# Post-Conditions:
#   - The white blanks are all removed (replaced by blanks)
#   - Head is at "▶"
#   - Turing Machine is in halting state
# Successor Blocks: None (halts upon exit)
(e,■,h,■,-) # entry
(h,□,h,■,-)
(h,0,h,0,-)
(h,1,h,1,-)
(h,▶,HALT,▶,0) # exit

Let the theory behind and plug this program into our python implementation of a (universal) TuringMachine. The variable source_tp_ex_3_3 contains the contents of the above listing.

tm = TuringMachine.fromSource(source_tp_ex_3_3)

assert tm.run("▶011") == "▶110", "test exercise-3.3-1"
assert tm.run("▶101011") == "▶110101", "test exercise-3.3-2"
# If something else is on the tape, then we need at least n+1 blanks
# (n=length-of-string) after the string. Trying the following with less
# blanks would fail:
assert tm.run("▶001■■■■xyz") == "▶100■■■■xyz", "test exercise-3.3-3"
assert tm.run("▶1") == "▶1", "test exercise-3.3-4"
assert tm.run("▶") == "▶", "test exercise-3.3-4"  # corner case!

"PASSED"
PASSED

These basic tests yield an alternative root of trust into the program. You are invited to try out tm yourself. If configured appropirately it can also print each step it does.

Exercise 3.4 (Turing machine to add modulo 2)

Describe a Turing machine to add two binary numbers \(x\) and \(y\) modulo 2. The numbers are input on the Turing machine tape in binary, in the form \(x\), followed by a single blank, followed by a \(y\). If one number is not as long as the other then you may assume that it has been padded with leading 0s to make the two numbers the same length.

Solution

The idea of the program can be seen in the following example, adding x=000 and y=111.

1. prepare input
▶00■11■■▶00□11□■
2. update x inplace
  • ▶00□11□■▶■0□11□■▶■0□□1□■▶10□□1□■
  • ▶10□□1□■▶1■□□1□■▶1■□□□□■▶11□□□□■
3. cleanup and halt
▶11□□□□■▶11■■■■■

In the first stage we replace the blank cell in the gap between \(x\) and \(y\) and the cell after \(y\) by a white blank. This simplifies the Turing program. In the first step of the second stage we memorize the first bit of \(x\) (a zero 0 here) and go to the right until we find the first bit of \(y\). We add this to our memorized bit of \(x\), replace it by a white blank and transport back the result to the marked location (the blank). Now we repeat the procedure with the next bit - and so on. Finally we have to replace the white blanks by ordinary blanks and move the head back to the beginning.

# Turing machine to add modulo 2
#   - input  : "▶xxx■yyy■■..." where xxx and yyy are strings of 0s and 1s of equal length
#     followed by at least 2 blanks. x and y can be empty.
#   - output : "▶zzz■..." where zzz is the bitwise sum of xxx and yyy, followed by at
#     least n+2 blanks, where n is the length of x (and also of y).
# The head starts and finishes at "▶".

# (A)
# Prepare the input by marking the gap between x and y and the cell after y with white
# blanks (□)
(START,▶,prepare-gap,▶,+)
(prepare-gap,0,prepare-gap,0,+)
(prepare-gap,1,prepare-gap,1,+)
(prepare-gap,■,prepare-end,□,+)  # we assume only one gap cell!
(prepare-end,0,prepare-end,0,+)
(prepare-end,1,prepare-end,1,+)
(prepare-end,■,search-start,□,-)  # note: after that, head cell contains no ■
# Successors: (B)

# (B)
# Search the start marker ▶
(search-start,0,search-start,0,-)
(search-start,1,search-start,1,-)
(search-start,□,search-start,□,-)
(search-start,▶,begin-update-next-bit,▶,+)
# Successors: (C), (D)

# (C)
# Memorize next bit of x by a state, and its position by marking it with a
# blank cell.
(begin-update-next-bit,0,memorize-0,■,+)
(begin-update-next-bit,1,memorize-1,■,+)
# Successors: (E), (F)

# (D)
# Final steps:
#   - replace all the □ (we wrote ourselves) by ■
#   - move the head back to ▶,
#   - and halt!
(begin-update-next-bit,□,cleanup,■,+)
(cleanup,□,cleanup,■,+)
(cleanup,■,finalize,■,-)  # here we need that the input is followed by 2 blanks
(finalize,■,finalize,■,-)
(finalize,0,finalize,0,-)
(finalize,1,finalize,1,-)
(finalize,▶,HALT,▶,0)
# Successors: None (halts)

# (E)
# Traverse the rest of x (which might be empty)
(memorize-0,0,memorize-0,0,+)
(memorize-0,1,memorize-0,1,+)
(memorize-1,0,memorize-1,0,+)
(memorize-1,1,memorize-1,1,+)
# Successors: (F)

# (F)
# Traverse the gap between x and y (y is not empty here since initially it had the
# same length as x)
(memorize-0,□,memorize-0y,□,+)
(memorize-1,□,memorize-1y,□,+)
(memorize-0y,□,memorize-0y,□,+)
(memorize-1y,□,memorize-1y,□,+)
# Successors: (G)

# (G)
# Memorize the bitwise sum of the two corresponding bits
(memorize-0y,0,memorize-sum-0,□,-)
(memorize-0y,1,memorize-sum-1,□,-)
(memorize-1y,0,memorize-sum-1,□,-)
(memorize-1y,1,memorize-sum-0,□,-)
# Successors: (H)

# (H)
# Now we know the sum - transport it leftward
(memorize-sum-0,0,memorize-sum-0,0,-)
(memorize-sum-0,1,memorize-sum-0,1,-)
(memorize-sum-0,□,memorize-sum-0,□,-)
(memorize-sum-1,0,memorize-sum-1,0,-)
(memorize-sum-1,1,memorize-sum-1,1,-)
(memorize-sum-1,□,memorize-sum-1,□,-)
# Successors: (I)

# (I)
# Write the result to the marked position in x
(memorize-sum-0,■,begin-update-next-bit,0,+)
(memorize-sum-1,■,begin-update-next-bit,1,+)
# Successors: (C), (D)

We conclude with a small demo:

tm = TuringMachine.fromSource(source_tp_ex_3_4)

assert tm.run("▶0■1") == "▶1"
assert tm.run("▶010■111") == "▶101"
assert tm.run("▶0111001■1101011") == "▶1010010"
"PASSED"
PASSED

Exercise 3.5 (Halting problem with no inputs)

Show that given a Turing machine \(M\) there is no algorithm to determine whether \(M\) halts when the input to the machine is a blank tape.

Proof

This problem can easily be reduced the version of the halting problem from Box 3.2. In fact, for any turing machine \(M\) (expressed in terms of its program) we consider the turing machine \(M'\) which first writes the Turing number of \(M\) to the tape and then proceeds as \(M\).

Note that the mapping \(M\mapsto M'\) can be effectively computed (by an algorithm). But we do not need this fact here - just nice to know.

Clearly \(M'\) halts upon the empty tape iff \(M\) halts upon its Turing number being on the tape. Hence, this variant of the halting problem can not be computed effectively either. QED.

Exercise 3.6 (Probabilistic halting problem)

Suppose we number the probabilistic Turing machines using a scheme similar to that found in Exercise 3.2 and define the probabilistic halting function \(h_p(x)\) to be \(1\) if machine \(x\) halts on input of \(x\) with probability at least \(1/2\) and \(0\) if machine \(x\) halts on input of \(x\) with probability less than \(1/2\). Show that there is no probabilistic Turing machine which can output \(h_p(x)\) with probability of correctness strictly greater than \(1/2\) for all \(x\).

Proof

The reasoning from Box 3.2 applies here too.

Assume that there is a Turing machine HALT which, for all inputs \(x\), halts and returns \(h_p(x)\) with probability \(p_x>1/2\). This implies that with probability at most \(1-p_x<1/2\) it returns the wrong result or does not halt.

Define TURING in the same way as in Box 3.2 and let \(t\) be its Turing number. If \(h_p(t)=1\) we have y=HALT(t) with probability \(p_t>1/2\). But the latter means that TURING(t) loops forever with a probability \(p_t>1/2\) which implies \(h_p(t)=0\) - contradiction. On the other hand \(h_p(t)=0\) also leads to a contradiction. QED.

Exercise 3.7 (Halting oracle)

Suppose a black box is made available to us which takes a non-negative integer \(x\) as input, and then outputs the value of \(h(x)\), where \(h\) is the halting function defined in Box 3.2. This type of black box is sometimes known as an oracle for the halting problem. Suppose we have a regular Turing machine which is augmented by the power to call the oracle. One way of accomplishing this is to use a two-tape Turing machine, and add an extra program instruction to the Turing machine which results in the oracle being called, and the value of \(h(x)\) being printed on the second tape, where \(x\) is the current contents of the second tape. It is clear that this model for computation is more powerful than the conventional Turing machine model, since it can be used to compute the halting function. Is the halting problem for this model of computation undecidable? That is, can a Turing machine aided by an oracle for the halting problem decide whether a program for the Turing machine with oracle will halt on a particular input?

Solution

This follows directly from the argument in Box 3.2. In fact, the argument does not impose any restrictions on the model of computation except that at least some things are possible. So the answer is no. There is no such algorithm in this enhanced model of computation. More generally, any enhancements of the "usual" model of computation (in terms of added oracles which compute certain previously uncomputable functions) does suffer from the lack of a halting algorithm (implementable inside the model itself).

On the other hand weak models of computation may allow the existence of a halting algorithm. Particularly simple is the case of a programming language which just has branching (if ... then ... else ... end) as the only control structure (in particular no loops). In that case, any algorithm halts and the halting algorithm can be implemented in a trivial way.

Exercise 3.8 (Universality of NAND)

Show that the NAND gate can be used to simulate the AND, XOR, and NOT gates, provided wires, ancilla bits and FANOUT are available.

Proof

Let us denote the NAND operation by \(\barwedge\). Recall that \(x\barwedge y=\neg(x\land y)\). In the following we give one way to implement each of the three gates:

  • NOT: $¬ x = ¬ x ∨ 0 = x \barwedge 1 .$
  • AND: $x ∧ y = ¬ (x \barwedge y) = (x \barwedge y) \barwedge 1 .$
  • OR: $ x ∨ y = ¬ x \barwedge ¬ y = (x \barwedge 1) \barwedge (y \barwedge 1) .$

QED.

SKIP Exercises 3.9-3.14

I skip these exercises since they are too simple and too many.

Exercise 3.15

Suppose an \(n\) element list is sorted by applying some sequence of compare-and-swap operations to the list. There are \(n!\) possible initial orderings of the list. Show that after \(k\) of the compare-and-swap operations have been applied, at most \(2^k\) of the possible initial orderings will have been sorted into the correct order. Conclude that \(\Omega(n\log(n))\) compare-and-swap operations are required to sort all possible initial orderings into the correct order.

Proof

First, let us note that starting from an orderd sequence of numbers, like \(1,2,\ldots,n\) applying at most \(k\) compare-and-swap operations we can generate at most \(2^k\) sequences. From this statement easily follows the first part of the exercise since sorting is just the reverse of this "shuffling-procedure".

In fact, we can see this by mathematical induction:

case \(k=0\)
Clearly we generate exactly one sequence in that case.
induction step \(k+1\leftarrow k\)
Assume the statement holds for \(k\) and consider \(k+1\). We know that after at most \(k\) steps we get at most \(2^k\) sequences. If we optionally add another step we can at most double that number!

This proves the first part. Therefore, to be able to sort all \(n!\) permutations we need at least a number \(k_n\) of steps which necessarily satisfies

\[ 2^{k_n} \geq n! . \]

By a very weak version of Stirlings Formula we have

\[ \log(n!) = \Omega(n\log(n)) . \]

By taking the logarithm of the first inequality and plugging in the second inequality we get \(k_n=\Omega(n\log(n))\). QED.

Just for completeness, here is a short proof of the our very weak Stirling-type inequality:

\[ \log(n!) = \sum_{k=1}^n \log(k) \geq \sum_{k > n/e} \log(k) \geq (n-\lfloor ne^{-1} \rfloor)(\log(n) - 1) = \Omega(n\log(n)) . \]

Note that \(\log(n!)\leq n\log(n)\) is even easier to see. Hence we actually have proven that \(\log(n!)=\Theta(n\log(n))\).

Exercise 3.16 (Hard-to-compute functions exist)

Show that there exist Boolean functions on \(n\) inputs which requires \(\Omega(2^n/n)\) logic gates to compute.

Remark
The exercise statement deviates from the book, see errata. The original statement was wrong, there is no boolean function which requires at least \(\Omega(2^n/\log(n))\) gates. We also prove this statement below.

Proof

To be specific we assume that the circuits consist of AND, OR, and NOT gates. Moreover we allow arbitrary many FANOUT operations.

The proof relies on the first two of the following lemmas. To avoid irrelevant corner cases we assume \(k\geq n\).

  1. The number of boolean functions on \(n\) inputs is \(2^{2^n}\).
  2. The number of boolean functions implementable with \(k\) gates and arbitrary many FANOUT operations is upper bounded by \((6k)^{2k}\).
  3. Bonus: A circuit with \(k\) gates and arbitrary many FANOUT can be replaced by a circuit using at most \(O(k^2)\) FANOUT which computes the same boolean function.

We do not really need the third statement but it is nice to know that we do not really need arbitrary many FANOUT operations.

Proof of 1

The number of \(n\) bit numbers is \(2^n\) (a factor of two for each bit). For each of the \(2^n\) inputs we have two choices (\(0\) and \(1\)) to assign a value. This makes for \(2^{(2^n)}\) possibilities in total. QED.

Proof of 3

Conider an arbitrary circuit. Let us define the spec of the circuit as follows: For each of the \(k\) gates specify the type as being AND, OR, or NOT. This makes for \(3^k\) possibilities. For each of the at most \(2k\) inputs to all the gates specify to which other gate or original input they are connected. If the input comes over one or more FANOUT operations trace it back to the original gate where the input originates from. With our choice of gates (and \(k\geq n\)) this makes for at most \((k+n)^{2k}\leq(2k)^{2k}\) possibilities (at most \(2k\) inputs (at most two for each gate) and \(k+n\) positions to connect with).

Note that each of our gates only has one output which we use here to simplify things but this is not really crucial. If there were gates with more than one output we would also specify which of the outputs is connected to which input of the next gate.

In total there are at most \(3^k(2k)^{2k}\leq(6k)^{2k}\) possible specs. Note that two circuits with the same spec are guaranteed to compute the same functions. QED.

Proof of 3 (Bonus)

The purpose of FANOUT operation is to copy either some of the inputs or some of the intermediate outputs of the gates. In that way different gates can share the same inputs (or one gate can obtain two versions of the same bit).

There are only \(O(k)\) bits to copy (all \(n\) inputs and all the outputs of the \(O(k)\) gates together). Moreover, there are only \(O(k)\) destinations to copy to (all inputs of all the \(k\) gate). Hence the number of actually required FANOUT operation is at most \(O(k\,\cdot\,k)=O(k^2)\). QED

Conclusion

Assume that \(l=l(n)\) is the minimal number of gates needed to implement any Boolean function on \(n\) bits. By 1 and 2 we necessarily have

\[ (6l)^{2l} \geq 2^{2^n} . \]

The careful reader might notice that we assumed \(k\geq n\) in 1 and 2. This was just used to simplify some constants. And indeed the reasoning here shows that \(k=n\) gates are not sufficient to generate all Boolean functions. Hence there is no problem. From the last equation we deduce \(2l\log_2(6l)\geq2^n\). Hence:

\[ l \geq \frac{2^n}{2\log_2(6l)} = \frac{2^n}{2\log_2(6\cdot 2^n/(2\log_2(6l)))} \geq \frac{2^n}{2(n+\log_2(3))} . \]

QED.

Proof of the reverse inequality

It can be shown that the bound is sharp up to a constant factor. That is, all functions can be implemented by \(O(2^n/n)\) gates. This proves that the uncorrected exercise was indeed wrong. Let us prove this now.

  1. Any circuit can be implemented by \(O(2^n)\) gates.
  2. There is a circuit with \(O(2^{2^n})\) gates which implements a function \(f:\{0,1\}^n\to\{0,1\}^{2^{2^n}}\) which returns the result of any Boolean function at the same time.

The first version is a weaker version of what we want to prove. Actually we do not directly apply this Lemma. Instead we just use it to demonstrate special case of a slightly more involved construction later on.

Proof of 1

The proof is based on the identity:

\[ f(x_1,\ldots,x_{n+1}) = \underbrace{f(x_1,\ldots,x_n,0)}_{=:f_0(\ldots)} x_{n+1}' + \underbrace{f(x_1,\ldots,x_n,1)}_{=:f_1(\ldots)} x_{n+1} \]

Here we use \(+\) as "or" and multiplication as "and". The prime denotes negation. Note that this is exactly the idea behind Figure 3.8 from the book (the XOR there could be a OR).

Apart from the circuits implementing the two functions we only need five additional gates to implement \(f\). Hence, denoting \(c_n\) the (minimal) number of gates needed to implement any Boolean function:

\[ c_{n+1} \leq 2c_n + 5 . \]

By mathematical induction we get \(c_n=O(2^n)\).

Proof of 2

The idea of the proof is essentially based on the same formula as for Lemma 1. Let \(d_n\) be the minimal number of gates needed. Any Boolean function \(f\) in \(n+1\) variables can be implemented in the following way:

\[ f(x_1,\ldots,x_n,x_{n+1}) = f_0(x_1,\ldots,x_n)x_{n+1}' + f_1(x_1,\ldots,x_n)x_{n+1} , \]

where each of the two functions \(f_0\) and \(f_1\) can be chosen from \(2^{2^n}\) possibilities. From this we see

\[ d_{n+1} \leq d_n + 3\cdot 2^{2^n} . \]

The circuit for \(n+1\) basically consists of the circuit for \(n\) where each pair of the \(2^{2^n}\) outputs is recombined to the new \(2^{2^{n+1}}\) outputs. Each pair uses only two AND and one OR gates. Overall there is a single NOT gate needed which can be copied around by FANOUT as often as we need it.

From the inequality it is easy to see by induction that \(d_n=O(2^{2^n})\).

Conclusion

The idea is as follows: The construction from Lemma 1 is already close to what we want. The circuit in Lemma 2 despite using a lot of gates has a very good average gate usage. In fact, only constantly many gates are used for each Boolean function on average! We try to combine the two approaches to push down the upper bound obtained in Lemma 1.

Let \(m\leq n\) and write \(x=(y,z)\) where \(x=(x_1,\ldots,x_n)\) and \(y\) and \(z\) contain the first \(n-m\) and last \(m\) bits. Any Boolean function can be written as:

\[ f(x) = \sum_{c\in\{0,1\}^{n-m}} y^c g_c(z) . \]

Here \(y^c\) is a product of the bits in \(y\) where some of the bits are negated depending on whether \(c_i=1\) or not.

By Lemma 2 there is a circuit of size \(O(2^{2^m})\) which implements all possibly required \(g_c\) at once. How many additional gates do we need to fully implement \(f\)? It is helpful to define \(k=n-m\) and do induction over \(k\), keeping \(m\) fixed. Lets call the required number of gates \(a_k\). Let us write the formula for \(f\) in the following form:

\[ f(x) = \sum_{c_1\in\{0,1\}} y_1^{c_1} \left( \sum_{(c_2,\ldots,c_k)} \prod_{i=2}^k y_i^{c_i} \cdot g_c(z) \right) = y_1' f_0(x_2,\ldots,x_n) + y_1 f_1(x_2,\ldots,x_n) . \]

Note that \(f_0\) and \(f_1\) are functions like \(f\) but with "their" \(k\) being one less. Hence

\[ a_{k+1} \leq 4 + 2a_k . \]

By induction we get \(a_k=O(2^k)=O(2^{n-m})\). So in total there are at most this many gates needed to implement \(f\):

\[ O(2^{2^m} + 2^{n-m}) . \]

Note that \(m\) is a free parameter in that formula. We will minimize the expression. For simplicity let us replace \(2^m\) by a continuous variable \(a\). We have to minimize \(2^a+2^n/a\). But that is pretty simple (if we are not interested in constant global factors) Let us just set \(a=n/2\) and we get

\[ O(2^{n/2} + 2^n/n) = O(2^n/n) . \]

QED.

Exercise 3.17

Prove that a polynomial-time algorithm for finding the factors of a number \(m\) exists if and only if the factoring decision problem is in P.

Proof

It is clear that a polynomial-time algorithm for finding the factors of a number suffices to solve the decision problem in polynomial time. Thus we only have to prove one direction of the equivalence.

Let \((x,t)\mapsto\mathrm{factp}(x,t)\) be a polynomial time (in the number of bits of \(x\)) algorithm to solve the decision problem. We have to prove that in this case a polynomial-time algorithm for finding the factors exists.

We first give an algorithm to compute the smallest divisor of \(x\). The function takes factp as a callback. It solves a slightly more general problem to have a cleaner statement what the function does. See the docstring for the details:

def find_largest_if_not(x: int, factp: Callable[[int, int], bool]) -> int:
    """Find the largest integer 1<=d<=x such that factp(x, l) is false.
    Return 1 if none exists.

    Preconditions:
      - x >= 1
      - factp(x, l) is defined for l=1,...,x
      - If factp(x, l) is true for some l then so is factp(x, l+1)
      - Either factp(x, l) is true of all l>sqrt(x) or it is always false.

    In case of the factor problem factp(x, l) is true if x has a non-trivial divisor strictly
    smaller than l.

    Note: The third precondition is stricter than needed but it is satisfied for the
    factor problem."""
    if x == 1:  # corner case
        return 1

    # Find the smallest power of two for which factp is true (if it exists)
    k = 1
    while not factp(x, 2**k):
        k = k + 1
        if 2**k > x:
            # Here x is prime if factp is decides the Factor problem. In the general case
            # we rely on the third precondition here!
            return x

    # Here x is not prime if factp decides the Factor problem.

    # d *will* be the smallest number with the desired property. Right now only the left
    # most bit is correct. The following loop builds up d bit by bit.
    d = 2**(k-1)

    for i in range(k-2, -1, -1):
        d1 = d + 2**i
        if factp(x, d1):
            d = d  # just for explicity
        else:
            d = d1

    return d

Let \(n=O(\log(x))\) be the number of bits of \(x\). Let \(f(n)=\Omega(n)\) be a function such that factp runs in \(O(f(n))\) steps. Note that all basic arithmetic operations, within the above function are (theoretically) \(O(n)\) since the numbers involved are not larger then \(2x\) and we only add or multiply by the very peculiar number two which works nice with binary numbers (we could easily avoid the powers of two but I kept them for readability).

This algorithm calls factp around \(O(n)\) times and hence runs in \(O(nf(n))\) steps. Clearly the number \(x\) has at most \(\log_2(x)=O(n)\) factors. Hence it suffices to call this procedure \(O(n)\) times to compute all factors in \(O(n^2f(n))\) steps. QED,

Let us do a tiny non-systematic experiment - just having some fun with the above algorithm. To generate the primes I used bigprimes.org which claims to generate true primes with high probability. On the other hand, note that we do not really use the fact that these numbers are prime (due to our fake implementation of factp):

# We assume that the first prime in each tuple is the smallest
prime_tuples = [
  (34129,
   63907),
  (7081163707,
   8680787539),
  (6388797743,
   6725672819,
   6388797743),
  (53935348593314365769,
   90916509983980640933),
  (312242331938588645593029053519,
   75189371683335785392960817804971969),
  (85885984977628802697002710823722671231249625512539,
   173202242859416728004553100028078327188834835916135376079867),
  (16454682855571279030686449830860047,
   26432071998531033907383391746862621,
   68404929161253247874640503692598161,
   77570673742199569337801762624138099,
   80645260483118874221181403055241113),
]

for primes in prime_tuples:
    x = reduce((lambda a, b: a*b), primes, 1)
    count_calls_to_factp = 0

    def factp_fake(x, l):
        global count_calls_to_factp
        count_calls_to_factp += 1
        return l > primes[0]

    assert find_largest_if_not(x, factp_fake) == primes[0]
    print(f"Called factp {count_calls_to_factp} times for "\
    f"{x.bit_length()} bit number with {len(primes)} prime factors.")
Called factp 31 times for 32 bit number with 2 prime factors.
Called factp 65 times for 66 bit number with 2 prime factors.
Called factp 65 times for 98 bit number with 3 prime factors.
Called factp 131 times for 132 bit number with 2 prime factors.
Called factp 195 times for 214 bit number with 2 prime factors.
Called factp 331 times for 363 bit number with 2 prime factors.
Called factp 227 times for 576 bit number with 5 prime factors.

As expected, the number of calls to factp is roughly the number of bits of x. The worst case is when the number is a product of nearly equal prime numbers (of course, x being prime is even worse but this case is excluded above). If this is not the case the required number of calls can be much smaller.

Exercise 3.18

Prove that if coNP is unequal to NP then P is unequal to NP.

Proof

So we assume that coNP is unequal to NP. There are two possible cases (both can be true at the same time of course):

  1. There is a language \(L\) in NP which is not in coNP.
  2. There is a language \(L\) in coNP which is not in NP.

Furthermore note that P is part of the interection of NP and coNP. Hence in both cases \(L\) cannot be in P. Thus in case 1 we immediately get the claim. We are left with case 2.

But case 2 is not hard either. Just consider the language

\[ L^c = \{s\in\Sigma^*; s\notin L\} . \]

Any Turing machine solving either of the languages - with our without wittness - can be reused for the other language, you just have to switch the "interpretation" of the two output states (yes/no). In case a wittness is used the wittness can be reused too.

Two things follows from that: If there were a Turing machine solving either language in polynomial time, it would also solve the other language in polynomial time. Because we are in case 2, non of these languages is in P. On the other hand \(L^c\) is in NP by the above argument (Turing machine with wittness). Hence NP contains a language (\(L^c\)) which is not in P. QED.

Exercise 3.19

The REACHABILITY problem is to determine whether there is a path between two specified vertices in a graph. Show that REACHABILITY can be solved using \(O(n^2)\) operations if the graph has \(n\) vertices. Use the solution to show that it is possible to decide whether a graph is connected in \(O(n^3)\) operations.

Remark: The exercise statement deviates from the book, see errata.

Proof

Let \(m\) be the number of edges of the graph. Note that \(m=O(n^2)\). We show the more informative bound \(O(m)\) for the first statement and \(O(nm)\) for the second. Note that for unconnected graphs \(m\) might be much smaller than \(n\). Even \(m=0\) is possible which is a bit annoying when writing \(O(nm)\), in that case we interpret the latter as \(O(n)\) - sorry for the abuse of notation!

Let us first prove the second part. The connectedness of a graph can be checked as follows. Take any vertex A and then check if it is connected to any of the other \(n-1\) vertices. If the answer is always "yes" the graph is connected, otherwise it is not. Since we called the REACHABILITY subroutine \(n-1\) times this needs \(O(nm)\) operations - assuming the subroutine, as we have to prove, needs \(O(m)\) operations.

Let us turn our attention to REACHABILITY of two vertices A and B.

Preliminaries: The algorithm to be described uses two colors, green and red, to color the edges of the graph. One way to implement such operations without modifying the data structure holding the graph is to build up a copy of the graph in terms of a modified data structure where this is possible. This can be done in \(O(\max(n,m))\) operations. It can even be done in \(O(m)\) steps if the auxiliary data structure is build up on the fly - just representing the connected region around A as we explore it. For simplicity of exposition we just assume that the original data structure supports these operations.

When we say we toggle the color of an edge we mean that red gets green and green gets red.

Now we are in the position to describe the algorithm. First let us fix the pre- and post-conditions:

  • Precondition: The input is a graph with all edges being green.
  • Post condition: All edges are green (again), more generally the graph is not modified in any way. It returns the correct answer to the problem.

And here is how the algorithm operates:

  • (I) Initialize a set pending_nodes with just one element A.
  • (L) Loop over the following until pending_nodes is empty:
    • (L1) Take a vertex V from pending_nodes and remove it from the set.
    • (L2) Look at each edge of V and toggle its color.
    • (L3) Consider all vertices connected to V via red nodes and add them to pending_nodes. If one of them is B go to (Y).
  • (N) Cleanup and return no (and halt).
  • (Y) Cleanup and return yes (and halt).

With "Cleanup" we mean to make all remaining red edges green again to ensure the first part of post condition. Of course, if we had a copy of the graph (or of the relevant connected component) this won't be necessary. Anyway, this can be done in \(O(m)\) steps and hence it does not hurt much.

To prevent a double insertion into pending_nodes in (L3) we can for example add a flag to each vertex saying whether it is already pending or not. This allows us to implement pending_nodes as a simple list, with for example, just two operations: push an element to the end and pop an element from the end. In "ordinary" computers these operations often run in constant time. Of course, if an implementation of a set is available which allows checked insertion in constant time we could just use that (maybe a hash-set).

We call a vertex visited if it leaves the pending state (gets taken out of pending_nodes). It is not hard to see by induction over the steps of the algorithm that a visited vertex never gets pending again and that each edge gets red at most one time.

Hence we operate at most \(O(m)\) time on the edges. But we also operate at most \(O(m)\) times on vertices since we never leave the connected component where A lives. QED

Remark
Note that the term operation is somewhat vague (which is OK!). I interpreted it as a simple task which can be done on a real computer in essentially constant time (like random access into memory). I doubt that one can find a Turing machine (taking the definition of the book) which solves REACHABILITY in \(O(n^2)\) steps.

Exercise 3.20 (Euler's theorem)

Prove Euler's theorem. In particular, if each vertex has an even number of incident edges, give a constructive procedure for finding an Euler cycle.

Proof

To see Euler's Theorem consider a graph with \(m\) edges and an Euler cycle. Let us draw tiny arrows at each edge showing the direction in which we walk. Moreover enumerate the edges along the path from \(0\) to \(m-1\) along the path - starting at an arbitrary position. Let us identify the edges with its numbers.

Consider one of the vertices \(v\). Let \(i\) be one of the edges whose arrow points towards \(v\). Clearly \(i+1\) (modulo \(m\)) is an edge at \(v\) whose arrow points away from \(v\). Let \(j\) be one of the edges whose arrow points away from \(v\). Clearly \(i-1\) (modulo \(m\)) is an edge at \(v\) whose arrow points towards \(v\). We see that the edges around \(v\) always come in pairs \((i,i+1)\) where the first one is the "entering" edge and the last one is the "leaving" edge. Thus, the number of edges at \(v\) must be even. Since \(v\) was arbitrary this proves the Theorem.

Our procedure to compute the Euler cycle makes use of the following subroutine PartialCycle. Given a vertex \(v\) and a set of admissible edges the subroutine computes a nonempty closed path, containing \(v\) and only admissible edges.

  • input: a graph,a vertex \(v_s\) of the graph, and a set of admissible edges of the graph.
  • (I) Set the current vertex equal to \(v_s\).
  • (L) Loop indefinitely over the following
    • (L1) Take an unvisited admissible edge \(e\) of the current vertex to some vertex \(v\).
    • (L2) Walk over \(e\) to \(v\).
    • (L3) Set \(v\) to be the new current vertex.
    • (L4) If \(v=v_s\) return the constructed path and halt.
    • (L5) Mark \(e\) as visited.

Note that (L1) implicitly assumes that we actually find an unvisited admissible edge. If this is always the case it is easy to see that the algorithm does what we want. Also note that each vertex having an even number of admissible edges and \(v_s\) having at least some admissible edges guarantees that (L1) always succeeds!

Let us formulate the complete procedure for finding an Euler cycle. As a precondition we assume that each vertex has an even number of edges and that the graph is connected (the latter implying that no vertex has zero edges). This is justified by Euler's theorem.

  • Call PartialCycle on some arbitrarily chosen vertex with all edges being admissible.
  • Initialize a path \(p\) with the output of the previous step.
  • Loop indefinitely over the following:
    • Choose a vertex \(v\) having edges not already in \(p\). Halt if none exists.
    • Call PartialCycle on \(v\) allowing only the complement of the edges of \(p\) as admissible. Enlarge \(p\) by inserting the output into \(p\) directly after \(v\).

It is not hard to see that the procedure halts if and only if the path \(p\) is already an Euler cycle. QED.

Exercise 3.21 (Transitive property of reduction)

Show that if a language \(L_1\) is reducible to the language \(L_2\) and the language \(L_2\) is reducible to \(L_3\) then the language \(L_1\) is reducible to the language \(L_3\).

Proof

By definition there are polynomial-time reduction functions \(R_{12}\) and \(R_{23}\) from \(L_1\) to \(L_2\) and from \(L_2\) to \(L_3\) respectively.

Clearly \(R_{23}\,\circ\,R_{12}\) is a reduction function from \(L_1\) to \(L_3\). It remains to show that it is polynomial-time.

But first let us note that under a few minor assumptions the composition of two Turing machines can be implemented very naturally. The assumptions are that the input is presented as a "▶" followed by \(x\) followed by infinitely (or sufficiently) many blanks. The output is required to be in the same format. At the start and when the machine halts we require the head to be at the (single) "▶" (which is not to be modified). Moreover we assume that the set of states of the two Turing machines is disjoint (if it is not make them disjoint). Under these assumptions the composition of both Turing machines is obtained by concatenating the two corresponding programs (not important which one first), defining the start state of the first machine as the common start state, the halting states of the second machine as the common halting state, adding program lines which make sure to translate the halting states of the first one into the start state of the second one (suffices when the head is at "▶").

In the following let us identify the reduction functions with concrete Turing machines implementing them.

Let \(t(R,x)\) be the number of steps needed by \(R\) to compute \(R(x)\). Note that \(|R(x)|\leq|x|+t(R,x)\) since a Turing machine writes at most one symbol to the tape at each time step. By assumption there is a polynomial \(p(n)=C(1+n^k)\) such that

\[ t(R,x) \leq p(|x|) \text{ for } R\in \{R_{12}, R_{23}\} . \]

Hence

\[ t(R_{23}\,\circ\,R_{12}, x) \leq t(R_{12}, x) + \max_{|y| \leq |x| + p(|x|)} t(R_{23}, y) \leq p(|x|) + p(|x| + p(|x|)) . \]

The RHS clearly is a polynomial in the length \(|x|\) of \(x\). QED.

Exercise 3.22

Suppose \(L\) is complete for a complexity class, and \(L'\) is another language in the complexity class such that \(L\) reduces to \(L'\). Show that \(L'\) is complete for the complexity class.

Proof

We assume that the reduction relation for the class is transitive in the same way as for natural reduction relation of NP (exercise 3.21). I note this here since not for all complexity classes the same definition of reduction makes sense. In fact, consider how useless P-completenes would be if we used polynomial-time reduction as for NP.

Let us now return to the actual exercise. Using transitivity this is almost trivial. In fact, let \(K\) be any problem in the class then \(K\) reduces to \(L\) by completeness of \(L\). But \(L\) in turn reduces to \(L'\) by assumption. Using transitivity we deduce that \(K\) reduces to \(L'\). Since \(K\) was arbitrary we conclude that \(L'\) is complete too. QED.

Exercise 3.23

Show that SAT is NP-complete by first showing that SAT is in NP, and then showing that reduces to CSAT. (Hint: for the reduction it may help to represent each distinct wire in an instance of CSAT by different variables in a Boolean formula.)

Proof

In boolean formulas we use the following logical connectives: \(\neg\), \(\land\), \(\lor\), \(\leftrightarrow\) with the obvious meaning. We only sketch the proof of the exercise statement in the following.

The fact that SAT is in NP is intuitively clear. In fact, a possible wittness of the satisfiability of a boolean function is just a concrete assignment of the input variables.

It remains to show that CSAT reduces to SAT. As suggested we label each wire by a unique variable. In the following we replace each gate by a corresponding formula. For that we call the input wires \(i\) and the output wires \(o\). We give them an index in case more than one input and/or output is present.

NOT
\(o \leftrightarrow \neg i\)
OR
\(o \leftrightarrow i_1 \lor i_2\)
AND
\(o \leftrightarrow i_1 \land i_2\)
FANOUT
\(o_1 \leftrightarrow i \land o_2 \leftrightarrow i\)

Now the boolean formula for the whole circuit is just the conjunction of all these smaller formulas. It is intuitively clear that this can be done in polynomial time with respect to the number of gates (including FANOUT). Note that this essentially does not change if we do not count the FANOUT operations since it is not hard to see that only a polynomial number of FANOUT is needed to implement any boolean function (so we can just disallow circuits with super-polynomial (in terms of the number of other gates) many FANOUT gates). QED.

Exercise 3.24 (2SAT has an efficient solution)

Suppose \(\varphi\) is a Boolean formula in conjunctive normal form, in which each clause contains only two literals.

  1. Construct a (directed) graph \(G(\varphi)\) with directed edges in the following way: the vertices of \(G\) correspond to variables \(x_j\) and their negations \(\neg x_j\) in \(\varphi\). There is a (directed) edge \((\alpha, \beta)\) in \(G\) if and only if the clause \((\neg\alpha\lor\beta)\) or the clause \((\beta\lor\neg\alpha)\) is present in \(\varphi\). Show that \(\varphi\) is not satisfiable if and only if there exists a variable \(x\) such that there are paths from \(x\) to \(\neg x\) and from \(\neg x\) to \(x\) in \(G(\varphi)\).
  2. Show that given a directed graph \(G\) containing \(n\) vertices it is possible to determine whether two vertices \(v_1\) and \(v_2\) are connected in polynomial time.
  3. Find an efficient algorithm to solve 2SAT.

Solution

Proof of Statement 1

Note that \((\neg\alpha\lor\beta)\) is identical to both of the implication connectives \(\alpha\rightarrow\beta\) and \(\neg\beta\to\neg\alpha\). This makes the meaning of the graph very intuitive.

A path from \(x\) to \(\neg x\) implies \(x\to\neg x\) (since the implication connective is transitive and all clauses are implications). This means that a satisfying assignment cannot have \(x\) to be true. In the same way a path from \(\neg x\) to \(x\) implies \(\neg\,x\to\,x\). So, if both paths are present there cannot be a satisfying assignment. That was the easy part.

Now assume that no such pair of paths exist. For this direction it is crucial that the graph has the property that whenever it contains the edge \((\alpha,\beta)\) it also contains \((\neg\beta,\neg\alpha)\). Otherwise a counterexample for this direction of the statement is quickly found:

\[ (x\to y) \land (x\to\neg y) \land (\neg x\to x) . \]

Lets call this property the negation symmetry. If there is a path from \(x\) to \(y\) let us write \(x\tto y\).

We call a variable \(x\) a root variable if either \(x\tto\neg x\) or \(\neg x\tto x\). Note that the assignment of root variables is unique. In the former case necessarily \(x=0\) and in the latter case \(x=1\). We say \(x\) is fixed. We say that a variable is derived if there is a (root) variable \(x_1\) or \(x_2\) such that at least one of the following holds

\begin{align*} \neg x_1 \tto x_1 &\tto y \\ x_1 \tto \neg x_1 &\tto y \\ \neg x_2 \tto x_2 &\tto \neg y \\ x_2 \tto \neg x_2 &\tto \neg y \end{align*}

Note that a derived variable is also fixed. In the former two cases \(y=1\) and in the latter two cases \(y=0\) (necessarily). Can it happen that we get a contradictory assignment, that is, can it happen that an \(x_1\) and an \(x_2\) as above exists? No this cannot happen. Lets consider the case where line 1 and line 3 hold (the other cases are similar). In that case, by the negation-symmetry, we also have \(y\tto \neg x_1\) and \(y\tto\neg x_2\). But this implies (for example) \(x_2\,\tto\,\neg\,x_2\) - a contradiction to our assumption further above that a variable is not connected to its negation in both directions.

It is not hard to see that the given assignment for root and derived variables is not contradictory. In fact, it just remains to investigate paths from derived to root, from root to another root, and derived to another derived. We leave the tedious argumentation to the reader 😉.

There might be variables which are neither root nor derived. One might think that one can assign them any truth value, but this is not true as can be seen by the very simple example \(\varphi=x\to y\).

Let \(z\) be a variable which is neither root nor derived. There are three possible cases:

case 1
\[ \exists y_0 : z\tto y_0 \text{ and } z\tto \neg y_0 \]
case 2
\[ \exists y_1 : \neg z\tto y_1 \text{ and } \neg z\tto \neg y_1 \]
case 3
\[ \text{No such $y_0$ or $y_1$ exists.} \]

It is not hard to see that \(z\) is fixed in cases 1 and 2. In case 1 necessarily \(z=0\) and in case 2 necessarily \(z=1\). It is crucial that case 1 and case 2 do not happen at the same time. If they were it would follow that

\begin{align*} &y_0 \tto \neg z \tto \neg y_1 \tto z \tto \neg y_0 , \\ &\neg y_0 \tto \neg z \tto y_1 \tto z \tto y_0 . \\ \end{align*}

A contradiction to our assumption. Now let us replace \(\varphi\) by \(\varphi'\) as follows

  • In case 1: \(\varphi'=\varphi\land(z\to\neg z)\).
  • In case 2: \(\varphi'=\varphi\land(\neg z\to z)\).
  • In case 3: Do one of the above does not matter which one.

Clearly a solution of \(\varphi'\) is also a solution of \(\varphi\). It is not hard to see that the graph of \(\varphi'\) still has the property that no variable is connected to its negation in both directions. In fact, let us assume that we added \(z\to\neg z\) (the other case is similar). So we are in case 1 or 3. Assume \(\varphi'\) violates the property. Then there must be a \(y\) which is connected to its negation in both ways and at least of the paths contains the segment \(z\to\neg z\). But in fact both ways must lead through this segment since otherwise \(y\) would be a root and \(z\) were derived! Both ways through the segment means:

\begin{align*} &y \tto z \to \neg z \tto \neg y, \\ &\neg y \tto z \to \neg z \tto y. \end{align*}

and hence case 2 is satisfied - contradiction, since we already know that we are in case 1 or 2.

Repeated application of this procedure shows that we can replace \(\varphi\) by a formula \(\psi\) which still has the property that no variable is connected both ways to its negation. Moreover \(\psi\) has only root and derived variables. Hence it has a unique solution (we showed this above). This solution is also a satisfying assignment for \(\varphi\). QED.

Proof of Statement 2

In exercise 3.19 we showed this for undirected graphs. The same algorithm also works for directed graphs.

Solution of 3

The proof of statement 1 essentially contains an algorithm. In the following we describe a routine solve to find a satisfying assignment. The algorithm works on the graph representation of \(\varphi\). It makes use of two subroutines reach and traverse.

reach(a,b) just decides if vertices a and b are connected. This can be done in \(O(n^2)\) operations if \(n\) is the number of variables (exercise 3.19). The function traverse(a,f) starts at vertex \(a\) and walks over all vertices reachable by \(a\). It calls f(v) for each vertex it reaches. Below the callback always has the form v→t where t is some truth value. If \(v\) is the vertex of some variable \(y\) or its negation it means that we assign \(y=t\) or \(y=\neg t\) depending on what is the case. Note that traverse can be implemented by the procedure driving reach (just take b to be some artificial vertex connected to nothing). So it runs in \(O(n^2)\) operations too.

  • (R) For each variable \(x\) do:
    • Check if reach(x,¬x) or reach(¬x,x).
    • If both are true halt and return that no assignment exists.
    • If the first checks out set \(x=0\) and mark \(x\) as root.
    • If the second checks out set \(x=1\) and mark \(x\) as root.
  • (D) For each of the root variables \(r\) do:
    • Call traverse(r,v→r).
    • Call traverse(¬r,v→¬r).
  • Remove all vertices from the graph which where visited by the calls to traverse, including the root vertices (the function could mark each vertex upon visit to make this possible). Also remove all "dangling" edges.
  • Take any of the remaining variables \(z\).
    • If no such \(z\) exists we are done and we can return the assignments we found.
    • Try to add \(z\to\neg z\) to the graph and call solve recursively on this new graph.
    • If the former fails add \(\neg z\to z\) to the graph (instead) and call solve again (this does not fail by what we have shown to prove statement 1).

Each call to solve, not including the recursive call, needs \(O(n^3)\) operations. In the worst case we have to call solve once for each variable which leads to \(O(n^4)\). QED.

Exercise 3.25 (PSPACE ⊆ EXP)

The complexity class EXP (for exponential time) contains all decision problems which may be decided by a Turing machine running in exponential time, that is time \(O(2^{n^k})\), where \(k\) is any constant. Prove that PSPACE ⊆ EXP. (Hint: If a Turing machine has \(l\) internal states, an \(m\) letter alphabet, and uses space \(O(p(n))\), argue that the machine can exist in one of at most \(lm^{p(n)}\) different states, and that if the Turing machine is to avoid infinite loops then it must halt before revisiting a state.)

Proof

Lets do it as suggested in the hint. Consider a Turing machine and some input of length \(n\). Let \(S\) be the set of states, and \(\Gamma\) the alphabet for the tape. Let us define the configuration of the Turing machine to be

\[ (h, s, t) \in \{1,\ldots,p(n)\} \times S \times \Gamma^{p(n)} , \]

where \(h\) is the head position, \(s\) is the current state and \(t\) the content of the tape up to position \(p(n)\). By assumption the Turing machine does not use more than the first \(p(n)\) tape cells.

Since the Turing machine is deterministic: if a configuration appears twice the Turing machine does not halt. Hence if it halts no configuration appears twice. The number of possible configurations is:

\[ p(n) \cdot l \cdot m^{p(n)} . \]

Hence this is also the maximial running time if the Turing machine halts. Clearly this is an exponential bound. QED.

Exercise 3.26 (L ⊆ P)

The complexity class L (for logarithmic space) contains all decision problems which may be decided by a Turing machine running in logarithmic space, that is, in space \(O(\log(n))\). More precisely, the class \(L\) is defined using a two-tape Turing machine. The first tape contains the problem instance, of size \(n\), and is a read-only tape, in the sense that only program lines which don’t change the contents of the first tape are allowed. The second tape is a working tape which initially contains only blanks. The logarithmic space requirement is imposed on the second, working tape only. Show that L ⊆ P.

Proof

The reasoning is basically the same as in the proof of PSPACE ⊆ EXP (exercise 3.25). But there is the caveat that a priori we have no bound on the possible head positions on the read-only tape. On the other hand, most definitions of \(L\) do not really lead to this problem since they require the head position to be stored on the working tape. Therefore let us first assume that the read-only head can only traverse the \(n\) input cells. We come back to this restriction at the end of the proof.

Following the reasoning of the proof of exercise 3.25 we see that the number of configurations is at most

\[ n \cdot O(\log(n)) \cdot l \cdot m^{O(\log(n))} = O(\mathrm{poly}(n)) , \]

where we multiply the possible positions of the read-only head (\(n\)), the possible positions of the working head (\(O(\log(n)))\), the number of states (\(l\)), the number of letters (\(m\)) raised to the number of cells to be written \(O(\log(n))\). This proves the claim.

What happens if we do not restrict the read-only head to the first \(n\) inputs? Let us assume that all other cells are blank (wouldn't make sense to allow that the read head reads other thing then the actual input). How far can the read-only head travel into the blank region? The answer is: only a polynomial amount of cells! In fact, during all the time the head points to a blank and hence does effectively not contribute to the configuration of the machine. So the amount of consecutive time spend in the blank region is at most polynomial. Hence the number of positions of the read-only head is polynomial too. This implies the claim in the general case. QED.

Exercise 3.27 (Approximation algorithm for VERTEX COVER)

Let \(G=(V,E)\) be an undirected graph. Prove that the following algorithm finds a vertex cover for \(G\) that is within a factor two of being a minimal vertex cover:

VC = ∅
E' = E
do until E' = ∅
  let (α, β) be any edge of E'
  VC = VC ∪ {α, β}
  remove from E' every edge incident on α or β
return VC.

Proof

Consider a minimal vertex cover MVC. Note that the algorithm operates on a sequence of edges \((\alpha,\beta)\). None of these edges is visited twice. By definition of the vertex cover at least one of the vertices \(\alpha,\beta\) must be in MVC. The above algorithm adds both vertices to VC. Hence VC is at most twice as large as MVC. QED.

Exercise 3.28 (Arbitrariness of the constant in BPP)

Suppose \(k\) is a fixed constant, \(1/2 < k\leq1\). Suppose \(L\) is a language such that there exists a Turing machine \(M\) with the property that whenever \(x\in L\), \(M\) accepts \(x\) with probability at least \(k\), and whenever \(x\notin L\), \(M\) rejects \(x\) with probability at least \(k\). Show that \(L\in\mathrm{BPP}\).

Proof

For simplicity we assume that the Turing machine always halts. I think most definitions of BPP actually assume this. But this is not really crucial, because a polynomial time bound is required for the case where the machine guesses the correct value one can just introduce a step-counter which aborts after some sufficiently large amount of polynomial time. In this case one can halt in either the accepting or the rejecting state (garbage), but "in practice" it would certainly be nicer to introduce a new halting state ABORT.

Let \(\varepsilon=k-1/2\) and \(r\in\NN\). Consider the Turing machine \(M_r\) which runs \(M\) on the same input \(r\) times and accepts if most individual runs return accept. It rejects otherwise. In case of a tie just reject (doesn't really matter but fits better into the argumentation of Box 3.4). Let us fix an input \(x\) and assume \(x\in L\). The case \(x\notin L\) is similar and we omit it.

Let \(X_i\) be \(1\) if the \(M\) accepts and \(0\) if it rejects upon \(x\) as input.

It is intuitively clear that the \(X_i\) are iid (independent, identically distributed). They are identically distributed since it is always the same machine and always the same input. They are independent since the individual coin tosses are independent (certainly a reasonable assumption although it was never explicitly mentioned in the definition of the probabilistic Turing machine).

Hence the Chernoff bound (Box 3.4) implies that \(x\) is accepted by \(M_r\) with probability at least \(1-e^{-2\varepsilon^2r}\) which goes to \(1\) if \(r\to\infty\). The case \(x\notin L\) can be treated in the same way (here the \(X_i\) are \(1\) upon rejection and \(0\) upon acceptance). Hence \(k_r\geq1-e^{-2\varepsilon^2r}\) which can be made arbitrary close to \(1\) if \(r\) is sufficiently large. QED.

Exercise 3.29 (Fredkin gate is self-inverse)

Show that applying two consecutive Fredkin gates gives the same outputs as inputs.

Proof

The action of the Fredkin gate can be briefly expressed by the following to laws of action:

\begin{align*} ab0 &\mapsto ab0 , \\ ab1 &\mapsto ba1 . \end{align*}

It is very hard not to see that the Fredkin gate is self-inverse. QED.

TODO Exercise 3.30

Verify that the billiard ball computer in Figure 3.14 computes the Fredkin gate.

Exercise 3.31 (Reversible half-adder)

Construct a reversible circuit which, when two bits \(x\) and \(y\) are input, outputs \((x,y,c,x\oplus y)\), where \(c\) is the carry bit when \(x\) and \(y\) are added.

Solution

Below we give a circuit which is made of Toffoli gates and implements the operation:

\[ (x_0, x_1, 1, a_0, a_1) \; \mapsto \, (x_0, x_1, 1, a_0\oplus c, a_1\oplus x_0\oplus x_1) , \]

where \(c=x_0x_1\) is the carry-bit. This operation is implemented by the following circuit. Note that no garbage is left behind (since the ancilla bit is not modified).

                       ░
x_0: ──────────────────░───■─────────■──
                       ░   │         │
x_1: ──────────────────░───■────■────┼──
     ┌───────────────┐ ░   │    │    │
anc: ┤ Initialize(1) ├─░───┼────■────■──
     └───────────────┘ ░ ┌─┴─┐  │    │
a_0: ──────────────────░─┤ X ├──┼────┼──
                       ░ └───┘┌─┴─┐┌─┴─┐
a_1: ──────────────────░──────┤ X ├┤ X ├
                       ░      └───┘└───┘

The first Toffoli directly adds the carry-bit to \(a_0\). The second and the third Toffoli produce \(x_0\oplus x_1\).

Exercise 3.32 (From Fredkin to Toffoli and back again)

What is the smallest number of Fredkin gates needed to simulate a Toffoli gate? What is the smallest number of Toffoli gates needed to simulate a Fredkin gate?

Solution: Implement Fredkin by Toffoli

The following circuit clearly implements the Fredkin gate in terms of three Toffoli gates. The nice thing is that no ancilla bits are needed.


  c: ──■────■────■──
       │  ┌─┴─┐  │
x_0: ──■──┤ X ├──■──
     ┌─┴─┐└─┬─┘┌─┴─┐
x_1: ┤ X ├──■──┤ X ├
     └───┘     └───┘

Clearly the Fredkin gate cannot be implemented by one Toffoli gate since the former modifies (up to) two bits and the latter only one bit. It is also not very hard to see that two Toffolis are not sufficient either (even with ancilla bits). In fact, it is clear that the two targets are already fixed to target the \(x_0\) and the \(x_1\) bit. From that it follows that both Toffolis must have one control at \(c\) since \(c\) in the Fredkin gate has an effect on both \(x_0\) and \(x_1\). In the same way the second control is necessarily at \(x_0\) or \(x_1\) so there is no further freedom.

Solution: Implement Toffoli by Fredkin

The Toffoli gate can be implemented by three Fredkin gates and two ancilla bits (initialized to \(0\) and \(1\) in that order) if garbage is allowed:

                         ░
  c_0: ──────────────────░─────■────
                         ░     │
  c_1: ──────────────────░─────┼──■─
                         ░     │  │
    x: ──────────────────░──■──┼──X─
       ┌───────────────┐ ░  │  │  │
anc_0: ┤ Initialize(0) ├─░──X──X──X─
       ├───────────────┤ ░  │  │
anc_1: ┤ Initialize(1) ├─░──X──X────
       └───────────────┘ ░

Note that the first Fredkin gate is responsible for filling the second to last and the last wire with \(x\) and \(\overline{x}\). The other two Fredkin gates operate on three wires two of which contain \(x\) and the third one contains \(\overline{x}\). The \(\overline{x}\) is moved into the third wire iff \(c_0\) and \(c_1\) are both \(1\).

We need ancilla bits since the Fredkin gate is conservative, that is, it preserves the number of ones (and zeros). The Toffoli gate does not do this. This also implies that we cannot uncompute the garbage. Note that two ancilla bits a \(0\) and a \(1\) are indeed required because the Toffoli gate can either increase or decrease the number of ones. Note that we assume ancilla bits to be zero initialized (qiskit convention) and hence we need a NOT (here called X) to generate a \(1\) as an initial value.

Clearly one Fredkin gate cannot produce the Toffoli gate. Let us demonstrate that it is still not possible with two Fredkin gates.

Assume that to the contrary that an implementation with just two Fredkin gates exists. None of the targets of the first Fredkin gate lies on the first or second wire. Otherwise the second gate had to be used to move back the bit of that wire and hence had to occupy the same targets and the same control - there are a few corner cases for which this is not needed but they do not work for trivial reasons. It follows that also the second gate cannot have a target on one of the first two wires. On the other hand the operation must depend on both \(c_0\) and \(c_1\) hence the controls must occupy \(c_0\) and \(c_1\). Without loss of generality we may assume that the first gate has \(c_0\) as control and the second gate has \(c_1\) as control. Clearly, the second gate must have the third wire as one of its targets since otherwise it would not contribute to the result. The other target must be an ancilla bit. Hence we have the following situation:

c_0: ─■────

c_1: ────■─
         │
  x: ────X─
         │
  ?: ────X─

Meaning that the second gate is already fixed and that we know the control of the first gate. Right before the execution of the second gate the fourth wire must contain \(\overline{x}\) if \(c_0=1\). Clearly there is no way how the first gate could accomplish that (in general). Contradiction! Hence three Fredkin gates are necessary.

We have already seen that two ancilla bits are necessarily required (a \(0\) and a \(1\)). Hence the given construction is optimal in the number of gates and ancilla bits at the same time.