author: Robert R. Tucci |
date: Oct 17, 2000 |
Blackbox1: Grover's Oracle |
Blackbox2: Op-Amp |
Although Grover's algorithm (GA) (Ref.1) is mathematically fairly simple, beginners are often befuddled by the claims made about it (Ref .2). In this article I will point out certain linguistic pitfalls which I believe are in large measure responsible for this confusion. By making the reader aware of these pitfalls, I hope to teach him how to avert them.
The confusion surrounding GA can be traced to its use of an "oracle" or "blackbox", which is supposed to answer "queries" about an "unordered database".
Henceforth, I will replace the term "unordered database" by one which I prefer, "unsorted list". By a list L we mean a function f(.) from a set X to a set Y, where X and Y both have a linear order (<=) defined on them.
By a partial order on a set X we mean a relationship <= such that if x_1, x_2 and x_3 are elements of X, then
A set X with a partial order <= defined on it is called a poset. A partial order is called a linear order (aka full order) if it also satisfies
An example of a partial order is the relationship x Subset y for any two sets x and y. An example of a linear order is x LessThanOrEqual y, for any two integers x and y. Henceforth we will restrict our attention to linear orders.
We write x_1>= x_2 iff x_2<= x_1. We write x_1 < x_2 iff x_1<= x_2 but not (x_1=x_2). We write x_1> x_2 iff x_2< x_1.
We say that the list L is sorted iff for any x_1, x_2 in X, x_1<x_2 implies f(x_1)<=f(x_2). A list which is not sorted will be said to be unsorted. Note that sorted lists have monotonically increasing (or at least nondecreasing) graphs. We will use NE to represent the number of elements in X, and assume NE=2^NB, where NB is the number of bits.
For example, X might equal the integers from 1 to 64 and Y might be a set of names. Our list L could then be specified by (1)Jack (2)Jack (3)Bob (4)Alice...If these names were arranged in alphabetical order, we would say that the list is sorted.
Let Bool= {0,1} and Bool ^NB = set of strings x of the form (x_1, x_2, ..., x_NB) , where all the x_i are in Bool. In quantum computation, X is mapped in a 1-1onto fashion into the set Bool ^NB of NB-bit strings. If q(x_1) and q(x_2) are NB-bit strings corresponding to elements x_1 and x_2 of X, we can say q(x_1)<=q(x_2) iff x_1<= x_2. Likewise, Y is mapped in a 1-1 fashion into Bool ^NB, and NB-bit strings corresponding to elements of Y have an order induced upon them by the order on Y.
Suppose we are looking for the position of string t=target in the set Y of our list. When the list is sorted, its f(x) is monotonic. In that case one can use a binary search on a classical computer to find t in log(NE) steps. Thus, the speed advantage of GA, if any, is for unsorted lists.
In GA, all list information is encoded into an oracle which can be represented by a unitary operation U. Let H be the Hilbert space for NB qubits. H has the standard orthonormal basis of all possible |x>, where x is an NB-bit string. Let e = Elvis represent some fixed NB-bit string that is marked as tha king of strings. Suppose d(x,y) is the Kronecker delta function; it equals one if x=y and zero otherwise. The GA oracle is a unitary transformation U that acts on any |x> as follows:
U|x> = (-1)^d(x, e) |x>
U can be represented very simply in the |x> basis by the NE by NE diagonal matrix whose diagonal entries are all +1 with the exception of a single entry which is -1. The -1 entry is located at the position corresponding to the string e. The matrix U models a black box that will answer Boolean questions like: Is Elvis at my house? (yes or no). But it won't answer (directly) questions like: Where is Elvis?
Let t, an element of set Y, be the string that we decide to look for, our target string. It is reasonable to assume that the list and its oracle U are fixed before we decide what the string t is going to be. But the U just defined depends on a string e that the oracle assumes will be the one we seek. The oracle can either read the future, which is unlikely, or it tries to guess the future, and it will only succeed the fraction 1/NE of the time when t=e. The rest of the time the oracle will fail. According to GA proponents, GA can search an unsorted list in order(sqrt(NE)) steps, and find the location of an entry of that list with probability almost 1. Our previous arguments seem to paint a very different picture: GA can search an unsorted list in order(sqrt(NE)) steps and find the location of an entry with probability 1/NE of success. Viewed this way, GA doesn't look very useful for searching large unsorted lists. So which picture is right? They both are. There is no contradiction because they refer to different experiments.
One possibility is that before building the GA oracle, we are given an unsorted list with string t=target in its set Y, and we are asked to find the position of t. Call this the ask, then build (AB) strategy. Another strategy, call it build, then ask (BA), is to first build the oracle, and then, after it is built, decide which of the strings in Y is the one we seek. The BA strategy should never be used since its success probability is 1/NE (unless the oracle is prescient). When people say that GA can be used to search the Internet, or an unordered phonebook, faster than classical, that is mere speculation which may turn out to be false. The BA strategy fails most of the time, and following an AB strategy would force us to build a different oracle for every conceivable string anyone would ever ask for in the future. This myriad of oracles could be avoided if one could design a variable oracle circuit whose target string could be changed blindly (without any need to know where and how data is stored inside the list) and incrementally (without modifying the circuit too extensively). Call this a build, then ask, then tune (BAT) strategy. The design of GA oracle circuits is still in its infancy, so we don't know much at present about using GA in the BAT strategy (GA/BAT). To do GA/BAT effectively, we will need to learn how to:
GA/BAT oracles have a certain amount of reusability lacked by GA/AB oracles. The price to pay for this reusability may be that GA/BAT oracle circuits may turn out to be larger and slower than their AB counterparts. It may turn out that GA/BAT is not always sqrt(NE) faster and is even slower for certain applications than the classical algorithms that do the same calculation.
GA/AB can be used to solve certain NP problems. Let's see how. In such problems we are given a function C(x), call it a cost function, which maps a set X into the reals. We are asked to calculate the zeros of C. GA can be used to find the zeros if one considers a list with list function f(x) = C(x). Assume f(x) has exactly one zero which we will denote by f -1(0). We define the oracle for this problem to be the unitary transformation U from H to H whose action on the standard basis is:
U|x> = (-1)^d(f(x), 0) |x>.
Since
d(f(x), 0) = d(x, f -1(0)),
if we set e = f -1(0), we get the same oracle U that we had before. In applying GA to NP problems of this kind, the Ask part of the strategy is fully accomplished by specifying the cost function. The oracle is built after this, so an AB strategy is called for.
The AB and BAT strategies can be used by other quantum computer "programs", not just GA, and also by classical computer programs. Examples of computer programs that use the AB strategy are: a program that calculates Pi=3.1415... numerically, or one that calculates the roots of sin(x). An example of a computer program that uses the BAT strategy is: a program that calculates sin(x) for any real x entered by the user. BAT programs have a certain amount of reusability which AB programs lack. Not much is known about the design of BAT quantum computer programs. Quantum compilers such as the one described in Ref.3(shameless self-promotion) may turn out to be helpful for this.
In conclusion, although GA is mathematically fairly simple, beginners are often befuddled by it. I think the main reason for this is that the literature about GA often does not distinguish clearly between AB, BA and BAT. GA/BA is not very useful, GA/AB is very useful, and GA/BAT, if it can be developed, may turn out to be useful too. GA/AB is especially promising for solving certain NP problems. Using GA to search the Internet or an unsorted phonebook would require a BAT strategy about which we know next to nothing. Hence, at this point in time, claims of this are best described as pure science fiction.
Click here to see how many papers at the Los Alamos Library have the word Grover in the author-title-abstract. The number is 86 as I write this.