Ames Laboratory,
Department of Energy,
ISU,
Ames, Iowa

Humans have talents that are hard to program. Driving a car, recognizing
continuous speech, and playing chess have proved far more challenging to
computers than they have to people. To this class we can probably add
algorithm optimization. How is it that a mathematician can look
at a simple algorithm for complex product like

and discover that the number of
operations can be reduced to 3?
One method, which seems far from obvious, is


The conventional method for 2 by 2 matrix products calls for 8 multiplications and 3 additions. (In this paper, we equate additions and subtractions in assessing operation count, since they are computationally similar).
The 'trick' behind the Strassen method for 2 by 2 matrix products [20] is even more abtruse and baffling than the shortcut for complex products (Yuval presents a possible derivation, see [22]):



This algorithm was later improved by Winograd [1, p. 247], who reduced the number of additions from 18 to 15:


Where is the pattern in these methods? Although the theory of trilinear forms [16,17] has helped guide some shortcuts, it appears that these methods are the result of inexplicable intuitive leaps by some very bright people.
In August of 1991, the authors began an experiment to see if computers,
especially massively parallel computers, could discover these tricks,
and perhaps find new ones. Beginning with brute force search methods that
ran for days on very small algorithms, the program became gradually more
sophisticated and able to handle interesting problems within the limits of
our patience. Recently, an nCUBE 2 with 256 processors revealed that
it is possible to compute the cross product of two 3-dimensional vectors
using only 5 multiplications, and this method is new to the best of our
knowledge. The conventional method requires 6 multiplications and 3 subtractions:

The algorithm with 5 multiplications, found by the computer program, is:


The rest of the paper describes our search program. Short-cut ways of computing the given expressions are found by doing a search among all possible expressions that can be derived from the given set of variables and operations. Since this exhaustive enumeration has a combinatorially large number of expressions to explore, strategies are developed which reduce the search. Several `pruning' strategies are used to avoid the exploration of unpromising subtrees. Parallel computers are employed to conduct the search in parallel, to achieve higher speed.
