Ames Laboratory, Department of Energy, ISU, Ames, Iowa

next up previous
Next: Aggressive Pruning Rules Up: Massively Parallel Searching Previous: The Basic Search

Conservative Pruning Rules

To motivate the pruning rules, look at what is probably the simplest "trick" in all of elementary algebra: a2 - b2 = (a + b)(a - b). The factoring reduces the number of multiplications from 2 to 1, at the cost of increasing the number of adds from 1 to 2. If we set up a search tree using Algorithm A, the first step could be any of

Since addition and multiplication are commutative, and we include both subtraction and reverse subtraction, we need not include both input1 op input2 and input2 op input1. Therefore, requiring the first input to be from the level on or above that of the second input is a conservative pruning rule. This pruning rule is easily implemented by changing loop control variables instead of using explicit `if' statements. Some of the pruning strategies are rather straightforward and obvious. Nevertheless, they are very important in view of the significant amount of reduction in the search.

Before we enumerate the pruning rules, it is necessary to introduce some notions. The program always keeps track of the path from the root of the tree to the current node along with the way each expression on the path is derived. Expressions are represented as polynomials in the input variables. The terms of the polynomials are ordered to make it easy for addition and subtraction operations and for checking mathematical equality of expressions. The expressions on the path are numbered starting from the root. The way an expression is derived is represented by the numbers denoting the parent expressions and the operation used. Figure 3 makes the ideas clear.

FIGURE 3: A typical path showing expressions and how they are derived.

We can also define a lexicographic ordering on the expressions based on the way they are derived. Let f1op1g1 and f2op2g2 denote two expressions e1 and e2 on a path. We saye1 precedes e2 in lexicographic order if g1 < g2 or if g1 = g2 and op1 precedes op2 in lexicographic order or if g1 = g2, op1 = op2 and f1 < f2. The ordering on operations is defined to be "" < "" < " " < "". There are several possible choices for defining a lexicographic ordering and there is nothing special about the specific order chosen. The ordering is useful in defining some pruning rules. A number of pruning strategies are discussed below. For each pruning rule that is used, we state the rule along with a brief justification wherever it is appropriate.

  1. Explore only paths of length less than or equal to the total number of operations plus the number of variables.

    Each expression on a path (except the variables) is formed by one operation. Therefore, paths of length more than the budgeted number of operations plus the number of variables cannot provide the desired solution.

  2. Restrict the number of operations of each particular type.

    If the budget of operations of a particular type are consumed on a path from the root to a node, all children of the node using the particular operation can be pruned.

  3. Subtracting an expression from itself is not useful, and can be excluded from the search tree.

    This assumes that 0 is not a goal expression.

  4. If the expression at a node is the same as the expression at one of its parent nodes, delete the node and the subtree under it.

    This is obvious since no shortcut will have the same expression computed twice. In figure 4, the node labeled b(3: 0 - 2) is deleted since it represents the same expression as the node labeled b(1).

    FIGURE 4. A path showing two identical expressions.

  5. Eliminate expressions with a leading negative term, except when one of the goal expressions contains a leading negative term.

    Recall that the terms of expressions are ordered and hence there is no ambiguity in deciding if an expression has a leading negative term. An expression is negative if it has a leading negative term and positive otherwise. For each path P(1) containing a negative expression, we show the existence of another path P(2) which does not contain such expressions and is a solution if P(1) is a solution. P(2) satisfies the property that every expression in it is the same as the corresponding expression in P(1) or its negative. We construct P(2) starting from the root such that at each stage the constructed partial path satisfies the above property. To start with, the first m levels constitute the variables and hence satisfy the property. Let e3(2) be the next expression to be added to P(2) and e3(1) be the corresponding expression on P(1). Let e3(1) = e1(1) op e2(1). We can show that for every choice of op and for every possible combination of signs of e1(1) and e2(1), e3(2) can be constructed such that e3(2) = - e3(1) if e3(1) is negative and e3(2) = e3(1) otherwise.

    For example, if e3(1) = e1(1) + e2(1) and e3(1) and e2(1) are negative, let e3(2) = e1(2) ~ e2(2). Since e1(2) = e1(1) and e2(2) = - e2(1), e3(2) = - e2(1) - e1(1) = - e3(1), as desired. If all goal expressions are positive and P(1) contains a solution, then P(2) must also contain a solution by the above construction. Therefore, we can prune the subtree under any expression that is negative.

  6. The level of the second input node should be the same as or greater than the level of the first input node.

    This rule was explained in the beginning of this section.

  7. Eliminate exploring paths with identical set of expressions, by requiring expressions on each path to be in increasing lexicographic order starting from the root.

    For each path P(1) that does not contain expressions in increasing lexicographic order, we can show the existence of another path P(2) containing the same expressions as P(1) but in the proper order. Let e1(1) and e2(1) be two expressions on P(1) that do not conform to the lexicographic order. Without loss of generality, let e1(1) be the expression nearer to the root. Because e2(1) precedes e1(1) in lexicographic order, the expressions needed to compute e2(1) appear before e1(1) on the path and therefore e1(1) and e2(1) can be swapped. We can construct P(2) by systematically swapping every two expressions on P(1) that do not conform to the lexicographic order. Because P(1) and P(2) contain the same expressions, P(2) represents a solution if P(1) represents a solution.

    Using this rule, we can prune each node that precedes its parent in lexicographic order. In figure 5, the two paths shown have identical expressions and hence the subtrees under them are identical. The tree is pruned at node labeled a - b (3: 0 - 1) since it precedes its parent in lexicographic order. This avoids computing the same subtree twice.

    FIGURE 5: Search tree with two paths having identical set of expressions.

  8. Among all the children of a node with mathematically equivalent expressions, choose the one that is smallest in the lexicographic ordering.

    This rule is also useful to avoid computing duplicate subtrees. The expression that is smallest in the lexicographic ordering is chosen because by rule 7, the subtree under such a node contains the subtree under a node representing the same expression and having the same parent node. For example, in figure 6, the node labeled b - c(5: 3 ~ 4) is pruned as it succeeds the node labeled b - c(5: 2 - 3) in lexicographic order and has a common parent.

    FIGURE 6: Search tree with two children of a node representing the same expression.

  9. Levels to explore should be at least as many as goal expressions to be found.

    A path representing a solution should contain all the goal expressions. Paths with not enough room for all the goals need not be explored.

  10. In a shortcut, each expression is used at least once.

    Each expression is formed by at most 2 expressions on the current path. Also, the number of expressions on the path is bounded. This places a limit on the number of expressions that can be used in forming subsequent expressions on the path. If there is no possibility that all the expressions (except the goal expressions) are used at least once, the tree can be pruned at this node.

    Let l = length of the path constructed so far
          g = number of goals yet to be found
          u = number of unused expressions on the path

    As only paths of length at most m + M + A are explored, the constructed path can be extended to contain (m + M + A) - l more expressions and at most 2 (m + M + A - l) different expressions could be used to construct these. There are u expressions on the path explored that are unused and (m + M + A - l - g) expressions on any extended path that have to be used. Therefore, any extension of the path could be a shortcut and a solution only if u + (m + M + A - l - g) 2 (m + M + A - l) or if u m + M + A - l + g. Otherwise, the subtree under the path constructed so far can be pruned.

It should be noted that some of the pruning rules are more expensive than the others. This can be viewed as moving the tree traversal cost to the node expansion cost, and the trade off in cost should be considered. For example, rule 4 involves determining whether the expression is identical to one of its ancestors. Implementing this rule is fairly cheap and effective when the node is close to the root since there are few nodes to compare and large subtrees to prune. However, when it is close to the maximum level (as defined by rule 1), the cost of comparison increases dramatically but the payoff decreases significantly. Rule 8 involves comparing the expression to its siblings generated before, making it impractical to implement even for nodes fairly close to the root. In fact, the siblings are not available in a depth-first search as only the path from the root to the current node is stored. In such a case, a restricted version of the rule might be more useful. For example, rule 8 implies that any expression in first degree of length two should be constructed using only the given variables and this can be checked with just two comparisons. Except for rule 4 and rule 8, the remaining rules can be implemented using a constant number of operations irrespective of the position in the tree of the node being tested.

There are other pruning rules that guarantee that if a solution were to be found in one of the deleted subtrees, then an equally economical solution is guaranteed to be found in another subtree to be explored. These rules usually stem from the commutativity of the operation the goal expressions represent or from symmetry considerations.

As an example, consider the complex product, which is a commutative operation.

(a + ib)(c + id) = (c + id)(a + ib)

Exchanging a with c and b with d in any solution that computes the complex product also gives a solution for the same. Hence, in the search program, if a path from root to a node can be derived from another path by the above exchange, the subtree under that node need not be explored. In fact, we can easily derive the missed solutions from the solutions found.

Such a pruning rule is very useful because it prunes the tree at the first level. For, if a path can be derived from another, the first node in the path can also be derived and the tree should have been pruned at the first node on the path itself.

As another example of a similar rule, consider the 2 2 matrix product. Interchanging the rows of the first matrix and/or the columns of the second matrix does not affect the goal expressions to be computed. This prunes three out of four nodes at the first level of the tree.



next up previous
Next: Aggressive Pruning Rules Up: Massively Parallel Searching Previous: The Basic Search