### Open Problem: Efficiently Evaluating Subset Sums

Suppose we are given a vector of $n$ items $\begin{bmatrix} x_1 & x_2 & \dots x_n \end{bmatrix}$ and we would like to perform a number of subset sums on these items.  Specifically, we would like to compute $\sum_{i \in S} x_i$ for some collection of subsets $S \subseteq \{ 1, \dots, n \}$.   We would like to evaluate all of the summations using the minimum number of arithmetic operations (additions and subtractions).

In this setting, addition and subtraction of two elements is an expensive operation (e.g., think of them as being 2D numpy arrays).  Therefore, we are willing to spend computational time and resources to find an efficient means to evaluate the subset sums.  Moreover, we may need to compute these subset sums many times, e.g., if they are computed as part of an inner loop to an iterative algorithm.  Therefore, the time spent finding the right way to evaluate the subsets sums will ultimately pay for itself.

A few examples with solutions to make this problem more concrete:

Example 1:

Suppose the subsets sums are structured so

$S_k = \sum_{i=1}^k x_i$ for $k=1, \dots, n$.

Naively evaluating these subset sums one-by-one would require $(k-1)$ additions for each k, for a total of $n(n-1)/2$ additions to evaluate all $n$ subset sums.  However, we can do better by noting that the subset sums overlap.  In particular, if the subset sum has been calculated for a given $k$, we can calculate the subset sum for $k+1$ using only one addition using the simple fact that

$$S_{k+1} = x_{k+1} + S_k$$

This improved method only requires $(n-1)$ additions total, to evaluate all $n$ subset sums.

Example 2:

Suppose the subset sums are structured so

$$S_j = \sum_{i = 1 \\ i \neq j}^n x_i$$

Naively evaluating these subset sums one-by-one would require $(n-2)$ additions for each $j$, for a total of $n (n-2)$ additions to evaluate all $n$ subset sums.  We can do much better by first computing $T = \sum_{i=1}^n x_i$ and then calculating $S_j$ as follows:

$$S_j = T - x_j$$

This improved method requires $(n-1)$ additions to compute $T$, then $n$ subtractions to compute each $S_j$, for a total of $2n - 1$ arithmetic operations.  This is a significant improvement over the naive method.

Example 3:

The above examples were very simple and highly structured.  By analyzing their structure we were able to find efficient ways to evaluate the subset sums.  But the purpose of this blog post is to consider the general problem, and hopefully identify an automated algorithm that can identify a good way to evaluate the subset sums efficiently.  Here is a third example which doesn't have an obvious structure, but is of practical interest nonetheless.  Here, each row represents a subset sum we would like to compute.  The non-zero elements of that row correspond to which elements will be summed.  Here, $n=10$ and the number of subset sums is $8$.  Naively evaluating the subset sums would require $nnz(A)-8$ arithmetic operations, where $A$ is the matrix below and $nnz$ is the number of non-zero entries.  Since this matrix is fairly dense, this is not an ideal approach.  How much better can we do here?
array([[1, 1, 1, 1, 1, 1, 1, 0, 1, 1],
[0, 1, 0, 1, 1, 1, 1, 0, 1, 1],
[0, 0, 1, 0, 0, 1, 0, 0, 1, 1],
[1, 1, 0, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 0, 1, 1, 1, 1, 0, 1],
[1, 1, 1, 1, 0, 1, 0, 1, 0, 0],
[1, 0, 0, 1, 1, 1, 1, 1, 1, 1],
[1, 0, 1, 0, 1, 1, 0, 1, 1, 1]])

Questions:

Is there a polynomial time algorithm that can find the optimal "decomposition" for an arbitrary input $A$?  If so, is there an $n^3$ time algorithm to solve this problem?  If not, is there an efficient approximation algorithm? I don't have the answers to these questions, but I thought the problem was interesting nonetheless.  If you have any thoughts, feel free to post them in the comments below!  I believe a good solution to this problem could be of general and practical interest.