Efficiently navigating the search space: strategies for minimizing repetitions

Our algorithms that search for formulas may sometimes cover the same formulas more than once. Such cases of overlapping scans are wasteful. We are looking for new ways to scan over the search space while avoiding such overlaps. This challenge is becoming of great importance since we are starting to cover larger search spaces with higher risks for overlaps.

The goal: scan large spaces of parameters without wasteful repetitions

A substantial part of our project is to scan the space of polynomial continued fractions and find more conjectures that connect them to fundamental mathematical constants. A polynomial continued fraction is defined by two sequences, \(a_n\) and \(b_n\)

\(a_0 + \frac{b_1}{a_1 + \frac{b_2}{a_2 + \frac{b_3}{a_3 + \ddots}} }\)

We consider polynomial continued fractions with integer coefficients, meaning that both can be expressed using an easy-to-scan structure. Assuming \(a_n\) is of degree 3 and \(b_n\) is of degree 6, they can be written as:

\( a_n = c_0^{(a)} + c_1^{(a)} n + c_2^{(a)} n^2 + c_3^{(a)} n^3 \\\\
b_n = c_0^{(b)} + c_1^{(b)} n + c_2^{(b)} n^2 + c_3^{(b)} n^3 + c_4^{(b)} n^4 + c_5^{(b)} n^5 + c_6^{(b)} n^6 \\\\
\)

where \(c_i^{(a)}, c_i^{(b)} \in \mathbb{Z} \).

In this representation, scanning the space of continued fractions is straightforward – just set a range for every coefficient and run over all of them. When we distribute this search space to multiple clients, every client gets a chunk, which is a sub-range of the original space.

However, the plot thickens when we start issuing focused searches over structures that are similar to other conjectures. For examples, the following results are all valid formulas for \(\zeta(3)\):

\(
a_n = n^3 +(n+1)^3 \\
b_n = -n^6
\)

\(
a_n = n^3 +(n+1)^3 + 4(2n+1) \\
b_n = -n^6
\)

\(
a_n = 9(n^3 +(n+1)^3) + 80(2n+1) \\
b_n = -81n^6
\)

It is quite visible from this short example that a better way to define \(a_n\) and \(b_n\) is through the following scheme:

\( a_n = c_0(n^3 +(n+1)^3) + c_1(2n+1) \\\\
b_n = c_2 n^6
\)

It offers a much smaller and more focused search space. So it is more likely that we will find conjectures in it.

However, in many cases, we run searches on both schemes. This scenario is very common, since we can exhaust a focused search scheme, and then roll back to the more general one, aiming to find new research directions. In this case, keeping track over which polynomial continued were scanned becomes a substantial challenge. We can avoid millions of redundant calculations if we do it well.

So here is the challenge:

Can you create a system that tracks what continued fractions were scanned in different schemes, and enable scanning new ranges in different schemes without repetitions? The system should enable taking a very large search space, and divide it into reasonable chunks (say, 50K items in every chunk) where we only calculate continued fractions that were not scanned before. (For example, imagine polynomial continued fractions where \(a_n\) is of degree 7 and \(b_n\) is of degree 14, where every coefficient ranges from -200 to 200.

Notice that since the search space is very large, we cannot save a list of all the continued fractions that we already calculated. We can only store the ranges, and if it helps, the scheme they were created from.