Here are our latest papers.

  • ASyMOB: Algebraic Symbolic Mathematical Operations Benchmark
    Shalyt M., Elimelech R., Kaminer I.
    arXiv 2505.23851 (2025)
    We introduce ASyMOB, a new benchmark designed to rigorously evaluate LLMs on symbolic mathematics, specifically targeting core tasks like integration, limits and differential equations. ASyMOB’s 17,092 purely symbolic challenges allows the analysis of model generalization and failure through controlled perturbations. Results show that current LLMs, even high-performing ones, often rely on pattern memorization and struggle significantly when faced with slight problem variations – degrading by up to -70%. However, integrating code execution with LLMs improves accuracy, especially for weaker models. The most advanced models, such as o4-mini and Gemini 2.5 Flash, show both high accuracy and robustness to perturbations, suggesting a possible phase transition in symbolic reasoning capabilities, though it’s still uncertain whether future progress will come from better LLMs alone or deeper integration with tools like computer algebra systems.
    ASyMOB code repository
    ASyMOB dataset

  • The Ramanujan Library — Automated Discovery on the Hypergraph of Integer Relations
    Beit-Halachmi I., Kaminer I.
    arXiv 2412.12361 (2024)
    We introduce the first library dedicated to mathematical constants and their interrelations, aiming to provide a central resource for scientists and a platform for developing new algorithms. Using a novel hypergraph representation, where constants are nodes and formulas are edges, we developed a systematic approach to automatically discover connections between constants using the PSLQ algorithm. This method led to the discovery of 75 previously unknown connections, including new formulas for various constants and generalizations of known relations.
  • Unsupervised Discovery of Formulas for Mathematical Constants
    Shalyt M., Seligmann U., Beit-Halachmi I., David O., Elimelech R., Kaminer I.
    The 38th Conference on Neural Information Processing Systems (NeurIPS 2024) (previous version on arXiv)
    This is a methodology for categorizing and identifying patterns in polynomial continued fraction formulas based on convergence dynamics rather than numerical values, enabling automated clustering. Applied to a set of 1,768,900 unlabeled formulas, this approach autonomously rediscovered known formulas and uncovered new formulas for π, ln(2), and other constants, revealing underlying mathematical structures.
    (For a 5-minute video abstract see here)
  • Algorithm-assisted Discovery of an Intrinsic Order Among Mathematical Constants
    Elimelech R., David O., De la Cruz Mengual C., Kalisch R., Berndt W., Shalyt M., Silberstein M., Hadad Y., & Kaminer I.
    Proceedings of the National Academy of Sciences (PNAS) 121, e2321440121 (2024) (previous version on arXiv)
    A massively parallel computer algorithm has discovered an unprecedented number of continued fraction formulas for fundamental mathematical constants. These formulas unveil a novel mathematical structure that we refer to as the conservative matrix field. This field not only unifies thousands of existing formulas but also generates an infinite array of new formulas. Most importantly, it reveals unexpected relationships among various mathematical constants.
  • The conservative matrix field
    David O.
    arXiv 2303.09318 (2023)
    A mathematical structure used to study mathematical constants by combining polynomial continued fractions in an interesting way. In particular it is used to reprove and motivate Apery’s original proof of the irrationality of \(\zeta(3)\).
    (see also here for some details and examples).
  • On Euler polynomial continued fraction
    David O.
    arXiv 2308.02567v2 (2023)
    Euler polynomial continued fraction, are those that in a sense come from “simple” infinite sums via Euler conversion. We describe a method to find if a given polynomial continued fraction is of this form and how to convert it back to infinite sums.
    (see also here for some details and examples).