A Machine Learning Based Splitting Heuristic for Divide-and-Conquer Solvers

From LRDE

Revision as of 10:57, 8 September 2021 by Bot (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Abstract

In this paper, we present a machine learning based splitting heuristic for divide-and-conquer parallel Boolean SAT solvers. Splitting heuristics, whether they are look-ahead or look-back, are designed using proxy metricswhich when optimized, approximate the true metric of minimizing solver runtime on sub-formulas resulting from a split. The rationale for such metrics is that they have been empirically shown to be excellent proxies for runtime of solvers, in addition to being cheap to compute in an online fashion. However, the design of traditional splitting methods are often ad-hoc and do not leverage the copious amounts of data that solvers generate. To address the above-mentioned issues, we propose a machine learning based splitting heuristic that leverages the features of input formulas and data generated during the run of a divide-and-conquer (DC) parallel solver. More precisely, we reformulate the splitting problem as a ranking problem and develop two machine learning models for pairwise ranking and computing the minimum ranked variable. Our model can compare variables according to their splitting qualitywhich is based on a set of features extracted from structural properties of the input formula, as well as dynamic probing statistics, collected during the solver's run. We derive the true labels through offline collection of runtimes of a parallel DC solver on sample formulas and variables within them. At each splitting point, we generate a predicted ranking (pairwise or minimum rank) of candidate variables and split the formula on the top variable. We implemented our heuristic in the Painless parallel SAT framework and evaluated our solver on a set of cryptographic instances encoding the SHA-1 preimage as well as SAT competition 2018 and 2019 benchmarks. We solve significantly more instances compared to the baseline Painless solver and outperform top divide-and-conquer solvers from recent SAT competitions, such as Treengeling. Furthermore, we are much faster than these top solvers on cryptographic benchmarks.

Documents

Bibtex (lrde.bib)

@InProceedings{	  nejati.20.cp,
  author	= {Saeed Nejati and Ludovic {Le Frioux} and Vijay Ganesh},
  title		= {A Machine Learning Based Splitting Heuristic for
		  Divide-and-Conquer Solvers},
  booktitle	= {Proceedings of the 26 th International Conference on
		  Principles and Practice of Constraint Programming (CP'20)},
  year		= 2020,
  month		= sep,
  volume	= {12333},
  pages		= {899--916},
  series	= {Lecture Notes in Computer Science},
  publisher	= {Springer, Cham},
  abstract	= {In this paper, we present a machine learning based
		  splitting heuristic for divide-and-conquer parallel Boolean
		  SAT solvers. Splitting heuristics, whether they are
		  look-ahead or look-back, are designed using proxy metrics,
		  which when optimized, approximate the true metric of
		  minimizing solver runtime on sub-formulas resulting from a
		  split. The rationale for such metrics is that they have
		  been empirically shown to be excellent proxies for runtime
		  of solvers, in addition to being cheap to compute in an
		  online fashion. However, the design of traditional
		  splitting methods are often ad-hoc and do not leverage the
		  copious amounts of data that solvers generate. To address
		  the above-mentioned issues, we propose a machine learning
		  based splitting heuristic that leverages the features of
		  input formulas and data generated during the run of a
		  divide-and-conquer (DC) parallel solver. More precisely, we
		  reformulate the splitting problem as a ranking problem and
		  develop two machine learning models for pairwise ranking
		  and computing the minimum ranked variable. Our model can
		  compare variables according to their splitting quality,
		  which is based on a set of features extracted from
		  structural properties of the input formula, as well as
		  dynamic probing statistics, collected during the solver's
		  run. We derive the true labels through offline collection
		  of runtimes of a parallel DC solver on sample formulas and
		  variables within them. At each splitting point, we generate
		  a predicted ranking (pairwise or minimum rank) of candidate
		  variables and split the formula on the top variable. We
		  implemented our heuristic in the Painless parallel SAT
		  framework and evaluated our solver on a set of
		  cryptographic instances encoding the SHA-1 preimage as well
		  as SAT competition 2018 and 2019 benchmarks. We solve
		  significantly more instances compared to the baseline
		  Painless solver and outperform top divide-and-conquer
		  solvers from recent SAT competitions, such as Treengeling.
		  Furthermore, we are much faster than these top solvers on
		  cryptographic benchmarks.}
}