文献学习——Guiding CDCL SAT Search via Random Exploration amid Conflict Depression

时间：2021-06-02 15:33:24 阅读：0 评论：0 收藏：0 [点我收藏+]

标签：begin lmos dep Once gate pattern 检测 form familiar

Guiding CDCL SAT Search via Random Exploration amid Conflict Depression

Md Solimul Chowdhury, Martin M¨ uller, Jia-Huai You
Department of Computing Science, University of Alberta
Edmonton, Alberta, Canada.
{mdsolimu, mmueller, jyou}@ualberta.ca

Abstract

The efficiency of Conflict Driven Clause Learning (CDCL) SAT solving depends crucially on finding conflicts at a
fast rate. State-of-the-art CDCL branching heuristics such as VSIDS, CHB and LRB conform to this goal.

We take a closer look at the way in which conflicts are generated over the course of a CDCL SAT search.

译文：我们将仔细研究在CDCL SAT搜索过程中产生冲突的方式。

Our study of the VSIDS branching heuristic shows that conflicts are typically generated in short bursts, followed by what we call a conflict depression phase in which the search fails to generate any conflicts in a span of decisions. 译文：我们对vsid分支启发式的研究表明，冲突通常在短时间内产生，随后是我们所说的冲突抑郁阶段，在这个阶段中，搜索不能在一系列决策中产生任何冲突。

The lack of conflict indicates that the variables that are currently ranked highest by the branching heuristic fail to generate conflicts.译文：缺少冲突表明，目前在分支启发式中排名最高的变量不能产生冲突。

Based on this analysis, we propose an exploration strategy, called expSAT, which randomly samples variable selection sequences in order to learn an updated heuristic from the generated conflicts. 译文：在此基础上，我们提出了一种探索策略，称为expSAT，它随机抽样变量选择序列，以从产生的冲突中学习得到一个更新的启发式。

The goal is to escape from conflict depressions expeditiously. 译文：目标是迅速摆脱冲突带来的萧条。

The branching heuristic deployed in expSAT combines these updates with the standard VSIDS activity scores. 译文：expSAT中部署的分支启发式方法将这些更新与标准vsid活动分数结合起来。

An extensive empirical evaluation with four state-of-the-art CDCL SAT solvers demonstrates good-to-strong performance gains with the expSAT approach. 译文：一个广泛的经验评估与四个最先进的CDCL SAT解决方案，证明了采用expSAT方法能带来良好的性能收益。

Introduction

	These heuristics reward variables involved in recent conflicts. The intuition is that assignments of these variables are likely to generate further conflicts, leading to useful learned clauses and thus pruning the search space. 译文：这些启发策略奖励最近冲突中涉及的变量。直觉是，这些变量的赋值很可能产生进一步的冲突，导致有用的学习子句，从而修剪搜索空间。

Preliminaries



	We assume familiarity with SAT solving (Biere et al. 2009). Here we briefly review the most relevant concepts. VSIDS Heuristic: VSIDS (Moskewicz et al. 2001) is a popular family of dynamic branching heuristics. We focus on exponential VSIDS as used in gLCM. VSIDS maintains an activity score for each variable in the given formula. It increases the activity score of each variable that is involved in conflict resolution by a variable bumping factor gz, where g > 1 is a constant and z is the count of the number of conflicts in the search so far. This strongly favors variables that participated in the most recent conflicts. Literal Block Distance (LBD): The LBD score (Audemard and Simon 2009) of a learned clause is the number of distinct decision levels in it. If this score is n, then the clause contains n propagation blocks, where each block has been propagated within the same branching decision. As variables in a block are considered to be related, learned clauses with a lower LBD score are likely of higher quality. Especially, when LBD score is 2, they are known to be glue clauses. Global Learning Rate (GLR): Suppose a CDCL solver takes d decisions to solve a given formula F and generates q conflicts. The GLR of the solver for F is defined as q d . GLR measures the overall ability of a solver to generate conflict for a given problem (Liang et al. 2017).

	Software, Hardware and Test Environment In this work, we adopt four baseline solvers: gLCM4, MplCOMSPS5 (winner of SAT-2016), MplCM4 (second runner up of SAT-2018) and MplCBT4 (winner of SAT-2018). While gLCM uses only VSIDS as its branching heuristic, the other three combine VSIDS with other heuristics. All experiments presented in this paper were run on a workstation with 64GB RAM and a processor clock speed of 2.4 GHz. Two test sets were used in experiments. (a) Test Set 1 contains 750 instances from the main track of SAT-2017 (350) and 2018 (400) and is run with a time limit of 5000 seconds per instance. (b) Test Set 2 consists of 52 hard instances from SATCoin (Bit Coin Mining) cryptographic benchmark, which are generated with the instance generator from (Manthey and Heusser 2018). We generated these instances by varying the range parameter, which determines the difficulty of a SATCoin instance. For experiments, we set the time limit to be 36,000 seconds per instance.

Conflict Depression and Conflict Bursts


	Let us represent the conflict history of the search by the sequence of ci and define a conflict depression (CD) phase as a sequence of one or more consecutive decisions with no conflict. 译文：让我们用c_i的序列来表示搜索的冲突历史，并将冲突压抑(CD)阶段定义为一个或多个没有冲突的连续决策的序列。
	Let us define a conflict burst (CB) phase as a sequence of one or more consecutive decisions with at least one conflict.译文：让我们将冲突爆发(CB)阶段定义为一个或多个具有至少一个冲突的连续决策的序列。


	译文：我们注意到，大多数情况下，CD的平均阶段长度都很短，但仍然包含多个决策(蓝色)。译文：不管它们的平均CD阶段长度如何，在给定决策率(黄色)的情况下，几乎所有的CD阶段(橙色)都以很高的速率出现.
	译文：图1右侧的直方图显示了CD相平均长度的分布。译文：这个平均值从2.09到1402.30。263个实例的长度很短(最多3个)。分布是重尾的，有69个平均长度大于25(最右边的bin)。

	Overall, the data indicates that for gLCM on Test Set 1, conflict depressions occur frequently and often last over multiple decisions (high average CD phase length).译文：总的来说，数据表明，在测试集1上的gLCM中，冲突抑郁经常发生，并在多个决策中持续(高平均CD阶段长度)。

	Propagation Depression Amid a CD Phase During a CD phase, VSIDS scores are not a good predictor of a variable’s future performance, since branching decisions fail to produce any conflict and perform only truth value propagations.译文：在CD阶段，vsid分数并不是变量未来性能的良好预测器，因为分支决策不会产生任何冲突，只执行真值传播。 Are there any differences in the pattern of unit propagations between CD and CB phases?译文：CD相和CB阶段的单位传播模式有什么不同吗? We define the Propagation Rate (PR) as the number of propagations per decision. Table 1 compares the average PR values for Test Set 1 over the decisions in CD and CB phases.译文：我们将传播速率(PR)定义为每个决策的传播次数。表1比较了测试集1在CD和CB阶段的平均PR值。 On average, PR values during a CD phase are almost 10 times lower than CB phases. Clearly, this result demonstrates that during a CD phase, VSIDS branching decisions go through propagation depression as well.译文：平均而言，CD阶段的PR值几乎比CB阶段低10倍。显然，这一结果表明，在CD阶段，vsid分支决策也经历了传播抑制。

	Conflict Bursts in gLCM How long are the CB phases compared to CD phases? For the Test Set 1 instances, average value of CB and CD length are 1.67 and 20.63, respectively.译文：与CD相相比CB相有多长?对于Test Set 1实例，CB和CD长度的平均值分别为1.67和20.63。 Thus, on average, shorter CB phases are followed by much longer CD phases.译文：因此，平均而言，较短的转CB阶段之后是较长的CD阶段。

	Bursts of Conflict Generation 冲突的爆发 Table 2 shows the average values of GLR, FDC, FDOC and FDMC for Test Set 1. Column 3 shows the average GLR values for all three types of problems to be close to 0.5.译文：三种问题的平均GLR值均接近0.5。 In contrast, the average FDC values in column 4 are much lower, averaging 0.2507 over all instances.译文：相比之下，第4列中的平均FDC值要低得多，所有实例的平均FDC值为0.2507。 Therefore, on average, about 75% of all the decisions do not produce any conflict and only 25% of all the decisions produce at least one conflict. Further, the majority of the conflict producing decisions produce more than 1 conflict. This is evident in the average FDMC value (0.1535), which is 61% of the total conflict producing decisions (0.2507). As a summary, we have the following conclusions. The typical search behavior contains shorter CB phases, which is followed by longer CD phases, where the search does not find any conflicts. 译文：典型的搜索行为包括较短的CB阶段，随后是较长的CD阶段，在这些阶段中搜索不会发现任何冲突。 ? During a CD phase, the search goes through propagation depression as well.译文：在CD阶段，搜索也经历了传播萧条。 ? The shorter CB phases are conflict intense, i.e., within a few decisions, many conflicts are generated.译文：较短的CB阶段是冲突激烈的阶段，即在几个决策中产生许多冲突。

Exploration Guided VSIDS 探索vsid的引导作用

	Is it possible to correct the course of the search in a CD phase by identifying promising variables that are currently underranked by VSIDS? 译文：是否有可能通过识别目前vsid低估的有希望的变量来纠正CD阶段的搜索过程? In this work, we address this question by formulating a solver framework, called expSAT, which performs random explorations that probe into the future search space.译文：在这项工作中，我们通过制定一个求解器框架来解决这个问题，称为expSAT，它执行对未来搜索空间的随机探索。 The goal is to discover branching variables that are likely to lead to conflicts from which clauses are learned.译文：其目标是发现可能导致冲突的分支变量。 Given a CDCL SAT solver, expSAT modifies it as follows: 1. Before each branching decision, if a substantial CD phase is detected, then with probability pexp, expSAT performs an exploration episode, consisting of a fixed number nW of random walks. 译文：在每个分支决策之前，如果检测到一个重要的CD阶段，那么expSAT将以概率pexp执行一个由固定数量的nW随机漫步组成探索阶段。 Each walk consists of a limited number of random steps. Each such step consists of the uniform random selection of an unassigned step variable, followed by unit propagation (UP).译文：每次行走都包含有限的随机步数。每个这样的步骤由一个未分配的步骤变量的均匀随机选择，然后是单位传播(UP)。 A walk terminates either when a conflict occurs during UP, or after a fixed number lW of random steps have been taken.译文：当UP期间发生冲突时，或者在执行了固定数量的lW随机步骤后，游走终止 After each walk, the search state is restored and the next walk begins.译文：每次行走之后，将恢复搜索状态，并开始下一次行走。 Fig. 2 illustrates an exploration episode with 3 walks and a maximum of 3 random steps per walk.译文：图2显示了3次行走和每次最多3次随机行走的探索阶段。 ? An exploration score is computed for each step variable.译文：为每个步骤变量计算一个探索分数。 ? In the CDCL search, branching variables are chosen that maximize the expVSIDS heuristic, which combines the VSIDS activity score of a variable and its exploration score. Ties are broken randomly.译文：在CDCL搜索中，选择的分支变量将使expVSIDS启发式最大化，该启发式结合了变量的VSIDS活动分数和它的探索分数。平局是随机打破的。 ? All other elements, such as unit propagation, conflict analysis, restarts, and backjumping, remain the same as in the underlying CDCL SAT solver. 译文：所有其他元素，如单元传播、冲突分析、重启和回跳，都与底层的CDCL SAT解决器保持相同。

	Algorithm Details Input and Parameters All these parameters are explained above, except ω, which we explain below. 译文：所有这些参数都在上面解释了，除了ω，我们在下面解释。 When a random walk ends in a conflict after a series of random steps, some combination of the assigned variables has caused the conflict. 译文：当一个随机漫步在一系列随机步骤之后以冲突结束时，某个指定变量的组合导致了冲突。 In expSAT, we assign the most credit to the most recently assigned variable, and exponentially decay the credit for the variables assigned earlier in the walk, by a factor of ω per decision step. This approach is patterned on reward decay in reinforcement learning (Sutton and Barto 1998).译文：在expSAT中，我们为最近分配的变量分配了最多的积分，并以指数形式衰减在行走中较早分配的变量的积分，每个决策步骤的系数为ω。这种方法是以强化学习中的奖励衰减为模式的(Sutton and Barto 1998)。

文献学习——Guiding CDCL SAT Search via Random Exploration amid Conflict Depression

标签：begin lmos dep Once gate pattern 检测 form familiar

原文地址：https://www.cnblogs.com/yuweng1689/p/14822429.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行