Skip to content

Commit c01190e

Browse files
committed
Initialize the pubgrub-rs guide
0 parents  commit c01190e

22 files changed

+1269
-0
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
book

book.toml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
[book]
2+
authors = ["Matthieu Pizenberg"]
3+
language = "en"
4+
multilingual = false
5+
src = "src"
6+
title = "PubGrub Guide"
7+
8+
[output.html]
9+
mathjax-support = true

src/SUMMARY.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Summary
2+
3+
- [Version solving](./version_solving.md)
4+
- [Using the pubgrub crate](./pubgrub_crate/intro.md)
5+
- [Basic example with OfflineDependencyProvider](./pubgrub_crate/offline_dep_provider.md)
6+
- [Writing your own dependency provider](./pubgrub_crate/custom_dep_provider.md)
7+
- [Caching dependencies in a DependencyProvider](./pubgrub_crate/caching.md)
8+
- [Strategical decision making in a DependencyProvider](./pubgrub_crate/strategy.md)
9+
- [Solution and error reporting](./pubgrub_crate/solution.md)
10+
- [Writing your own error reporting logic](./pubgrub_crate/custom_report.md)
11+
- [Internals of the PubGrub algorithm](./internals/intro.md)
12+
- [Overview of the algorithm](./internals/overview.md)
13+
- [Terms](./internals/terms.md)
14+
- [Incompatibilities](./internals/incompatibilities.md)
15+
- [Partial solution](./internals/partial_solution.md)
16+
- [Conflict resolution](./internals/conflict_resolution.md)
17+
- [Building a report tree](./internals/report_tree.md)
18+
- [Testing and benchmarking](./testing/intro.md)
19+
- [Property testing](./testing/property.md)
20+
- [Benchmarking](./testing/benchmarking.md)
21+
- [How can I contribute? Here are some ideas](./contributing.md)

src/contributing.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# How can I contribute? Here are some ideas
2+
3+
- Use it!
4+
Indeed there is quite some work left for custom
5+
dependency providers. So just using the crate, building
6+
you own dependency provider for your favorite programming language
7+
and letting us know how it turned out
8+
would be amazing feedback already!
9+
10+
- Non failing extension for multiple versions.
11+
Currently, the solver works by allowing only one version per package.
12+
In some contexts however, we may want to not fail if multiple versions are required,
13+
and return instead multiple versions per package.
14+
Such situations may be for example allowing multiple major versions of the same crate.
15+
But it could be more general than that exact scenario.

src/internals/conflict_resolution.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# Conflict resolution
2+
3+
As stated before, a conflict is a satisfied incompatibility
4+
that we detected in the unit propagation loop.
5+
The goal of conflict resolution is to backtrack the partial solution
6+
such that we have the following guarantees:
7+
8+
1. The root cause incompatibility of the conflict is almost satisfied
9+
(such that we can continue unit propagation).
10+
2. The following derivations will be different than before conflict resolution.
11+
12+
Let the "satisfier" be the earliest assignment in the partial solution
13+
making the incompatibility fully satisfied by the partial solution up to that point.
14+
We know that we want to backtrack the partial solution at least previous to that assignment.
15+
Backtracking only makes sense if done at decision levels frontiers.
16+
As such the conflictual incompatibility can only become "almost satisfied"
17+
if there is only one package term related to incompatibility satisfaction
18+
at the decision level of that satisfier.
19+
When the satisfier is a decision this is trivial since all previous assignments
20+
are of lower decision levels.
21+
When the satisfier is a derivation however we need to check that property.
22+
We do that by computing the "previous satisfier" decision level.
23+
The previous satisfier is (if it exists) the earliest assignment
24+
previous to the satisfier such that the partial solution up to that point,
25+
plus the satisfier, makes the incompatibility satisfied.
26+
Once we found it, we know that property (1) is guaranteed as long as
27+
we backtrack to a decision level between the one of the previous satisfier
28+
and the one of the satisfier, as long as these are different.
29+
30+
If the satisfier and previous satisfier decisions levels are the same,
31+
we cannot guarantee (1) for that incompatibility after backtracking.
32+
Therefore, the key of conflict resolution is to derive a new incompatibility
33+
for which we will be able to guarantee (1).
34+
And we have seen how to do that with the
35+
[rule of resolution](incompatibilities.md#rule-of-resolution).
36+
We will derive a new incompatibility called the "prior cause"
37+
as the resolvent of the current incompatibility and
38+
the incompatibility which is the cause of the satisfier.
39+
If necessary, we repeat that procedure until finding an incompatibility,
40+
called the "root cause" for which we can guarantee that it will
41+
be almost satisfied after backtracking (1).
42+
43+
Now the question is where do we cut?
44+
Is there a reason we cut at the previous satisfier decision level?
45+
Is it to guarantee (2)? Would that not be guaranteed if we picked
46+
another decision level? Is it because backtracking further back
47+
will reduce the number of potential conflicts?

src/internals/incompatibilities.md

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
# Incompatibilities
2+
3+
4+
## Definition
5+
6+
Incompatibilities are called "nogoods" in [CDNL-ASP][ass] terminology.
7+
**An incompatibility is a [conjunction][conjunction] of package terms that must
8+
be evaluated false**, meaning at least one package term must be evaluated false.
9+
Otherwise we say that the incompatibility has been "satisfied".
10+
Satisfied incompatibilities represent conflicts and thus
11+
the goal of the PubGrub algorithm is to build a solution
12+
such that none of the produced incompatibilities are ever satisfied.
13+
If one incompatibility becomes satisfied at some point,
14+
the algorithm finds the root cause of it and backtracks the partial solution
15+
before the decision at the origin of that root cause.
16+
17+
> Remark: incompatibilities (nogoods) are the opposite of clauses
18+
> in traditional conflict-driven clause learning ([CDCL][cdcl])
19+
> which are disjunctions of literals that must be evaluated true,
20+
> so have at least one literal evaluated true.
21+
>
22+
> The gist of CDCL is that it builds a solution to satisfy a
23+
> [conjunctive normal form][cnf] (conjunction of clauses) while
24+
> CDNL builds a solution to unsatisfy a [disjunctive normal form][dnf]
25+
> (disjunction of nogoods).
26+
>
27+
> In addition, PubGrub is a lazy CDNL algorithm since the disjunction of nogoods
28+
> (incompatibilities) is built on the fly with the solution.
29+
30+
[ass]: https://www.sciencedirect.com/science/article/pii/S0004370212000409
31+
[cdcl]: https://en.wikipedia.org/wiki/Conflict-driven_clause_learning
32+
[conjunction]: https://en.wikipedia.org/wiki/Logical_conjunction
33+
[cnf]: https://en.wikipedia.org/wiki/Conjunctive_normal_form
34+
[dnf]: https://en.wikipedia.org/wiki/Disjunctive_normal_form
35+
36+
In this guide, we will note incompatibilities with curly braces.
37+
An incompatibility containing one term \\(T_a\\) for package \\(a\\)
38+
and one term \\(T_b\\) for package \\(b\\) will be noted
39+
40+
\\[ \\{ a: T_a, b: T_b \\}. \\]
41+
42+
> Remark: in a more "mathematical" setting, we would probably have noted
43+
> \\( T_a \land T_b \\), but the chosen notation maps well
44+
> with the representation of incompatibilities as hash maps.
45+
46+
47+
## Properties
48+
49+
**Packages only appear once in an incompatibility**.
50+
Since an incompatibility is a conjunction,
51+
multiple terms for the same package are merged with the intersection of those terms.
52+
53+
**Terms that are always satisfied can be removed from an incompatibility**.
54+
We previously explained that the term \\( \neg [\varnothing] \\) is always evaluated true.
55+
As a consequence, it can safely be removed from the conjunction of terms that is the incompatibility.
56+
57+
\\[ \\{ a: T_a, b: T_b, c: \neg [\varnothing] \\} = \\{ a: T_a, b: T_b \\} \\]
58+
59+
**Dependencies can be expressed as incompatibilities**.
60+
Saying that versions in range \\( r_a \\) of package \\( a \\)
61+
depend on versions in range \\( r_b \\) of package \\( b \\)
62+
can be expressed by the incompatibility
63+
64+
\\[ \\{ a: [r_a], b: \neg [r_b] \\}. \\]
65+
66+
67+
## Unit propagation
68+
69+
If all terms but one of an incompatibility are satisfied by a partial solution,
70+
we can deduce that the remaining unsatisfied term must be evaluated false.
71+
We can thus derive a new unit term for the partial solution
72+
which is the negation of the remaining unsatisfied term of the incompatibility.
73+
For example, if we have the incompatibility
74+
\\( \\{ a: T_a, b: T_b, c: T_c \\} \\)
75+
and if \\( T_a \\) and \\( T_b \\) are satisfied by terms in the partial solution
76+
then we can derive that the term \\( \overline{T_c} \\) can be added for package \\( c \\)
77+
in the partial solution.
78+
79+
80+
## Rule of resolution
81+
82+
Intuitively, we are able to deduce things such as if package \\( a \\)
83+
depends and package \\( b \\) which depends on package \\( c \\),
84+
then \\( a \\) depends on \\( c \\).
85+
With incompatibilities, we would note
86+
87+
\\[ \\{ a: T_a, b: \overline{T_b} \\}, \quad
88+
\\{ b: T_b, c: \overline{T_c} \\} \quad
89+
\Rightarrow \quad \\{ a: T_a, c: \overline{T_c} \\}. \\]
90+
91+
This is the simplified version of the rule of resolution.
92+
For the generalization, let's reuse the "more mathematical" notation of conjunctions
93+
for incompatibilities \\( T_a \land T_b \\) and the above rule would be
94+
95+
\\[ T_a \land \overline{T_b}, \quad
96+
T_b \land \overline{T_c} \quad
97+
\Rightarrow \quad T_a \land \overline{T_c}. \\]
98+
99+
In fact, the above rule can also be expressed as follows
100+
101+
\\[ T_a \land \overline{T_b}, \quad
102+
T_b \land \overline{T_c} \quad
103+
\Rightarrow \quad T_a \land (\overline{T_b} \lor T_b) \land \overline{T_c} \\]
104+
105+
since for any term \\( T \\), the disjunction \\( T \lor \overline{T} \\) is always true.
106+
In general, for any two incompatibilities \\( T_a^1 \land T_b^1 \land \cdots \land T_z^1 \\)
107+
and \\( T_a^2 \land T_b^2 \land \cdots \land T_z^2 \\) we can deduce a third,
108+
called the resolvent whose expression is
109+
110+
\\[ (T_a^1 \lor T_a^2) \land (T_b^1 \land T_b^2) \land \cdots \land (T_z^1 \land T_z^2). \\]
111+
112+
In that expression, only one pair of package terms is regrouped as a union (a disjunction),
113+
the others are all intersected (conjunction).
114+
If a term for a package does not exist in one incompatibility,
115+
it can safely be replaced by the term \\( \neg [\varnothing] \\) in the expression above
116+
as we have already explained before.

src/internals/intro.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Internals of the PubGrub algorithm
2+
3+
4+
For an alternative / complementary explanation of the PubGrub algorithm,
5+
you can read the detailed description of the solver
6+
provided by the original PubGrub author in the GitHub repository
7+
of the dart package manager [pub][pub].
8+
9+
PubGrub is an algorithm inspired by conflict-driven nogood learning (CDNL-ASP),
10+
an [approach presented by Gabser, Kaufmann and Schaub in 2012][ass].
11+
The approach consists in iteratively taking decisions
12+
(here picking a package and version) until reaching a conflict.
13+
At that point it records a nogood (an "incompatibility" in PubGrub terminology)
14+
describing the root cause of the conflict
15+
and backtracks to a state previous to the decision leading to that conflict.
16+
CDNL has many similarities with [CDCL][cdcl] (conflict-driven clause learning)
17+
with the difference that nogoods are conjunctions
18+
while clauses are disjunctions of literals.
19+
More documentation of their approach is available on [their website][potassco].
20+
21+
At any moment, the PubGrub algorithm holds a state composed of two things,
22+
(1) a partial solution and (2) a set of incompatibilities.
23+
The partial solution (1) is a chronological list of "assignments",
24+
which are either decisions taken or version constraints
25+
for packages where no decision was made yet.
26+
The set of incompatibilities (2) is an ever-growing collection of
27+
incompatibilities.
28+
We will describe them in more details later but simply put,
29+
an incompatibility describes packages that are dependent or incompatible,
30+
that is packages that must be present at the same time
31+
or that cannot be present at the same time in the solution.
32+
33+
Incompatibilities express facts, and as such are always valid.
34+
Therefore, the set of incompatibilities is never backtracked,
35+
only growing and recording new knowledge along the way.
36+
In contrast, the partial solution contains decisions and deductions
37+
(called "derivations" in PubGrub terminology),
38+
that are dependent on every decision made.
39+
Therefore, PubGrub needs to be able to backtrack the partial solution
40+
to an older state when there is a conflict.
41+
42+
[pub]: https://github.com/dart-lang/pub/blob/master/doc/solver.md
43+
[ass]: https://www.sciencedirect.com/science/article/pii/S0004370212000409
44+
[cdcl]: https://en.wikipedia.org/wiki/Conflict-driven_clause_learning
45+
[potassco]: https://potassco.org/book/

src/internals/overview.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Overview of the algorithm
2+
3+
4+
## Solver main loop
5+
6+
The solver runs in a loop with the following steps:
7+
8+
1. Perform unit propagation on the currently selected package.
9+
2. Make a decision: pick a new package and version
10+
compatible with the current state of the partial solution.
11+
3. Retrieve dependencies for the newly selected package
12+
and transform those into incompatibilities.
13+
14+
At any point within the loop, the algorithm may fail
15+
due to an impossibility to solve a conflict or
16+
an error occuring while trying to retrieve dependencies.
17+
When there is no more decision to be made,
18+
the algorithm returns the decisions from the partial solution.
19+
20+
21+
## Unit propagation overview
22+
23+
Unit propagation is the core mechanism of the algorithm.
24+
For the currently selected package,
25+
unit propagation aims at deriving new constraints
26+
(called "terms" in PubGrub and "literals" in CDNL terminology),
27+
from all incompatibilities referring to that package.
28+
For example, if an incompatibility specifies that packages a and b
29+
are incompatible, and if we just made a decision for package a,
30+
then we can derive a term specifying that package b should not appear in the solution.
31+
32+
While browsing incompatibilities, we may stumble upon one that is already "satisfied"
33+
by the current state of the partial solution.
34+
In our previous example, that would be the case if
35+
we had previously already made a decision for package b
36+
(in practice that exact situation could not happen but let's leave that subtlety for later).
37+
If an incompatibility is satisfied, we call that a conflict and must perform conflict resolution
38+
to backtrack the partial solution to a state previous to that conflict.
39+
Details on conflict resolution are presented in its
40+
[dedicated section](./conflict_resolution.md).

src/internals/partial_solution.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Partial solution
2+
3+
The partial solution is the part of PubGrub state holding all the decisions taken
4+
and the derivations computed from unit propagation on almost satisfied incompatibilities.
5+
We regroup decisions and derivations under the term "assignment".
6+
7+
The partial solution must be backtrackable when conflicts are detected.
8+
For this reason all assignments are recorded with their associated decision level.
9+
The decision level of an assignment is a counter for the number of decisions
10+
that have already been taken (including that one if it is a decision).
11+
If we represent all assignments as a chronological vec, they would look like follows:
12+
13+
```txt
14+
[ (0, root_derivation),
15+
(1, root_decision),
16+
(1, derivation_1a),
17+
(1, derivation_1b),
18+
...,
19+
(2, decision_2),
20+
(2, derivation_2a),
21+
...,
22+
]
23+
```
24+
25+
The partial solution must also enable efficient evaluation of incompatibilities
26+
in the unit propagation loop.
27+
For this, we need to have efficient access to all assignments
28+
referring to the packages present in an incompatibility.
29+
30+
To enable both efficient backtracking and efficient access to specific
31+
package assignments, the current implementation holds a dual representation
32+
of the the partial solution.
33+
One is called `history` and keeps dated (with decision levels) assignments
34+
in an ordered growing vec.
35+
The other is called `memory` and organizes assignments in a hashmap
36+
where they are regrouped by packages which are the hashmap keys.
37+
It would be interresting to see how the partial solution is stored
38+
in other implementations of PubGrub such as the one in [dart pub][pub].
39+
40+
[pub]: https://github.com/dart-lang/pub

0 commit comments

Comments
 (0)