pubgrub-rs
diff --git a/‎.gitignore
Lines changed: 1 addition & 0 deletions b/‎.gitignore
Lines changed: 1 addition & 0 deletions
diff --git a/‎book.toml
Lines changed: 9 additions & 0 deletions b/‎book.toml
Lines changed: 9 additions & 0 deletions
diff --git a/‎src/SUMMARY.md
Lines changed: 21 additions & 0 deletions b/‎src/SUMMARY.md
Lines changed: 21 additions & 0 deletions
diff --git a/‎src/contributing.md
Lines changed: 15 additions & 0 deletions b/‎src/contributing.md
Lines changed: 15 additions & 0 deletions
diff --git a/‎src/internals/conflict_resolution.md
Lines changed: 47 additions & 0 deletions b/‎src/internals/conflict_resolution.md
Lines changed: 47 additions & 0 deletions
diff --git a/‎src/internals/incompatibilities.md
Lines changed: 116 additions & 0 deletions b/‎src/internals/incompatibilities.md
Lines changed: 116 additions & 0 deletions
diff --git a/‎src/internals/intro.md
Lines changed: 45 additions & 0 deletions b/‎src/internals/intro.md
Lines changed: 45 additions & 0 deletions
diff --git a/‎src/internals/overview.md
Lines changed: 40 additions & 0 deletions b/‎src/internals/overview.md
Lines changed: 40 additions & 0 deletions
diff --git a/‎src/internals/partial_solution.md
Lines changed: 40 additions & 0 deletions b/‎src/internals/partial_solution.md
Lines changed: 40 additions & 0 deletions
@@ -0,0 +1 @@
+book
@@ -0,0 +1,9 @@
+[book]
+authors = ["Matthieu Pizenberg"]
+language = "en"
+multilingual = false
+src = "src"
+title = "PubGrub Guide"
+
+[output.html]
+mathjax-support = true
@@ -0,0 +1,21 @@
+# Summary
+
+- [Version solving](./version_solving.md)
+- [Using the pubgrub crate](./pubgrub_crate/intro.md)
+  - [Basic example with OfflineDependencyProvider](./pubgrub_crate/offline_dep_provider.md)
+  - [Writing your own dependency provider](./pubgrub_crate/custom_dep_provider.md)
+  - [Caching dependencies in a DependencyProvider](./pubgrub_crate/caching.md)
+  - [Strategical decision making in a DependencyProvider](./pubgrub_crate/strategy.md)
+  - [Solution and error reporting](./pubgrub_crate/solution.md)
+  - [Writing your own error reporting logic](./pubgrub_crate/custom_report.md)
+- [Internals of the PubGrub algorithm](./internals/intro.md)
+  - [Overview of the algorithm](./internals/overview.md)
+  - [Terms](./internals/terms.md)
+  - [Incompatibilities](./internals/incompatibilities.md)
+  - [Partial solution](./internals/partial_solution.md)
+  - [Conflict resolution](./internals/conflict_resolution.md)
+  - [Building a report tree](./internals/report_tree.md)
+- [Testing and benchmarking](./testing/intro.md)
+  - [Property testing](./testing/property.md)
+  - [Benchmarking](./testing/benchmarking.md)
+- [How can I contribute? Here are some ideas](./contributing.md)
@@ -0,0 +1,15 @@
+# How can I contribute? Here are some ideas
+
+- Use it!
+  Indeed there is quite some work left for custom
+  dependency providers. So just using the crate, building
+  you own dependency provider for your favorite programming language
+  and letting us know how it turned out
+  would be amazing feedback already!
+
+- Non failing extension for multiple versions.
+  Currently, the solver works by allowing only one version per package.
+  In some contexts however, we may want to not fail if multiple versions are required,
+  and return instead multiple versions per package.
+  Such situations may be for example allowing multiple major versions of the same crate.
+  But it could be more general than that exact scenario.
@@ -0,0 +1,47 @@
+# Conflict resolution
+
+As stated before, a conflict is a satisfied incompatibility
+that we detected in the unit propagation loop.
+The goal of conflict resolution is to backtrack the partial solution
+such that we have the following guarantees:
+
+1. The root cause incompatibility of the conflict is almost satisfied
+   (such that we can continue unit propagation).
+2. The following derivations will be different than before conflict resolution.
+
+Let the "satisfier" be the earliest assignment in the partial solution
+making the incompatibility fully satisfied by the partial solution up to that point.
+We know that we want to backtrack the partial solution at least previous to that assignment.
+Backtracking only makes sense if done at decision levels frontiers.
+As such the conflictual incompatibility can only become "almost satisfied"
+if there is only one package term related to incompatibility satisfaction
+at the decision level of that satisfier.
+When the satisfier is a decision this is trivial since all previous assignments
+are of lower decision levels.
+When the satisfier is a derivation however we need to check that property.
+We do that by computing the "previous satisfier" decision level.
+The previous satisfier is (if it exists) the earliest assignment
+previous to the satisfier such that the partial solution up to that point,
+plus the satisfier, makes the incompatibility satisfied.
+Once we found it, we know that property (1) is guaranteed as long as
+we backtrack to a decision level between the one of the previous satisfier
+and the one of the satisfier, as long as these are different.
+
+If the satisfier and previous satisfier decisions levels are the same,
+we cannot guarantee (1) for that incompatibility after backtracking.
+Therefore, the key of conflict resolution is to derive a new incompatibility
+for which we will be able to guarantee (1).
+And we have seen how to do that with the
+[rule of resolution](incompatibilities.md#rule-of-resolution).
+We will derive a new incompatibility called the "prior cause"
+as the resolvent of the current incompatibility and
+the incompatibility which is the cause of the satisfier.
+If necessary, we repeat that procedure until finding an incompatibility,
+called the "root cause" for which we can guarantee that it will
+be almost satisfied after backtracking (1).
+
+Now the question is where do we cut?
+Is there a reason we cut at the previous satisfier decision level?
+Is it to guarantee (2)? Would that not be guaranteed if we picked
+another decision level? Is it because backtracking further back
+will reduce the number of potential conflicts?
@@ -0,0 +1,116 @@
+# Incompatibilities
+
+
+## Definition
+
+Incompatibilities are called "nogoods" in [CDNL-ASP][ass] terminology.
+**An incompatibility is a [conjunction][conjunction] of package terms that must
+be evaluated false**, meaning at least one package term must be evaluated false.
+Otherwise we say that the incompatibility has been "satisfied".
+Satisfied incompatibilities represent conflicts and thus
+the goal of the PubGrub algorithm is to build a solution
+such that none of the produced incompatibilities are ever satisfied.
+If one incompatibility becomes satisfied at some point,
+the algorithm finds the root cause of it and backtracks the partial solution
+before the decision at the origin of that root cause.
+
+> Remark: incompatibilities (nogoods) are the opposite of clauses
+> in traditional conflict-driven clause learning ([CDCL][cdcl])
+> which are disjunctions of literals that must be evaluated true,
+> so have at least one literal evaluated true.
+>
+> The gist of CDCL is that it builds a solution to satisfy a
+> [conjunctive normal form][cnf] (conjunction of clauses) while
+> CDNL builds a solution to unsatisfy a [disjunctive normal form][dnf]
+> (disjunction of nogoods).
+>
+> In addition, PubGrub is a lazy CDNL algorithm since the disjunction of nogoods
+> (incompatibilities) is built on the fly with the solution.
+
+[ass]: https://www.sciencedirect.com/science/article/pii/S0004370212000409
+[cdcl]: https://en.wikipedia.org/wiki/Conflict-driven_clause_learning
+[conjunction]: https://en.wikipedia.org/wiki/Logical_conjunction
+[cnf]: https://en.wikipedia.org/wiki/Conjunctive_normal_form
+[dnf]: https://en.wikipedia.org/wiki/Disjunctive_normal_form
+
+In this guide, we will note incompatibilities with curly braces.
+An incompatibility containing one term \\(T_a\\) for package \\(a\\)
+and one term \\(T_b\\) for package \\(b\\) will be noted
+
+\\[ \\{ a: T_a, b: T_b \\}. \\]
+
+> Remark: in a more "mathematical" setting, we would probably have noted
+> \\( T_a \land T_b \\), but the chosen notation maps well
+> with the representation of incompatibilities as hash maps.
+
+
+## Properties
+
+**Packages only appear once in an incompatibility**.
+Since an incompatibility is a conjunction,
+multiple terms for the same package are merged with the intersection of those terms.
+
+**Terms that are always satisfied can be removed from an incompatibility**.
+We previously explained that the term \\( \neg [\varnothing] \\) is always evaluated true.
+As a consequence, it can safely be removed from the conjunction of terms that is the incompatibility.
+
+\\[ \\{ a: T_a, b: T_b, c: \neg [\varnothing] \\} = \\{ a: T_a, b: T_b \\} \\]
+
+**Dependencies can be expressed as incompatibilities**.
+Saying that versions in range \\( r_a \\) of package \\( a \\)
+depend on versions in range \\( r_b \\) of package \\( b \\)
+can be expressed by the incompatibility
+
+\\[ \\{ a: [r_a], b: \neg [r_b] \\}. \\]
+
+
+## Unit propagation
+
+If all terms but one of an incompatibility are satisfied by a partial solution,
+we can deduce that the remaining unsatisfied term must be evaluated false.
+We can thus derive a new unit term for the partial solution
+which is the negation of the remaining unsatisfied term of the incompatibility.
+For example, if we have the incompatibility
+\\( \\{ a: T_a, b: T_b, c: T_c \\} \\)
+and if \\( T_a \\) and \\( T_b \\) are satisfied by terms in the partial solution
+then we can derive that the term \\( \overline{T_c} \\) can be added for package \\( c \\)
+in the partial solution.
+
+
+## Rule of resolution
+
+Intuitively, we are able to deduce things such as if package \\( a \\)
+depends and package \\( b \\) which depends on package \\( c \\),
+then \\( a \\) depends on \\( c \\).
+With incompatibilities, we would note
+
+\\[               \\{ a: T_a, b: \overline{T_b} \\}, \quad
+                  \\{ b: T_b, c: \overline{T_c} \\}  \quad
+\Rightarrow \quad \\{ a: T_a, c: \overline{T_c} \\}. \\]
+
+This is the simplified version of the rule of resolution.
+For the generalization, let's reuse the "more mathematical" notation of conjunctions
+for incompatibilities \\( T_a \land T_b \\) and the above rule would be
+
+\\[               T_a \land \overline{T_b}, \quad
+                  T_b \land \overline{T_c}  \quad
+\Rightarrow \quad T_a \land \overline{T_c}. \\]
+
+In fact, the above rule can also be expressed as follows
+
+\\[               T_a \land \overline{T_b}, \quad
+                  T_b \land \overline{T_c}  \quad
+\Rightarrow \quad T_a \land (\overline{T_b} \lor T_b) \land \overline{T_c} \\]
+
+since for any term \\( T \\), the disjunction \\( T \lor \overline{T} \\) is always true.
+In general, for any two incompatibilities \\( T_a^1 \land T_b^1 \land \cdots \land T_z^1 \\)
+and \\( T_a^2 \land T_b^2 \land \cdots \land T_z^2 \\) we can deduce a third,
+called the resolvent whose expression is
+
+\\[ (T_a^1 \lor T_a^2) \land (T_b^1 \land T_b^2) \land \cdots \land (T_z^1 \land T_z^2). \\]
+
+In that expression, only one pair of package terms is regrouped as a union (a disjunction),
+the others are all intersected (conjunction).
+If a term for a package does not exist in one incompatibility,
+it can safely be replaced by the term \\( \neg [\varnothing] \\) in the expression above
+as we have already explained before.
@@ -0,0 +1,45 @@
+# Internals of the PubGrub algorithm
+
+
+For an alternative / complementary explanation of the PubGrub algorithm,
+you can read the detailed description of the solver
+provided by the original PubGrub author in the GitHub repository
+of the dart package manager [pub][pub].
+
+PubGrub is an algorithm inspired by conflict-driven nogood learning (CDNL-ASP),
+an [approach presented by Gabser, Kaufmann and Schaub in 2012][ass].
+The approach consists in iteratively taking decisions
+(here picking a package and version) until reaching a conflict.
+At that point it records a nogood (an "incompatibility" in PubGrub terminology)
+describing the root cause of the conflict
+and backtracks to a state previous to the decision leading to that conflict.
+CDNL has many similarities with [CDCL][cdcl] (conflict-driven clause learning)
+with the difference that nogoods are conjunctions
+while clauses are disjunctions of literals.
+More documentation of their approach is available on [their website][potassco].
+
+At any moment, the PubGrub algorithm holds a state composed of two things,
+(1) a partial solution and (2) a set of incompatibilities.
+The partial solution (1) is a chronological list of "assignments",
+which are either decisions taken or version constraints
+for packages where no decision was made yet.
+The set of incompatibilities (2) is an ever-growing collection of
+incompatibilities.
+We will describe them in more details later but simply put,
+an incompatibility describes packages that are dependent or incompatible,
+that is packages that must be present at the same time
+or that cannot be present at the same time in the solution.
+
+Incompatibilities express facts, and as such are always valid.
+Therefore, the set of incompatibilities is never backtracked,
+only growing and recording new knowledge along the way.
+In contrast, the partial solution contains decisions and deductions
+(called "derivations" in PubGrub terminology),
+that are dependent on every decision made.
+Therefore, PubGrub needs to be able to backtrack the partial solution
+to an older state when there is a conflict.
+
+[pub]: https://github.com/dart-lang/pub/blob/master/doc/solver.md
+[ass]: https://www.sciencedirect.com/science/article/pii/S0004370212000409
+[cdcl]: https://en.wikipedia.org/wiki/Conflict-driven_clause_learning
+[potassco]: https://potassco.org/book/
@@ -0,0 +1,40 @@
+# Overview of the algorithm
+
+
+## Solver main loop
+
+The solver runs in a loop with the following steps:
+
+1. Perform unit propagation on the currently selected package.
+2. Make a decision: pick a new package and version
+   compatible with the current state of the partial solution.
+3. Retrieve dependencies for the newly selected package
+   and transform those into incompatibilities.
+
+At any point within the loop, the algorithm may fail
+due to an impossibility to solve a conflict or
+an error occuring while trying to retrieve dependencies.
+When there is no more decision to be made,
+the algorithm returns the decisions from the partial solution.
+
+
+## Unit propagation overview
+
+Unit propagation is the core mechanism of the algorithm.
+For the currently selected package,
+unit propagation aims at deriving new constraints
+(called "terms" in PubGrub and "literals" in CDNL terminology),
+from all incompatibilities referring to that package.
+For example, if an incompatibility specifies that packages a and b
+are incompatible, and if we just made a decision for package a,
+then we can derive a term specifying that package b should not appear in the solution.
+
+While browsing incompatibilities, we may stumble upon one that is already "satisfied"
+by the current state of the partial solution.
+In our previous example, that would be the case if
+we had previously already made a decision for package b
+(in practice that exact situation could not happen but let's leave that subtlety for later).
+If an incompatibility is satisfied, we call that a conflict and must perform conflict resolution
+to backtrack the partial solution to a state previous to that conflict.
+Details on conflict resolution are presented in its
+[dedicated section](./conflict_resolution.md).
@@ -0,0 +1,40 @@
+# Partial solution
+
+The partial solution is the part of PubGrub state holding all the decisions taken
+and the derivations computed from unit propagation on almost satisfied incompatibilities.
+We regroup decisions and derivations under the term "assignment".
+
+The partial solution must be backtrackable when conflicts are detected.
+For this reason all assignments are recorded with their associated decision level.
+The decision level of an assignment is a counter for the number of decisions
+that have already been taken (including that one if it is a decision).
+If we represent all assignments as a chronological vec, they would look like follows:
+
+```txt
+[ (0, root_derivation),
+  (1, root_decision),
+  (1, derivation_1a),
+  (1, derivation_1b),
+  ...,
+  (2, decision_2),
+  (2, derivation_2a),
+  ...,
+]
+```
+
+The partial solution must also enable efficient evaluation of incompatibilities
+in the unit propagation loop.
+For this, we need to have efficient access to all assignments
+referring to the packages present in an incompatibility.
+
+To enable both efficient backtracking and efficient access to specific
+package assignments, the current implementation holds a dual representation
+of the the partial solution.
+One is called `history` and keeps dated (with decision levels) assignments
+in an ordered growing vec.
+The other is called `memory` and organizes assignments in a hashmap
+where they are regrouped by packages which are the hashmap keys.
+It would be interresting to see how the partial solution is stored
+in other implementations of PubGrub such as the one in [dart pub][pub].
+
+[pub]: https://github.com/dart-lang/pub