Skip to content
This repository was archived by the owner on Jul 6, 2023. It is now read-only.

Alternative learning paths

Roger Grosse edited this page Jun 13, 2014 · 4 revisions

Currently, Metacademy is based on a single dependency graph, where each concept has a single set of dependencies. (In some cases, the dependencies are annotated at a more fine-grained level, where each of a concept's goals depends on a set of concept goals.) The graph is used to generate a learning plan for a given concept; this learning plan depends only on the concepts a user has already learned. Might we want to offer alternative learning plans, similarly to how Google Maps offers several alternative sets of directions?

In theory, there may be different sets of prerequisites which would suffice to learn a given concept, and different sets of prereqs may be more efficient for different users. In my experience, this has been a lot less common than I'd expected, but it does happen. I'll list some examples, followed by several proposals for how the graph structure could be extended to allow alternative paths.

One common case is already covered by the existing structure: when a given concept has to be learned to varying depths depending on the situation. E.g., many machine learning concepts depend on "positive definite matrices," but only at a fairly superficial level. This is already handled by goal-specific dependencies. A given concept can depend on a specific set of goals for PD matrices, and those goals each have their own sets of dependencies.

But here are some examples which aren't handled so well under the current framework:

  • Many resources present belief propagation using a different formalism. Koller's book and Coursera course both use the clique tree formalism (which is also sometimes known as the junction tree algorithm). Bishop's and MacKay's books both use factor graphs. MIT's graphical models course, 6.438, presented it in terms of MRFs. All of these presentations are closely related, and involve essentially the same ideas, but aren't equivalent. They also have different dependencies. Currently, the factor graph version is treated as the canonical one. The Koller resources are listed for BP, but there is a note explaining the difference. The junction tree version of the algorithm is listed as a separate concept.

  • Regularization is an abstract concept which is needed to understand the motivation for a lot of machine learning algorithms (e.g. SVMs). But you can't just explain regularization in the abstract; it has to be introduced in the context of some particular learning algorithm. Currently, ridge regression is treated as the canonical version of regularization, since linear regression is probably the simplest algorithm to start with. This means other concepts, such as logistic regression and SVMs, depend on ridge regression when it isn't strictly required.

  • There are a variety of models of computation which are all provably equivalent, including Turing machines, lambda calculus, register machines, and recursive functions. Ideas such as decidability, incompleteness of formal logic, and Kolmogorov complexity all depend on formalizing computation one way or another. While the results are independent of the model of computation (as long as it is Turing-complete), the proofs draw upon the details of one particular formalism. We currently use Turing machines as the canonical model of computation, so if a resource made a different choice, that is listed as an additional dependency. (So Turing machines are listed as a dependency even if they aren't actually required.)

  • The Compactness Theorem for first-order logic can be proved as a straightforward extension of the Completeness Theorem. But this is considered somewhat inelegant, since the Compactness Theorem is a purely semantic result, whereas the Completeness Theorem requires defining a specific deductive calculus. There is an alternative purely semantic proof based on the ultraproduct construction, which draws upon results about Boolean algebras. We treat the syntactic proof as the canonical one, since it has the smaller set of dependencies.

Here are several proposals for how to accommodate cases such as these. I think the situation is probably rare enough that we might as well keep the status quo for now, but I mention the alternatives just in case. Maybe the right choice will become more obvious once we've collected more examples.

  • Use goal-specific dependencies. E.g., "prove the Compactness Theorem using the Completeness Theorem" would depend on "Completeness Theorem," whereas "prove the Compactness Theorem using ultraproducts" would depend on "ultraproduct".

    • Pro: doesn't require any changes to the current schema
    • Con: can't recommend alternative paths to different users; annotator has to mark goal-specific dependencies to avoid pulling in all the dependencies from both branches
  • Each concept can have alternative sets of dependencies.

    • Pro: allows fine-grained control; the most efficient path can be computed for a given user based on what they already know
    • Con: need to change database schema; more work to annotate
  • Encode alternative prerequisites by listing them specifically for each resource, e.g. "Turing machine" would be listed as a dependency for resources which use Turing machines. Then take resource-specific dependencies into account when drawing up alternative graph structures and learning plans. (Currently, resource-specific dependencies are ignored in the graphs and learning plans.)

    • Pro: doesn't require changing database structure, only graph generation code
    • Con: graph generation may be complicated to implement; relationship between dependencies and the graph structure is more complicated
Clone this wiki locally