Skip to content
Roger Grosse edited this page Aug 11, 2014 · 17 revisions

Inevitably, making up the conceptual dependency graph requires a lot of judgment calls: what are the appropriate concepts? What is actually a prerequisite for what? Here's an informal set of guidelines we've come up with based on our experience editing the content.

You should view this document as rules of thumb rather than rigid requirements. As we build up more of the graph structure, other strategies might turn out to work better. And our experiences gathering content for our own field, machine learning, might not be representative of areas we decide to add later.

Also, there's a lot of stuff here, so don't worry about following all of it as you go. You can think of this document as more of a reference that we can turn to whenever we spot problems with the graph or when we have disagreements about how particular things should be structured.

Concept nodes

What to include?

How do we know if a concept is important enough to include in the graph? Importance isn't really an issue. Unlike with courses and textbooks, there's no reason to leave something out simply to save space or time. If it's likely that someone will find the material useful, you might as well put it in.

Second, we expect a common use case for Metacademy will be to create dependency maps for individual papers. Since papers don't quite correspond to concepts, it doesn't really make sense to add a concept node for that purpose. Instead, the proper way to do this is to create a roadmap, which is a document giving an overview of a topic and pointers to relevant concepts and resources.

Granularity

The concepts in our graph certainly aren't atomic, and it's possible to subdivide them more and more finely. Conversely, we could try to cram more stuff into individual nodes. There's no obvious answer for the right granularity, but hopefully the tradeoff is pretty clear: with too coarse-grained a dependency graph, a lot of superfluous dependencies will be required. But too fine-grained a graph would become confusing and unwieldy.

Our rule of thumb is: each node corresponds to a concept that you should learn as a unit. If it make sense to learn only part of a concept, it should probably be split. Conversely, if it only makes sense to learn concept A in conjunction with concept B and vice versa, they should be combined. For instance, "Bayesian networks" and "variable elimination" are separate nodes in our graph, since it could be useful to learn what Bayes nets represent without learning how to actually do inference in them. But the definition of the variable elimination algorithm and the analysis of the running time are grouped together in the same node, since you don't really understand the algorithm unless you know when it is or is not efficient.

Sometimes some aspects of a concept are much harder to learn than others. For instance, proving a theorem may be much harder than understanding and using the statement of the theorem. The cleanest solution is normally to use goal-specific dependencies, but maybe in extreme cases we could include the theorem and its proof as separate concepts.

In general, each of the nodes usually corresponds to on the order of 5 pages of text, or 30 minutes of video, but 2 pages or 15 pages are not uncommon. Unlike with courses or textbooks, there's no reason to divide everything into chunks of equal length. It's more important that the concepts make sense as units.

Overview nodes

Textbooks typically start with an overview chapter explaining what the field is about and giving an overall outline of the topics discussed. Including this in the graph would generally be overkill, since a typical concept in our graph may depend on at least one concept from 5 or 6 different fields. You don't want to have to wade through that many overview nodes. (Also, the overview is often more useful to read after you've learned a lot of the specific concepts, rather than before.)

However, it's sometimes worth including an overview node if the more technical content only makes sense given a certain amount of context. For instance, explanations of graphical model inference algorithms tend to assume some general knowledge: why marginalization is equivalent to partition function computation, how to extend MRF inference algorithms to Bayes nets, that exact inference is intractable in the worst case, etc. We have an "inference in MRFs" node which explains these things, and many nodes for particular algorithms depend on it.

Relationship nodes

Often, two concepts are closely related even though neither one really depends on the other. For instance, maximum likelihood and Bayesian parameter estimation often result in very similar algorithms, and comparing the two provides additional insight into both algorithms. On the other hand, it would be a stretch to say one is a prerequisite for the other. In these cases, the thing to do is create a relationship node which lists both concepts as dependencies, and then provide forward links from both concepts to the relationship node. For instance, we could have a "Maximum likelihood vs. Bayesian parameter estimation" node.

Goals

In addition to the summary, we also show the user a list of goals: the things they should understand or be able to do, or the questions they should be able to answer. This also serves as the canonical list of topics included in a given node, and should be used for the purposes of choosing resources and dependencies.

The idea is that by reading any of the resources, you should be able to learn all of these topics. There may be additional topics which are incidentally covered by the resources but which aren't actually considered to be part of the concept node.

Dependencies

It's also sometimes ambiguous whether one concept depends on another. In general, concept B depends on concept A if you need to understand concept A in order to understand concept B. It's not enough that A is helpful for understanding B; in fact, lots of concepts will mutually reinforce each other without actually being prerequisites. It's best to err on the side of conservatism (including too few dependencies), since the size of the dependency graph is exponential in the depth, and the fan-out factor determines the base of the exponent.

On the other hand, the dependencies should include enough background to understand the concept at a technical level. Even if you can't reproduce the proof of a theorem, you should still have an intuition for why it's true.

Acyclicity

The dependency tree should be an acyclic graph. If you find that you've added a cycle, it's likely because you've included dependencies which are "good to have" rather than necessary. Only include the edges which are actually required for understanding the concept. If two concepts mutually reinforce each other, it's probably better to have a relationship node, as described above.

Resource-specific dependencies

Each of the nodes directs the user to one or more resources which explain the concept. Those resources generally aren't stand-alone documents, but excerpts from a larger work with a (usually linear) ordering assumed. In that context, it's often convenient for them to assume additional dependencies beyond what's actually required for the concept. These dependencies should be marked in the entries for individual resources (see below) rather than as part of the main graph structure.

But then, as a practical matter, if basically every resource assumes a dependency you don't think is strictly required, you might as well just add it to the graph structure.

Goal-specific dependencies

As described above, each concept has a specific set of learning goals. By default, if concept B depends on concept A, then all of the learning goals of A are needed for each goal of B. But sometimes, you only need to know a little bit about A to learn B, or A is only required for a subset of the goals. Then you can annotate goal-specific dependencies. This can dramatically shrink the size of the dependency graphs, especially when they include concepts such as positive definite matrices, which are frequently used but only at a surface level.

A common special case of this is when proving a theorem is much harder than applying the statement of the theorem. Most concepts which depend on the theorem don't actually depend on the proof. In this case, "Be able to apply Bob's Theorem" and "Prove Bob's Theorem" should be separate learning goals.

Transitivity

If B depends on A and C depends on B, then A will always be included in C's dependency graph regardless of whether C directly depends on A. Should we explicitly list A as a dependency for C?

In principle, a node's dependencies should be independent of the structure of the rest of the graph. I.e., if A is directly required for understanding C, then it should be included as a dependency. On the other hand, if it's only important insofar as B depends on it, then leave it out. There are two reasons for this:

  • If the graph is re-structured so that B no longer depends on A, this will indicate whether C should still depend on A.

  • All of the dependencies are shown to the user in Context section of the learning view, even if they are hidden in the graph view. This includes the prerequisites for the concept, as well as the list of concepts in the current dependency graph which depend on it. Seeing the full list of dependencies is useful, since it provides hints about what to review, and gives more precise information about why the concept is needed for the target node.

  • To avoid bloat in the graph, we're thinking of computing bottleneck scores for each node: the fraction of long-distance dependencies which would disappear if that node were removed from the graph. Nodes with high bottleneck scores are good candidates for refactoring. Annotating the A -> C dependency specifically will mean it doesn't "count" towards the bottleneck score for B.

Of course, if we tried to follow this religiously, we'd go crazy noting that everything depends on, say, vectors or partial derivatives. Just mark the dependencies that seem especially relevant, but don't worry about getting every last one.

Ordering of dependencies

The ordering of the dependencies is not arbitrary: it is used to determine the ordering of the concepts in the learning plan. In particular, the linear ordering is generated using a depth-first traversal of the dependency graph, so that dependencies earlier in the list tend to appear earlier in the learning plans.

A good rule of thumb is to order the dependencies from most general to most specific. More precisely, start with the concepts which are required for general reasons, followed by the ones which are needed for something specific. For instance, in "Convergence of loopy belief propagation" node, "Loopy belief propagation" would come before "Brouwer's fixed point theorem," since that is used to show a fairly specific claim about BP having a fixed point. This ordering would have the effect that concepts are presented as close as possible to when they're actually used.

Resources

For each node, we list resources where one can go to learn about the topic. These will generally be textbooks, video lectures, lecture notes, and review papers, although other sources such as Ph.D. theses or individual papers are also possible. The resource list means "read one of the following," rather than "read all of the following." The idea is that the user would start by reading/watching one of the resources, and then possibly consult others for additional clarification.

Resources are marked as "free," "free but requires signup," and "paid," and are presented to the users in that order. Otherwise, the ordering is arbitrary.

Which sources are worth including? The rule of thumb is, someone should be able to learn the concept by reading/watching one starred resource. In a bit more detail, the resources should do all or most of the following:

  • cover the important topics

  • provide motivation and intuition

  • for technical concepts, provide figures and worked through examples

  • give sufficient detail, but not so much to be overwhelming

  • don't include too many dependencies other than what's already in the main graph (one or two additional ancestors is OK)

Note: in a previous version of the content database, we distinguished "core" and "supplemental" resources. Generally, the supplemental ones either didn't cover all the goals, or didn't explain things quite as clearly as the core resources. We plan to phase out this distinction, since it isn't very informative to the user, and people found it confusing. Instead, if the resource only covers a subset of the learning goals, that can be annotated directly in the concept editor. In terms of the resource's writing quality, we might eventually introduce some sort of user voting scheme, rather than leaving it to the whims of the annotator.

Clone this wiki locally