support dagitty objects #183

grasshoppermouse · 2023-11-16T04:33:36Z

resolve #179

I'm an anthropologist, not a software engineer, so I hope I'm doing this right

thomasp85 · 2023-11-16T07:10:48Z

Does dagitty objects not contain any node and edge attributes? (asking in earnest as I'm not familiar with the package)

grasshoppermouse · 2023-11-16T14:52:40Z

I'm just learning about the package myself. dagitty is used to specify causal relationships among variables to aid designs of scientific studies. It is an R wrapper around a javascript library. Graphs are specified using the Graphviz dot language, where node and edge attributes are set with square brackets. Ideally, some R function already exists to parse a dot string and convert it to igraph, but I couldn't find one.

As far as I can tell, dagitty nodes can optionally have one of these 3 attributes: "exposure", "outcome", "latent". There are getter functions for those. Nodes can also optionally have x, and y coordinates for layout, and there is a function to get those.

Edges have 3 types: "->", "<->", "--" (directed, bidirectional, undirected), and there is a function to get those. But in tidygraph/igraph, don't all edges have to be either directed or undirected?

Edges can optionally have a "beta" attribute that sets the strength the causal relationship. As far as I can tell, there is no function to get those values. However, there are internal functions dagitty:::.vertexAttributes(g, a) and dagitty:::.edgeAttributes(g, a), where "g" is a dagitty object and "a" is any user-specified attribute.

So, I could add the node attributes exposure/outcome/latent, if they are present, I could add the node x & y coordinates, if present, I could add the edge direction attributes, and I could add node_attribute and edge_attribute arguments to get user-specified attributes, with defaults node_attribute=NULL and edge_attribute="beta".

How does that sound?

szhorvat · 2023-11-16T17:53:06Z

This is up to dagitty of course, but I wanted to say that DOT is a rather poor choice for data exchange. It is a format that is specific to a single software, Graphviz, and has features that make sense only for that one software. I once looked into how easy it would be to write a robust parser for this format for igraph. It would not be easy, and most likely it will never happen. The solution for these sorts of problems is to use a format which was designed for data exchange, not as data description language of one specific software. GraphML is a good choice for this. So are various formats built on JSON.

thomasp85 · 2023-11-17T10:05:11Z

Yeah, we should get as much info into the tbl_graph as possible. It is true that tidygraph/igraph doesn't support bi-directional edges. And edges can't be undirected in a directed graph. Maybe it is best to add a "type" attribute to edges that encode if it is undirected, directed, or bidirectional since we can't really capture that information otherwise.

It seems weird that there are attributes in the structure that the user cannot get to, but if that is so we shouldn't extract them as we don't want to take a potentially breaking dependency on a package.

As for their choice of DOT, not much we can do about that. The whole point of tidygraph is to some extend to save people from relying on packages with questionable structure when possible :-)

szhorvat · 2023-11-17T10:25:33Z

That comment was also partly an explanation of why igraph is unlikely to ever support reading DOT files. As for writing DOT files, that's already supported, and I consider that essential. We want to make it easy to plot igraph graphs with Graphviz.

…s to get an optional user specified attribute for nodes and edges (with a default of "beta" for edges)

grasshoppermouse · 2023-11-21T13:50:39Z

I pushed an update to add node attributes that are accessible with dagitty functions, as well as one user-defined attribute for nodes and one for edges, with the latter set to default to "beta", which is an important, commonly used edge attribute.

thomasp85 · 2023-11-21T14:34:55Z

Is there only ever one user-defined attribute for nodes and edges?

grasshoppermouse · 2023-11-21T16:49:43Z

In principle, I believe there can be arbitrary numbers of user defined node and edge attributes. In practice, I'm not sure. The attributes are meant to specify aspects relevant to the theory of causal diagrams, and almost all have dedicated functions in dagitty (I think beta might be the only exception).

…ributes. Better error handling.

thomasp85 · 2023-11-22T06:37:56Z

R/dagitty.R

+  for (a in node_attr){
+    if (vctrs::vec_as_names(a, repair = 'unique') != a) stop('each node_attr must be a string of length > 0')
+    nodes[a] <- dagitty:::.vertexAttributes(x, a)$a
+  }


If there is no way to extract these without using a non-exported function we should simply ignore any additional attributes all-together. Same with edge attributes

Except that the "beta" edge attributes are important. Perhaps @jtextor, maintainer of the dagitty package, can suggest a solution, e.g., modifying the edges function to also extract the beta values.

yes, that would be necessary on daggity's side. Using internal functions from other packages is not allowed, nor wise

thomasp85 · 2023-11-22T06:38:10Z

R/dagitty.R

+
+  edges <- dagitty::edges(x)
+  if (is_empty(edges)){
+    edges <- tibble::tibble(from = int(), to = int())


Suggested change

edges <- tibble::tibble(from = int(), to = int())

edges <- tibble::tibble(from = integer(), to = integer())

grasshoppermouse · 2023-11-28T21:00:37Z

I found this fairly extensive package, which adds a basic as_tbl_graph method for dagitty objects, but doesn't add node or edge attributes to the tbl_graph object. It also converts dagitty objects to its tidy_dag objects, which preserve some attributes:

https://github.com/r-causal/ggdag

thomasp85 · 2023-11-30T07:22:37Z

So, would you think there is still need for a method in tidygraph proper?

jtextor · 2023-11-30T08:06:03Z

Hi all, interesting discussion here. I'm certainly willing to update the package to export the edge attributes; I was planning to do this anyway.

Regarding my "poor choice" to use the dot syntax, I want to give a few arguments why I did this:

For causal inference applications it's very important to have flexibility with respect to edge types. There are at least 8 different types of edges supported by dagitty and more may be needed in the future.
The syntax can be extremely concise and readable. For instance "a->{b->{c->{d->{e->f}}}}" describes the complete graph with 6 nodes.

So for input by a human, I still feel it's really useful. I agree that it's less suitable as an exchange format and is complicated to parse (I had to write my own parser)

However, dagitty can already export to various other formats, and it would be quite simple to add a GraphML exporter. There is a function "toString" in the dagitty package that implements various output formats already. I could just add GraphML as a further option. The only obstacle is that GraphML only supports directed and undirected edges, as far as I can tell ... so I would have to add some custom nonstandard attributes.

Does your package support edges other than undirected and directed?

grasshoppermouse · 2023-11-30T13:57:10Z

So, would you think there is still need for a method in tidygraph proper?

I'm curious to get @jtextor 's opinion, since he knows the dagitty ecosystem best: where should an as_tbl_graph.dagitty method live? In dagitty?, ggdag (which already has a barebones one)? or tidygraph?

jtextor · 2023-11-30T14:36:16Z

I have no strong opinion on this. But having now read the entire discussion I think it's the easiest solution to just export to GraphML, which then can be read by this package if developers are open to it. But then we'd have to agree on a way to encode the different type of edges, which is not part of the GraphML standard.

grasshoppermouse · 2023-12-01T02:28:34Z

@jtextor, would you be willing to add an as.igraph.dagitty method to dagitty? I see you already suggest igraph, which has a huge number of useful functions, and tidygraph is based on igraph.

thomasp85 · 2023-12-01T07:46:01Z

If daggity were to get an as.igraph() method it would work out of the box in tidygraph

jtextor · 2023-12-01T08:11:23Z

To some extent, this is already available. The R package "causaleffect" is based on igraph as well, and there is a function that converts to an igraph object that can be understood by causaleffect. E.g.,

dagitty::convert(dagitty::getExample("Shrier"),"causaleffect")

However the way in which <-> edges are represented in causaleffect is not straightforward. These are converted to two directed edges, both of which are given a special attribute.

I looked around the igraph manual and now they seem to have a somewhat-standard way to represent mixed graphs with directed, undirected and bi-directed edges. This covers most of what the causal inference community needs. So I'll add a method "as.igraph" that will follow that convention.

thomasp85 · 2023-12-01T12:09:41Z

@jtextor thank you! As this solves support for as_tbl_graph() completely I'll be closing this PR. Still, thank you @grasshoppermouse for pushing this forward 🙏

support dagitty objects

5c0d08c

olivroy mentioned this pull request Nov 17, 2023

read .DOT and generate node/edge dfs? rich-iannone/DiagrammeR#466

Open

Added node attributes for which there are getter functions. Added arg…

2cef20c

…s to get an optional user specified attribute for nodes and edges (with a default of "beta" for edges)

Complete rewrite. Now handles multiple user-defined node and edge att…

b245d80

…ributes. Better error handling.

thomasp85 reviewed Nov 22, 2023

View reviewed changes

grasshoppermouse and others added 2 commits November 22, 2023 04:02

fixed error in empty edges code

8e18e25

Removed use of internal functions

f6ba04e

thomasp85 closed this Dec 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support dagitty objects #183

support dagitty objects #183

grasshoppermouse commented Nov 16, 2023

thomasp85 commented Nov 16, 2023

grasshoppermouse commented Nov 16, 2023

szhorvat commented Nov 16, 2023

thomasp85 commented Nov 17, 2023

szhorvat commented Nov 17, 2023 •

edited

Loading

grasshoppermouse commented Nov 21, 2023

thomasp85 commented Nov 21, 2023

grasshoppermouse commented Nov 21, 2023

thomasp85 Nov 22, 2023

grasshoppermouse Nov 22, 2023

thomasp85 Nov 22, 2023

thomasp85 Nov 22, 2023

grasshoppermouse commented Nov 28, 2023

thomasp85 commented Nov 30, 2023

jtextor commented Nov 30, 2023

grasshoppermouse commented Nov 30, 2023

jtextor commented Nov 30, 2023

grasshoppermouse commented Dec 1, 2023

thomasp85 commented Dec 1, 2023

jtextor commented Dec 1, 2023 •

edited

Loading

thomasp85 commented Dec 1, 2023

	edges <- tibble::tibble(from = int(), to = int())
	edges <- tibble::tibble(from = integer(), to = integer())

support dagitty objects #183

support dagitty objects #183

Conversation

grasshoppermouse commented Nov 16, 2023

thomasp85 commented Nov 16, 2023

grasshoppermouse commented Nov 16, 2023

szhorvat commented Nov 16, 2023

thomasp85 commented Nov 17, 2023

szhorvat commented Nov 17, 2023 • edited Loading

grasshoppermouse commented Nov 21, 2023

thomasp85 commented Nov 21, 2023

grasshoppermouse commented Nov 21, 2023

thomasp85 Nov 22, 2023

Choose a reason for hiding this comment

grasshoppermouse Nov 22, 2023

Choose a reason for hiding this comment

thomasp85 Nov 22, 2023

Choose a reason for hiding this comment

thomasp85 Nov 22, 2023

Choose a reason for hiding this comment

grasshoppermouse commented Nov 28, 2023

thomasp85 commented Nov 30, 2023

jtextor commented Nov 30, 2023

grasshoppermouse commented Nov 30, 2023

jtextor commented Nov 30, 2023

grasshoppermouse commented Dec 1, 2023

thomasp85 commented Dec 1, 2023

jtextor commented Dec 1, 2023 • edited Loading

thomasp85 commented Dec 1, 2023

szhorvat commented Nov 17, 2023 •

edited

Loading

jtextor commented Dec 1, 2023 •

edited

Loading