|
| 1 | +--- |
| 2 | +title: "On Study Design in Computational Humanities" |
| 3 | +author: "Dennis Yi Tenen" |
| 4 | +date: "May 10, 2025" |
| 5 | +documentclass: texMemo |
| 6 | +mainfont: "fbb" |
| 7 | +header-includes: | |
| 8 | + \usepackage{graphicx} |
| 9 | + \memoto{Recipient Name} |
| 10 | + \memofrom{Dennis Yi Tenen} |
| 11 | + \memosubject{Memo 1: On Study Design in Computational Humanities} |
| 12 | + \memodate{\today} |
| 13 | + \memologo{\includegraphics[width=0.3\textwidth]{cunil-logo.png}} |
| 14 | +--- |
| 15 | + |
| 16 | +Reading Thad Dunning's *Natural Experiments in the Social Science* (Cambridge, 2012) I am |
| 17 | +particularly struck by his discussion of study design. "How can causal inference be improved?" |
| 18 | +he asks on page 4 and answers: "In seeking to answer such questions, I place central emphasis |
| 19 | +on natural experiments as a 'design-based' method of research — one in which control over |
| 20 | +confounding variables comes primarily from research-design choices, rather than *ex post* |
| 21 | +adjustment using parametric statistical models (4)." |
| 22 | + |
| 23 | +This approach seems particularly well-suited for computational study in the humanities, where |
| 24 | +"the veracity of causal and statistical assumptions that are often difficult to explicate |
| 25 | +and defend — let alone validate." The natural experiment approach seeks to shift reasoning |
| 26 | +about such assumptions from the statistical modeling part of the research process, expressed |
| 27 | +mathematically, to the design process, expressed in the logic of the world observed: "With |
| 28 | +natural experiments, it is the research design, rather than the statistical modeling, that |
| 29 | +compels conviction." |
| 30 | + |
| 31 | +For this reason, Dunning writes, "substantive and contextual knowledge plays an important role |
| 32 | +at every stage of natural-experimental research — from discovery to analysis to evaluation." |
| 33 | +The emphasis on context necessitates thinking about statistical concepts such as "effect," in |
| 34 | +more specified, historical terms. The influence of one author on another, for example, depends |
| 35 | +crucially on contingent facts about their biography, their publication history, ideology, genre |
| 36 | +conventions, and numerous other factors worthy of consideration. The design-approach asks us |
| 37 | +to ground abstract statistical relationships firmly within concrete historical contexts and |
| 38 | +detailed interpretive frameworks. |
| 39 | + |
| 40 | +As a consequence of reasoning about complicated contexts, the quantitative analysis of natural |
| 41 | +experiments tends to be simple. Dunning writes: "Often, a minimum of mathematical manipulation |
| 42 | +is involved. For example, straightforward contrasts between the treatment and control groups |
| 43 | +— such as the difference in average outcomes in these two groups — often suffices to provide |
| 44 | +evidence of causal effects (105)." The potential simplicity of quantitative data analysis makes |
| 45 | +the statistical results easier to convey and interpret, Dunning writes. "Rather than presenting |
| 46 | +the estimated coefficients from multivariate models in long tables of regression results," |
| 47 | +he concludes, "analysts may have more space in articles to discuss the research design and |
| 48 | +substantive import of the results. I would add this also makes them easier to peer-review. |
| 49 | + |
| 50 | +Simplicity ultimately breeds transparency. Again, Dunning: "Analyzing data from strong research |
| 51 | +designs — including true and natural experiments — requires analysts to invoke assumptions |
| 52 | +about the process that gives rise to observed data (106)." For me, here finally lies the |
| 53 | +subtle but crucial point of his argument: all of the above remains true not just for natural |
| 54 | +experiments, but for strong research study design in computational humanities and social |
| 55 | +sciences more generally. Christopher H. Achen makes a similar point in his wonderful paper on |
| 56 | +"Garbage-Can Regressions," arguing for "sophisticated simplicity" in study design, engaging |
| 57 | +more "creatively" with the data. |
| 58 | + |
| 59 | +The study-design mindset fits well with my organic inclinations as a humanist. I don't |
| 60 | +normally reason by data manipulation. Reasoning by data manipulation alone risks "cooking the |
| 61 | +books" in losing sight of the underlying social or linguistic dynamics. The vagrancies of |
| 62 | +culture force me to think contextually: in terms of processes, timelines, customs, genres, |
| 63 | +relationships, narratives, etc. And I would like to remain firmly grounded in that realm when |
| 64 | +doing computational research. |
0 commit comments