Publications | Alexander Reisach

2025

TIME
The Case for Time in Causal DAGs

Alexander G. Reisach, Alberto Suárez, Sebastian Weichwald, and Antoine Chambaz

Preprint 2025

Abs arXiv Bib PDF

We make the case for incorporating a notion of time into causal directed acyclic graphs (DAGs). We demonstrate that nontemporal causal DAGs are ambiguous and obstruct justification of the acyclicity assumption. Assuming that causes precede effects, causal relationships are relative to the time order, and causal DAGs require temporal qualification. We propose a formalization via composite causal variables that refer to quantities at one or multiple time points. We emphasize that the acyclicity assumption requires different justifications depending on whether the time order allows cycles. We conclude by discussing implications for the interpretation and applicability of DAGs as causal models.
@article{reisach2025time, title = {The Case for Time in Causal DAGs}, author = {Reisach, Alexander G. and Suárez, Alberto and Weichwald, Sebastian and Chambaz, Antoine}, year = {2025}, doi = {10.48550/arXiv.2501.19311}, primaryclass = {stat.ME}, }

2024

spillR
spillR: Spillover Compensation in Mass Cytometry Data

Marco Guazzini, Alexander G. Reisach, Sebastian Weichwald, and Christof Seiler

Bioinformatics 2024

Abs publication Bib PDF

Channel interference in mass cytometry can cause spillover and may result in miscounting of protein markers. @catalyst introduce an experimental and computational procedure to estimate and compensate for spillover implemented in their R package ‘CATALYST‘. They assume spillover can be described by a spillover matrix that encodes the ratio between unstained and stained channels. They estimate the spillover matrix from experiments with beads. We propose to skip the matrix estimation step and work directly with the full bead distributions. We develop a nonparametric finite mixture model, and use the mixture components to estimate the probability of spillover. Spillover correction is often a pre-processing step followed by downstream analyses, choosing a flexible model reduces the chance of introducing biases that can propagaate downstream. We implement our method in an R package ‘spillR‘ using expectation-maximization to fit the mixture model. We test our method on synthetic and real data from ‘CATALYST‘. We find that our method compensates low counts accurately, does not introduce negative counts, avoids overcompensating high counts, and preserves correlations between markers that may be biologically meaningful.
@article{guazzini2024spillR, author = {Guazzini, Marco and Reisach, Alexander G. and Weichwald, Sebastian and Seiler, Christof}, title = {spillR: Spillover Compensation in Mass Cytometry Data}, journal = {Bioinformatics}, issn = {1367-4811}, volume = {40}, number = {6}, pages = {btae337}, year = {2024}, doi = {10.1093/bioinformatics/btae337}, }

2023

SI-Sort-ANM
A Scale-Invariant Sorting Criterion to Find a Causal Order in Additive Noise Models

Alexander G. Reisach, Myriam Tami, Christof Seiler, Antoine Chambaz, and Sebastian Weichwald

Advances in Neural Information Processing Systems 2023

Abs arXiv publication Bib PDF

Additive Noise Models (ANMs) are a common model class for causal discovery from observational data. Due to a lack of real-world data for which an underlying ANM is known, ANMs with randomly sampled parameters are commonly used to simulate data for the evaluation of causal discovery algorithms. While some parameters may be fixed by explicit assumptions, fully specifying an ANM requires choosing all parameters. Reisach et al. (2021) show that, for many ANM parameter choices, sorting the variables by increasing variance yields an ordering close to a causal order and introduce var-sortability to quantify this alignment. Since increasing variances may be unrealistic and cannot be exploited when data scales are arbitrary, ANM data are often rescaled to unit variance in causal discovery benchmarking. We show that synthetic ANM data are characterized by another pattern that is scale-invariant and thus persists even after standardization: the explainable fraction of a variable’s variance, as captured by the coefficient of determination R², tends to increase along the causal order. The result is high R²-sortability, meaning that sorting the variables by increasing R² yields an ordering close to a causal order. We propose a computationally efficient baseline algorithm termed R²-SortnRegress that exploits high R²-sortability and that can match and exceed the performance of established causal discovery algorithms. We show analytically that sufficiently high edge weights lead to a relative decrease of the noise contributions along causal chains, resulting in increasingly deterministic relationships and high R². We characterize R²-sortability on synthetic data with different simulation parameters and find high values in common settings. Our findings reveal high R²-sortability as an assumption about the data generating process relevant to causal discovery and implicit in many ANM sampling schemes. It should be made explicit, as its prevalence in real-world data is an open question. For causal discovery benchmarking, we provide implementations of R²-sortability, the R²-SortnRegress algorithm, and ANM simulation procedures in our library CausalDisco (https://causaldisco.github.io/CausalDisco/).
@inproceedings{reisach2023simple, title = {A Scale-Invariant Sorting Criterion to Find a Causal Order in Additive Noise Models}, author = {Reisach, Alexander G. and Tami, Myriam and Seiler, Christof and Chambaz, Antoine and Weichwald, Sebastian}, booktitle = {Advances in Neural Information Processing Systems}, volume = {36}, year = {2023}, doi = {10.48550/arXiv.2303.18211}, }

2021

BS-DAG!
Beware of the Simulated DAG! Causal Discovery Benchmarks May Be Easy to Game

Alexander G. Reisach, Christof Seiler, and Sebastian Weichwald

Advances in Neural Information Processing Systems 2021

Abs arXiv publication Bib PDF

Simulated DAG models may exhibit properties that, perhaps inadvertently, render their structure identifiable and unexpectedly affect structure learning algorithms. Here, we show that marginal variance tends to increase along the causal order for generically sampled additive noise models. We introduce varsortability as a measure of the agreement between the order of increasing marginal variance and the causal order. For commonly sampled graphs and model parameters, we show that the remarkable performance of some continuous structure learning algorithms can be explained by high varsortability and matched by a simple baseline method. Yet, this performance may not transfer to real-world data where varsortability may be moderate or dependent on the choice of measurement scales. On standardized data, the same algorithms fail to identify the ground-truth DAG or its Markov equivalence class. While standardization removes the pattern in marginal variance, we show that data generating processes that incur high varsortability also leave a distinct covariance pattern that may be exploited even after standardization. Our findings challenge the significance of generic benchmarks with independently drawn parameters. The code is available at https://github.com/Scriddie/Varsortability
@inproceedings{reisach2021beware, title = {Beware of the Simulated DAG! Causal Discovery Benchmarks May Be Easy to Game}, author = {Reisach, Alexander G. and Seiler, Christof and Weichwald, Sebastian}, booktitle = {Advances in Neural Information Processing Systems}, volume = {34}, year = {2021}, doi = {10.48550/arXiv.2102.13647}, }