Publications
2023
- SSC-ANMSimple Sorting Criteria Help Find the Causal Order in Additive Noise ModelsAlexander G. Reisach, Myriam Tami, Christof Seiler, Antoine Chambaz, and Sebastian Weichwald2023
Additive Noise Models (ANM) encode a popular functional assumption that enables learning causal structure from observational data. Due to a lack of real-world data meeting the assumptions, synthetic ANM data are often used to evaluate causal discovery algorithms. Reisach et al. (2021) show that, for common simulation parameters, a variable ordering by increasing variance is closely aligned with a causal order and introduce var-sortability to quantify the alignment. Here, we show that not only variance, but also the fraction of a variable’s variance explained by all others, as captured by the coefficient of determination R^2, tends to increase along the causal order. Simple baseline algorithms can use R^2-sortability to match the performance of established methods. Since R^2-sortability is invariant under data rescaling, these algorithms perform equally well on standardized or rescaled data, addressing a key limitation of algorithms exploiting var-sortability. We characterize and empirically assess R^2-sortability for different simulation parameters. We show that all simulation parameters can affect R^2-sortability and must be chosen deliberately to control the difficulty of the causal discovery task and the real-world plausibility of the simulated data. We provide an implementation of the sortability measures and sortability-based algorithms in our library CausalDisco (https://github.com/CausalDisco/CausalDisco).
@misc{reisach2023simple, title = {Simple Sorting Criteria Help Find the Causal Order in Additive Noise Models}, author = {Reisach, Alexander G. and Tami, Myriam and Seiler, Christof and Chambaz, Antoine and Weichwald, Sebastian}, year = {2023}, eprint = {2303.18211}, archiveprefix = {arXiv}, primaryclass = {stat.ML}, }
2021
- BS-DAG!Beware of the Simulated DAG! Causal Discovery Benchmarks May Be Easy to GameAlexander G. Reisach, Christof Seiler, and Sebastian WeichwaldAdvances in Neural Information Processing Systems 2021
Simulated DAG models may exhibit properties that, perhaps inadvertently, render their structure identifiable and unexpectedly affect structure learning algorithms. Here, we show that marginal variance tends to increase along the causal order for generically sampled additive noise models. We introduce varsortability as a measure of the agreement between the order of increasing marginal variance and the causal order. For commonly sampled graphs and model parameters, we show that the remarkable performance of some continuous structure learning algorithms can be explained by high varsortability and matched by a simple baseline method. Yet, this performance may not transfer to real-world data where varsortability may be moderate or dependent on the choice of measurement scales. On standardized data, the same algorithms fail to identify the ground-truth DAG or its Markov equivalence class. While standardization removes the pattern in marginal variance, we show that data generating processes that incur high varsortability also leave a distinct covariance pattern that may be exploited even after standardization. Our findings challenge the significance of generic benchmarks with independently drawn parameters. The code is available at https://github.com/Scriddie/Varsortability
@article{reisach2021beware, title = {Beware of the Simulated DAG! Causal Discovery Benchmarks May Be Easy to Game}, author = {Reisach, Alexander G. and Seiler, Christof and Weichwald, Sebastian}, journal = {Advances in Neural Information Processing Systems}, volume = {34}, year = {2021}, }