Mixture Causal Discovery | Burak Varıcı

(NeurIPS 2024) Interventional Causal Discovery in a Mixture of DAGs

(TMLR 2024) Separability Analysis for Causal Discovery in Mixture of DAGs

=======================

Project summary:

In various applications, the observed data is heterogeneous, and most causal systems are complex.

Time-varying systems: causal relationships can change over time.
Cyclic relationships: causal effects can form a cycle, e.g., feedback loops.
Modeling subpopulations: e.g., subtypes of cancers do not share the same exact biological pathways

Since all causal relationships cannot be explained via a Directed Acyclic Graph (DAG), these complex systems are better modeled by a mixture of DAGs.

Causal discovery of mixture models is fundamentally more challenging than the causal discovery of a single DAG. Two major difficulties stem from (i) an inherent uncertainty about the skeletons of the component DAGs that constitute the mixture and (ii) possibly cyclic relationships across these component DAGs. Despite the practical relevance, these problems are underexplored in existing literature. In this project, we aim to advance our understanding of causal discovery of mixture models.

Separability Analysis for Causal Discovery in Mixture of DAG

In this paper, we formalize a framework for addressing the causal discovery objectives in a mixture of DAGs. First, we identify the causal relationships that can possibly be learned from those that are impossible to learn. It can be readily verified that some natural choices generally investigated in single DAGs are impossible when working with mixture models. We formalize the separability framework in which the inferential decisions are centered around separability analysis, which refers to identifying the pair of random variables that cannot be made conditionally independent in the mixture distribution. Accordingly, we refer to the node pairs associated with inseparable random variables as inseparable pairs. We introduce proper edge notations to visualize such edges graphically. Given this framework, we investigate what can be learned about the underlying DAGs in a mixture model. We adopt a constraint-based approach and use conditional independence (CI) tests on the mixture distribution to identify inseparable node pairs.

Our main contributions are as follows.

We introduce a mixture graph to represent the inseparable pairs of nodes in the mixture distribution. To interpret the edges of the mixture graph, we present a sufficient condition for two nodes to be separable. This result sheds light on the inadequacy of an existing graphical representation in the literature.
We strengthen the results for a mixture of tree-structured DAGs and establish necessary and sufficient conditions for two nodes to form an emergent edge in the mixture.
We introduce new edge notations for inseparable node pairs to represent the above conditions and partially orient the skeleton of the mixture graph. We discuss the inference of these oriented edges from unshielded triples and show that v-structures upon nodes with varying causal mechanisms can be recovered. Finally, we devise an algorithm built on our theoretical results to recover the learnable causal relationships.

Interventional Causal Discovery in a Mixture of DAGs:

As discussed in our previous paper, observational data is not sufficient to distinguish true causal relationships (true edges) from spurious ones (emergent edges). In this paper, our objective is to identify true edges via interventions on the mixture model.

First, we establish matching necessary and sufficient intervention sizes for identifying the mixture parents of a node (union of parent sets of a node in the mixture). We show that interventions \(I\) with size at most \(|{\rm pa}_{\rm m}(i)|+1\) are sufficient to identify mixture parents \(|{\rm pa}_{\rm m}(i)|\) of the node \(i\). We also show that interventions with size \(|{\rm pa}_{\rm m}(i)|+1\) are necessary at the worst-case.
We specialize this result for a mixture of trees, and show the necessary and sufficient intervention size becomes \(K+1\) for a mixture of \(K\) trees.
Next, we design an adaptive algorithm that identifies all directed edges of the individual DAGs in the mixture by using \(\mathcal{O}(n^2)\), where \(n\) is the number of variables. Remarkably, the maximum size of the interventions used in our algorithm is optimal if the mixture ancestors of a node (i.e., the union of its ancestors across all DAGs) do not form a cycle.
Finally, we show that the gap between the maximum intervention size used by the proposed algorithm for a given node and the optimal size is bounded by the cyclic complexity number of the node, which is defined as the number of nodes needing intervention to break cycles among the ancestors of the node, and is upper bounded by the number of such cycles.

Causal interactions among a group of variables are often modeled by a single causal graph. In some domains, however, these interactions are best described by multiple co-existing causal graphs, e.g., in dynamical systems or genomics. This paper addresses the hitherto unknown role of interventions in learning causal interactions among variables governed by a mixture of causal systems, each modeled by one directed acyclic graph (DAG). Causal discovery from mixtures is fundamentally more challenging than single-DAG causal discovery. Two major difficulties stem from (i) an inherent uncertainty about the skeletons of the component DAGs that constitute the mixture and (ii) possibly cyclic relationships across these component DAGs. This paper addresses these challenges and aims to identify edges that exist in at least one component DAG of the mixture, referred to as the true edges. First, it establishes matching necessary and sufficient conditions on the size of interventions required to identify the true edges. Next, guided by the necessity results, an adaptive algorithm is designed that learns all true edges using O(n^2) interventions, where n is the number of nodes. Remarkably, the size of the interventions is optimal if the underlying mixture model does not contain cycles across its components. More generally, the gap between the intervention size used by the algorithm and the optimal size is quantified. It is shown to be bounded by the cyclic complexity number of the mixture model, defined as the size of the minimal intervention that can break the cycles in the mixture, which is upper bounded by the number of cycles among the ancestors of a node.

Directed acyclic graphs (DAGs) are effective for compactly representing causal systems and specifying the causal relationships among the system’s constituents. Specifying such causal relationships in some systems requires a mixture of multiple DAGs – a single DAG is insufficient. Some examples include time-varying causal systems or aggregated subgroups of a population. Recovering the causal structure of the systems represented by single DAGs is investigated extensively, but it remains mainly open for the systems represented by a mixture of DAGs. A major difference between single- versus mixture-DAG recovery is the existence of node pairs that are separable in the individual DAGs but become inseparable in their mixture. This paper provides the theoretical foundations for analyzing such inseparable node pairs. Specifically, the notion of \emphemergent edges is introduced to represent such inseparable pairs that do not exist in the single DAGs but emerge in their mixtures. Necessary conditions for identifying the emergent edges are established. Operationally, these conditions serve as sufficient conditions for separating a pair of nodes in the mixture of DAGs. These results are further extended, and matching necessary and sufficient conditions for identifying the emergent edges in tree-structured DAGs are established. Finally, a novel graphical representation is formalized to specify these conditions, and an algorithm is provided for inferring the learnable causal relations.

Separability Analysis for Causal Discovery in Mixture of DAG

Interventional Causal Discovery in a Mixture of DAGs:

References