publications | Burak Varıcı

2025

ICML
Contextures: Representations from Contexts

Runtian Zhai, Kai Yang, Burak Varıcı, Che-Ping Tsai, J. Zico Kolter, and Pradeep Ravikumar

In International Conference on Machine Learning, 2025

Abs Bib PDF

Despite the empirical success of foundation models, we do not have a systematic characterization of the representations that these models learn. In this paper, we establish the contexture theory. It shows that a large class of representation learning methods can be characterized as learning from the association between the input and a context variable. Specifically, we show that many popular methods aim to approximate the top-d singular functions of the expectation operator induced by the context, in which case we say that the representation learns the contexture. We demonstrate the generality of the contexture theory by proving that representation learning within various learning paradigms – supervised, self-supervised, and manifold learning – can all be studied from such a perspective. We also prove that the representations that learn the contexture are optimal on those tasks that are compatible with the context. One important implication of the contexture theory is that once the model is large enough to approximate the top singular functions, further scaling up the model size yields diminishing returns. Therefore, scaling is not all we need, and further improvement requires better contexts. To this end, we study how to evaluate the usefulness of a context without knowing the downstream tasks. We propose a metric and show by experiments that it correlates well with the actual performance of the encoder on many real datasets.
@inproceedings{zhai2025contextures, title = {Contextures: Representations from Contexts}, author = {Zhai, Runtian and Yang, Kai and Var{\i}c{\i}, Burak and Tsai, Che-Ping and Kolter, J. Zico and Ravikumar, Pradeep}, booktitle = {International Conference on Machine Learning}, year = {2025}, }
AISTATS
On the Consistent Recovery of Joint Distributions from Conditionals

Mahbod Majid, Rattana Pukdee, Vishwajeet Agrawal, Burak Varıcı, and Pradeep Ravikumar

In International Conference on Artificial Intelligence and Statistics, 2025

Abs Bib PDF

Self-supervised learning methods that mask parts of the input data and train models to predict the missing components have led to significant advances in machine learning. These approaches learn conditional distributions p(x_T | x_S) simultaneously where x_S, x_T are subsets of the observed variables. In this paper, we examine the core problem of when all these conditional distributions are consistent with some joint distribution, and whether common models used in practice can learn consistent conditionals. We explore this problem in two settings. First, for the complementary conditioning sets where S ∪T is the full set of variables, we introduce the concept of path consistency, a necessary condition for a consistent joint. Second, we consider the case where we have access to p(x_T | x_S) for all subsets S, T. In this case, we propose the concepts of autoregressive and swap consistency, which we show are necessary and sufficient conditions for a consistent joint. For both settings, we analyze when these consistency conditions hold and show that standard discriminative models may fail to satisfy them. Finally, we corroborate via experiments that proposed consistency measures can be used as proxies for evaluating the consistency of conditionals p(x_T | x_S), and common parameterizations may find it hard to learn true conditionals.
@inproceedings{majid2025consistent, title = {On the Consistent Recovery of Joint Distributions from Conditionals}, author = {Majid, Mahbod and Pukdee, Rattana and Agrawal, Vishwajeet and Var{\i}c{\i}, Burak and Ravikumar, Pradeep}, booktitle = {International Conference on Artificial Intelligence and Statistics}, year = {2025}, }

2024

NeurIPS
Linear Causal Representation Learning from Unknown Multi-node Interventions

Burak Varıcı, Emre Acartürk, Karthikeyan Shanmugam, and Ali Tajer

In Proc. Advances in Neural Information Processing Systems, 2024

Abs Bib PDF Code Poster Slides

Despite the multifaceted recent advances in interventional causal representation learning (CRL), they primarily focus on the stylized assumption of single-node interventions. This assumption is not valid in a wide range of applications, and generally, the subset of nodes intervened in an interventional environment is fully unknown. This paper focuses on interventional CRL under unknown multi-node (UMN) interventional environments and establishes the first identifiability results for general latent causal models (parametric or nonparametric) under stochastic interventions (soft or hard) and linear transformation from the latent to observed space. Specifically, it is established that given sufficiently diverse interventional environments, (i) identifiability up to ancestors is possible using only soft interventions, and (ii) perfect identifiability is possible using hard interventions. Remarkably, these guarantees match the best-known results for more restrictive single-node interventions. Furthermore, CRL algorithms are also provided that achieve the identifiability guarantees. A central step in designing these algorithms is establishing the relationships between UMN interventional CRL and score functions associated with the statistical models of different interventional environments. Establishing these relationships also serves as constructive proof of the identifiability guarantees.
@inproceedings{varici2024linear, title = {Linear Causal Representation Learning from Unknown Multi-node Interventions}, author = {Var{\i}c{\i}, Burak and Acart{\"u}rk, Emre and Shanmugam, Karthikeyan and Tajer, Ali}, booktitle = {Proc. Advances in Neural Information Processing Systems}, year = {2024}, video_short = {https://neurips.cc/virtual/2024/poster/93136}, }
NeurIPS
Interventional Causal Discovery in a Mixture of DAGs

Burak Varıcı, Dmitriy A Katz, Dennis Wei, Prasanna Sattigeri, and Ali Tajer

In Proc. Advances in Neural Information Processing Systems, 2024

Abs Bib PDF Code Poster Slides

Causal interactions among a group of variables are often modeled by a single causal graph. In some domains, however, these interactions are best described by multiple co-existing causal graphs, e.g., in dynamical systems or genomics. This paper addresses the hitherto unknown role of interventions in learning causal interactions among variables governed by a mixture of causal systems, each modeled by one directed acyclic graph (DAG). Causal discovery from mixtures is fundamentally more challenging than single-DAG causal discovery. Two major difficulties stem from (i) an inherent uncertainty about the skeletons of the component DAGs that constitute the mixture and (ii) possibly cyclic relationships across these component DAGs. This paper addresses these challenges and aims to identify edges that exist in at least one component DAG of the mixture, referred to as the true edges. First, it establishes matching necessary and sufficient conditions on the size of interventions required to identify the true edges. Next, guided by the necessity results, an adaptive algorithm is designed that learns all true edges using O(n^2) interventions, where n is the number of nodes. Remarkably, the size of the interventions is optimal if the underlying mixture model does not contain cycles across its components. More generally, the gap between the intervention size used by the algorithm and the optimal size is quantified. It is shown to be bounded by the cyclic complexity number of the mixture model, defined as the size of the minimal intervention that can break the cycles in the mixture, which is upper bounded by the number of cycles among the ancestors of a node.
@inproceedings{varici2024interventional, title = {Interventional Causal Discovery in a Mixture of DAGs}, author = {Var{\i}c{\i}, Burak and Katz, Dmitriy A and Wei, Dennis and Sattigeri, Prasanna and Tajer, Ali}, booktitle = {Proc. Advances in Neural Information Processing Systems}, year = {2024}, video_short = {https://neurips.cc/virtual/2024/poster/93767}, }

NeurIPS

Sample Complexity of Interventional Causal Representation Learning

Emre Acartürk, Burak Varıcı, Karthikeyan Shanmugam, and Ali Tajer

In Proc. Advances in Neural Information Processing Systems, 2024

Bib PDF Poster

@inproceedings{acarturk2024sample,
  title = {Sample Complexity of Interventional Causal Representation Learning},
  author = {Acart{\"u}rk, Emre and Var{\i}c{\i}, Burak and Shanmugam, Karthikeyan and Tajer, Ali},
  booktitle = {Proc. Advances in Neural Information Processing Systems},
  year = {2024},
}

AISTATS (oral)
General Identifiability and Achievability for Causal Representation Learning

Burak Varıcı, Emre Acartürk, Karthikeyan Shanmugam, and Ali Tajer

In Proc. International Conference on Artificial Intelligence and Statistics, 2024

Abs Bib PDF Code Poster Slides

This paper focuses on causal representation learning (CRL) under a general nonparametric latent causal model and a general transformation model that maps the latent data to the observational data. It establishes identifiability and achievability results using two hard uncoupled interventions per node in the latent causal graph. Notably, one does not know which pair of intervention environments have the same node intervened (hence, uncoupled). For identifiability, the paper establishes that perfect recovery of the latent causal model and variables is guaranteed under uncoupled interventions. For achievability, an algorithm is designed that uses observational and interventional data and recovers the latent causal model and variables with provable guarantees. This algorithm leverages score variations across different environments to estimate the inverse of the transformer and, subsequently, the latent variables. The analysis, additionally, recovers the identifiability result for two hard coupled interventions, that is when metadata about the pair of environments that have the same node intervened is known. This paper also shows that when observational data is available, additional faithfulness assumptions that are adopted by the existing literature are unnecessary.
@inproceedings{varici2024general, title = {General Identifiability and Achievability for Causal Representation Learning}, author = {Var{\i}c{\i}, Burak and Acart{\"u}rk, Emre and Shanmugam, Karthikeyan and Tajer, Ali}, booktitle = {Proc. International Conference on Artificial Intelligence and Statistics}, year = {2024}, address = {Valencia, Spain}, }
arXiv
Score-based causal representation learning: Linear and General Transformations

Burak Varıcı, Emre Acartürk, Karthikeyan Shanmugam, Abhishek Kumar, and Ali Tajer

arXiv:2402.00849, 2024

Abs Bib PDF Code

This paper addresses intervention-based causal representation learning (CRL) under a general nonparametric latent causal model and an unknown transformation that maps the latent variables to the observed variables. Linear and general transformations are investigated. The paper addresses both the identifiability and achievability aspects. Identifiability refers to determining algorithm-agnostic conditions that ensure recovering the true latent causal variables and the latent causal graph underlying them. Achievability refers to the algorithmic aspects and addresses designing algorithms that achieve identifiability guarantees. By drawing novel connections between score functions (i.e., the gradients of the logarithm of density functions) and CRL, this paper designs a score-based class of algorithms that ensures both identifiability and achievability. First, the paper focuses on linear transformations and shows that one stochastic hard intervention per node suffices to guarantee identifiability. It also provides partial identifiability guarantees for soft interventions, including identifiability up to ancestors for general causal models and perfect latent graph recovery for sufficiently non-linear causal models. Secondly, it focuses on general transformations and shows that two stochastic hard interventions per node suffice for identifiability. Notably, one does not need to know which pair of interventional environments have the same node intervened. Finally, the theoretical results are empirically validated via experiments on structured synthetic data and image data.
@article{varici2024score, title = {Score-based causal representation learning: Linear and General Transformations}, author = {Var{\i}c{\i}, Burak and Acart{\"u}rk, Emre and Shanmugam, Karthikeyan and Kumar, Abhishek and Tajer, Ali}, journal = {arXiv:2402.00849}, year = {2024}, }
TMLR
Separability Analysis for Causal Discovery in Mixture of DAGs

Burak Varıcı, Dmitriy Katz-Rogozhnikov, Dennis Wei, Prasanna Sattigeri, and Ali Tajer

Transactions on Machine Learning Research, 2024

Abs Bib PDF Code

Directed acyclic graphs (DAGs) are effective for compactly representing causal systems and specifying the causal relationships among the system’s constituents. Specifying such causal relationships in some systems requires a mixture of multiple DAGs – a single DAG is insufficient. Some examples include time-varying causal systems or aggregated subgroups of a population. Recovering the causal structure of the systems represented by single DAGs is investigated extensively, but it remains mainly open for the systems represented by a mixture of DAGs. A major difference between single- versus mixture-DAG recovery is the existence of node pairs that are separable in the individual DAGs but become inseparable in their mixture. This paper provides the theoretical foundations for analyzing such inseparable node pairs. Specifically, the notion of \emphemergent edges is introduced to represent such inseparable pairs that do not exist in the single DAGs but emerge in their mixtures. Necessary conditions for identifying the emergent edges are established. Operationally, these conditions serve as sufficient conditions for separating a pair of nodes in the mixture of DAGs. These results are further extended, and matching necessary and sufficient conditions for identifying the emergent edges in tree-structured DAGs are established. Finally, a novel graphical representation is formalized to specify these conditions, and an algorithm is provided for inferring the learnable causal relations.
@article{varici2024separability, title = {Separability Analysis for Causal Discovery in Mixture of {DAG}s}, author = {Var{\i}c{\i}, Burak and Katz-Rogozhnikov, Dmitriy and Wei, Dennis and Sattigeri, Prasanna and Tajer, Ali}, journal = {Transactions on Machine Learning Research}, issn = {2835-8856}, year = {2024}, }

JSAIT

Robust Causal Bandits for Linear Models

Zirui Yan, Arpan Mukherjee, Burak Varıcı, and Ali Tajer

IEEE Journal on Selected Areas in Information Theory, 2024

Bib PDF

@article{yan2024robust,
  title = {Robust Causal Bandits for Linear Models},
  author = {Yan, Zirui and Mukherjee, Arpan and Var{\i}c{\i}, Burak and Tajer, Ali},
  journal = {IEEE Journal on Selected Areas in Information Theory},
  year = {2024},
}

ISIT

Improved Bound for Robust Causal Bandits with Linear Models

Zirui Yan, Arpan Mukherjee, Burak Varıcı, and Ali Tajer

In International Symposium on Information Theory, 2024

Bib PDF

@inproceedings{yan2024improved,
  title = {Improved Bound for Robust Causal Bandits with Linear Models},
  author = {Yan, Zirui and Mukherjee, Arpan and Var{\i}c{\i}, Burak and Tajer, Ali},
  booktitle = {International Symposium on Information Theory},
  year = {2024},
  address = {Athens, Greece},
}

2023

JMLR

Causal Bandits for Linear Structural Equation Models

Burak Varıcı, Karthikeyan Shanmugam, Prasanna Sattigeri, and Ali Tajer

Journal of Machine Learning Research, 2023

Bib PDF Code Poster

@article{varici2023causal,
  author = {Var{\i}c{\i}, Burak and Shanmugam, Karthikeyan and Sattigeri, Prasanna and Tajer, Ali},
  title = {Causal Bandits for Linear Structural Equation Models},
  journal = {Journal of Machine Learning Research},
  year = {2023},
  volume = {24},
  number = {297},
  pages = {1--59},
  url = {http://jmlr.org/papers/v24/22-0969.html},
  video_short = {https://neurips.cc/virtual/2024/poster/98317}
}

arXiv
Score-based causal representation learning with interventions

Burak Varıcı, Emre Acartürk, Karthikeyan Shanmugam, Abhishek Kumar, and Ali Tajer

arXiv:2301.08230, 2023

Abs Bib PDF Slides

This paper studies the causal representation learning problem when the latent causal variables are observed indirectly through an unknown linear transformation. The objectives are: (i) recovering the unknown linear transformation (up to scaling) and (ii) determining the directed acyclic graph (DAG) underlying the latent variables. Sufficient conditions for DAG recovery are established, and it is shown that a large class of non-linear models in the latent space (e.g., causal mechanisms parameterized by two-layer neural networks) satisfy these conditions. These sufficient conditions ensure that the effect of an intervention can be detected correctly from changes in the score. Capitalizing on this property, recovering a valid transformation is facilitated by the following key property: any valid transformation renders latent variables’ score function to necessarily have the minimal variations across different interventional environments. This property is leveraged for perfect recovery of the latent DAG structure using only soft interventions. For the special case of stochastic hard interventions, with an additional hypothesis testing step, one can also uniquely recover the linear transformation up to scaling and a valid causal ordering.
@article{varici2023score, title = {Score-based causal representation learning with interventions}, author = {Var{\i}c{\i}, Burak and Acartürk, Emre and Shanmugam, Karthikeyan and Kumar, Abhishek and Tajer, Ali}, journal = {arXiv:2301.08230}, year = {2023}, }

2022

UAI

Intervention target estimation in the presence of latent variables

Burak Varıcı, Karthikeyan Shanmugam, Prasanna Sattigeri, and Ali Tajer

In Proc. Conference on Uncertainty in Artificial Intelligence, 2022

Bib PDF Code Poster

@inproceedings{varici2022intervention,
  title = {Intervention target estimation in the presence of latent variables},
  author = {Var{\i}c{\i}, Burak and Shanmugam, Karthikeyan and Sattigeri, Prasanna and Tajer, Ali},
  booktitle = {Proc. Conference on Uncertainty in Artificial Intelligence},
  pages = {2013--2023},
  year = {2022},
  address = {Eindhoven, Netherlands},
}

2021

AISTATS

Learning Shared Subgraphs in Ising Model Pairs

Burak Varıcı, Saurabh Sihag, and Ali Tajer

In Proc. International Conference on Artificial Intelligence and Statistics, 2021

Bib PDF Poster

@inproceedings{varici2021learning,
  title = {Learning Shared Subgraphs in Ising Model Pairs},
  author = {Var{\i}c{\i}, Burak and Sihag, Saurabh and Tajer, Ali},
  booktitle = {Proc. International Conference on Artificial Intelligence and Statistics},
  pages = {3952--3960},
  year = {2021},
}

NeurIPS

Scalable Intervention Target Estimation in Linear Models

Burak Varıcı, Karthikeyan Shanmugam, Prasanna Sattigeri, and Ali Tajer

In Proc. Advances in Neural Information Processing Systems, 2021

Bib PDF Code Poster

@inproceedings{varici2021scalable,
  title = {Scalable Intervention Target Estimation in Linear Models},
  author = {Var{\i}c{\i}, Burak and Shanmugam, Karthikeyan and Sattigeri, Prasanna and Tajer, Ali},
  booktitle = {Proc. Advances in Neural Information Processing Systems},
  year = {2021},
  pages = {1494--1505},
}