All Projects → JeanKaddour → SIN

JeanKaddour / SIN

Licence: other
Causal Effect Inference for Structured Treatments (SIN) (NeurIPS 2021)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to SIN

doubleml-for-r
DoubleML - Double Machine Learning in R
Stars: ✭ 58 (+81.25%)
Mutual labels:  causal-inference
Pgmpy
Python Library for learning (Structure and Parameter) and inference (Probabilistic and Causal) in Bayesian Networks.
Stars: ✭ 1,942 (+5968.75%)
Mutual labels:  causal-inference
causalnlp
CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable.
Stars: ✭ 98 (+206.25%)
Mutual labels:  causal-inference
pcalg-py
Implement PC algorithm in Python | PC 算法的 Python 实现
Stars: ✭ 52 (+62.5%)
Mutual labels:  causal-inference
causaldag
Python package for the creation, manipulation, and learning of Causal DAGs
Stars: ✭ 82 (+156.25%)
Mutual labels:  causal-inference
Coz
Coz: Causal Profiling
Stars: ✭ 2,719 (+8396.88%)
Mutual labels:  causal-inference
doubleml-for-py
DoubleML - Double Machine Learning in Python
Stars: ✭ 129 (+303.13%)
Mutual labels:  causal-inference
causeinfer
Machine learning based causal inference/uplift in Python
Stars: ✭ 45 (+40.63%)
Mutual labels:  causal-inference
Dowhy
DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
Stars: ✭ 3,480 (+10775%)
Mutual labels:  causal-inference
cfvqa
[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias
Stars: ✭ 96 (+200%)
Mutual labels:  causal-inference
CIKM18-LCVA
Code for CIKM'18 paper, Linked Causal Variational Autoencoder for Inferring Paired Spillover Effects.
Stars: ✭ 13 (-59.37%)
Mutual labels:  causal-inference
cobalt
Covariate Balance Tables and Plots - An R package for assessing covariate balance
Stars: ✭ 52 (+62.5%)
Mutual labels:  causal-inference
cfml tools
My collection of causal inference algorithms built on top of accessible, simple, out-of-the-box ML methods, aimed at being explainable and useful in the business context
Stars: ✭ 24 (-25%)
Mutual labels:  causal-inference
causal-learn
Causal Discovery for Python. Translation and extension of the Tetrad Java code.
Stars: ✭ 428 (+1237.5%)
Mutual labels:  causal-inference
perfect match
➕➕ Perfect Match is a simple method for learning representations for counterfactual inference with neural networks.
Stars: ✭ 100 (+212.5%)
Mutual labels:  causal-inference
drnet
💉📈 Dose response networks (DRNets) are a method for learning to estimate individual dose-response curves for multiple parametric treatments from observational data using neural networks.
Stars: ✭ 48 (+50%)
Mutual labels:  causal-inference
Causalml
Uplift modeling and causal inference with machine learning algorithms
Stars: ✭ 2,499 (+7709.38%)
Mutual labels:  causal-inference
causal-ml
Must-read papers and resources related to causal inference and machine (deep) learning
Stars: ✭ 387 (+1109.38%)
Mutual labels:  causal-inference
CausalInferenceIntro
Causal Inference for the Brave and True的中文翻译版。全部代码基于Python,适用于计量经济学、量化社会学、策略评估等领域。英文版原作者:Matheus Facure
Stars: ✭ 207 (+546.88%)
Mutual labels:  causal-inference
tlverse-handbook
🎯 📕 Targeted Learning in R: A Causal Data Science Handbook
Stars: ✭ 50 (+56.25%)
Mutual labels:  causal-inference

Causal Effect Inference for Structured Treatments

Overview

We address the estimation of conditional average treatment effects (CATEs) for structured treatments (e.g., graphs, images, texts). Given a weak condition on the effect, we propose the generalized Robinson decomposition, which (i) isolates the causal estimand (reducing regularization bias), (ii) allows one to plug in arbitrary models for learning, and (iii) possesses a quasi-oracle convergence guarantee under mild assumptions. In experiments with small-world and molecular graphs we demonstrate that our approach outperforms prior work in CATE estimation.

Link to paper

Requirements

We tested the implementation in Python 3.8.

Dependencies

requirements.txt is an automatically generated file with all dependencies.

Essential packages include:

rdkit
numpy
networkx
scikit-learn
torch
pyg
wandb

Datasets

The TCGA simulation requires the TCGA and QM9 datasets. The code automatically downloads and unzips these datasets if they do not exist. Alternatively, the TCGA dataset can be downloaded from here and the QM9 dataset from here. Both datasets should be located in data/tcga/.

Entry points

There are three runnable python scripts:

  • generate_data.py: Generates and saves a dataset given the configuration in configs/generate_data/.
    • Stores generated data in data_path with folder structure {data_path}/{task}/seed-{seed}/bias-{bias}/
    • For each task, seed, and bias combination, generates and stores a new dataset
  • run_model_training.py: Trains and evaluates a CATE estimation model given the configuration in configs/run_model/.
    • Evaluation results will be logged, can be saved to results_path and/or synced to a wandb.ai account
  • run_hyperparameter_sweeping.py Sweeps hyper-parameters with wandb as specified in configs/sweeps/
  • run_unseen_treatment_update.py: Runs the GNN baseline on a specified dataset and updates one-hot encodings of previously unseen treatments in the test set to the closest ones seen during training based on their Euclidean space in the hidden embedding space.
    • Before running the CAT baseline, run this script. Otherwise, unseen treatment one-hot encodings will be fed into the network.

Quick tour

generate_data.py

Important arguments

  • task: Simulation sw or tcga
  • bias: Treatment selection bias coefficient
  • seed: Random seed
  • data_path: Path to save/load generated datasets

run_model.py

Important arguments

  • task: Simulation sw or tcga
  • model: SIN, gnn, cat, graphite, zero
  • bias: Treatment selection bias coefficient
  • seed: Random seed

Remarks

TCGA Simulation warnings

When parsing smiles from the QM9 dataset for simulating a TCGA experiment, there may be bad input warnings for certain molecules. The data generator will ignore these molecules. When subsampling 10k molecules, we noticed that there are around ~1% faulty molecules.

Hyper-parameter tuning and experiment management

For hyper-parameter tuning and experiment management, we use the wandb package. Please note that for both tasks, you need an account on wandb.ai. If you want to run single experiments, you can do so without an account - in this case, please ignore the warnings.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].