Independence Tests¶
This section includes conditional tests of independence. These tests are used in many constraint-based learning
algorithms such as PC
, MMPC
,
MMHC
and DMMHC
.
Abstract classes¶
- class pybnesian.learning.independences.IndependenceTest¶
The
IndependenceTest
is an abstract class defining an interface for a conditional test of independence.An
IndependenceTest
is defined over a set of variables and can calculate the p-value of any conditional test on these variables.- __init__(self: pybnesian.learning.independences.IndependenceTest) → None¶
Initializes an
IndependenceTest
.
- has_variables(self: pybnesian.learning.independences.IndependenceTest, variables: str or List[str]) → bool¶
Checks whether this
IndependenceTest
has the givenvariables
.- Parameters
variables – Name or list of variables.
- Returns
True if the
IndependenceTest
is defined over the set ofvariables
, False otherwise.
- name(self: pybnesian.learning.independences.IndependenceTest, index: int) → str¶
Gets the variable name of the index-th variable.
- Parameters
index – Index of the variable.
- Returns
Variable name at the
index
position.
- num_variables(self: pybnesian.learning.independences.IndependenceTest) → int¶
Gets the number of variables of the
IndependenceTest
.- Returns
Number of variables of the
IndependenceTest
.
- pvalue(*args, **kwargs)¶
Overloaded function.
pvalue(self: pybnesian.learning.independences.IndependenceTest, x: str, y: str) -> float
Calculates the p-value of the unconditional test of independence \(x \perp y\).
- Parameters
x – A variable name.
y – A variable name.
- Returns
The p-value of the unconditional test of independence \(x \perp y\).
pvalue(self: pybnesian.learning.independences.IndependenceTest, x: str, y: str, z: str) -> float
Calculates the p-value of an univariate conditional test of independence \(x \perp y \mid z\).
- Parameters
x – A variable name.
y – A variable name.
z – A variable name.
- Returns
The p-value of an univariate conditional test of independence \(x \perp y \mid z\).
pvalue(self: pybnesian.learning.independences.IndependenceTest, x: str, y: str, z: List[str]) -> float
Calculates the p-value of a multivariate conditional test of independence \(x \perp y \mid \mathbf{z}\).
- Parameters
x – A variable name.
y – A variable name.
z – A list of variable names.
- Returns
The p-value of a multivariate conditional test of independence \(x \perp y \mid \mathbf{z}\).
- variable_names(self: pybnesian.learning.independences.IndependenceTest) → List[str]¶
Gets the list of variable names of the
IndependenceTest
.- Returns
List of variable names of the
IndependenceTest
.
- class pybnesian.learning.independences.DynamicIndependenceTest¶
A
DynamicIndependenceTest
adapts the staticIndependenceTest
to learn dynamic Bayesian networks. It generates a static and a transition independence test to learn the static and transition components of the dynamic Bayesian network.The dynamic independence tests are usually implemented using a
DynamicDataFrame
with the methodsDynamicDataFrame.static_df
andDynamicDataFrame.transition_df
.- has_variables(self: pybnesian.learning.scores.DynamicScore, variables: str or List[str]) → bool¶
Checks whether this
DynamicScore
has the givenvariables
.- Parameters
variables – Name or list of variables.
- Returns
True if the
DynamicScore
is defined over the set ofvariables
, False otherwise.
- markovian_order(self: pybnesian.learning.independences.DynamicIndependenceTest) → int¶
Gets the markovian order used in this
DynamicIndependenceTest
.- Returns
Markovian order of the
DynamicIndependenceTest
.
- name(self: pybnesian.learning.independences.DynamicIndependenceTest, index: int) → str¶
Gets the variable name of the index-th variable.
- Parameters
index – Index of the variable.
- Returns
Variable name at the
index
position.
- num_variables(self: pybnesian.learning.independences.DynamicIndependenceTest) → int¶
Gets the number of variables of the
DynamicIndependenceTest
.- Returns
Number of variables of the
DynamicIndependenceTest
.
- static_tests(self: pybnesian.learning.independences.DynamicIndependenceTest) → pybnesian.learning.independences.IndependenceTest¶
It returns the static independence test component of the
DynamicIndependenceTest
.- Returns
The static independence test component.
- transition_tests(self: pybnesian.learning.independences.DynamicIndependenceTest) → pybnesian.learning.independences.IndependenceTest¶
It returns the transition independence test component of the
DynamicIndependenceTest
.- Returns
The transition independence test component.
- variable_names(self: pybnesian.learning.independences.DynamicIndependenceTest) → List[str]¶
Gets the list of variable names of the
DynamicIndependenceTest
.- Returns
List of variable names of the
DynamicIndependenceTest
.
Concrete classes¶
- class pybnesian.learning.independences.LinearCorrelation¶
Bases:
pybnesian.learning.independences.IndependenceTest
This class implements a partial linear correlation independence test. This independence is only valid for continuous data.
- __init__(self: pybnesian.learning.independences.LinearCorrelation, df: DataFrame) → None¶
Initializes a
LinearCorrelation
for the continuous variables in the DataFramedf
.- Parameters
df – DataFrame on which to calculate the independence tests.
- class pybnesian.learning.independences.KMutualInformation¶
Bases:
pybnesian.learning.independences.IndependenceTest
This class implements a non-parametric independence test that is based on the estimation of the mutual information using k-nearest neighbors. This independence is only implemented for continuous data.
This independence test is based on [CMIknn].
- __init__(self: pybnesian.learning.independences.KMutualInformation, df: DataFrame, k: int, seed: Optional[int] = None, shuffle_neighbors: int = 5, samples: int = 1000) → None¶
Initializes a
KMutualInformation
for datadf
.k
is the number of neighbors in the k-nn model used to estimate the mutual information.This is a permutation independence test, so
samples
defines the number of permutations.shuffle neighbors
(\(k_{perm}\) in the original paper [CMIknn]) defines how many neighbors are used to perform the conditional permutations.- Parameters
df – DataFrame on which to calculate the independence tests.
k – number of neighbors in the k-nn model used to estimate the mutual information.
seed – A random seed number. If not specified or
None
, a random seed is generated.shuffle_neighbors – Number of neighbors used to perform the conditional permutation.
samples – Number of permutations for the
KMutualInformation
.
- mi(*args, **kwargs)¶
Overloaded function.
mi(self: pybnesian.learning.independences.KMutualInformation, x: str, y: str) -> float
Estimates the unconditional mutual information \(\text{MI}(x, y)\).
- Parameters
x – A variable name.
y – A variable name.
- Returns
The unconditional mutual information \(\text{MI}(x, y)\).
mi(self: pybnesian.learning.independences.KMutualInformation, x: str, y: str, z: str) -> float
Estimates the univariate conditional mutual information \(\text{MI}(x, y \mid z)\).
- Parameters
x – A variable name.
y – A variable name.
z – A variable name.
- Returns
The univariate conditional mutual information \(\text{MI}(x, y \mid z)\).
mi(self: pybnesian.learning.independences.KMutualInformation, x: str, y: str, z: List[str]) -> float
Estimates the multivariate conditional mutual information \(\text{MI}(x, y \mid \mathbf{z})\).
- Parameters
x – A variable name.
y – A variable name.
z – A list of variable names.
- Returns
The multivariate conditional mutual information \(\text{MI}(x, y \mid \mathbf{z})\).
- class pybnesian.learning.independences.RCoT¶
Bases:
pybnesian.learning.independences.IndependenceTest
This class implements a non-parametric independence test called Randomized Conditional Correlation Test (RCoT). This method is described in [RCoT]. This independence is only implemented for continuous data.
This method uses random fourier features and is designed to be a fast non-parametric independence test.
- __init__(self: pybnesian.learning.independences.RCoT, df: DataFrame, random_fourier_xy: int = 5, random_fourier_z: int = 100) → None¶
Initializes a
RCoT
for datadf
. The number of random fourier features used for thex
andy
variables inIndependenceTest.pvalue
israndom_fourier_xy
. The number of random features used forz
is equal torandom_fourier_z
.- Parameters
df – DataFrame on which to calculate the independence tests.
random_fourier_xy – Number of random fourier features for the variables of the independence test.
randoum_fourier_z – Number of random fourier features for the conditioning variables of the independence test.
- class pybnesian.learning.independences.DynamicLinearCorrelation¶
Bases:
pybnesian.learning.independences.DynamicIndependenceTest
The dynamic adaptation of the
LinearCorrelation
independence test.- __init__(self: pybnesian.learning.independences.DynamicLinearCorrelation, ddf: pybnesian.dataset.DynamicDataFrame) → None¶
Initializes a
DynamicLinearCorrelation
with the givenDynamicDataFrame
ddf
.- Parameters
ddf –
DynamicDataFrame
to create theDynamicLinearCorrelation
.
- class pybnesian.learning.independences.DynamicKMutualInformation¶
Bases:
pybnesian.learning.independences.DynamicIndependenceTest
The dynamic adaptation of the
KMutualInformation
independence test.- __init__(self: pybnesian.learning.independences.DynamicKMutualInformation, ddf: pybnesian.dataset.DynamicDataFrame, k: int, seed: Optional[int] = None, shuffle_neighbors: int = 5, samples: int = 1000) → None¶
Initializes a
DynamicKMutualInformation
with the givenDynamicDataFrame
df
. Thek
,seed
,shuffle_neighbors
andsamples
parameters are passed to the static and transition components ofKMutualInformation
.- Parameters
ddf –
DynamicDataFrame
to create theDynamicKMutualInformation
.k – number of neighbors in the k-nn model used to estimate the mutual information.
seed – A random seed number. If not specified or
None
, a random seed is generated.shuffle_neighbors – Number of neighbors used to perform the conditional permutation.
samples – Number of permutations for the
KMutualInformation
.
- class pybnesian.learning.independences.DynamicRCoT¶
Bases:
pybnesian.learning.independences.DynamicIndependenceTest
The dynamic adaptation of the
RCoT
independence test.- __init__(self: pybnesian.learning.independences.DynamicRCoT, ddf: pybnesian.dataset.DynamicDataFrame, random_fourier_xy: int = 5, random_fourier_z: int = 100) → None¶
Initializes a
DynamicRCoT
with the givenDynamicDataFrame
df
. Therandom_fourier_xy
andrandom_fourier_z
parameters are passed to the static and transition components ofRCoT
.- Parameters
ddf –
DynamicDataFrame
to create theDynamicRCoT
.random_fourier_xy – Number of random fourier features for the variables of the independence test.
randoum_fourier_z – Number of random fourier features for the conditioning variables of the independence test.
Bibliography¶
- CMIknn(1,2)
Runge, J. (2018). Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information. International Conference on Artificial Intelligence and Statistics, AISTATS 2018, 84, 938–947.
- RCoT
Strobl, E. V., Zhang, K., & Visweswaran, S. (2019). Approximate kernel-based conditional independence tests for fast non-parametric causal discovery. Journal of Causal Inference, 7(1).