Structure Scores

This section includes different learning scores that evaluate the goodness of a Bayesian network. This is used for the score-and-search learning algorithms such as GreedyHillClimbing, MMHC and DMMHC.

Abstract classes

class pybnesian.Score

A Score scores Bayesian network structures.

__init__(self: pybnesian.Score) → None: Initializes a Score.

__str__(self: pybnesian.Score) → str

compatible_bn(self: pybnesian.Score, model: BayesianNetworkBase or ConditionalBayesianNetworkBase) → bool

Checks whether the model is compatible (can be used) with this Score.

Parameters: model – A Bayesian network model.
Returns: True if the Bayesian network model is compatible with this Score, False otherwise.

data(self: pybnesian.Score) → DataFrame

Returns the DataFrame used to calculate the score and local scores.

Returns: DataFrame used to calculate the score. If the score do not use data, it returns None.

has_variables(self: pybnesian.Score, variables: str or List[str]) → bool

Checks whether this Score has the given variables.

Parameters: variables – Name or list of variables.
Returns: True if the Score is defined over the set of variables, False otherwise.

local_score(*args, **kwargs)

Overloaded function.

local_score(self: pybnesian.Score, model: pybnesian.ConditionalBayesianNetworkBase, variable: str) -> float
local_score(self: pybnesian.Score, model: pybnesian.BayesianNetworkBase, variable: str) -> float

Returns the local score value of a node variable in the model.

For example:

>>> score.local_score(m, "a")

returns the local score of node "a" in the model m. This method assumes that the parents in the score are m.parents("a") and its node type is m.node_type("a").

Parameters

model – Bayesian network model.
variable – A variable name.

Returns

Local score value of node in the model.

local_score(self: pybnesian.Score, model: pybnesian.ConditionalBayesianNetworkBase, variable: str, evidence: List[str]) -> float
local_score(self: pybnesian.Score, model: pybnesian.BayesianNetworkBase, variable: str, evidence: List[str]) -> float

Returns the local score value of a node variable in the model if it had evidence as parents.

For example:

>>> score.local_score(m, "a", ["b"])

returns the local score of node "a" in the model m, with ["b"] as parents. This method assumes that the node type of "a" is m.node_type("a").

Parameters

model – Bayesian network model.
variable – A variable name.
evidence – A list of parent names.

Returns

Local score value of node in the model with evidence as parents.

local_score_node_type(self: pybnesian.Score, model: pybnesian.BayesianNetworkBase, variable_type: pybnesian.FactorType, variable: str, evidence: List[str]) → float

Returns the local score value of a node variable in the model if its conditional distribution were a variable_type factor and it had evidence as parents.

For example:

>>> score.local_score(m, LinearGaussianCPDType(), "a", ["b"])

returns the local score of node "a" in the model m, with ["b"] as parents assuming the conditional distribution of "a" is a LinearGaussianCPD.

Parameters

model – Bayesian network model.
variable_type – The FactorType of the node variable.
variable – A variable name.
evidence – A list of parent names.

Returns

Local score value of node in the model with evidence as parents and variable_type as conditional distribution.

score(self: pybnesian.Score, model: BayesianNetworkBase or ConditionalBayesianNetworkBase) → float

Returns the score value of the model.

Parameters: model – Bayesian network model.
Returns: Score value of model.

class pybnesian.ValidatedScore

Bases: Score

A ValidatedScore is a score with training and validation scores. In a ValidatedScore, the training is driven by the training score through the functions Score.score(), Score.local_score_variable(), Score.local_score() and Score.local_score_node_type()). The convergence of the structure is evaluated using a validation likelihood (usually defined over different data) through the functions ValidatedScore.vscore(), ValidatedScore.vlocal_score_variable(), ValidatedScore.vlocal_score() and ValidatedScore.vlocal_score_node_type().

__init__(self: pybnesian.ValidatedScore) → None

vlocal_score(*args, **kwargs)

Overloaded function.

vlocal_score(self: pybnesian.ValidatedScore, model: pybnesian.ConditionalBayesianNetworkBase, variable: str) -> float
vlocal_score(self: pybnesian.ValidatedScore, model: pybnesian.BayesianNetworkBase, variable: str) -> float

vlocal_score(self: pybnesian.ValidatedScore, model: BayesianNetworkBase or ConditionalBayesianNetworkBase, variable: str) -> float

Returns the validated local score value of a node variable in the model.

For example:

>>> score.local_score(m, "a")

returns the validated local score of node "a" in the model m. This method assumes that the parents of "a" are m.parents("a") and its node type is m.node_type("a").

Parameters

model – Bayesian network model.
variable – A variable name.

Returns

Validated local score value of node in the model.

vlocal_score(self: pybnesian.ValidatedScore, arg0: pybnesian.ConditionalBayesianNetworkBase, arg1: str, arg2: List[str]) -> float
vlocal_score(self: pybnesian.ValidatedScore, model: pybnesian.BayesianNetworkBase, variable: str, evidence: List[str]) -> float

vlocal_score(self: pybnesian.ValidatedScore, model: BayesianNetworkBase or ConditionalBayesianNetworkBase, variable: str, evidence: List[str]) -> float

Returns the validated local score value of a node variable in the model if it had evidence as parents.

For example:

>>> score.local_score(m, "a", ["b"])

returns the validated local score of node "a" in the model m, with ["b"] as parents. This method assumes that the node type of "a" is m.node_type("a").

Parameters

model – Bayesian network model.
variable – A variable name.
evidence – A list of parent names.

Returns

Validated local score value of node in the model with evidence as parents.

vlocal_score_node_type(self: pybnesian.ValidatedScore, model: pybnesian.BayesianNetworkBase, variable_type: pybnesian.FactorType, variable: str, evidence: List[str]) → float

Returns the validated local score value of a node variable in the model if its conditional distribution were a variable_type factor and it had evidence as parents.

For example:

>>> score.vlocal_score(m, LinearGaussianCPDType(), "a", ["b"])

returns the validated local score of node "a" in the model m, with ["b"] as parents assuming the conditional distribution of "a" is a LinearGaussianCPD.

Parameters

model – Bayesian network model.
variable_type – The FactorType of the node variable.
variable – A variable name.
evidence – A list of parent names.

Returns

Validated local score value of node in the model with evidence as parents and variable_type as conditional distribution.

vscore(self: pybnesian.ValidatedScore, model: BayesianNetworkBase or ConditionalBayesianNetworkBase) → float

Returns the validated score value of the model.

Parameters: model – Bayesian network model.
Returns: Validated score value of model.

class pybnesian.DynamicScore

A DynamicScore adapts the static Score to learn dynamic Bayesian networks. It generates a static and a transition score to learn the static and transition components of the dynamic Bayesian network.

The dynamic scores are usually implemented using a DynamicDataFrame with the methods DynamicDataFrame.static_df and DynamicDataFrame.transition_df.

__init__(self: pybnesian.DynamicScore) → None: Initializes a DynamicScore.

has_variables(self: pybnesian.DynamicScore, variables: str or List[str]) → bool

Checks whether this DynamicScore has the given variables.

Parameters: variables – Name or list of variables.
Returns: True if the DynamicScore is defined over the set of variables, False otherwise.

static_score(self: pybnesian.DynamicScore) → pybnesian.Score

It returns the static score component of the DynamicScore.

Returns: The static score component.

transition_score(self: pybnesian.DynamicScore) → pybnesian.Score

It returns the transition score component of the DynamicScore.

Returns: The transition score component.

Concrete classes

class pybnesian.BIC

Bases: Score

This class implements the Bayesian Information Criterion (BIC).

__init__(self: pybnesian.BIC, df: DataFrame) → None

Initializes a BIC with the given DataFrame df.

Parameters: df – DataFrame to compute the BIC score.

class pybnesian.BGe

Bases: Score

This class implements the Bayesian Gaussian equivalent (BGe).

__init__(self: pybnesian.BGe, df: DataFrame, iss_mu: float = 1, iss_w: Optional[float] = None, nu: Optional[numpy.ndarray[numpy.float64[m, 1]]] = None) → None

Initializes a BGe with the given DataFrame df.

Parameters

df – DataFrame to compute the BGe score.
iss_mu – Imaginary sample size for the normal component of the normal-Wishart prior.
iss_w – Imaginary sample size for the Wishart component of the normal-Wishart prior.
nu – Mean vector of the normal-Wishart prior.

class pybnesian.BDe

Bases: Score

This class implements the Bayesian Dirichlet equivalent (BDe).

__init__(self: pybnesian.BDe, df: DataFrame, iss: float = 1) → None

Initializes a BDe with the given DataFrame df.

Parameters

df – DataFrame to compute the BDe score.
iss – Imaginary sample size of the Dirichlet prior.

class pybnesian.CVLikelihood

Bases: Score

This class implements an estimation of the log-likelihood on unseen data using k-fold cross validation over the data.

__init__(self: pybnesian.CVLikelihood, df: DataFrame, k: int = 10, seed: Optional[int] = None, construction_args: pybnesian.Arguments = Arguments) → None

Initializes a CVLikelihood with the given DataFrame df. It uses a CrossValidation with k folds and the given seed.

Parameters

df – DataFrame to compute the score.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or None, a random seed is generated.
construction_args – Additional arguments provided to construct the Factor.

property cv: The underlying CrossValidation object to compute the score.

class pybnesian.HoldoutLikelihood

Bases: Score

This class implements an estimation of the log-likelihood on unseen data using a holdout dataset. Thus, the parameters are estimated using training data, and the score is estimated in the holdout data.

__init__(self: pybnesian.HoldoutLikelihood, df: DataFrame, test_ratio: float = 0.2, seed: Optional[int] = None, construction_args: pybnesian.Arguments = Arguments) → None

Initializes a HoldoutLikelihood with the given DataFrame df. It uses a HoldOut with the given test_ratio and seed.

Parameters

df – DataFrame to compute the score.
test_ratio – Proportion of instances left for the holdout data.
seed – A random seed number. If not specified or None, a random seed is generated.
construction_args – Additional arguments provided to construct the Factor.

property holdout: The underlying HoldOut object to compute the score.

test_data(self: pybnesian.HoldoutLikelihood) → DataFrame: Gets the holdout data of the HoldOut object.

training_data(self: pybnesian.HoldoutLikelihood) → DataFrame: Gets the training data of the HoldOut object.

class pybnesian.ValidatedLikelihood

Bases: ValidatedScore

This class mixes the functionality of CVLikelihood and HoldoutLikelihood. First, it applies a HoldOut split over the data. Then:

It estimates the training score using a CVLikelihood over the training data.
It estimates the validation score using the training data to estimate the parameters and calculating the log-likelihood on the holdout data.

__init__(self: pybnesian.ValidatedLikelihood, df: DataFrame, test_ratio: float = 0.2, k: int = 10, seed: Optional[int] = None, construction_args: pybnesian.Arguments = Arguments) → None

Initializes a ValidatedLikelihood with the given DataFrame df. The HoldOut is initialized with test_ratio and seed. The CVLikelihood is initialized with k and seed over the training data of the holdout HoldOut.

Parameters

df – DataFrame to compute the score.
test_ratio – Proportion of instances left for the holdout data.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or None, a random seed is generated.
construction_args – Additional arguments provided to construct the Factor.

property cv_lik: The underlying CVLikelihood to compute the training score.

property holdout_lik: The underlying HoldoutLikelihood to compute the validation score.

training_data(self: pybnesian.ValidatedLikelihood) → DataFrame: The underlying training data of the HoldOut.

validation_data(self: pybnesian.ValidatedLikelihood) → DataFrame: The underlying holdout data of the HoldOut.

class pybnesian.DynamicBIC

Bases: DynamicScore

The dynamic adaptation of the BIC score.

__init__(self: pybnesian.DynamicBIC, ddf: pybnesian.DynamicDataFrame) → None

Initializes a DynamicBIC with the given DynamicDataFrame ddf.

Parameters: ddf – DynamicDataFrame to compute the DynamicBIC score.

class pybnesian.DynamicBGe

Bases: DynamicScore

The dynamic adaptation of the BGe score.

__init__(self: pybnesian.DynamicBGe, ddf: pybnesian.DynamicDataFrame, iss_mu: float = 1, iss_w: Optional[float] = None, nu: Optional[numpy.ndarray[numpy.float64[m, 1]]] = None) → None

Initializes a DynamicBGe with the given DynamicDataFrame ddf.

Parameters

ddf – DynamicDataFrame to compute the DynamicBGe score.
iss_mu – Imaginary sample size for the normal component of the normal-Wishart prior.
iss_w – Imaginary sample size for the Wishart component of the normal-Wishart prior.
nu – Mean vector of the normal-Wishart prior.

class pybnesian.DynamicBDe

Bases: DynamicScore

The dynamic adaptation of the BDe score.

__init__(self: pybnesian.DynamicBDe, ddf: pybnesian.DynamicDataFrame, iss: float = 1) → None

Initializes a DynamicBDe with the given DynamicDataFrame ddf.

Parameters

ddf – DynamicDataFrame to compute the DynamicBDe score.
iss – Imaginary sample size of the Dirichlet prior.

class pybnesian.DynamicCVLikelihood

Bases: DynamicScore

The dynamic adaptation of the CVLikelihood score.

__init__(self: pybnesian.DynamicCVLikelihood, df: pybnesian.DynamicDataFrame, k: int = 10, seed: Optional[int] = None) → None

Initializes a DynamicCVLikelihood with the given DynamicDataFrame df. The k and seed parameters are passed to the static and transition components of CVLikelihood.

Parameters

df – DynamicDataFrame to compute the score.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or None, a random seed is generated.

class pybnesian.DynamicHoldoutLikelihood

Bases: DynamicScore

The dynamic adaptation of the HoldoutLikelihood score.

__init__(self: pybnesian.DynamicHoldoutLikelihood, df: pybnesian.DynamicDataFrame, test_ratio: float = 0.2, seed: Optional[int] = None) → None

Initializes a DynamicHoldoutLikelihood with the given DynamicDataFrame df. The test_ratio and seed parameters are passed to the static and transition components of HoldoutLikelihood.

Parameters

df – DynamicDataFrame to compute the score.
test_ratio – Proportion of instances left for the holdout data.
seed – A random seed number. If not specified or None, a random seed is generated.

class pybnesian.DynamicValidatedLikelihood

Bases: DynamicScore

The dynamic adaptation of the ValidatedLikelihood score.

__init__(self: pybnesian.DynamicValidatedLikelihood, df: pybnesian.DynamicDataFrame, test_ratio: float = 0.2, k: int = 10, seed: Optional[int] = None) → None

Initializes a DynamicValidatedLikelihood with the given DynamicDataFrame df. The test_ratio, k and seed parameters are passed to the static and transition components of ValidatedLikelihood.

Parameters

df – DynamicDataFrame to compute the score.
test_ratio – Proportion of instances left for the holdout data.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or None, a random seed is generated.