Structure Scores¶

This section includes different learning scores that evaluate the goodness of a Bayesian network. This is used for the score-and-search learning algorithms such as GreedyHillClimbing, MMHC and DMMHC.

Abstract classes¶

class pybnesian.learning.scores.Score¶

A Score scores Bayesian network structures.

__init__(self: pybnesian.learning.scores.Score) → None ¶: Initializes a Score.

__str__(self: pybnesian.learning.scores.Score) → str ¶

compatible_bn(self: pybnesian.learning.scores.Score, model: BayesianNetworkBase or ConditionalBayesianNetworkBase) → bool ¶

Checks whether the model is compatible (can be used) with this Score.

Parameters: model – A Bayesian network model.
Returns: True if the Bayesian network model is compatible with this Score, False otherwise.

data(self: pybnesian.learning.scores.Score) → DataFrame¶

Returns the DataFrame used to calculate the score and local scores.

Returns: DataFrame used to calculate the score. If the score do not use data, it returns None.

has_variables(self: pybnesian.learning.scores.Score, variables: str or List[str]) → bool ¶

Checks whether this Score has the given variables.

Parameters: variables – Name or list of variables.
Returns: True if the Score is defined over the set of variables, False otherwise.

local_score(*args, **kwargs)¶

Overloaded function.

local_score(self: pybnesian.learning.scores.Score, model: pybnesian.models.ConditionalBayesianNetworkBase, variable: str) -> float
local_score(self: pybnesian.learning.scores.Score, model: pybnesian.models.BayesianNetworkBase, variable: str) -> float

Returns the local score value of a node variable in the model.

For example:

>>> score.local_score(m, "a")

returns the local score of node "a" in the model m. This method assumes that the parents in the score are m.parents("a") and its node type is m.node_type("a").

Parameters

model – Bayesian network model.
variable – A variable name.

Returns

Local score value of node in the model.

local_score(self: pybnesian.learning.scores.Score, model: pybnesian.models.ConditionalBayesianNetworkBase, variable: str, evidence: List[str]) -> float
local_score(self: pybnesian.learning.scores.Score, model: pybnesian.models.BayesianNetworkBase, variable: str, evidence: List[str]) -> float

Returns the local score value of a node variable in the model if it had evidence as parents.

For example:

>>> score.local_score(m, "a", ["b"])

returns the local score of node "a" in the model m, with ["b"] as parents. This method assumes that the node type of "a" is m.node_type("a").

Parameters

model – Bayesian network model.
variable – A variable name.
evidence – A list of parent names.

Returns

Local score value of node in the model with evidence as parents.

local_score_node_type(self: pybnesian.learning.scores.Score, model: pybnesian.models.BayesianNetworkBase, variable_type: pybnesian.factors.FactorType, variable: str, evidence: List[str]) → float ¶

Returns the local score value of a node variable in the model if its conditional distribution were a variable_type factor and it had evidence as parents.

For example:

>>> score.local_score(m, LinearGaussianCPDType(), "a", ["b"])

returns the local score of node "a" in the model m, with ["b"] as parents assuming the conditional distribution of "a" is a LinearGaussianCPD.

Parameters

model – Bayesian network model.
variable_type – The FactorType of the node variable.
variable – A variable name.
evidence – A list of parent names.

Returns

Local score value of node in the model with evidence as parents and variable_type as conditional distribution.

score(self: pybnesian.learning.scores.Score, model: BayesianNetworkBase or ConditionalBayesianNetworkBase) → float ¶

Returns the score value of the model.

Parameters: model – Bayesian network model.
Returns: Score value of model.

class pybnesian.learning.scores.ValidatedScore¶

Bases: pybnesian.learning.scores.Score

A ValidatedScore is a score with training and validation scores. In a ValidatedScore, the training is driven by the training score through the functions Score.score(), Score.local_score_variable(), Score.local_score() and Score.local_score_node_type()). The convergence of the structure is evaluated using a validation likelihood (usually defined over different data) through the functions ValidatedScore.vscore(), ValidatedScore.vlocal_score_variable(), ValidatedScore.vlocal_score() and ValidatedScore.vlocal_score_node_type().

__init__(self: pybnesian.learning.scores.ValidatedScore) → None ¶

vlocal_score(*args, **kwargs)¶

Overloaded function.

vlocal_score(self: pybnesian.learning.scores.ValidatedScore, model: pybnesian.models.ConditionalBayesianNetworkBase, variable: str) -> float
vlocal_score(self: pybnesian.learning.scores.ValidatedScore, model: pybnesian.models.BayesianNetworkBase, variable: str) -> float

vlocal_score(self: pybnesian.learning.scores.ValidatedScore, model: BayesianNetworkBase or ConditionalBayesianNetworkBase, variable: str) -> float

Returns the validated local score value of a node variable in the model.

For example:

>>> score.local_score(m, "a")

returns the validated local score of node "a" in the model m. This method assumes that the parents of "a" are m.parents("a") and its node type is m.node_type("a").

Parameters

model – Bayesian network model.
variable – A variable name.

Returns

Validated local score value of node in the model.

vlocal_score(self: pybnesian.learning.scores.ValidatedScore, arg0: pybnesian.models.ConditionalBayesianNetworkBase, arg1: str, arg2: List[str]) -> float
vlocal_score(self: pybnesian.learning.scores.ValidatedScore, model: pybnesian.models.BayesianNetworkBase, variable: str, evidence: List[str]) -> float

vlocal_score(self: pybnesian.learning.scores.ValidatedScore, model: BayesianNetworkBase or ConditionalBayesianNetworkBase, variable: str, evidence: List[str]) -> float

Returns the validated local score value of a node variable in the model if it had evidence as parents.

For example:

>>> score.local_score(m, "a", ["b"])

returns the validated local score of node "a" in the model m, with ["b"] as parents. This method assumes that the node type of "a" is m.node_type("a").

Parameters

model – Bayesian network model.
variable – A variable name.
evidence – A list of parent names.

Returns

Validated local score value of node in the model with evidence as parents.

vlocal_score_node_type(self: pybnesian.learning.scores.ValidatedScore, model: pybnesian.models.BayesianNetworkBase, variable_type: pybnesian.factors.FactorType, variable: str, evidence: List[str]) → float ¶

Returns the validated local score value of a node variable in the model if its conditional distribution were a variable_type factor and it had evidence as parents.

For example:

>>> score.vlocal_score(m, LinearGaussianCPDType(), "a", ["b"])

returns the validated local score of node "a" in the model m, with ["b"] as parents assuming the conditional distribution of "a" is a LinearGaussianCPD.

Parameters

model – Bayesian network model.
variable_type – The FactorType of the node variable.
variable – A variable name.
evidence – A list of parent names.

Returns

Validated local score value of node in the model with evidence as parents and variable_type as conditional distribution.

vscore(self: pybnesian.learning.scores.ValidatedScore, model: BayesianNetworkBase or ConditionalBayesianNetworkBase) → float ¶

Returns the validated score value of the model.

Parameters: model – Bayesian network model.
Returns: Validated score value of model.

class pybnesian.learning.scores.DynamicScore¶

A DynamicScore adapts the static Score to learn dynamic Bayesian networks. It generates a static and a transition score to learn the static and transition components of the dynamic Bayesian network.

The dynamic scores are usually implemented using a DynamicDataFrame with the methods DynamicDataFrame.static_df and DynamicDataFrame.transition_df.

__init__(self: pybnesian.learning.scores.DynamicScore) → None ¶: Initializes a DynamicScore.

has_variables(self: pybnesian.learning.scores.DynamicScore, variables: str or List[str]) → bool ¶

Checks whether this DynamicScore has the given variables.

Parameters: variables – Name or list of variables.
Returns: True if the DynamicScore is defined over the set of variables, False otherwise.

static_score(self: pybnesian.learning.scores.DynamicScore) → pybnesian.learning.scores.Score ¶

It returns the static score component of the DynamicScore.

Returns: The static score component.

transition_score(self: pybnesian.learning.scores.DynamicScore) → pybnesian.learning.scores.Score ¶

It returns the transition score component of the DynamicScore.

Returns: The transition score component.

Concrete classes¶

class pybnesian.learning.scores.BIC¶

Bases: pybnesian.learning.scores.Score

This class implements the Bayesian Information Criterion (BIC).

__init__(self: pybnesian.learning.scores.BIC, df: DataFrame) → None ¶

Initializes a BIC with the given DataFrame df.

Parameters: df – DataFrame to compute the BIC score.

class pybnesian.learning.scores.BGe¶

Bases: pybnesian.learning.scores.Score

This class implements the Bayesian Gaussian equivalent (BGe).

__init__(self: pybnesian.learning.scores.BGe, df: DataFrame, iss_mu: float = 1, iss_w: Optional[float] = None, nu: Optional[numpy.ndarray[numpy.float64[m, 1]]] = None) → None ¶

Initializes a BGe with the given DataFrame df.

Parameters

df – DataFrame to compute the BGe score.
iss_mu – Imaginary sample size for the normal component of the normal-Wishart prior.
iss_w – Imaginary sample size for the Wishart component of the normal-Wishart prior.
nu – Mean vector of the normal-Wishart prior.

class pybnesian.learning.scores.CVLikelihood¶

Bases: pybnesian.learning.scores.Score

This class implements an estimation of the log-likelihood on unseen data using k-fold cross validation over the data.

__init__(self: pybnesian.learning.scores.CVLikelihood, df: DataFrame, k: int = 10, seed: Optional[int] = None) → None ¶

Initializes a CVLikelihood with the given DataFrame df. It uses a CrossValidation with k folds and the given seed.

Parameters

df – DataFrame to compute the score.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or None, a random seed is generated.

property cv¶: The underlying CrossValidation object to compute the score.

class pybnesian.learning.scores.HoldoutLikelihood¶

Bases: pybnesian.learning.scores.Score

This class implements an estimation of the log-likelihood on unseen data using a holdout dataset. Thus, the parameters are estimated using training data, and the score is estimated in the holdout data.

__init__(self: pybnesian.learning.scores.HoldoutLikelihood, df: DataFrame, test_ratio: float = 0.2, seed: Optional[int] = None) → None ¶

Initializes a HoldoutLikelihood with the given DataFrame df. It uses a HoldOut with the given test_ratio and seed.

Parameters

df – DataFrame to compute the score.
test_ratio – Proportion of instances left for the holdout data.
seed – A random seed number. If not specified or None, a random seed is generated.

property holdout¶: The underlying HoldOut object to compute the score.

test_data(self: pybnesian.learning.scores.HoldoutLikelihood) → DataFrame¶: Gets the holdout data of the HoldOut object.

training_data(self: pybnesian.learning.scores.HoldoutLikelihood) → DataFrame¶: Gets the training data of the HoldOut object.

class pybnesian.learning.scores.ValidatedLikelihood¶

Bases: pybnesian.learning.scores.ValidatedScore

This class mixes the functionality of CVLikelihood and HoldoutLikelihood. First, it applies a HoldOut split over the data. Then:

It estimates the training score using a CVLikelihood over the training data.
It estimates the validation score using the training data to estimate the parameters and calculating the log-likelihood on the holdout data.

__init__(self: pybnesian.learning.scores.ValidatedLikelihood, df: DataFrame, test_ratio: float = 0.2, k: int = 10, seed: Optional[int] = None) → None ¶

Initializes a ValidatedLikelihood with the given DataFrame df. The HoldOut is initialized with test_ratio and seed. The CVLikelihood is initialized with k and seed over the training data of the holdout HoldOut.

Parameters

df – DataFrame to compute the score.
test_ratio – Proportion of instances left for the holdout data.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or None, a random seed is generated.

property cv_lik¶: The underlying CVLikelihood to compute the training score.

property holdout_lik¶: The underlying HoldoutLikelihood to compute the validation score.

training_data(self: pybnesian.learning.scores.ValidatedLikelihood) → DataFrame¶: The underlying training data of the HoldOut.

validation_data(self: pybnesian.learning.scores.ValidatedLikelihood) → DataFrame¶: The underlying holdout data of the HoldOut.

class pybnesian.learning.scores.DynamicBIC¶

Bases: pybnesian.learning.scores.DynamicScore

The dynamic adaptation of the BIC score.

__init__(self: pybnesian.learning.scores.DynamicBIC, ddf: pybnesian.dataset.DynamicDataFrame) → None ¶

Initializes a DynamicBIC with the given DynamicDataFrame ddf.

Parameters: ddf – DynamicDataFrame to compute the DynamicBIC score.

class pybnesian.learning.scores.DynamicBGe¶

Bases: pybnesian.learning.scores.DynamicScore

The dynamic adaptation of the BGe score.

__init__(self: pybnesian.learning.scores.DynamicBGe, ddf: pybnesian.dataset.DynamicDataFrame, iss_mu: float = 1, iss_w: Optional[float] = None, nu: Optional[numpy.ndarray[numpy.float64[m, 1]]] = None) → None ¶

Initializes a DynamicBGe with the given DynamicDataFrame ddf.

Parameters

ddf – DynamicDataFrame to compute the DynamicBGe score.
iss_mu – Imaginary sample size for the normal component of the normal-Wishart prior.
iss_w – Imaginary sample size for the Wishart component of the normal-Wishart prior.
nu – Mean vector of the normal-Wishart prior.

class pybnesian.learning.scores.DynamicCVLikelihood¶

Bases: pybnesian.learning.scores.DynamicScore

The dynamic adaptation of the CVLikelihood score.

__init__(self: pybnesian.learning.scores.DynamicCVLikelihood, df: pybnesian.dataset.DynamicDataFrame, k: int = 10, seed: Optional[int] = None) → None ¶

Initializes a DynamicCVLikelihood with the given DynamicDataFrame df. The k and seed parameters are passed to the static and transition components of CVLikelihood.

Parameters

df – DynamicDataFrame to compute the score.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or None, a random seed is generated.

class pybnesian.learning.scores.DynamicHoldoutLikelihood¶

Bases: pybnesian.learning.scores.DynamicScore

The dynamic adaptation of the HoldoutLikelihood score.

__init__(self: pybnesian.learning.scores.DynamicHoldoutLikelihood, df: pybnesian.dataset.DynamicDataFrame, test_ratio: float = 0.2, seed: Optional[int] = None) → None ¶

Initializes a DynamicHoldoutLikelihood with the given DynamicDataFrame df. The test_ratio and seed parameters are passed to the static and transition components of HoldoutLikelihood.

Parameters

df – DynamicDataFrame to compute the score.
test_ratio – Proportion of instances left for the holdout data.
seed – A random seed number. If not specified or None, a random seed is generated.

class pybnesian.learning.scores.DynamicValidatedLikelihood¶

Bases: pybnesian.learning.scores.DynamicScore

The dynamic adaptation of the ValidatedLikelihood score.

__init__(self: pybnesian.learning.scores.DynamicValidatedLikelihood, df: pybnesian.dataset.DynamicDataFrame, test_ratio: float = 0.2, k: int = 10, seed: Optional[int] = None) → None ¶

Initializes a DynamicValidatedLikelihood with the given DynamicDataFrame df. The test_ratio, k and seed parameters are passed to the static and transition components of ValidatedLikelihood.

Parameters

df – DynamicDataFrame to compute the score.
test_ratio – Proportion of instances left for the holdout data.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or None, a random seed is generated.