Structure Scores¶
This section includes different learning scores that evaluate the goodness of a Bayesian network. This is used
for the score-and-search learning algorithms such as
GreedyHillClimbing
,
MMHC
and DMMHC
.
Abstract classes¶
- class pybnesian.learning.scores.Score¶
A
Score
scores Bayesian network structures.- __init__(self: pybnesian.learning.scores.Score) → None¶
Initializes a
Score
.
- __str__(self: pybnesian.learning.scores.Score) → str¶
- compatible_bn(self: pybnesian.learning.scores.Score, model: BayesianNetworkBase or ConditionalBayesianNetworkBase) → bool¶
Checks whether the
model
is compatible (can be used) with thisScore
.- Parameters
model – A Bayesian network model.
- Returns
True if the Bayesian network model is compatible with this
Score
, False otherwise.
- data(self: pybnesian.learning.scores.Score) → DataFrame¶
Returns the DataFrame used to calculate the score and local scores.
- Returns
DataFrame used to calculate the score. If the score do not use data, it returns None.
- has_variables(self: pybnesian.learning.scores.Score, variables: str or List[str]) → bool¶
Checks whether this
Score
has the givenvariables
.- Parameters
variables – Name or list of variables.
- Returns
True if the
Score
is defined over the set ofvariables
, False otherwise.
- local_score(*args, **kwargs)¶
Overloaded function.
local_score(self: pybnesian.learning.scores.Score, model: pybnesian.models.ConditionalBayesianNetworkBase, variable: str) -> float
local_score(self: pybnesian.learning.scores.Score, model: pybnesian.models.BayesianNetworkBase, variable: str) -> float
Returns the local score value of a node
variable
in themodel
.For example:
>>> score.local_score(m, "a")
returns the local score of node
"a"
in the modelm
. This method assumes that the parents in the score arem.parents("a")
and its node type ism.node_type("a")
.- Parameters
model – Bayesian network model.
variable – A variable name.
- Returns
Local score value of
node
in themodel
.
local_score(self: pybnesian.learning.scores.Score, model: pybnesian.models.ConditionalBayesianNetworkBase, variable: str, evidence: List[str]) -> float
local_score(self: pybnesian.learning.scores.Score, model: pybnesian.models.BayesianNetworkBase, variable: str, evidence: List[str]) -> float
Returns the local score value of a node
variable
in themodel
if it hadevidence
as parents.For example:
>>> score.local_score(m, "a", ["b"])
returns the local score of node
"a"
in the modelm
, with["b"]
as parents. This method assumes that the node type of"a"
ism.node_type("a")
.- Parameters
model – Bayesian network model.
variable – A variable name.
evidence – A list of parent names.
- Returns
Local score value of
node
in themodel
withevidence
as parents.
- local_score_node_type(self: pybnesian.learning.scores.Score, model: pybnesian.models.BayesianNetworkBase, variable_type: pybnesian.factors.FactorType, variable: str, evidence: List[str]) → float¶
Returns the local score value of a node
variable
in themodel
if its conditional distribution were avariable_type
factor and it hadevidence
as parents.For example:
>>> score.local_score(m, LinearGaussianCPDType(), "a", ["b"])
returns the local score of node
"a"
in the modelm
, with["b"]
as parents assuming the conditional distribution of"a"
is aLinearGaussianCPD
.- Parameters
model – Bayesian network model.
variable_type – The
FactorType
of the nodevariable
.variable – A variable name.
evidence – A list of parent names.
- Returns
Local score value of
node
in themodel
withevidence
as parents andvariable_type
as conditional distribution.
- score(self: pybnesian.learning.scores.Score, model: BayesianNetworkBase or ConditionalBayesianNetworkBase) → float¶
Returns the score value of the
model
.- Parameters
model – Bayesian network model.
- Returns
Score value of
model
.
- class pybnesian.learning.scores.ValidatedScore¶
Bases:
pybnesian.learning.scores.Score
A
ValidatedScore
is a score with training and validation scores. In aValidatedScore
, the training is driven by the training score through the functionsScore.score()
,Score.local_score_variable()
,Score.local_score()
andScore.local_score_node_type()
). The convergence of the structure is evaluated using a validation likelihood (usually defined over different data) through the functionsValidatedScore.vscore()
,ValidatedScore.vlocal_score_variable()
,ValidatedScore.vlocal_score()
andValidatedScore.vlocal_score_node_type()
.- __init__(self: pybnesian.learning.scores.ValidatedScore) → None¶
- vlocal_score(*args, **kwargs)¶
Overloaded function.
vlocal_score(self: pybnesian.learning.scores.ValidatedScore, model: pybnesian.models.ConditionalBayesianNetworkBase, variable: str) -> float
vlocal_score(self: pybnesian.learning.scores.ValidatedScore, model: pybnesian.models.BayesianNetworkBase, variable: str) -> float
vlocal_score(self: pybnesian.learning.scores.ValidatedScore, model: BayesianNetworkBase or ConditionalBayesianNetworkBase, variable: str) -> float
Returns the validated local score value of a node
variable
in themodel
.For example:
>>> score.local_score(m, "a")
returns the validated local score of node
"a"
in the modelm
. This method assumes that the parents of"a"
arem.parents("a")
and its node type ism.node_type("a")
.- Parameters
model – Bayesian network model.
variable – A variable name.
- Returns
Validated local score value of
node
in themodel
.
vlocal_score(self: pybnesian.learning.scores.ValidatedScore, arg0: pybnesian.models.ConditionalBayesianNetworkBase, arg1: str, arg2: List[str]) -> float
vlocal_score(self: pybnesian.learning.scores.ValidatedScore, model: pybnesian.models.BayesianNetworkBase, variable: str, evidence: List[str]) -> float
vlocal_score(self: pybnesian.learning.scores.ValidatedScore, model: BayesianNetworkBase or ConditionalBayesianNetworkBase, variable: str, evidence: List[str]) -> float
Returns the validated local score value of a node
variable
in themodel
if it hadevidence
as parents.For example:
>>> score.local_score(m, "a", ["b"])
returns the validated local score of node
"a"
in the modelm
, with["b"]
as parents. This method assumes that the node type of"a"
ism.node_type("a")
.- Parameters
model – Bayesian network model.
variable – A variable name.
evidence – A list of parent names.
- Returns
Validated local score value of
node
in themodel
withevidence
as parents.
- vlocal_score_node_type(self: pybnesian.learning.scores.ValidatedScore, model: pybnesian.models.BayesianNetworkBase, variable_type: pybnesian.factors.FactorType, variable: str, evidence: List[str]) → float¶
Returns the validated local score value of a node
variable
in themodel
if its conditional distribution were avariable_type
factor and it hadevidence
as parents.For example:
>>> score.vlocal_score(m, LinearGaussianCPDType(), "a", ["b"])
returns the validated local score of node
"a"
in the modelm
, with["b"]
as parents assuming the conditional distribution of"a"
is aLinearGaussianCPD
.- Parameters
model – Bayesian network model.
variable_type – The
FactorType
of the nodevariable
.variable – A variable name.
evidence – A list of parent names.
- Returns
Validated local score value of
node
in themodel
withevidence
as parents andvariable_type
as conditional distribution.
- vscore(self: pybnesian.learning.scores.ValidatedScore, model: BayesianNetworkBase or ConditionalBayesianNetworkBase) → float¶
Returns the validated score value of the
model
.- Parameters
model – Bayesian network model.
- Returns
Validated score value of
model
.
- class pybnesian.learning.scores.DynamicScore¶
A
DynamicScore
adapts the staticScore
to learn dynamic Bayesian networks. It generates a static and a transition score to learn the static and transition components of the dynamic Bayesian network.The dynamic scores are usually implemented using a
DynamicDataFrame
with the methodsDynamicDataFrame.static_df
andDynamicDataFrame.transition_df
.- __init__(self: pybnesian.learning.scores.DynamicScore) → None¶
Initializes a
DynamicScore
.
- has_variables(self: pybnesian.learning.scores.DynamicScore, variables: str or List[str]) → bool¶
Checks whether this
DynamicScore
has the givenvariables
.- Parameters
variables – Name or list of variables.
- Returns
True if the
DynamicScore
is defined over the set ofvariables
, False otherwise.
- static_score(self: pybnesian.learning.scores.DynamicScore) → pybnesian.learning.scores.Score¶
It returns the static score component of the
DynamicScore
.- Returns
The static score component.
- transition_score(self: pybnesian.learning.scores.DynamicScore) → pybnesian.learning.scores.Score¶
It returns the transition score component of the
DynamicScore
.- Returns
The transition score component.
Concrete classes¶
- class pybnesian.learning.scores.BIC¶
Bases:
pybnesian.learning.scores.Score
This class implements the Bayesian Information Criterion (BIC).
- __init__(self: pybnesian.learning.scores.BIC, df: DataFrame) → None¶
Initializes a
BIC
with the given DataFramedf
.- Parameters
df – DataFrame to compute the BIC score.
- class pybnesian.learning.scores.BGe¶
Bases:
pybnesian.learning.scores.Score
This class implements the Bayesian Gaussian equivalent (BGe).
- __init__(self: pybnesian.learning.scores.BGe, df: DataFrame, iss_mu: float = 1, iss_w: Optional[float] = None, nu: Optional[numpy.ndarray[numpy.float64[m, 1]]] = None) → None¶
Initializes a
BGe
with the given DataFramedf
.- Parameters
df – DataFrame to compute the BGe score.
iss_mu – Imaginary sample size for the normal component of the normal-Wishart prior.
iss_w – Imaginary sample size for the Wishart component of the normal-Wishart prior.
nu – Mean vector of the normal-Wishart prior.
- class pybnesian.learning.scores.CVLikelihood¶
Bases:
pybnesian.learning.scores.Score
This class implements an estimation of the log-likelihood on unseen data using k-fold cross validation over the data.
- __init__(self: pybnesian.learning.scores.CVLikelihood, df: DataFrame, k: int = 10, seed: Optional[int] = None) → None¶
Initializes a
CVLikelihood
with the given DataFramedf
. It uses aCrossValidation
withk
folds and the givenseed
.- Parameters
df – DataFrame to compute the score.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or
None
, a random seed is generated.
- property cv¶
The underlying
CrossValidation
object to compute the score.
- class pybnesian.learning.scores.HoldoutLikelihood¶
Bases:
pybnesian.learning.scores.Score
This class implements an estimation of the log-likelihood on unseen data using a holdout dataset. Thus, the parameters are estimated using training data, and the score is estimated in the holdout data.
- __init__(self: pybnesian.learning.scores.HoldoutLikelihood, df: DataFrame, test_ratio: float = 0.2, seed: Optional[int] = None) → None¶
Initializes a
HoldoutLikelihood
with the given DataFramedf
. It uses aHoldOut
with the giventest_ratio
andseed
.- Parameters
df – DataFrame to compute the score.
test_ratio – Proportion of instances left for the holdout data.
seed – A random seed number. If not specified or
None
, a random seed is generated.
- property holdout¶
The underlying
HoldOut
object to compute the score.
- test_data(self: pybnesian.learning.scores.HoldoutLikelihood) → DataFrame¶
Gets the holdout data of the
HoldOut
object.
- training_data(self: pybnesian.learning.scores.HoldoutLikelihood) → DataFrame¶
Gets the training data of the
HoldOut
object.
- class pybnesian.learning.scores.ValidatedLikelihood¶
Bases:
pybnesian.learning.scores.ValidatedScore
This class mixes the functionality of
CVLikelihood
andHoldoutLikelihood
. First, it applies aHoldOut
split over the data. Then:It estimates the training score using a
CVLikelihood
over the training data.It estimates the validation score using the training data to estimate the parameters and calculating the log-likelihood on the holdout data.
- __init__(self: pybnesian.learning.scores.ValidatedLikelihood, df: DataFrame, test_ratio: float = 0.2, k: int = 10, seed: Optional[int] = None) → None¶
Initializes a
ValidatedLikelihood
with the given DataFramedf
. TheHoldOut
is initialized withtest_ratio
andseed
. TheCVLikelihood
is initialized withk
andseed
over the training data of the holdoutHoldOut
.- Parameters
df – DataFrame to compute the score.
test_ratio – Proportion of instances left for the holdout data.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or
None
, a random seed is generated.
- property cv_lik¶
The underlying
CVLikelihood
to compute the training score.
- property holdout_lik¶
The underlying
HoldoutLikelihood
to compute the validation score.
- training_data(self: pybnesian.learning.scores.ValidatedLikelihood) → DataFrame¶
The underlying training data of the
HoldOut
.
- validation_data(self: pybnesian.learning.scores.ValidatedLikelihood) → DataFrame¶
The underlying holdout data of the
HoldOut
.
- class pybnesian.learning.scores.DynamicBIC¶
Bases:
pybnesian.learning.scores.DynamicScore
The dynamic adaptation of the
BIC
score.- __init__(self: pybnesian.learning.scores.DynamicBIC, ddf: pybnesian.dataset.DynamicDataFrame) → None¶
Initializes a
DynamicBIC
with the givenDynamicDataFrame
ddf
.- Parameters
ddf –
DynamicDataFrame
to compute theDynamicBIC
score.
- class pybnesian.learning.scores.DynamicBGe¶
Bases:
pybnesian.learning.scores.DynamicScore
The dynamic adaptation of the
BGe
score.- __init__(self: pybnesian.learning.scores.DynamicBGe, ddf: pybnesian.dataset.DynamicDataFrame, iss_mu: float = 1, iss_w: Optional[float] = None, nu: Optional[numpy.ndarray[numpy.float64[m, 1]]] = None) → None¶
Initializes a
DynamicBGe
with the givenDynamicDataFrame
ddf
.- Parameters
ddf –
DynamicDataFrame
to compute theDynamicBGe
score.iss_mu – Imaginary sample size for the normal component of the normal-Wishart prior.
iss_w – Imaginary sample size for the Wishart component of the normal-Wishart prior.
nu – Mean vector of the normal-Wishart prior.
- class pybnesian.learning.scores.DynamicCVLikelihood¶
Bases:
pybnesian.learning.scores.DynamicScore
The dynamic adaptation of the
CVLikelihood
score.- __init__(self: pybnesian.learning.scores.DynamicCVLikelihood, df: pybnesian.dataset.DynamicDataFrame, k: int = 10, seed: Optional[int] = None) → None¶
Initializes a
DynamicCVLikelihood
with the givenDynamicDataFrame
df
. Thek
andseed
parameters are passed to the static and transition components ofCVLikelihood
.- Parameters
df –
DynamicDataFrame
to compute the score.k – Number of folds of the cross validation.
seed – A random seed number. If not specified or
None
, a random seed is generated.
- class pybnesian.learning.scores.DynamicHoldoutLikelihood¶
Bases:
pybnesian.learning.scores.DynamicScore
The dynamic adaptation of the
HoldoutLikelihood
score.- __init__(self: pybnesian.learning.scores.DynamicHoldoutLikelihood, df: pybnesian.dataset.DynamicDataFrame, test_ratio: float = 0.2, seed: Optional[int] = None) → None¶
Initializes a
DynamicHoldoutLikelihood
with the givenDynamicDataFrame
df
. Thetest_ratio
andseed
parameters are passed to the static and transition components ofHoldoutLikelihood
.- Parameters
df –
DynamicDataFrame
to compute the score.test_ratio – Proportion of instances left for the holdout data.
seed – A random seed number. If not specified or
None
, a random seed is generated.
- class pybnesian.learning.scores.DynamicValidatedLikelihood¶
Bases:
pybnesian.learning.scores.DynamicScore
The dynamic adaptation of the
ValidatedLikelihood
score.- __init__(self: pybnesian.learning.scores.DynamicValidatedLikelihood, df: pybnesian.dataset.DynamicDataFrame, test_ratio: float = 0.2, k: int = 10, seed: Optional[int] = None) → None¶
Initializes a
DynamicValidatedLikelihood
with the givenDynamicDataFrame
df
. Thetest_ratio
,k
andseed
parameters are passed to the static and transition components ofValidatedLikelihood
.- Parameters
df –
DynamicDataFrame
to compute the score.test_ratio – Proportion of instances left for the holdout data.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or
None
, a random seed is generated.