Structure Scores
This section includes different learning scores that evaluate the goodness of a Bayesian network. This is used
for the score-and-search learning algorithms such as
GreedyHillClimbing
, MMHC
and DMMHC
.
Abstract classes
- class pybnesian.Score
A
Score
scores Bayesian network structures.- __init__(self: pybnesian.Score) None
Initializes a
Score
.
- __str__(self: pybnesian.Score) str
- compatible_bn(self: pybnesian.Score, model: BayesianNetworkBase or ConditionalBayesianNetworkBase) bool
Checks whether the
model
is compatible (can be used) with thisScore
.- Parameters
model – A Bayesian network model.
- Returns
True if the Bayesian network model is compatible with this
Score
, False otherwise.
- data(self: pybnesian.Score) DataFrame
Returns the DataFrame used to calculate the score and local scores.
- Returns
DataFrame used to calculate the score. If the score do not use data, it returns None.
- has_variables(self: pybnesian.Score, variables: str or List[str]) bool
Checks whether this
Score
has the givenvariables
.- Parameters
variables – Name or list of variables.
- Returns
True if the
Score
is defined over the set ofvariables
, False otherwise.
- local_score(*args, **kwargs)
Overloaded function.
local_score(self: pybnesian.Score, model: pybnesian.ConditionalBayesianNetworkBase, variable: str) -> float
local_score(self: pybnesian.Score, model: pybnesian.BayesianNetworkBase, variable: str) -> float
Returns the local score value of a node
variable
in themodel
.For example:
>>> score.local_score(m, "a")
returns the local score of node
"a"
in the modelm
. This method assumes that the parents in the score arem.parents("a")
and its node type ism.node_type("a")
.- Parameters
model – Bayesian network model.
variable – A variable name.
- Returns
Local score value of
node
in themodel
.
local_score(self: pybnesian.Score, model: pybnesian.ConditionalBayesianNetworkBase, variable: str, evidence: List[str]) -> float
local_score(self: pybnesian.Score, model: pybnesian.BayesianNetworkBase, variable: str, evidence: List[str]) -> float
Returns the local score value of a node
variable
in themodel
if it hadevidence
as parents.For example:
>>> score.local_score(m, "a", ["b"])
returns the local score of node
"a"
in the modelm
, with["b"]
as parents. This method assumes that the node type of"a"
ism.node_type("a")
.- Parameters
model – Bayesian network model.
variable – A variable name.
evidence – A list of parent names.
- Returns
Local score value of
node
in themodel
withevidence
as parents.
- local_score_node_type(self: pybnesian.Score, model: pybnesian.BayesianNetworkBase, variable_type: pybnesian.FactorType, variable: str, evidence: List[str]) float
Returns the local score value of a node
variable
in themodel
if its conditional distribution were avariable_type
factor and it hadevidence
as parents.For example:
>>> score.local_score(m, LinearGaussianCPDType(), "a", ["b"])
returns the local score of node
"a"
in the modelm
, with["b"]
as parents assuming the conditional distribution of"a"
is aLinearGaussianCPD
.- Parameters
model – Bayesian network model.
variable_type – The
FactorType
of the nodevariable
.variable – A variable name.
evidence – A list of parent names.
- Returns
Local score value of
node
in themodel
withevidence
as parents andvariable_type
as conditional distribution.
- score(self: pybnesian.Score, model: BayesianNetworkBase or ConditionalBayesianNetworkBase) float
Returns the score value of the
model
.- Parameters
model – Bayesian network model.
- Returns
Score value of
model
.
- class pybnesian.ValidatedScore
Bases:
Score
A
ValidatedScore
is a score with training and validation scores. In aValidatedScore
, the training is driven by the training score through the functionsScore.score()
,Score.local_score_variable()
,Score.local_score()
andScore.local_score_node_type()
). The convergence of the structure is evaluated using a validation likelihood (usually defined over different data) through the functionsValidatedScore.vscore()
,ValidatedScore.vlocal_score_variable()
,ValidatedScore.vlocal_score()
andValidatedScore.vlocal_score_node_type()
.- __init__(self: pybnesian.ValidatedScore) None
- vlocal_score(*args, **kwargs)
Overloaded function.
vlocal_score(self: pybnesian.ValidatedScore, model: pybnesian.ConditionalBayesianNetworkBase, variable: str) -> float
vlocal_score(self: pybnesian.ValidatedScore, model: pybnesian.BayesianNetworkBase, variable: str) -> float
vlocal_score(self: pybnesian.ValidatedScore, model: BayesianNetworkBase or ConditionalBayesianNetworkBase, variable: str) -> float
Returns the validated local score value of a node
variable
in themodel
.For example:
>>> score.local_score(m, "a")
returns the validated local score of node
"a"
in the modelm
. This method assumes that the parents of"a"
arem.parents("a")
and its node type ism.node_type("a")
.- Parameters
model – Bayesian network model.
variable – A variable name.
- Returns
Validated local score value of
node
in themodel
.
vlocal_score(self: pybnesian.ValidatedScore, arg0: pybnesian.ConditionalBayesianNetworkBase, arg1: str, arg2: List[str]) -> float
vlocal_score(self: pybnesian.ValidatedScore, model: pybnesian.BayesianNetworkBase, variable: str, evidence: List[str]) -> float
vlocal_score(self: pybnesian.ValidatedScore, model: BayesianNetworkBase or ConditionalBayesianNetworkBase, variable: str, evidence: List[str]) -> float
Returns the validated local score value of a node
variable
in themodel
if it hadevidence
as parents.For example:
>>> score.local_score(m, "a", ["b"])
returns the validated local score of node
"a"
in the modelm
, with["b"]
as parents. This method assumes that the node type of"a"
ism.node_type("a")
.- Parameters
model – Bayesian network model.
variable – A variable name.
evidence – A list of parent names.
- Returns
Validated local score value of
node
in themodel
withevidence
as parents.
- vlocal_score_node_type(self: pybnesian.ValidatedScore, model: pybnesian.BayesianNetworkBase, variable_type: pybnesian.FactorType, variable: str, evidence: List[str]) float
Returns the validated local score value of a node
variable
in themodel
if its conditional distribution were avariable_type
factor and it hadevidence
as parents.For example:
>>> score.vlocal_score(m, LinearGaussianCPDType(), "a", ["b"])
returns the validated local score of node
"a"
in the modelm
, with["b"]
as parents assuming the conditional distribution of"a"
is aLinearGaussianCPD
.- Parameters
model – Bayesian network model.
variable_type – The
FactorType
of the nodevariable
.variable – A variable name.
evidence – A list of parent names.
- Returns
Validated local score value of
node
in themodel
withevidence
as parents andvariable_type
as conditional distribution.
- vscore(self: pybnesian.ValidatedScore, model: BayesianNetworkBase or ConditionalBayesianNetworkBase) float
Returns the validated score value of the
model
.- Parameters
model – Bayesian network model.
- Returns
Validated score value of
model
.
- class pybnesian.DynamicScore
A
DynamicScore
adapts the staticScore
to learn dynamic Bayesian networks. It generates a static and a transition score to learn the static and transition components of the dynamic Bayesian network.The dynamic scores are usually implemented using a
DynamicDataFrame
with the methodsDynamicDataFrame.static_df
andDynamicDataFrame.transition_df
.- __init__(self: pybnesian.DynamicScore) None
Initializes a
DynamicScore
.
- has_variables(self: pybnesian.DynamicScore, variables: str or List[str]) bool
Checks whether this
DynamicScore
has the givenvariables
.- Parameters
variables – Name or list of variables.
- Returns
True if the
DynamicScore
is defined over the set ofvariables
, False otherwise.
- static_score(self: pybnesian.DynamicScore) pybnesian.Score
It returns the static score component of the
DynamicScore
.- Returns
The static score component.
- transition_score(self: pybnesian.DynamicScore) pybnesian.Score
It returns the transition score component of the
DynamicScore
.- Returns
The transition score component.
Concrete classes
- class pybnesian.BIC
Bases:
Score
This class implements the Bayesian Information Criterion (BIC).
- __init__(self: pybnesian.BIC, df: DataFrame) None
Initializes a
BIC
with the given DataFramedf
.- Parameters
df – DataFrame to compute the BIC score.
- class pybnesian.BGe
Bases:
Score
This class implements the Bayesian Gaussian equivalent (BGe).
- __init__(self: pybnesian.BGe, df: DataFrame, iss_mu: float = 1, iss_w: Optional[float] = None, nu: Optional[numpy.ndarray[numpy.float64[m, 1]]] = None) None
Initializes a
BGe
with the given DataFramedf
.- Parameters
df – DataFrame to compute the BGe score.
iss_mu – Imaginary sample size for the normal component of the normal-Wishart prior.
iss_w – Imaginary sample size for the Wishart component of the normal-Wishart prior.
nu – Mean vector of the normal-Wishart prior.
- class pybnesian.BDe
Bases:
Score
This class implements the Bayesian Dirichlet equivalent (BDe).
- __init__(self: pybnesian.BDe, df: DataFrame, iss: float = 1) None
Initializes a
BDe
with the given DataFramedf
.- Parameters
df – DataFrame to compute the BDe score.
iss – Imaginary sample size of the Dirichlet prior.
- class pybnesian.CVLikelihood
Bases:
Score
This class implements an estimation of the log-likelihood on unseen data using k-fold cross validation over the data.
- __init__(self: pybnesian.CVLikelihood, df: DataFrame, k: int = 10, seed: Optional[int] = None, construction_args: pybnesian.Arguments = Arguments) None
Initializes a
CVLikelihood
with the given DataFramedf
. It uses aCrossValidation
withk
folds and the givenseed
.- Parameters
df – DataFrame to compute the score.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or
None
, a random seed is generated.construction_args – Additional arguments provided to construct the
Factor
.
- property cv
The underlying
CrossValidation
object to compute the score.
- class pybnesian.HoldoutLikelihood
Bases:
Score
This class implements an estimation of the log-likelihood on unseen data using a holdout dataset. Thus, the parameters are estimated using training data, and the score is estimated in the holdout data.
- __init__(self: pybnesian.HoldoutLikelihood, df: DataFrame, test_ratio: float = 0.2, seed: Optional[int] = None, construction_args: pybnesian.Arguments = Arguments) None
Initializes a
HoldoutLikelihood
with the given DataFramedf
. It uses aHoldOut
with the giventest_ratio
andseed
.- Parameters
df – DataFrame to compute the score.
test_ratio – Proportion of instances left for the holdout data.
seed – A random seed number. If not specified or
None
, a random seed is generated.construction_args – Additional arguments provided to construct the
Factor
.
- test_data(self: pybnesian.HoldoutLikelihood) DataFrame
Gets the holdout data of the
HoldOut
object.
- training_data(self: pybnesian.HoldoutLikelihood) DataFrame
Gets the training data of the
HoldOut
object.
- class pybnesian.ValidatedLikelihood
Bases:
ValidatedScore
This class mixes the functionality of
CVLikelihood
andHoldoutLikelihood
. First, it applies aHoldOut
split over the data. Then:It estimates the training score using a
CVLikelihood
over the training data.It estimates the validation score using the training data to estimate the parameters and calculating the log-likelihood on the holdout data.
- __init__(self: pybnesian.ValidatedLikelihood, df: DataFrame, test_ratio: float = 0.2, k: int = 10, seed: Optional[int] = None, construction_args: pybnesian.Arguments = Arguments) None
Initializes a
ValidatedLikelihood
with the given DataFramedf
. TheHoldOut
is initialized withtest_ratio
andseed
. TheCVLikelihood
is initialized withk
andseed
over the training data of the holdoutHoldOut
.- Parameters
df – DataFrame to compute the score.
test_ratio – Proportion of instances left for the holdout data.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or
None
, a random seed is generated.construction_args – Additional arguments provided to construct the
Factor
.
- property cv_lik
The underlying
CVLikelihood
to compute the training score.
- property holdout_lik
The underlying
HoldoutLikelihood
to compute the validation score.
- training_data(self: pybnesian.ValidatedLikelihood) DataFrame
The underlying training data of the
HoldOut
.
- validation_data(self: pybnesian.ValidatedLikelihood) DataFrame
The underlying holdout data of the
HoldOut
.
- class pybnesian.DynamicBIC
Bases:
DynamicScore
The dynamic adaptation of the
BIC
score.- __init__(self: pybnesian.DynamicBIC, ddf: pybnesian.DynamicDataFrame) None
Initializes a
DynamicBIC
with the givenDynamicDataFrame
ddf
.- Parameters
ddf –
DynamicDataFrame
to compute theDynamicBIC
score.
- class pybnesian.DynamicBGe
Bases:
DynamicScore
The dynamic adaptation of the
BGe
score.- __init__(self: pybnesian.DynamicBGe, ddf: pybnesian.DynamicDataFrame, iss_mu: float = 1, iss_w: Optional[float] = None, nu: Optional[numpy.ndarray[numpy.float64[m, 1]]] = None) None
Initializes a
DynamicBGe
with the givenDynamicDataFrame
ddf
.- Parameters
ddf –
DynamicDataFrame
to compute theDynamicBGe
score.iss_mu – Imaginary sample size for the normal component of the normal-Wishart prior.
iss_w – Imaginary sample size for the Wishart component of the normal-Wishart prior.
nu – Mean vector of the normal-Wishart prior.
- class pybnesian.DynamicBDe
Bases:
DynamicScore
The dynamic adaptation of the
BDe
score.- __init__(self: pybnesian.DynamicBDe, ddf: pybnesian.DynamicDataFrame, iss: float = 1) None
Initializes a
DynamicBDe
with the givenDynamicDataFrame
ddf
.- Parameters
ddf –
DynamicDataFrame
to compute theDynamicBDe
score.iss – Imaginary sample size of the Dirichlet prior.
- class pybnesian.DynamicCVLikelihood
Bases:
DynamicScore
The dynamic adaptation of the
CVLikelihood
score.- __init__(self: pybnesian.DynamicCVLikelihood, df: pybnesian.DynamicDataFrame, k: int = 10, seed: Optional[int] = None) None
Initializes a
DynamicCVLikelihood
with the givenDynamicDataFrame
df
. Thek
andseed
parameters are passed to the static and transition components ofCVLikelihood
.- Parameters
df –
DynamicDataFrame
to compute the score.k – Number of folds of the cross validation.
seed – A random seed number. If not specified or
None
, a random seed is generated.
- class pybnesian.DynamicHoldoutLikelihood
Bases:
DynamicScore
The dynamic adaptation of the
HoldoutLikelihood
score.- __init__(self: pybnesian.DynamicHoldoutLikelihood, df: pybnesian.DynamicDataFrame, test_ratio: float = 0.2, seed: Optional[int] = None) None
Initializes a
DynamicHoldoutLikelihood
with the givenDynamicDataFrame
df
. Thetest_ratio
andseed
parameters are passed to the static and transition components ofHoldoutLikelihood
.- Parameters
df –
DynamicDataFrame
to compute the score.test_ratio – Proportion of instances left for the holdout data.
seed – A random seed number. If not specified or
None
, a random seed is generated.
- class pybnesian.DynamicValidatedLikelihood
Bases:
DynamicScore
The dynamic adaptation of the
ValidatedLikelihood
score.- __init__(self: pybnesian.DynamicValidatedLikelihood, df: pybnesian.DynamicDataFrame, test_ratio: float = 0.2, k: int = 10, seed: Optional[int] = None) None
Initializes a
DynamicValidatedLikelihood
with the givenDynamicDataFrame
df
. Thetest_ratio
,k
andseed
parameters are passed to the static and transition components ofValidatedLikelihood
.- Parameters
df –
DynamicDataFrame
to compute the score.test_ratio – Proportion of instances left for the holdout data.
k – Number of folds of the cross validation.
seed – A random seed number. If not specified or
None
, a random seed is generated.