pysr3.lme.models module

Linear Mixed-Effects Models (simple, relaxed, and regularized)

class pysr3.lme.models.CADLmeModel(tol_solver: float = 1e-05, initializer: str = 'None', max_iter_solver: int = 10000, stepping: str = 'line-search', rho: float = 0.3, lam: float = 1.0, elastic_eps: float = 0.0001, logger_keys: Set = ('converged',), fixed_step_len=None, prior=None, **kwargs)

Bases: SimpleLMEModel

Implements a CAD-regularized Linear Mixed-Effects Model

Initializes the model

Parameters:
  • tol_solver (float) – tolerance for the stop criterion of PGD solver

  • initializer (str) – pre-initialization. Can be “None”, in which case the algorithm starts with “all-ones” starting parameters, or “EM”, in which case the algorithm does one step of EM algorithm

  • max_iter_solver (int) – maximal number of iterations for PGD solver

  • stepping (str) – step-size policy for PGD. Can be either “line-search” or “fixed”

  • lam (float) – strength of CAD regularizer

  • rho (float) – cut-off amplitude above which the coefficients are not penalized

  • logger_keys (List[str]) – list of keys for the parameters that the logger should track

  • fixed_step_len (float) – step-size for PGD algorithm. If “linear-search” is used for stepping then the algorithm uses this value as the maximal step possible. Use this parameter if you know the Lipschitz-smoothness constant L for your problem as fixed_step_len=1/L.

  • prior (Optional[Prior]) – an instance of Prior class. If None then a non-informative prior is used.

  • kwargs – for passing debugging info

instantiate()

Instantiates the model: creates all internal entities such as oracle, regularizer, and solver

Returns:

Tuple of [Oracle, Regularizer, Solver] that correspond to this model

class pysr3.lme.models.CADLmeModelSR3(tol_oracle: float = 1e-05, tol_solver: float = 1e-05, initializer: str = 'None', max_iter_oracle: int = 10000, max_iter_solver: int = 10000, stepping: str = 'fixed', ell: float = 40.0, rho: float = 0.3, lam: float = 1.0, elastic_eps: float = 0.0001, central_path_neighbourhood_target: float = 0.5, logger_keys: Set = ('converged',), warm_start_oracle=True, practical=False, take_only_positive_part=True, take_expected_value=False, update_prox_every=1, fixed_step_len=None, prior=None, **kwargs)

Bases: SimpleLMEModelSR3

Implements a CAD-regularized SR3-relaxed Linear Mixed-Effect Model

Initializes the model

Parameters:
  • tol_oracle (float) – tolerance for SR3 oracle’s internal numerical subroutines

  • tol_solver (float) – tolerance for the stop criterion of PGD solver

  • initializer (str) – pre-initialization. Can be “None”, in which case the algorithm starts with “all-ones” starting parameters, or “EM”, in which case the algorithm does one step of EM algorithm

  • max_iter_solver (int) – maximal number of iterations for PGD solver

  • stepping (str) – step-size policy for PGD. Can be either “line-search” or “fixed”

  • ell (float) – level of SR3-relaxation

  • lam (float) – strength of CAD regularizer

  • rho (float) – cut-off amplitude above which the coefficients are not penalized

  • participation_in_selection (ndarray of int) – 0 if the feature should be affected by the regularizer, 0 otherwise

  • logger_keys (List[str]) – list of keys for the parameters that the logger should track

  • warm_start_oracle (bool) – if fitting should be started from the current model’s coefficients. Used for fine-tuning and iterative fitting.

  • practical (bool) – whether to use SR3-Practical method, which works faster at the expense of accuracy

  • update_prox_every (int) – how often update the relaxed variables. Only if practical=True

  • fixed_step_len (float) – step-size for PGD algorithm. If “linear-search” is used for stepping then the algorithm uses this value as the maximal step possible. Use this parameter if you know the Lipschitz-smoothness constant L for your problem as fixed_step_len=1/L.

  • prior (Optional[Prior]) – an instance of Prior class. If None then a non-informative prior is used.

  • kwargs – for passing debugging info

instantiate()

Instantiates the model: creates all internal entities such as oracle, regularizer, and solver

Returns:

Tuple of [Oracle, Regularizer, Solver] that correspond to this model

class pysr3.lme.models.L0LmeModel(tol_solver: float = 1e-05, initializer: str = 'None', max_iter_solver: int = 10000, stepping: str = 'line-search', nnz_tbeta: int | None = None, nnz_tgamma: int | None = None, elastic_eps: float = 0.0001, logger_keys: Set = ('converged',), fixed_step_len=None, prior=None, **kwargs)

Bases: SimpleLMEModel

Implements an L0-regularized Linear Mixed-Effect Model. It allows specifying the maximal number of non-zero fixed and random effects in your model

Initializes the model

Parameters:
  • tol_solver (float) – tolerance for the stop criterion of PGD solver

  • initializer (str) – pre-initialization. Can be “None”, in which case the algorithm starts with “all-ones” starting parameters, or “EM”, in which case the algorithm does one step of EM algorithm

  • max_iter_solver (int) – maximal number of iterations for PGD solver

  • stepping (str) – step-size policy for PGD. Can be either “line-search” or “fixed”

  • nnz_tbeta (int) – the maximal number of non-zero fixed effects in your model

  • nnz_tgamma (int) – the maximal number of non-zero random effects in your model

  • logger_keys (List[str]) – list of keys for the parameters that the logger should track

  • fixed_step_len (float) – step-size for PGD algorithm. If “linear-search” is used for stepping then the algorithm uses this value as the maximal step possible. Use this parameter if you know the Lipschitz-smoothness constant L for your problem as fixed_step_len=1/L.

  • prior (Optional[Prior]) – an instance of Prior class. If None then a non-informative prior is used.

  • kwargs – for passing debugging info

instantiate()

Instantiates the model: creates all internal entities such as oracle, regularizer, and solver

Returns:

Tuple of [Oracle, Regularizer, Solver] that correspond to this model

class pysr3.lme.models.L0LmeModelSR3(tol_oracle: float = 1e-05, tol_solver: float = 1e-05, initializer: str = 'None', max_iter_oracle: int = 1000, max_iter_solver: int = 1000, stepping: str = 'fixed', ell: float = 40, elastic_eps: float = 0.0001, central_path_neighbourhood_target=0.5, nnz_tbeta: int = 1, nnz_tgamma: int = 1, logger_keys: Set = ('converged',), warm_start_oracle=True, practical=False, take_only_positive_part=True, take_expected_value=False, update_prox_every=1, fixed_step_len=None, prior=None, **kwargs)

Bases: SimpleLMEModelSR3

Implements Regularized Linear Mixed-Effects Model functional for given problem:

Y_i = X_i*β + Z_i*u_i + 𝜺_i,

where

β ~ 𝒩(tb, 1/lb),

||||_0 = nnz(β) <= nnz_tbeta,

u_i ~ 𝒩(0, diag(𝛄)),

𝛄 ~ 𝒩(t𝛄, 1/lg),

||t𝛄||_0 = nnz(t𝛄) <= nnz_tgamma,

𝜺_i ~ 𝒩(0, Λ)

Here tβ and t𝛄 are single variables, not multiplications (e.g. not t*β). This oracle is designed for a solver (LinearLMESparseModel) which searches for a sparse solution (tβ, t𝛄) with at most k and j <= k non-zero elements respectively. For more details, see the documentation for LinearLMESparseModel.

The problem should be provided as LMEProblem.

Initializes the model

Parameters:
  • tol_oracle (float) – tolerance for SR3 oracle’s internal numerical subroutines

  • tol_solver (float) – tolerance for the stop criterion of PGD solver

  • initializer (str) – pre-initialization. Can be “None”, in which case the algorithm starts with “all-ones” starting parameters, or “EM”, in which case the algorithm does one step of EM algorithm

  • max_iter_solver (int) – maximal number of iterations for PGD solver

  • stepping (str) – step-size policy for PGD. Can be either “line-search” or “fixed”

  • lb (float) – level of SR3-relaxation for fixed effects

  • lg (float) – level of SR3-relaxation for random effects

  • nnz_tbeta (int) – the maximal number of non-zero fixed effects in your model

  • nnz_tgamma (int) – the maximal number of non-zero random effects in your model

  • logger_keys (List[str]) – list of keys for the parameters that the logger should track

  • warm_start_oracle (bool) – if fitting should be started from the current model’s coefficients. Used for fine-tuning and iterative fitting.

  • practical (bool) – whether to use SR3-Practical method, which works faster at the expense of accuracy

  • update_prox_every (int) – how often update the relaxed variables. Only if practical=True

  • fixed_step_len (float) – step-size for PGD algorithm. If “linear-search” is used for stepping then the algorithm uses this value as the maximal step possible. Use this parameter if you know the Lipschitz-smoothness constant L for your problem as fixed_step_len=1/L.

  • prior (Optional[Prior]) – an instance of Prior class. If None then a non-informative prior is used.

  • kwargs – for passing debugging info

instantiate()

Instantiates the model: creates all internal entities such as oracle, regularizer, and solver

Returns:

Tuple of [Oracle, Regularizer, Solver] that correspond to this model

class pysr3.lme.models.L1LmeModel(tol_solver: float = 1e-05, initializer: str = 'None', max_iter_solver: int = 10000, stepping: str = 'line-search', lam: float = 0, elastic_eps: float = 0.0001, logger_keys: Set = ('converged',), fixed_step_len=None, prior=None, **kwargs)

Bases: SimpleLMEModel

Implements a LASSO-regularized Linear Mixed-Effect Model

Initializes the model

Parameters:
  • tol_solver (float) – tolerance for the stop criterion of PGD solver

  • initializer (str) – pre-initialization. Can be “None”, in which case the algorithm starts with “all-ones” starting parameters, or “EM”, in which case the algorithm does one step of EM algorithm

  • max_iter_solver (int) – maximal number of iterations for PGD solver

  • stepping (str) – step-size policy for PGD. Can be either “line-search” or “fixed”

  • lam (float) – strength of LASSO regularizer

  • logger_keys (List[str]) – list of keys for the parameters that the logger should track

  • fixed_step_len (float) – step-size for PGD algorithm. If “linear-search” is used for stepping then the algorithm uses this value as the maximal step possible. Use this parameter if you know the Lipschitz-smoothness constant L for your problem as fixed_step_len=1/L.

  • prior (Optional[Prior]) – an instance of Prior class. If None then a non-informative prior is used.

  • kwargs – for passing debugging info

instantiate()

Instantiates the model: creates all internal entities such as oracle, regularizer, and solver

Returns:

Tuple of [Oracle, Regularizer, Solver] that correspond to this model

class pysr3.lme.models.L1LmeModelSR3(tol_oracle: float = 1e-05, tol_solver: float = 1e-05, initializer: str = 'None', max_iter_oracle: int = 10000, max_iter_solver: int = 10000, stepping: str = 'fixed', ell: float = 40, elastic_eps: float = 0.0001, lam: float = 1, logger_keys: Set = ('converged',), warm_start_oracle=True, practical=False, take_only_positive_part=True, take_expected_value=False, update_prox_every=1, central_path_neighbourhood_target=0.5, fixed_step_len=None, prior=None, **kwargs)

Bases: SimpleLMEModelSR3

Implements an SR3-relaxed LASSO-regularized Linear Mixed-Effect Model

Initializes the model

Parameters:
  • tol_oracle (float) – tolerance for SR3 oracle’s internal numerical subroutines

  • tol_solver (float) – tolerance for the stop criterion of PGD solver

  • initializer (str) – pre-initialization. Can be “None”, in which case the algorithm starts with “all-ones” starting parameters, or “EM”, in which case the algorithm does one step of EM algorithm

  • max_iter_solver (int) – maximal number of iterations for PGD solver

  • stepping (str) – step-size policy for PGD. Can be either “line-search” or “fixed”

  • ell (float) – level of SR3-relaxation

  • lam (float) – strength of LASSO regularizer

  • participation_in_selection (ndarray of int) – 0 if the feature should be affected by the regularizer, 0 otherwise

  • logger_keys (List[str]) – list of keys for the parameters that the logger should track

  • warm_start_oracle (bool) – if fitting should be started from the current model’s coefficients. Used for fine-tuning and iterative fitting.

  • practical (bool) – whether to use SR3-Practical method, which works faster at the expense of accuracy

  • update_prox_every (int) – how often update the relaxed variables. Only if practical=True

  • fixed_step_len (float) – step-size for PGD algorithm. If “linear-search” is used for stepping then the algorithm uses this value as the maximal step possible. Use this parameter if you know the Lipschitz-smoothness constant L for your problem as fixed_step_len=1/L.

  • prior (Optional[Prior]) – an instance of Prior class. If None then a non-informative prior is used.

  • kwargs – for passing debugging info

instantiate()

Instantiates the model: creates all internal entities such as oracle, regularizer, and solver

Returns:

Tuple of [Oracle, Regularizer, Solver] that correspond to this model

class pysr3.lme.models.LMEModel(logger_keys: Set = ('converged', 'iteration'), initializer: str = 'None')

Bases: BaseEstimator, RegressorMixin

Solve Linear Mixed Effects problem with projected gradient descent method.

The original statistical model which this loss is based on is:

Y_i = X_i*β + Z_i*u_i + 𝜺_i,

where

u_i ~ 𝒩(0, diag(𝛄)),

𝛄 ~ 𝒩(t𝛄, 1/lg)

β ~ 𝒩(, 1/lb)

𝜺_i ~ 𝒩(0, diag(obs_var)

See the paper for more details.

Initializes the model

Parameters:
  • initializer (str) – “EM” or “None”

  • logger_keys (Optional[Set[str]]) – list of values for the logger to track.

check_is_fitted()

Checks if the model was fitted before. Throws an error otherwise.

Returns:

None

fit(x: ndarray, y: ndarray, columns_labels: ndarray | None = None, initial_parameters: dict | None = None, warm_start=False, fit_fixed_intercept=False, fit_random_intercept=False, fe_regularization_weights=None, re_regularization_weights=None, **kwargs)

Fits a Linear Model with Linear Mixed-Effects to the given data.

Parameters:
  • x (np.ndarray) – Data matrix. Rows correspond to objects, columns correspond to features, group labels, and variances.

  • y (np.ndarray) – Answers, real-valued array.

  • columns_labels (List[str]) –

    List of column labels. There shall be only one column of group labels and answers STDs.

    • “fixed” : fixed effect

    • “random” : random effect

    • “fixed+random” : both fixed and random,

    • “group” : groups labels

    • “variance” : answers standard deviations

    • “intercept” : intercept column (fixed or random intercept is controlled by “fit_fixed_intercept”
      and “fit_random_intercept” respectively.
  • initial_parameters (Dict[np.ndarray]) –

    Dict with possible fields:

    • ‘beta’ : np.ndarray, shape = [p],
      Initial estimate of fixed effects. If None then it defaults to an all-ones vector.
    • ‘gamma’ : np.ndarray, shape = [q],
      Initial estimate of random effects covariances.
      If None then it defaults to an all-ones vector.
  • warm_start (bool, default is False) – Whether to use previous parameters as initial ones. Overrides initial_parameters if given. Throws NotFittedError if set to True when not fitted.

  • fit_fixed_intercept (bool, default = False) – Whether to add the intercept to the model

  • fit_random_intercept (bool, default = False) – Whether treat the intercept as a random effect.

  • fe_regularization_weights (ndarray[int], 0 or 1) – Vector of length of the number of features where 0 means do not apply regularizer to the coefficients of the corresponding fixed features and 1 means apply as usual.

  • re_regularization_weights (ndarray[int], 0 or 1) – Vector of length of the number of features where 0 means do not apply regularizer to the coefficients of the corresponding fixed features and 1 means apply as usual.

  • kwargs – Not used currently, left here for passing debugging parameters.

Returns:

self (LinearLMESparseModel) – Fitted regression model.

fit_problem(problem: LMEProblem, initial_parameters: dict | None = None, warm_start=False, fe_regularization_weights=None, re_regularization_weights=None, **kwargs)

Fits the model to a provided problem

Parameters:
  • problem (LMEProblem) – an instance of LinearLMEProblem that contains all data-dependent information

  • initial_parameters (np.ndarray) –

    Dict with possible fields:

    • ‘beta’ : np.ndarray, shape = [p],
      Initial estimate of fixed effects. If None then it defaults to an all-ones vector.
    • ‘gamma’ : np.ndarray, shape = [q],
      Initial estimate of random effects covariances. If None then it defaults to an all-ones vector.
  • warm_start (bool, default is False) – Whether to use previous parameters as initial ones. Overrides initial_parameters if given. Throws NotFittedError if set to True when not fitted.

  • fe_regularization_weights (ndarray[int], 0 or 1) – Vector of length of the number of features where 0 means do not apply regularizer to the coefficients of the corresponding fixed features and 1 means apply as usual.

  • re_regularization_weights (ndarray[int], 0 or 1) – Vector of length of the number of features where 0 means do not apply regularizer to the coefficients of the corresponding fixed features and 1 means apply as usual.

kwargs :

Not used currently, left here for passing debugging parameters.

Returns:

self

instantiate() Tuple[LinearLMEOracle | None, Regularizer | None, PGDSolver | None]

Instantiates the model: creates all internal entities such as oracle, regularizer, and solver.

predict(x, columns_labels: List[str] | None = None, fit_fixed_intercept=False, fit_random_intercept=False, **kwargs)

Makes a prediction if .fit(X, y) was called before and throws an error otherwise.

Parameters:
  • x (np.ndarray) – Data matrix. Should have the same format as the data which was used for fitting the model: the number of columns and the columns’ labels should be the same. It may contain new groups, in which case the prediction will be formed using the fixed effects only.

  • columns_labels (List[str]) –

    List of column labels. There shall be only one column of group labels and answers STDs.

    • “fixed” : fixed effect

    • “random” : random effect

    • “fixed+random” : both fixed and random,

    • “group” : groups labels

    • “variance” : answers standard deviations

    • “intercept” : intercept column (fixed or random intercept is controlled by “fit_fixed_intercept”
      and “fit_random_intercept” respectively.
  • fit_fixed_intercept (bool, default = False) – Whether to add an intercept as a fixed feature

  • fit_random_intercept (bool, default = False) – Whether to add an intercept as a random feature.

Returns:

y (np.ndarray) – Models predictions.

predict_problem(problem, **kwargs)

Makes a prediction if .fit was called before and throws an error otherwise.

Parameters:
  • problem (LMEProblem) – An instance of LinearLMEProblem. Should have the same format as the data which was used for fitting the model. It may contain new groups, in which case the prediction will be formed using the fixed effects only.

  • kwargs – for passing debugging parameters

Returns:

y (np.ndarray) – Models predictions.

score(x, y, columns_labels=None, fit_fixed_intercept=False, fit_random_intercept=False, sample_weight=None)

Returns the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.

Parameters:
  • x (np.ndarray) – Data matrix. Should have the same format as the data which was used for fitting the model: the number of columns and the columns’ labels should be the same. It may contain new groups, in which case the prediction will be formed using the fixed effects only.

  • y (np.ndarray) – Answers, real-valued array.

  • columns_labels (List[str]) –

    List of column labels. There shall be only one column of group labels and answers STDs.

    • “fixed” : fixed effect

    • “random” : random effect

    • “fixed+random” : both fixed and random,

    • “group” : groups labels

    • “variance” : answers standard deviations

    • “intercept” : intercept column (fixed or random intercept is controlled by “fit_fixed_intercept”
      and “fit_random_intercept” respectively.
  • fit_fixed_intercept (bool, default = False) – Whether to add an intercept as a fixed feature

  • fit_random_intercept (bool, default = False) – Whether to add an intercept as a random feature.

  • sample_weight (array_like, Optional) – Weights of samples for calculating the R^2 statistics.

Returns:

r2_score (float) – R^2 score

class pysr3.lme.models.SCADLmeModel(tol_solver: float = 1e-05, initializer: str = 'None', max_iter_solver: int = 10000, stepping: str = 'line-search', rho: float = 3.7, sigma: float = 0.5, lam: float = 1.0, elastic_eps: float = 0.0001, logger_keys: Set = ('converged',), fixed_step_len=None, prior=None, **kwargs)

Bases: SimpleLMEModel

Implements SCAD-regularized Linear Mixed-Effect Model

Initializes the model

Parameters:
  • tol_solver (float) – tolerance for the stop criterion of PGD solver

  • initializer (str) – pre-initialization. Can be “None”, in which case the algorithm starts with “all-ones” starting parameters, or “EM”, in which case the algorithm does one step of EM algorithm

  • max_iter_solver (int) – maximal number of iterations for PGD solver

  • stepping (str) – step-size policy for PGD. Can be either “line-search” or “fixed”

  • lam (float) – strength of SCAD regularizer

  • rho (float, rho > 1) – first knot of the SCAD spline

  • sigma (float,) – a positive constant such that sigma*rho is the second knot of the SCAD spline

  • logger_keys (List[str]) – list of keys for the parameters that the logger should track

  • fixed_step_len (float) – step-size for PGD algorithm. If “linear-search” is used for stepping then the algorithm uses this value as the maximal step possible. Use this parameter if you know the Lipschitz-smoothness constant L for your problem as fixed_step_len=1/L.

  • prior (Optional[Prior]) – an instance of Prior class. If None then a non-informative prior is used.

  • kwargs – for passing debugging info

instantiate()

Instantiates the model: creates all internal entities such as oracle, regularizer, and solver

Returns:

Tuple of [Oracle, Regularizer, Solver] that correspond to this model

class pysr3.lme.models.SCADLmeModelSR3(tol_oracle: float = 1e-05, tol_solver: float = 1e-05, initializer: str = 'None', max_iter_oracle: int = 10000, max_iter_solver: int = 10000, stepping: str = 'fixed', ell: float = 40.0, rho: float = 3.7, sigma: float = 0.5, lam: float = 1.0, elastic_eps: float = 0.0001, central_path_neighbourhood_target: float = 0.5, logger_keys: Set = ('converged',), warm_start_oracle=True, practical=False, take_only_positive_part=True, take_expected_value=False, update_prox_every=1, fixed_step_len=None, prior=None, **kwargs)

Bases: SimpleLMEModelSR3

Implements SCAD-regularized SR3-relaxed Linear Mixed-Effects Model

Initializes the model

Parameters:
  • tol_oracle (float) – tolerance for SR3 oracle’s internal numerical subroutines

  • tol_solver (float) – tolerance for the stop criterion of PGD solver

  • initializer (str) – pre-initialization. Can be “None”, in which case the algorithm starts with “all-ones” starting parameters, or “EM”, in which case the algorithm does one step of EM algorithm

  • max_iter_solver (int) – maximal number of iterations for PGD solver

  • stepping (str) – step-size policy for PGD. Can be either “line-search” or “fixed”

  • ell (float) – level of SR3-relaxation

  • lam (float) – strength of SCAD regularizer

  • rho (float, rho > 1) – first knot of the SCAD spline

  • sigma (float, sigma > 1) – a positive constant such that sigma*rho is the second knot of the SCAD spline

  • participation_in_selection (ndarray of int) – 0 if the feature should be affected by the regularizer, 0 otherwise

  • logger_keys (List[str]) – list of keys for the parameters that the logger should track

  • warm_start_oracle (bool) – if fitting should be started from the current model’s coefficients. Used for fine-tuning and iterative fitting.

  • practical (bool) – whether to use SR3-Practical method, which works faster at the expense of accuracy

  • update_prox_every (int) – how often update the relaxed variables. Only if practical=True

  • fixed_step_len (float) – step-size for PGD algorithm. If “linear-search” is used for stepping then the algorithm uses this value as the maximal step possible. Use this parameter if you know the Lipschitz-smoothness constant L for your problem as fixed_step_len=1/L.

  • prior (Optional[Prior]) – an instance of Prior class. If None then a non-informative prior is used.

  • kwargs – for passing debugging info

instantiate()

Instantiates the model: creates all internal entities such as oracle, regularizer, and solver

Returns:

Tuple of [Oracle, Regularizer, Solver] that correspond to this model

class pysr3.lme.models.SimpleLMEModel(tol_solver: float = 1e-05, initializer: str = 'None', max_iter_solver: int = 1000, stepping: str = 'line-search', elastic_eps: float = 0.0001, logger_keys: Set = ('converged',), fixed_step_len=None, prior=None, **kwargs)

Bases: LMEModel

Implements a standard Linear Mixed-Effects Model.

Initializes the model

Parameters:
  • tol_solver (float) – tolerance for the stop criterion of PGD solver

  • initializer (str) – pre-initialization. Can be “None”, in which case the algorithm starts with “all-ones” starting parameters, or “EM”, in which case the algorithm does one step of EM algorithm

  • max_iter_solver (int) – maximal number of iterations for PGD solver

  • stepping (str) – step-size policy for PGD. Can be either “line-search” or “fixed”

  • logger_keys (List[str]) – list of keys for the parameters that the logger should track

  • fixed_step_len (float) – step-size for PGD algorithm. If “linear-search” is used for stepping then the algorithm uses this value as the maximal step possible. Use this parameter if you know the Lipschitz-smoothness constant L for your problem as fixed_step_len=1/L.

  • prior (Optional[Prior]) – an instance of Prior class. If None then a non-informative prior is used.

  • kwargs – for passing debugging info

get_information_criterion(x, y, columns_labels=None, ic='muller_ic')
xnp.ndarray

Data matrix. Rows correspond to objects, columns correspond to features, group labels, and variances.

ynp.ndarray

Answers, real-valued array.

columns_labelsList[str]

List of column labels. There shall be only one column of group labels and answers STDs.

  • “fixed” : fixed effect

  • “random” : random effect

  • “fixed+random” : both fixed and random,

  • “group” : groups labels

  • “variance” : answers standard deviations

  • “intercept” : intercept column (fixed or random intercept is controlled by “fit_fixed_intercept”
    and “fit_random_intercept” respectively.
icstr

Information criterion. Can be one of the following

  • “muller_ic”: IC from (Hui, Muller, 2016)

  • “vaida_aic”: AIC from (Vaida, 2005)

  • “jones_bic”: BIC from (Jones, 2010)

Returns:

value of the requested IC

instantiate()

Instantiates the model: creates all internal entities such as oracle, regularizer, and solver

Returns:

Tuple of [Oracle, Regularizer, Solver] that correspond to this model

class pysr3.lme.models.SimpleLMEModelSR3(tol_oracle: float = 1e-05, tol_solver: float = 1e-05, initializer: str = 'None', max_iter_oracle: int = 10000, max_iter_solver: int = 10000, stepping: str = 'fixed', ell: float = 40, elastic_eps: float = 0.0001, central_path_neighbourhood_target=0.5, logger_keys: Set = ('converged',), warm_start_oracle=True, practical=False, update_prox_every=1, fixed_step_len=None, take_only_positive_part=True, take_expected_value=False, prior=None, **kwargs)

Bases: LMEModel

Implements Regularized Linear Mixed-Effects Model functional for given problem:

Y_i = X_i*β + Z_i*u_i + 𝜺_i,

where

β ~ 𝒩(tb, 1/lb),

||||_0 = nnz(β) <= nnz_tbeta,

u_i ~ 𝒩(0, diag(𝛄)),

𝛄 ~ 𝒩(t𝛄, 1/lg),

||t𝛄||_0 = nnz(t𝛄) <= nnz_tgamma,

𝜺_i ~ 𝒩(0, Λ)

Here tβ and t𝛄 are single variables, not multiplications (e.g. not t*β). This oracle is designed for a solver (LinearLMESparseModel) which searches for a sparse solution (tβ, t𝛄) with at most k and j <= k non-zero elements respectively. For more details, see the documentation for LinearLMESparseModel.

The problem should be provided as LMEProblem.

Initializes the model

Parameters:
  • tol_oracle (float) – tolerance for SR3 oracle’s internal numerical subroutines

  • tol_solver (float) – tolerance for the stop criterion of PGD solver

  • initializer (str) – pre-initialization. Can be “None”, in which case the algorithm starts with “all-ones” starting parameters, or “EM”, in which case the algorithm does one step of EM algorithm

  • max_iter_solver (int) – maximal number of iterations for PGD solver

  • stepping (str) – step-size policy for PGD. Can be either “line-search” or “fixed”

  • ell (float) – level of SR3-relaxation

  • elastic_eps (float) – regularizer coefficient for ||beta|| and ||gamma||. Safeguards against infinite solutions.

  • central_path_neighbourhood_target (float) – how close to the central path MSR3-fast algorithm needs to finish Newton iterations and make a projection

  • logger_keys (List[str]) – list of keys for the parameters that the logger should track

  • warm_start_oracle (bool) – if fitting should be started from the current model’s coefficients. Used for fine-tuning and iterative fitting.

  • practical (bool) – whether to use SR3-Practical method, which works faster at the expense of accuracy

  • update_prox_every (int) – how often update the relaxed variables. Only if practical=True

  • fixed_step_len (float) – step-size for PGD algorithm. If “linear-search” is used for stepping then the algorithm uses this value as the maximal step possible. Use this parameter if you know the Lipschitz-smoothness constant L for your problem as fixed_step_len=1/L.

  • prior (Optional[Prior]) – an instance of Prior class. If None then a non-informative prior is used.

  • kwargs – for passing debugging info

get_information_criterion(x, y, columns_labels=None, ic='muller_ic')
xnp.ndarray

Data matrix. Rows correspond to objects, columns correspond to features, group labels, and variances.

ynp.ndarray

Answers, real-valued array.

columns_labelsList[str]

List of column labels. There shall be only one column of group labels and answers STDs.

  • “fixed” : fixed effect

  • “random” : random effect

  • “fixed+random” : both fixed and random,

  • “group” : groups labels

  • “variance” : answers standard deviations

  • “intercept” : intercept column (fixed or random intercept is controlled by “fit_fixed_intercept”
    and “fit_random_intercept” respectively.
icstr

Information criterion. Can be one of the following

  • “muller_ic”: IC from (Hui, Muller, 2016)

  • “vaida_aic”: AIC from (Vaida, 2005)

  • “jones_bic”: BIC from (Jones, 2010)

Returns:

value of the requested IC

instantiate()

Instantiates the model: creates all internal entities such as oracle, regularizer, and solver

Returns:

Tuple of [Oracle, Regularizer, Solver] that correspond to this model