Linear model


Module Sklearn.​Linear_model.​ARDRegression wraps Python class sklearn.linear_model.ARDRegression.

type t


constructor and attributes create
val create :
  ?n_iter:int ->
  ?tol:float ->
  ?alpha_1:float ->
  ?alpha_2:float ->
  ?lambda_1:float ->
  ?lambda_2:float ->
  ?compute_score:bool ->
  ?threshold_lambda:float ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?copy_X:bool ->
  ?verbose:int ->
  unit ->

Bayesian ARD regression.

Fit the weights of a regression model, using an ARD prior. The weights of the regression model are assumed to be in Gaussian distributions. Also estimate the parameters lambda (precisions of the distributions of the weights) and alpha (precision of the distribution of the noise). The estimation is done by an iterative procedures (Evidence Maximization)

Read more in the :ref:User Guide <bayesian_regression>.


  • n_iter : int, default=300 Maximum number of iterations.

  • tol : float, default=1e-3 Stop the algorithm if w has converged.

  • alpha_1 : float, default=1e-6

  • Hyper-parameter : shape parameter for the Gamma distribution prior over the alpha parameter.

  • alpha_2 : float, default=1e-6

  • Hyper-parameter : inverse scale parameter (rate parameter) for the Gamma distribution prior over the alpha parameter.

  • lambda_1 : float, default=1e-6

  • Hyper-parameter : shape parameter for the Gamma distribution prior over the lambda parameter.

  • lambda_2 : float, default=1e-6

  • Hyper-parameter : inverse scale parameter (rate parameter) for the Gamma distribution prior over the lambda parameter.

  • compute_score : bool, default=False If True, compute the objective function at each step of the model.

  • threshold_lambda : float, default=10 000 threshold for removing (pruning) weights with high precision from the computation.

  • fit_intercept : bool, default=True whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • verbose : bool, default=False Verbose mode when fitting the model.


  • coef_ : array-like of shape (n_features,) Coefficients of the regression model (mean of distribution)

  • alpha_ : float estimated precision of the noise.

  • lambda_ : array-like of shape (n_features,) estimated precisions of the weights.

  • sigma_ : array-like of shape (n_features, n_features) estimated variance-covariance matrix of the weights

  • scores_ : float if computed, value of the objective function (to be maximized)

  • intercept_ : float Independent term in decision function. Set to 0.0 if fit_intercept = False.


>>> from sklearn import linear_model
>>> clf = linear_model.ARDRegression()
>>>[[0,0], [1, 1], [2, 2]], [0, 1, 2])
>>> clf.predict([[1, 1]])


For an example, see :ref:examples/linear_model/ <>.


D. J. C. MacKay, Bayesian nonlinear modeling for the prediction competition, ASHRAE Transactions, 1994.

R. Salakhutdinov, Lecture notes on Statistical Machine Learning,

  • Their beta is our self.alpha_ Their alpha is our self.lambda_ ARD is a little different than the slide: only dimensions/features for which self.lambda_ < self.threshold_lambda are kept and the rest are discarded.


method fit
val fit :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit the ARDRegression model according to the given training data and parameters.

Iterative procedure to maximize the evidence


  • X : array-like of shape (n_samples, n_features) Training vector, where n_samples in the number of samples and n_features is the number of features.

  • y : array-like of shape (n_samples,) Target values (integers). Will be cast to X's dtype if necessary


  • self : returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  ?return_std:bool ->
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.

In addition to the mean of the predictive distribution, also its standard deviation can be returned.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Samples.

  • return_std : bool, default=False Whether to return the standard deviation of posterior prediction.


  • y_mean : array-like of shape (n_samples,) Mean of predictive distribution of query points.

  • y_std : array-like of shape (n_samples,) Standard deviation of predictive distribution of query points.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alpha_
val alpha_ : t -> float
val alpha_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute lambda_
val lambda_ : t -> [>`ArrayLike] Np.Obj.t
val lambda_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute sigma_
val sigma_ : t -> [>`ArrayLike] Np.Obj.t
val sigma_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute scores_
val scores_ : t -> float
val scores_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​BayesianRidge wraps Python class sklearn.linear_model.BayesianRidge.

type t


constructor and attributes create
val create :
  ?n_iter:int ->
  ?tol:float ->
  ?alpha_1:float ->
  ?alpha_2:float ->
  ?lambda_1:float ->
  ?lambda_2:float ->
  ?alpha_init:float ->
  ?lambda_init:float ->
  ?compute_score:bool ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?copy_X:bool ->
  ?verbose:int ->
  unit ->

Bayesian ridge regression.

Fit a Bayesian ridge model. See the Notes section for details on this implementation and the optimization of the regularization parameters lambda (precision of the weights) and alpha (precision of the noise).

Read more in the :ref:User Guide <bayesian_regression>.


  • n_iter : int, default=300 Maximum number of iterations. Should be greater than or equal to 1.

  • tol : float, default=1e-3 Stop the algorithm if w has converged.

  • alpha_1 : float, default=1e-6

  • Hyper-parameter : shape parameter for the Gamma distribution prior over the alpha parameter.

  • alpha_2 : float, default=1e-6

  • Hyper-parameter : inverse scale parameter (rate parameter) for the Gamma distribution prior over the alpha parameter.

  • lambda_1 : float, default=1e-6

  • Hyper-parameter : shape parameter for the Gamma distribution prior over the lambda parameter.

  • lambda_2 : float, default=1e-6

  • Hyper-parameter : inverse scale parameter (rate parameter) for the Gamma distribution prior over the lambda parameter.

  • alpha_init : float, default=None Initial value for alpha (precision of the noise). If not set, alpha_init is 1/Var(y).

    .. versionadded:: 0.22
  • lambda_init : float, default=None Initial value for lambda (precision of the weights). If not set, lambda_init is 1.

    .. versionadded:: 0.22
  • compute_score : bool, default=False If True, compute the log marginal likelihood at each iteration of the optimization.

  • fit_intercept : bool, default=True Whether to calculate the intercept for this model. The intercept is not treated as a probabilistic parameter and thus has no associated variance. If set to False, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • verbose : bool, default=False Verbose mode when fitting the model.


  • coef_ : array-like of shape (n_features,) Coefficients of the regression model (mean of distribution)

  • intercept_ : float Independent term in decision function. Set to 0.0 if fit_intercept = False.

  • alpha_ : float Estimated precision of the noise.

  • lambda_ : float Estimated precision of the weights.

  • sigma_ : array-like of shape (n_features, n_features) Estimated variance-covariance matrix of the weights

  • scores_ : array-like of shape (n_iter_+1,) If computed_score is True, value of the log marginal likelihood (to be maximized) at each iteration of the optimization. The array starts with the value of the log marginal likelihood obtained for the initial values of alpha and lambda and ends with the value obtained for the estimated alpha and lambda.

  • n_iter_ : int The actual number of iterations to reach the stopping criterion.


>>> from sklearn import linear_model
>>> clf = linear_model.BayesianRidge()
>>>[[0,0], [1, 1], [2, 2]], [0, 1, 2])
>>> clf.predict([[1, 1]])


There exist several strategies to perform Bayesian ridge regression. This implementation is based on the algorithm described in Appendix A of (Tipping, 2001) where updates of the regularization parameters are done as suggested in (MacKay, 1992). Note that according to A New View of Automatic Relevance Determination (Wipf and Nagarajan, 2008) these update rules do not guarantee that the marginal likelihood is increasing between two consecutive iterations of the optimization.


D. J. C. MacKay, Bayesian Interpolation, Computation and Neural Systems, Vol. 4, No. 3, 1992.

M. E. Tipping, Sparse Bayesian Learning and the Relevance Vector Machine, Journal of Machine Learning Research, Vol. 1, 2001.


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit the model


  • X : ndarray of shape (n_samples, n_features) Training data

  • y : ndarray of shape (n_samples,) Target values. Will be cast to X's dtype if necessary

  • sample_weight : ndarray of shape (n_samples,), default=None Individual weights for each sample

    .. versionadded:: 0.20 parameter sample_weight support to BayesianRidge.


  • self : returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  ?return_std:bool ->
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.

In addition to the mean of the predictive distribution, also its standard deviation can be returned.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Samples.

  • return_std : bool, default=False Whether to return the standard deviation of posterior prediction.


  • y_mean : array-like of shape (n_samples,) Mean of predictive distribution of query points.

  • y_std : array-like of shape (n_samples,) Standard deviation of predictive distribution of query points.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alpha_
val alpha_ : t -> float
val alpha_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute lambda_
val lambda_ : t -> float
val lambda_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute sigma_
val sigma_ : t -> [>`ArrayLike] Np.Obj.t
val sigma_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute scores_
val scores_ : t -> [>`ArrayLike] Np.Obj.t
val scores_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​ElasticNet wraps Python class sklearn.linear_model.ElasticNet.

type t


constructor and attributes create
val create :
  ?alpha:float ->
  ?l1_ratio:float ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?precompute:[`Arr of [>`ArrayLike] Np.Obj.t | `Bool of bool] ->
  ?max_iter:int ->
  ?copy_X:bool ->
  ?tol:float ->
  ?warm_start:bool ->
  ?positive:bool ->
  ?random_state:int ->
  ?selection:[`Cyclic | `Random] ->
  unit ->

Linear regression with combined L1 and L2 priors as regularizer.

Minimizes the objective function::

    1 / (2 * n_samples) * ||y - Xw||^2_2
    + alpha * l1_ratio * ||w||_1
    + 0.5 * alpha * (1 - l1_ratio) * ||w||^2_2

If you are interested in controlling the L1 and L2 penalty separately, keep in mind that this is equivalent to::

    a * L1 + b * L2
  • where::
    alpha = a + b and l1_ratio = a / (a + b)

The parameter l1_ratio corresponds to alpha in the glmnet R package while alpha corresponds to the lambda parameter in glmnet. Specifically, l1_ratio = 1 is the lasso penalty. Currently, l1_ratio <= 0.01 is not reliable, unless you supply your own sequence of alpha.

Read more in the :ref:User Guide <elastic_net>.


  • alpha : float, default=1.0 Constant that multiplies the penalty terms. Defaults to 1.0. See the notes for the exact mathematical meaning of this parameter. alpha = 0 is equivalent to an ordinary least square, solved by the :class:LinearRegression object. For numerical reasons, using alpha = 0 with the Lasso object is not advised. Given this, you should use the :class:LinearRegression object.

  • l1_ratio : float, default=0.5 The ElasticNet mixing parameter, with 0 <= l1_ratio <= 1. For l1_ratio = 0 the penalty is an L2 penalty. For l1_ratio = 1 it is an L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2.

  • fit_intercept : bool, default=True Whether the intercept should be estimated or not. If False, the data is assumed to be already centered.

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • precompute : bool or array-like of shape (n_features, n_features), default=False Whether to use a precomputed Gram matrix to speed up calculations. The Gram matrix can also be passed as argument. For sparse input this option is always True to preserve sparsity.

  • max_iter : int, default=1000 The maximum number of iterations

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • tol : float, default=1e-4 The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.

  • warm_start : bool, default=False When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

  • See :term:the Glossary <warm_start>.

  • positive : bool, default=False When set to True, forces the coefficients to be positive.

  • random_state : int, RandomState instance, default=None The seed of the pseudo random number generator that selects a random feature to update. Used when selection == 'random'. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.

  • selection : {'cyclic', 'random'}, default='cyclic' If set to 'random', a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to 'random') often leads to significantly faster convergence especially when tol is higher than 1e-4.


  • coef_ : ndarray of shape (n_features,) or (n_targets, n_features) parameter vector (w in the cost function formula)

  • sparse_coef_ : sparse matrix of shape (n_features, 1) or (n_targets, n_features) sparse_coef_ is a readonly property derived from coef_

  • intercept_ : float or ndarray of shape (n_targets,) independent term in decision function.

  • n_iter_ : list of int number of iterations run by the coordinate descent solver to reach the specified tolerance.


>>> from sklearn.linear_model import ElasticNet
>>> from sklearn.datasets import make_regression
>>> X, y = make_regression(n_features=2, random_state=0)
>>> regr = ElasticNet(random_state=0)
>>>, y)
>>> print(regr.coef_)
[18.83816048 64.55968825]
>>> print(regr.intercept_)
>>> print(regr.predict([[0, 0]]))


To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

See also

  • ElasticNetCV : Elastic net model with best model selection by cross-validation.

  • SGDRegressor: implements elastic net regression with incremental training.

  • SGDClassifier: implements logistic regression with elastic net penalty (SGDClassifier(loss='log', penalty='elasticnet')).


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  ?check_input:bool ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit model with coordinate descent.


  • X : {ndarray, sparse matrix} of (n_samples, n_features) Data

  • y : {ndarray, sparse matrix} of shape (n_samples,) or (n_samples, n_targets) Target. Will be cast to X's dtype if necessary

  • sample_weight : float or array-like of shape (n_samples,), default=None Sample weight.

  • check_input : bool, default=True Allow to bypass several input checking. Don't use this parameter unless you know what you do.


Coordinate descent is an algorithm that considers each column of data at a time hence it will automatically convert the X input as a Fortran-contiguous numpy array if necessary.

To avoid memory re-allocation it is advised to allocate the initial data in memory directly using that format.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute sparse_coef_
val sparse_coef_ : t -> [`ArrayLike|`Object|`Spmatrix] Np.Obj.t
val sparse_coef_opt : t -> ([`ArrayLike|`Object|`Spmatrix] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> Py.Object.t
val n_iter_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​ElasticNetCV wraps Python class sklearn.linear_model.ElasticNetCV.

type t


constructor and attributes create
val create :
  ?l1_ratio:[`F of float | `Fs of float list] ->
  ?eps:float ->
  ?n_alphas:int ->
  ?alphas:[>`ArrayLike] Np.Obj.t ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?precompute:[`Arr of [>`ArrayLike] Np.Obj.t | `Auto | `Bool of bool] ->
  ?max_iter:int ->
  ?tol:float ->
  ?cv:[`BaseCrossValidator of [>`BaseCrossValidator] Np.Obj.t | `I of int | `Arr of [>`ArrayLike] Np.Obj.t] ->
  ?copy_X:bool ->
  ?verbose:int ->
  ?n_jobs:int ->
  ?positive:bool ->
  ?random_state:int ->
  ?selection:[`Cyclic | `Random] ->
  unit ->

Elastic Net model with iterative fitting along a regularization path.

See glossary entry for :term:cross-validation estimator.

Read more in the :ref:User Guide <elastic_net>.


  • l1_ratio : float or list of float, default=0.5 float between 0 and 1 passed to ElasticNet (scaling between l1 and l2 penalties). For l1_ratio = 0 the penalty is an L2 penalty. For l1_ratio = 1 it is an L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2 This parameter can be a list, in which case the different values are tested by cross-validation and the one giving the best prediction score is used. Note that a good choice of list of values for l1_ratio is often to put more values close to 1 (i.e. Lasso) and less close to 0 (i.e. Ridge), as in [.1, .5, .7, .9, .95, .99, 1]

  • eps : float, default=1e-3 Length of the path. eps=1e-3 means that alpha_min / alpha_max = 1e-3.

  • n_alphas : int, default=100 Number of alphas along the regularization path, used for each l1_ratio.

  • alphas : ndarray, default=None List of alphas where to compute the models. If None alphas are set automatically

  • fit_intercept : bool, default=True whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • precompute : 'auto', bool or array-like of shape (n_features, n_features), default='auto' Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.

  • max_iter : int, default=1000 The maximum number of iterations

  • tol : float, default=1e-4 The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.

  • cv : int, cross-validation generator or iterable, default=None Determines the cross-validation splitting strategy. Possible inputs for cv are:

    • None, to use the default 5-fold cross-validation,
    • int, to specify the number of folds.
    • :term:CV splitter,
    • An iterable yielding (train, test) splits as arrays of indices.

    For int/None inputs, :class:KFold is used.

  • Refer :ref:User Guide <cross_validation> for the various cross-validation strategies that can be used here.

    .. versionchanged:: 0.22 cv default value if None changed from 3-fold to 5-fold.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • verbose : bool or int, default=0 Amount of verbosity.

  • n_jobs : int, default=None Number of CPUs to use during the cross validation. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors. See :term:Glossary <n_jobs> for more details.

  • positive : bool, default=False When set to True, forces the coefficients to be positive.

  • random_state : int, RandomState instance, default=None The seed of the pseudo random number generator that selects a random feature to update. Used when selection == 'random'. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.

  • selection : {'cyclic', 'random'}, default='cyclic' If set to 'random', a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to 'random') often leads to significantly faster convergence especially when tol is higher than 1e-4.


  • alpha_ : float The amount of penalization chosen by cross validation

  • l1_ratio_ : float The compromise between l1 and l2 penalization chosen by cross validation

  • coef_ : ndarray of shape (n_features,) or (n_targets, n_features) Parameter vector (w in the cost function formula),

  • intercept_ : float or ndarray of shape (n_targets, n_features) Independent term in the decision function.

  • mse_path_ : ndarray of shape (n_l1_ratio, n_alpha, n_folds) Mean square error for the test set on each fold, varying l1_ratio and alpha.

  • alphas_ : ndarray of shape (n_alphas,) or (n_l1_ratio, n_alphas) The grid of alphas used for fitting, for each l1_ratio.

  • n_iter_ : int number of iterations run by the coordinate descent solver to reach the specified tolerance for the optimal alpha.


>>> from sklearn.linear_model import ElasticNetCV
>>> from sklearn.datasets import make_regression
>>> X, y = make_regression(n_features=2, random_state=0)
>>> regr = ElasticNetCV(cv=5, random_state=0)
>>>, y)
ElasticNetCV(cv=5, random_state=0)
>>> print(regr.alpha_)
>>> print(regr.intercept_)
>>> print(regr.predict([[0, 0]]))


For an example, see :ref:examples/linear_model/ <>.

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

The parameter l1_ratio corresponds to alpha in the glmnet R package while alpha corresponds to the lambda parameter in glmnet. More specifically, the optimization objective is::

1 / (2 * n_samples) * ||y - Xw||^2_2
+ alpha * l1_ratio * ||w||_1
+ 0.5 * alpha * (1 - l1_ratio) * ||w||^2_2

If you are interested in controlling the L1 and L2 penalty separately, keep in mind that this is equivalent to::

a * L1 + b * L2
  • for::

    alpha = a + b and l1_ratio = a / (a + b).

See also

enet_path ElasticNet


method fit
val fit :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit linear model with coordinate descent

Fit is on grid of alphas and best alpha estimated by cross-validation.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training data. Pass directly as Fortran-contiguous data to avoid unnecessary memory duplication. If y is mono-output, X can be sparse.

  • y : array-like of shape (n_samples,) or (n_samples, n_targets) Target values


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute alpha_
val alpha_ : t -> float
val alpha_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute l1_ratio_
val l1_ratio_ : t -> float
val l1_ratio_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute mse_path_
val mse_path_ : t -> [>`ArrayLike] Np.Obj.t
val mse_path_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alphas_
val alphas_ : t -> [>`ArrayLike] Np.Obj.t
val alphas_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​GammaRegressor wraps Python class sklearn.linear_model.GammaRegressor.

type t


constructor and attributes create
val create :
  ?alpha:float ->
  ?fit_intercept:bool ->
  ?max_iter:int ->
  ?tol:float ->
  ?warm_start:bool ->
  ?verbose:int ->
  unit ->

Generalized Linear Model with a Gamma distribution.

Read more in the :ref:User Guide <Generalized_linear_regression>.


  • alpha : float, default=1 Constant that multiplies the penalty term and thus determines the regularization strength. alpha = 0 is equivalent to unpenalized GLMs. In this case, the design matrix X must have full column rank (no collinearities).

  • fit_intercept : bool, default=True Specifies if a constant (a.k.a. bias or intercept) should be added to the linear predictor (X @ coef + intercept).

  • max_iter : int, default=100 The maximal number of iterations for the solver.

  • tol : float, default=1e-4 Stopping criterion. For the lbfgs solver, the iteration will stop when max{ |g_j|, j = 1, ..., d} <= tol where g_j is the j-th component of the gradient (derivative) of the objective function.

  • warm_start : bool, default=False If set to True, reuse the solution of the previous call to fit as initialization for coef_ and intercept_ .

  • verbose : int, default=0 For the lbfgs solver set verbose to any positive number for verbosity.


  • coef_ : array of shape (n_features,) Estimated coefficients for the linear predictor (X * coef_ + intercept_) in the GLM.

  • intercept_ : float Intercept (a.k.a. bias) added to linear predictor.

  • n_iter_ : int Actual number of iterations used in the solver.


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit a Generalized Linear Model.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training data.

  • y : array-like of shape (n_samples,) Target values.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • self : returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using GLM with feature matrix X.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Samples.


  • y_pred : array of shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Compute D^2, the percentage of deviance explained.

D^2 is a generalization of the coefficient of determination R^2. R^2 uses squared error and D^2 deviance. Note that those two are equal for family='normal'.

D^2 is defined as :math:D^2 = 1-\frac{D(y_{true},y_{pred})}{D_{null}}, :math:D_{null} is the null deviance, i.e. the deviance of a model with intercept alone, which corresponds to :math:y_{pred} = \bar{y}. The mean :math:\bar{y} is averaged by sample_weight. Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse).


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Test samples.

  • y : array-like of shape (n_samples,) True values of target.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float D^2 of self.predict(X) w.r.t. y.


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​Hinge wraps Python class sklearn.linear_model.Hinge.

type t


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​Huber wraps Python class sklearn.linear_model.Huber.

type t


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​HuberRegressor wraps Python class sklearn.linear_model.HuberRegressor.

type t


constructor and attributes create
val create :
  ?epsilon:float ->
  ?max_iter:int ->
  ?alpha:float ->
  ?warm_start:bool ->
  ?fit_intercept:bool ->
  ?tol:float ->
  unit ->

Linear regression model that is robust to outliers.

The Huber Regressor optimizes the squared loss for the samples where |(y - X'w) / sigma| < epsilon and the absolute loss for the samples where |(y - X'w) / sigma| > epsilon, where w and sigma are parameters to be optimized. The parameter sigma makes sure that if y is scaled up or down by a certain factor, one does not need to rescale epsilon to achieve the same robustness. Note that this does not take into account the fact that the different features of X may be of different scales.

This makes sure that the loss function is not heavily influenced by the outliers while not completely ignoring their effect.

Read more in the :ref:User Guide <huber_regression>

.. versionadded:: 0.18


  • epsilon : float, greater than 1.0, default 1.35 The parameter epsilon controls the number of samples that should be classified as outliers. The smaller the epsilon, the more robust it is to outliers.

  • max_iter : int, default 100 Maximum number of iterations that scipy.optimize.minimize(method='L-BFGS-B') should run for.

  • alpha : float, default 0.0001 Regularization parameter.

  • warm_start : bool, default False This is useful if the stored attributes of a previously used model has to be reused. If set to False, then the coefficients will be rewritten for every call to fit.

  • See :term:the Glossary <warm_start>.

  • fit_intercept : bool, default True Whether or not to fit the intercept. This can be set to False if the data is already centered around the origin.

  • tol : float, default 1e-5 The iteration will stop when max{ |proj g_i | i = 1, ..., n} <= tol where pg_i is the i-th component of the projected gradient.


  • coef_ : array, shape (n_features,) Features got by optimizing the Huber loss.

  • intercept_ : float Bias.

  • scale_ : float The value by which |y - X'w - c| is scaled down.

  • n_iter_ : int Number of iterations that scipy.optimize.minimize(method='L-BFGS-B') has run for.

    .. versionchanged:: 0.20

    In SciPy <= 1.0.0 the number of lbfgs iterations may exceed
    ``max_iter``. ``n_iter_`` will now report at most ``max_iter``.
  • outliers_ : array, shape (n_samples,) A boolean mask which is set to True where the samples are identified as outliers.


>>> import numpy as np
>>> from sklearn.linear_model import HuberRegressor, LinearRegression
>>> from sklearn.datasets import make_regression
>>> rng = np.random.RandomState(0)
>>> X, y, coef = make_regression(
...     n_samples=200, n_features=2, noise=4.0, coef=True, random_state=0)
>>> X[:4] = rng.uniform(10, 20, (4, 2))
>>> y[:4] = rng.uniform(10, 20, 4)
>>> huber = HuberRegressor().fit(X, y)
>>> huber.score(X, y)
>>> huber.predict(X[:1,])
>>> linear = LinearRegression().fit(X, y)
>>> print('True coefficients:', coef)
True coefficients: [20.4923...  34.1698...]
>>> print('Huber coefficients:', huber.coef_)
Huber coefficients: [17.7906... 31.0106...]
>>> print('Linear Regression coefficients:', linear.coef_)
Linear Regression coefficients: [-1.9221...  7.0226...]


.. [1] Peter J. Huber, Elvezio M. Ronchetti, Robust Statistics Concomitant scale estimates, pg 172 .. [2] Art B. Owen (2006), A robust hybrid of lasso and ridge regression.



method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit the model according to the given training data.


  • X : array-like, shape (n_samples, n_features) Training vector, where n_samples in the number of samples and n_features is the number of features.

  • y : array-like, shape (n_samples,) Target vector relative to X.

  • sample_weight : array-like, shape (n_samples,) Weight given to each sample.


  • self : object


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute scale_
val scale_ : t -> float
val scale_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute outliers_
val outliers_ : t -> [>`ArrayLike] Np.Obj.t
val outliers_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​Lars wraps Python class sklearn.linear_model.Lars.

type t


constructor and attributes create
val create :
  ?fit_intercept:bool ->
  ?verbose:int ->
  ?normalize:bool ->
  ?precompute:[`Arr of [>`ArrayLike] Np.Obj.t | `Auto | `Bool of bool] ->
  ?n_nonzero_coefs:int ->
  ?eps:float ->
  ?copy_X:bool ->
  ?fit_path:bool ->
  ?jitter:float ->
  ?random_state:int ->
  unit ->

Least Angle Regression model a.k.a. LAR

Read more in the :ref:User Guide <least_angle_regression>.


  • fit_intercept : bool, default=True Whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • verbose : bool or int, default=False Sets the verbosity amount

  • normalize : bool, default=True This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • precompute : bool, 'auto' or array-like , default='auto' Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.

  • n_nonzero_coefs : int, default=500 Target number of non-zero coefficients. Use np.inf for no limit.

  • eps : float, optional The machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems. Unlike the tol parameter in some iterative optimization-based algorithms, this parameter does not control the tolerance of the optimization. By default, np.finfo(np.float).eps is used.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • fit_path : bool, default=True If True the full path is stored in the coef_path_ attribute. If you compute the solution for a large problem or many targets, setting fit_path to False will lead to a speedup, especially with a small alpha.

  • jitter : float, default=None Upper bound on a uniform noise parameter to be added to the y values, to satisfy the model's assumption of one-at-a-time computations. Might help with stability.

  • random_state : int, RandomState instance or None (default) Determines random number generation for jittering. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>. Ignored if jitter is None.


  • alphas_ : array-like of shape (n_alphas + 1,) | list of n_targets such arrays Maximum of covariances (in absolute value) at each iteration. n_alphas is either n_nonzero_coefs or n_features, whichever is smaller.

  • active_ : list, length = n_alphas | list of n_targets such lists Indices of active variables at the end of the path.

  • coef_path_ : array-like of shape (n_features, n_alphas + 1) | list of n_targets such arrays The varying values of the coefficients along the path. It is not present if the fit_path parameter is False.

  • coef_ : array-like of shape (n_features,) or (n_targets, n_features) Parameter vector (w in the formulation formula).

  • intercept_ : float or array-like of shape (n_targets,) Independent term in decision function.

  • n_iter_ : array-like or int The number of iterations taken by lars_path to find the grid of alphas for each target.


>>> from sklearn import linear_model
>>> reg = linear_model.Lars(n_nonzero_coefs=1)
>>>[[-1, 1], [0, 0], [1, 1]], [-1.1111, 0, -1.1111])
>>> print(reg.coef_)
[ 0. -1.11...]

See also

lars_path, LarsCV sklearn.decomposition.sparse_encode


method fit
val fit :
  ?xy:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit the model using X, y as training data.


  • X : array-like of shape (n_samples, n_features) Training data.

  • y : array-like of shape (n_samples,) or (n_samples, n_targets) Target values.

  • Xy : array-like of shape (n_samples,) or (n_samples, n_targets), default=None Xy =, y) that can be precomputed. It is useful only when the Gram matrix is precomputed.


  • self : object returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute alphas_
val alphas_ : t -> Py.Object.t
val alphas_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute active_
val active_ : t -> Py.Object.t
val active_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_path_
val coef_path_ : t -> Py.Object.t
val coef_path_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> Py.Object.t
val n_iter_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​LarsCV wraps Python class sklearn.linear_model.LarsCV.

type t


constructor and attributes create
val create :
  ?fit_intercept:bool ->
  ?verbose:int ->
  ?max_iter:int ->
  ?normalize:bool ->
  ?precompute:[`Arr of [>`ArrayLike] Np.Obj.t | `Auto | `Bool of bool] ->
  ?cv:[`BaseCrossValidator of [>`BaseCrossValidator] Np.Obj.t | `I of int | `Arr of [>`ArrayLike] Np.Obj.t] ->
  ?max_n_alphas:int ->
  ?n_jobs:int ->
  ?eps:float ->
  ?copy_X:bool ->
  unit ->

Cross-validated Least Angle Regression model.

See glossary entry for :term:cross-validation estimator.

Read more in the :ref:User Guide <least_angle_regression>.


  • fit_intercept : bool, default=True whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • verbose : bool or int, default=False Sets the verbosity amount

  • max_iter : int, default=500 Maximum number of iterations to perform.

  • normalize : bool, default=True This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • precompute : bool, 'auto' or array-like , default='auto' Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix cannot be passed as argument since we will use only subsets of X.

  • cv : int, cross-validation generator or an iterable, default=None Determines the cross-validation splitting strategy. Possible inputs for cv are:

    • None, to use the default 5-fold cross-validation,
    • integer, to specify the number of folds.
    • :term:CV splitter,
    • An iterable yielding (train, test) splits as arrays of indices.

    For integer/None inputs, :class:KFold is used.

  • Refer :ref:User Guide <cross_validation> for the various cross-validation strategies that can be used here.

    .. versionchanged:: 0.22 cv default value if None changed from 3-fold to 5-fold.

  • max_n_alphas : int, default=1000 The maximum number of points on the path used to compute the residuals in the cross-validation

  • n_jobs : int or None, default=None Number of CPUs to use during the cross validation. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors. See :term:Glossary <n_jobs> for more details.

  • eps : float, optional The machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems. By default, np.finfo(np.float).eps is used.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.


  • coef_ : array-like of shape (n_features,) parameter vector (w in the formulation formula)

  • intercept_ : float independent term in decision function

  • coef_path_ : array-like of shape (n_features, n_alphas) the varying values of the coefficients along the path

  • alpha_ : float the estimated regularization parameter alpha

  • alphas_ : array-like of shape (n_alphas,) the different values of alpha along the path

  • cv_alphas_ : array-like of shape (n_cv_alphas,) all the values of alpha along the path for the different folds

  • mse_path_ : array-like of shape (n_folds, n_cv_alphas) the mean square error on left-out for each fold along the path (alpha values given by cv_alphas)

  • n_iter_ : array-like or int the number of iterations run by Lars with the optimal alpha.


>>> from sklearn.linear_model import LarsCV
>>> from sklearn.datasets import make_regression
>>> X, y = make_regression(n_samples=200, noise=4.0, random_state=0)
>>> reg = LarsCV(cv=5).fit(X, y)
>>> reg.score(X, y)
>>> reg.alpha_
>>> reg.predict(X[:1,])

See also

lars_path, LassoLars, LassoLarsCV


method fit
val fit :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit the model using X, y as training data.


  • X : array-like of shape (n_samples, n_features) Training data.

  • y : array-like of shape (n_samples,) Target values.


  • self : object returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_path_
val coef_path_ : t -> [>`ArrayLike] Np.Obj.t
val coef_path_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alpha_
val alpha_ : t -> float
val alpha_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alphas_
val alphas_ : t -> [>`ArrayLike] Np.Obj.t
val alphas_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute cv_alphas_
val cv_alphas_ : t -> [>`ArrayLike] Np.Obj.t
val cv_alphas_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute mse_path_
val mse_path_ : t -> [>`ArrayLike] Np.Obj.t
val mse_path_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> Py.Object.t
val n_iter_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​Lasso wraps Python class sklearn.linear_model.Lasso.

type t


constructor and attributes create
val create :
  ?alpha:float ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?precompute:[`Arr of [>`ArrayLike] Np.Obj.t | `Auto | `Bool of bool] ->
  ?copy_X:bool ->
  ?max_iter:int ->
  ?tol:float ->
  ?warm_start:bool ->
  ?positive:bool ->
  ?random_state:int ->
  ?selection:[`Cyclic | `Random] ->
  unit ->

Linear Model trained with L1 prior as regularizer (aka the Lasso)

The optimization objective for Lasso is::

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

Technically the Lasso model is optimizing the same objective function as the Elastic Net with l1_ratio=1.0 (no L2 penalty).

Read more in the :ref:User Guide <lasso>.


  • alpha : float, default=1.0 Constant that multiplies the L1 term. Defaults to 1.0. alpha = 0 is equivalent to an ordinary least square, solved by the :class:LinearRegression object. For numerical reasons, using alpha = 0 with the Lasso object is not advised. Given this, you should use the :class:LinearRegression object.

  • fit_intercept : bool, default=True Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • precompute : 'auto', bool or array-like of shape (n_features, n_features), default=False Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument. For sparse input this option is always True to preserve sparsity.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • max_iter : int, default=1000 The maximum number of iterations

  • tol : float, default=1e-4 The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.

  • warm_start : bool, default=False When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

  • See :term:the Glossary <warm_start>.

  • positive : bool, default=False When set to True, forces the coefficients to be positive.

  • random_state : int, RandomState instance, default=None The seed of the pseudo random number generator that selects a random feature to update. Used when selection == 'random'. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.

  • selection : {'cyclic', 'random'}, default='cyclic' If set to 'random', a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to 'random') often leads to significantly faster convergence especially when tol is higher than 1e-4.


  • coef_ : ndarray of shape (n_features,) or (n_targets, n_features) parameter vector (w in the cost function formula)

  • sparse_coef_ : sparse matrix of shape (n_features, 1) or (n_targets, n_features) sparse_coef_ is a readonly property derived from coef_

  • intercept_ : float or ndarray of shape (n_targets,) independent term in decision function.

  • n_iter_ : int or list of int number of iterations run by the coordinate descent solver to reach the specified tolerance.


>>> from sklearn import linear_model
>>> clf = linear_model.Lasso(alpha=0.1)
>>>[[0,0], [1, 1], [2, 2]], [0, 1, 2])
>>> print(clf.coef_)
[0.85 0.  ]
>>> print(clf.intercept_)

See also

lars_path lasso_path LassoLars LassoCV LassoLarsCV sklearn.decomposition.sparse_encode


The algorithm used to fit the model is coordinate descent.

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  ?check_input:bool ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit model with coordinate descent.


  • X : {ndarray, sparse matrix} of (n_samples, n_features) Data

  • y : {ndarray, sparse matrix} of shape (n_samples,) or (n_samples, n_targets) Target. Will be cast to X's dtype if necessary

  • sample_weight : float or array-like of shape (n_samples,), default=None Sample weight.

  • check_input : bool, default=True Allow to bypass several input checking. Don't use this parameter unless you know what you do.


Coordinate descent is an algorithm that considers each column of data at a time hence it will automatically convert the X input as a Fortran-contiguous numpy array if necessary.

To avoid memory re-allocation it is advised to allocate the initial data in memory directly using that format.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute sparse_coef_
val sparse_coef_ : t -> [`ArrayLike|`Object|`Spmatrix] Np.Obj.t
val sparse_coef_opt : t -> ([`ArrayLike|`Object|`Spmatrix] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> Py.Object.t
val n_iter_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​LassoCV wraps Python class sklearn.linear_model.LassoCV.

type t


constructor and attributes create
val create :
  ?eps:float ->
  ?n_alphas:int ->
  ?alphas:[>`ArrayLike] Np.Obj.t ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?precompute:[`Arr of [>`ArrayLike] Np.Obj.t | `Auto | `Bool of bool] ->
  ?max_iter:int ->
  ?tol:float ->
  ?copy_X:bool ->
  ?cv:[`BaseCrossValidator of [>`BaseCrossValidator] Np.Obj.t | `I of int | `Arr of [>`ArrayLike] Np.Obj.t] ->
  ?verbose:int ->
  ?n_jobs:int ->
  ?positive:bool ->
  ?random_state:int ->
  ?selection:[`Cyclic | `Random] ->
  unit ->

Lasso linear model with iterative fitting along a regularization path.

See glossary entry for :term:cross-validation estimator.

The best model is selected by cross-validation.

The optimization objective for Lasso is::

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

Read more in the :ref:User Guide <lasso>.


  • eps : float, default=1e-3 Length of the path. eps=1e-3 means that alpha_min / alpha_max = 1e-3.

  • n_alphas : int, default=100 Number of alphas along the regularization path

  • alphas : ndarray, default=None List of alphas where to compute the models. If None alphas are set automatically

  • fit_intercept : bool, default=True whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • precompute : 'auto', bool or array-like of shape (n_features, n_features), default='auto' Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.

  • max_iter : int, default=1000 The maximum number of iterations

  • tol : float, default=1e-4 The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • cv : int, cross-validation generator or iterable, default=None Determines the cross-validation splitting strategy. Possible inputs for cv are:

    • None, to use the default 5-fold cross-validation,
    • int, to specify the number of folds.
    • :term:CV splitter,
    • An iterable yielding (train, test) splits as arrays of indices.

    For int/None inputs, :class:KFold is used.

  • Refer :ref:User Guide <cross_validation> for the various cross-validation strategies that can be used here.

    .. versionchanged:: 0.22 cv default value if None changed from 3-fold to 5-fold.

  • verbose : bool or int, default=False Amount of verbosity.

  • n_jobs : int, default=None Number of CPUs to use during the cross validation. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors. See :term:Glossary <n_jobs> for more details.

  • positive : bool, default=False If positive, restrict regression coefficients to be positive

  • random_state : int, RandomState instance, default=None The seed of the pseudo random number generator that selects a random feature to update. Used when selection == 'random'. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.

  • selection : {'cyclic', 'random'}, default='cyclic' If set to 'random', a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to 'random') often leads to significantly faster convergence especially when tol is higher than 1e-4.


  • alpha_ : float The amount of penalization chosen by cross validation

  • coef_ : ndarray of shape (n_features,) or (n_targets, n_features) parameter vector (w in the cost function formula)

  • intercept_ : float or ndarray of shape (n_targets,) independent term in decision function.

  • mse_path_ : ndarray of shape (n_alphas, n_folds) mean square error for the test set on each fold, varying alpha

  • alphas_ : ndarray of shape (n_alphas,) The grid of alphas used for fitting

  • dual_gap_ : float or ndarray of shape (n_targets,) The dual gap at the end of the optimization for the optimal alpha (alpha_).

  • n_iter_ : int number of iterations run by the coordinate descent solver to reach the specified tolerance for the optimal alpha.


>>> from sklearn.linear_model import LassoCV
>>> from sklearn.datasets import make_regression
>>> X, y = make_regression(noise=4, random_state=0)
>>> reg = LassoCV(cv=5, random_state=0).fit(X, y)
>>> reg.score(X, y)
>>> reg.predict(X[:1,])


For an example, see :ref:examples/linear_model/ <>.

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

See also

lars_path lasso_path LassoLars Lasso LassoLarsCV


method fit
val fit :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit linear model with coordinate descent

Fit is on grid of alphas and best alpha estimated by cross-validation.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training data. Pass directly as Fortran-contiguous data to avoid unnecessary memory duplication. If y is mono-output, X can be sparse.

  • y : array-like of shape (n_samples,) or (n_samples, n_targets) Target values


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute alpha_
val alpha_ : t -> float
val alpha_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute mse_path_
val mse_path_ : t -> [>`ArrayLike] Np.Obj.t
val mse_path_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alphas_
val alphas_ : t -> [>`ArrayLike] Np.Obj.t
val alphas_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute dual_gap_
val dual_gap_ : t -> [>`ArrayLike] Np.Obj.t
val dual_gap_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​LassoLars wraps Python class sklearn.linear_model.LassoLars.

type t


constructor and attributes create
val create :
  ?alpha:float ->
  ?fit_intercept:bool ->
  ?verbose:int ->
  ?normalize:bool ->
  ?precompute:[`Arr of [>`ArrayLike] Np.Obj.t | `Auto | `Bool of bool] ->
  ?max_iter:int ->
  ?eps:float ->
  ?copy_X:bool ->
  ?fit_path:bool ->
  ?positive:bool ->
  ?jitter:float ->
  ?random_state:int ->
  unit ->

Lasso model fit with Least Angle Regression a.k.a. Lars

It is a Linear Model trained with an L1 prior as regularizer.

The optimization objective for Lasso is::

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

Read more in the :ref:User Guide <least_angle_regression>.


  • alpha : float, default=1.0 Constant that multiplies the penalty term. Defaults to 1.0. alpha = 0 is equivalent to an ordinary least square, solved

  • by :class:LinearRegression. For numerical reasons, using alpha = 0 with the LassoLars object is not advised and you should prefer the LinearRegression object.

  • fit_intercept : bool, default=True whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • verbose : bool or int, default=False Sets the verbosity amount

  • normalize : bool, default=True This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • precompute : bool, 'auto' or array-like, default='auto' Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.

  • max_iter : int, default=500 Maximum number of iterations to perform.

  • eps : float, optional The machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems. Unlike the tol parameter in some iterative optimization-based algorithms, this parameter does not control the tolerance of the optimization. By default, np.finfo(np.float).eps is used.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • fit_path : bool, default=True If True the full path is stored in the coef_path_ attribute. If you compute the solution for a large problem or many targets, setting fit_path to False will lead to a speedup, especially with a small alpha.

  • positive : bool, default=False Restrict coefficients to be >= 0. Be aware that you might want to remove fit_intercept which is set True by default. Under the positive restriction the model coefficients will not converge to the ordinary-least-squares solution for small values of alpha. Only coefficients up to the smallest alpha value (alphas_[alphas_ > 0.].min() when fit_path=True) reached by the stepwise Lars-Lasso algorithm are typically in congruence with the solution of the coordinate descent Lasso estimator.

  • jitter : float, default=None Upper bound on a uniform noise parameter to be added to the y values, to satisfy the model's assumption of one-at-a-time computations. Might help with stability.

  • random_state : int, RandomState instance or None (default) Determines random number generation for jittering. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>. Ignored if jitter is None.


  • alphas_ : array-like of shape (n_alphas + 1,) | list of n_targets such arrays Maximum of covariances (in absolute value) at each iteration. n_alphas is either max_iter, n_features, or the number of nodes in the path with correlation greater than alpha, whichever is smaller.

  • active_ : list, length = n_alphas | list of n_targets such lists Indices of active variables at the end of the path.

  • coef_path_ : array-like of shape (n_features, n_alphas + 1) or list If a list is passed it's expected to be one of n_targets such arrays. The varying values of the coefficients along the path. It is not present if the fit_path parameter is False.

  • coef_ : array-like of shape (n_features,) or (n_targets, n_features) Parameter vector (w in the formulation formula).

  • intercept_ : float or array-like of shape (n_targets,) Independent term in decision function.

  • n_iter_ : array-like or int. The number of iterations taken by lars_path to find the grid of alphas for each target.


>>> from sklearn import linear_model
>>> reg = linear_model.LassoLars(alpha=0.01)
>>>[[-1, 1], [0, 0], [1, 1]], [-1, 0, -1])
>>> print(reg.coef_)
[ 0.         -0.963257...]

See also

lars_path lasso_path Lasso LassoCV LassoLarsCV LassoLarsIC sklearn.decomposition.sparse_encode


method fit
val fit :
  ?xy:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit the model using X, y as training data.


  • X : array-like of shape (n_samples, n_features) Training data.

  • y : array-like of shape (n_samples,) or (n_samples, n_targets) Target values.

  • Xy : array-like of shape (n_samples,) or (n_samples, n_targets), default=None Xy =, y) that can be precomputed. It is useful only when the Gram matrix is precomputed.


  • self : object returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute alphas_
val alphas_ : t -> Py.Object.t
val alphas_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute active_
val active_ : t -> Py.Object.t
val active_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_path_
val coef_path_ : t -> [>`ArrayLike] Np.Obj.t
val coef_path_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> Py.Object.t
val n_iter_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​LassoLarsCV wraps Python class sklearn.linear_model.LassoLarsCV.

type t


constructor and attributes create
val create :
  ?fit_intercept:bool ->
  ?verbose:int ->
  ?max_iter:int ->
  ?normalize:bool ->
  ?precompute:[`Auto | `Bool of bool] ->
  ?cv:[`BaseCrossValidator of [>`BaseCrossValidator] Np.Obj.t | `I of int | `Arr of [>`ArrayLike] Np.Obj.t] ->
  ?max_n_alphas:int ->
  ?n_jobs:int ->
  ?eps:float ->
  ?copy_X:bool ->
  ?positive:bool ->
  unit ->

Cross-validated Lasso, using the LARS algorithm.

See glossary entry for :term:cross-validation estimator.

The optimization objective for Lasso is::

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

Read more in the :ref:User Guide <least_angle_regression>.


  • fit_intercept : bool, default=True whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • verbose : bool or int, default=False Sets the verbosity amount

  • max_iter : int, default=500 Maximum number of iterations to perform.

  • normalize : bool, default=True This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • precompute : bool or 'auto' , default='auto' Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix cannot be passed as argument since we will use only subsets of X.

  • cv : int, cross-validation generator or an iterable, default=None Determines the cross-validation splitting strategy. Possible inputs for cv are:

    • None, to use the default 5-fold cross-validation,
    • integer, to specify the number of folds.
    • :term:CV splitter,
    • An iterable yielding (train, test) splits as arrays of indices.

    For integer/None inputs, :class:KFold is used.

  • Refer :ref:User Guide <cross_validation> for the various cross-validation strategies that can be used here.

    .. versionchanged:: 0.22 cv default value if None changed from 3-fold to 5-fold.

  • max_n_alphas : int, default=1000 The maximum number of points on the path used to compute the residuals in the cross-validation

  • n_jobs : int or None, default=None Number of CPUs to use during the cross validation. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors. See :term:Glossary <n_jobs> for more details.

  • eps : float, optional The machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems. By default, np.finfo(np.float).eps is used.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • positive : bool, default=False Restrict coefficients to be >= 0. Be aware that you might want to remove fit_intercept which is set True by default. Under the positive restriction the model coefficients do not converge to the ordinary-least-squares solution for small values of alpha. Only coefficients up to the smallest alpha value (alphas_[alphas_ > 0.].min() when fit_path=True) reached by the stepwise Lars-Lasso algorithm are typically in congruence with the solution of the coordinate descent Lasso estimator. As a consequence using LassoLarsCV only makes sense for problems where a sparse solution is expected and/or reached.


  • coef_ : array-like of shape (n_features,) parameter vector (w in the formulation formula)

  • intercept_ : float independent term in decision function.

  • coef_path_ : array-like of shape (n_features, n_alphas) the varying values of the coefficients along the path

  • alpha_ : float the estimated regularization parameter alpha

  • alphas_ : array-like of shape (n_alphas,) the different values of alpha along the path

  • cv_alphas_ : array-like of shape (n_cv_alphas,) all the values of alpha along the path for the different folds

  • mse_path_ : array-like of shape (n_folds, n_cv_alphas) the mean square error on left-out for each fold along the path (alpha values given by cv_alphas)

  • n_iter_ : array-like or int the number of iterations run by Lars with the optimal alpha.


>>> from sklearn.linear_model import LassoLarsCV
>>> from sklearn.datasets import make_regression
>>> X, y = make_regression(noise=4.0, random_state=0)
>>> reg = LassoLarsCV(cv=5).fit(X, y)
>>> reg.score(X, y)
>>> reg.alpha_
>>> reg.predict(X[:1,])


The object solves the same problem as the LassoCV object. However, unlike the LassoCV, it find the relevant alphas values by itself. In general, because of this property, it will be more stable. However, it is more fragile to heavily multicollinear datasets.

It is more efficient than the LassoCV if only a small number of features are selected compared to the total number, for instance if there are very few samples compared to the number of features.

See also

lars_path, LassoLars, LarsCV, LassoCV


method fit
val fit :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit the model using X, y as training data.


  • X : array-like of shape (n_samples, n_features) Training data.

  • y : array-like of shape (n_samples,) Target values.


  • self : object returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_path_
val coef_path_ : t -> [>`ArrayLike] Np.Obj.t
val coef_path_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alpha_
val alpha_ : t -> float
val alpha_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alphas_
val alphas_ : t -> [>`ArrayLike] Np.Obj.t
val alphas_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute cv_alphas_
val cv_alphas_ : t -> [>`ArrayLike] Np.Obj.t
val cv_alphas_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute mse_path_
val mse_path_ : t -> [>`ArrayLike] Np.Obj.t
val mse_path_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> Py.Object.t
val n_iter_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​LassoLarsIC wraps Python class sklearn.linear_model.LassoLarsIC.

type t


constructor and attributes create
val create :
  ?criterion:[`Bic | `Aic] ->
  ?fit_intercept:bool ->
  ?verbose:int ->
  ?normalize:bool ->
  ?precompute:[`Arr of [>`ArrayLike] Np.Obj.t | `Auto | `Bool of bool] ->
  ?max_iter:int ->
  ?eps:float ->
  ?copy_X:bool ->
  ?positive:bool ->
  unit ->

Lasso model fit with Lars using BIC or AIC for model selection

The optimization objective for Lasso is::

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

AIC is the Akaike information criterion and BIC is the Bayes Information criterion. Such criteria are useful to select the value of the regularization parameter by making a trade-off between the goodness of fit and the complexity of the model. A good model should explain well the data while being simple.

Read more in the :ref:User Guide <least_angle_regression>.


  • criterion : {'bic' , 'aic'}, default='aic' The type of criterion to use.

  • fit_intercept : bool, default=True whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • verbose : bool or int, default=False Sets the verbosity amount

  • normalize : bool, default=True This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • precompute : bool, 'auto' or array-like, default='auto' Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.

  • max_iter : int, default=500 Maximum number of iterations to perform. Can be used for early stopping.

  • eps : float, optional The machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems. Unlike the tol parameter in some iterative optimization-based algorithms, this parameter does not control the tolerance of the optimization. By default, np.finfo(np.float).eps is used

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • positive : bool, default=False Restrict coefficients to be >= 0. Be aware that you might want to remove fit_intercept which is set True by default. Under the positive restriction the model coefficients do not converge to the ordinary-least-squares solution for small values of alpha. Only coefficients up to the smallest alpha value (alphas_[alphas_ > 0.].min() when fit_path=True) reached by the stepwise Lars-Lasso algorithm are typically in congruence with the solution of the coordinate descent Lasso estimator. As a consequence using LassoLarsIC only makes sense for problems where a sparse solution is expected and/or reached.


  • coef_ : array-like of shape (n_features,) parameter vector (w in the formulation formula)

  • intercept_ : float independent term in decision function.

  • alpha_ : float the alpha parameter chosen by the information criterion

  • n_iter_ : int number of iterations run by lars_path to find the grid of alphas.

  • criterion_ : array-like of shape (n_alphas,) The value of the information criteria ('aic', 'bic') across all alphas. The alpha which has the smallest information criterion is chosen. This value is larger by a factor of n_samples compared to Eqns. 2.15 and 2.16 in (Zou et al, 2007).


>>> from sklearn import linear_model
>>> reg = linear_model.LassoLarsIC(criterion='bic')
>>>[[-1, 1], [0, 0], [1, 1]], [-1.1111, 0, -1.1111])
>>> print(reg.coef_)
[ 0.  -1.11...]


The estimation of the number of degrees of freedom is given by:

'On the degrees of freedom of the lasso' Hui Zou, Trevor Hastie, and Robert Tibshirani Ann. Statist. Volume 35, Number 5 (2007), 2173-2192.



See also

lars_path, LassoLars, LassoLarsCV


method fit
val fit :
  ?copy_X:bool ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit the model using X, y as training data.


  • X : array-like of shape (n_samples, n_features) training data.

  • y : array-like of shape (n_samples,) target values. Will be cast to X's dtype if necessary

  • copy_X : bool, default=None If provided, this parameter will override the choice of copy_X made at instance creation. If True, X will be copied; else, it may be overwritten.


  • self : object returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alpha_
val alpha_ : t -> float
val alpha_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute criterion_
val criterion_ : t -> [>`ArrayLike] Np.Obj.t
val criterion_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​LinearRegression wraps Python class sklearn.linear_model.LinearRegression.

type t


constructor and attributes create
val create :
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?copy_X:bool ->
  ?n_jobs:int ->
  unit ->

Ordinary least squares Linear Regression.

LinearRegression fits a linear model with coefficients w = (w1, ..., wp) to minimize the residual sum of squares between the observed targets in the dataset, and the targets predicted by the linear approximation.


  • fit_intercept : bool, default=True Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • n_jobs : int, default=None The number of jobs to use for the computation. This will only provide speedup for n_targets > 1 and sufficient large problems. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors. See :term:Glossary <n_jobs> for more details.


  • coef_ : array of shape (n_features, ) or (n_targets, n_features) Estimated coefficients for the linear regression problem. If multiple targets are passed during the fit (y 2D), this is a 2D array of shape (n_targets, n_features), while if only one target is passed, this is a 1D array of length n_features.

  • rank_ : int Rank of matrix X. Only available when X is dense.

  • singular_ : array of shape (min(X, y),) Singular values of X. Only available when X is dense.

  • intercept_ : float or array of shape (n_targets,) Independent term in the linear model. Set to 0.0 if fit_intercept = False.

See Also

  • sklearn.linear_model.Ridge : Ridge regression addresses some of the problems of Ordinary Least Squares by imposing a penalty on the size of the coefficients with l2 regularization.

  • sklearn.linear_model.Lasso : The Lasso is a linear model that estimates sparse coefficients with l1 regularization.

  • sklearn.linear_model.ElasticNet : Elastic-Net is a linear regression model trained with both l1 and l2 -norm regularization of the coefficients.


From the implementation point of view, this is just plain Ordinary Least Squares (scipy.linalg.lstsq) wrapped as a predictor object.


>>> import numpy as np
>>> from sklearn.linear_model import LinearRegression
>>> X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
>>> # y = 1 * x_0 + 2 * x_1 + 3
>>> y =, np.array([1, 2])) + 3
>>> reg = LinearRegression().fit(X, y)
>>> reg.score(X, y)
>>> reg.coef_
array([1., 2.])
>>> reg.intercept_
>>> reg.predict(np.array([[3, 5]]))


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit linear model.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training data

  • y : array-like of shape (n_samples,) or (n_samples, n_targets) Target values. Will be cast to X's dtype if necessary

  • sample_weight : array-like of shape (n_samples,), default=None Individual weights for each sample

    .. versionadded:: 0.17 parameter sample_weight support to LinearRegression.


  • self : returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute rank_
val rank_ : t -> int
val rank_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute singular_
val singular_ : t -> [>`ArrayLike] Np.Obj.t
val singular_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​Log wraps Python class sklearn.linear_model.Log.

type t


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​LogisticRegression wraps Python class sklearn.linear_model.LogisticRegression.

type t


constructor and attributes create
val create :
  ?penalty:[`L1 | `L2 | `Elasticnet | `None] ->
  ?dual:bool ->
  ?tol:float ->
  ?c:float ->
  ?fit_intercept:bool ->
  ?intercept_scaling:float ->
  ?class_weight:[`Balanced | `DictIntToFloat of (int * float) list] ->
  ?random_state:int ->
  ?solver:[`Newton_cg | `Lbfgs | `Liblinear | `Sag | `Saga] ->
  ?max_iter:int ->
  ?multi_class:[`Auto | `Ovr | `Multinomial] ->
  ?verbose:int ->
  ?warm_start:bool ->
  ?n_jobs:int ->
  ?l1_ratio:float ->
  unit ->

Logistic Regression (aka logit, MaxEnt) classifier.

In the multiclass case, the training algorithm uses the one-vs-rest (OvR) scheme if the 'multi_class' option is set to 'ovr', and uses the cross-entropy loss if the 'multi_class' option is set to 'multinomial'. (Currently the 'multinomial' option is supported only by the 'lbfgs', 'sag', 'saga' and 'newton-cg' solvers.)

This class implements regularized logistic regression using the 'liblinear' library, 'newton-cg', 'sag', 'saga' and 'lbfgs' solvers. Note that regularization is applied by default. It can handle both dense and sparse input. Use C-ordered arrays or CSR matrices containing 64-bit floats for optimal performance; any other input format will be converted (and copied).

The 'newton-cg', 'sag', and 'lbfgs' solvers support only L2 regularization with primal formulation, or no regularization. The 'liblinear' solver supports both L1 and L2 regularization, with a dual formulation only for the L2 penalty. The Elastic-Net regularization is only supported by the 'saga' solver.

Read more in the :ref:User Guide <logistic_regression>.


  • penalty : {'l1', 'l2', 'elasticnet', 'none'}, default='l2' Used to specify the norm used in the penalization. The 'newton-cg', 'sag' and 'lbfgs' solvers support only l2 penalties. 'elasticnet' is only supported by the 'saga' solver. If 'none' (not supported by the liblinear solver), no regularization is applied.

    .. versionadded:: 0.19 l1 penalty with SAGA solver (allowing 'multinomial' + L1)

  • dual : bool, default=False Dual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features.

  • tol : float, default=1e-4 Tolerance for stopping criteria.

  • C : float, default=1.0 Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization.

  • fit_intercept : bool, default=True Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.

  • intercept_scaling : float, default=1 Useful only when the solver 'liblinear' is used and self.fit_intercept is set to True. In this case, x becomes [x, self.intercept_scaling], i.e. a 'synthetic' feature with constant value equal to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic_feature_weight.

    Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased.

  • class_weight : dict or 'balanced', default=None Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one.

    The 'balanced' mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)).

    Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified.

    .. versionadded:: 0.17 class_weight='balanced'

  • random_state : int, RandomState instance, default=None Used when solver == 'sag', 'saga' or 'liblinear' to shuffle the data. See :term:Glossary <random_state> for details.

  • solver : {'newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga'}, default='lbfgs'

    Algorithm to use in the optimization problem.

    • For small datasets, 'liblinear' is a good choice, whereas 'sag' and 'saga' are faster for large ones.
    • For multiclass problems, only 'newton-cg', 'sag', 'saga' and 'lbfgs' handle multinomial loss; 'liblinear' is limited to one-versus-rest schemes.
    • 'newton-cg', 'lbfgs', 'sag' and 'saga' handle L2 or no penalty
    • 'liblinear' and 'saga' also handle L1 penalty
    • 'saga' also supports 'elasticnet' penalty
    • 'liblinear' does not support setting penalty='none'

    Note that 'sag' and 'saga' fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.

    .. versionadded:: 0.17 Stochastic Average Gradient descent solver. .. versionadded:: 0.19 SAGA solver. .. versionchanged:: 0.22 The default solver changed from 'liblinear' to 'lbfgs' in 0.22.

  • max_iter : int, default=100 Maximum number of iterations taken for the solvers to converge.

  • multi_class : {'auto', 'ovr', 'multinomial'}, default='auto' If the option chosen is 'ovr', then a binary problem is fit for each label. For 'multinomial' the loss minimised is the multinomial loss fit across the entire probability distribution, even when the data is binary. 'multinomial' is unavailable when solver='liblinear'. 'auto' selects 'ovr' if the data is binary, or if solver='liblinear', and otherwise selects 'multinomial'.

    .. versionadded:: 0.18 Stochastic Average Gradient descent solver for 'multinomial' case. .. versionchanged:: 0.22 Default changed from 'ovr' to 'auto' in 0.22.

  • verbose : int, default=0 For the liblinear and lbfgs solvers set verbose to any positive number for verbosity.

  • warm_start : bool, default=False When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Useless for liblinear solver. See :term:the Glossary <warm_start>.

    .. versionadded:: 0.17 warm_start to support lbfgs, newton-cg, sag, saga solvers.

  • n_jobs : int, default=None Number of CPU cores used when parallelizing over classes if multi_class='ovr''. This parameter is ignored when the solver is set to 'liblinear' regardless of whether 'multi_class' is specified or not. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors.

  • See :term:Glossary <n_jobs> for more details.

  • l1_ratio : float, default=None The Elastic-Net mixing parameter, with 0 <= l1_ratio <= 1. Only used if penalty='elasticnet'. Setting l1_ratio=0 is equivalent to using penalty='l2', while setting l1_ratio=1 is equivalent to using penalty='l1'. For 0 < l1_ratio <1, the penalty is a combination of L1 and L2.


  • classes_ : ndarray of shape (n_classes, ) A list of class labels known to the classifier.

  • coef_ : ndarray of shape (1, n_features) or (n_classes, n_features) Coefficient of the features in the decision function.

    coef_ is of shape (1, n_features) when the given problem is binary. In particular, when multi_class='multinomial', coef_ corresponds to outcome 1 (True) and -coef_ corresponds to outcome 0 (False).

  • intercept_ : ndarray of shape (1,) or (n_classes,) Intercept (a.k.a. bias) added to the decision function.

    If fit_intercept is set to False, the intercept is set to zero. intercept_ is of shape (1,) when the given problem is binary. In particular, when multi_class='multinomial', intercept_ corresponds to outcome 1 (True) and -intercept_ corresponds to outcome 0 (False).

  • n_iter_ : ndarray of shape (n_classes,) or (1, ) Actual number of iterations for all classes. If binary or multinomial, it returns only 1 element. For liblinear solver, only the maximum number of iteration across all classes is given.

    .. versionchanged:: 0.20

    In SciPy <= 1.0.0 the number of lbfgs iterations may exceed
    ``max_iter``. ``n_iter_`` will now report at most ``max_iter``.

See Also

  • SGDClassifier : Incrementally trained logistic regression (when given the parameter loss='log').

  • LogisticRegressionCV : Logistic regression with built-in cross validation.


The underlying C implementation uses a random number generator to select features when fitting the model. It is thus not uncommon, to have slightly different results for the same input data. If that happens, try with a smaller tol parameter.

Predict output may not match that of standalone liblinear in certain cases. See :ref:differences from liblinear <liblinear_differences> in the narrative documentation.


L-BFGS-B -- Software for Large-scale Bound-constrained Optimization Ciyou Zhu, Richard Byrd, Jorge Nocedal and Jose Luis Morales.


LIBLINEAR -- A Library for Large Linear Classification


SAG -- Mark Schmidt, Nicolas Le Roux, and Francis Bach Minimizing Finite Sums with the Stochastic Average Gradient


SAGA -- Defazio, A., Bach F. & Lacoste-Julien S. (2014).

  • SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives


Hsiang-Fu Yu, Fang-Lan Huang, Chih-Jen Lin (2011). Dual coordinate descent methods for logistic regression and maximum entropy models. Machine Learning 85(1-2):41-75.



>>> from sklearn.datasets import load_iris
>>> from sklearn.linear_model import LogisticRegression
>>> X, y = load_iris(return_X_y=True)
>>> clf = LogisticRegression(random_state=0).fit(X, y)
>>> clf.predict(X[:2, :])
array([0, 0])
>>> clf.predict_proba(X[:2, :])
array([[9.8...e-01, 1.8...e-02, 1.4...e-08],
       [9.7...e-01, 2.8...e-02, ...e-08]])
>>> clf.score(X, y)


method decision_function
val decision_function :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict confidence scores for samples.

The confidence score for a sample is the signed distance of that sample to the hyperplane.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


array, shape=(n_samples,) if n_classes == 2 else (n_samples, n_classes) Confidence scores per (sample, class) combination. In the binary case, confidence score for self.classes_[1] where >0 means this class would be predicted.


method densify
val densify :
  [> tag] Obj.t ->

Convert coefficient matrix to dense array format.

Converts the coef_ member (back) to a numpy.ndarray. This is the default format of coef_ and is required for fitting, so calling this method is only required on models that have previously been sparsified; otherwise, it is a no-op.


self Fitted estimator.


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit the model according to the given training data.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training vector, where n_samples is the number of samples and n_features is the number of features.

  • y : array-like of shape (n_samples,) Target vector relative to X.

  • sample_weight : array-like of shape (n_samples,) default=None Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.

    .. versionadded:: 0.17 sample_weight support to LogisticRegression.


self Fitted estimator.


The SAGA solver supports both float64 and float32 bit arrays.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict class labels for samples in X.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape [n_samples] Predicted class label per sample.


method predict_log_proba
val predict_log_proba :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict logarithm of probability estimates.

The returned estimates for all classes are ordered by the label of classes.


  • X : array-like of shape (n_samples, n_features) Vector to be scored, where n_samples is the number of samples and n_features is the number of features.


  • T : array-like of shape (n_samples, n_classes) Returns the log-probability of the sample for each class in the model, where classes are ordered as they are in self.classes_.


method predict_proba
val predict_proba :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Probability estimates.

The returned estimates for all classes are ordered by the label of classes.

For a multi_class problem, if multi_class is set to be 'multinomial' the softmax function is used to find the predicted probability of each class. Else use a one-vs-rest approach, i.e calculate the probability of each class assuming it to be positive using the logistic function. and normalize these values across all the classes.


  • X : array-like of shape (n_samples, n_features) Vector to be scored, where n_samples is the number of samples and n_features is the number of features.


  • T : array-like of shape (n_samples, n_classes) Returns the probability of the sample for each class in the model, where classes are ordered as they are in self.classes_.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.


  • X : array-like of shape (n_samples, n_features) Test samples.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float Mean accuracy of self.predict(X) wrt. y.


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


method sparsify
val sparsify :
  [> tag] Obj.t ->

Convert coefficient matrix to sparse format.

Converts the coef_ member to a scipy.sparse matrix, which for L1-regularized models can be much more memory- and storage-efficient than the usual numpy.ndarray representation.

The intercept_ member is not converted.


self Fitted estimator.


For non-sparse models, i.e. when there are not many zeros in coef_, this may actually increase memory usage, so use this method with care. A rule of thumb is that the number of zero elements, which can be computed with (coef_ == 0).sum(), must be more than 50% for this to provide significant benefits.

After calling this method, further fitting with the partial_fit method (if any) will not work until you call densify.


attribute classes_
val classes_ : t -> [>`ArrayLike] Np.Obj.t
val classes_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> [>`ArrayLike] Np.Obj.t
val n_iter_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​LogisticRegressionCV wraps Python class sklearn.linear_model.LogisticRegressionCV.

type t


constructor and attributes create
val create :
  ?cs:[`Fs of float list | `I of int] ->
  ?fit_intercept:bool ->
  ?cv:[`BaseCrossValidator of [>`BaseCrossValidator] Np.Obj.t | `I of int] ->
  ?dual:bool ->
  ?penalty:[`L1 | `L2 | `Elasticnet] ->
  ?scoring:[`Roc_auc_ovo_weighted | `Callable of Py.Object.t | `Precision | `Roc_auc_ovr | `Recall_micro | `F1_micro | `Precision_micro | `Fowlkes_mallows_score | `F1 | `Jaccard | `Max_error | `Precision_weighted | `Precision_macro | `Neg_brier_score | `Roc_auc_ovo | `F1_weighted | `Average_precision | `Adjusted_mutual_info_score | `Neg_mean_poisson_deviance | `Neg_median_absolute_error | `Jaccard_macro | `Jaccard_micro | `Neg_log_loss | `Recall_samples | `Explained_variance | `Balanced_accuracy | `Normalized_mutual_info_score | `F1_samples | `Completeness_score | `Mutual_info_score | `Accuracy | `Neg_mean_squared_log_error | `Roc_auc | `Precision_samples | `V_measure_score | `Neg_mean_gamma_deviance | `Jaccard_weighted | `R2 | `Recall_weighted | `Recall_macro | `Roc_auc_ovr_weighted | `Homogeneity_score | `Neg_mean_squared_error | `Neg_root_mean_squared_error | `Recall | `Neg_mean_absolute_error | `Adjusted_rand_score | `Jaccard_samples | `F1_macro] ->
  ?solver:[`Newton_cg | `Lbfgs | `Liblinear | `Sag | `Saga] ->
  ?tol:float ->
  ?max_iter:int ->
  ?class_weight:[`Balanced | `DictIntToFloat of (int * float) list] ->
  ?n_jobs:int ->
  ?verbose:int ->
  ?refit:bool ->
  ?intercept_scaling:float ->
  ?multi_class:[`Ovr | `Multinomial | `T_auto of Py.Object.t] ->
  ?random_state:int ->
  ?l1_ratios:float list ->
  unit ->

Logistic Regression CV (aka logit, MaxEnt) classifier.

See glossary entry for :term:cross-validation estimator.

This class implements logistic regression using liblinear, newton-cg, sag of lbfgs optimizer. The newton-cg, sag and lbfgs solvers support only L2 regularization with primal formulation. The liblinear solver supports both L1 and L2 regularization, with a dual formulation only for the L2 penalty. Elastic-Net penalty is only supported by the saga solver.

For the grid of Cs values and l1_ratios values, the best hyperparameter is selected by the cross-validator :class:~sklearn.model_selection.StratifiedKFold, but it can be changed using the :term:cv parameter. The 'newton-cg', 'sag', 'saga' and 'lbfgs' solvers can warm-start the coefficients (see :term:Glossary<warm_start>).

Read more in the :ref:User Guide <logistic_regression>.


  • Cs : int or list of floats, default=10 Each of the values in Cs describes the inverse of regularization strength. If Cs is as an int, then a grid of Cs values are chosen in a logarithmic scale between 1e-4 and 1e4. Like in support vector machines, smaller values specify stronger regularization.

  • fit_intercept : bool, default=True Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.

  • cv : int or cross-validation generator, default=None The default cross-validation generator used is Stratified K-Folds. If an integer is provided, then it is the number of folds used. See the module :mod:sklearn.model_selection module for the list of possible cross-validation objects.

    .. versionchanged:: 0.22 cv default value if None changed from 3-fold to 5-fold.

  • dual : bool, default=False Dual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features.

  • penalty : {'l1', 'l2', 'elasticnet'}, default='l2' Used to specify the norm used in the penalization. The 'newton-cg', 'sag' and 'lbfgs' solvers support only l2 penalties. 'elasticnet' is only supported by the 'saga' solver.

  • scoring : str or callable, default=None A string (see model evaluation documentation) or a scorer callable object / function with signature scorer(estimator, X, y). For a list of scoring functions that can be used, look at :mod:sklearn.metrics. The default scoring option used is 'accuracy'.

  • solver : {'newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga'}, default='lbfgs'

    Algorithm to use in the optimization problem.

    • For small datasets, 'liblinear' is a good choice, whereas 'sag' and 'saga' are faster for large ones.
    • For multiclass problems, only 'newton-cg', 'sag', 'saga' and 'lbfgs' handle multinomial loss; 'liblinear' is limited to one-versus-rest schemes.
    • 'newton-cg', 'lbfgs' and 'sag' only handle L2 penalty, whereas 'liblinear' and 'saga' handle L1 penalty.
    • 'liblinear' might be slower in LogisticRegressionCV because it does not handle warm-starting.

    Note that 'sag' and 'saga' fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.

    .. versionadded:: 0.17 Stochastic Average Gradient descent solver. .. versionadded:: 0.19 SAGA solver.

  • tol : float, default=1e-4 Tolerance for stopping criteria.

  • max_iter : int, default=100 Maximum number of iterations of the optimization algorithm.

  • class_weight : dict or 'balanced', default=None Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one.

    The 'balanced' mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)).

    Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified.

    .. versionadded:: 0.17 class_weight == 'balanced'

  • n_jobs : int, default=None Number of CPU cores used during the cross-validation loop. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors. See :term:Glossary <n_jobs> for more details.

  • verbose : int, default=0 For the 'liblinear', 'sag' and 'lbfgs' solvers set verbose to any positive number for verbosity.

  • refit : bool, default=True If set to True, the scores are averaged across all folds, and the coefs and the C that corresponds to the best score is taken, and a final refit is done using these parameters. Otherwise the coefs, intercepts and C that correspond to the best scores across folds are averaged.

  • intercept_scaling : float, default=1 Useful only when the solver 'liblinear' is used and self.fit_intercept is set to True. In this case, x becomes [x, self.intercept_scaling], i.e. a 'synthetic' feature with constant value equal to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic_feature_weight.

    Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased.

  • multi_class : {'auto, 'ovr', 'multinomial'}, default='auto' If the option chosen is 'ovr', then a binary problem is fit for each label. For 'multinomial' the loss minimised is the multinomial loss fit across the entire probability distribution, even when the data is binary. 'multinomial' is unavailable when solver='liblinear'. 'auto' selects 'ovr' if the data is binary, or if solver='liblinear', and otherwise selects 'multinomial'.

    .. versionadded:: 0.18 Stochastic Average Gradient descent solver for 'multinomial' case. .. versionchanged:: 0.22 Default changed from 'ovr' to 'auto' in 0.22.

  • random_state : int, RandomState instance, default=None Used when solver='sag', 'saga' or 'liblinear' to shuffle the data. Note that this only applies to the solver and not the cross-validation generator. See :term:Glossary <random_state> for details.

  • l1_ratios : list of float, default=None The list of Elastic-Net mixing parameter, with 0 <= l1_ratio <= 1. Only used if penalty='elasticnet'. A value of 0 is equivalent to using penalty='l2', while 1 is equivalent to using penalty='l1'. For 0 < l1_ratio <1, the penalty is a combination of L1 and L2.


  • classes_ : ndarray of shape (n_classes, ) A list of class labels known to the classifier.

  • coef_ : ndarray of shape (1, n_features) or (n_classes, n_features) Coefficient of the features in the decision function.

    coef_ is of shape (1, n_features) when the given problem is binary.

  • intercept_ : ndarray of shape (1,) or (n_classes,) Intercept (a.k.a. bias) added to the decision function.

    If fit_intercept is set to False, the intercept is set to zero. intercept_ is of shape(1,) when the problem is binary.

  • Cs_ : ndarray of shape (n_cs) Array of C i.e. inverse of regularization parameter values used for cross-validation.

  • l1_ratios_ : ndarray of shape (n_l1_ratios) Array of l1_ratios used for cross-validation. If no l1_ratio is used (i.e. penalty is not 'elasticnet'), this is set to [None]

  • coefs_paths_ : ndarray of shape (n_folds, n_cs, n_features) or (n_folds, n_cs, n_features + 1) dict with classes as the keys, and the path of coefficients obtained during cross-validating across each fold and then across each Cs after doing an OvR for the corresponding class as values. If the 'multi_class' option is set to 'multinomial', then the coefs_paths are the coefficients corresponding to each class. Each dict value has shape (n_folds, n_cs, n_features) or (n_folds, n_cs, n_features + 1) depending on whether the intercept is fit or not. If penalty='elasticnet', the shape is (n_folds, n_cs, n_l1_ratios_, n_features) or (n_folds, n_cs, n_l1_ratios_, n_features + 1).

  • scores_ : dict dict with classes as the keys, and the values as the grid of scores obtained during cross-validating each fold, after doing an OvR for the corresponding class. If the 'multi_class' option given is 'multinomial' then the same scores are repeated across all classes, since this is the multinomial class. Each dict value has shape (n_folds, n_cs or (n_folds, n_cs, n_l1_ratios) if penalty='elasticnet'.

  • C_ : ndarray of shape (n_classes,) or (n_classes - 1,) Array of C that maps to the best scores across every class. If refit is set to False, then for each class, the best C is the average of the C's that correspond to the best scores for each fold. C_ is of shape(n_classes,) when the problem is binary.

  • l1_ratio_ : ndarray of shape (n_classes,) or (n_classes - 1,) Array of l1_ratio that maps to the best scores across every class. If refit is set to False, then for each class, the best l1_ratio is the average of the l1_ratio's that correspond to the best scores for each fold. l1_ratio_ is of shape(n_classes,) when the problem is binary.

  • n_iter_ : ndarray of shape (n_classes, n_folds, n_cs) or (1, n_folds, n_cs) Actual number of iterations for all classes, folds and Cs. In the binary or multinomial cases, the first dimension is equal to 1. If penalty='elasticnet', the shape is (n_classes, n_folds, n_cs, n_l1_ratios) or (1, n_folds, n_cs, n_l1_ratios).


>>> from sklearn.datasets import load_iris
>>> from sklearn.linear_model import LogisticRegressionCV
>>> X, y = load_iris(return_X_y=True)
>>> clf = LogisticRegressionCV(cv=5, random_state=0).fit(X, y)
>>> clf.predict(X[:2, :])
array([0, 0])
>>> clf.predict_proba(X[:2, :]).shape
(2, 3)
>>> clf.score(X, y)

See also



method decision_function
val decision_function :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict confidence scores for samples.

The confidence score for a sample is the signed distance of that sample to the hyperplane.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


array, shape=(n_samples,) if n_classes == 2 else (n_samples, n_classes) Confidence scores per (sample, class) combination. In the binary case, confidence score for self.classes_[1] where >0 means this class would be predicted.


method densify
val densify :
  [> tag] Obj.t ->

Convert coefficient matrix to dense array format.

Converts the coef_ member (back) to a numpy.ndarray. This is the default format of coef_ and is required for fitting, so calling this method is only required on models that have previously been sparsified; otherwise, it is a no-op.


self Fitted estimator.


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit the model according to the given training data.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training vector, where n_samples is the number of samples and n_features is the number of features.

  • y : array-like of shape (n_samples,) Target vector relative to X.

  • sample_weight : array-like of shape (n_samples,) default=None Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.


  • self : object


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict class labels for samples in X.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape [n_samples] Predicted class label per sample.


method predict_log_proba
val predict_log_proba :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict logarithm of probability estimates.

The returned estimates for all classes are ordered by the label of classes.


  • X : array-like of shape (n_samples, n_features) Vector to be scored, where n_samples is the number of samples and n_features is the number of features.


  • T : array-like of shape (n_samples, n_classes) Returns the log-probability of the sample for each class in the model, where classes are ordered as they are in self.classes_.


method predict_proba
val predict_proba :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Probability estimates.

The returned estimates for all classes are ordered by the label of classes.

For a multi_class problem, if multi_class is set to be 'multinomial' the softmax function is used to find the predicted probability of each class. Else use a one-vs-rest approach, i.e calculate the probability of each class assuming it to be positive using the logistic function. and normalize these values across all the classes.


  • X : array-like of shape (n_samples, n_features) Vector to be scored, where n_samples is the number of samples and n_features is the number of features.


  • T : array-like of shape (n_samples, n_classes) Returns the probability of the sample for each class in the model, where classes are ordered as they are in self.classes_.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Returns the score using the scoring option on the given test data and labels.


  • X : array-like of shape (n_samples, n_features) Test samples.

  • y : array-like of shape (n_samples,) True labels for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float Score of self.predict(X) wrt. y.


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


method sparsify
val sparsify :
  [> tag] Obj.t ->

Convert coefficient matrix to sparse format.

Converts the coef_ member to a scipy.sparse matrix, which for L1-regularized models can be much more memory- and storage-efficient than the usual numpy.ndarray representation.

The intercept_ member is not converted.


self Fitted estimator.


For non-sparse models, i.e. when there are not many zeros in coef_, this may actually increase memory usage, so use this method with care. A rule of thumb is that the number of zero elements, which can be computed with (coef_ == 0).sum(), must be more than 50% for this to provide significant benefits.

After calling this method, further fitting with the partial_fit method (if any) will not work until you call densify.


attribute classes_
val classes_ : t -> [>`ArrayLike] Np.Obj.t
val classes_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute cs_
val cs_ : t -> [>`ArrayLike] Np.Obj.t
val cs_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute l1_ratios_
val l1_ratios_ : t -> [>`ArrayLike] Np.Obj.t
val l1_ratios_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coefs_paths_
val coefs_paths_ : t -> [>`ArrayLike] Np.Obj.t
val coefs_paths_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute scores_
val scores_ : t -> Dict.t
val scores_opt : t -> (Dict.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute c_
val c_ : t -> [>`ArrayLike] Np.Obj.t
val c_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute l1_ratio_
val l1_ratio_ : t -> [>`ArrayLike] Np.Obj.t
val l1_ratio_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> [>`ArrayLike] Np.Obj.t
val n_iter_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​ModifiedHuber wraps Python class sklearn.linear_model.ModifiedHuber.

type t


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​MultiTaskElasticNet wraps Python class sklearn.linear_model.MultiTaskElasticNet.

type t


constructor and attributes create
val create :
  ?alpha:float ->
  ?l1_ratio:float ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?copy_X:bool ->
  ?max_iter:int ->
  ?tol:float ->
  ?warm_start:bool ->
  ?random_state:int ->
  ?selection:[`Cyclic | `Random] ->
  unit ->

Multi-task ElasticNet model trained with L1/L2 mixed-norm as regularizer

The optimization objective for MultiTaskElasticNet is::

(1 / (2 * n_samples)) * ||Y - XW||_Fro^2
+ alpha * l1_ratio * ||W||_21
+ 0.5 * alpha * (1 - l1_ratio) * ||W||_Fro^2
  • Where::

    ||W||_21 = sum_i sqrt(sum_j W_ij ^ 2)

i.e. the sum of norms of each row.

Read more in the :ref:User Guide <multi_task_elastic_net>.


  • alpha : float, default=1.0 Constant that multiplies the L1/L2 term. Defaults to 1.0

  • l1_ratio : float, default=0.5 The ElasticNet mixing parameter, with 0 < l1_ratio <= 1. For l1_ratio = 1 the penalty is an L1/L2 penalty. For l1_ratio = 0 it is an L2 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1/L2 and L2.

  • fit_intercept : bool, default=True whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • max_iter : int, default=1000 The maximum number of iterations

  • tol : float, default=1e-4 The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.

  • warm_start : bool, default=False When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

  • See :term:the Glossary <warm_start>.

  • random_state : int, RandomState instance, default=None The seed of the pseudo random number generator that selects a random feature to update. Used when selection == 'random'. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.

  • selection : {'cyclic', 'random'}, default='cyclic' If set to 'random', a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to 'random') often leads to significantly faster convergence especially when tol is higher than 1e-4.


  • intercept_ : ndarray of shape (n_tasks,) Independent term in decision function.

  • coef_ : ndarray of shape (n_tasks, n_features) Parameter vector (W in the cost function formula). If a 1D y is passed in at fit (non multi-task usage), coef_ is then a 1D array. Note that coef_ stores the transpose of W, W.T.

  • n_iter_ : int number of iterations run by the coordinate descent solver to reach the specified tolerance.


>>> from sklearn import linear_model
>>> clf = linear_model.MultiTaskElasticNet(alpha=0.1)
>>>[[0,0], [1, 1], [2, 2]], [[0, 0], [1, 1], [2, 2]])
>>> print(clf.coef_)
[[0.45663524 0.45612256]
 [0.45663524 0.45612256]]
>>> print(clf.intercept_)
[0.0872422 0.0872422]

See also

  • MultiTaskElasticNet : Multi-task L1/L2 ElasticNet with built-in cross-validation. ElasticNet MultiTaskLasso


The algorithm used to fit the model is coordinate descent.

To avoid unnecessary memory duplication the X and y arguments of the fit method should be directly passed as Fortran-contiguous numpy arrays.


method fit
val fit :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit MultiTaskElasticNet model with coordinate descent


  • X : ndarray of shape (n_samples, n_features) Data

  • y : ndarray of shape (n_samples, n_tasks) Target. Will be cast to X's dtype if necessary


Coordinate descent is an algorithm that considers each column of data at a time hence it will automatically convert the X input as a Fortran-contiguous numpy array if necessary.

To avoid memory re-allocation it is advised to allocate the initial data in memory directly using that format.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​MultiTaskElasticNetCV wraps Python class sklearn.linear_model.MultiTaskElasticNetCV.

type t


constructor and attributes create
val create :
  ?l1_ratio:[`F of float | `Fs of float list] ->
  ?eps:float ->
  ?n_alphas:int ->
  ?alphas:[>`ArrayLike] Np.Obj.t ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?max_iter:int ->
  ?tol:float ->
  ?cv:[`BaseCrossValidator of [>`BaseCrossValidator] Np.Obj.t | `I of int | `Arr of [>`ArrayLike] Np.Obj.t] ->
  ?copy_X:bool ->
  ?verbose:int ->
  ?n_jobs:int ->
  ?random_state:int ->
  ?selection:[`Cyclic | `Random] ->
  unit ->

Multi-task L1/L2 ElasticNet with built-in cross-validation.

See glossary entry for :term:cross-validation estimator.

The optimization objective for MultiTaskElasticNet is::

(1 / (2 * n_samples)) * ||Y - XW||^Fro_2
+ alpha * l1_ratio * ||W||_21
+ 0.5 * alpha * (1 - l1_ratio) * ||W||_Fro^2
  • Where::

    ||W||21 = \sum_i \sqrt{\sum_j w{ij}^2}

i.e. the sum of norm of each row.

Read more in the :ref:User Guide <multi_task_elastic_net>.

.. versionadded:: 0.15


  • l1_ratio : float or list of float, default=0.5 The ElasticNet mixing parameter, with 0 < l1_ratio <= 1. For l1_ratio = 1 the penalty is an L1/L2 penalty. For l1_ratio = 0 it is an L2 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1/L2 and L2. This parameter can be a list, in which case the different values are tested by cross-validation and the one giving the best prediction score is used. Note that a good choice of list of values for l1_ratio is often to put more values close to 1 (i.e. Lasso) and less close to 0 (i.e. Ridge), as in [.1, .5, .7, .9, .95, .99, 1]

  • eps : float, default=1e-3 Length of the path. eps=1e-3 means that alpha_min / alpha_max = 1e-3.

  • n_alphas : int, default=100 Number of alphas along the regularization path

  • alphas : array-like, default=None List of alphas where to compute the models. If not provided, set automatically.

  • fit_intercept : bool, default=True whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • max_iter : int, default=1000 The maximum number of iterations

  • tol : float, default=1e-4 The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.

  • cv : int, cross-validation generator or iterable, default=None Determines the cross-validation splitting strategy. Possible inputs for cv are:

    • None, to use the default 5-fold cross-validation,
    • int, to specify the number of folds.
    • :term:CV splitter,
    • An iterable yielding (train, test) splits as arrays of indices.

    For int/None inputs, :class:KFold is used.

  • Refer :ref:User Guide <cross_validation> for the various cross-validation strategies that can be used here.

    .. versionchanged:: 0.22 cv default value if None changed from 3-fold to 5-fold.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • verbose : bool or int, default=0 Amount of verbosity.

  • n_jobs : int, default=None Number of CPUs to use during the cross validation. Note that this is used only if multiple values for l1_ratio are given. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors. See :term:Glossary <n_jobs> for more details.

  • random_state : int, RandomState instance, default=None The seed of the pseudo random number generator that selects a random feature to update. Used when selection == 'random'. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.

  • selection : {'cyclic', 'random'}, default='cyclic' If set to 'random', a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to 'random') often leads to significantly faster convergence especially when tol is higher than 1e-4.


  • intercept_ : ndarray of shape (n_tasks,) Independent term in decision function.

  • coef_ : ndarray of shape (n_tasks, n_features) Parameter vector (W in the cost function formula). Note that coef_ stores the transpose of W, W.T.

  • alpha_ : float The amount of penalization chosen by cross validation

  • mse_path_ : ndarray of shape (n_alphas, n_folds) or (n_l1_ratio, n_alphas, n_folds) mean square error for the test set on each fold, varying alpha

  • alphas_ : ndarray of shape (n_alphas,) or (n_l1_ratio, n_alphas) The grid of alphas used for fitting, for each l1_ratio

  • l1_ratio_ : float best l1_ratio obtained by cross-validation.

  • n_iter_ : int number of iterations run by the coordinate descent solver to reach the specified tolerance for the optimal alpha.


>>> from sklearn import linear_model
>>> clf = linear_model.MultiTaskElasticNetCV(cv=3)
>>>[[0,0], [1, 1], [2, 2]],
...         [[0, 0], [1, 1], [2, 2]])
>>> print(clf.coef_)
[[0.52875032 0.46958558]
 [0.52875032 0.46958558]]
>>> print(clf.intercept_)
[0.00166409 0.00166409]

See also

MultiTaskElasticNet ElasticNetCV MultiTaskLassoCV


The algorithm used to fit the model is coordinate descent.

To avoid unnecessary memory duplication the X and y arguments of the fit method should be directly passed as Fortran-contiguous numpy arrays.


method fit
val fit :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit linear model with coordinate descent

Fit is on grid of alphas and best alpha estimated by cross-validation.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training data. Pass directly as Fortran-contiguous data to avoid unnecessary memory duplication. If y is mono-output, X can be sparse.

  • y : array-like of shape (n_samples,) or (n_samples, n_targets) Target values


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alpha_
val alpha_ : t -> float
val alpha_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute mse_path_
val mse_path_ : t -> [>`ArrayLike] Np.Obj.t
val mse_path_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alphas_
val alphas_ : t -> [>`ArrayLike] Np.Obj.t
val alphas_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute l1_ratio_
val l1_ratio_ : t -> float
val l1_ratio_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​MultiTaskLasso wraps Python class sklearn.linear_model.MultiTaskLasso.

type t


constructor and attributes create
val create :
  ?alpha:float ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?copy_X:bool ->
  ?max_iter:int ->
  ?tol:float ->
  ?warm_start:bool ->
  ?random_state:int ->
  ?selection:[`Cyclic | `Random] ->
  unit ->

Multi-task Lasso model trained with L1/L2 mixed-norm as regularizer.

The optimization objective for Lasso is::

(1 / (2 * n_samples)) * ||Y - XW||^2_Fro + alpha * ||W||_21
  • Where::

    ||W||21 = \sum_i \sqrt{\sum_j w{ij}^2}

i.e. the sum of norm of each row.

Read more in the :ref:User Guide <multi_task_lasso>.


  • alpha : float, default=1.0 Constant that multiplies the L1/L2 term. Defaults to 1.0

  • fit_intercept : bool, default=True whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • max_iter : int, default=1000 The maximum number of iterations

  • tol : float, default=1e-4 The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.

  • warm_start : bool, default=False When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

  • See :term:the Glossary <warm_start>.

  • random_state : int, RandomState instance, default=None The seed of the pseudo random number generator that selects a random feature to update. Used when selection == 'random'. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.

  • selection : {'cyclic', 'random'}, default='cyclic' If set to 'random', a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to 'random') often leads to significantly faster convergence especially when tol is higher than 1e-4


  • coef_ : ndarray of shape (n_tasks, n_features) Parameter vector (W in the cost function formula). Note that coef_ stores the transpose of W, W.T.

  • intercept_ : ndarray of shape (n_tasks,) independent term in decision function.

  • n_iter_ : int number of iterations run by the coordinate descent solver to reach the specified tolerance.


>>> from sklearn import linear_model
>>> clf = linear_model.MultiTaskLasso(alpha=0.1)
>>>[[0, 1], [1, 2], [2, 4]], [[0, 0], [1, 1], [2, 3]])
>>> print(clf.coef_)
[[0.         0.60809415]
[0.         0.94592424]]
>>> print(clf.intercept_)
[-0.41888636 -0.87382323]

See also

  • MultiTaskLasso : Multi-task L1/L2 Lasso with built-in cross-validation Lasso MultiTaskElasticNet


The algorithm used to fit the model is coordinate descent.

To avoid unnecessary memory duplication the X and y arguments of the fit method should be directly passed as Fortran-contiguous numpy arrays.


method fit
val fit :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit MultiTaskElasticNet model with coordinate descent


  • X : ndarray of shape (n_samples, n_features) Data

  • y : ndarray of shape (n_samples, n_tasks) Target. Will be cast to X's dtype if necessary


Coordinate descent is an algorithm that considers each column of data at a time hence it will automatically convert the X input as a Fortran-contiguous numpy array if necessary.

To avoid memory re-allocation it is advised to allocate the initial data in memory directly using that format.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​MultiTaskLassoCV wraps Python class sklearn.linear_model.MultiTaskLassoCV.

type t


constructor and attributes create
val create :
  ?eps:float ->
  ?n_alphas:int ->
  ?alphas:[>`ArrayLike] Np.Obj.t ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?max_iter:int ->
  ?tol:float ->
  ?copy_X:bool ->
  ?cv:[`BaseCrossValidator of [>`BaseCrossValidator] Np.Obj.t | `I of int | `Arr of [>`ArrayLike] Np.Obj.t] ->
  ?verbose:int ->
  ?n_jobs:int ->
  ?random_state:int ->
  ?selection:[`Cyclic | `Random] ->
  unit ->

Multi-task Lasso model trained with L1/L2 mixed-norm as regularizer.

See glossary entry for :term:cross-validation estimator.

The optimization objective for MultiTaskLasso is::

(1 / (2 * n_samples)) * ||Y - XW||^Fro_2 + alpha * ||W||_21
  • Where::

    ||W||21 = \sum_i \sqrt{\sum_j w{ij}^2}

i.e. the sum of norm of each row.

Read more in the :ref:User Guide <multi_task_lasso>.

.. versionadded:: 0.15


  • eps : float, default=1e-3 Length of the path. eps=1e-3 means that alpha_min / alpha_max = 1e-3.

  • n_alphas : int, default=100 Number of alphas along the regularization path

  • alphas : array-like, default=None List of alphas where to compute the models. If not provided, set automatically.

  • fit_intercept : bool, default=True whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • max_iter : int, default=1000 The maximum number of iterations.

  • tol : float, default=1e-4 The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • cv : int, cross-validation generator or iterable, default=None Determines the cross-validation splitting strategy. Possible inputs for cv are:

    • None, to use the default 5-fold cross-validation,
    • int, to specify the number of folds.
    • :term:CV splitter,
    • An iterable yielding (train, test) splits as arrays of indices.

    For int/None inputs, :class:KFold is used.

  • Refer :ref:User Guide <cross_validation> for the various cross-validation strategies that can be used here.

    .. versionchanged:: 0.22 cv default value if None changed from 3-fold to 5-fold.

  • verbose : bool or int, default=False Amount of verbosity.

  • n_jobs : int, default=None Number of CPUs to use during the cross validation. Note that this is used only if multiple values for l1_ratio are given. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors. See :term:Glossary <n_jobs> for more details.

  • random_state : int, RandomState instance, default=None The seed of the pseudo random number generator that selects a random feature to update. Used when selection == 'random'. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.

  • selection : {'cyclic', 'random'}, default='cyclic' If set to 'random', a random coefficient is updated every iteration rather than looping over features sequentially by default. This (setting to 'random') often leads to significantly faster convergence especially when tol is higher than 1e-4.


  • intercept_ : ndarray of shape (n_tasks,) Independent term in decision function.

  • coef_ : ndarray of shape (n_tasks, n_features) Parameter vector (W in the cost function formula). Note that coef_ stores the transpose of W, W.T.

  • alpha_ : float The amount of penalization chosen by cross validation

  • mse_path_ : ndarray of shape (n_alphas, n_folds) mean square error for the test set on each fold, varying alpha

  • alphas_ : ndarray of shape (n_alphas,) The grid of alphas used for fitting.

  • n_iter_ : int number of iterations run by the coordinate descent solver to reach the specified tolerance for the optimal alpha.


>>> from sklearn.linear_model import MultiTaskLassoCV
>>> from sklearn.datasets import make_regression
>>> from sklearn.metrics import r2_score
>>> X, y = make_regression(n_targets=2, noise=4, random_state=0)
>>> reg = MultiTaskLassoCV(cv=5, random_state=0).fit(X, y)
>>> r2_score(y, reg.predict(X))
>>> reg.alpha_
>>> reg.predict(X[:1,])
array([[153.7971...,  94.9015...]])

See also

MultiTaskElasticNet ElasticNetCV MultiTaskElasticNetCV


The algorithm used to fit the model is coordinate descent.

To avoid unnecessary memory duplication the X and y arguments of the fit method should be directly passed as Fortran-contiguous numpy arrays.


method fit
val fit :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit linear model with coordinate descent

Fit is on grid of alphas and best alpha estimated by cross-validation.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training data. Pass directly as Fortran-contiguous data to avoid unnecessary memory duplication. If y is mono-output, X can be sparse.

  • y : array-like of shape (n_samples,) or (n_samples, n_targets) Target values


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alpha_
val alpha_ : t -> float
val alpha_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute mse_path_
val mse_path_ : t -> [>`ArrayLike] Np.Obj.t
val mse_path_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alphas_
val alphas_ : t -> [>`ArrayLike] Np.Obj.t
val alphas_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​OrthogonalMatchingPursuit wraps Python class sklearn.linear_model.OrthogonalMatchingPursuit.

type t


constructor and attributes create
val create :
  ?n_nonzero_coefs:int ->
  ?tol:float ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?precompute:[`Auto | `Bool of bool] ->
  unit ->

Orthogonal Matching Pursuit model (OMP)

Read more in the :ref:User Guide <omp>.


  • n_nonzero_coefs : int, optional Desired number of non-zero entries in the solution. If None (by default) this value is set to 10% of n_features.

  • tol : float, optional Maximum norm of the residual. If not None, overrides n_nonzero_coefs.

  • fit_intercept : boolean, optional whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : boolean, optional, default True This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • precompute : {True, False, 'auto'}, default 'auto' Whether to use a precomputed Gram and Xy matrix to speed up calculations. Improves performance when :term:n_targets or :term:n_samples is very large. Note that if you already have such matrices, you can pass them directly to the fit method.


  • coef_ : array, shape (n_features,) or (n_targets, n_features) parameter vector (w in the formula)

  • intercept_ : float or array, shape (n_targets,) independent term in decision function.

  • n_iter_ : int or array-like Number of active features across every target.


>>> from sklearn.linear_model import OrthogonalMatchingPursuit
>>> from sklearn.datasets import make_regression
>>> X, y = make_regression(noise=4, random_state=0)
>>> reg = OrthogonalMatchingPursuit().fit(X, y)
>>> reg.score(X, y)
>>> reg.predict(X[:1,])


Orthogonal matching pursuit was introduced in G. Mallat, Z. Zhang, Matching pursuits with time-frequency dictionaries, IEEE Transactions on Signal Processing, Vol. 41, No. 12. (December 1993), pp. 3397-3415. (

This implementation is based on Rubinstein, R., Zibulevsky, M. and Elad, M., Efficient Implementation of the K-SVD Algorithm using Batch Orthogonal Matching Pursuit Technical Report - CS Technion, April 2008.


See also

orthogonal_mp orthogonal_mp_gram lars_path Lars LassoLars decomposition.sparse_encode OrthogonalMatchingPursuitCV


method fit
val fit :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit the model using X, y as training data.


  • X : array-like, shape (n_samples, n_features) Training data.

  • y : array-like, shape (n_samples,) or (n_samples, n_targets) Target values. Will be cast to X's dtype if necessary


  • self : object returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> Py.Object.t
val n_iter_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​OrthogonalMatchingPursuitCV wraps Python class sklearn.linear_model.OrthogonalMatchingPursuitCV.

type t


constructor and attributes create
val create :
  ?copy:bool ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?max_iter:int ->
  ?cv:[`BaseCrossValidator of [>`BaseCrossValidator] Np.Obj.t | `I of int | `Arr of [>`ArrayLike] Np.Obj.t] ->
  ?n_jobs:int ->
  ?verbose:int ->
  unit ->

Cross-validated Orthogonal Matching Pursuit model (OMP).

See glossary entry for :term:cross-validation estimator.

Read more in the :ref:User Guide <omp>.


  • copy : bool, optional Whether the design matrix X must be copied by the algorithm. A false value is only helpful if X is already Fortran-ordered, otherwise a copy is made anyway.

  • fit_intercept : boolean, optional whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : boolean, optional, default True This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • max_iter : integer, optional Maximum numbers of iterations to perform, therefore maximum features to include. 10% of n_features but at least 5 if available.

  • cv : int, cross-validation generator or an iterable, optional Determines the cross-validation splitting strategy. Possible inputs for cv are:

    • None, to use the default 5-fold cross-validation,
    • integer, to specify the number of folds.
    • :term:CV splitter,
    • An iterable yielding (train, test) splits as arrays of indices.

    For integer/None inputs, :class:KFold is used.

  • Refer :ref:User Guide <cross_validation> for the various cross-validation strategies that can be used here.

    .. versionchanged:: 0.22 cv default value if None changed from 3-fold to 5-fold.

  • n_jobs : int or None, optional (default=None) Number of CPUs to use during the cross validation. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors. See :term:Glossary <n_jobs> for more details.

  • verbose : boolean or integer, optional Sets the verbosity amount


  • intercept_ : float or array, shape (n_targets,) Independent term in decision function.

  • coef_ : array, shape (n_features,) or (n_targets, n_features) Parameter vector (w in the problem formulation).

  • n_nonzero_coefs_ : int Estimated number of non-zero coefficients giving the best mean squared error over the cross-validation folds.

  • n_iter_ : int or array-like Number of active features across every target for the model refit with the best hyperparameters got by cross-validating across all folds.


>>> from sklearn.linear_model import OrthogonalMatchingPursuitCV
>>> from sklearn.datasets import make_regression
>>> X, y = make_regression(n_features=100, n_informative=10,
...                        noise=4, random_state=0)
>>> reg = OrthogonalMatchingPursuitCV(cv=5).fit(X, y)
>>> reg.score(X, y)
>>> reg.n_nonzero_coefs_
>>> reg.predict(X[:1,])

See also

orthogonal_mp orthogonal_mp_gram lars_path Lars LassoLars OrthogonalMatchingPursuit LarsCV LassoLarsCV decomposition.sparse_encode


method fit
val fit :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit the model using X, y as training data.


  • X : array-like, shape [n_samples, n_features] Training data.

  • y : array-like, shape [n_samples] Target values. Will be cast to X's dtype if necessary


  • self : object returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_nonzero_coefs_
val n_nonzero_coefs_ : t -> int
val n_nonzero_coefs_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> Py.Object.t
val n_iter_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​PassiveAggressiveClassifier wraps Python class sklearn.linear_model.PassiveAggressiveClassifier.

type t


constructor and attributes create
val create :
  ?c:float ->
  ?fit_intercept:bool ->
  ?max_iter:int ->
  ?tol:[`F of float | `None] ->
  ?early_stopping:bool ->
  ?validation_fraction:float ->
  ?n_iter_no_change:int ->
  ?shuffle:bool ->
  ?verbose:int ->
  ?loss:string ->
  ?n_jobs:int ->
  ?random_state:int ->
  ?warm_start:bool ->
  ?class_weight:Py.Object.t ->
  ?average:[`I of int | `Bool of bool] ->
  unit ->

Passive Aggressive Classifier

Read more in the :ref:User Guide <passive_aggressive>.


  • C : float Maximum step size (regularization). Defaults to 1.0.

  • fit_intercept : bool, default=False Whether the intercept should be estimated or not. If False, the data is assumed to be already centered.

  • max_iter : int, optional (default=1000) The maximum number of passes over the training data (aka epochs). It only impacts the behavior in the fit method, and not the :meth:partial_fit method.

    .. versionadded:: 0.19

  • tol : float or None, optional (default=1e-3) The stopping criterion. If it is not None, the iterations will stop when (loss > previous_loss - tol).

    .. versionadded:: 0.19

  • early_stopping : bool, default=False Whether to use early stopping to terminate training when validation. score is not improving. If set to True, it will automatically set aside a stratified fraction of training data as validation and terminate training when validation score is not improving by at least tol for n_iter_no_change consecutive epochs.

    .. versionadded:: 0.20

  • validation_fraction : float, default=0.1 The proportion of training data to set aside as validation set for early stopping. Must be between 0 and 1. Only used if early_stopping is True.

    .. versionadded:: 0.20

  • n_iter_no_change : int, default=5 Number of iterations with no improvement to wait before early stopping.

    .. versionadded:: 0.20

  • shuffle : bool, default=True Whether or not the training data should be shuffled after each epoch.

  • verbose : integer, optional The verbosity level

  • loss : string, optional The loss function to be used:

  • hinge: equivalent to PA-I in the reference paper.

  • squared_hinge: equivalent to PA-II in the reference paper.

  • n_jobs : int or None, optional (default=None) The number of CPUs to use to do the OVA (One Versus All, for multi-class problems) computation. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors. See :term:Glossary <n_jobs> for more details.

  • random_state : int, RandomState instance, default=None Used to shuffle the training data, when shuffle is set to True. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.

  • warm_start : bool, optional When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

  • See :term:the Glossary <warm_start>.

    Repeatedly calling fit or partial_fit when warm_start is True can result in a different solution than when calling fit a single time because of the way the data is shuffled.

  • class_weight : dict, {class_label: weight} or 'balanced' or None, optional Preset for the class_weight fit parameter.

    Weights associated with classes. If not given, all classes are supposed to have weight one.

    The 'balanced' mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

    .. versionadded:: 0.17 parameter class_weight to automatically weight samples.

  • average : bool or int, optional When set to True, computes the averaged SGD weights and stores the result in the coef_ attribute. If set to an int greater than 1, averaging will begin once the total number of samples seen reaches average. So average=10 will begin averaging after seeing 10 samples.

    .. versionadded:: 0.19 parameter average to use weights averaging in SGD


  • coef_ : array, shape = [1, n_features] if n_classes == 2 else [n_classes, n_features] Weights assigned to the features.

  • intercept_ : array, shape = [1] if n_classes == 2 else [n_classes] Constants in decision function.

  • n_iter_ : int The actual number of iterations to reach the stopping criterion. For multiclass fits, it is the maximum over every binary fit.

  • classes_ : array of shape (n_classes,) The unique classes labels.

  • t_ : int Number of weight updates performed during training. Same as (n_iter_ * n_samples).

  • loss_function_ : callable Loss function used by the algorithm.


>>> from sklearn.linear_model import PassiveAggressiveClassifier
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification(n_features=4, random_state=0)
>>> clf = PassiveAggressiveClassifier(max_iter=1000, random_state=0,
... tol=1e-3)
>>>, y)
>>> print(clf.coef_)
[[0.26642044 0.45070924 0.67251877 0.64185414]]
>>> print(clf.intercept_)
>>> print(clf.predict([[0, 0, 0, 0]]))

See also

SGDClassifier Perceptron


Online Passive-Aggressive Algorithms K. Crammer, O. Dekel, J. Keshat, S. Shalev-Shwartz, Y. Singer - JMLR (2006)


method decision_function
val decision_function :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict confidence scores for samples.

The confidence score for a sample is the signed distance of that sample to the hyperplane.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


array, shape=(n_samples,) if n_classes == 2 else (n_samples, n_classes) Confidence scores per (sample, class) combination. In the binary case, confidence score for self.classes_[1] where >0 means this class would be predicted.


method densify
val densify :
  [> tag] Obj.t ->

Convert coefficient matrix to dense array format.

Converts the coef_ member (back) to a numpy.ndarray. This is the default format of coef_ and is required for fitting, so calling this method is only required on models that have previously been sparsified; otherwise, it is a no-op.


self Fitted estimator.


method fit
val fit :
  ?coef_init:[>`ArrayLike] Np.Obj.t ->
  ?intercept_init:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit linear model with Passive Aggressive algorithm.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training data

  • y : numpy array of shape [n_samples] Target values

  • coef_init : array, shape = [n_classes,n_features] The initial coefficients to warm-start the optimization.

  • intercept_init : array, shape = [n_classes] The initial intercept to warm-start the optimization.


  • self : returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method partial_fit
val partial_fit :
  ?classes:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit linear model with Passive Aggressive algorithm.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Subset of the training data

  • y : numpy array of shape [n_samples] Subset of the target values

  • classes : array, shape = [n_classes] Classes across all calls to partial_fit. Can be obtained by via np.unique(y_all), where y_all is the target vector of the entire dataset. This argument is required for the first call to partial_fit and can be omitted in the subsequent calls. Note that y doesn't need to contain all labels in classes.


  • self : returns an instance of self.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict class labels for samples in X.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape [n_samples] Predicted class label per sample.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.


  • X : array-like of shape (n_samples, n_features) Test samples.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float Mean accuracy of self.predict(X) wrt. y.


method set_params
val set_params :
  ?kwargs:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set and validate the parameters of estimator.


  • **kwargs : dict Estimator parameters.


  • self : object Estimator instance.


method sparsify
val sparsify :
  [> tag] Obj.t ->

Convert coefficient matrix to sparse format.

Converts the coef_ member to a scipy.sparse matrix, which for L1-regularized models can be much more memory- and storage-efficient than the usual numpy.ndarray representation.

The intercept_ member is not converted.


self Fitted estimator.


For non-sparse models, i.e. when there are not many zeros in coef_, this may actually increase memory usage, so use this method with care. A rule of thumb is that the number of zero elements, which can be computed with (coef_ == 0).sum(), must be more than 50% for this to provide significant benefits.

After calling this method, further fitting with the partial_fit method (if any) will not work until you call densify.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute classes_
val classes_ : t -> [>`ArrayLike] Np.Obj.t
val classes_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute t_
val t_ : t -> int
val t_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute loss_function_
val loss_function_ : t -> Py.Object.t
val loss_function_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​PassiveAggressiveRegressor wraps Python class sklearn.linear_model.PassiveAggressiveRegressor.

type t


constructor and attributes create
val create :
  ?c:float ->
  ?fit_intercept:bool ->
  ?max_iter:int ->
  ?tol:[`F of float | `None] ->
  ?early_stopping:bool ->
  ?validation_fraction:float ->
  ?n_iter_no_change:int ->
  ?shuffle:bool ->
  ?verbose:int ->
  ?loss:string ->
  ?epsilon:float ->
  ?random_state:int ->
  ?warm_start:bool ->
  ?average:[`I of int | `Bool of bool] ->
  unit ->

Passive Aggressive Regressor

Read more in the :ref:User Guide <passive_aggressive>.


  • C : float Maximum step size (regularization). Defaults to 1.0.

  • fit_intercept : bool Whether the intercept should be estimated or not. If False, the data is assumed to be already centered. Defaults to True.

  • max_iter : int, optional (default=1000) The maximum number of passes over the training data (aka epochs). It only impacts the behavior in the fit method, and not the :meth:partial_fit method.

    .. versionadded:: 0.19

  • tol : float or None, optional (default=1e-3) The stopping criterion. If it is not None, the iterations will stop when (loss > previous_loss - tol).

    .. versionadded:: 0.19

  • early_stopping : bool, default=False Whether to use early stopping to terminate training when validation. score is not improving. If set to True, it will automatically set aside a fraction of training data as validation and terminate training when validation score is not improving by at least tol for n_iter_no_change consecutive epochs.

    .. versionadded:: 0.20

  • validation_fraction : float, default=0.1 The proportion of training data to set aside as validation set for early stopping. Must be between 0 and 1. Only used if early_stopping is True.

    .. versionadded:: 0.20

  • n_iter_no_change : int, default=5 Number of iterations with no improvement to wait before early stopping.

    .. versionadded:: 0.20

  • shuffle : bool, default=True Whether or not the training data should be shuffled after each epoch.

  • verbose : integer, optional The verbosity level

  • loss : string, optional The loss function to be used:

  • epsilon_insensitive: equivalent to PA-I in the reference paper.

  • squared_epsilon_insensitive: equivalent to PA-II in the reference paper.

  • epsilon : float If the difference between the current prediction and the correct label is below this threshold, the model is not updated.

  • random_state : int, RandomState instance, default=None Used to shuffle the training data, when shuffle is set to True. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.

  • warm_start : bool, optional When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

  • See :term:the Glossary <warm_start>.

    Repeatedly calling fit or partial_fit when warm_start is True can result in a different solution than when calling fit a single time because of the way the data is shuffled.

  • average : bool or int, optional When set to True, computes the averaged SGD weights and stores the result in the coef_ attribute. If set to an int greater than 1, averaging will begin once the total number of samples seen reaches average. So average=10 will begin averaging after seeing 10 samples.

    .. versionadded:: 0.19 parameter average to use weights averaging in SGD


  • coef_ : array, shape = [1, n_features] if n_classes == 2 else [n_classes, n_features] Weights assigned to the features.

  • intercept_ : array, shape = [1] if n_classes == 2 else [n_classes] Constants in decision function.

  • n_iter_ : int The actual number of iterations to reach the stopping criterion.

  • t_ : int Number of weight updates performed during training. Same as (n_iter_ * n_samples).


>>> from sklearn.linear_model import PassiveAggressiveRegressor
>>> from sklearn.datasets import make_regression
>>> X, y = make_regression(n_features=4, random_state=0)
>>> regr = PassiveAggressiveRegressor(max_iter=100, random_state=0,
... tol=1e-3)
>>>, y)
PassiveAggressiveRegressor(max_iter=100, random_state=0)
>>> print(regr.coef_)
[20.48736655 34.18818427 67.59122734 87.94731329]
>>> print(regr.intercept_)
>>> print(regr.predict([[0, 0, 0, 0]]))

See also



Online Passive-Aggressive Algorithms K. Crammer, O. Dekel, J. Keshat, S. Shalev-Shwartz, Y. Singer - JMLR (2006)


method densify
val densify :
  [> tag] Obj.t ->

Convert coefficient matrix to dense array format.

Converts the coef_ member (back) to a numpy.ndarray. This is the default format of coef_ and is required for fitting, so calling this method is only required on models that have previously been sparsified; otherwise, it is a no-op.


self Fitted estimator.


method fit
val fit :
  ?coef_init:[>`ArrayLike] Np.Obj.t ->
  ?intercept_init:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit linear model with Passive Aggressive algorithm.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training data

  • y : numpy array of shape [n_samples] Target values

  • coef_init : array, shape = [n_features] The initial coefficients to warm-start the optimization.

  • intercept_init : array, shape = [1] The initial intercept to warm-start the optimization.


  • self : returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method partial_fit
val partial_fit :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit linear model with Passive Aggressive algorithm.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Subset of training data

  • y : numpy array of shape [n_samples] Subset of target values


  • self : returns an instance of self.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model


  • X : {array-like, sparse matrix}, shape (n_samples, n_features)


ndarray of shape (n_samples,) Predicted target values per element in X.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?kwargs:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set and validate the parameters of estimator.


  • **kwargs : dict Estimator parameters.


  • self : object Estimator instance.


method sparsify
val sparsify :
  [> tag] Obj.t ->

Convert coefficient matrix to sparse format.

Converts the coef_ member to a scipy.sparse matrix, which for L1-regularized models can be much more memory- and storage-efficient than the usual numpy.ndarray representation.

The intercept_ member is not converted.


self Fitted estimator.


For non-sparse models, i.e. when there are not many zeros in coef_, this may actually increase memory usage, so use this method with care. A rule of thumb is that the number of zero elements, which can be computed with (coef_ == 0).sum(), must be more than 50% for this to provide significant benefits.

After calling this method, further fitting with the partial_fit method (if any) will not work until you call densify.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute t_
val t_ : t -> int
val t_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​Perceptron wraps Python class sklearn.linear_model.Perceptron.

type t


constructor and attributes create
val create :
  ?penalty:[`Elasticnet | `L2 | `L1] ->
  ?alpha:float ->
  ?fit_intercept:bool ->
  ?max_iter:int ->
  ?tol:float ->
  ?shuffle:bool ->
  ?verbose:int ->
  ?eta0:float ->
  ?n_jobs:int ->
  ?random_state:int ->
  ?early_stopping:bool ->
  ?validation_fraction:float ->
  ?n_iter_no_change:int ->
  ?class_weight:[`Balanced | `DictIntToFloat of (int * float) list | `T_class_label_weight_ of Py.Object.t] ->
  ?warm_start:bool ->
  unit ->


Read more in the :ref:User Guide <perceptron>.


  • penalty : {'l2','l1','elasticnet'}, default=None The penalty (aka regularization term) to be used.

  • alpha : float, default=0.0001 Constant that multiplies the regularization term if regularization is used.

  • fit_intercept : bool, default=True Whether the intercept should be estimated or not. If False, the data is assumed to be already centered.

  • max_iter : int, default=1000 The maximum number of passes over the training data (aka epochs). It only impacts the behavior in the fit method, and not the :meth:partial_fit method.

    .. versionadded:: 0.19

  • tol : float, default=1e-3 The stopping criterion. If it is not None, the iterations will stop when (loss > previous_loss - tol).

    .. versionadded:: 0.19

  • shuffle : bool, default=True Whether or not the training data should be shuffled after each epoch.

  • verbose : int, default=0 The verbosity level

  • eta0 : double, default=1 Constant by which the updates are multiplied.

  • n_jobs : int, default=None The number of CPUs to use to do the OVA (One Versus All, for multi-class problems) computation. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors. See :term:Glossary <n_jobs> for more details.

  • random_state : int, RandomState instance, default=None Used to shuffle the training data, when shuffle is set to True. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.

  • early_stopping : bool, default=False Whether to use early stopping to terminate training when validation. score is not improving. If set to True, it will automatically set aside a stratified fraction of training data as validation and terminate training when validation score is not improving by at least tol for n_iter_no_change consecutive epochs.

    .. versionadded:: 0.20

  • validation_fraction : float, default=0.1 The proportion of training data to set aside as validation set for early stopping. Must be between 0 and 1. Only used if early_stopping is True.

    .. versionadded:: 0.20

  • n_iter_no_change : int, default=5 Number of iterations with no improvement to wait before early stopping.

    .. versionadded:: 0.20

  • class_weight : dict, {class_label: weight} or 'balanced', default=None Preset for the class_weight fit parameter.

    Weights associated with classes. If not given, all classes are supposed to have weight one.

    The 'balanced' mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

  • warm_start : bool, default=False When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. See :term:the Glossary <warm_start>.


  • coef_ : ndarray of shape = [1, n_features] if n_classes == 2 else [n_classes, n_features] Weights assigned to the features.

  • intercept_ : ndarray of shape = [1] if n_classes == 2 else [n_classes] Constants in decision function.

  • n_iter_ : int The actual number of iterations to reach the stopping criterion. For multiclass fits, it is the maximum over every binary fit.

  • classes_ : ndarray of shape (n_classes,) The unique classes labels.

  • t_ : int Number of weight updates performed during training. Same as (n_iter_ * n_samples).


Perceptron is a classification algorithm which shares the same underlying implementation with SGDClassifier. In fact, Perceptron() is equivalent to SGDClassifier(loss='perceptron', eta0=1, learning_rate='constant', penalty=None).


>>> from sklearn.datasets import load_digits
>>> from sklearn.linear_model import Perceptron
>>> X, y = load_digits(return_X_y=True)
>>> clf = Perceptron(tol=1e-3, random_state=0)
>>>, y)
>>> clf.score(X, y)

See also



  • and references therein.


method decision_function
val decision_function :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict confidence scores for samples.

The confidence score for a sample is the signed distance of that sample to the hyperplane.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


array, shape=(n_samples,) if n_classes == 2 else (n_samples, n_classes) Confidence scores per (sample, class) combination. In the binary case, confidence score for self.classes_[1] where >0 means this class would be predicted.


method densify
val densify :
  [> tag] Obj.t ->

Convert coefficient matrix to dense array format.

Converts the coef_ member (back) to a numpy.ndarray. This is the default format of coef_ and is required for fitting, so calling this method is only required on models that have previously been sparsified; otherwise, it is a no-op.


self Fitted estimator.


method fit
val fit :
  ?coef_init:[>`ArrayLike] Np.Obj.t ->
  ?intercept_init:[>`ArrayLike] Np.Obj.t ->
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit linear model with Stochastic Gradient Descent.


  • X : {array-like, sparse matrix}, shape (n_samples, n_features) Training data.

  • y : ndarray of shape (n_samples,) Target values.

  • coef_init : ndarray of shape (n_classes, n_features), default=None The initial coefficients to warm-start the optimization.

  • intercept_init : ndarray of shape (n_classes,), default=None The initial intercept to warm-start the optimization.

  • sample_weight : array-like, shape (n_samples,), default=None Weights applied to individual samples. If not provided, uniform weights are assumed. These weights will be multiplied with class_weight (passed through the constructor) if class_weight is specified.


self : Returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method partial_fit
val partial_fit :
  ?classes:[>`ArrayLike] Np.Obj.t ->
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Perform one epoch of stochastic gradient descent on given samples.

Internally, this method uses max_iter = 1. Therefore, it is not guaranteed that a minimum of the cost function is reached after calling it once. Matters such as objective convergence and early stopping should be handled by the user.


  • X : {array-like, sparse matrix}, shape (n_samples, n_features) Subset of the training data.

  • y : ndarray of shape (n_samples,) Subset of the target values.

  • classes : ndarray of shape (n_classes,), default=None Classes across all calls to partial_fit. Can be obtained by via np.unique(y_all), where y_all is the target vector of the entire dataset. This argument is required for the first call to partial_fit and can be omitted in the subsequent calls. Note that y doesn't need to contain all labels in classes.

  • sample_weight : array-like, shape (n_samples,), default=None Weights applied to individual samples. If not provided, uniform weights are assumed.


self : Returns an instance of self.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict class labels for samples in X.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape [n_samples] Predicted class label per sample.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.


  • X : array-like of shape (n_samples, n_features) Test samples.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float Mean accuracy of self.predict(X) wrt. y.


method set_params
val set_params :
  ?kwargs:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set and validate the parameters of estimator.


  • **kwargs : dict Estimator parameters.


  • self : object Estimator instance.


method sparsify
val sparsify :
  [> tag] Obj.t ->

Convert coefficient matrix to sparse format.

Converts the coef_ member to a scipy.sparse matrix, which for L1-regularized models can be much more memory- and storage-efficient than the usual numpy.ndarray representation.

The intercept_ member is not converted.


self Fitted estimator.


For non-sparse models, i.e. when there are not many zeros in coef_, this may actually increase memory usage, so use this method with care. A rule of thumb is that the number of zero elements, which can be computed with (coef_ == 0).sum(), must be more than 50% for this to provide significant benefits.

After calling this method, further fitting with the partial_fit method (if any) will not work until you call densify.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute classes_
val classes_ : t -> [>`ArrayLike] Np.Obj.t
val classes_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute t_
val t_ : t -> int
val t_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​PoissonRegressor wraps Python class sklearn.linear_model.PoissonRegressor.

type t


constructor and attributes create
val create :
  ?alpha:float ->
  ?fit_intercept:bool ->
  ?max_iter:int ->
  ?tol:float ->
  ?warm_start:bool ->
  ?verbose:int ->
  unit ->

Generalized Linear Model with a Poisson distribution.

Read more in the :ref:User Guide <Generalized_linear_regression>.


  • alpha : float, default=1 Constant that multiplies the penalty term and thus determines the regularization strength. alpha = 0 is equivalent to unpenalized GLMs. In this case, the design matrix X must have full column rank (no collinearities).

  • fit_intercept : bool, default=True Specifies if a constant (a.k.a. bias or intercept) should be added to the linear predictor (X @ coef + intercept).

  • max_iter : int, default=100 The maximal number of iterations for the solver.

  • tol : float, default=1e-4 Stopping criterion. For the lbfgs solver, the iteration will stop when max{ |g_j|, j = 1, ..., d} <= tol where g_j is the j-th component of the gradient (derivative) of the objective function.

  • warm_start : bool, default=False If set to True, reuse the solution of the previous call to fit as initialization for coef_ and intercept_ .

  • verbose : int, default=0 For the lbfgs solver set verbose to any positive number for verbosity.


  • coef_ : array of shape (n_features,) Estimated coefficients for the linear predictor (X @ coef_ + intercept_) in the GLM.

  • intercept_ : float Intercept (a.k.a. bias) added to linear predictor.

  • n_iter_ : int Actual number of iterations used in the solver.


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit a Generalized Linear Model.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training data.

  • y : array-like of shape (n_samples,) Target values.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • self : returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using GLM with feature matrix X.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Samples.


  • y_pred : array of shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Compute D^2, the percentage of deviance explained.

D^2 is a generalization of the coefficient of determination R^2. R^2 uses squared error and D^2 deviance. Note that those two are equal for family='normal'.

D^2 is defined as :math:D^2 = 1-\frac{D(y_{true},y_{pred})}{D_{null}}, :math:D_{null} is the null deviance, i.e. the deviance of a model with intercept alone, which corresponds to :math:y_{pred} = \bar{y}. The mean :math:\bar{y} is averaged by sample_weight. Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse).


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Test samples.

  • y : array-like of shape (n_samples,) True values of target.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float D^2 of self.predict(X) w.r.t. y.


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​RANSACRegressor wraps Python class sklearn.linear_model.RANSACRegressor.

type t


constructor and attributes create
val create :
  ?base_estimator:[>`BaseEstimator] Np.Obj.t ->
  ?min_samples:[`Float_0_1_ of Py.Object.t | `I of int] ->
  ?residual_threshold:float ->
  ?is_data_valid:Py.Object.t ->
  ?is_model_valid:Py.Object.t ->
  ?max_trials:int ->
  ?max_skips:int ->
  ?stop_n_inliers:int ->
  ?stop_score:float ->
  ?stop_probability:float ->
  ?loss:[`S of string | `Callable of Py.Object.t] ->
  ?random_state:int ->
  unit ->

RANSAC (RANdom SAmple Consensus) algorithm.

RANSAC is an iterative algorithm for the robust estimation of parameters from a subset of inliers from the complete data set.

Read more in the :ref:User Guide <ransac_regression>.


  • base_estimator : object, optional Base estimator object which implements the following methods:

    • fit(X, y): Fit model to given training data and target values.
    • score(X, y): Returns the mean accuracy on the given test data, which is used for the stop criterion defined by stop_score. Additionally, the score is used to decide which of two equally large consensus sets is chosen as the better one.
    • predict(X): Returns predicted values using the linear model, which is used to compute residual error using loss function.

    If base_estimator is None, then base_estimator=sklearn.linear_model.LinearRegression() is used for target values of dtype float.

    Note that the current implementation only supports regression estimators.

  • min_samples : int (>= 1) or float ([0, 1]), optional Minimum number of samples chosen randomly from original data. Treated as an absolute number of samples for min_samples >= 1, treated as a relative number ceil(min_samples * X.shape[0]) for min_samples < 1. This is typically chosen as the minimal number of samples necessary to estimate the given base_estimator. By default a sklearn.linear_model.LinearRegression() estimator is assumed and min_samples is chosen as X.shape[1] + 1.

  • residual_threshold : float, optional Maximum residual for a data sample to be classified as an inlier. By default the threshold is chosen as the MAD (median absolute deviation) of the target values y.

  • is_data_valid : callable, optional This function is called with the randomly selected data before the model is fitted to it: is_data_valid(X, y). If its return value is False the current randomly chosen sub-sample is skipped.

  • is_model_valid : callable, optional This function is called with the estimated model and the randomly selected data: is_model_valid(model, X, y). If its return value is False the current randomly chosen sub-sample is skipped. Rejecting samples with this function is computationally costlier than with is_data_valid. is_model_valid should therefore only be used if the estimated model is needed for making the rejection decision.

  • max_trials : int, optional Maximum number of iterations for random sample selection.

  • max_skips : int, optional Maximum number of iterations that can be skipped due to finding zero inliers or invalid data defined by is_data_valid or invalid models defined by is_model_valid.

    .. versionadded:: 0.19

  • stop_n_inliers : int, optional Stop iteration if at least this number of inliers are found.

  • stop_score : float, optional Stop iteration if score is greater equal than this threshold.

  • stop_probability : float in range [0, 1], optional RANSAC iteration stops if at least one outlier-free set of the training data is sampled in RANSAC. This requires to generate at least N samples (iterations)::

    N >= log(1 - probability) / log(1 - e**m)

    where the probability (confidence) is typically set to high value such as 0.99 (the default) and e is the current fraction of inliers w.r.t. the total number of samples.

  • loss : string, callable, optional, default 'absolute_loss' String inputs, 'absolute_loss' and 'squared_loss' are supported which find the absolute loss and squared loss per sample respectively.

    If loss is a callable, then it should be a function that takes two arrays as inputs, the true and predicted value and returns a 1-D array with the i-th value of the array corresponding to the loss on X[i].

    If the loss on a sample is greater than the residual_threshold, then this sample is classified as an outlier.

    .. versionadded:: 0.18

  • random_state : int, RandomState instance, default=None The generator used to initialize the centers. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.


  • estimator_ : object Best fitted model (copy of the base_estimator object).

  • n_trials_ : int Number of random selection trials until one of the stop criteria is met. It is always <= max_trials.

  • inlier_mask_ : bool array of shape [n_samples] Boolean mask of inliers classified as True.

  • n_skips_no_inliers_ : int Number of iterations skipped due to finding zero inliers.

    .. versionadded:: 0.19

  • n_skips_invalid_data_ : int Number of iterations skipped due to invalid data defined by is_data_valid.

    .. versionadded:: 0.19

  • n_skips_invalid_model_ : int Number of iterations skipped due to an invalid model defined by is_model_valid.

    .. versionadded:: 0.19


>>> from sklearn.linear_model import RANSACRegressor
>>> from sklearn.datasets import make_regression
>>> X, y = make_regression(
...     n_samples=200, n_features=2, noise=4.0, random_state=0)
>>> reg = RANSACRegressor(random_state=0).fit(X, y)
>>> reg.score(X, y)
>>> reg.predict(X[:1,])


.. [1] .. [2] .. [3]


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit estimator using RANSAC algorithm.


  • X : array-like or sparse matrix, shape [n_samples, n_features] Training data.

  • y : array-like of shape (n_samples,) or (n_samples, n_targets) Target values.

  • sample_weight : array-like of shape (n_samples,), default=None Individual weights for each sample raises error if sample_weight is passed and base_estimator fit method does not support it.

    .. versionadded:: 0.18


ValueError If no valid consensus set could be found. This occurs if is_data_valid and is_model_valid return False for all max_trials randomly chosen sub-samples.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the estimated model.

This is a wrapper for estimator_.predict(X).


  • X : numpy array of shape [n_samples, n_features]


  • y : array, shape = [n_samples] or [n_samples, n_targets] Returns predicted values.


method score
val score :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Returns the score of the prediction.

This is a wrapper for estimator_.score(X, y).


  • X : numpy array or sparse matrix of shape [n_samples, n_features] Training data.

  • y : array, shape = [n_samples] or [n_samples, n_targets] Target values.


  • z : float Score of the prediction.


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute estimator_
val estimator_ : t -> Py.Object.t
val estimator_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_trials_
val n_trials_ : t -> int
val n_trials_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute inlier_mask_
val inlier_mask_ : t -> [>`ArrayLike] Np.Obj.t
val inlier_mask_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_skips_no_inliers_
val n_skips_no_inliers_ : t -> int
val n_skips_no_inliers_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_skips_invalid_data_
val n_skips_invalid_data_ : t -> int
val n_skips_invalid_data_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_skips_invalid_model_
val n_skips_invalid_model_ : t -> int
val n_skips_invalid_model_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​Ridge wraps Python class sklearn.linear_model.Ridge.

type t


constructor and attributes create
val create :
  ?alpha:[>`ArrayLike] Np.Obj.t ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?copy_X:bool ->
  ?max_iter:int ->
  ?tol:float ->
  ?solver:[`Auto | `Svd | `Cholesky | `Lsqr | `Sparse_cg | `Sag | `Saga] ->
  ?random_state:int ->
  unit ->

Linear least squares with l2 regularization.

Minimizes the objective function::

||y - Xw||^2_2 + alpha * ||w||^2_2

This model solves a regression model where the loss function is the linear least squares function and regularization is given by the l2-norm. Also known as Ridge Regression or Tikhonov regularization. This estimator has built-in support for multi-variate regression (i.e., when y is a 2d-array of shape (n_samples, n_targets)).

Read more in the :ref:User Guide <ridge_regression>.


  • alpha : {float, ndarray of shape (n_targets,)}, default=1.0 Regularization strength; must be a positive float. Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization. Alpha corresponds to 1 / (2C) in other linear models such as :class:~sklearn.linear_model.LogisticRegression or :class:sklearn.svm.LinearSVC. If an array is passed, penalties are assumed to be specific to the targets. Hence they must correspond in number.

  • fit_intercept : bool, default=True Whether to fit the intercept for this model. If set to false, no intercept will be used in calculations (i.e. X and y are expected to be centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • max_iter : int, default=None Maximum number of iterations for conjugate gradient solver. For 'sparse_cg' and 'lsqr' solvers, the default value is determined by scipy.sparse.linalg. For 'sag' solver, the default value is 1000.

  • tol : float, default=1e-3 Precision of the solution.

  • solver : {'auto', 'svd', 'cholesky', 'lsqr', 'sparse_cg', 'sag', 'saga'}, default='auto' Solver to use in the computational routines:

    • 'auto' chooses the solver automatically based on the type of data.

    • 'svd' uses a Singular Value Decomposition of X to compute the Ridge coefficients. More stable for singular matrices than 'cholesky'.

    • 'cholesky' uses the standard scipy.linalg.solve function to obtain a closed-form solution.

    • 'sparse_cg' uses the conjugate gradient solver as found in As an iterative algorithm, this solver is more appropriate than 'cholesky' for large-scale data (possibility to set tol and max_iter).

    • 'lsqr' uses the dedicated regularized least-squares routine scipy.sparse.linalg.lsqr. It is the fastest and uses an iterative procedure.

    • 'sag' uses a Stochastic Average Gradient descent, and 'saga' uses its improved, unbiased version named SAGA. Both methods also use an iterative procedure, and are often faster than other solvers when both n_samples and n_features are large. Note that 'sag' and 'saga' fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.

    All last five solvers support both dense and sparse data. However, only 'sag' and 'sparse_cg' supports sparse input when fit_intercept is True.

    .. versionadded:: 0.17 Stochastic Average Gradient descent solver. .. versionadded:: 0.19 SAGA solver.

  • random_state : int, RandomState instance, default=None Used when solver == 'sag' or 'saga' to shuffle the data.

  • See :term:Glossary <random_state> for details.

    .. versionadded:: 0.17 random_state to support Stochastic Average Gradient.


  • coef_ : ndarray of shape (n_features,) or (n_targets, n_features) Weight vector(s).

  • intercept_ : float or ndarray of shape (n_targets,) Independent term in decision function. Set to 0.0 if fit_intercept = False.

  • n_iter_ : None or ndarray of shape (n_targets,) Actual number of iterations for each target. Available only for sag and lsqr solvers. Other solvers will return None.

    .. versionadded:: 0.17

See also

  • RidgeClassifier : Ridge classifier

  • RidgeCV : Ridge regression with built-in cross validation :class:sklearn.kernel_ridge.KernelRidge : Kernel ridge regression combines ridge regression with the kernel trick


>>> from sklearn.linear_model import Ridge
>>> import numpy as np
>>> n_samples, n_features = 10, 5
>>> rng = np.random.RandomState(0)
>>> y = rng.randn(n_samples)
>>> X = rng.randn(n_samples, n_features)
>>> clf = Ridge(alpha=1.0)
>>>, y)


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit Ridge regression model.


  • X : {ndarray, sparse matrix} of shape (n_samples, n_features) Training data

  • y : ndarray of shape (n_samples,) or (n_samples, n_targets) Target values

  • sample_weight : float or ndarray of shape (n_samples,), default=None Individual weights for each sample. If given a float, every sample will have the same weight.


  • self : returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> [>`ArrayLike] Np.Obj.t
val n_iter_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​RidgeCV wraps Python class sklearn.linear_model.RidgeCV.

type t


constructor and attributes create
val create :
  ?alphas:[>`ArrayLike] Np.Obj.t ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?scoring:[`Roc_auc_ovo_weighted | `Callable of Py.Object.t | `Precision | `Roc_auc_ovr | `Recall_micro | `F1_micro | `Precision_micro | `Fowlkes_mallows_score | `F1 | `Jaccard | `Max_error | `Precision_weighted | `Precision_macro | `Neg_brier_score | `Roc_auc_ovo | `F1_weighted | `Average_precision | `Adjusted_mutual_info_score | `Neg_mean_poisson_deviance | `Neg_median_absolute_error | `Jaccard_macro | `Jaccard_micro | `Neg_log_loss | `Recall_samples | `Explained_variance | `Balanced_accuracy | `Normalized_mutual_info_score | `F1_samples | `Completeness_score | `Mutual_info_score | `Accuracy | `Neg_mean_squared_log_error | `Roc_auc | `Precision_samples | `V_measure_score | `Neg_mean_gamma_deviance | `Jaccard_weighted | `R2 | `Recall_weighted | `Recall_macro | `Roc_auc_ovr_weighted | `Homogeneity_score | `Neg_mean_squared_error | `Neg_root_mean_squared_error | `Recall | `Neg_mean_absolute_error | `Adjusted_rand_score | `Jaccard_samples | `F1_macro] ->
  ?cv:[`BaseCrossValidator of [>`BaseCrossValidator] Np.Obj.t | `I of int | `Arr of [>`ArrayLike] Np.Obj.t] ->
  ?gcv_mode:[`Svd | `Eigen | `Auto] ->
  ?store_cv_values:bool ->
  unit ->

Ridge regression with built-in cross-validation.

See glossary entry for :term:cross-validation estimator.

By default, it performs Generalized Cross-Validation, which is a form of efficient Leave-One-Out cross-validation.

Read more in the :ref:User Guide <ridge_regression>.


  • alphas : ndarray of shape (n_alphas,), default=(0.1, 1.0, 10.0) Array of alpha values to try. Regularization strength; must be a positive float. Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization. Alpha corresponds to 1 / (2C) in other linear models such as :class:~sklearn.linear_model.LogisticRegression or :class:sklearn.svm.LinearSVC. If using generalized cross-validation, alphas must be positive.

  • fit_intercept : bool, default=True Whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • scoring : string, callable, default=None A string (see model evaluation documentation) or a scorer callable object / function with signature scorer(estimator, X, y). If None, the negative mean squared error if cv is 'auto' or None (i.e. when using generalized cross-validation), and r2 score otherwise.

  • cv : int, cross-validation generator or an iterable, default=None Determines the cross-validation splitting strategy. Possible inputs for cv are:

    • None, to use the efficient Leave-One-Out cross-validation (also known as Generalized Cross-Validation).
    • integer, to specify the number of folds.
    • :term:CV splitter,
    • An iterable yielding (train, test) splits as arrays of indices.

    For integer/None inputs, if y is binary or multiclass, :class:sklearn.model_selection.StratifiedKFold is used, else, :class:sklearn.model_selection.KFold is used.

  • Refer :ref:User Guide <cross_validation> for the various cross-validation strategies that can be used here.

  • gcv_mode : {'auto', 'svd', eigen'}, default='auto' Flag indicating which strategy to use when performing Generalized Cross-Validation. Options are::

    'auto' : use 'svd' if n_samples > n_features, otherwise use 'eigen'
    'svd' : force use of singular value decomposition of X when X is
        dense, eigenvalue decomposition of X^T.X when X is sparse.
    'eigen' : force computation via eigendecomposition of X.X^T

    The 'auto' mode is the default and is intended to pick the cheaper option of the two depending on the shape of the training data.

  • store_cv_values : bool, default=False Flag indicating if the cross-validation values corresponding to each alpha should be stored in the cv_values_ attribute (see below). This flag is only compatible with cv=None (i.e. using Generalized Cross-Validation).


  • cv_values_ : ndarray of shape (n_samples, n_alphas) or shape (n_samples, n_targets, n_alphas), optional Cross-validation values for each alpha (only available if store_cv_values=True and cv=None). After fit() has been called, this attribute will contain the mean squared errors (by default) or the values of the {loss,score}_func function (if provided in the constructor).

  • coef_ : ndarray of shape (n_features) or (n_targets, n_features) Weight vector(s).

  • intercept_ : float or ndarray of shape (n_targets,) Independent term in decision function. Set to 0.0 if fit_intercept = False.

  • alpha_ : float Estimated regularization parameter.

  • best_score_ : float Score of base estimator with best alpha.


>>> from sklearn.datasets import load_diabetes
>>> from sklearn.linear_model import RidgeCV
>>> X, y = load_diabetes(return_X_y=True)
>>> clf = RidgeCV(alphas=[1e-3, 1e-2, 1e-1, 1]).fit(X, y)
>>> clf.score(X, y)

See also

  • Ridge : Ridge regression

  • RidgeClassifier : Ridge classifier

  • RidgeClassifierCV : Ridge classifier with built-in cross validation


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit Ridge regression model with cv.


  • X : ndarray of shape (n_samples, n_features) Training data. If using GCV, will be cast to float64 if necessary.

  • y : ndarray of shape (n_samples,) or (n_samples, n_targets) Target values. Will be cast to X's dtype if necessary.

  • sample_weight : float or ndarray of shape (n_samples,), default=None Individual weights for each sample. If given a float, every sample will have the same weight.


  • self : object


When sample_weight is provided, the selected hyperparameter may depend on whether we use generalized cross-validation (cv=None or cv='auto') or another form of cross-validation, because only generalized cross-validation takes the sample weights into account when computing the validation score.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute cv_values_
val cv_values_ : t -> Py.Object.t
val cv_values_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alpha_
val alpha_ : t -> float
val alpha_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute best_score_
val best_score_ : t -> float
val best_score_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​RidgeClassifier wraps Python class sklearn.linear_model.RidgeClassifier.

type t


constructor and attributes create
val create :
  ?alpha:float ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?copy_X:bool ->
  ?max_iter:int ->
  ?tol:float ->
  ?class_weight:[`Balanced | `DictIntToFloat of (int * float) list] ->
  ?solver:[`Auto | `Svd | `Cholesky | `Lsqr | `Sparse_cg | `Sag | `Saga] ->
  ?random_state:int ->
  unit ->

Classifier using Ridge regression.

This classifier first converts the target values into {-1, 1} and then treats the problem as a regression task (multi-output regression in the multiclass case).

Read more in the :ref:User Guide <ridge_regression>.


  • alpha : float, default=1.0 Regularization strength; must be a positive float. Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization. Alpha corresponds to 1 / (2C) in other linear models such as :class:~sklearn.linear_model.LogisticRegression or :class:sklearn.svm.LinearSVC.

  • fit_intercept : bool, default=True Whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • max_iter : int, default=None Maximum number of iterations for conjugate gradient solver. The default value is determined by scipy.sparse.linalg.

  • tol : float, default=1e-3 Precision of the solution.

  • class_weight : dict or 'balanced', default=None Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one.

    The 'balanced' mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)).

  • solver : {'auto', 'svd', 'cholesky', 'lsqr', 'sparse_cg', 'sag', 'saga'}, default='auto' Solver to use in the computational routines:

    • 'auto' chooses the solver automatically based on the type of data.

    • 'svd' uses a Singular Value Decomposition of X to compute the Ridge coefficients. More stable for singular matrices than 'cholesky'.

    • 'cholesky' uses the standard scipy.linalg.solve function to obtain a closed-form solution.

    • 'sparse_cg' uses the conjugate gradient solver as found in As an iterative algorithm, this solver is more appropriate than 'cholesky' for large-scale data (possibility to set tol and max_iter).

    • 'lsqr' uses the dedicated regularized least-squares routine scipy.sparse.linalg.lsqr. It is the fastest and uses an iterative procedure.

    • 'sag' uses a Stochastic Average Gradient descent, and 'saga' uses its unbiased and more flexible version named SAGA. Both methods use an iterative procedure, and are often faster than other solvers when both n_samples and n_features are large. Note that 'sag' and 'saga' fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.

    .. versionadded:: 0.17 Stochastic Average Gradient descent solver. .. versionadded:: 0.19 SAGA solver.

  • random_state : int, RandomState instance, default=None Used when solver == 'sag' or 'saga' to shuffle the data.

  • See :term:Glossary <random_state> for details.


  • coef_ : ndarray of shape (1, n_features) or (n_classes, n_features) Coefficient of the features in the decision function.

    coef_ is of shape (1, n_features) when the given problem is binary.

  • intercept_ : float or ndarray of shape (n_targets,) Independent term in decision function. Set to 0.0 if fit_intercept = False.

  • n_iter_ : None or ndarray of shape (n_targets,) Actual number of iterations for each target. Available only for sag and lsqr solvers. Other solvers will return None.

  • classes_ : ndarray of shape (n_classes,) The classes labels.

See Also

  • Ridge : Ridge regression.

  • RidgeClassifierCV : Ridge classifier with built-in cross validation.


For multi-class classification, n_class classifiers are trained in a one-versus-all approach. Concretely, this is implemented by taking advantage of the multi-variate response support in Ridge.


>>> from sklearn.datasets import load_breast_cancer
>>> from sklearn.linear_model import RidgeClassifier
>>> X, y = load_breast_cancer(return_X_y=True)
>>> clf = RidgeClassifier().fit(X, y)
>>> clf.score(X, y)


method decision_function
val decision_function :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict confidence scores for samples.

The confidence score for a sample is the signed distance of that sample to the hyperplane.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


array, shape=(n_samples,) if n_classes == 2 else (n_samples, n_classes) Confidence scores per (sample, class) combination. In the binary case, confidence score for self.classes_[1] where >0 means this class would be predicted.


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit Ridge classifier model.


  • X : {ndarray, sparse matrix} of shape (n_samples, n_features) Training data.

  • y : ndarray of shape (n_samples,) Target values.

  • sample_weight : float or ndarray of shape (n_samples,), default=None Individual weights for each sample. If given a float, every sample will have the same weight.

    .. versionadded:: 0.17 sample_weight support to Classifier.


  • self : object Instance of the estimator.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict class labels for samples in X.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape [n_samples] Predicted class label per sample.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.


  • X : array-like of shape (n_samples, n_features) Test samples.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float Mean accuracy of self.predict(X) wrt. y.


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> [>`ArrayLike] Np.Obj.t
val n_iter_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute classes_
val classes_ : t -> [>`ArrayLike] Np.Obj.t
val classes_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​RidgeClassifierCV wraps Python class sklearn.linear_model.RidgeClassifierCV.

type t


constructor and attributes create
val create :
  ?alphas:[>`ArrayLike] Np.Obj.t ->
  ?fit_intercept:bool ->
  ?normalize:bool ->
  ?scoring:[`Roc_auc_ovo_weighted | `Callable of Py.Object.t | `Precision | `Roc_auc_ovr | `Recall_micro | `F1_micro | `Precision_micro | `Fowlkes_mallows_score | `F1 | `Jaccard | `Max_error | `Precision_weighted | `Precision_macro | `Neg_brier_score | `Roc_auc_ovo | `F1_weighted | `Average_precision | `Adjusted_mutual_info_score | `Neg_mean_poisson_deviance | `Neg_median_absolute_error | `Jaccard_macro | `Jaccard_micro | `Neg_log_loss | `Recall_samples | `Explained_variance | `Balanced_accuracy | `Normalized_mutual_info_score | `F1_samples | `Completeness_score | `Mutual_info_score | `Accuracy | `Neg_mean_squared_log_error | `Roc_auc | `Precision_samples | `V_measure_score | `Neg_mean_gamma_deviance | `Jaccard_weighted | `R2 | `Recall_weighted | `Recall_macro | `Roc_auc_ovr_weighted | `Homogeneity_score | `Neg_mean_squared_error | `Neg_root_mean_squared_error | `Recall | `Neg_mean_absolute_error | `Adjusted_rand_score | `Jaccard_samples | `F1_macro] ->
  ?cv:[`BaseCrossValidator of [>`BaseCrossValidator] Np.Obj.t | `I of int | `Arr of [>`ArrayLike] Np.Obj.t] ->
  ?class_weight:[`Balanced | `DictIntToFloat of (int * float) list] ->
  ?store_cv_values:bool ->
  unit ->

Ridge classifier with built-in cross-validation.

See glossary entry for :term:cross-validation estimator.

By default, it performs Generalized Cross-Validation, which is a form of efficient Leave-One-Out cross-validation. Currently, only the n_features > n_samples case is handled efficiently.

Read more in the :ref:User Guide <ridge_regression>.


  • alphas : ndarray of shape (n_alphas,), default=(0.1, 1.0, 10.0) Array of alpha values to try. Regularization strength; must be a positive float. Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization. Alpha corresponds to 1 / (2C) in other linear models such as :class:~sklearn.linear_model.LogisticRegression or :class:sklearn.svm.LinearSVC.

  • fit_intercept : bool, default=True Whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).

  • normalize : bool, default=False This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use :class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.

  • scoring : string, callable, default=None A string (see model evaluation documentation) or a scorer callable object / function with signature scorer(estimator, X, y).

  • cv : int, cross-validation generator or an iterable, default=None Determines the cross-validation splitting strategy. Possible inputs for cv are:

    • None, to use the efficient Leave-One-Out cross-validation
    • integer, to specify the number of folds.
    • :term:CV splitter,
    • An iterable yielding (train, test) splits as arrays of indices.
  • Refer :ref:User Guide <cross_validation> for the various cross-validation strategies that can be used here.

  • class_weight : dict or 'balanced', default=None Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one.

    The 'balanced' mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

  • store_cv_values : bool, default=False Flag indicating if the cross-validation values corresponding to each alpha should be stored in the cv_values_ attribute (see below). This flag is only compatible with cv=None (i.e. using Generalized Cross-Validation).


  • cv_values_ : ndarray of shape (n_samples, n_targets, n_alphas), optional Cross-validation values for each alpha (if store_cv_values=True and cv=None). After fit() has been called, this attribute will contain the mean squared errors (by default) or the values of the {loss,score}_func function (if provided in the constructor). This attribute exists only when store_cv_values is True.

  • coef_ : ndarray of shape (1, n_features) or (n_targets, n_features) Coefficient of the features in the decision function.

    coef_ is of shape (1, n_features) when the given problem is binary.

  • intercept_ : float or ndarray of shape (n_targets,) Independent term in decision function. Set to 0.0 if fit_intercept = False.

  • alpha_ : float Estimated regularization parameter.

  • best_score_ : float Score of base estimator with best alpha.

  • classes_ : ndarray of shape (n_classes,) The classes labels.


>>> from sklearn.datasets import load_breast_cancer
>>> from sklearn.linear_model import RidgeClassifierCV
>>> X, y = load_breast_cancer(return_X_y=True)
>>> clf = RidgeClassifierCV(alphas=[1e-3, 1e-2, 1e-1, 1]).fit(X, y)
>>> clf.score(X, y)

See also

  • Ridge : Ridge regression

  • RidgeClassifier : Ridge classifier

  • RidgeCV : Ridge regression with built-in cross validation


For multi-class classification, n_class classifiers are trained in a one-versus-all approach. Concretely, this is implemented by taking advantage of the multi-variate response support in Ridge.


method decision_function
val decision_function :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict confidence scores for samples.

The confidence score for a sample is the signed distance of that sample to the hyperplane.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


array, shape=(n_samples,) if n_classes == 2 else (n_samples, n_classes) Confidence scores per (sample, class) combination. In the binary case, confidence score for self.classes_[1] where >0 means this class would be predicted.


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit Ridge classifier with cv.


  • X : ndarray of shape (n_samples, n_features) Training vectors, where n_samples is the number of samples and n_features is the number of features. When using GCV, will be cast to float64 if necessary.

  • y : ndarray of shape (n_samples,) Target values. Will be cast to X's dtype if necessary.

  • sample_weight : float or ndarray of shape (n_samples,), default=None Individual weights for each sample. If given a float, every sample will have the same weight.


  • self : object


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict class labels for samples in X.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape [n_samples] Predicted class label per sample.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.


  • X : array-like of shape (n_samples, n_features) Test samples.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float Mean accuracy of self.predict(X) wrt. y.


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute cv_values_
val cv_values_ : t -> [>`ArrayLike] Np.Obj.t
val cv_values_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute alpha_
val alpha_ : t -> float
val alpha_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute best_score_
val best_score_ : t -> float
val best_score_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute classes_
val classes_ : t -> [>`ArrayLike] Np.Obj.t
val classes_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​SGDClassifier wraps Python class sklearn.linear_model.SGDClassifier.

type t


constructor and attributes create
val create :
  ?loss:string ->
  ?penalty:[`L2 | `L1 | `Elasticnet] ->
  ?alpha:float ->
  ?l1_ratio:float ->
  ?fit_intercept:bool ->
  ?max_iter:int ->
  ?tol:float ->
  ?shuffle:bool ->
  ?verbose:int ->
  ?epsilon:float ->
  ?n_jobs:int ->
  ?random_state:int ->
  ?learning_rate:string ->
  ?eta0:float ->
  ?power_t:float ->
  ?early_stopping:bool ->
  ?validation_fraction:float ->
  ?n_iter_no_change:int ->
  ?class_weight:[`Balanced | `DictIntToFloat of (int * float) list | `T_class_label_weight_ of Py.Object.t] ->
  ?warm_start:bool ->
  ?average:[`I of int | `Bool of bool] ->
  unit ->

Linear classifiers (SVM, logistic regression, etc.) with SGD training.

This estimator implements regularized linear models with stochastic gradient descent (SGD) learning: the gradient of the loss is estimated each sample at a time and the model is updated along the way with a decreasing strength schedule (aka learning rate). SGD allows minibatch (online/out-of-core) learning via the partial_fit method. For best results using the default learning rate schedule, the data should have zero mean and unit variance.

This implementation works with data represented as dense or sparse arrays of floating point values for the features. The model it fits can be controlled with the loss parameter; by default, it fits a linear support vector machine (SVM).

The regularizer is a penalty added to the loss function that shrinks model parameters towards the zero vector using either the squared euclidean norm L2 or the absolute norm L1 or a combination of both (Elastic Net). If the parameter update crosses the 0.0 value because of the regularizer, the update is truncated to 0.0 to allow for learning sparse models and achieve online feature selection.

Read more in the :ref:User Guide <sgd>.


  • loss : str, default='hinge' The loss function to be used. Defaults to 'hinge', which gives a linear SVM.

    The possible options are 'hinge', 'log', 'modified_huber', 'squared_hinge', 'perceptron', or a regression loss: 'squared_loss', 'huber', 'epsilon_insensitive', or 'squared_epsilon_insensitive'.

    The 'log' loss gives logistic regression, a probabilistic classifier. 'modified_huber' is another smooth loss that brings tolerance to outliers as well as probability estimates. 'squared_hinge' is like hinge but is quadratically penalized. 'perceptron' is the linear loss used by the perceptron algorithm. The other losses are designed for regression but can be useful in classification as well; see :class:~sklearn.linear_model.SGDRegressor for a description.

    More details about the losses formulas can be found in the :ref:User Guide <sgd_mathematical_formulation>.

  • penalty : {'l2', 'l1', 'elasticnet'}, default='l2' The penalty (aka regularization term) to be used. Defaults to 'l2' which is the standard regularizer for linear SVM models. 'l1' and 'elasticnet' might bring sparsity to the model (feature selection) not achievable with 'l2'.

  • alpha : float, default=0.0001 Constant that multiplies the regularization term. The higher the value, the stronger the regularization. Also used to compute the learning rate when set to learning_rate is set to 'optimal'.

  • l1_ratio : float, default=0.15 The Elastic Net mixing parameter, with 0 <= l1_ratio <= 1. l1_ratio=0 corresponds to L2 penalty, l1_ratio=1 to L1. Only used if penalty is 'elasticnet'.

  • fit_intercept : bool, default=True Whether the intercept should be estimated or not. If False, the data is assumed to be already centered.

  • max_iter : int, default=1000 The maximum number of passes over the training data (aka epochs). It only impacts the behavior in the fit method, and not the :meth:partial_fit method.

    .. versionadded:: 0.19

  • tol : float, default=1e-3 The stopping criterion. If it is not None, training will stop when (loss > best_loss - tol) for n_iter_no_change consecutive epochs.

    .. versionadded:: 0.19

  • shuffle : bool, default=True Whether or not the training data should be shuffled after each epoch.

  • verbose : int, default=0 The verbosity level.

  • epsilon : float, default=0.1 Epsilon in the epsilon-insensitive loss functions; only if loss is 'huber', 'epsilon_insensitive', or 'squared_epsilon_insensitive'. For 'huber', determines the threshold at which it becomes less important to get the prediction exactly right. For epsilon-insensitive, any differences between the current prediction and the correct label are ignored if they are less than this threshold.

  • n_jobs : int, default=None The number of CPUs to use to do the OVA (One Versus All, for multi-class problems) computation. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors. See :term:Glossary <n_jobs> for more details.

  • random_state : int, RandomState instance, default=None Used for shuffling the data, when shuffle is set to True. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.

  • learning_rate : str, default='optimal' The learning rate schedule:

    • 'constant': eta = eta0
    • 'optimal': eta = 1.0 / (alpha * (t + t0)) where t0 is chosen by a heuristic proposed by Leon Bottou.
    • 'invscaling': eta = eta0 / pow(t, power_t)
    • 'adaptive': eta = eta0, as long as the training keeps decreasing. Each time n_iter_no_change consecutive epochs fail to decrease the training loss by tol or fail to increase validation score by tol if early_stopping is True, the current learning rate is divided by 5.

      .. versionadded:: 0.20 Added 'adaptive' option

  • eta0 : double, default=0.0 The initial learning rate for the 'constant', 'invscaling' or 'adaptive' schedules. The default value is 0.0 as eta0 is not used by the default schedule 'optimal'.

  • power_t : double, default=0.5 The exponent for inverse scaling learning rate [default 0.5].

  • early_stopping : bool, default=False Whether to use early stopping to terminate training when validation score is not improving. If set to True, it will automatically set aside a stratified fraction of training data as validation and terminate training when validation score returned by the score method is not improving by at least tol for n_iter_no_change consecutive epochs.

    .. versionadded:: 0.20 Added 'early_stopping' option

  • validation_fraction : float, default=0.1 The proportion of training data to set aside as validation set for early stopping. Must be between 0 and 1. Only used if early_stopping is True.

    .. versionadded:: 0.20 Added 'validation_fraction' option

  • n_iter_no_change : int, default=5 Number of iterations with no improvement to wait before early stopping.

    .. versionadded:: 0.20 Added 'n_iter_no_change' option

  • class_weight : dict, {class_label: weight} or 'balanced', default=None Preset for the class_weight fit parameter.

    Weights associated with classes. If not given, all classes are supposed to have weight one.

    The 'balanced' mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)).

  • warm_start : bool, default=False When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

  • See :term:the Glossary <warm_start>.

    Repeatedly calling fit or partial_fit when warm_start is True can result in a different solution than when calling fit a single time because of the way the data is shuffled. If a dynamic learning rate is used, the learning rate is adapted depending on the number of samples already seen. Calling fit resets this counter, while partial_fit will result in increasing the existing counter.

  • average : bool or int, default=False When set to True, computes the averaged SGD weights accross all updates and stores the result in the coef_ attribute. If set to an int greater than 1, averaging will begin once the total number of samples seen reaches average. So average=10 will begin averaging after seeing 10 samples.


  • coef_ : ndarray of shape (1, n_features) if n_classes == 2 else (n_classes, n_features) Weights assigned to the features.

  • intercept_ : ndarray of shape (1,) if n_classes == 2 else (n_classes,) Constants in decision function.

  • n_iter_ : int The actual number of iterations before reaching the stopping criterion. For multiclass fits, it is the maximum over every binary fit.

  • loss_function_ : concrete LossFunction

  • classes_ : array of shape (n_classes,)

  • t_ : int Number of weight updates performed during training. Same as (n_iter_ * n_samples).

See Also

  • sklearn.svm.LinearSVC: Linear support vector classification.

  • LogisticRegression: Logistic regression.

  • Perceptron: Inherits from SGDClassifier. Perceptron() is equivalent to SGDClassifier(loss='perceptron', eta0=1, learning_rate='constant', penalty=None).


>>> import numpy as np
>>> from sklearn.linear_model import SGDClassifier
>>> from sklearn.preprocessing import StandardScaler
>>> from sklearn.pipeline import make_pipeline
>>> X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
>>> Y = np.array([1, 1, 2, 2])
>>> # Always scale the input. The most convenient way is to use a pipeline.
>>> clf = make_pipeline(StandardScaler(),
...                     SGDClassifier(max_iter=1000, tol=1e-3))
>>>, Y)
Pipeline(steps=[('standardscaler', StandardScaler()),
                ('sgdclassifier', SGDClassifier())])
>>> print(clf.predict([[-0.8, -1]]))


method decision_function
val decision_function :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict confidence scores for samples.

The confidence score for a sample is the signed distance of that sample to the hyperplane.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


array, shape=(n_samples,) if n_classes == 2 else (n_samples, n_classes) Confidence scores per (sample, class) combination. In the binary case, confidence score for self.classes_[1] where >0 means this class would be predicted.


method densify
val densify :
  [> tag] Obj.t ->

Convert coefficient matrix to dense array format.

Converts the coef_ member (back) to a numpy.ndarray. This is the default format of coef_ and is required for fitting, so calling this method is only required on models that have previously been sparsified; otherwise, it is a no-op.


self Fitted estimator.


method fit
val fit :
  ?coef_init:[>`ArrayLike] Np.Obj.t ->
  ?intercept_init:[>`ArrayLike] Np.Obj.t ->
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit linear model with Stochastic Gradient Descent.


  • X : {array-like, sparse matrix}, shape (n_samples, n_features) Training data.

  • y : ndarray of shape (n_samples,) Target values.

  • coef_init : ndarray of shape (n_classes, n_features), default=None The initial coefficients to warm-start the optimization.

  • intercept_init : ndarray of shape (n_classes,), default=None The initial intercept to warm-start the optimization.

  • sample_weight : array-like, shape (n_samples,), default=None Weights applied to individual samples. If not provided, uniform weights are assumed. These weights will be multiplied with class_weight (passed through the constructor) if class_weight is specified.


self : Returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method partial_fit
val partial_fit :
  ?classes:[>`ArrayLike] Np.Obj.t ->
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Perform one epoch of stochastic gradient descent on given samples.

Internally, this method uses max_iter = 1. Therefore, it is not guaranteed that a minimum of the cost function is reached after calling it once. Matters such as objective convergence and early stopping should be handled by the user.


  • X : {array-like, sparse matrix}, shape (n_samples, n_features) Subset of the training data.

  • y : ndarray of shape (n_samples,) Subset of the target values.

  • classes : ndarray of shape (n_classes,), default=None Classes across all calls to partial_fit. Can be obtained by via np.unique(y_all), where y_all is the target vector of the entire dataset. This argument is required for the first call to partial_fit and can be omitted in the subsequent calls. Note that y doesn't need to contain all labels in classes.

  • sample_weight : array-like, shape (n_samples,), default=None Weights applied to individual samples. If not provided, uniform weights are assumed.


self : Returns an instance of self.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict class labels for samples in X.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape [n_samples] Predicted class label per sample.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.


  • X : array-like of shape (n_samples, n_features) Test samples.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float Mean accuracy of self.predict(X) wrt. y.


method set_params
val set_params :
  ?kwargs:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set and validate the parameters of estimator.


  • **kwargs : dict Estimator parameters.


  • self : object Estimator instance.


method sparsify
val sparsify :
  [> tag] Obj.t ->

Convert coefficient matrix to sparse format.

Converts the coef_ member to a scipy.sparse matrix, which for L1-regularized models can be much more memory- and storage-efficient than the usual numpy.ndarray representation.

The intercept_ member is not converted.


self Fitted estimator.


For non-sparse models, i.e. when there are not many zeros in coef_, this may actually increase memory usage, so use this method with care. A rule of thumb is that the number of zero elements, which can be computed with (coef_ == 0).sum(), must be more than 50% for this to provide significant benefits.

After calling this method, further fitting with the partial_fit method (if any) will not work until you call densify.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute loss_function_
val loss_function_ : t -> Np.NumpyRaw.Ndarray.t -> Np.NumpyRaw.Ndarray.t -> float
val loss_function_opt : t -> (Np.NumpyRaw.Ndarray.t -> Np.NumpyRaw.Ndarray.t -> float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute classes_
val classes_ : t -> [>`ArrayLike] Np.Obj.t
val classes_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute t_
val t_ : t -> int
val t_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​SGDRegressor wraps Python class sklearn.linear_model.SGDRegressor.

type t


constructor and attributes create
val create :
  ?loss:string ->
  ?penalty:[`L2 | `L1 | `Elasticnet] ->
  ?alpha:float ->
  ?l1_ratio:float ->
  ?fit_intercept:bool ->
  ?max_iter:int ->
  ?tol:float ->
  ?shuffle:bool ->
  ?verbose:int ->
  ?epsilon:float ->
  ?random_state:int ->
  ?learning_rate:string ->
  ?eta0:float ->
  ?power_t:float ->
  ?early_stopping:bool ->
  ?validation_fraction:float ->
  ?n_iter_no_change:int ->
  ?warm_start:bool ->
  ?average:[`I of int | `Bool of bool] ->
  unit ->

Linear model fitted by minimizing a regularized empirical loss with SGD

SGD stands for Stochastic Gradient Descent: the gradient of the loss is estimated each sample at a time and the model is updated along the way with a decreasing strength schedule (aka learning rate).

The regularizer is a penalty added to the loss function that shrinks model parameters towards the zero vector using either the squared euclidean norm L2 or the absolute norm L1 or a combination of both (Elastic Net). If the parameter update crosses the 0.0 value because of the regularizer, the update is truncated to 0.0 to allow for learning sparse models and achieve online feature selection.

This implementation works with data represented as dense numpy arrays of floating point values for the features.

Read more in the :ref:User Guide <sgd>.


  • loss : str, default='squared_loss' The loss function to be used. The possible values are 'squared_loss', 'huber', 'epsilon_insensitive', or 'squared_epsilon_insensitive'

    The 'squared_loss' refers to the ordinary least squares fit. 'huber' modifies 'squared_loss' to focus less on getting outliers correct by switching from squared to linear loss past a distance of epsilon. 'epsilon_insensitive' ignores errors less than epsilon and is linear past that; this is the loss function used in SVR. 'squared_epsilon_insensitive' is the same but becomes squared loss past a tolerance of epsilon.

    More details about the losses formulas can be found in the :ref:User Guide <sgd_mathematical_formulation>.

  • penalty : {'l2', 'l1', 'elasticnet'}, default='l2' The penalty (aka regularization term) to be used. Defaults to 'l2' which is the standard regularizer for linear SVM models. 'l1' and 'elasticnet' might bring sparsity to the model (feature selection) not achievable with 'l2'.

  • alpha : float, default=0.0001 Constant that multiplies the regularization term. The higher the value, the stronger the regularization. Also used to compute the learning rate when set to learning_rate is set to 'optimal'.

  • l1_ratio : float, default=0.15 The Elastic Net mixing parameter, with 0 <= l1_ratio <= 1. l1_ratio=0 corresponds to L2 penalty, l1_ratio=1 to L1. Only used if penalty is 'elasticnet'.

  • fit_intercept : bool, default=True Whether the intercept should be estimated or not. If False, the data is assumed to be already centered.

  • max_iter : int, default=1000 The maximum number of passes over the training data (aka epochs). It only impacts the behavior in the fit method, and not the :meth:partial_fit method.

    .. versionadded:: 0.19

  • tol : float, default=1e-3 The stopping criterion. If it is not None, training will stop when (loss > best_loss - tol) for n_iter_no_change consecutive epochs.

    .. versionadded:: 0.19

  • shuffle : bool, default=True Whether or not the training data should be shuffled after each epoch.

  • verbose : int, default=0 The verbosity level.

  • epsilon : float, default=0.1 Epsilon in the epsilon-insensitive loss functions; only if loss is 'huber', 'epsilon_insensitive', or 'squared_epsilon_insensitive'. For 'huber', determines the threshold at which it becomes less important to get the prediction exactly right. For epsilon-insensitive, any differences between the current prediction and the correct label are ignored if they are less than this threshold.

  • random_state : int, RandomState instance, default=None Used for shuffling the data, when shuffle is set to True. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>.

  • learning_rate : string, default='invscaling' The learning rate schedule:

    • 'constant': eta = eta0
    • 'optimal': eta = 1.0 / (alpha * (t + t0)) where t0 is chosen by a heuristic proposed by Leon Bottou.
    • 'invscaling': eta = eta0 / pow(t, power_t)
    • 'adaptive': eta = eta0, as long as the training keeps decreasing. Each time n_iter_no_change consecutive epochs fail to decrease the training loss by tol or fail to increase validation score by tol if early_stopping is True, the current learning rate is divided by 5.

      .. versionadded:: 0.20 Added 'adaptive' option

  • eta0 : double, default=0.01 The initial learning rate for the 'constant', 'invscaling' or 'adaptive' schedules. The default value is 0.01.

  • power_t : double, default=0.25 The exponent for inverse scaling learning rate.

  • early_stopping : bool, default=False Whether to use early stopping to terminate training when validation score is not improving. If set to True, it will automatically set aside a fraction of training data as validation and terminate training when validation score returned by the score method is not improving by at least tol for n_iter_no_change consecutive epochs.

    .. versionadded:: 0.20 Added 'early_stopping' option

  • validation_fraction : float, default=0.1 The proportion of training data to set aside as validation set for early stopping. Must be between 0 and 1. Only used if early_stopping is True.

    .. versionadded:: 0.20 Added 'validation_fraction' option

  • n_iter_no_change : int, default=5 Number of iterations with no improvement to wait before early stopping.

    .. versionadded:: 0.20 Added 'n_iter_no_change' option

  • warm_start : bool, default=False When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

  • See :term:the Glossary <warm_start>.

    Repeatedly calling fit or partial_fit when warm_start is True can result in a different solution than when calling fit a single time because of the way the data is shuffled. If a dynamic learning rate is used, the learning rate is adapted depending on the number of samples already seen. Calling fit resets this counter, while partial_fit will result in increasing the existing counter.

  • average : bool or int, default=False When set to True, computes the averaged SGD weights accross all updates and stores the result in the coef_ attribute. If set to an int greater than 1, averaging will begin once the total number of samples seen reaches average. So average=10 will begin averaging after seeing 10 samples.


  • coef_ : ndarray of shape (n_features,) Weights assigned to the features.

  • intercept_ : ndarray of shape (1,) The intercept term.

  • average_coef_ : ndarray of shape (n_features,) Averaged weights assigned to the features. Only available if average=True.

    .. deprecated:: 0.23 Attribute average_coef_ was deprecated in version 0.23 and will be removed in 0.25.

  • average_intercept_ : ndarray of shape (1,) The averaged intercept term. Only available if average=True.

    .. deprecated:: 0.23 Attribute average_intercept_ was deprecated in version 0.23 and will be removed in 0.25.

  • n_iter_ : int The actual number of iterations before reaching the stopping criterion.

  • t_ : int Number of weight updates performed during training. Same as (n_iter_ * n_samples).


>>> import numpy as np
>>> from sklearn.linear_model import SGDRegressor
>>> from sklearn.pipeline import make_pipeline
>>> from sklearn.preprocessing import StandardScaler
>>> n_samples, n_features = 10, 5
>>> rng = np.random.RandomState(0)
>>> y = rng.randn(n_samples)
>>> X = rng.randn(n_samples, n_features)
>>> # Always scale the input. The most convenient way is to use a pipeline.
>>> reg = make_pipeline(StandardScaler(),
...                     SGDRegressor(max_iter=1000, tol=1e-3))
>>>, y)
Pipeline(steps=[('standardscaler', StandardScaler()),
                ('sgdregressor', SGDRegressor())])

See also

Ridge, ElasticNet, Lasso, sklearn.svm.SVR


method densify
val densify :
  [> tag] Obj.t ->

Convert coefficient matrix to dense array format.

Converts the coef_ member (back) to a numpy.ndarray. This is the default format of coef_ and is required for fitting, so calling this method is only required on models that have previously been sparsified; otherwise, it is a no-op.


self Fitted estimator.


method fit
val fit :
  ?coef_init:[>`ArrayLike] Np.Obj.t ->
  ?intercept_init:[>`ArrayLike] Np.Obj.t ->
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit linear model with Stochastic Gradient Descent.


  • X : {array-like, sparse matrix}, shape (n_samples, n_features) Training data

  • y : ndarray of shape (n_samples,) Target values

  • coef_init : ndarray of shape (n_features,), default=None The initial coefficients to warm-start the optimization.

  • intercept_init : ndarray of shape (1,), default=None The initial intercept to warm-start the optimization.

  • sample_weight : array-like, shape (n_samples,), default=None Weights applied to individual samples (1. for unweighted).


  • self : returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method partial_fit
val partial_fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Perform one epoch of stochastic gradient descent on given samples.

Internally, this method uses max_iter = 1. Therefore, it is not guaranteed that a minimum of the cost function is reached after calling it once. Matters such as objective convergence and early stopping should be handled by the user.


  • X : {array-like, sparse matrix}, shape (n_samples, n_features) Subset of training data

  • y : numpy array of shape (n_samples,) Subset of target values

  • sample_weight : array-like, shape (n_samples,), default=None Weights applied to individual samples. If not provided, uniform weights are assumed.


  • self : returns an instance of self.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model


  • X : {array-like, sparse matrix}, shape (n_samples, n_features)


ndarray of shape (n_samples,) Predicted target values per element in X.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?kwargs:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set and validate the parameters of estimator.


  • **kwargs : dict Estimator parameters.


  • self : object Estimator instance.


method sparsify
val sparsify :
  [> tag] Obj.t ->

Convert coefficient matrix to sparse format.

Converts the coef_ member to a scipy.sparse matrix, which for L1-regularized models can be much more memory- and storage-efficient than the usual numpy.ndarray representation.

The intercept_ member is not converted.


self Fitted estimator.


For non-sparse models, i.e. when there are not many zeros in coef_, this may actually increase memory usage, so use this method with care. A rule of thumb is that the number of zero elements, which can be computed with (coef_ == 0).sum(), must be more than 50% for this to provide significant benefits.

After calling this method, further fitting with the partial_fit method (if any) will not work until you call densify.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute average_coef_
val average_coef_ : t -> [>`ArrayLike] Np.Obj.t
val average_coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute average_intercept_
val average_intercept_ : t -> [>`ArrayLike] Np.Obj.t
val average_intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute t_
val t_ : t -> int
val t_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​SquaredLoss wraps Python class sklearn.linear_model.SquaredLoss.

type t


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​TheilSenRegressor wraps Python class sklearn.linear_model.TheilSenRegressor.

type t


constructor and attributes create
val create :
  ?fit_intercept:bool ->
  ?copy_X:bool ->
  ?max_subpopulation:int ->
  ?n_subsamples:int ->
  ?max_iter:int ->
  ?tol:float ->
  ?random_state:int ->
  ?n_jobs:int ->
  ?verbose:int ->
  unit ->

Theil-Sen Estimator: robust multivariate regression model.

The algorithm calculates least square solutions on subsets with size n_subsamples of the samples in X. Any value of n_subsamples between the number of features and samples leads to an estimator with a compromise between robustness and efficiency. Since the number of least square solutions is 'n_samples choose n_subsamples', it can be extremely large and can therefore be limited with max_subpopulation. If this limit is reached, the subsets are chosen randomly. In a final step, the spatial median (or L1 median) is calculated of all least square solutions.

Read more in the :ref:User Guide <theil_sen_regression>.


  • fit_intercept : boolean, optional, default True Whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations.

  • copy_X : boolean, optional, default True If True, X will be copied; else, it may be overwritten.

  • max_subpopulation : int, optional, default 1e4 Instead of computing with a set of cardinality 'n choose k', where n is the number of samples and k is the number of subsamples (at least number of features), consider only a stochastic subpopulation of a given maximal size if 'n choose k' is larger than max_subpopulation. For other than small problem sizes this parameter will determine memory usage and runtime if n_subsamples is not changed.

  • n_subsamples : int, optional, default None Number of samples to calculate the parameters. This is at least the number of features (plus 1 if fit_intercept=True) and the number of samples as a maximum. A lower number leads to a higher breakdown point and a low efficiency while a high number leads to a low breakdown point and a high efficiency. If None, take the minimum number of subsamples leading to maximal robustness. If n_subsamples is set to n_samples, Theil-Sen is identical to least squares.

  • max_iter : int, optional, default 300 Maximum number of iterations for the calculation of spatial median.

  • tol : float, optional, default 1.e-3 Tolerance when calculating spatial median.

  • random_state : int, RandomState instance, default=None A random number generator instance to define the state of the random permutations generator. Pass an int for reproducible output across multiple function calls.

  • See :term:Glossary <random_state>

  • n_jobs : int or None, optional (default=None) Number of CPUs to use during the cross validation. None means 1 unless in a :obj:joblib.parallel_backend context. -1 means using all processors. See :term:Glossary <n_jobs> for more details.

  • verbose : boolean, optional, default False Verbose mode when fitting the model.


  • coef_ : array, shape = (n_features) Coefficients of the regression model (median of distribution).

  • intercept_ : float Estimated intercept of regression model.

  • breakdown_ : float Approximated breakdown point.

  • n_iter_ : int Number of iterations needed for the spatial median.

  • n_subpopulation_ : int Number of combinations taken into account from 'n choose k', where n is the number of samples and k is the number of subsamples.


>>> from sklearn.linear_model import TheilSenRegressor
>>> from sklearn.datasets import make_regression
>>> X, y = make_regression(
...     n_samples=200, n_features=2, noise=4.0, random_state=0)
>>> reg = TheilSenRegressor(random_state=0).fit(X, y)
>>> reg.score(X, y)
>>> reg.predict(X[:1,])


  • Theil-Sen Estimators in a Multiple Linear Regression Model, 2009 Xin Dang, Hanxiang Peng, Xueqin Wang and Heping Zhang



method fit
val fit :
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit linear model.


  • X : numpy array of shape [n_samples, n_features] Training data

  • y : numpy array of shape [n_samples] Target values


  • self : returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using the linear model.


  • X : array_like or sparse matrix, shape (n_samples, n_features) Samples.


  • C : array, shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Return the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.


  • X : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for X.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float R^2 of self.predict(X) wrt. y.


The R2 score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of :func:~sklearn.metrics.r2_score. This influences the score method of all the multioutput regressors (except for :class:~sklearn.multioutput.MultiOutputRegressor).


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute breakdown_
val breakdown_ : t -> float
val breakdown_opt : t -> (float) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_subpopulation_
val n_subpopulation_ : t -> int
val n_subpopulation_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


Module Sklearn.​Linear_model.​TweedieRegressor wraps Python class sklearn.linear_model.TweedieRegressor.

type t


constructor and attributes create
val create :
  ?power:float ->
  ?alpha:float ->
  ?fit_intercept:bool ->
  ?link:[`Auto | `Identity | `Log] ->
  ?max_iter:int ->
  ?tol:float ->
  ?warm_start:bool ->
  ?verbose:int ->
  unit ->

Generalized Linear Model with a Tweedie distribution.

This estimator can be used to model different GLMs depending on the power parameter, which determines the underlying distribution.

Read more in the :ref:User Guide <Generalized_linear_regression>.


  • power : float, default=0 The power determines the underlying target distribution according to the following table:

    | Power | Distribution           |
    | 0     | Normal                 |
    | 1     | Poisson                |
    | (1,2) | Compound Poisson Gamma |
    | 2     | Gamma                  |
    | 3     | Inverse Gaussian       |
    For ``0 < power < 1``, no distribution exists.
  • alpha : float, default=1 Constant that multiplies the penalty term and thus determines the regularization strength. alpha = 0 is equivalent to unpenalized GLMs. In this case, the design matrix X must have full column rank (no collinearities).

  • link : {'auto', 'identity', 'log'}, default='auto' The link function of the GLM, i.e. mapping from linear predictor X @ coeff + intercept to prediction y_pred. Option 'auto' sets the link depending on the chosen family as follows:

    • 'identity' for Normal distribution
    • 'log' for Poisson, Gamma and Inverse Gaussian distributions
  • fit_intercept : bool, default=True Specifies if a constant (a.k.a. bias or intercept) should be added to the linear predictor (X @ coef + intercept).

  • max_iter : int, default=100 The maximal number of iterations for the solver.

  • tol : float, default=1e-4 Stopping criterion. For the lbfgs solver, the iteration will stop when max{ |g_j|, j = 1, ..., d} <= tol where g_j is the j-th component of the gradient (derivative) of the objective function.

  • warm_start : bool, default=False If set to True, reuse the solution of the previous call to fit as initialization for coef_ and intercept_ .

  • verbose : int, default=0 For the lbfgs solver set verbose to any positive number for verbosity.


  • coef_ : array of shape (n_features,) Estimated coefficients for the linear predictor (X @ coef_ + intercept_) in the GLM.

  • intercept_ : float Intercept (a.k.a. bias) added to linear predictor.

  • n_iter_ : int Actual number of iterations used in the solver.


method fit
val fit :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Fit a Generalized Linear Model.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training data.

  • y : array-like of shape (n_samples,) Target values.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • self : returns an instance of self.


method get_params
val get_params :
  ?deep:bool ->
  [> tag] Obj.t ->

Get parameters for this estimator.


  • deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.


  • params : mapping of string to any Parameter names mapped to their values.


method predict
val predict :
  x:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Predict using GLM with feature matrix X.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Samples.


  • y_pred : array of shape (n_samples,) Returns predicted values.


method score
val score :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->

Compute D^2, the percentage of deviance explained.

D^2 is a generalization of the coefficient of determination R^2. R^2 uses squared error and D^2 deviance. Note that those two are equal for family='normal'.

D^2 is defined as :math:D^2 = 1-\frac{D(y_{true},y_{pred})}{D_{null}}, :math:D_{null} is the null deviance, i.e. the deviance of a model with intercept alone, which corresponds to :math:y_{pred} = \bar{y}. The mean :math:\bar{y} is averaged by sample_weight. Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse).


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Test samples.

  • y : array-like of shape (n_samples,) True values of target.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.


  • score : float D^2 of self.predict(X) w.r.t. y.


method set_params
val set_params :
  ?params:(string * Py.Object.t) list ->
  [> tag] Obj.t ->

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.


  • **params : dict Estimator parameters.


  • self : object Estimator instance.


attribute coef_
val coef_ : t -> [>`ArrayLike] Np.Obj.t
val coef_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute intercept_
val intercept_ : t -> [>`ArrayLike] Np.Obj.t
val intercept_opt : t -> ([>`ArrayLike] Np.Obj.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


attribute n_iter_
val n_iter_ : t -> int
val n_iter_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.


method to_string
val to_string: t -> string

Print the object to a human-readable representation.


method show
val show: t -> string

Print the object to a human-readable representation.


method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.


function enet_path
val enet_path :
  ?l1_ratio:float ->
  ?eps:float ->
  ?n_alphas:int ->
  ?alphas:[>`ArrayLike] Np.Obj.t ->
  ?precompute:[`Arr of [>`ArrayLike] Np.Obj.t | `Auto | `Bool of bool] ->
  ?xy:[>`ArrayLike] Np.Obj.t ->
  ?copy_X:bool ->
  ?coef_init:[>`ArrayLike] Np.Obj.t ->
  ?verbose:int ->
  ?return_n_iter:bool ->
  ?positive:bool ->
  ?check_input:bool ->
  ?params:(string * Py.Object.t) list ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  unit ->
  ([>`ArrayLike] Np.Obj.t * [>`ArrayLike] Np.Obj.t * [>`ArrayLike] Np.Obj.t * Py.Object.t)

Compute elastic net path with coordinate descent.

The elastic net optimization function varies for mono and multi-outputs.

For mono-output tasks it is::

1 / (2 * n_samples) * ||y - Xw||^2_2
+ alpha * l1_ratio * ||w||_1
+ 0.5 * alpha * (1 - l1_ratio) * ||w||^2_2

For multi-output tasks it is::

(1 / (2 * n_samples)) * ||Y - XW||^Fro_2
+ alpha * l1_ratio * ||W||_21
+ 0.5 * alpha * (1 - l1_ratio) * ||W||_Fro^2
  • Where::

    ||W||21 = \sum_i \sqrt{\sum_j w{ij}^2}

i.e. the sum of norm of each row.

Read more in the :ref:User Guide <elastic_net>.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training data. Pass directly as Fortran-contiguous data to avoid unnecessary memory duplication. If y is mono-output then X can be sparse.

  • y : {array-like, sparse matrix} of shape (n_samples,) or (n_samples, n_outputs) Target values.

  • l1_ratio : float, default=0.5 Number between 0 and 1 passed to elastic net (scaling between l1 and l2 penalties). l1_ratio=1 corresponds to the Lasso.

  • eps : float, default=1e-3 Length of the path. eps=1e-3 means that alpha_min / alpha_max = 1e-3.

  • n_alphas : int, default=100 Number of alphas along the regularization path.

  • alphas : ndarray, default=None List of alphas where to compute the models. If None alphas are set automatically.

  • precompute : 'auto', bool or array-like of shape (n_features, n_features), default='auto' Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.

  • Xy : array-like of shape (n_features,) or (n_features, n_outputs), default=None Xy =, y) that can be precomputed. It is useful only when the Gram matrix is precomputed.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • coef_init : ndarray of shape (n_features, ), default=None The initial values of the coefficients.

  • verbose : bool or int, default=False Amount of verbosity.

  • return_n_iter : bool, default=False Whether to return the number of iterations or not.

  • positive : bool, default=False If set to True, forces coefficients to be positive. (Only allowed when y.ndim == 1).

  • check_input : bool, default=True Skip input validation checks, including the Gram matrix when provided assuming there are handled by the caller when check_input=False.

  • **params : kwargs Keyword arguments passed to the coordinate descent solver.


  • alphas : ndarray of shape (n_alphas,) The alphas along the path where models are computed.

  • coefs : ndarray of shape (n_features, n_alphas) or (n_outputs, n_features, n_alphas) Coefficients along the path.

  • dual_gaps : ndarray of shape (n_alphas,) The dual gaps at the end of the optimization for each alpha.

  • n_iters : list of int The number of iterations taken by the coordinate descent optimizer to reach the specified tolerance for each alpha. (Is returned when return_n_iter is set to True).

See Also

MultiTaskElasticNet MultiTaskElasticNetCV ElasticNet ElasticNetCV


For an example, see :ref:examples/linear_model/ <>.


function lars_path
val lars_path :
  ?xy:[>`ArrayLike] Np.Obj.t ->
  ?gram:[`Arr of [>`ArrayLike] Np.Obj.t | `Auto] ->
  ?max_iter:int ->
  ?alpha_min:float ->
  ?method_:[`Lar | `Lasso] ->
  ?copy_X:bool ->
  ?eps:float ->
  ?copy_Gram:bool ->
  ?verbose:int ->
  ?return_path:bool ->
  ?return_n_iter:bool ->
  ?positive:bool ->
  x:[`Arr of [>`ArrayLike] Np.Obj.t | `None] ->
  y:[`Arr of [>`ArrayLike] Np.Obj.t | `None] ->
  unit ->
  ([>`ArrayLike] Np.Obj.t * [>`ArrayLike] Np.Obj.t * [>`ArrayLike] Np.Obj.t * int)

Compute Least Angle Regression or Lasso path using LARS algorithm [1]

The optimization objective for the case method='lasso' is::

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

in the case of method='lars', the objective function is only known in the form of an implicit equation (see discussion in [1])

Read more in the :ref:User Guide <least_angle_regression>.


  • X : None or array-like of shape (n_samples, n_features) Input data. Note that if X is None then the Gram matrix must be specified, i.e., cannot be None or False.

  • y : None or array-like of shape (n_samples,) Input targets.

  • Xy : array-like of shape (n_samples,) or (n_samples, n_targets), default=None Xy =, y) that can be precomputed. It is useful only when the Gram matrix is precomputed.

  • Gram : None, 'auto', array-like of shape (n_features, n_features), default=None Precomputed Gram matrix (X' * X), if 'auto', the Gram matrix is precomputed from the given X, if there are more samples than features.

  • max_iter : int, default=500 Maximum number of iterations to perform, set to infinity for no limit.

  • alpha_min : float, default=0 Minimum correlation along the path. It corresponds to the regularization parameter alpha parameter in the Lasso.

  • method : {'lar', 'lasso'}, default='lar' Specifies the returned model. Select 'lar' for Least Angle Regression, 'lasso' for the Lasso.

  • copy_X : bool, default=True If False, X is overwritten.

  • eps : float, optional The machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems. By default, np.finfo(np.float).eps is used.

  • copy_Gram : bool, default=True If False, Gram is overwritten.

  • verbose : int, default=0 Controls output verbosity.

  • return_path : bool, default=True If return_path==True returns the entire path, else returns only the last point of the path.

  • return_n_iter : bool, default=False Whether to return the number of iterations.

  • positive : bool, default=False Restrict coefficients to be >= 0. This option is only allowed with method 'lasso'. Note that the model coefficients will not converge to the ordinary-least-squares solution for small values of alpha. Only coefficients up to the smallest alpha value (alphas_[alphas_ > 0.].min() when fit_path=True) reached by the stepwise Lars-Lasso algorithm are typically in congruence with the solution of the coordinate descent lasso_path function.


  • alphas : array-like of shape (n_alphas + 1,) Maximum of covariances (in absolute value) at each iteration. n_alphas is either max_iter, n_features or the number of nodes in the path with alpha >= alpha_min, whichever is smaller.

  • active : array-like of shape (n_alphas,) Indices of active variables at the end of the path.

  • coefs : array-like of shape (n_features, n_alphas + 1) Coefficients along the path

  • n_iter : int Number of iterations run. Returned only if return_n_iter is set to True.

See also

lars_path_gram lasso_path lasso_path_gram LassoLars Lars LassoLarsCV LarsCV sklearn.decomposition.sparse_encode


.. [1] 'Least Angle Regression', Efron et al.


.. [2] Wikipedia entry on the Least-angle regression <>_

.. [3] Wikipedia entry on the Lasso <>_


function lars_path_gram
val lars_path_gram :
  ?max_iter:int ->
  ?alpha_min:float ->
  ?method_:[`Lar | `Lasso] ->
  ?copy_X:bool ->
  ?eps:float ->
  ?copy_Gram:bool ->
  ?verbose:int ->
  ?return_path:bool ->
  ?return_n_iter:bool ->
  ?positive:bool ->
  xy:[>`ArrayLike] Np.Obj.t ->
  gram:[>`ArrayLike] Np.Obj.t ->
  n_samples:[`F of float | `I of int] ->
  unit ->
  ([>`ArrayLike] Np.Obj.t * [>`ArrayLike] Np.Obj.t * [>`ArrayLike] Np.Obj.t * int)

lars_path in the sufficient stats mode [1]

The optimization objective for the case method='lasso' is::

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

in the case of method='lars', the objective function is only known in the form of an implicit equation (see discussion in [1])

Read more in the :ref:User Guide <least_angle_regression>.


  • Xy : array-like of shape (n_samples,) or (n_samples, n_targets) Xy =, y).

  • Gram : array-like of shape (n_features, n_features) Gram = * X).

  • n_samples : int or float Equivalent size of sample.

  • max_iter : int, default=500 Maximum number of iterations to perform, set to infinity for no limit.

  • alpha_min : float, default=0 Minimum correlation along the path. It corresponds to the regularization parameter alpha parameter in the Lasso.

  • method : {'lar', 'lasso'}, default='lar' Specifies the returned model. Select 'lar' for Least Angle Regression, 'lasso' for the Lasso.

  • copy_X : bool, default=True If False, X is overwritten.

  • eps : float, optional The machine-precision regularization in the computation of the Cholesky diagonal factors. Increase this for very ill-conditioned systems. By default, np.finfo(np.float).eps is used.

  • copy_Gram : bool, default=True If False, Gram is overwritten.

  • verbose : int, default=0 Controls output verbosity.

  • return_path : bool, default=True If return_path==True returns the entire path, else returns only the last point of the path.

  • return_n_iter : bool, default=False Whether to return the number of iterations.

  • positive : bool, default=False Restrict coefficients to be >= 0. This option is only allowed with method 'lasso'. Note that the model coefficients will not converge to the ordinary-least-squares solution for small values of alpha. Only coefficients up to the smallest alpha value (alphas_[alphas_ > 0.].min() when fit_path=True) reached by the stepwise Lars-Lasso algorithm are typically in congruence with the solution of the coordinate descent lasso_path function.


  • alphas : array-like of shape (n_alphas + 1,) Maximum of covariances (in absolute value) at each iteration. n_alphas is either max_iter, n_features or the number of nodes in the path with alpha >= alpha_min, whichever is smaller.

  • active : array-like of shape (n_alphas,) Indices of active variables at the end of the path.

  • coefs : array-like of shape (n_features, n_alphas + 1) Coefficients along the path

  • n_iter : int Number of iterations run. Returned only if return_n_iter is set to True.

See also

lars_path lasso_path lasso_path_gram LassoLars Lars LassoLarsCV LarsCV sklearn.decomposition.sparse_encode


.. [1] 'Least Angle Regression', Efron et al.


.. [2] Wikipedia entry on the Least-angle regression <>_

.. [3] Wikipedia entry on the Lasso <>_


function lasso_path
val lasso_path :
  ?eps:float ->
  ?n_alphas:int ->
  ?alphas:[>`ArrayLike] Np.Obj.t ->
  ?precompute:[`Arr of [>`ArrayLike] Np.Obj.t | `Auto | `Bool of bool] ->
  ?xy:[>`ArrayLike] Np.Obj.t ->
  ?copy_X:bool ->
  ?coef_init:[>`ArrayLike] Np.Obj.t ->
  ?verbose:int ->
  ?return_n_iter:bool ->
  ?positive:bool ->
  ?params:(string * Py.Object.t) list ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  unit ->
  ([>`ArrayLike] Np.Obj.t * [>`ArrayLike] Np.Obj.t * [>`ArrayLike] Np.Obj.t * Py.Object.t)

Compute Lasso path with coordinate descent

The Lasso optimization function varies for mono and multi-outputs.

For mono-output tasks it is::

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

For multi-output tasks it is::

(1 / (2 * n_samples)) * ||Y - XW||^2_Fro + alpha * ||W||_21
  • Where::

    ||W||21 = \sum_i \sqrt{\sum_j w{ij}^2}

i.e. the sum of norm of each row.

Read more in the :ref:User Guide <lasso>.


  • X : {array-like, sparse matrix} of shape (n_samples, n_features) Training data. Pass directly as Fortran-contiguous data to avoid unnecessary memory duplication. If y is mono-output then X can be sparse.

  • y : {array-like, sparse matrix} of shape (n_samples,) or (n_samples, n_outputs) Target values

  • eps : float, default=1e-3 Length of the path. eps=1e-3 means that alpha_min / alpha_max = 1e-3

  • n_alphas : int, default=100 Number of alphas along the regularization path

  • alphas : ndarray, default=None List of alphas where to compute the models. If None alphas are set automatically

  • precompute : 'auto', bool or array-like of shape (n_features, n_features), default='auto' Whether to use a precomputed Gram matrix to speed up calculations. If set to 'auto' let us decide. The Gram matrix can also be passed as argument.

  • Xy : array-like of shape (n_features,) or (n_features, n_outputs), default=None Xy =, y) that can be precomputed. It is useful only when the Gram matrix is precomputed.

  • copy_X : bool, default=True If True, X will be copied; else, it may be overwritten.

  • coef_init : ndarray of shape (n_features, ), default=None The initial values of the coefficients.

  • verbose : bool or int, default=False Amount of verbosity.

  • return_n_iter : bool, default=False whether to return the number of iterations or not.

  • positive : bool, default=False If set to True, forces coefficients to be positive. (Only allowed when y.ndim == 1).

  • **params : kwargs keyword arguments passed to the coordinate descent solver.


  • alphas : ndarray of shape (n_alphas,) The alphas along the path where models are computed.

  • coefs : ndarray of shape (n_features, n_alphas) or (n_outputs, n_features, n_alphas) Coefficients along the path.

  • dual_gaps : ndarray of shape (n_alphas,) The dual gaps at the end of the optimization for each alpha.

  • n_iters : list of int The number of iterations taken by the coordinate descent optimizer to reach the specified tolerance for each alpha.


For an example, see :ref:examples/linear_model/ <>.

To avoid unnecessary memory duplication the X argument of the fit method should be directly passed as a Fortran-contiguous numpy array.

Note that in certain cases, the Lars solver may be significantly faster to implement this functionality. In particular, linear interpolation can be used to retrieve model coefficients between the values output by lars_path


Comparing lasso_path and lars_path with interpolation:

>>> X = np.array([[1, 2, 3.1], [2.3, 5.4, 4.3]]).T
>>> y = np.array([1, 2, 3.1])
>>> # Use lasso_path to compute a coefficient path
>>> _, coef_path, _ = lasso_path(X, y, alphas=[5., 1., .5])
>>> print(coef_path)
[[0.         0.         0.46874778]
 [0.2159048  0.4425765  0.23689075]]
>>> # Now use lars_path and 1D linear interpolation to compute the
>>> # same path
>>> from sklearn.linear_model import lars_path
>>> alphas, active, coef_path_lars = lars_path(X, y, method='lasso')
>>> from scipy import interpolate
>>> coef_path_continuous = interpolate.interp1d(alphas[::-1],
...                                             coef_path_lars[:, ::-1])
>>> print(coef_path_continuous([5., 1., .5]))
[[0.         0.         0.46915237]
 [0.2159048  0.4425765  0.23668876]]

See also

lars_path Lasso LassoLars LassoCV LassoLarsCV sklearn.decomposition.sparse_encode


function orthogonal_mp
val orthogonal_mp :
  ?n_nonzero_coefs:int ->
  ?tol:float ->
  ?precompute:[`Auto | `Bool of bool] ->
  ?copy_X:bool ->
  ?return_path:bool ->
  ?return_n_iter:bool ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  unit ->
  ([>`ArrayLike] Np.Obj.t * Py.Object.t)

Orthogonal Matching Pursuit (OMP)

Solves n_targets Orthogonal Matching Pursuit problems. An instance of the problem has the form:

When parametrized by the number of non-zero coefficients using n_nonzero_coefs: argmin ||y - X\gamma||^2 subject to ||\gamma||0 <= n{nonzero coefs}

When parametrized by error using the parameter tol: argmin ||\gamma||_0 subject to ||y - X\gamma||^2 <= tol

Read more in the :ref:User Guide <omp>.


  • X : array, shape (n_samples, n_features) Input data. Columns are assumed to have unit norm.

  • y : array, shape (n_samples,) or (n_samples, n_targets) Input targets

  • n_nonzero_coefs : int Desired number of non-zero entries in the solution. If None (by default) this value is set to 10% of n_features.

  • tol : float Maximum norm of the residual. If not None, overrides n_nonzero_coefs.

  • precompute : {True, False, 'auto'}, Whether to perform precomputations. Improves performance when n_targets or n_samples is very large.

  • copy_X : bool, optional Whether the design matrix X must be copied by the algorithm. A false value is only helpful if X is already Fortran-ordered, otherwise a copy is made anyway.

  • return_path : bool, optional. Default: False Whether to return every value of the nonzero coefficients along the forward path. Useful for cross-validation.

  • return_n_iter : bool, optional default False Whether or not to return the number of iterations.


  • coef : array, shape (n_features,) or (n_features, n_targets) Coefficients of the OMP solution. If return_path=True, this contains the whole coefficient path. In this case its shape is (n_features, n_features) or (n_features, n_targets, n_features) and iterating over the last axis yields coefficients in increasing order of active features.

  • n_iters : array-like or int Number of active features across every target. Returned only if return_n_iter is set to True.

See also

OrthogonalMatchingPursuit orthogonal_mp_gram lars_path decomposition.sparse_encode


Orthogonal matching pursuit was introduced in S. Mallat, Z. Zhang, Matching pursuits with time-frequency dictionaries, IEEE Transactions on Signal Processing, Vol. 41, No. 12. (December 1993), pp. 3397-3415. (

This implementation is based on Rubinstein, R., Zibulevsky, M. and Elad, M., Efficient Implementation of the K-SVD Algorithm using Batch Orthogonal Matching Pursuit Technical Report - CS Technion, April 2008.



function orthogonal_mp_gram
val orthogonal_mp_gram :
  ?n_nonzero_coefs:int ->
  ?tol:float ->
  ?norms_squared:[>`ArrayLike] Np.Obj.t ->
  ?copy_Gram:bool ->
  ?copy_Xy:bool ->
  ?return_path:bool ->
  ?return_n_iter:bool ->
  gram:[>`ArrayLike] Np.Obj.t ->
  xy:[>`ArrayLike] Np.Obj.t ->
  unit ->
  ([>`ArrayLike] Np.Obj.t * Py.Object.t)

Gram Orthogonal Matching Pursuit (OMP)

Solves n_targets Orthogonal Matching Pursuit problems using only the Gram matrix X.T * X and the product X.T * y.

Read more in the :ref:User Guide <omp>.


  • Gram : array, shape (n_features, n_features) Gram matrix of the input data: X.T * X

  • Xy : array, shape (n_features,) or (n_features, n_targets) Input targets multiplied by X: X.T * y

  • n_nonzero_coefs : int Desired number of non-zero entries in the solution. If None (by default) this value is set to 10% of n_features.

  • tol : float Maximum norm of the residual. If not None, overrides n_nonzero_coefs.

  • norms_squared : array-like, shape (n_targets,) Squared L2 norms of the lines of y. Required if tol is not None.

  • copy_Gram : bool, optional Whether the gram matrix must be copied by the algorithm. A false value is only helpful if it is already Fortran-ordered, otherwise a copy is made anyway.

  • copy_Xy : bool, optional Whether the covariance vector Xy must be copied by the algorithm. If False, it may be overwritten.

  • return_path : bool, optional. Default: False Whether to return every value of the nonzero coefficients along the forward path. Useful for cross-validation.

  • return_n_iter : bool, optional default False Whether or not to return the number of iterations.


  • coef : array, shape (n_features,) or (n_features, n_targets) Coefficients of the OMP solution. If return_path=True, this contains the whole coefficient path. In this case its shape is (n_features, n_features) or (n_features, n_targets, n_features) and iterating over the last axis yields coefficients in increasing order of active features.

  • n_iters : array-like or int Number of active features across every target. Returned only if return_n_iter is set to True.

See also

OrthogonalMatchingPursuit orthogonal_mp lars_path decomposition.sparse_encode


Orthogonal matching pursuit was introduced in G. Mallat, Z. Zhang, Matching pursuits with time-frequency dictionaries, IEEE Transactions on Signal Processing, Vol. 41, No. 12. (December 1993), pp. 3397-3415. (

This implementation is based on Rubinstein, R., Zibulevsky, M. and Elad, M., Efficient Implementation of the K-SVD Algorithm using Batch Orthogonal Matching Pursuit Technical Report - CS Technion, April 2008.



function ridge_regression
val ridge_regression :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  ?solver:[`Auto | `Svd | `Cholesky | `Lsqr | `Sparse_cg | `Sag | `Saga] ->
  ?max_iter:int ->
  ?tol:float ->
  ?verbose:int ->
  ?random_state:int ->
  ?return_n_iter:bool ->
  ?return_intercept:bool ->
  ?check_input:bool ->
  x:[`Arr of [>`ArrayLike] Np.Obj.t | `LinearOperator of Py.Object.t] ->
  y:[>`ArrayLike] Np.Obj.t ->
  alpha:[>`ArrayLike] Np.Obj.t ->
  unit ->
  ([>`ArrayLike] Np.Obj.t * int * [>`ArrayLike] Np.Obj.t)

Solve the ridge equation by the method of normal equations.

Read more in the :ref:User Guide <ridge_regression>.


  • X : {ndarray, sparse matrix, LinearOperator} of shape (n_samples, n_features) Training data

  • y : ndarray of shape (n_samples,) or (n_samples, n_targets) Target values

  • alpha : float or array-like of shape (n_targets,) Regularization strength; must be a positive float. Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization. Alpha corresponds to 1 / (2C) in other linear models such as :class:~sklearn.linear_model.LogisticRegression or :class:sklearn.svm.LinearSVC. If an array is passed, penalties are assumed to be specific to the targets. Hence they must correspond in number.

  • sample_weight : float or array-like of shape (n_samples,), default=None Individual weights for each sample. If given a float, every sample will have the same weight. If sample_weight is not None and solver='auto', the solver will be set to 'cholesky'.

    .. versionadded:: 0.17

  • solver : {'auto', 'svd', 'cholesky', 'lsqr', 'sparse_cg', 'sag', 'saga'}, default='auto' Solver to use in the computational routines:

    • 'auto' chooses the solver automatically based on the type of data.

    • 'svd' uses a Singular Value Decomposition of X to compute the Ridge coefficients. More stable for singular matrices than 'cholesky'.

    • 'cholesky' uses the standard scipy.linalg.solve function to obtain a closed-form solution via a Cholesky decomposition of dot(X.T, X)

    • 'sparse_cg' uses the conjugate gradient solver as found in As an iterative algorithm, this solver is more appropriate than 'cholesky' for large-scale data (possibility to set tol and max_iter).

    • 'lsqr' uses the dedicated regularized least-squares routine scipy.sparse.linalg.lsqr. It is the fastest and uses an iterative procedure.

    • 'sag' uses a Stochastic Average Gradient descent, and 'saga' uses its improved, unbiased version named SAGA. Both methods also use an iterative procedure, and are often faster than other solvers when both n_samples and n_features are large. Note that 'sag' and 'saga' fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.

    All last five solvers support both dense and sparse data. However, only 'sag' and 'sparse_cg' supports sparse input when fit_intercept is True.

    .. versionadded:: 0.17 Stochastic Average Gradient descent solver. .. versionadded:: 0.19 SAGA solver.

  • max_iter : int, default=None Maximum number of iterations for conjugate gradient solver. For the 'sparse_cg' and 'lsqr' solvers, the default value is determined by scipy.sparse.linalg. For 'sag' and saga solver, the default value is 1000.

  • tol : float, default=1e-3 Precision of the solution.

  • verbose : int, default=0 Verbosity level. Setting verbose > 0 will display additional information depending on the solver used.

  • random_state : int, RandomState instance, default=None Used when solver == 'sag' or 'saga' to shuffle the data.

  • See :term:Glossary <random_state> for details.

  • return_n_iter : bool, default=False If True, the method also returns n_iter, the actual number of iteration performed by the solver.

    .. versionadded:: 0.17

  • return_intercept : bool, default=False If True and if X is sparse, the method also returns the intercept, and the solver is automatically changed to 'sag'. This is only a temporary fix for fitting the intercept with sparse data. For dense data, use sklearn.linear_model._preprocess_data before your regression.

    .. versionadded:: 0.17

  • check_input : bool, default=True If False, the input arrays X and y will not be checked.

    .. versionadded:: 0.21


  • coef : ndarray of shape (n_features,) or (n_targets, n_features) Weight vector(s).

  • n_iter : int, optional The actual number of iteration performed by the solver. Only returned if return_n_iter is True.

  • intercept : float or ndarray of shape (n_targets,) The intercept of the model. Only returned if return_intercept is True and if X is a scipy sparse array.


This function won't compute the intercept.