Kernel approximation
AdditiveChi2Sampler¶
Module Sklearn.Kernel_approximation.AdditiveChi2Sampler
wraps Python class sklearn.kernel_approximation.AdditiveChi2Sampler
.
type t
create¶
constructor and attributes create
val create :
?sample_steps:int ->
?sample_interval:float ->
unit ->
t
Approximate feature map for additive chi2 kernel.
Uses sampling the fourier transform of the kernel characteristic at regular intervals.
Since the kernel that is to be approximated is additive, the components of the input vectors can be treated separately. Each entry in the original space is transformed into 2*sample_steps+1 features, where sample_steps is a parameter of the method. Typical values of sample_steps include 1, 2 and 3.
Optimal choices for the sampling interval for certain data ranges can be computed (see the reference). The default values should be reasonable.
Read more in the :ref:User Guide <additive_chi_kernel_approx>
.
Parameters
-
sample_steps : int, optional Gives the number of (complex) sampling points.
-
sample_interval : float, optional Sampling interval. Must be specified when sample_steps not in {1,2,3}.
Attributes
- sample_interval_ : float Stored sampling interval. Specified as a parameter if sample_steps not in {1,2,3}.
Examples
>>> from sklearn.datasets import load_digits
>>> from sklearn.linear_model import SGDClassifier
>>> from sklearn.kernel_approximation import AdditiveChi2Sampler
>>> X, y = load_digits(return_X_y=True)
>>> chi2sampler = AdditiveChi2Sampler(sample_steps=2)
>>> X_transformed = chi2sampler.fit_transform(X, y)
>>> clf = SGDClassifier(max_iter=5, random_state=0, tol=1e-3)
>>> clf.fit(X_transformed, y)
SGDClassifier(max_iter=5, random_state=0)
>>> clf.score(X_transformed, y)
0.9499...
Notes
This estimator approximates a slightly different version of the additive
chi squared kernel then metric.additive_chi2
computes.
See also
-
SkewedChi2Sampler : A Fourier-approximation to a non-additive variant of the chi squared kernel.
-
sklearn.metrics.pairwise.chi2_kernel : The exact chi squared kernel.
-
sklearn.metrics.pairwise.additive_chi2_kernel : The exact additive chi squared kernel.
References
See 'Efficient additive kernels via explicit feature maps'
<http://www.robots.ox.ac.uk/~vedaldi/assets/pubs/vedaldi11efficient.pdf>
_
A. Vedaldi and A. Zisserman, Pattern Analysis and Machine Intelligence,
2011
fit¶
method fit
val fit :
?y:Py.Object.t ->
x:[>`ArrayLike] Np.Obj.t ->
[> tag] Obj.t ->
t
Set the parameters
Parameters
- X : array-like, shape (n_samples, n_features) Training data, where n_samples in the number of samples and n_features is the number of features.
Returns
- self : object Returns the transformer.
fit_transform¶
method fit_transform
val fit_transform :
?y:[>`ArrayLike] Np.Obj.t ->
?fit_params:(string * Py.Object.t) list ->
x:[>`ArrayLike] Np.Obj.t ->
[> tag] Obj.t ->
[>`ArrayLike] Np.Obj.t
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters
-
X : {array-like, sparse matrix, dataframe} of shape (n_samples, n_features)
-
y : ndarray of shape (n_samples,), default=None Target values.
-
**fit_params : dict Additional fit parameters.
Returns
- X_new : ndarray array of shape (n_samples, n_features_new) Transformed array.
get_params¶
method get_params
val get_params :
?deep:bool ->
[> tag] Obj.t ->
Dict.t
Get parameters for this estimator.
Parameters
- deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns
- params : mapping of string to any Parameter names mapped to their values.
set_params¶
method set_params
val set_params :
?params:(string * Py.Object.t) list ->
[> tag] Obj.t ->
t
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it's possible to update each
component of a nested object.
Parameters
- **params : dict Estimator parameters.
Returns
- self : object Estimator instance.
transform¶
method transform
val transform :
x:[>`ArrayLike] Np.Obj.t ->
[> tag] Obj.t ->
[>`ArrayLike] Np.Obj.t
Apply approximate feature map to X.
Parameters
- X : {array-like, sparse matrix} of shape (n_samples, n_features)
Returns
- X_new : {array, sparse matrix}, shape = (n_samples, n_features * (2*sample_steps + 1)) Whether the return value is an array of sparse matrix depends on the type of the input X.
sample_interval_¶
attribute sample_interval_
val sample_interval_ : t -> float
val sample_interval_opt : t -> (float) option
This attribute is documented in create
above. The first version raises Not_found
if the attribute is None. The _opt version returns an option.
to_string¶
method to_string
val to_string: t -> string
Print the object to a human-readable representation.
show¶
method show
val show: t -> string
Print the object to a human-readable representation.
pp¶
method pp
val pp: Format.formatter -> t -> unit
Pretty-print the object to a formatter.
Nystroem¶
Module Sklearn.Kernel_approximation.Nystroem
wraps Python class sklearn.kernel_approximation.Nystroem
.
type t
create¶
constructor and attributes create
val create :
?kernel:[`S of string | `Callable of Py.Object.t] ->
?gamma:float ->
?coef0:float ->
?degree:float ->
?kernel_params:Dict.t ->
?n_components:int ->
?random_state:int ->
unit ->
t
Approximate a kernel map using a subset of the training data.
Constructs an approximate feature map for an arbitrary kernel using a subset of the data as basis.
Read more in the :ref:User Guide <nystroem_kernel_approx>
.
.. versionadded:: 0.13
Parameters
-
kernel : string or callable, default='rbf' Kernel map to be approximated. A callable should accept two arguments and the keyword arguments passed to this object as kernel_params, and should return a floating point number.
-
gamma : float, default=None Gamma parameter for the RBF, laplacian, polynomial, exponential chi2 and sigmoid kernels. Interpretation of the default value is left to the kernel; see the documentation for sklearn.metrics.pairwise. Ignored by other kernels.
-
coef0 : float, default=None Zero coefficient for polynomial and sigmoid kernels. Ignored by other kernels.
-
degree : float, default=None Degree of the polynomial kernel. Ignored by other kernels.
-
kernel_params : mapping of string to any, optional Additional parameters (keyword arguments) for kernel function passed as callable object.
-
n_components : int Number of features to construct. How many data points will be used to construct the mapping.
-
random_state : int, RandomState instance or None, optional (default=None) Pseudo-random number generator to control the uniform sampling without replacement of n_components of the training data to construct the basis kernel. Pass an int for reproducible output across multiple function calls.
-
See :term:
Glossary <random_state>
.
Attributes
-
components_ : array, shape (n_components, n_features) Subset of training points used to construct the feature map.
-
component_indices_ : array, shape (n_components) Indices of
components_
in the training set. -
normalization_ : array, shape (n_components, n_components) Normalization matrix needed for embedding. Square root of the kernel matrix on
components_
.
Examples
>>> from sklearn import datasets, svm
>>> from sklearn.kernel_approximation import Nystroem
>>> X, y = datasets.load_digits(n_class=9, return_X_y=True)
>>> data = X / 16.
>>> clf = svm.LinearSVC()
>>> feature_map_nystroem = Nystroem(gamma=.2,
... random_state=1,
... n_components=300)
>>> data_transformed = feature_map_nystroem.fit_transform(data)
>>> clf.fit(data_transformed, y)
LinearSVC()
>>> clf.score(data_transformed, y)
0.9987...
References
-
Williams, C.K.I. and Seeger, M. 'Using the Nystroem method to speed up kernel machines', Advances in neural information processing systems 2001
-
T. Yang, Y. Li, M. Mahdavi, R. Jin and Z. Zhou 'Nystroem Method vs Random Fourier Features: A Theoretical and Empirical Comparison', Advances in Neural Information Processing Systems 2012
See also
-
RBFSampler : An approximation to the RBF kernel using random Fourier features.
-
sklearn.metrics.pairwise.kernel_metrics : List of built-in kernels.
fit¶
method fit
val fit :
?y:Py.Object.t ->
x:[>`ArrayLike] Np.Obj.t ->
[> tag] Obj.t ->
t
Fit estimator to data.
Samples a subset of training points, computes kernel on these and computes normalization matrix.
Parameters
- X : array-like of shape (n_samples, n_features) Training data.
fit_transform¶
method fit_transform
val fit_transform :
?y:[>`ArrayLike] Np.Obj.t ->
?fit_params:(string * Py.Object.t) list ->
x:[>`ArrayLike] Np.Obj.t ->
[> tag] Obj.t ->
[>`ArrayLike] Np.Obj.t
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters
-
X : {array-like, sparse matrix, dataframe} of shape (n_samples, n_features)
-
y : ndarray of shape (n_samples,), default=None Target values.
-
**fit_params : dict Additional fit parameters.
Returns
- X_new : ndarray array of shape (n_samples, n_features_new) Transformed array.
get_params¶
method get_params
val get_params :
?deep:bool ->
[> tag] Obj.t ->
Dict.t
Get parameters for this estimator.
Parameters
- deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns
- params : mapping of string to any Parameter names mapped to their values.
set_params¶
method set_params
val set_params :
?params:(string * Py.Object.t) list ->
[> tag] Obj.t ->
t
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it's possible to update each
component of a nested object.
Parameters
- **params : dict Estimator parameters.
Returns
- self : object Estimator instance.
transform¶
method transform
val transform :
x:[>`ArrayLike] Np.Obj.t ->
[> tag] Obj.t ->
[>`ArrayLike] Np.Obj.t
Apply feature map to X.
Computes an approximate feature map using the kernel between some training points and X.
Parameters
- X : array-like of shape (n_samples, n_features) Data to transform.
Returns
- X_transformed : array, shape=(n_samples, n_components) Transformed data.
components_¶
attribute components_
val components_ : t -> [>`ArrayLike] Np.Obj.t
val components_opt : t -> ([>`ArrayLike] Np.Obj.t) option
This attribute is documented in create
above. The first version raises Not_found
if the attribute is None. The _opt version returns an option.
component_indices_¶
attribute component_indices_
val component_indices_ : t -> [>`ArrayLike] Np.Obj.t
val component_indices_opt : t -> ([>`ArrayLike] Np.Obj.t) option
This attribute is documented in create
above. The first version raises Not_found
if the attribute is None. The _opt version returns an option.
normalization_¶
attribute normalization_
val normalization_ : t -> [>`ArrayLike] Np.Obj.t
val normalization_opt : t -> ([>`ArrayLike] Np.Obj.t) option
This attribute is documented in create
above. The first version raises Not_found
if the attribute is None. The _opt version returns an option.
to_string¶
method to_string
val to_string: t -> string
Print the object to a human-readable representation.
show¶
method show
val show: t -> string
Print the object to a human-readable representation.
pp¶
method pp
val pp: Format.formatter -> t -> unit
Pretty-print the object to a formatter.
RBFSampler¶
Module Sklearn.Kernel_approximation.RBFSampler
wraps Python class sklearn.kernel_approximation.RBFSampler
.
type t
create¶
constructor and attributes create
val create :
?gamma:float ->
?n_components:int ->
?random_state:int ->
unit ->
t
Approximates feature map of an RBF kernel by Monte Carlo approximation of its Fourier transform.
It implements a variant of Random Kitchen Sinks.[1]
Read more in the :ref:User Guide <rbf_kernel_approx>
.
Parameters
-
gamma : float Parameter of RBF kernel: exp(-gamma * x^2)
-
n_components : int Number of Monte Carlo samples per original feature. Equals the dimensionality of the computed feature space.
-
random_state : int, RandomState instance or None, optional (default=None) Pseudo-random number generator to control the generation of the random weights and random offset when fitting the training data. Pass an int for reproducible output across multiple function calls.
-
See :term:
Glossary <random_state>
.
Attributes
-
random_offset_ : ndarray of shape (n_components,), dtype=float64 Random offset used to compute the projection in the
n_components
dimensions of the feature space. -
random_weights_ : ndarray of shape (n_features, n_components), dtype=float64 Random projection directions drawn from the Fourier transform of the RBF kernel.
Examples
>>> from sklearn.kernel_approximation import RBFSampler
>>> from sklearn.linear_model import SGDClassifier
>>> X = [[0, 0], [1, 1], [1, 0], [0, 1]]
>>> y = [0, 0, 1, 1]
>>> rbf_feature = RBFSampler(gamma=1, random_state=1)
>>> X_features = rbf_feature.fit_transform(X)
>>> clf = SGDClassifier(max_iter=5, tol=1e-3)
>>> clf.fit(X_features, y)
SGDClassifier(max_iter=5)
>>> clf.score(X_features, y)
1.0
Notes
See 'Random Features for Large-Scale Kernel Machines' by A. Rahimi and Benjamin Recht.
[1] 'Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning' by A. Rahimi and Benjamin Recht. (https://people.eecs.berkeley.edu/~brecht/papers/08.rah.rec.nips.pdf)
fit¶
method fit
val fit :
?y:Py.Object.t ->
x:[>`ArrayLike] Np.Obj.t ->
[> tag] Obj.t ->
t
Fit the model with X.
Samples random projection according to n_features.
Parameters
- X : {array-like, sparse matrix}, shape (n_samples, n_features) Training data, where n_samples in the number of samples and n_features is the number of features.
Returns
- self : object Returns the transformer.
fit_transform¶
method fit_transform
val fit_transform :
?y:[>`ArrayLike] Np.Obj.t ->
?fit_params:(string * Py.Object.t) list ->
x:[>`ArrayLike] Np.Obj.t ->
[> tag] Obj.t ->
[>`ArrayLike] Np.Obj.t
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters
-
X : {array-like, sparse matrix, dataframe} of shape (n_samples, n_features)
-
y : ndarray of shape (n_samples,), default=None Target values.
-
**fit_params : dict Additional fit parameters.
Returns
- X_new : ndarray array of shape (n_samples, n_features_new) Transformed array.
get_params¶
method get_params
val get_params :
?deep:bool ->
[> tag] Obj.t ->
Dict.t
Get parameters for this estimator.
Parameters
- deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns
- params : mapping of string to any Parameter names mapped to their values.
set_params¶
method set_params
val set_params :
?params:(string * Py.Object.t) list ->
[> tag] Obj.t ->
t
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it's possible to update each
component of a nested object.
Parameters
- **params : dict Estimator parameters.
Returns
- self : object Estimator instance.
transform¶
method transform
val transform :
x:[>`ArrayLike] Np.Obj.t ->
[> tag] Obj.t ->
[>`ArrayLike] Np.Obj.t
Apply the approximate feature map to X.
Parameters
- X : {array-like, sparse matrix}, shape (n_samples, n_features) New data, where n_samples in the number of samples and n_features is the number of features.
Returns
- X_new : array-like, shape (n_samples, n_components)
random_offset_¶
attribute random_offset_
val random_offset_ : t -> Py.Object.t
val random_offset_opt : t -> (Py.Object.t) option
This attribute is documented in create
above. The first version raises Not_found
if the attribute is None. The _opt version returns an option.
random_weights_¶
attribute random_weights_
val random_weights_ : t -> Py.Object.t
val random_weights_opt : t -> (Py.Object.t) option
This attribute is documented in create
above. The first version raises Not_found
if the attribute is None. The _opt version returns an option.
to_string¶
method to_string
val to_string: t -> string
Print the object to a human-readable representation.
show¶
method show
val show: t -> string
Print the object to a human-readable representation.
pp¶
method pp
val pp: Format.formatter -> t -> unit
Pretty-print the object to a formatter.
SkewedChi2Sampler¶
Module Sklearn.Kernel_approximation.SkewedChi2Sampler
wraps Python class sklearn.kernel_approximation.SkewedChi2Sampler
.
type t
create¶
constructor and attributes create
val create :
?skewedness:float ->
?n_components:int ->
?random_state:int ->
unit ->
t
Approximates feature map of the 'skewed chi-squared' kernel by Monte Carlo approximation of its Fourier transform.
Read more in the :ref:User Guide <skewed_chi_kernel_approx>
.
Parameters
-
skewedness : float 'skewedness' parameter of the kernel. Needs to be cross-validated.
-
n_components : int number of Monte Carlo samples per original feature. Equals the dimensionality of the computed feature space.
-
random_state : int, RandomState instance or None, optional (default=None) Pseudo-random number generator to control the generation of the random weights and random offset when fitting the training data. Pass an int for reproducible output across multiple function calls.
-
See :term:
Glossary <random_state>
.
Examples
>>> from sklearn.kernel_approximation import SkewedChi2Sampler
>>> from sklearn.linear_model import SGDClassifier
>>> X = [[0, 0], [1, 1], [1, 0], [0, 1]]
>>> y = [0, 0, 1, 1]
>>> chi2_feature = SkewedChi2Sampler(skewedness=.01,
... n_components=10,
... random_state=0)
>>> X_features = chi2_feature.fit_transform(X, y)
>>> clf = SGDClassifier(max_iter=10, tol=1e-3)
>>> clf.fit(X_features, y)
SGDClassifier(max_iter=10)
>>> clf.score(X_features, y)
1.0
References
See 'Random Fourier Approximations for Skewed Multiplicative Histogram Kernels' by Fuxin Li, Catalin Ionescu and Cristian Sminchisescu.
See also
-
AdditiveChi2Sampler : A different approach for approximating an additive variant of the chi squared kernel.
-
sklearn.metrics.pairwise.chi2_kernel : The exact chi squared kernel.
fit¶
method fit
val fit :
?y:Py.Object.t ->
x:[>`ArrayLike] Np.Obj.t ->
[> tag] Obj.t ->
t
Fit the model with X.
Samples random projection according to n_features.
Parameters
- X : array-like, shape (n_samples, n_features) Training data, where n_samples in the number of samples and n_features is the number of features.
Returns
- self : object Returns the transformer.
fit_transform¶
method fit_transform
val fit_transform :
?y:[>`ArrayLike] Np.Obj.t ->
?fit_params:(string * Py.Object.t) list ->
x:[>`ArrayLike] Np.Obj.t ->
[> tag] Obj.t ->
[>`ArrayLike] Np.Obj.t
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters
-
X : {array-like, sparse matrix, dataframe} of shape (n_samples, n_features)
-
y : ndarray of shape (n_samples,), default=None Target values.
-
**fit_params : dict Additional fit parameters.
Returns
- X_new : ndarray array of shape (n_samples, n_features_new) Transformed array.
get_params¶
method get_params
val get_params :
?deep:bool ->
[> tag] Obj.t ->
Dict.t
Get parameters for this estimator.
Parameters
- deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns
- params : mapping of string to any Parameter names mapped to their values.
set_params¶
method set_params
val set_params :
?params:(string * Py.Object.t) list ->
[> tag] Obj.t ->
t
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it's possible to update each
component of a nested object.
Parameters
- **params : dict Estimator parameters.
Returns
- self : object Estimator instance.
transform¶
method transform
val transform :
x:[>`ArrayLike] Np.Obj.t ->
[> tag] Obj.t ->
[>`ArrayLike] Np.Obj.t
Apply the approximate feature map to X.
Parameters
- X : array-like, shape (n_samples, n_features) New data, where n_samples in the number of samples and n_features is the number of features. All values of X must be strictly greater than '-skewedness'.
Returns
- X_new : array-like, shape (n_samples, n_components)
to_string¶
method to_string
val to_string: t -> string
Print the object to a human-readable representation.
show¶
method show
val show: t -> string
Print the object to a human-readable representation.
pp¶
method pp
val pp: Format.formatter -> t -> unit
Pretty-print the object to a formatter.
as_float_array¶
function as_float_array
val as_float_array :
?copy:bool ->
?force_all_finite:[`Allow_nan | `Bool of bool] ->
x:[>`ArrayLike] Np.Obj.t ->
unit ->
[>`ArrayLike] Np.Obj.t
Converts an array-like to an array of floats.
The new dtype will be np.float32 or np.float64, depending on the original type. The function can create a copy or modify the argument depending on the argument copy.
Parameters
-
X : {array-like, sparse matrix}
-
copy : bool, optional If True, a copy of X will be created. If False, a copy may still be returned if X's dtype is not a floating point type.
-
force_all_finite : boolean or 'allow-nan', (default=True) Whether to raise an error on np.inf, np.nan, pd.NA in X. The possibilities are:
- True: Force all values of X to be finite.
- False: accepts np.inf, np.nan, pd.NA in X.
- 'allow-nan': accepts only np.nan and pd.NA values in X. Values cannot be infinite.
.. versionadded:: 0.20
force_all_finite
accepts the string'allow-nan'
... versionchanged:: 0.23 Accepts
pd.NA
and converts it intonp.nan
Returns
- XT : {array, sparse matrix} An array of type np.float
check_array¶
function check_array
val check_array :
?accept_sparse:[`S of string | `StringList of string list | `Bool of bool] ->
?accept_large_sparse:bool ->
?dtype:[`Dtypes of Np.Dtype.t list | `S of string | `Dtype of Np.Dtype.t | `None] ->
?order:[`F | `C] ->
?copy:bool ->
?force_all_finite:[`Allow_nan | `Bool of bool] ->
?ensure_2d:bool ->
?allow_nd:bool ->
?ensure_min_samples:int ->
?ensure_min_features:int ->
?estimator:[>`BaseEstimator] Np.Obj.t ->
array:Py.Object.t ->
unit ->
Py.Object.t
Input validation on an array, list, sparse matrix or similar.
By default, the input is checked to be a non-empty 2D array containing only finite values. If the dtype of the array is object, attempt converting to float, raising on failure.
Parameters
-
array : object Input object to check / convert.
-
accept_sparse : string, boolean or list/tuple of strings (default=False) String[s] representing allowed sparse matrix formats, such as 'csc', 'csr', etc. If the input is sparse but not in the allowed format, it will be converted to the first listed format. True allows the input to be any format. False means that a sparse matrix input will raise an error.
-
accept_large_sparse : bool (default=True) If a CSR, CSC, COO or BSR sparse matrix is supplied and accepted by accept_sparse, accept_large_sparse=False will cause it to be accepted only if its indices are stored with a 32-bit dtype.
.. versionadded:: 0.20
-
dtype : string, type, list of types or None (default='numeric') Data type of result. If None, the dtype of the input is preserved. If 'numeric', dtype is preserved unless array.dtype is object. If dtype is a list of types, conversion on the first type is only performed if the dtype of the input is not in the list.
-
order : 'F', 'C' or None (default=None) Whether an array will be forced to be fortran or c-style. When order is None (default), then if copy=False, nothing is ensured about the memory layout of the output array; otherwise (copy=True) the memory layout of the returned array is kept as close as possible to the original array.
-
copy : boolean (default=False) Whether a forced copy will be triggered. If copy=False, a copy might be triggered by a conversion.
-
force_all_finite : boolean or 'allow-nan', (default=True) Whether to raise an error on np.inf, np.nan, pd.NA in array. The possibilities are:
- True: Force all values of array to be finite.
- False: accepts np.inf, np.nan, pd.NA in array.
- 'allow-nan': accepts only np.nan and pd.NA values in array. Values cannot be infinite.
.. versionadded:: 0.20
force_all_finite
accepts the string'allow-nan'
... versionchanged:: 0.23 Accepts
pd.NA
and converts it intonp.nan
-
ensure_2d : boolean (default=True) Whether to raise a value error if array is not 2D.
-
allow_nd : boolean (default=False) Whether to allow array.ndim > 2.
-
ensure_min_samples : int (default=1) Make sure that the array has a minimum number of samples in its first axis (rows for a 2D array). Setting to 0 disables this check.
-
ensure_min_features : int (default=1) Make sure that the 2D array has some minimum number of features (columns). The default value of 1 rejects empty datasets. This check is only enforced when the input data has effectively 2 dimensions or is originally 1D and
ensure_2d
is True. Setting to 0 disables this check. -
estimator : str or estimator instance (default=None) If passed, include the name of the estimator in warning messages.
Returns
- array_converted : object The converted and validated array.
check_is_fitted¶
function check_is_fitted
val check_is_fitted :
?attributes:[`Arr of [>`ArrayLike] Np.Obj.t | `S of string | `StringList of string list] ->
?msg:string ->
?all_or_any:[`Callable of Py.Object.t | `PyObject of Py.Object.t] ->
estimator:[>`BaseEstimator] Np.Obj.t ->
unit ->
Py.Object.t
Perform is_fitted validation for estimator.
Checks if the estimator is fitted by verifying the presence of fitted attributes (ending with a trailing underscore) and otherwise raises a NotFittedError with the given message.
This utility is meant to be used internally by estimators themselves, typically in their own predict / transform methods.
Parameters
-
estimator : estimator instance. estimator instance for which the check is performed.
-
attributes : str, list or tuple of str, default=None Attribute name(s) given as string or a list/tuple of strings
-
Eg.:
['coef_', 'estimator_', ...], 'coef_'
If
None
,estimator
is considered fitted if there exist an attribute that ends with a underscore and does not start with double underscore. -
msg : string The default error message is, 'This %(name)s instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.'
For custom messages if '%(name)s' is present in the message string, it is substituted for the estimator name.
-
Eg. : 'Estimator, %(name)s, must be fitted before sparsifying'.
-
all_or_any : callable, {all, any}, default all Specify whether all or any of the given attributes must exist.
Returns
None
Raises
NotFittedError If the attributes are not found.
check_non_negative¶
function check_non_negative
val check_non_negative :
x:[>`ArrayLike] Np.Obj.t ->
whom:string ->
unit ->
Py.Object.t
Check if there is any negative value in an array.
Parameters
-
X : array-like or sparse matrix Input data.
-
whom : string Who passed X to this function.
check_random_state¶
function check_random_state
val check_random_state :
[`Optional of [`I of int | `None] | `RandomState of Py.Object.t] ->
Py.Object.t
Turn seed into a np.random.RandomState instance
Parameters
- seed : None | int | instance of RandomState If seed is None, return the RandomState singleton used by np.random. If seed is an int, return a new RandomState instance seeded with seed. If seed is already a RandomState instance, return it. Otherwise raise ValueError.
pairwise_kernels¶
function pairwise_kernels
val pairwise_kernels :
?y:[>`ArrayLike] Np.Obj.t ->
?metric:[`S of string | `Callable of Py.Object.t] ->
?filter_params:bool ->
?n_jobs:int ->
?kwds:(string * Py.Object.t) list ->
x:[`Arr of [>`ArrayLike] Np.Obj.t | `Otherwise of Py.Object.t] ->
unit ->
[>`ArrayLike] Np.Obj.t
Compute the kernel between arrays X and optional array Y.
This method takes either a vector array or a kernel matrix, and returns a kernel matrix. If the input is a vector array, the kernels are computed. If the input is a kernel matrix, it is returned instead.
This method provides a safe way to take a kernel matrix as input, while preserving compatibility with many other algorithms that take a vector array.
If Y is given (default is None), then the returned matrix is the pairwise kernel between the arrays from both X and Y.
Valid values for metric are: ['additive_chi2', 'chi2', 'linear', 'poly', 'polynomial', 'rbf', 'laplacian', 'sigmoid', 'cosine']
Read more in the :ref:User Guide <metrics>
.
Parameters
-
X : array [n_samples_a, n_samples_a] if metric == 'precomputed', or, [n_samples_a, n_features] otherwise Array of pairwise kernels between samples, or a feature array.
-
Y : array [n_samples_b, n_features] A second feature array only if X has shape [n_samples_a, n_features].
-
metric : string, or callable The metric to use when calculating kernel between instances in a feature array. If metric is a string, it must be one of the metrics in pairwise.PAIRWISE_KERNEL_FUNCTIONS. If metric is 'precomputed', X is assumed to be a kernel matrix. Alternatively, if metric is a callable function, it is called on each pair of instances (rows) and the resulting value recorded. The callable should take two rows from X as input and return the corresponding kernel value as a single number. This means that callables from :mod:
sklearn.metrics.pairwise
are not allowed, as they operate on matrices, not single samples. Use the string identifying the kernel instead. -
filter_params : boolean Whether to filter invalid parameters or not.
-
n_jobs : int or None, optional (default=None) The number of jobs to use for the computation. This works by breaking down the pairwise matrix into n_jobs even slices and computing them in parallel.
None
means 1 unless in a :obj:joblib.parallel_backend
context.-1
means using all processors. See :term:Glossary <n_jobs>
for more details. -
**kwds : optional keyword parameters Any further parameters are passed directly to the kernel function.
Returns
- K : array [n_samples_a, n_samples_a] or [n_samples_a, n_samples_b] A kernel matrix K such that K_{i, j} is the kernel between the ith and jth vectors of the given matrix X, if Y is None. If Y is not None, then K_{i, j} is the kernel between the ith array from X and the jth array from Y.
Notes
If metric is 'precomputed', Y is ignored and X is returned.
safe_sparse_dot¶
function safe_sparse_dot
val safe_sparse_dot :
?dense_output:Py.Object.t ->
a:[>`ArrayLike] Np.Obj.t ->
b:Py.Object.t ->
unit ->
[>`ArrayLike] Np.Obj.t
Dot product that handle the sparse matrix case correctly
Parameters
-
a : array or sparse matrix
-
b : array or sparse matrix
-
dense_output : boolean, (default=False) When False,
a
andb
both being sparse will yield sparse output. When True, output will always be a dense array.
Returns
- dot_product : array or sparse matrix
sparse if
a
andb
are sparse anddense_output=False
.
svd¶
function svd
val svd :
?full_matrices:bool ->
?compute_uv:bool ->
?overwrite_a:bool ->
?check_finite:bool ->
?lapack_driver:[`Gesdd | `Gesvd] ->
a:[>`ArrayLike] Np.Obj.t ->
unit ->
([>`ArrayLike] Np.Obj.t * [>`ArrayLike] Np.Obj.t * [>`ArrayLike] Np.Obj.t)
Singular Value Decomposition.
Factorizes the matrix a
into two unitary matrices U
and Vh
, and
a 1-D array s
of singular values (real, non-negative) such that
a == U @ S @ Vh
, where S
is a suitably shaped matrix of zeros with
main diagonal s
.
Parameters
-
a : (M, N) array_like Matrix to decompose.
-
full_matrices : bool, optional If True (default),
U
andVh
are of shape(M, M)
,(N, N)
. If False, the shapes are(M, K)
and(K, N)
, whereK = min(M, N)
. -
compute_uv : bool, optional Whether to compute also
U
andVh
in addition tos
. Default is True. -
overwrite_a : bool, optional Whether to overwrite
a
; may improve performance. Default is False. -
check_finite : bool, optional Whether to check that the input matrix contains only finite numbers. Disabling may give a performance gain, but may result in problems (crashes, non-termination) if the inputs do contain infinities or NaNs.
-
lapack_driver : {'gesdd', 'gesvd'}, optional Whether to use the more efficient divide-and-conquer approach (
'gesdd'
) or general rectangular approach ('gesvd'
) to compute the SVD. MATLAB and Octave use the'gesvd'
approach. Default is'gesdd'
... versionadded:: 0.18
Returns
-
U : ndarray Unitary matrix having left singular vectors as columns. Of shape
(M, M)
or(M, K)
, depending onfull_matrices
. -
s : ndarray The singular values, sorted in non-increasing order. Of shape (K,), with
K = min(M, N)
. -
Vh : ndarray Unitary matrix having right singular vectors as rows. Of shape
(N, N)
or(K, N)
depending onfull_matrices
.
For compute_uv=False
, only s
is returned.
Raises
LinAlgError If SVD computation does not converge.
See also
-
svdvals : Compute singular values of a matrix.
-
diagsvd : Construct the Sigma matrix, given the vector s.
Examples
>>> from scipy import linalg
>>> m, n = 9, 6
>>> a = np.random.randn(m, n) + 1.j*np.random.randn(m, n)
>>> U, s, Vh = linalg.svd(a)
>>> U.shape, s.shape, Vh.shape
((9, 9), (6,), (6, 6))
Reconstruct the original matrix from the decomposition:
>>> sigma = np.zeros((m, n))
>>> for i in range(min(m, n)):
... sigma[i, i] = s[i]
>>> a1 = np.dot(U, np.dot(sigma, Vh))
>>> np.allclose(a, a1)
True
Alternatively, use full_matrices=False
(notice that the shape of
U
is then (m, n)
instead of (m, m)
):
>>> U, s, Vh = linalg.svd(a, full_matrices=False)
>>> U.shape, s.shape, Vh.shape
((9, 6), (6,), (6, 6))
>>> S = np.diag(s)
>>> np.allclose(a, np.dot(U, np.dot(S, Vh)))
True
>>> s2 = linalg.svd(a, compute_uv=False)
>>> np.allclose(s, s2)
True