Skip to content

Utils

Bunch

Module Sklearn.​Utils.​Bunch wraps Python class sklearn.utils.Bunch.

type t

create

constructor and attributes create
val create :
  ?kwargs:(string * Py.Object.t) list ->
  unit ->
  t

Container object exposing keys as attributes

Bunch objects are sometimes used as an output for functions and methods. They extend dictionaries by enabling values to be accessed by key, bunch['value_key'], or by an attribute, bunch.value_key.

Examples

>>> b = Bunch(a=1, b=2)
>>> b['b']
2
>>> b.b
2
>>> b.a = 3
>>> b['a']
3
>>> b.c = 6
>>> b['c']
6

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

DataConversionWarning

Module Sklearn.​Utils.​DataConversionWarning wraps Python class sklearn.utils.DataConversionWarning.

type t

with_traceback

method with_traceback
val with_traceback :
  tb:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Exception.with_traceback(tb) -- set self.traceback to tb and return self.

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

Path

Module Sklearn.​Utils.​Path wraps Python class sklearn.utils.Path.

type t

create

constructor and attributes create
val create :
  Py.Object.t ->
  t

PurePath subclass that can make system calls.

Path represents a filesystem path but unlike PurePath, also offers methods to do system calls on path objects. Depending on your system, instantiating a Path will return either a PosixPath or a WindowsPath object. You can also instantiate a PosixPath or WindowsPath directly, but cannot instantiate a WindowsPath on a POSIX system or vice versa.

absolute

method absolute
val absolute :
  [> tag] Obj.t ->
  Py.Object.t

Return an absolute version of this path. This function works even if the path doesn't point to anything.

No normalization is done, i.e. all '.' and '..' will be kept along. Use resolve() to get the canonical path to a file.

as_posix

method as_posix
val as_posix :
  [> tag] Obj.t ->
  Py.Object.t

Return the string representation of the path with forward (/) slashes.

as_uri

method as_uri
val as_uri :
  [> tag] Obj.t ->
  Py.Object.t

Return the path as a 'file' URI.

chmod

method chmod
val chmod :
  mode:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Change the permissions of the path, like os.chmod().

cwd

method cwd
val cwd :
  [> tag] Obj.t ->
  Py.Object.t

Return a new path pointing to the current working directory (as returned by os.getcwd()).

exists

method exists
val exists :
  [> tag] Obj.t ->
  Py.Object.t

Whether this path exists.

expanduser

method expanduser
val expanduser :
  [> tag] Obj.t ->
  Py.Object.t

Return a new path with expanded ~ and ~user constructs (as returned by os.path.expanduser)

glob

method glob
val glob :
  pattern:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Iterate over this subtree and yield all existing files (of any kind, including directories) matching the given relative pattern.

group

method group
val group :
  [> tag] Obj.t ->
  Py.Object.t

Return the group name of the file gid.

home

method home
val home :
  [> tag] Obj.t ->
  Py.Object.t

Return a new path pointing to the user's home directory (as returned by os.path.expanduser('~')).

is_absolute

method is_absolute
val is_absolute :
  [> tag] Obj.t ->
  Py.Object.t

True if the path is absolute (has both a root and, if applicable, a drive).

is_block_device

method is_block_device
val is_block_device :
  [> tag] Obj.t ->
  Py.Object.t

Whether this path is a block device.

is_char_device

method is_char_device
val is_char_device :
  [> tag] Obj.t ->
  Py.Object.t

Whether this path is a character device.

is_dir

method is_dir
val is_dir :
  [> tag] Obj.t ->
  Py.Object.t

Whether this path is a directory.

is_fifo

method is_fifo
val is_fifo :
  [> tag] Obj.t ->
  Py.Object.t

Whether this path is a FIFO.

is_file

method is_file
val is_file :
  [> tag] Obj.t ->
  Py.Object.t

Whether this path is a regular file (also True for symlinks pointing to regular files).

is_mount

method is_mount
val is_mount :
  [> tag] Obj.t ->
  Py.Object.t

Check if this path is a POSIX mount point

is_reserved

method is_reserved
val is_reserved :
  [> tag] Obj.t ->
  Py.Object.t

Return True if the path contains one of the special names reserved by the system, if any.

is_socket

method is_socket
val is_socket :
  [> tag] Obj.t ->
  Py.Object.t

Whether this path is a socket.

method is_symlink
val is_symlink :
  [> tag] Obj.t ->
  Py.Object.t

Whether this path is a symbolic link.

iterdir

method iterdir
val iterdir :
  [> tag] Obj.t ->
  Py.Object.t

Iterate over the files in this directory. Does not yield any result for the special paths '.' and '..'.

joinpath

method joinpath
val joinpath :
  Py.Object.t list ->
  [> tag] Obj.t ->
  Py.Object.t

Combine this path with one or several arguments, and return a new path representing either a subpath (if all arguments are relative paths) or a totally different path (if one of the arguments is anchored).

lchmod

method lchmod
val lchmod :
  mode:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Like chmod(), except if the path points to a symlink, the symlink's permissions are changed, rather than its target's.

method link_to
val link_to :
  target:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Create a hard link pointing to a path named target.

lstat

method lstat
val lstat :
  [> tag] Obj.t ->
  Py.Object.t

Like stat(), except if the path points to a symlink, the symlink's status information is returned, rather than its target's.

match_

method match_
val match_ :
  path_pattern:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Return True if this path matches the given pattern.

mkdir

method mkdir
val mkdir :
  ?mode:Py.Object.t ->
  ?parents:Py.Object.t ->
  ?exist_ok:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Create a new directory at this given path.

open_

method open_
val open_ :
  ?mode:Py.Object.t ->
  ?buffering:Py.Object.t ->
  ?encoding:Py.Object.t ->
  ?errors:Py.Object.t ->
  ?newline:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Open the file pointed by this path and return a file object, as the built-in open() function does.

owner

method owner
val owner :
  [> tag] Obj.t ->
  Py.Object.t

Return the login name of the file owner.

read_bytes

method read_bytes
val read_bytes :
  [> tag] Obj.t ->
  Py.Object.t

Open the file in bytes mode, read it, and close the file.

read_text

method read_text
val read_text :
  ?encoding:Py.Object.t ->
  ?errors:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Open the file in text mode, read it, and close the file.

relative_to

method relative_to
val relative_to :
  Py.Object.t list ->
  [> tag] Obj.t ->
  Py.Object.t

Return the relative path to another path identified by the passed arguments. If the operation is not possible (because this is not a subpath of the other path), raise ValueError.

rename

method rename
val rename :
  target:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Rename this path to the given path, and return a new Path instance pointing to the given path.

replace

method replace
val replace :
  target:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Rename this path to the given path, clobbering the existing destination if it exists, and return a new Path instance pointing to the given path.

resolve

method resolve
val resolve :
  ?strict:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Make the path absolute, resolving all symlinks on the way and also normalizing it (for example turning slashes into backslashes under Windows).

rglob

method rglob
val rglob :
  pattern:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Recursively yield all existing files (of any kind, including directories) matching the given relative pattern, anywhere in this subtree.

rmdir

method rmdir
val rmdir :
  [> tag] Obj.t ->
  Py.Object.t

Remove this directory. The directory must be empty.

samefile

method samefile
val samefile :
  other_path:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Return whether other_path is the same or not as this file (as returned by os.path.samefile()).

stat

method stat
val stat :
  [> tag] Obj.t ->
  Py.Object.t

Return the result of the stat() system call on this path, like os.stat() does.

method symlink_to
val symlink_to :
  ?target_is_directory:Py.Object.t ->
  target:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Make this path a symlink pointing to the given path. Note the order of arguments (self, target) is the reverse of os.symlink's.

touch

method touch
val touch :
  ?mode:Py.Object.t ->
  ?exist_ok:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Create this file with the given access mode, if it doesn't exist.

method unlink
val unlink :
  ?missing_ok:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Remove this file or link. If the path is a directory, use rmdir() instead.

with_name

method with_name
val with_name :
  name:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Return a new path with the file name changed.

with_suffix

method with_suffix
val with_suffix :
  suffix:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Return a new path with the file suffix changed. If the path has no suffix, add given suffix. If the given suffix is an empty string, remove the suffix from the path.

write_bytes

method write_bytes
val write_bytes :
  data:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Open the file in bytes mode, write to it, and close the file.

write_text

method write_text
val write_text :
  ?encoding:Py.Object.t ->
  ?errors:Py.Object.t ->
  data:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Open the file in text mode, write to it, and close the file.

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

Sequence

Module Sklearn.​Utils.​Sequence wraps Python class sklearn.utils.Sequence.

type t

get_item

method get_item
val get_item :
  index:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

iter

method iter
val iter :
  [> tag] Obj.t ->
  Dict.t Seq.t

count

method count
val count :
  value:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

S.count(value) -> integer -- return number of occurrences of value

index

method index
val index :
  ?start:Py.Object.t ->
  ?stop:Py.Object.t ->
  value:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

S.index(value, [start, [stop]]) -> integer -- return first index of value. Raises ValueError if the value is not present.

Supporting start and stop arguments is optional, but recommended.

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

Compress

Module Sklearn.​Utils.​Compress wraps Python class sklearn.utils.compress.

type t

create

constructor and attributes create
val create :
  data:Py.Object.t ->
  selectors:Py.Object.t ->
  unit ->
  t

Return data elements corresponding to true selector elements.

Forms a shorter iterator from selected data elements using the selectors to choose the data elements.

iter

method iter
val iter :
  [> tag] Obj.t ->
  Dict.t Seq.t

Implement iter(self).

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

Islice

Module Sklearn.​Utils.​Islice wraps Python class sklearn.utils.islice.

type t

create

constructor and attributes create
val create :
  iterable:Py.Object.t ->
  stop:Py.Object.t ->
  unit ->
  t

islice(iterable, stop) --> islice object islice(iterable, start, stop[, step]) --> islice object

Return an iterator whose next() method returns selected values from an iterable. If start is specified, will skip all preceding elements; otherwise, start defaults to zero. Step defaults to one. If specified as another value, step determines how many values are skipped between successive calls. Works like a slice() on a list but returns an iterator.

iter

method iter
val iter :
  [> tag] Obj.t ->
  Dict.t Seq.t

Implement iter(self).

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

Itemgetter

Module Sklearn.​Utils.​Itemgetter wraps Python class sklearn.utils.itemgetter.

type t

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

Parallel_backend

Module Sklearn.​Utils.​Parallel_backend wraps Python class sklearn.utils.parallel_backend.

type t

create

constructor and attributes create
val create :
  ?n_jobs:Py.Object.t ->
  ?inner_max_num_threads:Py.Object.t ->
  ?backend_params:(string * Py.Object.t) list ->
  backend:Py.Object.t ->
  unit ->
  t

Change the default backend used by Parallel inside a with block.

If backend is a string it must match a previously registered implementation using the register_parallel_backend function.

By default the following backends are available:

  • 'loky': single-host, process-based parallelism (used by default),
  • 'threading': single-host, thread-based parallelism,
  • 'multiprocessing': legacy single-host, process-based parallelism.

'loky' is recommended to run functions that manipulate Python objects. 'threading' is a low-overhead alternative that is most efficient for functions that release the Global Interpreter Lock: e.g. I/O-bound code or CPU-bound code in a few calls to native code that explicitly releases the GIL.

In addition, if the dask and distributed Python packages are installed, it is possible to use the 'dask' backend for better scheduling of nested parallel calls without over-subscription and potentially distribute parallel calls over a networked cluster of several hosts.

Alternatively the backend can be passed directly as an instance.

By default all available workers will be used (n_jobs=-1) unless the caller passes an explicit value for the n_jobs parameter.

This is an alternative to passing a backend='backend_name' argument to the Parallel class constructor. It is particularly useful when calling into library code that uses joblib internally but does not expose the backend argument in its own API.

>>> from operator import neg
>>> with parallel_backend('threading'):
...     print(Parallel()(delayed(neg)(i + 1) for i in range(5)))
...
[-1, -2, -3, -4, -5]
  • Warning: this function is experimental and subject to change in a future version of joblib.

Joblib also tries to limit the oversubscription by limiting the number of threads usable in some third-party library threadpools like OpenBLAS, MKL or OpenMP. The default limit in each worker is set to max(cpu_count() // effective_n_jobs, 1) but this limit can be overwritten with the inner_max_num_threads argument which will be used to set this limit in the child processes.

.. versionadded:: 0.10

unregister

method unregister
val unregister :
  [> tag] Obj.t ->
  Py.Object.t

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

Arrayfuncs

Module Sklearn.​Utils.​Arrayfuncs wraps Python module sklearn.utils.arrayfuncs.

cholesky_delete

function cholesky_delete
val cholesky_delete :
  l:Py.Object.t ->
  go_out:Py.Object.t ->
  unit ->
  Py.Object.t

Class_weight

Module Sklearn.​Utils.​Class_weight wraps Python module sklearn.utils.class_weight.

compute_class_weight

function compute_class_weight
val compute_class_weight :
  class_weight:[`Balanced | `DictIntToFloat of (int * float) list | `None] ->
  classes:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Estimate class weights for unbalanced datasets.

Parameters

  • class_weight : dict, 'balanced' or None If 'balanced', class weights will be given by n_samples / (n_classes * np.bincount(y)). If a dictionary is given, keys are classes and values are corresponding class weights. If None is given, the class weights will be uniform.

  • classes : ndarray Array of the classes occurring in the data, as given by np.unique(y_org) with y_org the original class labels.

  • y : array-like, shape (n_samples,) Array of original class labels per sample;

Returns

  • class_weight_vect : ndarray, shape (n_classes,) Array with class_weight_vect[i] the weight for i-th class

References

The 'balanced' heuristic is inspired by Logistic Regression in Rare Events Data, King, Zen, 2001.

compute_sample_weight

function compute_sample_weight
val compute_sample_weight :
  ?indices:[>`ArrayLike] Np.Obj.t ->
  class_weight:[`Balanced | `DictIntToFloat of (int * float) list | `List_of_dicts of Py.Object.t | `None] ->
  y:[>`ArrayLike] Np.Obj.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Estimate sample weights by class for unbalanced datasets.

Parameters

  • class_weight : dict, list of dicts, 'balanced', or None, optional Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one. For multi-output problems, a list of dicts can be provided in the same order as the columns of y.

    Note that for multioutput (including multilabel) weights should be defined for each class of every column in its own dict. For example, for four-class multilabel classification weights should be [{0: 1, 1: 1}, {0: 1, 1: 5}, {0: 1, 1: 1}, {0: 1, 1: 1}] instead of [{1:1}, {2:5}, {3:1}, {4:1}].

    The 'balanced' mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data: n_samples / (n_classes * np.bincount(y)).

    For multi-output, the weights of each column of y will be multiplied.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) Array of original class labels per sample.

  • indices : array-like, shape (n_subsample,), or None Array of indices to be used in a subsample. Can be of length less than n_samples in the case of a subsample, or equal to n_samples in the case of a bootstrap subsample with repeated indices. If None, the sample weight will be calculated over the full sample. Only 'balanced' is supported for class_weight if this is provided.

Returns

  • sample_weight_vect : ndarray, shape (n_samples,) Array with sample weights as applied to the original y

Deprecation

Module Sklearn.​Utils.​Deprecation wraps Python module sklearn.utils.deprecation.

Extmath

Module Sklearn.​Utils.​Extmath wraps Python module sklearn.utils.extmath.

cartesian

function cartesian
val cartesian :
  ?out:[>`ArrayLike] Np.Obj.t ->
  arrays:Np.Numpy.Ndarray.List.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Generate a cartesian product of input arrays.

Parameters

  • arrays : list of array-like 1-D arrays to form the cartesian product of.

  • out : ndarray Array to place the cartesian product in.

Returns

  • out : ndarray 2-D array of shape (M, len(arrays)) containing cartesian products formed of input arrays.

Examples

>>> cartesian(([1, 2, 3], [4, 5], [6, 7]))
array([[1, 4, 6],
       [1, 4, 7],
       [1, 5, 6],
       [1, 5, 7],
       [2, 4, 6],
       [2, 4, 7],
       [2, 5, 6],
       [2, 5, 7],
       [3, 4, 6],
       [3, 4, 7],
       [3, 5, 6],
       [3, 5, 7]])

check_array

function check_array
val check_array :
  ?accept_sparse:[`S of string | `StringList of string list | `Bool of bool] ->
  ?accept_large_sparse:bool ->
  ?dtype:[`Dtypes of Np.Dtype.t list | `S of string | `Dtype of Np.Dtype.t | `None] ->
  ?order:[`F | `C] ->
  ?copy:bool ->
  ?force_all_finite:[`Allow_nan | `Bool of bool] ->
  ?ensure_2d:bool ->
  ?allow_nd:bool ->
  ?ensure_min_samples:int ->
  ?ensure_min_features:int ->
  ?estimator:[>`BaseEstimator] Np.Obj.t ->
  array:Py.Object.t ->
  unit ->
  Py.Object.t

Input validation on an array, list, sparse matrix or similar.

By default, the input is checked to be a non-empty 2D array containing only finite values. If the dtype of the array is object, attempt converting to float, raising on failure.

Parameters

  • array : object Input object to check / convert.

  • accept_sparse : string, boolean or list/tuple of strings (default=False) String[s] representing allowed sparse matrix formats, such as 'csc', 'csr', etc. If the input is sparse but not in the allowed format, it will be converted to the first listed format. True allows the input to be any format. False means that a sparse matrix input will raise an error.

  • accept_large_sparse : bool (default=True) If a CSR, CSC, COO or BSR sparse matrix is supplied and accepted by accept_sparse, accept_large_sparse=False will cause it to be accepted only if its indices are stored with a 32-bit dtype.

    .. versionadded:: 0.20

  • dtype : string, type, list of types or None (default='numeric') Data type of result. If None, the dtype of the input is preserved. If 'numeric', dtype is preserved unless array.dtype is object. If dtype is a list of types, conversion on the first type is only performed if the dtype of the input is not in the list.

  • order : 'F', 'C' or None (default=None) Whether an array will be forced to be fortran or c-style. When order is None (default), then if copy=False, nothing is ensured about the memory layout of the output array; otherwise (copy=True) the memory layout of the returned array is kept as close as possible to the original array.

  • copy : boolean (default=False) Whether a forced copy will be triggered. If copy=False, a copy might be triggered by a conversion.

  • force_all_finite : boolean or 'allow-nan', (default=True) Whether to raise an error on np.inf, np.nan, pd.NA in array. The possibilities are:

    • True: Force all values of array to be finite.
    • False: accepts np.inf, np.nan, pd.NA in array.
    • 'allow-nan': accepts only np.nan and pd.NA values in array. Values cannot be infinite.

    .. versionadded:: 0.20 force_all_finite accepts the string 'allow-nan'.

    .. versionchanged:: 0.23 Accepts pd.NA and converts it into np.nan

  • ensure_2d : boolean (default=True) Whether to raise a value error if array is not 2D.

  • allow_nd : boolean (default=False) Whether to allow array.ndim > 2.

  • ensure_min_samples : int (default=1) Make sure that the array has a minimum number of samples in its first axis (rows for a 2D array). Setting to 0 disables this check.

  • ensure_min_features : int (default=1) Make sure that the 2D array has some minimum number of features (columns). The default value of 1 rejects empty datasets. This check is only enforced when the input data has effectively 2 dimensions or is originally 1D and ensure_2d is True. Setting to 0 disables this check.

  • estimator : str or estimator instance (default=None) If passed, include the name of the estimator in warning messages.

Returns

  • array_converted : object The converted and validated array.

check_random_state

function check_random_state
val check_random_state :
  [`Optional of [`I of int | `None] | `RandomState of Py.Object.t] ->
  Py.Object.t

Turn seed into a np.random.RandomState instance

Parameters

  • seed : None | int | instance of RandomState If seed is None, return the RandomState singleton used by np.random. If seed is an int, return a new RandomState instance seeded with seed. If seed is already a RandomState instance, return it. Otherwise raise ValueError.

density

function density
val density :
  ?kwargs:(string * Py.Object.t) list ->
  w:[>`ArrayLike] Np.Obj.t ->
  unit ->
  Py.Object.t

Compute density of a sparse vector

Parameters

  • w : array_like The sparse vector

Returns

float The density of w, between 0 and 1

fast_logdet

function fast_logdet
val fast_logdet :
  [>`ArrayLike] Np.Obj.t ->
  Py.Object.t

Compute log(det(A)) for A symmetric

Equivalent to : np.log(nl.det(A)) but more robust. It returns -Inf if det(A) is non positive or is not defined.

Parameters

  • A : array_like The matrix

log_logistic

function log_logistic
val log_logistic :
  ?out:[`Arr of [>`ArrayLike] Np.Obj.t | `T_ of Py.Object.t] ->
  x:[>`ArrayLike] Np.Obj.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Compute the log of the logistic function, log(1 / (1 + e ** -x)).

This implementation is numerically stable because it splits positive and negative values::

-log(1 + exp(-x_i))     if x_i > 0
x_i - log(1 + exp(x_i)) if x_i <= 0

For the ordinary logistic function, use scipy.special.expit.

Parameters

  • X : array-like, shape (M, N) or (M, ) Argument to the logistic function

  • out : array-like, shape: (M, N) or (M, ), optional: Preallocated output array.

Returns

  • out : array, shape (M, N) or (M, ) Log of the logistic function evaluated at every point in x

Notes

See the blog post describing this implementation:

  • http://fa.bianp.net/blog/2013/numerical-optimizers-for-logistic-regression/

make_nonnegative

function make_nonnegative
val make_nonnegative :
  ?min_value:float ->
  x:[>`ArrayLike] Np.Obj.t ->
  unit ->
  Py.Object.t

Ensure X.min() >= min_value.

Parameters

  • X : array_like The matrix to make non-negative

  • min_value : float The threshold value

Returns

array_like The thresholded array

Raises

ValueError When X is sparse

randomized_range_finder

function randomized_range_finder
val randomized_range_finder :
  ?power_iteration_normalizer:[`Auto | `QR | `LU | `None] ->
  ?random_state:int ->
  a:[>`ArrayLike] Np.Obj.t ->
  size:int ->
  n_iter:int ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Computes an orthonormal matrix whose range approximates the range of A.

Parameters

  • A : 2D array The input data matrix

  • size : integer Size of the return array

  • n_iter : integer Number of power iterations used to stabilize the result

  • power_iteration_normalizer : 'auto' (default), 'QR', 'LU', 'none' Whether the power iterations are normalized with step-by-step QR factorization (the slowest but most accurate), 'none' (the fastest but numerically unstable when n_iter is large, e.g. typically 5 or larger), or 'LU' factorization (numerically stable but can lose slightly in accuracy). The 'auto' mode applies no normalization if n_iter <= 2 and switches to LU otherwise.

    .. versionadded:: 0.18

  • random_state : int, RandomState instance or None, optional (default=None) The seed of the pseudo random number generator to use when shuffling the data, i.e. getting the random vectors to initialize the algorithm. Pass an int for reproducible results across multiple function calls.

  • See :term:Glossary <random_state>.

Returns

  • Q : 2D array A (size x size) projection matrix, the range of which approximates well the range of the input matrix A.

Notes

Follows Algorithm 4.3 of Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions Halko, et al., 2009 (arXiv:909) https://arxiv.org/pdf/0909.4061.pdf

An implementation of a randomized algorithm for principal component analysis A. Szlam et al. 2014

randomized_svd

function randomized_svd
val randomized_svd :
  ?n_oversamples:Py.Object.t ->
  ?n_iter:Py.Object.t ->
  ?power_iteration_normalizer:[`Auto | `QR | `LU | `None] ->
  ?transpose:[`Auto | `Bool of bool] ->
  ?flip_sign:bool ->
  ?random_state:int ->
  m:[>`ArrayLike] Np.Obj.t ->
  n_components:int ->
  unit ->
  Py.Object.t

Computes a truncated randomized SVD

Parameters

  • M : ndarray or sparse matrix Matrix to decompose

  • n_components : int Number of singular values and vectors to extract.

  • n_oversamples : int (default is 10) Additional number of random vectors to sample the range of M so as to ensure proper conditioning. The total number of random vectors used to find the range of M is n_components + n_oversamples. Smaller number can improve speed but can negatively impact the quality of approximation of singular vectors and singular values.

  • n_iter : int or 'auto' (default is 'auto') Number of power iterations. It can be used to deal with very noisy problems. When 'auto', it is set to 4, unless n_components is small (< .1 * min(X.shape)) n_iter in which case is set to 7. This improves precision with few components.

    .. versionchanged:: 0.18

  • power_iteration_normalizer : 'auto' (default), 'QR', 'LU', 'none' Whether the power iterations are normalized with step-by-step QR factorization (the slowest but most accurate), 'none' (the fastest but numerically unstable when n_iter is large, e.g. typically 5 or larger), or 'LU' factorization (numerically stable but can lose slightly in accuracy). The 'auto' mode applies no normalization if n_iter <= 2 and switches to LU otherwise.

    .. versionadded:: 0.18

  • transpose : True, False or 'auto' (default) Whether the algorithm should be applied to M.T instead of M. The result should approximately be the same. The 'auto' mode will trigger the transposition if M.shape[1] > M.shape[0] since this implementation of randomized SVD tend to be a little faster in that case.

    .. versionchanged:: 0.18

  • flip_sign : boolean, (True by default) The output of a singular value decomposition is only unique up to a permutation of the signs of the singular vectors. If flip_sign is set to True, the sign ambiguity is resolved by making the largest loadings for each component in the left singular vectors positive.

  • random_state : int, RandomState instance or None, optional (default=None) The seed of the pseudo random number generator to use when shuffling the data, i.e. getting the random vectors to initialize the algorithm. Pass an int for reproducible results across multiple function calls.

  • See :term:Glossary <random_state>.

Notes

This algorithm finds a (usually very good) approximate truncated singular value decomposition using randomization to speed up the computations. It is particularly fast on large matrices on which you wish to extract only a small number of components. In order to obtain further speed up, n_iter can be set <=2 (at the cost of loss of precision).

References

  • Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions Halko, et al., 2009 https://arxiv.org/abs/0909.4061

  • A randomized algorithm for the decomposition of matrices Per-Gunnar Martinsson, Vladimir Rokhlin and Mark Tygert

  • An implementation of a randomized algorithm for principal component analysis A. Szlam et al. 2014

row_norms

function row_norms
val row_norms :
  ?squared:bool ->
  x:[>`ArrayLike] Np.Obj.t ->
  unit ->
  Py.Object.t

Row-wise (squared) Euclidean norm of X.

Equivalent to np.sqrt((X * X).sum(axis=1)), but also supports sparse matrices and does not create an X.shape-sized temporary.

Performs no input validation.

Parameters

  • X : array_like The input array

  • squared : bool, optional (default = False) If True, return squared norms.

Returns

array_like The row-wise (squared) Euclidean norm of X.

safe_min

function safe_min
val safe_min :
  [>`ArrayLike] Np.Obj.t ->
  Py.Object.t
  • DEPRECATED: safe_min is deprecated in version 0.22 and will be removed in version 0.24.

Returns the minimum value of a dense or a CSR/CSC matrix.

Adapated from https://stackoverflow.com/q/13426580

.. deprecated:: 0.22.0

Parameters

  • X : array_like The input array or sparse matrix

Returns

Float The min value of X

safe_sparse_dot

function safe_sparse_dot
val safe_sparse_dot :
  ?dense_output:Py.Object.t ->
  a:[>`ArrayLike] Np.Obj.t ->
  b:Py.Object.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Dot product that handle the sparse matrix case correctly

Parameters

  • a : array or sparse matrix

  • b : array or sparse matrix

  • dense_output : boolean, (default=False) When False, a and b both being sparse will yield sparse output. When True, output will always be a dense array.

Returns

  • dot_product : array or sparse matrix sparse if a and b are sparse and dense_output=False.

softmax

function softmax
val softmax :
  ?copy:bool ->
  x:[>`ArrayLike] Np.Obj.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Calculate the softmax function.

The softmax function is calculated by np.exp(X) / np.sum(np.exp(X), axis=1)

This will cause overflow when large values are exponentiated. Hence the largest value in each row is subtracted from each data point to prevent this.

Parameters

  • X : array-like of floats, shape (M, N) Argument to the logistic function

  • copy : bool, optional Copy X or not.

Returns

  • out : array, shape (M, N) Softmax function evaluated at every point in x

squared_norm

function squared_norm
val squared_norm :
  [>`ArrayLike] Np.Obj.t ->
  Py.Object.t

Squared Euclidean or Frobenius norm of x.

Faster than norm(x) ** 2.

Parameters

  • x : array_like

Returns

float The Euclidean norm when x is a vector, the Frobenius norm when x is a matrix (2-d array).

stable_cumsum

function stable_cumsum
val stable_cumsum :
  ?axis:int ->
  ?rtol:float ->
  ?atol:float ->
  arr:[>`ArrayLike] Np.Obj.t ->
  unit ->
  Py.Object.t

Use high precision for cumsum and check that final value matches sum

Parameters

  • arr : array-like To be cumulatively summed as flat

  • axis : int, optional Axis along which the cumulative sum is computed. The default (None) is to compute the cumsum over the flattened array.

  • rtol : float Relative tolerance, see np.allclose

  • atol : float Absolute tolerance, see np.allclose

svd_flip

function svd_flip
val svd_flip :
  ?u_based_decision:bool ->
  u:[>`ArrayLike] Np.Obj.t ->
  v:[>`ArrayLike] Np.Obj.t ->
  unit ->
  Py.Object.t

Sign correction to ensure deterministic output from SVD.

Adjusts the columns of u and the rows of v such that the loadings in the columns in u that are largest in absolute value are always positive.

Parameters

  • u : ndarray u and v are the output of linalg.svd or :func:~sklearn.utils.extmath.randomized_svd, with matching inner dimensions so one can compute np.dot(u * s, v).

  • v : ndarray u and v are the output of linalg.svd or :func:~sklearn.utils.extmath.randomized_svd, with matching inner dimensions so one can compute np.dot(u * s, v).

  • u_based_decision : boolean, (default=True) If True, use the columns of u as the basis for sign flipping. Otherwise, use the rows of v. The choice of which variable to base the decision on is generally algorithm dependent.

Returns

u_adjusted, v_adjusted : arrays with the same dimensions as the input.

weighted_mode

function weighted_mode
val weighted_mode :
  ?axis:int ->
  a:[>`ArrayLike] Np.Obj.t ->
  w:[>`ArrayLike] Np.Obj.t ->
  unit ->
  ([>`ArrayLike] Np.Obj.t * [>`ArrayLike] Np.Obj.t)

Returns an array of the weighted modal (most common) value in a

If there is more than one such value, only the first is returned. The bin-count for the modal bins is also returned.

This is an extension of the algorithm in scipy.stats.mode.

Parameters

  • a : array_like n-dimensional array of which to find mode(s).

  • w : array_like n-dimensional array of weights for each value

  • axis : int, optional Axis along which to operate. Default is 0, i.e. the first axis.

Returns

  • vals : ndarray Array of modal values.

  • score : ndarray Array of weighted counts for each mode.

Examples

>>> from sklearn.utils.extmath import weighted_mode
>>> x = [4, 1, 4, 2, 4, 2]
>>> weights = [1, 1, 1, 1, 1, 1]
>>> weighted_mode(x, weights)
(array([4.]), array([3.]))

The value 4 appears three times: with uniform weights, the result is simply the mode of the distribution.

>>> weights = [1, 3, 0.5, 1.5, 1, 2]  # deweight the 4's
>>> weighted_mode(x, weights)
(array([2.]), array([3.5]))

The value 2 has the highest score: it appears twice with weights of 1.5 and 2: the sum of these is 3.5.

See Also

scipy.stats.mode

Fixes

Module Sklearn.​Utils.​Fixes wraps Python module sklearn.utils.fixes.

LooseVersion

Module Sklearn.​Utils.​Fixes.​LooseVersion wraps Python class sklearn.utils.fixes.LooseVersion.

type t

create

constructor and attributes create
val create :
  ?vstring:Py.Object.t ->
  unit ->
  t

Version numbering for anarchists and software realists. Implements the standard interface for version number classes as described above. A version number consists of a series of numbers, separated by either periods or strings of letters. When comparing version numbers, the numeric components will be compared numerically, and the alphabetic components lexically. The following are all valid version numbers, in no particular order:

1.5.1
1.5.2b2
161
3.10a
8.02
3.4j
1996.07.12
3.2.pl0
3.1.1.6
2g6
11g
0.960923
2.2beta29
1.13++
5.5.kw
2.0b1pl0

In fact, there is no such thing as an invalid version number under this scheme; the rules for comparison are simple and predictable, but may not always give the results you want (for some definition of 'want').

parse

method parse
val parse :
  vstring:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

lobpcg

function lobpcg
val lobpcg :
  ?b:[`PyObject of Py.Object.t | `Spmatrix of [>`Spmatrix] Np.Obj.t] ->
  ?m:[`PyObject of Py.Object.t | `Spmatrix of [>`Spmatrix] Np.Obj.t] ->
  ?y:[`Arr of [>`ArrayLike] Np.Obj.t | `PyObject of Py.Object.t] ->
  ?tol:[`F of float | `S of string | `I of int | `Bool of bool] ->
  ?maxiter:int ->
  ?largest:bool ->
  ?verbosityLevel:int ->
  ?retLambdaHistory:bool ->
  ?retResidualNormsHistory:bool ->
  a:[`Spmatrix of [>`Spmatrix] Np.Obj.t | `PyObject of Py.Object.t] ->
  x:[`Arr of [>`ArrayLike] Np.Obj.t | `PyObject of Py.Object.t] ->
  unit ->
  ([>`ArrayLike] Np.Obj.t * [>`ArrayLike] Np.Obj.t * Np.Numpy.Ndarray.List.t * Np.Numpy.Ndarray.List.t)

Locally Optimal Block Preconditioned Conjugate Gradient Method (LOBPCG)

LOBPCG is a preconditioned eigensolver for large symmetric positive definite (SPD) generalized eigenproblems.

Parameters

  • A : {sparse matrix, dense matrix, LinearOperator} The symmetric linear operator of the problem, usually a sparse matrix. Often called the 'stiffness matrix'.

  • X : ndarray, float32 or float64 Initial approximation to the k eigenvectors (non-sparse). If A has shape=(n,n) then X should have shape shape=(n,k).

  • B : {dense matrix, sparse matrix, LinearOperator}, optional The right hand side operator in a generalized eigenproblem. By default, B = Identity. Often called the 'mass matrix'.

  • M : {dense matrix, sparse matrix, LinearOperator}, optional Preconditioner to A; by default M = Identity. M should approximate the inverse of A.

  • Y : ndarray, float32 or float64, optional n-by-sizeY matrix of constraints (non-sparse), sizeY < n The iterations will be performed in the B-orthogonal complement of the column-space of Y. Y must be full rank.

  • tol : scalar, optional Solver tolerance (stopping criterion). The default is tol=n*sqrt(eps).

  • maxiter : int, optional Maximum number of iterations. The default is maxiter = 20.

  • largest : bool, optional When True, solve for the largest eigenvalues, otherwise the smallest.

  • verbosityLevel : int, optional Controls solver output. The default is verbosityLevel=0.

  • retLambdaHistory : bool, optional Whether to return eigenvalue history. Default is False.

  • retResidualNormsHistory : bool, optional Whether to return history of residual norms. Default is False.

Returns

  • w : ndarray Array of k eigenvalues

  • v : ndarray An array of k eigenvectors. v has the same shape as X.

  • lambdas : list of ndarray, optional The eigenvalue history, if retLambdaHistory is True.

  • rnorms : list of ndarray, optional The history of residual norms, if retResidualNormsHistory is True.

Notes

If both retLambdaHistory and retResidualNormsHistory are True, the return tuple has the following format (lambda, V, lambda history, residual norms history).

In the following n denotes the matrix size and m the number of required eigenvalues (smallest or largest).

The LOBPCG code internally solves eigenproblems of the size 3m on every iteration by calling the 'standard' dense eigensolver, so if m is not small enough compared to n, it does not make sense to call the LOBPCG code, but rather one should use the 'standard' eigensolver, e.g. numpy or scipy function in this case. If one calls the LOBPCG algorithm for 5m > n, it will most likely break internally, so the code tries to call the standard function instead.

It is not that n should be large for the LOBPCG to work, but rather the ratio n / m should be large. It you call LOBPCG with m=1 and n=10, it works though n is small. The method is intended for extremely large n / m, see e.g., reference [28] in

  • https://arxiv.org/abs/0705.2626

The convergence speed depends basically on two factors:

  1. How well relatively separated the seeking eigenvalues are from the rest of the eigenvalues. One can try to vary m to make this better.

  2. How well conditioned the problem is. This can be changed by using proper preconditioning. For example, a rod vibration test problem (under tests directory) is ill-conditioned for large n, so convergence will be slow, unless efficient preconditioning is used. For this specific problem, a good simple preconditioner function would be a linear solve for A, which is easy to code since A is tridiagonal.

References

.. [1] A. V. Knyazev (2001), Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method. SIAM Journal on Scientific Computing 23, no. 2, pp. 517-541. http://dx.doi.org/10.1137/S1064827500366124

.. [2] A. V. Knyazev, I. Lashuk, M. E. Argentati, and E. Ovchinnikov (2007), Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) in hypre and PETSc. https://arxiv.org/abs/0705.2626

.. [3] A. V. Knyazev's C and MATLAB implementations:

  • https://bitbucket.org/joseroman/blopex

Examples

Solve A x = lambda x with constraints and preconditioning.

>>> import numpy as np
>>> from scipy.sparse import spdiags, issparse
>>> from scipy.sparse.linalg import lobpcg, LinearOperator
>>> n = 100
>>> vals = np.arange(1, n + 1)
>>> A = spdiags(vals, 0, n, n)
>>> A.toarray()
array([[  1.,   0.,   0., ...,   0.,   0.,   0.],
       [  0.,   2.,   0., ...,   0.,   0.,   0.],
       [  0.,   0.,   3., ...,   0.,   0.,   0.],
       ...,
       [  0.,   0.,   0., ...,  98.,   0.,   0.],
       [  0.,   0.,   0., ...,   0.,  99.,   0.],
       [  0.,   0.,   0., ...,   0.,   0., 100.]])

Constraints:

>>> Y = np.eye(n, 3)

Initial guess for eigenvectors, should have linearly independent columns. Column dimension = number of requested eigenvalues.

>>> X = np.random.rand(n, 3)

Preconditioner in the inverse of A in this example:

>>> invA = spdiags([1./vals], 0, n, n)

The preconditiner must be defined by a function:

>>> def precond( x ):
...     return invA @ x

The argument x of the preconditioner function is a matrix inside lobpcg, thus the use of matrix-matrix product @.

The preconditioner function is passed to lobpcg as a LinearOperator:

>>> M = LinearOperator(matvec=precond, matmat=precond,
...                    shape=(n, n), dtype=float)

Let us now solve the eigenvalue problem for the matrix A:

>>> eigenvalues, _ = lobpcg(A, X, Y=Y, M=M, largest=False)
>>> eigenvalues
array([4., 5., 6.])

Note that the vectors passed in Y are the eigenvectors of the 3 smallest eigenvalues. The results returned are orthogonal to those.

loguniform

function loguniform
val loguniform :
  ?loc:Py.Object.t ->
  ?scale:Py.Object.t ->
  a:Py.Object.t ->
  b:Py.Object.t ->
  unit ->
  Py.Object.t

A loguniform or reciprocal continuous random variable.

As an instance of the rv_continuous class, Distribution object inherits from it a collection of generic methods (see below for the full list), and completes them with details specific for this particular distribution.

Methods

rvs(a, b, loc=0, scale=1, size=1, random_state=None) Random variates. pdf(x, a, b, loc=0, scale=1) Probability density function. logpdf(x, a, b, loc=0, scale=1) Log of the probability density function. cdf(x, a, b, loc=0, scale=1) Cumulative distribution function. logcdf(x, a, b, loc=0, scale=1) Log of the cumulative distribution function. sf(x, a, b, loc=0, scale=1) Survival function (also defined as 1 - cdf, but sf is sometimes more accurate). logsf(x, a, b, loc=0, scale=1) Log of the survival function. ppf(q, a, b, loc=0, scale=1) Percent point function (inverse of cdf --- percentiles). isf(q, a, b, loc=0, scale=1) Inverse survival function (inverse of sf). moment(n, a, b, loc=0, scale=1) Non-central moment of order n stats(a, b, loc=0, scale=1, moments='mv') Mean('m'), variance('v'), skew('s'), and/or kurtosis('k'). entropy(a, b, loc=0, scale=1) (Differential) entropy of the RV. fit(data) Parameter estimates for generic data. See scipy.stats.rv_continuous.fit <https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.fit.html#scipy.stats.rv_continuous.fit>__ for detailed documentation of the keyword arguments. expect(func, args=(a, b), loc=0, scale=1, lb=None, ub=None, conditional=False, **kwds) Expected value of a function (of one argument) with respect to the distribution. median(a, b, loc=0, scale=1) Median of the distribution. mean(a, b, loc=0, scale=1) Mean of the distribution. var(a, b, loc=0, scale=1) Variance of the distribution. std(a, b, loc=0, scale=1) Standard deviation of the distribution. interval(alpha, a, b, loc=0, scale=1) Endpoints of the range that contains alpha percent of the distribution

Notes

The probability density function for this class is:

f(x, a, b) = \frac{1}{x \log(b/a)}
  • for :math:a \le x \le b, :math:b > a > 0. This class takes :math:a and :math:b as shape parameters. The probability density above is defined in the 'standardized' form. To shift and/or scale the distribution use the loc and scale parameters. Specifically, Distribution.pdf(x, a, b, loc, scale) is identically equivalent to Distribution.pdf(y, a, b) / scale with y = (x - loc) / scale.

Examples

>>> from scipy.stats import Distribution
>>> import matplotlib.pyplot as plt
>>> fig, ax = plt.subplots(1, 1)

Calculate a few first moments:

>>> a, b = 
>>> mean, var, skew, kurt = Distribution.stats(a, b, moments='mvsk')

Display the probability density function (pdf):

>>> x = np.linspace(Distribution.ppf(0.01, a, b),
...                 Distribution.ppf(0.99, a, b), 100)
>>> ax.plot(x, Distribution.pdf(x, a, b),
...        'r-', lw=5, alpha=0.6, label='Distribution pdf')

Alternatively, the distribution object can be called (as a function) to fix the shape, location and scale parameters. This returns a 'frozen' RV object holding the given parameters fixed.

Freeze the distribution and display the frozen pdf:

>>> rv = Distribution(a, b)
>>> ax.plot(x, rv.pdf(x), 'k-', lw=2, label='frozen pdf')

Check accuracy of cdf and ppf:

>>> vals = Distribution.ppf([0.001, 0.5, 0.999], a, b)
>>> np.allclose([0.001, 0.5, 0.999], Distribution.cdf(vals, a, b))
True

Generate random numbers:

>>> r = Distribution.rvs(a, b, size=1000)

And compare the histogram:

>>> ax.hist(r, density=True, histtype='stepfilled', alpha=0.2)
>>> ax.legend(loc='best', frameon=False)
>>> plt.show()

This doesn't show the equal probability of 0.01, 0.1 and 1. This is best when the x-axis is log-scaled:

>>> import numpy as np
>>> fig, ax = plt.subplots(1, 1)
>>> ax.hist(np.log10(r))
>>> ax.set_ylabel('Frequency')
>>> ax.set_xlabel('Value of random variable')
>>> ax.xaxis.set_major_locator(plt.FixedLocator([-2, -1, 0]))
>>> ticks = ['$10^{{ {} }}$'.format(i) for i in [-2, -1, 0]]
>>> ax.set_xticklabels(ticks)  # doctest: +SKIP
>>> plt.show()

This random variable will be log-uniform regardless of the base chosen for a and b. Let's specify with base 2 instead:

>>> rvs = Distribution(2**-2, 2**0).rvs(size=1000)

Values of 1/4, 1/2 and 1 are equally likely with this random variable. Here's the histogram:

>>> fig, ax = plt.subplots(1, 1)
>>> ax.hist(np.log2(rvs))
>>> ax.set_ylabel('Frequency')
>>> ax.set_xlabel('Value of random variable')
>>> ax.xaxis.set_major_locator(plt.FixedLocator([-2, -1, 0]))
>>> ticks = ['$2^{{ {} }}$'.format(i) for i in [-2, -1, 0]]
>>> ax.set_xticklabels(ticks)  # doctest: +SKIP
>>> plt.show()

parse_version

function parse_version
val parse_version :
  Py.Object.t ->
  Py.Object.t

sparse_lsqr

function sparse_lsqr
val sparse_lsqr :
  ?damp:float ->
  ?atol:Py.Object.t ->
  ?btol:Py.Object.t ->
  ?conlim:float ->
  ?iter_lim:int ->
  ?show:bool ->
  ?calc_var:bool ->
  ?x0:[>`ArrayLike] Np.Obj.t ->
  a:[`Arr of [>`ArrayLike] Np.Obj.t | `LinearOperator of Py.Object.t] ->
  b:[>`ArrayLike] Np.Obj.t ->
  unit ->
  (Py.Object.t * int * int * float * float * float * float * float * float * Py.Object.t)

Find the least-squares solution to a large, sparse, linear system of equations.

The function solves Ax = b or min ||Ax - b||^2 or min ||Ax - b||^2 + d^2 ||x||^2.

The matrix A may be square or rectangular (over-determined or under-determined), and may have any rank.

::

  1. Unsymmetric equations -- solve A*x = b

  2. Linear least squares -- solve A*x = b in the least-squares sense

  3. Damped least squares -- solve ( A )x = ( b ) ( dampI ) ( 0 ) in the least-squares sense

Parameters

  • A : {sparse matrix, ndarray, LinearOperator} Representation of an m-by-n matrix. Alternatively, A can be a linear operator which can produce Ax and A^T x using, e.g., scipy.sparse.linalg.LinearOperator.

  • b : array_like, shape (m,) Right-hand side vector b.

  • damp : float Damping coefficient. atol, btol : float, optional Stopping tolerances. If both are 1.0e-9 (say), the final residual norm should be accurate to about 9 digits. (The final x will usually have fewer correct digits, depending on cond(A) and the size of damp.)

  • conlim : float, optional Another stopping tolerance. lsqr terminates if an estimate of cond(A) exceeds conlim. For compatible systems Ax = b, conlim could be as large as 1.0e+12 (say). For least-squares problems, conlim should be less than 1.0e+8. Maximum precision can be obtained by setting atol = btol = conlim = zero, but the number of iterations may then be excessive.

  • iter_lim : int, optional Explicit limitation on number of iterations (for safety).

  • show : bool, optional Display an iteration log.

  • calc_var : bool, optional Whether to estimate diagonals of (A'A + damp^2*I)^{-1}.

  • x0 : array_like, shape (n,), optional Initial guess of x, if None zeros are used.

    .. versionadded:: 1.0.0

Returns

  • x : ndarray of float The final solution.

  • istop : int Gives the reason for termination. 1 means x is an approximate solution to Ax = b. 2 means x approximately solves the least-squares problem.

  • itn : int Iteration number upon termination.

  • r1norm : float norm(r), where r = b - Ax.

  • r2norm : float sqrt( norm(r)^2 + damp^2 * norm(x)^2 ). Equal to r1norm if damp == 0.

  • anorm : float Estimate of Frobenius norm of Abar = [[A]; [damp*I]].

  • acond : float Estimate of cond(Abar).

  • arnorm : float Estimate of norm(A'*r - damp^2*x).

  • xnorm : float norm(x)

  • var : ndarray of float If calc_var is True, estimates all diagonals of (A'A)^{-1} (if damp == 0) or more generally (A'A + damp^2*I)^{-1}. This is well defined if A has full column rank or damp > 0. (Not sure what var means if rank(A) < n and damp = 0.)

Notes

LSQR uses an iterative method to approximate the solution. The number of iterations required to reach a certain accuracy depends strongly on the scaling of the problem. Poor scaling of the rows or columns of A should therefore be avoided where possible.

For example, in problem 1 the solution is unaltered by row-scaling. If a row of A is very small or large compared to the other rows of A, the corresponding row of ( A b ) should be scaled up or down.

In problems 1 and 2, the solution x is easily recovered following column-scaling. Unless better information is known, the nonzero columns of A should be scaled so that they all have the same Euclidean norm (e.g., 1.0).

In problem 3, there is no freedom to re-scale if damp is nonzero. However, the value of damp should be assigned only after attention has been paid to the scaling of A.

The parameter damp is intended to help regularize ill-conditioned systems, by preventing the true solution from being very large. Another aid to regularization is provided by the parameter acond, which may be used to terminate iterations before the computed solution becomes very large.

If some initial estimate x0 is known and if damp == 0, one could proceed as follows:

  1. Compute a residual vector r0 = b - A*x0.
  2. Use LSQR to solve the system A*dx = r0.
  3. Add the correction dx to obtain a final solution x = x0 + dx.

This requires that x0 be available before and after the call to LSQR. To judge the benefits, suppose LSQR takes k1 iterations to solve Ax = b and k2 iterations to solve Adx = r0. If x0 is 'good', norm(r0) will be smaller than norm(b). If the same stopping tolerances atol and btol are used for each system, k1 and k2 will be similar, but the final solution x0 + dx should be more accurate. The only way to reduce the total work is to use a larger stopping tolerance for the second system. If some value btol is suitable for Ax = b, the larger value btolnorm(b)/norm(r0) should be suitable for A*dx = r0.

Preconditioning is another way to reduce the number of iterations. If it is possible to solve a related system M*x = b efficiently, where M approximates A in some helpful way (e.g. M - A has low rank or its elements are small relative to those of A), LSQR may converge more rapidly on the system A*M(inverse)*z = b, after which x can be recovered by solving M*x = z.

If A is symmetric, LSQR should not be used!

Alternatives are the symmetric conjugate-gradient method (cg) and/or SYMMLQ. SYMMLQ is an implementation of symmetric cg that applies to any symmetric A and will converge more rapidly than LSQR. If A is positive definite, there are other implementations of symmetric cg that require slightly less work per iteration than SYMMLQ (but will take the same number of iterations).

References

.. [1] C. C. Paige and M. A. Saunders (1982a). 'LSQR: An algorithm for sparse linear equations and sparse least squares', ACM TOMS 8(1), 43-71. .. [2] C. C. Paige and M. A. Saunders (1982b). 'Algorithm 583. LSQR: Sparse linear equations and least squares problems', ACM TOMS 8(2), 195-209. .. [3] M. A. Saunders (1995). 'Solution of sparse rectangular systems using LSQR and CRAIG', BIT 35, 588-604.

Examples

>>> from scipy.sparse import csc_matrix
>>> from scipy.sparse.linalg import lsqr
>>> A = csc_matrix([[1., 0.], [1., 1.], [0., 1.]], dtype=float)

The first example has the trivial solution [0, 0]

>>> b = np.array([0., 0., 0.], dtype=float)
>>> x, istop, itn, normr = lsqr(A, b)[:4]
The exact solution is  x = 0
>>> istop
0
>>> x
array([ 0.,  0.])

The stopping code istop=0 returned indicates that a vector of zeros was found as a solution. The returned solution x indeed contains [0., 0.]. The next example has a non-trivial solution:

>>> b = np.array([1., 0., -1.], dtype=float)
>>> x, istop, itn, r1norm = lsqr(A, b)[:4]
>>> istop
1
>>> x
array([ 1., -1.])
>>> itn
1
>>> r1norm
4.440892098500627e-16

As indicated by istop=1, lsqr found a solution obeying the tolerance limits. The given solution [1., -1.] obviously solves the equation. The remaining return values include information about the number of iterations (itn=1) and the remaining difference of left and right side of the solved equation. The final example demonstrates the behavior in the case where there is no solution for the equation:

>>> b = np.array([1., 0.01, -1.], dtype=float)
>>> x, istop, itn, r1norm = lsqr(A, b)[:4]
>>> istop
2
>>> x
array([ 1.00333333, -0.99666667])
>>> A.dot(x)-b
array([ 0.00333333, -0.00333333,  0.00333333])
>>> r1norm
0.005773502691896255

istop indicates that the system is inconsistent and thus x is rather an approximate solution to the corresponding least-squares problem. r1norm contains the norm of the minimal residual that was found.

Graph

Module Sklearn.​Utils.​Graph wraps Python module sklearn.utils.graph.

single_source_shortest_path_length

function single_source_shortest_path_length
val single_source_shortest_path_length :
  ?cutoff:int ->
  graph:[>`ArrayLike] Np.Obj.t ->
  source:int ->
  unit ->
  Py.Object.t

Return the shortest path length from source to all reachable nodes.

Returns a dictionary of shortest path lengths keyed by target.

Parameters

  • graph : sparse matrix or 2D array (preferably LIL matrix) Adjacency matrix of the graph

  • source : integer Starting node for path

  • cutoff : integer, optional Depth to stop the search - only paths of length <= cutoff are returned.

Examples

>>> from sklearn.utils.graph import single_source_shortest_path_length
>>> import numpy as np
>>> graph = np.array([[ 0, 1, 0, 0],
...                   [ 1, 0, 1, 0],
...                   [ 0, 1, 0, 1],
...                   [ 0, 0, 1, 0]])
>>> list(sorted(single_source_shortest_path_length(graph, 0).items()))
[(0, 0), (1, 1), (2, 2), (3, 3)]
>>> graph = np.ones((6, 6))
>>> list(sorted(single_source_shortest_path_length(graph, 2).items()))
[(0, 1), (1, 1), (2, 0), (3, 1), (4, 1), (5, 1)]

Graph_shortest_path

Module Sklearn.​Utils.​Graph_shortest_path wraps Python module sklearn.utils.graph_shortest_path.

DTYPE

Module Sklearn.​Utils.​Graph_shortest_path.​DTYPE wraps Python class sklearn.utils.graph_shortest_path.DTYPE.

type t

create

constructor and attributes create
val create :
  ?x:Py.Object.t ->
  unit ->
  t

Double-precision floating-point number type, compatible with Python float and C double. Character code: 'd'. Canonical name: np.double.

  • Alias: np.float_. Alias on this platform: np.float64: 64-bit precision floating-point number type: sign bit, 11 bits exponent, 52 bits mantissa.

get_item

method get_item
val get_item :
  key:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Return self[key].

fromhex

method fromhex
val fromhex :
  string:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Create a floating-point number from a hexadecimal string.

>>> float.fromhex('0x1.ffffp10')
2047.984375
>>> float.fromhex('-0x1p-1074')
-5e-324

hex

method hex
val hex :
  [> tag] Obj.t ->
  Py.Object.t

Return a hexadecimal representation of a floating-point number.

>>> (-0.1).hex()
'-0x1.999999999999ap-4'
>>> 3.14159.hex()
'0x1.921f9f01b866ep+1'

is_integer

method is_integer
val is_integer :
  [> tag] Obj.t ->
  Py.Object.t

Return True if the float is an integer.

newbyteorder

method newbyteorder
val newbyteorder :
  ?new_order:string ->
  [> tag] Obj.t ->
  Np.Dtype.t

newbyteorder(new_order='S')

Return a new dtype with a different byte order.

Changes are also made in all fields and sub-arrays of the data type.

The new_order code can be any from the following:

  • 'S' - swap dtype from current to opposite endian
  • {'<', 'L'} - little endian
  • {'>', 'B'} - big endian
  • {'=', 'N'} - native order
  • {'|', 'I'} - ignore (no change to byte order)

Parameters

  • new_order : str, optional Byte order to force; a value from the byte order specifications above. The default value ('S') results in swapping the current byte order. The code does a case-insensitive check on the first letter of new_order for the alternatives above. For example, any of 'B' or 'b' or 'biggish' are valid to specify big-endian.

Returns

  • new_dtype : dtype New dtype object with the given change to the byte order.

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

ITYPE

Module Sklearn.​Utils.​Graph_shortest_path.​ITYPE wraps Python class sklearn.utils.graph_shortest_path.ITYPE.

type t

get_item

method get_item
val get_item :
  key:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Return self[key].

newbyteorder

method newbyteorder
val newbyteorder :
  ?new_order:string ->
  [> tag] Obj.t ->
  Np.Dtype.t

newbyteorder(new_order='S')

Return a new dtype with a different byte order.

Changes are also made in all fields and sub-arrays of the data type.

The new_order code can be any from the following:

  • 'S' - swap dtype from current to opposite endian
  • {'<', 'L'} - little endian
  • {'>', 'B'} - big endian
  • {'=', 'N'} - native order
  • {'|', 'I'} - ignore (no change to byte order)

Parameters

  • new_order : str, optional Byte order to force; a value from the byte order specifications above. The default value ('S') results in swapping the current byte order. The code does a case-insensitive check on the first letter of new_order for the alternatives above. For example, any of 'B' or 'b' or 'biggish' are valid to specify big-endian.

Returns

  • new_dtype : dtype New dtype object with the given change to the byte order.

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

isspmatrix

function isspmatrix
val isspmatrix :
  Py.Object.t ->
  Py.Object.t

Is x of a sparse matrix type?

Parameters

x object to check for being a sparse matrix

Returns

bool True if x is a sparse matrix, False otherwise

Notes

issparse and isspmatrix are aliases for the same function.

Examples

>>> from scipy.sparse import csr_matrix, isspmatrix
>>> isspmatrix(csr_matrix([[5]]))
True
>>> from scipy.sparse import isspmatrix
>>> isspmatrix(5)
False

isspmatrix_csr

function isspmatrix_csr
val isspmatrix_csr :
  Py.Object.t ->
  Py.Object.t

Is x of csr_matrix type?

Parameters

x object to check for being a csr matrix

Returns

bool True if x is a csr matrix, False otherwise

Examples

>>> from scipy.sparse import csr_matrix, isspmatrix_csr
>>> isspmatrix_csr(csr_matrix([[5]]))
True
>>> from scipy.sparse import csc_matrix, csr_matrix, isspmatrix_csc
>>> isspmatrix_csr(csc_matrix([[5]]))
False

Metaestimators

Module Sklearn.​Utils.​Metaestimators wraps Python module sklearn.utils.metaestimators.

Attrgetter

Module Sklearn.​Utils.​Metaestimators.​Attrgetter wraps Python class sklearn.utils.metaestimators.attrgetter.

type t

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

any

function any
val any :
  ?kwds:(string * Py.Object.t) list ->
  Py.Object.t list ->
  Py.Object.t

Internal indicator of special typing constructs. See _doc instance attribute for specific docs.

list

function list
val list :
  ?kwargs:(string * Py.Object.t) list ->
  Py.Object.t list ->
  Py.Object.t

The central part of internal API.

This represents a generic version of type 'origin' with type arguments 'params'. There are two kind of these aliases: user defined and special. The special ones are wrappers around builtin collections and ABCs in collections.abc. These must have 'name' always set. If 'inst' is False, then the alias can't be instantiated, this is used by e.g. typing.List and typing.Dict.

abstractmethod

function abstractmethod
val abstractmethod :
  Py.Object.t ->
  Py.Object.t

A decorator indicating abstract methods.

Requires that the metaclass is ABCMeta or derived from it. A class that has a metaclass derived from ABCMeta cannot be instantiated unless all of its abstract methods are overridden. The abstract methods can be called using any of the normal 'super' call mechanisms. abstractmethod() may be used to declare abstract methods for properties and descriptors.

Usage:

class C(metaclass=ABCMeta):
    @abstractmethod
    def my_abstract_method(self, ...):
        ...

if_delegate_has_method

function if_delegate_has_method
val if_delegate_has_method :
  [`S of string | `StringList of string list] ->
  Py.Object.t

Create a decorator for methods that are delegated to a sub-estimator

This enables ducktyping by hasattr returning True according to the sub-estimator.

Parameters

  • delegate : string, list of strings or tuple of strings Name of the sub-estimator that can be accessed as an attribute of the base object. If a list or a tuple of names are provided, the first sub-estimator that is an attribute of the base object will be used.

update_wrapper

function update_wrapper
val update_wrapper :
  ?assigned:Py.Object.t ->
  ?updated:Py.Object.t ->
  wrapper:Py.Object.t ->
  wrapped:Py.Object.t ->
  unit ->
  Py.Object.t

Update a wrapper function to look like the wrapped function

wrapper is the function to be updated wrapped is the original function assigned is a tuple naming the attributes assigned directly from the wrapped function to the wrapper function (defaults to functools.WRAPPER_ASSIGNMENTS) updated is a tuple naming the attributes of the wrapper that are updated with the corresponding attribute from the wrapped function (defaults to functools.WRAPPER_UPDATES)

Multiclass

Module Sklearn.​Utils.​Multiclass wraps Python module sklearn.utils.multiclass.

Chain

Module Sklearn.​Utils.​Multiclass.​Chain wraps Python class sklearn.utils.multiclass.chain.

type t

create

constructor and attributes create
val create :
  Py.Object.t list ->
  t

chain( *iterables) --> chain object

Return a chain object whose .next() method returns elements from the first iterable until it is exhausted, then elements from the next iterable, until all of the iterables are exhausted.

iter

method iter
val iter :
  [> tag] Obj.t ->
  Dict.t Seq.t

Implement iter(self).

from_iterable

method from_iterable
val from_iterable :
  iterable:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Alternative chain() constructor taking a single iterable argument that evaluates lazily.

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

Dok_matrix

Module Sklearn.​Utils.​Multiclass.​Dok_matrix wraps Python class sklearn.utils.multiclass.dok_matrix.

type t

create

constructor and attributes create
val create :
  ?shape:int list ->
  ?dtype:Py.Object.t ->
  ?copy:Py.Object.t ->
  arg1:Py.Object.t ->
  unit ->
  t

Dictionary Of Keys based sparse matrix.

This is an efficient structure for constructing sparse matrices incrementally.

This can be instantiated in several ways: dok_matrix(D) with a dense matrix, D

dok_matrix(S)
    with a sparse matrix, S

dok_matrix((M,N), [dtype])
    create the matrix with initial shape (M,N)
    dtype is optional, defaulting to dtype='d'

Attributes

  • dtype : dtype Data type of the matrix

  • shape : 2-tuple Shape of the matrix

  • ndim : int Number of dimensions (this is always 2) nnz Number of nonzero elements

Notes

Sparse matrices can be used in arithmetic operations: they support addition, subtraction, multiplication, division, and matrix power.

Allows for efficient O(1) access of individual elements. Duplicates are not allowed. Can be efficiently converted to a coo_matrix once constructed.

Examples

>>> import numpy as np
>>> from scipy.sparse import dok_matrix
>>> S = dok_matrix((5, 5), dtype=np.float32)
>>> for i in range(5):
...     for j in range(5):
...         S[i, j] = i + j    # Update element

get_item

method get_item
val get_item :
  key:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

iter

method iter
val iter :
  [> tag] Obj.t ->
  Dict.t Seq.t

setitem

method setitem
val __setitem__ :
  key:Py.Object.t ->
  x:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

asformat

method asformat
val asformat :
  ?copy:bool ->
  format:[`S of string | `None] ->
  [> tag] Obj.t ->
  Py.Object.t

Return this matrix in the passed format.

Parameters

  • format : {str, None} The desired matrix format ('csr', 'csc', 'lil', 'dok', 'array', ...) or None for no conversion.

  • copy : bool, optional If True, the result is guaranteed to not share data with self.

Returns

  • A : This matrix in the passed format.

asfptype

method asfptype
val asfptype :
  [> tag] Obj.t ->
  Py.Object.t

Upcast matrix to a floating point format (if necessary)

astype

method astype
val astype :
  ?casting:[`No | `Equiv | `Safe | `Same_kind | `Unsafe] ->
  ?copy:bool ->
  dtype:[`S of string | `Dtype of Np.Dtype.t] ->
  [> tag] Obj.t ->
  Py.Object.t

Cast the matrix elements to a specified type.

Parameters

  • dtype : string or numpy dtype Typecode or data-type to which to cast the data.

  • casting : {'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional Controls what kind of data casting may occur. Defaults to 'unsafe' for backwards compatibility. 'no' means the data types should not be cast at all. 'equiv' means only byte-order changes are allowed. 'safe' means only casts which can preserve values are allowed. 'same_kind' means only safe casts or casts within a kind, like float64 to float32, are allowed. 'unsafe' means any data conversions may be done.

  • copy : bool, optional If copy is False, the result might share some memory with this matrix. If copy is True, it is guaranteed that the result and this matrix do not share any memory.

clear

method clear
val clear :
  [> tag] Obj.t ->
  Py.Object.t

D.clear() -> None. Remove all items from D.

conj

method conj
val conj :
  ?copy:bool ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise complex conjugation.

If the matrix is of non-complex data type and copy is False, this method does nothing and the data is not copied.

Parameters

  • copy : bool, optional If True, the result is guaranteed to not share data with self.

Returns

  • A : The element-wise complex conjugate.

conjtransp

method conjtransp
val conjtransp :
  [> tag] Obj.t ->
  Py.Object.t

Return the conjugate transpose.

conjugate

method conjugate
val conjugate :
  ?copy:bool ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise complex conjugation.

If the matrix is of non-complex data type and copy is False, this method does nothing and the data is not copied.

Parameters

  • copy : bool, optional If True, the result is guaranteed to not share data with self.

Returns

  • A : The element-wise complex conjugate.

copy

method copy
val copy :
  [> tag] Obj.t ->
  Py.Object.t

Returns a copy of this matrix.

No data/indices will be shared between the returned value and current matrix.

count_nonzero

method count_nonzero
val count_nonzero :
  [> tag] Obj.t ->
  Py.Object.t

Number of non-zero entries, equivalent to

np.count_nonzero(a.toarray())

Unlike getnnz() and the nnz property, which return the number of stored entries (the length of the data attribute), this method counts the actual number of non-zero entries in data.

diagonal

method diagonal
val diagonal :
  ?k:int ->
  [> tag] Obj.t ->
  Py.Object.t

Returns the kth diagonal of the matrix.

Parameters

  • k : int, optional Which diagonal to get, corresponding to elements a[i, i+k].

  • Default: 0 (the main diagonal).

    .. versionadded:: 1.0

See also

  • numpy.diagonal : Equivalent numpy function.

Examples

>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[1, 2, 0], [0, 0, 3], [4, 0, 5]])
>>> A.diagonal()
array([1, 0, 5])
>>> A.diagonal(k=1)
array([2, 3])

dot

method dot
val dot :
  other:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Ordinary dot product

Examples

>>> import numpy as np
>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[1, 2, 0], [0, 0, 3], [4, 0, 5]])
>>> v = np.array([1, 0, -1])
>>> A.dot(v)
array([ 1, -3, -1], dtype=int64)

fromkeys

method fromkeys
val fromkeys :
  ?value:Py.Object.t ->
  iterable:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Create a new dictionary with keys from iterable and values set to value.

get

method get
val get :
  ?default:Py.Object.t ->
  key:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

This overrides the dict.get method, providing type checking but otherwise equivalent functionality.

getH

method getH
val getH :
  [> tag] Obj.t ->
  Py.Object.t

Return the Hermitian transpose of this matrix.

See Also

  • numpy.matrix.getH : NumPy's implementation of getH for matrices

get_shape

method get_shape
val get_shape :
  [> tag] Obj.t ->
  Py.Object.t

Get shape of a matrix.

getcol

method getcol
val getcol :
  j:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Returns a copy of column j of the matrix, as an (m x 1) sparse matrix (column vector).

getformat

method getformat
val getformat :
  [> tag] Obj.t ->
  Py.Object.t

Format of a matrix representation as a string.

getmaxprint

method getmaxprint
val getmaxprint :
  [> tag] Obj.t ->
  Py.Object.t

Maximum number of elements to display when printed.

getnnz

method getnnz
val getnnz :
  ?axis:[`Zero | `One] ->
  [> tag] Obj.t ->
  Py.Object.t

Number of stored values, including explicit zeros.

Parameters

  • axis : None, 0, or 1 Select between the number of values across the whole matrix, in each column, or in each row.

See also

  • count_nonzero : Number of non-zero entries

getrow

method getrow
val getrow :
  i:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Returns a copy of row i of the matrix, as a (1 x n) sparse matrix (row vector).

items

method items
val items :
  [> tag] Obj.t ->
  Py.Object.t

D.items() -> a set-like object providing a view on D's items

keys

method keys
val keys :
  [> tag] Obj.t ->
  Py.Object.t

D.keys() -> a set-like object providing a view on D's keys

maximum

method maximum
val maximum :
  other:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise maximum between this and another matrix.

mean

method mean
val mean :
  ?axis:[`One | `Zero | `PyObject of Py.Object.t] ->
  ?dtype:Np.Dtype.t ->
  ?out:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Compute the arithmetic mean along the specified axis.

Returns the average of the matrix elements. The average is taken over all elements in the matrix by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.

Parameters

  • axis : {-2, -1, 0, 1, None} optional Axis along which the mean is computed. The default is to compute the mean of all elements in the matrix (i.e., axis = None).

  • dtype : data-type, optional Type to use in computing the mean. For integer inputs, the default is float64; for floating point inputs, it is the same as the input dtype.

    .. versionadded:: 0.18.0

  • out : np.matrix, optional Alternative output matrix in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.

    .. versionadded:: 0.18.0

Returns

  • m : np.matrix

See Also

  • numpy.matrix.mean : NumPy's implementation of 'mean' for matrices

minimum

method minimum
val minimum :
  other:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise minimum between this and another matrix.

multiply

method multiply
val multiply :
  other:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Point-wise multiplication by another matrix

nonzero

method nonzero
val nonzero :
  [> tag] Obj.t ->
  Py.Object.t

nonzero indices

Returns a tuple of arrays (row,col) containing the indices of the non-zero elements of the matrix.

Examples

>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[1,2,0],[0,0,3],[4,0,5]])
>>> A.nonzero()
(array([0, 0, 1, 2, 2]), array([0, 1, 2, 0, 2]))

pop

method pop
val pop :
  ?d:Py.Object.t ->
  k:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

D.pop(k[,d]) -> v, remove specified key and return the corresponding value. If key is not found, d is returned if given, otherwise KeyError is raised

popitem

method popitem
val popitem :
  [> tag] Obj.t ->
  Py.Object.t

Remove and return a (key, value) pair as a 2-tuple.

Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.

power

method power
val power :
  ?dtype:Py.Object.t ->
  n:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise power.

reshape

method reshape
val reshape :
  ?kwargs:(string * Py.Object.t) list ->
  Py.Object.t list ->
  [> tag] Obj.t ->
  [`ArrayLike|`Object|`Spmatrix] Np.Obj.t

reshape(self, shape, order='C', copy=False)

Gives a new shape to a sparse matrix without changing its data.

Parameters

  • shape : length-2 tuple of ints The new shape should be compatible with the original shape.

  • order : {'C', 'F'}, optional Read the elements using this index order. 'C' means to read and write the elements using C-like index order; e.g., read entire first row, then second row, etc. 'F' means to read and write the elements using Fortran-like index order; e.g., read entire first column, then second column, etc.

  • copy : bool, optional Indicates whether or not attributes of self should be copied whenever possible. The degree to which attributes are copied varies depending on the type of sparse matrix being used.

Returns

  • reshaped_matrix : sparse matrix A sparse matrix with the given shape, not necessarily of the same format as the current object.

See Also

  • numpy.matrix.reshape : NumPy's implementation of 'reshape' for matrices

resize

method resize
val resize :
  int list ->
  [> tag] Obj.t ->
  Py.Object.t

Resize the matrix in-place to dimensions given by shape

Any elements that lie within the new shape will remain at the same indices, while non-zero elements lying outside the new shape are removed.

Parameters

  • shape : (int, int) number of rows and columns in the new matrix

Notes

The semantics are not identical to numpy.ndarray.resize or numpy.resize. Here, the same data will be maintained at each index before and after reshape, if that index is within the new bounds. In numpy, resizing maintains contiguity of the array, moving elements around in the logical matrix but not within a flattened representation.

We give no guarantees about whether the underlying data attributes (arrays, etc.) will be modified in place or replaced with new objects.

set_shape

method set_shape
val set_shape :
  shape:int list ->
  [> tag] Obj.t ->
  Py.Object.t

See reshape.

setdefault

method setdefault
val setdefault :
  ?default:Py.Object.t ->
  key:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Insert key with a value of default if key is not in the dictionary.

Return the value for key if key is in the dictionary, else default.

setdiag

method setdiag
val setdiag :
  ?k:int ->
  values:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  Py.Object.t

Set diagonal or off-diagonal elements of the array.

Parameters

  • values : array_like New values of the diagonal elements.

    Values may have any length. If the diagonal is longer than values, then the remaining diagonal entries will not be set. If values if longer than the diagonal, then the remaining values are ignored.

    If a scalar value is given, all of the diagonal is set to it.

  • k : int, optional Which off-diagonal to set, corresponding to elements a[i,i+k].

  • Default: 0 (the main diagonal).

sum

method sum
val sum :
  ?axis:[`One | `Zero | `PyObject of Py.Object.t] ->
  ?dtype:Np.Dtype.t ->
  ?out:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Sum the matrix elements over a given axis.

Parameters

  • axis : {-2, -1, 0, 1, None} optional Axis along which the sum is computed. The default is to compute the sum of all the matrix elements, returning a scalar (i.e., axis = None).

  • dtype : dtype, optional The type of the returned matrix and of the accumulator in which the elements are summed. The dtype of a is used by default unless a has an integer dtype of less precision than the default platform integer. In that case, if a is signed then the platform integer is used while if a is unsigned then an unsigned integer of the same precision as the platform integer is used.

    .. versionadded:: 0.18.0

  • out : np.matrix, optional Alternative output matrix in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.

    .. versionadded:: 0.18.0

Returns

  • sum_along_axis : np.matrix A matrix with the same shape as self, with the specified axis removed.

See Also

  • numpy.matrix.sum : NumPy's implementation of 'sum' for matrices

toarray

method toarray
val toarray :
  ?order:[`F | `C] ->
  ?out:[`Arr of [>`ArrayLike] Np.Obj.t | `T2_D of Py.Object.t] ->
  [> tag] Obj.t ->
  Py.Object.t

Return a dense ndarray representation of this matrix.

Parameters

  • order : {'C', 'F'}, optional Whether to store multidimensional data in C (row-major) or Fortran (column-major) order in memory. The default is 'None', indicating the NumPy default of C-ordered. Cannot be specified in conjunction with the out argument.

  • out : ndarray, 2-D, optional If specified, uses this array as the output buffer instead of allocating a new array to return. The provided array must have the same shape and dtype as the sparse matrix on which you are calling the method. For most sparse types, out is required to be memory contiguous (either C or Fortran ordered).

Returns

  • arr : ndarray, 2-D An array with the same shape and containing the same data represented by the sparse matrix, with the requested memory order. If out was passed, the same object is returned after being modified in-place to contain the appropriate values.

tobsr

method tobsr
val tobsr :
  ?blocksize:Py.Object.t ->
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to Block Sparse Row format.

With copy=False, the data/indices may be shared between this matrix and the resultant bsr_matrix.

When blocksize=(R, C) is provided, it will be used for construction of the bsr_matrix.

tocoo

method tocoo
val tocoo :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to COOrdinate format.

With copy=False, the data/indices may be shared between this matrix and the resultant coo_matrix.

tocsc

method tocsc
val tocsc :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to Compressed Sparse Column format.

With copy=False, the data/indices may be shared between this matrix and the resultant csc_matrix.

tocsr

method tocsr
val tocsr :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to Compressed Sparse Row format.

With copy=False, the data/indices may be shared between this matrix and the resultant csr_matrix.

todense

method todense
val todense :
  ?order:[`F | `C] ->
  ?out:[`Arr of [>`ArrayLike] Np.Obj.t | `T2_D of Py.Object.t] ->
  [> tag] Obj.t ->
  Py.Object.t

Return a dense matrix representation of this matrix.

Parameters

  • order : {'C', 'F'}, optional Whether to store multi-dimensional data in C (row-major) or Fortran (column-major) order in memory. The default is 'None', indicating the NumPy default of C-ordered. Cannot be specified in conjunction with the out argument.

  • out : ndarray, 2-D, optional If specified, uses this array (or numpy.matrix) as the output buffer instead of allocating a new array to return. The provided array must have the same shape and dtype as the sparse matrix on which you are calling the method.

Returns

  • arr : numpy.matrix, 2-D A NumPy matrix object with the same shape and containing the same data represented by the sparse matrix, with the requested memory order. If out was passed and was an array (rather than a numpy.matrix), it will be filled with the appropriate values and returned wrapped in a numpy.matrix object that shares the same memory.

todia

method todia
val todia :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to sparse DIAgonal format.

With copy=False, the data/indices may be shared between this matrix and the resultant dia_matrix.

todok

method todok
val todok :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to Dictionary Of Keys format.

With copy=False, the data/indices may be shared between this matrix and the resultant dok_matrix.

tolil

method tolil
val tolil :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to List of Lists format.

With copy=False, the data/indices may be shared between this matrix and the resultant lil_matrix.

transpose

method transpose
val transpose :
  ?axes:Py.Object.t ->
  ?copy:bool ->
  [> tag] Obj.t ->
  Py.Object.t

Reverses the dimensions of the sparse matrix.

Parameters

  • axes : None, optional This argument is in the signature solely for NumPy compatibility reasons. Do not pass in anything except for the default value.

  • copy : bool, optional Indicates whether or not attributes of self should be copied whenever possible. The degree to which attributes are copied varies depending on the type of sparse matrix being used.

Returns

  • p : self with the dimensions reversed.

See Also

  • numpy.matrix.transpose : NumPy's implementation of 'transpose' for matrices

update

method update
val update :
  val_:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

D.update([E, ]**F) -> None. Update D from dict/iterable E and F. If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values

method values
val values :
  [> tag] Obj.t ->
  Py.Object.t

D.values() -> an object providing a view on D's values

dtype

attribute dtype
val dtype : t -> Np.Dtype.t
val dtype_opt : t -> (Np.Dtype.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.

shape

attribute shape
val shape : t -> int list
val shape_opt : t -> (int list) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.

ndim

attribute ndim
val ndim : t -> int
val ndim_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.

nnz

attribute nnz
val nnz : t -> Py.Object.t
val nnz_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

Lil_matrix

Module Sklearn.​Utils.​Multiclass.​Lil_matrix wraps Python class sklearn.utils.multiclass.lil_matrix.

type t

create

constructor and attributes create
val create :
  ?shape:int list ->
  ?dtype:Py.Object.t ->
  ?copy:Py.Object.t ->
  arg1:Py.Object.t ->
  unit ->
  t

Row-based list of lists sparse matrix

This is a structure for constructing sparse matrices incrementally. Note that inserting a single item can take linear time in the worst case; to construct a matrix efficiently, make sure the items are pre-sorted by index, per row.

This can be instantiated in several ways: lil_matrix(D) with a dense matrix or rank-2 ndarray D

lil_matrix(S)
    with another sparse matrix S (equivalent to S.tolil())

lil_matrix((M, N), [dtype])
    to construct an empty matrix with shape (M, N)
    dtype is optional, defaulting to dtype='d'.

Attributes

  • dtype : dtype Data type of the matrix

  • shape : 2-tuple Shape of the matrix

  • ndim : int Number of dimensions (this is always 2) nnz Number of stored values, including explicit zeros data LIL format data array of the matrix rows LIL format row index array of the matrix

Notes

Sparse matrices can be used in arithmetic operations: they support addition, subtraction, multiplication, division, and matrix power.

Advantages of the LIL format - supports flexible slicing - changes to the matrix sparsity structure are efficient

Disadvantages of the LIL format - arithmetic operations LIL + LIL are slow (consider CSR or CSC) - slow column slicing (consider CSC) - slow matrix vector products (consider CSR or CSC)

Intended Usage - LIL is a convenient format for constructing sparse matrices - once a matrix has been constructed, convert to CSR or CSC format for fast arithmetic and matrix vector operations - consider using the COO format when constructing large matrices

Data Structure - An array (self.rows) of rows, each of which is a sorted list of column indices of non-zero elements. - The corresponding nonzero values are stored in similar fashion in self.data.

get_item

method get_item
val get_item :
  key:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

iter

method iter
val iter :
  [> tag] Obj.t ->
  Dict.t Seq.t

setitem

method setitem
val __setitem__ :
  key:Py.Object.t ->
  x:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

asformat

method asformat
val asformat :
  ?copy:bool ->
  format:[`S of string | `None] ->
  [> tag] Obj.t ->
  Py.Object.t

Return this matrix in the passed format.

Parameters

  • format : {str, None} The desired matrix format ('csr', 'csc', 'lil', 'dok', 'array', ...) or None for no conversion.

  • copy : bool, optional If True, the result is guaranteed to not share data with self.

Returns

  • A : This matrix in the passed format.

asfptype

method asfptype
val asfptype :
  [> tag] Obj.t ->
  Py.Object.t

Upcast matrix to a floating point format (if necessary)

astype

method astype
val astype :
  ?casting:[`No | `Equiv | `Safe | `Same_kind | `Unsafe] ->
  ?copy:bool ->
  dtype:[`S of string | `Dtype of Np.Dtype.t] ->
  [> tag] Obj.t ->
  Py.Object.t

Cast the matrix elements to a specified type.

Parameters

  • dtype : string or numpy dtype Typecode or data-type to which to cast the data.

  • casting : {'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional Controls what kind of data casting may occur. Defaults to 'unsafe' for backwards compatibility. 'no' means the data types should not be cast at all. 'equiv' means only byte-order changes are allowed. 'safe' means only casts which can preserve values are allowed. 'same_kind' means only safe casts or casts within a kind, like float64 to float32, are allowed. 'unsafe' means any data conversions may be done.

  • copy : bool, optional If copy is False, the result might share some memory with this matrix. If copy is True, it is guaranteed that the result and this matrix do not share any memory.

conj

method conj
val conj :
  ?copy:bool ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise complex conjugation.

If the matrix is of non-complex data type and copy is False, this method does nothing and the data is not copied.

Parameters

  • copy : bool, optional If True, the result is guaranteed to not share data with self.

Returns

  • A : The element-wise complex conjugate.

conjugate

method conjugate
val conjugate :
  ?copy:bool ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise complex conjugation.

If the matrix is of non-complex data type and copy is False, this method does nothing and the data is not copied.

Parameters

  • copy : bool, optional If True, the result is guaranteed to not share data with self.

Returns

  • A : The element-wise complex conjugate.

copy

method copy
val copy :
  [> tag] Obj.t ->
  Py.Object.t

Returns a copy of this matrix.

No data/indices will be shared between the returned value and current matrix.

count_nonzero

method count_nonzero
val count_nonzero :
  [> tag] Obj.t ->
  Py.Object.t

Number of non-zero entries, equivalent to

np.count_nonzero(a.toarray())

Unlike getnnz() and the nnz property, which return the number of stored entries (the length of the data attribute), this method counts the actual number of non-zero entries in data.

diagonal

method diagonal
val diagonal :
  ?k:int ->
  [> tag] Obj.t ->
  Py.Object.t

Returns the kth diagonal of the matrix.

Parameters

  • k : int, optional Which diagonal to get, corresponding to elements a[i, i+k].

  • Default: 0 (the main diagonal).

    .. versionadded:: 1.0

See also

  • numpy.diagonal : Equivalent numpy function.

Examples

>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[1, 2, 0], [0, 0, 3], [4, 0, 5]])
>>> A.diagonal()
array([1, 0, 5])
>>> A.diagonal(k=1)
array([2, 3])

dot

method dot
val dot :
  other:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Ordinary dot product

Examples

>>> import numpy as np
>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[1, 2, 0], [0, 0, 3], [4, 0, 5]])
>>> v = np.array([1, 0, -1])
>>> A.dot(v)
array([ 1, -3, -1], dtype=int64)

getH

method getH
val getH :
  [> tag] Obj.t ->
  Py.Object.t

Return the Hermitian transpose of this matrix.

See Also

  • numpy.matrix.getH : NumPy's implementation of getH for matrices

get_shape

method get_shape
val get_shape :
  [> tag] Obj.t ->
  Py.Object.t

Get shape of a matrix.

getcol

method getcol
val getcol :
  j:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Returns a copy of column j of the matrix, as an (m x 1) sparse matrix (column vector).

getformat

method getformat
val getformat :
  [> tag] Obj.t ->
  Py.Object.t

Format of a matrix representation as a string.

getmaxprint

method getmaxprint
val getmaxprint :
  [> tag] Obj.t ->
  Py.Object.t

Maximum number of elements to display when printed.

getnnz

method getnnz
val getnnz :
  ?axis:[`Zero | `One] ->
  [> tag] Obj.t ->
  Py.Object.t

Number of stored values, including explicit zeros.

Parameters

  • axis : None, 0, or 1 Select between the number of values across the whole matrix, in each column, or in each row.

See also

  • count_nonzero : Number of non-zero entries

getrow

method getrow
val getrow :
  i:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Returns a copy of the 'i'th row.

getrowview

method getrowview
val getrowview :
  i:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Returns a view of the 'i'th row (without copying).

maximum

method maximum
val maximum :
  other:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise maximum between this and another matrix.

mean

method mean
val mean :
  ?axis:[`One | `Zero | `PyObject of Py.Object.t] ->
  ?dtype:Np.Dtype.t ->
  ?out:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Compute the arithmetic mean along the specified axis.

Returns the average of the matrix elements. The average is taken over all elements in the matrix by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.

Parameters

  • axis : {-2, -1, 0, 1, None} optional Axis along which the mean is computed. The default is to compute the mean of all elements in the matrix (i.e., axis = None).

  • dtype : data-type, optional Type to use in computing the mean. For integer inputs, the default is float64; for floating point inputs, it is the same as the input dtype.

    .. versionadded:: 0.18.0

  • out : np.matrix, optional Alternative output matrix in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.

    .. versionadded:: 0.18.0

Returns

  • m : np.matrix

See Also

  • numpy.matrix.mean : NumPy's implementation of 'mean' for matrices

minimum

method minimum
val minimum :
  other:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise minimum between this and another matrix.

multiply

method multiply
val multiply :
  other:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Point-wise multiplication by another matrix

nonzero

method nonzero
val nonzero :
  [> tag] Obj.t ->
  Py.Object.t

nonzero indices

Returns a tuple of arrays (row,col) containing the indices of the non-zero elements of the matrix.

Examples

>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[1,2,0],[0,0,3],[4,0,5]])
>>> A.nonzero()
(array([0, 0, 1, 2, 2]), array([0, 1, 2, 0, 2]))

power

method power
val power :
  ?dtype:Py.Object.t ->
  n:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise power.

reshape

method reshape
val reshape :
  ?kwargs:(string * Py.Object.t) list ->
  Py.Object.t list ->
  [> tag] Obj.t ->
  [`ArrayLike|`Object|`Spmatrix] Np.Obj.t

reshape(self, shape, order='C', copy=False)

Gives a new shape to a sparse matrix without changing its data.

Parameters

  • shape : length-2 tuple of ints The new shape should be compatible with the original shape.

  • order : {'C', 'F'}, optional Read the elements using this index order. 'C' means to read and write the elements using C-like index order; e.g., read entire first row, then second row, etc. 'F' means to read and write the elements using Fortran-like index order; e.g., read entire first column, then second column, etc.

  • copy : bool, optional Indicates whether or not attributes of self should be copied whenever possible. The degree to which attributes are copied varies depending on the type of sparse matrix being used.

Returns

  • reshaped_matrix : sparse matrix A sparse matrix with the given shape, not necessarily of the same format as the current object.

See Also

  • numpy.matrix.reshape : NumPy's implementation of 'reshape' for matrices

resize

method resize
val resize :
  int list ->
  [> tag] Obj.t ->
  Py.Object.t

Resize the matrix in-place to dimensions given by shape

Any elements that lie within the new shape will remain at the same indices, while non-zero elements lying outside the new shape are removed.

Parameters

  • shape : (int, int) number of rows and columns in the new matrix

Notes

The semantics are not identical to numpy.ndarray.resize or numpy.resize. Here, the same data will be maintained at each index before and after reshape, if that index is within the new bounds. In numpy, resizing maintains contiguity of the array, moving elements around in the logical matrix but not within a flattened representation.

We give no guarantees about whether the underlying data attributes (arrays, etc.) will be modified in place or replaced with new objects.

set_shape

method set_shape
val set_shape :
  shape:int list ->
  [> tag] Obj.t ->
  Py.Object.t

See reshape.

setdiag

method setdiag
val setdiag :
  ?k:int ->
  values:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  Py.Object.t

Set diagonal or off-diagonal elements of the array.

Parameters

  • values : array_like New values of the diagonal elements.

    Values may have any length. If the diagonal is longer than values, then the remaining diagonal entries will not be set. If values if longer than the diagonal, then the remaining values are ignored.

    If a scalar value is given, all of the diagonal is set to it.

  • k : int, optional Which off-diagonal to set, corresponding to elements a[i,i+k].

  • Default: 0 (the main diagonal).

sum

method sum
val sum :
  ?axis:[`One | `Zero | `PyObject of Py.Object.t] ->
  ?dtype:Np.Dtype.t ->
  ?out:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Sum the matrix elements over a given axis.

Parameters

  • axis : {-2, -1, 0, 1, None} optional Axis along which the sum is computed. The default is to compute the sum of all the matrix elements, returning a scalar (i.e., axis = None).

  • dtype : dtype, optional The type of the returned matrix and of the accumulator in which the elements are summed. The dtype of a is used by default unless a has an integer dtype of less precision than the default platform integer. In that case, if a is signed then the platform integer is used while if a is unsigned then an unsigned integer of the same precision as the platform integer is used.

    .. versionadded:: 0.18.0

  • out : np.matrix, optional Alternative output matrix in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.

    .. versionadded:: 0.18.0

Returns

  • sum_along_axis : np.matrix A matrix with the same shape as self, with the specified axis removed.

See Also

  • numpy.matrix.sum : NumPy's implementation of 'sum' for matrices

toarray

method toarray
val toarray :
  ?order:[`F | `C] ->
  ?out:[`Arr of [>`ArrayLike] Np.Obj.t | `T2_D of Py.Object.t] ->
  [> tag] Obj.t ->
  Py.Object.t

Return a dense ndarray representation of this matrix.

Parameters

  • order : {'C', 'F'}, optional Whether to store multidimensional data in C (row-major) or Fortran (column-major) order in memory. The default is 'None', indicating the NumPy default of C-ordered. Cannot be specified in conjunction with the out argument.

  • out : ndarray, 2-D, optional If specified, uses this array as the output buffer instead of allocating a new array to return. The provided array must have the same shape and dtype as the sparse matrix on which you are calling the method. For most sparse types, out is required to be memory contiguous (either C or Fortran ordered).

Returns

  • arr : ndarray, 2-D An array with the same shape and containing the same data represented by the sparse matrix, with the requested memory order. If out was passed, the same object is returned after being modified in-place to contain the appropriate values.

tobsr

method tobsr
val tobsr :
  ?blocksize:Py.Object.t ->
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to Block Sparse Row format.

With copy=False, the data/indices may be shared between this matrix and the resultant bsr_matrix.

When blocksize=(R, C) is provided, it will be used for construction of the bsr_matrix.

tocoo

method tocoo
val tocoo :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to COOrdinate format.

With copy=False, the data/indices may be shared between this matrix and the resultant coo_matrix.

tocsc

method tocsc
val tocsc :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to Compressed Sparse Column format.

With copy=False, the data/indices may be shared between this matrix and the resultant csc_matrix.

tocsr

method tocsr
val tocsr :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to Compressed Sparse Row format.

With copy=False, the data/indices may be shared between this matrix and the resultant csr_matrix.

todense

method todense
val todense :
  ?order:[`F | `C] ->
  ?out:[`Arr of [>`ArrayLike] Np.Obj.t | `T2_D of Py.Object.t] ->
  [> tag] Obj.t ->
  Py.Object.t

Return a dense matrix representation of this matrix.

Parameters

  • order : {'C', 'F'}, optional Whether to store multi-dimensional data in C (row-major) or Fortran (column-major) order in memory. The default is 'None', indicating the NumPy default of C-ordered. Cannot be specified in conjunction with the out argument.

  • out : ndarray, 2-D, optional If specified, uses this array (or numpy.matrix) as the output buffer instead of allocating a new array to return. The provided array must have the same shape and dtype as the sparse matrix on which you are calling the method.

Returns

  • arr : numpy.matrix, 2-D A NumPy matrix object with the same shape and containing the same data represented by the sparse matrix, with the requested memory order. If out was passed and was an array (rather than a numpy.matrix), it will be filled with the appropriate values and returned wrapped in a numpy.matrix object that shares the same memory.

todia

method todia
val todia :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to sparse DIAgonal format.

With copy=False, the data/indices may be shared between this matrix and the resultant dia_matrix.

todok

method todok
val todok :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to Dictionary Of Keys format.

With copy=False, the data/indices may be shared between this matrix and the resultant dok_matrix.

tolil

method tolil
val tolil :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to List of Lists format.

With copy=False, the data/indices may be shared between this matrix and the resultant lil_matrix.

transpose

method transpose
val transpose :
  ?axes:Py.Object.t ->
  ?copy:bool ->
  [> tag] Obj.t ->
  Py.Object.t

Reverses the dimensions of the sparse matrix.

Parameters

  • axes : None, optional This argument is in the signature solely for NumPy compatibility reasons. Do not pass in anything except for the default value.

  • copy : bool, optional Indicates whether or not attributes of self should be copied whenever possible. The degree to which attributes are copied varies depending on the type of sparse matrix being used.

Returns

  • p : self with the dimensions reversed.

See Also

  • numpy.matrix.transpose : NumPy's implementation of 'transpose' for matrices

dtype

attribute dtype
val dtype : t -> Np.Dtype.t
val dtype_opt : t -> (Np.Dtype.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.

shape

attribute shape
val shape : t -> int list
val shape_opt : t -> (int list) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.

ndim

attribute ndim
val ndim : t -> int
val ndim_opt : t -> (int) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.

nnz

attribute nnz
val nnz : t -> Py.Object.t
val nnz_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.

data

attribute data
val data : t -> Py.Object.t
val data_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.

rows

attribute rows
val rows : t -> Py.Object.t
val rows_opt : t -> (Py.Object.t) option

This attribute is documented in create above. The first version raises Not_found if the attribute is None. The _opt version returns an option.

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

Spmatrix

Module Sklearn.​Utils.​Multiclass.​Spmatrix wraps Python class sklearn.utils.multiclass.spmatrix.

type t

create

constructor and attributes create
val create :
  ?maxprint:Py.Object.t ->
  unit ->
  t

This class provides a base class for all sparse matrices. It cannot be instantiated. Most of the work is provided by subclasses.

iter

method iter
val iter :
  [> tag] Obj.t ->
  Dict.t Seq.t

asformat

method asformat
val asformat :
  ?copy:bool ->
  format:[`S of string | `None] ->
  [> tag] Obj.t ->
  Py.Object.t

Return this matrix in the passed format.

Parameters

  • format : {str, None} The desired matrix format ('csr', 'csc', 'lil', 'dok', 'array', ...) or None for no conversion.

  • copy : bool, optional If True, the result is guaranteed to not share data with self.

Returns

  • A : This matrix in the passed format.

asfptype

method asfptype
val asfptype :
  [> tag] Obj.t ->
  Py.Object.t

Upcast matrix to a floating point format (if necessary)

astype

method astype
val astype :
  ?casting:[`No | `Equiv | `Safe | `Same_kind | `Unsafe] ->
  ?copy:bool ->
  dtype:[`S of string | `Dtype of Np.Dtype.t] ->
  [> tag] Obj.t ->
  Py.Object.t

Cast the matrix elements to a specified type.

Parameters

  • dtype : string or numpy dtype Typecode or data-type to which to cast the data.

  • casting : {'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional Controls what kind of data casting may occur. Defaults to 'unsafe' for backwards compatibility. 'no' means the data types should not be cast at all. 'equiv' means only byte-order changes are allowed. 'safe' means only casts which can preserve values are allowed. 'same_kind' means only safe casts or casts within a kind, like float64 to float32, are allowed. 'unsafe' means any data conversions may be done.

  • copy : bool, optional If copy is False, the result might share some memory with this matrix. If copy is True, it is guaranteed that the result and this matrix do not share any memory.

conj

method conj
val conj :
  ?copy:bool ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise complex conjugation.

If the matrix is of non-complex data type and copy is False, this method does nothing and the data is not copied.

Parameters

  • copy : bool, optional If True, the result is guaranteed to not share data with self.

Returns

  • A : The element-wise complex conjugate.

conjugate

method conjugate
val conjugate :
  ?copy:bool ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise complex conjugation.

If the matrix is of non-complex data type and copy is False, this method does nothing and the data is not copied.

Parameters

  • copy : bool, optional If True, the result is guaranteed to not share data with self.

Returns

  • A : The element-wise complex conjugate.

copy

method copy
val copy :
  [> tag] Obj.t ->
  Py.Object.t

Returns a copy of this matrix.

No data/indices will be shared between the returned value and current matrix.

count_nonzero

method count_nonzero
val count_nonzero :
  [> tag] Obj.t ->
  Py.Object.t

Number of non-zero entries, equivalent to

np.count_nonzero(a.toarray())

Unlike getnnz() and the nnz property, which return the number of stored entries (the length of the data attribute), this method counts the actual number of non-zero entries in data.

diagonal

method diagonal
val diagonal :
  ?k:int ->
  [> tag] Obj.t ->
  Py.Object.t

Returns the kth diagonal of the matrix.

Parameters

  • k : int, optional Which diagonal to get, corresponding to elements a[i, i+k].

  • Default: 0 (the main diagonal).

    .. versionadded:: 1.0

See also

  • numpy.diagonal : Equivalent numpy function.

Examples

>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[1, 2, 0], [0, 0, 3], [4, 0, 5]])
>>> A.diagonal()
array([1, 0, 5])
>>> A.diagonal(k=1)
array([2, 3])

dot

method dot
val dot :
  other:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Ordinary dot product

Examples

>>> import numpy as np
>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[1, 2, 0], [0, 0, 3], [4, 0, 5]])
>>> v = np.array([1, 0, -1])
>>> A.dot(v)
array([ 1, -3, -1], dtype=int64)

getH

method getH
val getH :
  [> tag] Obj.t ->
  Py.Object.t

Return the Hermitian transpose of this matrix.

See Also

  • numpy.matrix.getH : NumPy's implementation of getH for matrices

get_shape

method get_shape
val get_shape :
  [> tag] Obj.t ->
  Py.Object.t

Get shape of a matrix.

getcol

method getcol
val getcol :
  j:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Returns a copy of column j of the matrix, as an (m x 1) sparse matrix (column vector).

getformat

method getformat
val getformat :
  [> tag] Obj.t ->
  Py.Object.t

Format of a matrix representation as a string.

getmaxprint

method getmaxprint
val getmaxprint :
  [> tag] Obj.t ->
  Py.Object.t

Maximum number of elements to display when printed.

getnnz

method getnnz
val getnnz :
  ?axis:[`Zero | `One] ->
  [> tag] Obj.t ->
  Py.Object.t

Number of stored values, including explicit zeros.

Parameters

  • axis : None, 0, or 1 Select between the number of values across the whole matrix, in each column, or in each row.

See also

  • count_nonzero : Number of non-zero entries

getrow

method getrow
val getrow :
  i:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Returns a copy of row i of the matrix, as a (1 x n) sparse matrix (row vector).

maximum

method maximum
val maximum :
  other:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise maximum between this and another matrix.

mean

method mean
val mean :
  ?axis:[`One | `Zero | `PyObject of Py.Object.t] ->
  ?dtype:Np.Dtype.t ->
  ?out:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Compute the arithmetic mean along the specified axis.

Returns the average of the matrix elements. The average is taken over all elements in the matrix by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.

Parameters

  • axis : {-2, -1, 0, 1, None} optional Axis along which the mean is computed. The default is to compute the mean of all elements in the matrix (i.e., axis = None).

  • dtype : data-type, optional Type to use in computing the mean. For integer inputs, the default is float64; for floating point inputs, it is the same as the input dtype.

    .. versionadded:: 0.18.0

  • out : np.matrix, optional Alternative output matrix in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.

    .. versionadded:: 0.18.0

Returns

  • m : np.matrix

See Also

  • numpy.matrix.mean : NumPy's implementation of 'mean' for matrices

minimum

method minimum
val minimum :
  other:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise minimum between this and another matrix.

multiply

method multiply
val multiply :
  other:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Point-wise multiplication by another matrix

nonzero

method nonzero
val nonzero :
  [> tag] Obj.t ->
  Py.Object.t

nonzero indices

Returns a tuple of arrays (row,col) containing the indices of the non-zero elements of the matrix.

Examples

>>> from scipy.sparse import csr_matrix
>>> A = csr_matrix([[1,2,0],[0,0,3],[4,0,5]])
>>> A.nonzero()
(array([0, 0, 1, 2, 2]), array([0, 1, 2, 0, 2]))

power

method power
val power :
  ?dtype:Py.Object.t ->
  n:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Element-wise power.

reshape

method reshape
val reshape :
  ?kwargs:(string * Py.Object.t) list ->
  Py.Object.t list ->
  [> tag] Obj.t ->
  [`ArrayLike|`Object|`Spmatrix] Np.Obj.t

reshape(self, shape, order='C', copy=False)

Gives a new shape to a sparse matrix without changing its data.

Parameters

  • shape : length-2 tuple of ints The new shape should be compatible with the original shape.

  • order : {'C', 'F'}, optional Read the elements using this index order. 'C' means to read and write the elements using C-like index order; e.g., read entire first row, then second row, etc. 'F' means to read and write the elements using Fortran-like index order; e.g., read entire first column, then second column, etc.

  • copy : bool, optional Indicates whether or not attributes of self should be copied whenever possible. The degree to which attributes are copied varies depending on the type of sparse matrix being used.

Returns

  • reshaped_matrix : sparse matrix A sparse matrix with the given shape, not necessarily of the same format as the current object.

See Also

  • numpy.matrix.reshape : NumPy's implementation of 'reshape' for matrices

resize

method resize
val resize :
  shape:int list ->
  [> tag] Obj.t ->
  Py.Object.t

Resize the matrix in-place to dimensions given by shape

Any elements that lie within the new shape will remain at the same indices, while non-zero elements lying outside the new shape are removed.

Parameters

  • shape : (int, int) number of rows and columns in the new matrix

Notes

The semantics are not identical to numpy.ndarray.resize or numpy.resize. Here, the same data will be maintained at each index before and after reshape, if that index is within the new bounds. In numpy, resizing maintains contiguity of the array, moving elements around in the logical matrix but not within a flattened representation.

We give no guarantees about whether the underlying data attributes (arrays, etc.) will be modified in place or replaced with new objects.

set_shape

method set_shape
val set_shape :
  shape:int list ->
  [> tag] Obj.t ->
  Py.Object.t

See reshape.

setdiag

method setdiag
val setdiag :
  ?k:int ->
  values:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  Py.Object.t

Set diagonal or off-diagonal elements of the array.

Parameters

  • values : array_like New values of the diagonal elements.

    Values may have any length. If the diagonal is longer than values, then the remaining diagonal entries will not be set. If values if longer than the diagonal, then the remaining values are ignored.

    If a scalar value is given, all of the diagonal is set to it.

  • k : int, optional Which off-diagonal to set, corresponding to elements a[i,i+k].

  • Default: 0 (the main diagonal).

sum

method sum
val sum :
  ?axis:[`One | `Zero | `PyObject of Py.Object.t] ->
  ?dtype:Np.Dtype.t ->
  ?out:[>`ArrayLike] Np.Obj.t ->
  [> tag] Obj.t ->
  [>`ArrayLike] Np.Obj.t

Sum the matrix elements over a given axis.

Parameters

  • axis : {-2, -1, 0, 1, None} optional Axis along which the sum is computed. The default is to compute the sum of all the matrix elements, returning a scalar (i.e., axis = None).

  • dtype : dtype, optional The type of the returned matrix and of the accumulator in which the elements are summed. The dtype of a is used by default unless a has an integer dtype of less precision than the default platform integer. In that case, if a is signed then the platform integer is used while if a is unsigned then an unsigned integer of the same precision as the platform integer is used.

    .. versionadded:: 0.18.0

  • out : np.matrix, optional Alternative output matrix in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.

    .. versionadded:: 0.18.0

Returns

  • sum_along_axis : np.matrix A matrix with the same shape as self, with the specified axis removed.

See Also

  • numpy.matrix.sum : NumPy's implementation of 'sum' for matrices

toarray

method toarray
val toarray :
  ?order:[`F | `C] ->
  ?out:[`Arr of [>`ArrayLike] Np.Obj.t | `T2_D of Py.Object.t] ->
  [> tag] Obj.t ->
  Py.Object.t

Return a dense ndarray representation of this matrix.

Parameters

  • order : {'C', 'F'}, optional Whether to store multidimensional data in C (row-major) or Fortran (column-major) order in memory. The default is 'None', indicating the NumPy default of C-ordered. Cannot be specified in conjunction with the out argument.

  • out : ndarray, 2-D, optional If specified, uses this array as the output buffer instead of allocating a new array to return. The provided array must have the same shape and dtype as the sparse matrix on which you are calling the method. For most sparse types, out is required to be memory contiguous (either C or Fortran ordered).

Returns

  • arr : ndarray, 2-D An array with the same shape and containing the same data represented by the sparse matrix, with the requested memory order. If out was passed, the same object is returned after being modified in-place to contain the appropriate values.

tobsr

method tobsr
val tobsr :
  ?blocksize:Py.Object.t ->
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to Block Sparse Row format.

With copy=False, the data/indices may be shared between this matrix and the resultant bsr_matrix.

When blocksize=(R, C) is provided, it will be used for construction of the bsr_matrix.

tocoo

method tocoo
val tocoo :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to COOrdinate format.

With copy=False, the data/indices may be shared between this matrix and the resultant coo_matrix.

tocsc

method tocsc
val tocsc :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to Compressed Sparse Column format.

With copy=False, the data/indices may be shared between this matrix and the resultant csc_matrix.

tocsr

method tocsr
val tocsr :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to Compressed Sparse Row format.

With copy=False, the data/indices may be shared between this matrix and the resultant csr_matrix.

todense

method todense
val todense :
  ?order:[`F | `C] ->
  ?out:[`Arr of [>`ArrayLike] Np.Obj.t | `T2_D of Py.Object.t] ->
  [> tag] Obj.t ->
  Py.Object.t

Return a dense matrix representation of this matrix.

Parameters

  • order : {'C', 'F'}, optional Whether to store multi-dimensional data in C (row-major) or Fortran (column-major) order in memory. The default is 'None', indicating the NumPy default of C-ordered. Cannot be specified in conjunction with the out argument.

  • out : ndarray, 2-D, optional If specified, uses this array (or numpy.matrix) as the output buffer instead of allocating a new array to return. The provided array must have the same shape and dtype as the sparse matrix on which you are calling the method.

Returns

  • arr : numpy.matrix, 2-D A NumPy matrix object with the same shape and containing the same data represented by the sparse matrix, with the requested memory order. If out was passed and was an array (rather than a numpy.matrix), it will be filled with the appropriate values and returned wrapped in a numpy.matrix object that shares the same memory.

todia

method todia
val todia :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to sparse DIAgonal format.

With copy=False, the data/indices may be shared between this matrix and the resultant dia_matrix.

todok

method todok
val todok :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to Dictionary Of Keys format.

With copy=False, the data/indices may be shared between this matrix and the resultant dok_matrix.

tolil

method tolil
val tolil :
  ?copy:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Convert this matrix to List of Lists format.

With copy=False, the data/indices may be shared between this matrix and the resultant lil_matrix.

transpose

method transpose
val transpose :
  ?axes:Py.Object.t ->
  ?copy:bool ->
  [> tag] Obj.t ->
  Py.Object.t

Reverses the dimensions of the sparse matrix.

Parameters

  • axes : None, optional This argument is in the signature solely for NumPy compatibility reasons. Do not pass in anything except for the default value.

  • copy : bool, optional Indicates whether or not attributes of self should be copied whenever possible. The degree to which attributes are copied varies depending on the type of sparse matrix being used.

Returns

  • p : self with the dimensions reversed.

See Also

  • numpy.matrix.transpose : NumPy's implementation of 'transpose' for matrices

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

check_array

function check_array
val check_array :
  ?accept_sparse:[`S of string | `StringList of string list | `Bool of bool] ->
  ?accept_large_sparse:bool ->
  ?dtype:[`Dtypes of Np.Dtype.t list | `S of string | `Dtype of Np.Dtype.t | `None] ->
  ?order:[`F | `C] ->
  ?copy:bool ->
  ?force_all_finite:[`Allow_nan | `Bool of bool] ->
  ?ensure_2d:bool ->
  ?allow_nd:bool ->
  ?ensure_min_samples:int ->
  ?ensure_min_features:int ->
  ?estimator:[>`BaseEstimator] Np.Obj.t ->
  array:Py.Object.t ->
  unit ->
  Py.Object.t

Input validation on an array, list, sparse matrix or similar.

By default, the input is checked to be a non-empty 2D array containing only finite values. If the dtype of the array is object, attempt converting to float, raising on failure.

Parameters

  • array : object Input object to check / convert.

  • accept_sparse : string, boolean or list/tuple of strings (default=False) String[s] representing allowed sparse matrix formats, such as 'csc', 'csr', etc. If the input is sparse but not in the allowed format, it will be converted to the first listed format. True allows the input to be any format. False means that a sparse matrix input will raise an error.

  • accept_large_sparse : bool (default=True) If a CSR, CSC, COO or BSR sparse matrix is supplied and accepted by accept_sparse, accept_large_sparse=False will cause it to be accepted only if its indices are stored with a 32-bit dtype.

    .. versionadded:: 0.20

  • dtype : string, type, list of types or None (default='numeric') Data type of result. If None, the dtype of the input is preserved. If 'numeric', dtype is preserved unless array.dtype is object. If dtype is a list of types, conversion on the first type is only performed if the dtype of the input is not in the list.

  • order : 'F', 'C' or None (default=None) Whether an array will be forced to be fortran or c-style. When order is None (default), then if copy=False, nothing is ensured about the memory layout of the output array; otherwise (copy=True) the memory layout of the returned array is kept as close as possible to the original array.

  • copy : boolean (default=False) Whether a forced copy will be triggered. If copy=False, a copy might be triggered by a conversion.

  • force_all_finite : boolean or 'allow-nan', (default=True) Whether to raise an error on np.inf, np.nan, pd.NA in array. The possibilities are:

    • True: Force all values of array to be finite.
    • False: accepts np.inf, np.nan, pd.NA in array.
    • 'allow-nan': accepts only np.nan and pd.NA values in array. Values cannot be infinite.

    .. versionadded:: 0.20 force_all_finite accepts the string 'allow-nan'.

    .. versionchanged:: 0.23 Accepts pd.NA and converts it into np.nan

  • ensure_2d : boolean (default=True) Whether to raise a value error if array is not 2D.

  • allow_nd : boolean (default=False) Whether to allow array.ndim > 2.

  • ensure_min_samples : int (default=1) Make sure that the array has a minimum number of samples in its first axis (rows for a 2D array). Setting to 0 disables this check.

  • ensure_min_features : int (default=1) Make sure that the 2D array has some minimum number of features (columns). The default value of 1 rejects empty datasets. This check is only enforced when the input data has effectively 2 dimensions or is originally 1D and ensure_2d is True. Setting to 0 disables this check.

  • estimator : str or estimator instance (default=None) If passed, include the name of the estimator in warning messages.

Returns

  • array_converted : object The converted and validated array.

check_classification_targets

function check_classification_targets
val check_classification_targets :
  [>`ArrayLike] Np.Obj.t ->
  Py.Object.t

Ensure that target y is of a non-regression type.

Only the following target types (as defined in type_of_target) are allowed: 'binary', 'multiclass', 'multiclass-multioutput', 'multilabel-indicator', 'multilabel-sequences'

Parameters

  • y : array-like

class_distribution

function class_distribution
val class_distribution :
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  y:[`Arr of [>`ArrayLike] Np.Obj.t | `Sparse_matrix_of_size of Py.Object.t] ->
  unit ->
  (Py.Object.t * Py.Object.t * Py.Object.t)

Compute class priors from multioutput-multiclass target data

Parameters

  • y : array like or sparse matrix of size (n_samples, n_outputs) The labels for each example.

  • sample_weight : array-like of shape (n_samples,), default=None Sample weights.

Returns

  • classes : list of size n_outputs of arrays of size (n_classes,) List of classes for each column.

  • n_classes : list of integers of size n_outputs Number of classes in each column

  • class_prior : list of size n_outputs of arrays of size (n_classes,) Class distribution of each column.

is_multilabel

function is_multilabel
val is_multilabel :
  [>`ArrayLike] Np.Obj.t ->
  bool

Check if y is in a multilabel format.

Parameters

  • y : numpy array of shape [n_samples] Target values.

Returns

  • out : bool, Return True, if y is in a multilabel format, else `False.

Examples

>>> import numpy as np
>>> from sklearn.utils.multiclass import is_multilabel
>>> is_multilabel([0, 1, 0, 1])
False
>>> is_multilabel([[1], [0, 2], []])
False
>>> is_multilabel(np.array([[1, 0], [0, 0]]))
True
>>> is_multilabel(np.array([[1], [0], [0]]))
False
>>> is_multilabel(np.array([[1, 0, 0]]))
True

issparse

function issparse
val issparse :
  Py.Object.t ->
  Py.Object.t

Is x of a sparse matrix type?

Parameters

x object to check for being a sparse matrix

Returns

bool True if x is a sparse matrix, False otherwise

Notes

issparse and isspmatrix are aliases for the same function.

Examples

>>> from scipy.sparse import csr_matrix, isspmatrix
>>> isspmatrix(csr_matrix([[5]]))
True
>>> from scipy.sparse import isspmatrix
>>> isspmatrix(5)
False

type_of_target

function type_of_target
val type_of_target :
  [>`ArrayLike] Np.Obj.t ->
  string

Determine the type of data indicated by the target.

Note that this type is the most specific type that can be inferred. For example:

* ``binary`` is more specific but compatible with ``multiclass``.
* ``multiclass`` of integers is more specific but compatible with
  ``continuous``.
* ``multilabel-indicator`` is more specific but compatible with
  ``multiclass-multioutput``.

Parameters

  • y : array-like

Returns

  • target_type : string One of:

    • 'continuous': y is an array-like of floats that are not all integers, and is 1d or a column vector.
    • 'continuous-multioutput': y is a 2d array of floats that are not all integers, and both dimensions are of size > 1.
    • 'binary': y contains <= 2 discrete values and is 1d or a column vector.
    • 'multiclass': y contains more than two discrete values, is not a sequence of sequences, and is 1d or a column vector.
    • 'multiclass-multioutput': y is a 2d array that contains more than two discrete values, is not a sequence of sequences, and both dimensions are of size > 1.
    • 'multilabel-indicator': y is a label indicator matrix, an array of two dimensions with at least two columns, and at most 2 unique values.
    • 'unknown': y is array-like but none of the above, such as a 3d array, sequence of sequences, or an array of non-sequence objects.

Examples

>>> import numpy as np
>>> type_of_target([0.1, 0.6])
'continuous'
>>> type_of_target([1, -1, -1, 1])
'binary'
>>> type_of_target(['a', 'b', 'a'])
'binary'
>>> type_of_target([1.0, 2.0])
'binary'
>>> type_of_target([1, 0, 2])
'multiclass'
>>> type_of_target([1.0, 0.0, 3.0])
'multiclass'
>>> type_of_target(['a', 'b', 'c'])
'multiclass'
>>> type_of_target(np.array([[1, 2], [3, 1]]))
'multiclass-multioutput'
>>> type_of_target([[1, 2]])
'multilabel-indicator'
>>> type_of_target(np.array([[1.5, 2.0], [3.0, 1.6]]))
'continuous-multioutput'
>>> type_of_target(np.array([[0, 1], [1, 1]]))
'multilabel-indicator'

unique_labels

function unique_labels
val unique_labels :
  Py.Object.t list ->
  [>`ArrayLike] Np.Obj.t

Extract an ordered array of unique labels

We don't allow: - mix of multilabel and multiclass (single label) targets - mix of label indicator matrix and anything else, because there are no explicit labels) - mix of label indicator matrices of different sizes - mix of string and integer labels

At the moment, we also don't allow 'multiclass-multioutput' input type.

Parameters

  • *ys : array-likes

Returns

  • out : numpy array of shape [n_unique_labels] An ordered array of unique labels.

Examples

>>> from sklearn.utils.multiclass import unique_labels
>>> unique_labels([3, 5, 5, 5, 7, 7])
array([3, 5, 7])
>>> unique_labels([1, 2, 3, 4], [2, 2, 3, 4])
array([1, 2, 3, 4])
>>> unique_labels([1, 2, 10], [5, 11])
array([ 1,  2,  5, 10, 11])

Murmurhash

Module Sklearn.​Utils.​Murmurhash wraps Python module sklearn.utils.murmurhash.

Optimize

Module Sklearn.​Utils.​Optimize wraps Python module sklearn.utils.optimize.

line_search_wolfe1

function line_search_wolfe1
val line_search_wolfe1 :
  ?gfk:[>`ArrayLike] Np.Obj.t ->
  ?old_fval:float ->
  ?old_old_fval:float ->
  ?args:Py.Object.t ->
  ?c1:Py.Object.t ->
  ?c2:Py.Object.t ->
  ?amax:Py.Object.t ->
  ?amin:Py.Object.t ->
  ?xtol:Py.Object.t ->
  f:Py.Object.t ->
  fprime:Py.Object.t ->
  xk:[>`ArrayLike] Np.Obj.t ->
  pk:[>`ArrayLike] Np.Obj.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

As scalar_search_wolfe1 but do a line search to direction pk

Parameters

  • f : callable Function f(x)

  • fprime : callable Gradient of f

  • xk : array_like Current point

  • pk : array_like Search direction

  • gfk : array_like, optional Gradient of f at point xk

  • old_fval : float, optional Value of f at point xk

  • old_old_fval : float, optional Value of f at point preceding xk

The rest of the parameters are the same as for scalar_search_wolfe1.

Returns

stp, f_count, g_count, fval, old_fval As in line_search_wolfe1

  • gval : array Gradient of f at the final point

line_search_wolfe2

function line_search_wolfe2
val line_search_wolfe2 :
  ?gfk:[>`ArrayLike] Np.Obj.t ->
  ?old_fval:float ->
  ?old_old_fval:float ->
  ?args:Py.Object.t ->
  ?c1:float ->
  ?c2:float ->
  ?amax:float ->
  ?extra_condition:Py.Object.t ->
  ?maxiter:int ->
  f:Py.Object.t ->
  myfprime:Py.Object.t ->
  xk:[>`ArrayLike] Np.Obj.t ->
  pk:[>`ArrayLike] Np.Obj.t ->
  unit ->
  (float option * int * int * float option * float * float option)

Find alpha that satisfies strong Wolfe conditions.

Parameters

  • f : callable f(x,*args) Objective function.

  • myfprime : callable f'(x,*args) Objective function gradient.

  • xk : ndarray Starting point.

  • pk : ndarray Search direction.

  • gfk : ndarray, optional Gradient value for x=xk (xk being the current parameter estimate). Will be recomputed if omitted.

  • old_fval : float, optional Function value for x=xk. Will be recomputed if omitted.

  • old_old_fval : float, optional Function value for the point preceding x=xk.

  • args : tuple, optional Additional arguments passed to objective function.

  • c1 : float, optional Parameter for Armijo condition rule.

  • c2 : float, optional Parameter for curvature condition rule.

  • amax : float, optional Maximum step size

  • extra_condition : callable, optional A callable of the form extra_condition(alpha, x, f, g) returning a boolean. Arguments are the proposed step alpha and the corresponding x, f and g values. The line search accepts the value of alpha only if this callable returns True. If the callable returns False for the step length, the algorithm will continue with new iterates. The callable is only called for iterates satisfying the strong Wolfe conditions.

  • maxiter : int, optional Maximum number of iterations to perform.

Returns

  • alpha : float or None Alpha for which x_new = x0 + alpha * pk, or None if the line search algorithm did not converge.

  • fc : int Number of function evaluations made.

  • gc : int Number of gradient evaluations made.

  • new_fval : float or None New function value f(x_new)=f(x0+alpha*pk), or None if the line search algorithm did not converge.

  • old_fval : float Old function value f(x0).

  • new_slope : float or None The local slope along the search direction at the new value <myfprime(x_new), pk>, or None if the line search algorithm did not converge.

Notes

Uses the line search algorithm to enforce strong Wolfe conditions. See Wright and Nocedal, 'Numerical Optimization', 1999, pp. 59-61.

Examples

>>> from scipy.optimize import line_search

A objective function and its gradient are defined.

>>> def obj_func(x):
...     return (x[0])**2+(x[1])**2
>>> def obj_grad(x):
...     return [2*x[0], 2*x[1]]

We can find alpha that satisfies strong Wolfe conditions.

>>> start_point = np.array([1.8, 1.7])
>>> search_gradient = np.array([-1.0, -1.0])
>>> line_search(obj_func, obj_grad, start_point, search_gradient)
(1.0, 2, 1, 1.1300000000000001, 6.13, [1.6, 1.4])

newton_cg

function newton_cg
val newton_cg :
  ?args:Py.Object.t ->
  ?tol:Py.Object.t ->
  ?maxiter:Py.Object.t ->
  ?maxinner:Py.Object.t ->
  ?line_search:Py.Object.t ->
  ?warn:Py.Object.t ->
  grad_hess:Py.Object.t ->
  func:Py.Object.t ->
  grad:Py.Object.t ->
  x0:Py.Object.t ->
  unit ->
  Py.Object.t
  • DEPRECATED: newton_cg is deprecated in version 0.22 and will be removed in version 0.24.

Random

Module Sklearn.​Utils.​Random wraps Python module sklearn.utils.random.

check_random_state

function check_random_state
val check_random_state :
  [`Optional of [`I of int | `None] | `RandomState of Py.Object.t] ->
  Py.Object.t

Turn seed into a np.random.RandomState instance

Parameters

  • seed : None | int | instance of RandomState If seed is None, return the RandomState singleton used by np.random. If seed is an int, return a new RandomState instance seeded with seed. If seed is already a RandomState instance, return it. Otherwise raise ValueError.

random_choice_csc

function random_choice_csc
val random_choice_csc :
  ?class_probability:Py.Object.t ->
  ?random_state:int ->
  n_samples:Py.Object.t ->
  classes:Py.Object.t ->
  unit ->
  Py.Object.t
  • DEPRECATED: random_choice_csc is deprecated in version 0.22 and will be removed in version 0.24.

Sparsefuncs

Module Sklearn.​Utils.​Sparsefuncs wraps Python module sklearn.utils.sparsefuncs.

count_nonzero

function count_nonzero
val count_nonzero :
  ?axis:[`Zero | `One] ->
  ?sample_weight:[>`ArrayLike] Np.Obj.t ->
  x:[>`Csr_matrix] Np.Obj.t ->
  unit ->
  Py.Object.t

A variant of X.getnnz() with extension to weighting on axis 0

Useful in efficiently calculating multilabel metrics.

Parameters

  • X : CSR sparse matrix of shape (n_samples, n_labels) Input data.

  • axis : None, 0 or 1 The axis on which the data is aggregated.

  • sample_weight : array-like of shape (n_samples,), default=None Weight for each row of X.

csc_median_axis_0

function csc_median_axis_0
val csc_median_axis_0 :
  [>`Csc_matrix] Np.Obj.t ->
  [>`ArrayLike] Np.Obj.t

Find the median across axis 0 of a CSC matrix. It is equivalent to doing np.median(X, axis=0).

Parameters

  • X : CSC sparse matrix, shape (n_samples, n_features) Input data.

Returns

  • median : ndarray, shape (n_features,) Median.

incr_mean_variance_axis

function incr_mean_variance_axis
val incr_mean_variance_axis :
  x:[`Csr_matrix of [>`Csr_matrix] Np.Obj.t | `Csc_matrix of [>`Csc_matrix] Np.Obj.t] ->
  axis:int ->
  last_mean:Py.Object.t ->
  last_var:Py.Object.t ->
  last_n:int ->
  unit ->
  (Py.Object.t * Py.Object.t * int)

Compute incremental mean and variance along an axix on a CSR or CSC matrix.

last_mean, last_var are the statistics computed at the last step by this function. Both must be initialized to 0-arrays of the proper size, i.e. the number of features in X. last_n is the number of samples encountered until now.

Parameters

  • X : CSR or CSC sparse matrix, shape (n_samples, n_features) Input data.

  • axis : int (either 0 or 1) Axis along which the axis should be computed.

  • last_mean : float array with shape (n_features,) Array of feature-wise means to update with the new data X.

  • last_var : float array with shape (n_features,) Array of feature-wise var to update with the new data X.

  • last_n : int with shape (n_features,) Number of samples seen so far, excluded X.

Returns

  • means : float array with shape (n_features,) Updated feature-wise means.

  • variances : float array with shape (n_features,) Updated feature-wise variances.

  • n : int with shape (n_features,) Updated number of seen samples.

Notes

NaNs are ignored in the algorithm.

inplace_column_scale

function inplace_column_scale
val inplace_column_scale :
  x:[`Csr_matrix of [>`Csr_matrix] Np.Obj.t | `Csc_matrix of [>`Csc_matrix] Np.Obj.t] ->
  scale:Py.Object.t ->
  unit ->
  Py.Object.t

Inplace column scaling of a CSC/CSR matrix.

Scale each feature of the data matrix by multiplying with specific scale provided by the caller assuming a (n_samples, n_features) shape.

Parameters

  • X : CSC or CSR matrix with shape (n_samples, n_features) Matrix to normalize using the variance of the features.

  • scale : float array with shape (n_features,) Array of precomputed feature-wise values to use for scaling.

inplace_csr_column_scale

function inplace_csr_column_scale
val inplace_csr_column_scale :
  x:[>`Csr_matrix] Np.Obj.t ->
  scale:Py.Object.t ->
  unit ->
  Py.Object.t

Inplace column scaling of a CSR matrix.

Scale each feature of the data matrix by multiplying with specific scale provided by the caller assuming a (n_samples, n_features) shape.

Parameters

  • X : CSR matrix with shape (n_samples, n_features) Matrix to normalize using the variance of the features.

  • scale : float array with shape (n_features,) Array of precomputed feature-wise values to use for scaling.

inplace_csr_row_scale

function inplace_csr_row_scale
val inplace_csr_row_scale :
  x:[>`Csr_matrix] Np.Obj.t ->
  scale:Py.Object.t ->
  unit ->
  Py.Object.t

Inplace row scaling of a CSR matrix.

Scale each sample of the data matrix by multiplying with specific scale provided by the caller assuming a (n_samples, n_features) shape.

Parameters

  • X : CSR sparse matrix, shape (n_samples, n_features) Matrix to be scaled.

  • scale : float array with shape (n_samples,) Array of precomputed sample-wise values to use for scaling.

inplace_row_scale

function inplace_row_scale
val inplace_row_scale :
  x:[`Csr_matrix of [>`Csr_matrix] Np.Obj.t | `Csc_matrix of [>`Csc_matrix] Np.Obj.t] ->
  scale:Py.Object.t ->
  unit ->
  Py.Object.t

Inplace row scaling of a CSR or CSC matrix.

Scale each row of the data matrix by multiplying with specific scale provided by the caller assuming a (n_samples, n_features) shape.

Parameters

  • X : CSR or CSC sparse matrix, shape (n_samples, n_features) Matrix to be scaled.

  • scale : float array with shape (n_features,) Array of precomputed sample-wise values to use for scaling.

inplace_swap_column

function inplace_swap_column
val inplace_swap_column :
  x:[`Csr_matrix of [>`Csr_matrix] Np.Obj.t | `Csc_matrix of [>`Csc_matrix] Np.Obj.t] ->
  m:int ->
  n:int ->
  unit ->
  Py.Object.t

Swaps two columns of a CSC/CSR matrix in-place.

Parameters

  • X : CSR or CSC sparse matrix, shape=(n_samples, n_features) Matrix whose two columns are to be swapped.

  • m : int Index of the column of X to be swapped.

  • n : int Index of the column of X to be swapped.

inplace_swap_row

function inplace_swap_row
val inplace_swap_row :
  x:[`Csr_matrix of [>`Csr_matrix] Np.Obj.t | `Csc_matrix of [>`Csc_matrix] Np.Obj.t] ->
  m:int ->
  n:int ->
  unit ->
  Py.Object.t

Swaps two rows of a CSC/CSR matrix in-place.

Parameters

  • X : CSR or CSC sparse matrix, shape=(n_samples, n_features) Matrix whose two rows are to be swapped.

  • m : int Index of the row of X to be swapped.

  • n : int Index of the row of X to be swapped.

inplace_swap_row_csc

function inplace_swap_row_csc
val inplace_swap_row_csc :
  x:Py.Object.t ->
  m:int ->
  n:int ->
  unit ->
  Py.Object.t

Swaps two rows of a CSC matrix in-place.

Parameters

  • X : scipy.sparse.csc_matrix, shape=(n_samples, n_features) Matrix whose two rows are to be swapped.

  • m : int Index of the row of X to be swapped.

  • n : int Index of the row of X to be swapped.

inplace_swap_row_csr

function inplace_swap_row_csr
val inplace_swap_row_csr :
  x:Py.Object.t ->
  m:int ->
  n:int ->
  unit ->
  Py.Object.t

Swaps two rows of a CSR matrix in-place.

Parameters

  • X : scipy.sparse.csr_matrix, shape=(n_samples, n_features) Matrix whose two rows are to be swapped.

  • m : int Index of the row of X to be swapped.

  • n : int Index of the row of X to be swapped.

mean_variance_axis

function mean_variance_axis
val mean_variance_axis :
  x:[`Csr_matrix of [>`Csr_matrix] Np.Obj.t | `Csc_matrix of [>`Csc_matrix] Np.Obj.t] ->
  axis:int ->
  unit ->
  (Py.Object.t * Py.Object.t)

Compute mean and variance along an axix on a CSR or CSC matrix

Parameters

  • X : CSR or CSC sparse matrix, shape (n_samples, n_features) Input data.

  • axis : int (either 0 or 1) Axis along which the axis should be computed.

Returns

  • means : float array with shape (n_features,) Feature-wise means

  • variances : float array with shape (n_features,) Feature-wise variances

min_max_axis

function min_max_axis
val min_max_axis :
  ?ignore_nan:bool ->
  x:[`Csr_matrix of [>`Csr_matrix] Np.Obj.t | `Csc_matrix of [>`Csc_matrix] Np.Obj.t] ->
  axis:int ->
  unit ->
  (Py.Object.t * Py.Object.t)

Compute minimum and maximum along an axis on a CSR or CSC matrix and optionally ignore NaN values.

Parameters

  • X : CSR or CSC sparse matrix, shape (n_samples, n_features) Input data.

  • axis : int (either 0 or 1) Axis along which the axis should be computed.

  • ignore_nan : bool, default is False Ignore or passing through NaN values.

    .. versionadded:: 0.20

Returns

  • mins : float array with shape (n_features,) Feature-wise minima

  • maxs : float array with shape (n_features,) Feature-wise maxima

Sparsefuncs_fast

Module Sklearn.​Utils.​Sparsefuncs_fast wraps Python module sklearn.utils.sparsefuncs_fast.

assign_rows_csr

function assign_rows_csr
val assign_rows_csr :
  x:Py.Object.t ->
  x_rows:Py.Object.t ->
  out_rows:Py.Object.t ->
  out:Py.Object.t ->
  unit ->
  Py.Object.t

Densify selected rows of a CSR matrix into a preallocated array.

Like out[out_rows] = X[X_rows].toarray() but without copying. No-copy supported for both dtype=np.float32 and dtype=np.float64.

Parameters

  • X : scipy.sparse.csr_matrix, shape=(n_samples, n_features)

  • X_rows : array, dtype=np.intp, shape=n_rows

  • out_rows : array, dtype=np.intp, shape=n_rows

  • out : array, shape=(arbitrary, n_features)

Stats

Module Sklearn.​Utils.​Stats wraps Python module sklearn.utils.stats.

stable_cumsum

function stable_cumsum
val stable_cumsum :
  ?axis:int ->
  ?rtol:float ->
  ?atol:float ->
  arr:[>`ArrayLike] Np.Obj.t ->
  unit ->
  Py.Object.t

Use high precision for cumsum and check that final value matches sum

Parameters

  • arr : array-like To be cumulatively summed as flat

  • axis : int, optional Axis along which the cumulative sum is computed. The default (None) is to compute the cumsum over the flattened array.

  • rtol : float Relative tolerance, see np.allclose

  • atol : float Absolute tolerance, see np.allclose

Validation

Module Sklearn.​Utils.​Validation wraps Python module sklearn.utils.validation.

ComplexWarning

Module Sklearn.​Utils.​Validation.​ComplexWarning wraps Python class sklearn.utils.validation.ComplexWarning.

type t

with_traceback

method with_traceback
val with_traceback :
  tb:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Exception.with_traceback(tb) -- set self.traceback to tb and return self.

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

Parameter

Module Sklearn.​Utils.​Validation.​Parameter wraps Python class sklearn.utils.validation.Parameter.

type t

create

constructor and attributes create
val create :
  name:Py.Object.t ->
  kind:Py.Object.t ->
  default:Py.Object.t ->
  annotation:Py.Object.t ->
  unit ->
  t

Represents a parameter in a function signature.

Has the following public attributes:

  • name : str The name of the parameter as a string.
  • default : object The default value for the parameter if specified. If the parameter has no default value, this attribute is set to Parameter.empty.
  • annotation The annotation for the parameter if specified. If the parameter has no annotation, this attribute is set to Parameter.empty.
  • kind : str Describes how argument values are bound to the parameter. Possible values: Parameter.POSITIONAL_ONLY, Parameter.POSITIONAL_OR_KEYWORD, Parameter.VAR_POSITIONAL, Parameter.KEYWORD_ONLY, Parameter.VAR_KEYWORD.

replace

method replace
val replace :
  ?name:Py.Object.t ->
  ?kind:Py.Object.t ->
  ?annotation:Py.Object.t ->
  ?default:Py.Object.t ->
  [> tag] Obj.t ->
  Py.Object.t

Creates a customized copy of the Parameter.

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

Suppress

Module Sklearn.​Utils.​Validation.​Suppress wraps Python class sklearn.utils.validation.suppress.

type t

create

constructor and attributes create
val create :
  Py.Object.t list ->
  t

Context manager to suppress specified exceptions

After the exception is suppressed, execution proceeds with the next statement following the with statement.

 with suppress(FileNotFoundError):
     os.remove(somefile)
 # Execution still resumes here if the file was already removed

to_string

method to_string
val to_string: t -> string

Print the object to a human-readable representation.

show

method show
val show: t -> string

Print the object to a human-readable representation.

pp

method pp
val pp: Format.formatter -> t -> unit

Pretty-print the object to a formatter.

as_float_array

function as_float_array
val as_float_array :
  ?copy:bool ->
  ?force_all_finite:[`Allow_nan | `Bool of bool] ->
  x:[>`ArrayLike] Np.Obj.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Converts an array-like to an array of floats.

The new dtype will be np.float32 or np.float64, depending on the original type. The function can create a copy or modify the argument depending on the argument copy.

Parameters

  • X : {array-like, sparse matrix}

  • copy : bool, optional If True, a copy of X will be created. If False, a copy may still be returned if X's dtype is not a floating point type.

  • force_all_finite : boolean or 'allow-nan', (default=True) Whether to raise an error on np.inf, np.nan, pd.NA in X. The possibilities are:

    • True: Force all values of X to be finite.
    • False: accepts np.inf, np.nan, pd.NA in X.
    • 'allow-nan': accepts only np.nan and pd.NA values in X. Values cannot be infinite.

    .. versionadded:: 0.20 force_all_finite accepts the string 'allow-nan'.

    .. versionchanged:: 0.23 Accepts pd.NA and converts it into np.nan

Returns

  • XT : {array, sparse matrix} An array of type np.float

assert_all_finite

function assert_all_finite
val assert_all_finite :
  ?allow_nan:bool ->
  x:[>`ArrayLike] Np.Obj.t ->
  unit ->
  Py.Object.t

Throw a ValueError if X contains NaN or infinity.

Parameters

  • X : array or sparse matrix

  • allow_nan : bool

check_X_y

function check_X_y
val check_X_y :
  ?accept_sparse:[`S of string | `StringList of string list | `Bool of bool] ->
  ?accept_large_sparse:bool ->
  ?dtype:[`Dtypes of Np.Dtype.t list | `S of string | `Dtype of Np.Dtype.t | `None] ->
  ?order:[`F | `C] ->
  ?copy:bool ->
  ?force_all_finite:[`Allow_nan | `Bool of bool] ->
  ?ensure_2d:bool ->
  ?allow_nd:bool ->
  ?multi_output:bool ->
  ?ensure_min_samples:int ->
  ?ensure_min_features:int ->
  ?y_numeric:bool ->
  ?estimator:[>`BaseEstimator] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  unit ->
  (Py.Object.t * Py.Object.t)

Input validation for standard estimators.

Checks X and y for consistent length, enforces X to be 2D and y 1D. By default, X is checked to be non-empty and containing only finite values. Standard input checks are also applied to y, such as checking that y does not have np.nan or np.inf targets. For multi-label y, set multi_output=True to allow 2D and sparse y. If the dtype of X is object, attempt converting to float, raising on failure.

Parameters

  • X : nd-array, list or sparse matrix Input data.

  • y : nd-array, list or sparse matrix Labels.

  • accept_sparse : string, boolean or list of string (default=False) String[s] representing allowed sparse matrix formats, such as 'csc', 'csr', etc. If the input is sparse but not in the allowed format, it will be converted to the first listed format. True allows the input to be any format. False means that a sparse matrix input will raise an error.

  • accept_large_sparse : bool (default=True) If a CSR, CSC, COO or BSR sparse matrix is supplied and accepted by accept_sparse, accept_large_sparse will cause it to be accepted only if its indices are stored with a 32-bit dtype.

    .. versionadded:: 0.20

  • dtype : string, type, list of types or None (default='numeric') Data type of result. If None, the dtype of the input is preserved. If 'numeric', dtype is preserved unless array.dtype is object. If dtype is a list of types, conversion on the first type is only performed if the dtype of the input is not in the list.

  • order : 'F', 'C' or None (default=None) Whether an array will be forced to be fortran or c-style.

  • copy : boolean (default=False) Whether a forced copy will be triggered. If copy=False, a copy might be triggered by a conversion.

  • force_all_finite : boolean or 'allow-nan', (default=True) Whether to raise an error on np.inf, np.nan, pd.NA in X. This parameter does not influence whether y can have np.inf, np.nan, pd.NA values. The possibilities are:

    • True: Force all values of X to be finite.
    • False: accepts np.inf, np.nan, pd.NA in X.
    • 'allow-nan': accepts only np.nan or pd.NA values in X. Values cannot be infinite.

    .. versionadded:: 0.20 force_all_finite accepts the string 'allow-nan'.

    .. versionchanged:: 0.23 Accepts pd.NA and converts it into np.nan

  • ensure_2d : boolean (default=True) Whether to raise a value error if X is not 2D.

  • allow_nd : boolean (default=False) Whether to allow X.ndim > 2.

  • multi_output : boolean (default=False) Whether to allow 2D y (array or sparse matrix). If false, y will be validated as a vector. y cannot have np.nan or np.inf values if multi_output=True.

  • ensure_min_samples : int (default=1) Make sure that X has a minimum number of samples in its first axis (rows for a 2D array).

  • ensure_min_features : int (default=1) Make sure that the 2D array has some minimum number of features (columns). The default value of 1 rejects empty datasets. This check is only enforced when X has effectively 2 dimensions or is originally 1D and ensure_2d is True. Setting to 0 disables this check.

  • y_numeric : boolean (default=False) Whether to ensure that y has a numeric type. If dtype of y is object, it is converted to float64. Should only be used for regression algorithms.

  • estimator : str or estimator instance (default=None) If passed, include the name of the estimator in warning messages.

Returns

  • X_converted : object The converted and validated X.

  • y_converted : object The converted and validated y.

check_array

function check_array
val check_array :
  ?accept_sparse:[`S of string | `StringList of string list | `Bool of bool] ->
  ?accept_large_sparse:bool ->
  ?dtype:[`Dtypes of Np.Dtype.t list | `S of string | `Dtype of Np.Dtype.t | `None] ->
  ?order:[`F | `C] ->
  ?copy:bool ->
  ?force_all_finite:[`Allow_nan | `Bool of bool] ->
  ?ensure_2d:bool ->
  ?allow_nd:bool ->
  ?ensure_min_samples:int ->
  ?ensure_min_features:int ->
  ?estimator:[>`BaseEstimator] Np.Obj.t ->
  array:Py.Object.t ->
  unit ->
  Py.Object.t

Input validation on an array, list, sparse matrix or similar.

By default, the input is checked to be a non-empty 2D array containing only finite values. If the dtype of the array is object, attempt converting to float, raising on failure.

Parameters

  • array : object Input object to check / convert.

  • accept_sparse : string, boolean or list/tuple of strings (default=False) String[s] representing allowed sparse matrix formats, such as 'csc', 'csr', etc. If the input is sparse but not in the allowed format, it will be converted to the first listed format. True allows the input to be any format. False means that a sparse matrix input will raise an error.

  • accept_large_sparse : bool (default=True) If a CSR, CSC, COO or BSR sparse matrix is supplied and accepted by accept_sparse, accept_large_sparse=False will cause it to be accepted only if its indices are stored with a 32-bit dtype.

    .. versionadded:: 0.20

  • dtype : string, type, list of types or None (default='numeric') Data type of result. If None, the dtype of the input is preserved. If 'numeric', dtype is preserved unless array.dtype is object. If dtype is a list of types, conversion on the first type is only performed if the dtype of the input is not in the list.

  • order : 'F', 'C' or None (default=None) Whether an array will be forced to be fortran or c-style. When order is None (default), then if copy=False, nothing is ensured about the memory layout of the output array; otherwise (copy=True) the memory layout of the returned array is kept as close as possible to the original array.

  • copy : boolean (default=False) Whether a forced copy will be triggered. If copy=False, a copy might be triggered by a conversion.

  • force_all_finite : boolean or 'allow-nan', (default=True) Whether to raise an error on np.inf, np.nan, pd.NA in array. The possibilities are:

    • True: Force all values of array to be finite.
    • False: accepts np.inf, np.nan, pd.NA in array.
    • 'allow-nan': accepts only np.nan and pd.NA values in array. Values cannot be infinite.

    .. versionadded:: 0.20 force_all_finite accepts the string 'allow-nan'.

    .. versionchanged:: 0.23 Accepts pd.NA and converts it into np.nan

  • ensure_2d : boolean (default=True) Whether to raise a value error if array is not 2D.

  • allow_nd : boolean (default=False) Whether to allow array.ndim > 2.

  • ensure_min_samples : int (default=1) Make sure that the array has a minimum number of samples in its first axis (rows for a 2D array). Setting to 0 disables this check.

  • ensure_min_features : int (default=1) Make sure that the 2D array has some minimum number of features (columns). The default value of 1 rejects empty datasets. This check is only enforced when the input data has effectively 2 dimensions or is originally 1D and ensure_2d is True. Setting to 0 disables this check.

  • estimator : str or estimator instance (default=None) If passed, include the name of the estimator in warning messages.

Returns

  • array_converted : object The converted and validated array.

check_consistent_length

function check_consistent_length
val check_consistent_length :
  Py.Object.t list ->
  Py.Object.t

Check that all arrays have consistent first dimensions.

Checks whether all objects in arrays have the same shape or length.

Parameters

  • *arrays : list or tuple of input objects. Objects that will be checked for consistent length.

check_is_fitted

function check_is_fitted
val check_is_fitted :
  ?attributes:[`Arr of [>`ArrayLike] Np.Obj.t | `S of string | `StringList of string list] ->
  ?msg:string ->
  ?all_or_any:[`Callable of Py.Object.t | `PyObject of Py.Object.t] ->
  estimator:[>`BaseEstimator] Np.Obj.t ->
  unit ->
  Py.Object.t

Perform is_fitted validation for estimator.

Checks if the estimator is fitted by verifying the presence of fitted attributes (ending with a trailing underscore) and otherwise raises a NotFittedError with the given message.

This utility is meant to be used internally by estimators themselves, typically in their own predict / transform methods.

Parameters

  • estimator : estimator instance. estimator instance for which the check is performed.

  • attributes : str, list or tuple of str, default=None Attribute name(s) given as string or a list/tuple of strings

  • Eg.: ['coef_', 'estimator_', ...], 'coef_'

    If None, estimator is considered fitted if there exist an attribute that ends with a underscore and does not start with double underscore.

  • msg : string The default error message is, 'This %(name)s instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.'

    For custom messages if '%(name)s' is present in the message string, it is substituted for the estimator name.

  • Eg. : 'Estimator, %(name)s, must be fitted before sparsifying'.

  • all_or_any : callable, {all, any}, default all Specify whether all or any of the given attributes must exist.

Returns

None

Raises

NotFittedError If the attributes are not found.

check_memory

function check_memory
val check_memory :
  [`Object_with_the_joblib_Memory_interface of Py.Object.t | `S of string | `None] ->
  Py.Object.t

Check that memory is joblib.Memory-like.

joblib.Memory-like means that memory can be converted into a joblib.Memory instance (typically a str denoting the location) or has the same interface (has a cache method).

Parameters

  • memory : None, str or object with the joblib.Memory interface

Returns

  • memory : object with the joblib.Memory interface

Raises

ValueError If memory is not joblib.Memory-like.

check_non_negative

function check_non_negative
val check_non_negative :
  x:[>`ArrayLike] Np.Obj.t ->
  whom:string ->
  unit ->
  Py.Object.t

Check if there is any negative value in an array.

Parameters

  • X : array-like or sparse matrix Input data.

  • whom : string Who passed X to this function.

check_random_state

function check_random_state
val check_random_state :
  [`Optional of [`I of int | `None] | `RandomState of Py.Object.t] ->
  Py.Object.t

Turn seed into a np.random.RandomState instance

Parameters

  • seed : None | int | instance of RandomState If seed is None, return the RandomState singleton used by np.random. If seed is an int, return a new RandomState instance seeded with seed. If seed is already a RandomState instance, return it. Otherwise raise ValueError.

check_scalar

function check_scalar
val check_scalar :
  ?min_val:[`F of float | `I of int] ->
  ?max_val:[`F of float | `I of int] ->
  x:Py.Object.t ->
  name:string ->
  target_type:[`Tuple of Py.Object.t | `Dtype of Np.Dtype.t] ->
  unit ->
  Py.Object.t

Validate scalar parameters type and value.

Parameters

  • x : object The scalar parameter to validate.

  • name : str The name of the parameter to be printed in error messages.

  • target_type : type or tuple Acceptable data types for the parameter.

  • min_val : float or int, optional (default=None) The minimum valid value the parameter can take. If None (default) it is implied that the parameter does not have a lower bound.

  • max_val : float or int, optional (default=None) The maximum valid value the parameter can take. If None (default) it is implied that the parameter does not have an upper bound.

Raises

TypeError If the parameter's type does not match the desired type.

ValueError If the parameter's value violates the given bounds.

check_symmetric

function check_symmetric
val check_symmetric :
  ?tol:float ->
  ?raise_warning:bool ->
  ?raise_exception:bool ->
  array:[>`ArrayLike] Np.Obj.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Make sure that array is 2D, square and symmetric.

If the array is not symmetric, then a symmetrized version is returned. Optionally, a warning or exception is raised if the matrix is not symmetric.

Parameters

  • array : nd-array or sparse matrix Input object to check / convert. Must be two-dimensional and square, otherwise a ValueError will be raised.

  • tol : float Absolute tolerance for equivalence of arrays. Default = 1E-10.

  • raise_warning : boolean (default=True) If True then raise a warning if conversion is required.

  • raise_exception : boolean (default=False) If True then raise an exception if array is not symmetric.

Returns

  • array_sym : ndarray or sparse matrix Symmetrized version of the input array, i.e. the average of array and array.transpose(). If sparse, then duplicate entries are first summed and zeros are eliminated.

column_or_1d

function column_or_1d
val column_or_1d :
  ?warn:bool ->
  y:[>`ArrayLike] Np.Obj.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Ravel column or 1d numpy array, else raises an error

Parameters

  • y : array-like

  • warn : boolean, default False To control display of warnings.

Returns

  • y : array

has_fit_parameter

function has_fit_parameter
val has_fit_parameter :
  estimator:[>`BaseEstimator] Np.Obj.t ->
  parameter:string ->
  unit ->
  bool

Checks whether the estimator's fit method supports the given parameter.

Parameters

  • estimator : object An estimator to inspect.

  • parameter : str The searched parameter.

Returns

  • is_parameter: bool Whether the parameter was found to be a named parameter of the estimator's fit method.

Examples

>>> from sklearn.svm import SVC
>>> has_fit_parameter(SVC(), 'sample_weight')
True

indexable

function indexable
val indexable :
  Py.Object.t list ->
  Py.Object.t

Make arrays indexable for cross-validation.

Checks consistent length, passes through None, and ensures that everything can be indexed by converting sparse matrices to csr and converting non-interable objects to arrays.

Parameters

  • *iterables : lists, dataframes, arrays, sparse matrices List of objects to ensure sliceability.

isclass

function isclass
val isclass :
  Py.Object.t ->
  Py.Object.t

Return true if the object is a class.

Class objects provide these attributes: doc documentation string module name of module in which this class was defined

parse_version

function parse_version
val parse_version :
  Py.Object.t ->
  Py.Object.t

signature

function signature
val signature :
  ?follow_wrapped:Py.Object.t ->
  obj:Py.Object.t ->
  unit ->
  Py.Object.t

Get a signature object for the passed callable.

wraps

function wraps
val wraps :
  ?assigned:Py.Object.t ->
  ?updated:Py.Object.t ->
  wrapped:Py.Object.t ->
  unit ->
  Py.Object.t

Decorator factory to apply update_wrapper() to a wrapper function

Returns a decorator that invokes update_wrapper() with the decorated function as the wrapper argument and the arguments to wraps() as the remaining arguments. Default arguments are as for update_wrapper(). This is a convenience function to simplify applying partial() to update_wrapper().

all_estimators

function all_estimators
val all_estimators :
  ?type_filter:[`S of string | `StringList of string list] ->
  unit ->
  Py.Object.t

Get a list of all estimators from sklearn.

This function crawls the module and gets all classes that inherit from BaseEstimator. Classes that are defined in test-modules are not included. By default meta_estimators such as GridSearchCV are also not included.

Parameters

  • type_filter : string, list of string, or None, default=None Which kind of estimators should be returned. If None, no filter is applied and all estimators are returned. Possible values are 'classifier', 'regressor', 'cluster' and 'transformer' to get estimators only of these specific types, or a list of these to get the estimators that fit at least one of the types.

Returns

  • estimators : list of tuples List of (name, class), where name is the class name as string and class is the actuall type of the class.

as_float_array

function as_float_array
val as_float_array :
  ?copy:bool ->
  ?force_all_finite:[`Allow_nan | `Bool of bool] ->
  x:[>`ArrayLike] Np.Obj.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Converts an array-like to an array of floats.

The new dtype will be np.float32 or np.float64, depending on the original type. The function can create a copy or modify the argument depending on the argument copy.

Parameters

  • X : {array-like, sparse matrix}

  • copy : bool, optional If True, a copy of X will be created. If False, a copy may still be returned if X's dtype is not a floating point type.

  • force_all_finite : boolean or 'allow-nan', (default=True) Whether to raise an error on np.inf, np.nan, pd.NA in X. The possibilities are:

    • True: Force all values of X to be finite.
    • False: accepts np.inf, np.nan, pd.NA in X.
    • 'allow-nan': accepts only np.nan and pd.NA values in X. Values cannot be infinite.

    .. versionadded:: 0.20 force_all_finite accepts the string 'allow-nan'.

    .. versionchanged:: 0.23 Accepts pd.NA and converts it into np.nan

Returns

  • XT : {array, sparse matrix} An array of type np.float

assert_all_finite

function assert_all_finite
val assert_all_finite :
  ?allow_nan:bool ->
  x:[>`ArrayLike] Np.Obj.t ->
  unit ->
  Py.Object.t

Throw a ValueError if X contains NaN or infinity.

Parameters

  • X : array or sparse matrix

  • allow_nan : bool

axis0_safe_slice

function axis0_safe_slice
val axis0_safe_slice :
  x:[>`ArrayLike] Np.Obj.t ->
  mask:[>`ArrayLike] Np.Obj.t ->
  len_mask:int ->
  unit ->
  Py.Object.t

This mask is safer than safe_mask since it returns an empty array, when a sparse matrix is sliced with a boolean mask with all False, instead of raising an unhelpful error in older versions of SciPy.

  • See: https://github.com/scipy/scipy/issues/5361

Also note that we can avoid doing the dot product by checking if the len_mask is not zero in _huber_loss_and_gradient but this is not going to be the bottleneck, since the number of outliers and non_outliers are typically non-zero and it makes the code tougher to follow.

Parameters

  • X : {array-like, sparse matrix} Data on which to apply mask.

  • mask : array Mask to be used on X.

  • len_mask : int The length of the mask.

Returns

mask

check_X_y

function check_X_y
val check_X_y :
  ?accept_sparse:[`S of string | `StringList of string list | `Bool of bool] ->
  ?accept_large_sparse:bool ->
  ?dtype:[`Dtypes of Np.Dtype.t list | `S of string | `Dtype of Np.Dtype.t | `None] ->
  ?order:[`F | `C] ->
  ?copy:bool ->
  ?force_all_finite:[`Allow_nan | `Bool of bool] ->
  ?ensure_2d:bool ->
  ?allow_nd:bool ->
  ?multi_output:bool ->
  ?ensure_min_samples:int ->
  ?ensure_min_features:int ->
  ?y_numeric:bool ->
  ?estimator:[>`BaseEstimator] Np.Obj.t ->
  x:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  unit ->
  (Py.Object.t * Py.Object.t)

Input validation for standard estimators.

Checks X and y for consistent length, enforces X to be 2D and y 1D. By default, X is checked to be non-empty and containing only finite values. Standard input checks are also applied to y, such as checking that y does not have np.nan or np.inf targets. For multi-label y, set multi_output=True to allow 2D and sparse y. If the dtype of X is object, attempt converting to float, raising on failure.

Parameters

  • X : nd-array, list or sparse matrix Input data.

  • y : nd-array, list or sparse matrix Labels.

  • accept_sparse : string, boolean or list of string (default=False) String[s] representing allowed sparse matrix formats, such as 'csc', 'csr', etc. If the input is sparse but not in the allowed format, it will be converted to the first listed format. True allows the input to be any format. False means that a sparse matrix input will raise an error.

  • accept_large_sparse : bool (default=True) If a CSR, CSC, COO or BSR sparse matrix is supplied and accepted by accept_sparse, accept_large_sparse will cause it to be accepted only if its indices are stored with a 32-bit dtype.

    .. versionadded:: 0.20

  • dtype : string, type, list of types or None (default='numeric') Data type of result. If None, the dtype of the input is preserved. If 'numeric', dtype is preserved unless array.dtype is object. If dtype is a list of types, conversion on the first type is only performed if the dtype of the input is not in the list.

  • order : 'F', 'C' or None (default=None) Whether an array will be forced to be fortran or c-style.

  • copy : boolean (default=False) Whether a forced copy will be triggered. If copy=False, a copy might be triggered by a conversion.

  • force_all_finite : boolean or 'allow-nan', (default=True) Whether to raise an error on np.inf, np.nan, pd.NA in X. This parameter does not influence whether y can have np.inf, np.nan, pd.NA values. The possibilities are:

    • True: Force all values of X to be finite.
    • False: accepts np.inf, np.nan, pd.NA in X.
    • 'allow-nan': accepts only np.nan or pd.NA values in X. Values cannot be infinite.

    .. versionadded:: 0.20 force_all_finite accepts the string 'allow-nan'.

    .. versionchanged:: 0.23 Accepts pd.NA and converts it into np.nan

  • ensure_2d : boolean (default=True) Whether to raise a value error if X is not 2D.

  • allow_nd : boolean (default=False) Whether to allow X.ndim > 2.

  • multi_output : boolean (default=False) Whether to allow 2D y (array or sparse matrix). If false, y will be validated as a vector. y cannot have np.nan or np.inf values if multi_output=True.

  • ensure_min_samples : int (default=1) Make sure that X has a minimum number of samples in its first axis (rows for a 2D array).

  • ensure_min_features : int (default=1) Make sure that the 2D array has some minimum number of features (columns). The default value of 1 rejects empty datasets. This check is only enforced when X has effectively 2 dimensions or is originally 1D and ensure_2d is True. Setting to 0 disables this check.

  • y_numeric : boolean (default=False) Whether to ensure that y has a numeric type. If dtype of y is object, it is converted to float64. Should only be used for regression algorithms.

  • estimator : str or estimator instance (default=None) If passed, include the name of the estimator in warning messages.

Returns

  • X_converted : object The converted and validated X.

  • y_converted : object The converted and validated y.

check_array

function check_array
val check_array :
  ?accept_sparse:[`S of string | `StringList of string list | `Bool of bool] ->
  ?accept_large_sparse:bool ->
  ?dtype:[`Dtypes of Np.Dtype.t list | `S of string | `Dtype of Np.Dtype.t | `None] ->
  ?order:[`F | `C] ->
  ?copy:bool ->
  ?force_all_finite:[`Allow_nan | `Bool of bool] ->
  ?ensure_2d:bool ->
  ?allow_nd:bool ->
  ?ensure_min_samples:int ->
  ?ensure_min_features:int ->
  ?estimator:[>`BaseEstimator] Np.Obj.t ->
  array:Py.Object.t ->
  unit ->
  Py.Object.t

Input validation on an array, list, sparse matrix or similar.

By default, the input is checked to be a non-empty 2D array containing only finite values. If the dtype of the array is object, attempt converting to float, raising on failure.

Parameters

  • array : object Input object to check / convert.

  • accept_sparse : string, boolean or list/tuple of strings (default=False) String[s] representing allowed sparse matrix formats, such as 'csc', 'csr', etc. If the input is sparse but not in the allowed format, it will be converted to the first listed format. True allows the input to be any format. False means that a sparse matrix input will raise an error.

  • accept_large_sparse : bool (default=True) If a CSR, CSC, COO or BSR sparse matrix is supplied and accepted by accept_sparse, accept_large_sparse=False will cause it to be accepted only if its indices are stored with a 32-bit dtype.

    .. versionadded:: 0.20

  • dtype : string, type, list of types or None (default='numeric') Data type of result. If None, the dtype of the input is preserved. If 'numeric', dtype is preserved unless array.dtype is object. If dtype is a list of types, conversion on the first type is only performed if the dtype of the input is not in the list.

  • order : 'F', 'C' or None (default=None) Whether an array will be forced to be fortran or c-style. When order is None (default), then if copy=False, nothing is ensured about the memory layout of the output array; otherwise (copy=True) the memory layout of the returned array is kept as close as possible to the original array.

  • copy : boolean (default=False) Whether a forced copy will be triggered. If copy=False, a copy might be triggered by a conversion.

  • force_all_finite : boolean or 'allow-nan', (default=True) Whether to raise an error on np.inf, np.nan, pd.NA in array. The possibilities are:

    • True: Force all values of array to be finite.
    • False: accepts np.inf, np.nan, pd.NA in array.
    • 'allow-nan': accepts only np.nan and pd.NA values in array. Values cannot be infinite.

    .. versionadded:: 0.20 force_all_finite accepts the string 'allow-nan'.

    .. versionchanged:: 0.23 Accepts pd.NA and converts it into np.nan

  • ensure_2d : boolean (default=True) Whether to raise a value error if array is not 2D.

  • allow_nd : boolean (default=False) Whether to allow array.ndim > 2.

  • ensure_min_samples : int (default=1) Make sure that the array has a minimum number of samples in its first axis (rows for a 2D array). Setting to 0 disables this check.

  • ensure_min_features : int (default=1) Make sure that the 2D array has some minimum number of features (columns). The default value of 1 rejects empty datasets. This check is only enforced when the input data has effectively 2 dimensions or is originally 1D and ensure_2d is True. Setting to 0 disables this check.

  • estimator : str or estimator instance (default=None) If passed, include the name of the estimator in warning messages.

Returns

  • array_converted : object The converted and validated array.

check_consistent_length

function check_consistent_length
val check_consistent_length :
  Py.Object.t list ->
  Py.Object.t

Check that all arrays have consistent first dimensions.

Checks whether all objects in arrays have the same shape or length.

Parameters

  • *arrays : list or tuple of input objects. Objects that will be checked for consistent length.

check_matplotlib_support

function check_matplotlib_support
val check_matplotlib_support :
  string ->
  Py.Object.t

Raise ImportError with detailed error message if mpl is not installed.

Plot utilities like :func:plot_partial_dependence should lazily import matplotlib and call this helper before any computation.

Parameters

  • caller_name : str The name of the caller that requires matplotlib.

check_pandas_support

function check_pandas_support
val check_pandas_support :
  string ->
  Py.Object.t

Raise ImportError with detailed error message if pandsa is not installed.

Plot utilities like :func:fetch_openml should lazily import pandas and call this helper before any computation.

Parameters

  • caller_name : str The name of the caller that requires pandas.

check_random_state

function check_random_state
val check_random_state :
  [`Optional of [`I of int | `None] | `RandomState of Py.Object.t] ->
  Py.Object.t

Turn seed into a np.random.RandomState instance

Parameters

  • seed : None | int | instance of RandomState If seed is None, return the RandomState singleton used by np.random. If seed is an int, return a new RandomState instance seeded with seed. If seed is already a RandomState instance, return it. Otherwise raise ValueError.

check_scalar

function check_scalar
val check_scalar :
  ?min_val:[`F of float | `I of int] ->
  ?max_val:[`F of float | `I of int] ->
  x:Py.Object.t ->
  name:string ->
  target_type:[`Tuple of Py.Object.t | `Dtype of Np.Dtype.t] ->
  unit ->
  Py.Object.t

Validate scalar parameters type and value.

Parameters

  • x : object The scalar parameter to validate.

  • name : str The name of the parameter to be printed in error messages.

  • target_type : type or tuple Acceptable data types for the parameter.

  • min_val : float or int, optional (default=None) The minimum valid value the parameter can take. If None (default) it is implied that the parameter does not have a lower bound.

  • max_val : float or int, optional (default=None) The maximum valid value the parameter can take. If None (default) it is implied that the parameter does not have an upper bound.

Raises

TypeError If the parameter's type does not match the desired type.

ValueError If the parameter's value violates the given bounds.

check_symmetric

function check_symmetric
val check_symmetric :
  ?tol:float ->
  ?raise_warning:bool ->
  ?raise_exception:bool ->
  array:[>`ArrayLike] Np.Obj.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Make sure that array is 2D, square and symmetric.

If the array is not symmetric, then a symmetrized version is returned. Optionally, a warning or exception is raised if the matrix is not symmetric.

Parameters

  • array : nd-array or sparse matrix Input object to check / convert. Must be two-dimensional and square, otherwise a ValueError will be raised.

  • tol : float Absolute tolerance for equivalence of arrays. Default = 1E-10.

  • raise_warning : boolean (default=True) If True then raise a warning if conversion is required.

  • raise_exception : boolean (default=False) If True then raise an exception if array is not symmetric.

Returns

  • array_sym : ndarray or sparse matrix Symmetrized version of the input array, i.e. the average of array and array.transpose(). If sparse, then duplicate entries are first summed and zeros are eliminated.

column_or_1d

function column_or_1d
val column_or_1d :
  ?warn:bool ->
  y:[>`ArrayLike] Np.Obj.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Ravel column or 1d numpy array, else raises an error

Parameters

  • y : array-like

  • warn : boolean, default False To control display of warnings.

Returns

  • y : array

compute_class_weight

function compute_class_weight
val compute_class_weight :
  class_weight:[`Balanced | `DictIntToFloat of (int * float) list | `None] ->
  classes:[>`ArrayLike] Np.Obj.t ->
  y:[>`ArrayLike] Np.Obj.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Estimate class weights for unbalanced datasets.

Parameters

  • class_weight : dict, 'balanced' or None If 'balanced', class weights will be given by n_samples / (n_classes * np.bincount(y)). If a dictionary is given, keys are classes and values are corresponding class weights. If None is given, the class weights will be uniform.

  • classes : ndarray Array of the classes occurring in the data, as given by np.unique(y_org) with y_org the original class labels.

  • y : array-like, shape (n_samples,) Array of original class labels per sample;

Returns

  • class_weight_vect : ndarray, shape (n_classes,) Array with class_weight_vect[i] the weight for i-th class

References

The 'balanced' heuristic is inspired by Logistic Regression in Rare Events Data, King, Zen, 2001.

compute_sample_weight

function compute_sample_weight
val compute_sample_weight :
  ?indices:[>`ArrayLike] Np.Obj.t ->
  class_weight:[`Balanced | `DictIntToFloat of (int * float) list | `List_of_dicts of Py.Object.t | `None] ->
  y:[>`ArrayLike] Np.Obj.t ->
  unit ->
  [>`ArrayLike] Np.Obj.t

Estimate sample weights by class for unbalanced datasets.

Parameters

  • class_weight : dict, list of dicts, 'balanced', or None, optional Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one. For multi-output problems, a list of dicts can be provided in the same order as the columns of y.

    Note that for multioutput (including multilabel) weights should be defined for each class of every column in its own dict. For example, for four-class multilabel classification weights should be [{0: 1, 1: 1}, {0: 1, 1: 5}, {0: 1, 1: 1}, {0: 1, 1: 1}] instead of [{1:1}, {2:5}, {3:1}, {4:1}].

    The 'balanced' mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data: n_samples / (n_classes * np.bincount(y)).

    For multi-output, the weights of each column of y will be multiplied.

  • y : array-like of shape (n_samples,) or (n_samples, n_outputs) Array of original class labels per sample.

  • indices : array-like, shape (n_subsample,), or None Array of indices to be used in a subsample. Can be of length less than n_samples in the case of a subsample, or equal to n_samples in the case of a bootstrap subsample with repeated indices. If None, the sample weight will be calculated over the full sample. Only 'balanced' is supported for class_weight if this is provided.

Returns

  • sample_weight_vect : ndarray, shape (n_samples,) Array with sample weights as applied to the original y

contextmanager

function contextmanager
val contextmanager :
  Py.Object.t ->
  Py.Object.t

@contextmanager decorator.

Typical usage:

@contextmanager
def some_generator(<arguments>):
    <setup>
    try:
        yield <value>
    finally:
        <cleanup>

This makes this:

with some_generator(<arguments>) as <variable>:
    <body>

equivalent to this:

<setup>
try:
    <variable> = <value>
    <body>
finally:
    <cleanup>

estimator_html_repr

function estimator_html_repr
val estimator_html_repr :
  [>`BaseEstimator] Np.Obj.t ->
  string

Build a HTML representation of an estimator.

Read more in the :ref:User Guide <visualizing_composite_estimators>.

Parameters

  • estimator : estimator object The estimator to visualize.

Returns

  • html: str HTML representation of estimator.

gen_batches

function gen_batches
val gen_batches :
  ?min_batch_size:int ->
  n:int ->
  batch_size:Py.Object.t ->
  unit ->
  Py.Object.t

Generator to create slices containing batch_size elements, from 0 to n.

The last slice may contain less than batch_size elements, when batch_size does not divide n.

Parameters

  • n : int

  • batch_size : int Number of element in each batch

  • min_batch_size : int, default=0 Minimum batch size to produce.

Yields

slice of batch_size elements

Examples

>>> from sklearn.utils import gen_batches
>>> list(gen_batches(7, 3))
[slice(0, 3, None), slice(3, 6, None), slice(6, 7, None)]
>>> list(gen_batches(6, 3))
[slice(0, 3, None), slice(3, 6, None)]
>>> list(gen_batches(2, 3))
[slice(0, 2, None)]
>>> list(gen_batches(7, 3, min_batch_size=0))
[slice(0, 3, None), slice(3, 6, None), slice(6, 7, None)]
>>> list(gen_batches(7, 3, min_batch_size=2))
[slice(0, 3, None), slice(3, 7, None)]

gen_even_slices

function gen_even_slices
val gen_even_slices :
  ?n_samples:int ->
  n:int ->
  n_packs:Py.Object.t ->
  unit ->
  Py.Object.t

Generator to create n_packs slices going up to n.

Parameters

  • n : int

  • n_packs : int Number of slices to generate.

  • n_samples : int or None (default = None) Number of samples. Pass n_samples when the slices are to be used for sparse matrix indexing; slicing off-the-end raises an exception, while it works for NumPy arrays.

Yields

slice

Examples

>>> from sklearn.utils import gen_even_slices
>>> list(gen_even_slices(10, 1))
[slice(0, 10, None)]
>>> list(gen_even_slices(10, 10))
[slice(0, 1, None), slice(1, 2, None), ..., slice(9, 10, None)]
>>> list(gen_even_slices(10, 5))
[slice(0, 2, None), slice(2, 4, None), ..., slice(8, 10, None)]
>>> list(gen_even_slices(10, 3))
[slice(0, 4, None), slice(4, 7, None), slice(7, 10, None)]

get_chunk_n_rows

function get_chunk_n_rows
val get_chunk_n_rows :
  ?max_n_rows:int ->
  ?working_memory:[`F of float | `I of int] ->
  row_bytes:int ->
  unit ->
  Py.Object.t

Calculates how many rows can be processed within working_memory

Parameters

  • row_bytes : int The expected number of bytes of memory that will be consumed during the processing of each row.

  • max_n_rows : int, optional The maximum return value.

  • working_memory : int or float, optional The number of rows to fit inside this number of MiB will be returned. When None (default), the value of sklearn.get_config()['working_memory'] is used.

Returns

int or the value of n_samples

Warns

Issues a UserWarning if row_bytes exceeds working_memory MiB.

get_config

function get_config
val get_config :
  unit ->
  Dict.t

Retrieve current values for configuration set by :func:set_config

Returns

  • config : dict Keys are parameter names that can be passed to :func:set_config.

See Also

  • config_context: Context manager for global scikit-learn configuration

  • set_config: Set global scikit-learn configuration

import_module

function import_module
val import_module :
  ?package:Py.Object.t ->
  name:Py.Object.t ->
  unit ->
  Py.Object.t

Import a module.

The 'package' argument is required when performing a relative import. It specifies the package to use as the anchor point from which to resolve the relative import to an absolute import.

indexable

function indexable
val indexable :
  Py.Object.t list ->
  Py.Object.t

Make arrays indexable for cross-validation.

Checks consistent length, passes through None, and ensures that everything can be indexed by converting sparse matrices to csr and converting non-interable objects to arrays.

Parameters

  • *iterables : lists, dataframes, arrays, sparse matrices List of objects to ensure sliceability.

indices_to_mask

function indices_to_mask
val indices_to_mask :
  indices:[>`ArrayLike] Np.Obj.t ->
  mask_length:int ->
  unit ->
  Py.Object.t

Convert list of indices to boolean mask.

Parameters

  • indices : list-like List of integers treated as indices.

  • mask_length : int Length of boolean mask to be generated. This parameter must be greater than max(indices)

Returns

  • mask : 1d boolean nd-array Boolean array that is True where indices are present, else False.

Examples

>>> from sklearn.utils import indices_to_mask
>>> indices = [1, 2 , 3, 4]
>>> indices_to_mask(indices, 5)
array([False,  True,  True,  True,  True])

is_scalar_nan

function is_scalar_nan
val is_scalar_nan :
  Py.Object.t ->
  Py.Object.t

Tests if x is NaN

This function is meant to overcome the issue that np.isnan does not allow non-numerical types as input, and that np.nan is not np.float('nan').

Parameters

  • x : any type

Returns

boolean

Examples

>>> is_scalar_nan(np.nan)
True
>>> is_scalar_nan(float('nan'))
True
>>> is_scalar_nan(None)
False
>>> is_scalar_nan('')
False
>>> is_scalar_nan([np.nan])
False

issparse

function issparse
val issparse :
  Py.Object.t ->
  Py.Object.t

Is x of a sparse matrix type?

Parameters

x object to check for being a sparse matrix

Returns

bool True if x is a sparse matrix, False otherwise

Notes

issparse and isspmatrix are aliases for the same function.

Examples

>>> from scipy.sparse import csr_matrix, isspmatrix
>>> isspmatrix(csr_matrix([[5]]))
True
>>> from scipy.sparse import isspmatrix
>>> isspmatrix(5)
False

parse_version

function parse_version
val parse_version :
  Py.Object.t ->
  Py.Object.t

register_parallel_backend

function register_parallel_backend
val register_parallel_backend :
  ?make_default:Py.Object.t ->
  name:Py.Object.t ->
  factory:Py.Object.t ->
  unit ->
  Py.Object.t

Register a new Parallel backend factory.

The new backend can then be selected by passing its name as the backend argument to the Parallel class. Moreover, the default backend can be overwritten globally by setting make_default=True.

The factory can be any callable that takes no argument and return an instance of ParallelBackendBase.

  • Warning: this function is experimental and subject to change in a future version of joblib.

.. versionadded:: 0.10

resample

function resample
val resample :
  ?options:(string * Py.Object.t) list ->
  Py.Object.t list ->
  Py.Object.t

Resample arrays or sparse matrices in a consistent way

The default strategy implements one step of the bootstrapping procedure.

Parameters

  • *arrays : sequence of indexable data-structures Indexable data-structures can be arrays, lists, dataframes or scipy sparse matrices with consistent first dimension.

Other Parameters

  • replace : boolean, True by default Implements resampling with replacement. If False, this will implement (sliced) random permutations.

  • n_samples : int, None by default Number of samples to generate. If left to None this is automatically set to the first dimension of the arrays. If replace is False it should not be larger than the length of arrays.

  • random_state : int, RandomState instance or None, optional (default=None) Determines random number generation for shuffling the data. Pass an int for reproducible results across multiple function calls.

  • See :term:Glossary <random_state>.

  • stratify : array-like or None (default=None) If not None, data is split in a stratified fashion, using this as the class labels.

Returns

  • resampled_arrays : sequence of indexable data-structures Sequence of resampled copies of the collections. The original arrays are not impacted.

Examples

It is possible to mix sparse and dense arrays in the same run::

X = np.array([[1., 0.], [2., 1.], [0., 0.]]) y = np.array([0, 1, 2])

from scipy.sparse import coo_matrix X_sparse = coo_matrix(X)

from sklearn.utils import resample X, X_sparse, y = resample(X, X_sparse, y, random_state=0) X array([[1., 0.], [2., 1.], [1., 0.]])

X_sparse <3x2 sparse matrix of type '<... 'numpy.float64'>' with 4 stored elements in Compressed Sparse Row format>

X_sparse.toarray() array([[1., 0.], [2., 1.], [1., 0.]])

y array([0, 1, 0])

resample(y, n_samples=2, random_state=0) array([0, 1])

Example using stratification::

y = [0, 0, 1, 1, 1, 1, 1, 1, 1] resample(y, n_samples=5, replace=False, stratify=y, ... random_state=0) [1, 1, 1, 0, 1]

See also

:func:sklearn.utils.shuffle

safe_indexing

function safe_indexing
val safe_indexing :
  ?axis:int ->
  x:[`Arr of [>`ArrayLike] Np.Obj.t | `PyObject of Py.Object.t] ->
  indices:[`Arr of [>`ArrayLike] Np.Obj.t | `I of int | `S of string | `Slice of Np.Wrap_utils.Slice.t | `Bool of bool] ->
  unit ->
  Py.Object.t
  • DEPRECATED: safe_indexing is deprecated in version 0.22 and will be removed in version 0.24.

Return rows, items or columns of X using indices.

.. deprecated:: 0.22 This function was deprecated in version 0.22 and will be removed in version 0.24.

Parameters

  • X : array-like, sparse-matrix, list, pandas.DataFrame, pandas.Series Data from which to sample rows, items or columns. list are only supported when axis=0.

  • indices : bool, int, str, slice, array-like

    • If axis=0, boolean and integer array-like, integer slice, and scalar integer are supported.
    • If axis=1:

      • to select a single column, indices can be of int type for all X types and str only for dataframe. The selected subset will be 1D, unless X is a sparse matrix in which case it will be 2D.
      • to select multiples columns, indices can be one of the
  • following: list, array, slice. The type used in these containers can be one of the following: int, 'bool' and str. However, str is only supported when X is a dataframe. The selected subset will be 2D.

  • axis : int, default=0 The axis along which X will be subsampled. axis=0 will select rows while axis=1 will select columns.

Returns

subset Subset of X on axis 0 or 1.

Notes

CSR, CSC, and LIL sparse matrices are supported. COO sparse matrices are not supported.

safe_mask

function safe_mask
val safe_mask :
  x:[>`ArrayLike] Np.Obj.t ->
  mask:[>`ArrayLike] Np.Obj.t ->
  unit ->
  Py.Object.t

Return a mask which is safe to use on X.

Parameters

  • X : {array-like, sparse matrix} Data on which to apply mask.

  • mask : array Mask to be used on X.

Returns

mask

safe_sqr

function safe_sqr
val safe_sqr :
  ?copy:bool ->
  x:[>`ArrayLike] Np.Obj.t ->
  unit ->
  Py.Object.t

Element wise squaring of array-likes and sparse matrices.

Parameters

  • X : array like, matrix, sparse matrix

  • copy : boolean, optional, default True Whether to create a copy of X and operate on it or to perform inplace computation (default behaviour).

Returns

X ** 2 : element wise square

shuffle

function shuffle
val shuffle :
  ?random_state:int ->
  ?n_samples:int ->
  [>`ArrayLike] Np.Obj.t list ->
  [>`ArrayLike] Np.Obj.t list

Shuffle arrays or sparse matrices in a consistent way

This is a convenience alias to resample( *arrays, replace=False) to do random permutations of the collections.

Parameters

  • *arrays : sequence of indexable data-structures Indexable data-structures can be arrays, lists, dataframes or scipy sparse matrices with consistent first dimension.

Other Parameters

  • random_state : int, RandomState instance or None, optional (default=None) Determines random number generation for shuffling the data. Pass an int for reproducible results across multiple function calls.

  • See :term:Glossary <random_state>.

  • n_samples : int, None by default Number of samples to generate. If left to None this is automatically set to the first dimension of the arrays.

Returns

  • shuffled_arrays : sequence of indexable data-structures Sequence of shuffled copies of the collections. The original arrays are not impacted.

Examples

It is possible to mix sparse and dense arrays in the same run::

X = np.array([[1., 0.], [2., 1.], [0., 0.]]) y = np.array([0, 1, 2])

from scipy.sparse import coo_matrix X_sparse = coo_matrix(X)

from sklearn.utils import shuffle X, X_sparse, y = shuffle(X, X_sparse, y, random_state=0) X array([[0., 0.], [2., 1.], [1., 0.]])

X_sparse <3x2 sparse matrix of type '<... 'numpy.float64'>' with 3 stored elements in Compressed Sparse Row format>

X_sparse.toarray() array([[0., 0.], [2., 1.], [1., 0.]])

y array([2, 1, 0])

shuffle(y, n_samples=2, random_state=0) array([0, 1])

See also

:func:sklearn.utils.resample

tosequence

function tosequence
val tosequence :
  [>`ArrayLike] Np.Obj.t ->
  Py.Object.t

Cast iterable x to a Sequence, avoiding a copy if possible.

Parameters

  • x : iterable