asreview.models.classifiers.SVM#

class asreview.models.classifiers.SVM(penalty='l2', loss='squared_hinge', *, dual='auto', tol=0.0001, C=1.0, multi_class='ovr', fit_intercept=True, intercept_scaling=1, class_weight=None, verbose=0, random_state=None, max_iter=1000)[source]#

Bases: LinearSVC

Support vector machine classifier.

Based on the sklearn implementation of the support vector machine sklearn.svm.LinearSVC.

Methods

__init__([penalty, loss, dual, tol, C, ...])

decision_function(X)

Predict confidence scores for samples.

densify()

Convert coefficient matrix to dense array format.

fit(X, y[, sample_weight])

Fit the model according to the given training data.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

predict(X)

Predict class labels for samples in X.

score(X, y[, sample_weight])

Return accuracy on provided data and labels.

set_fit_request(*[, sample_weight])

Configure whether metadata should be requested to be passed to the fit method.

set_params(**params)

Set the parameters of this estimator.

set_score_request(*[, sample_weight])

Configure whether metadata should be requested to be passed to the score method.

sparsify()

Convert coefficient matrix to sparse format.

Attributes

decision_function(X)#

Predict confidence scores for samples.

The confidence score for a sample is proportional to the signed distance of that sample to the hyperplane.

Parameters:

X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The data matrix for which we want to get the confidence scores.

Returns:

scores (ndarray of shape (n_samples,) or (n_samples, n_classes)) – Confidence scores per (n_samples, n_classes) combination. In the binary case, confidence score for self.classes_[1] where >0 means this class would be predicted.

densify()#

Convert coefficient matrix to dense array format.

Converts the coef_ member (back) to a numpy.ndarray. This is the default format of coef_ and is required for fitting, so calling this method is only required on models that have previously been sparsified; otherwise, it is a no-op.

Returns:

self – Fitted estimator.

fit(X, y, sample_weight=None)#

Fit the model according to the given training data.

Parameters:
  • X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Training vector, where n_samples is the number of samples and n_features is the number of features.

  • y (array-like of shape (n_samples,)) – Target vector relative to X.

  • sample_weight (array-like of shape (n_samples,), default=None) –

    Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.

    Added in version 0.18.

Returns:

self (object) – An instance of the estimator.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routing (MetadataRequest) – A MetadataRequest encapsulating routing information.

get_params(deep=True)#

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params (dict) – Parameter names mapped to their values.

label = 'Support vector machine'#
name = 'svm'#
predict(X)#

Predict class labels for samples in X.

Parameters:

X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The data matrix for which we want to get the predictions.

Returns:

y_pred (ndarray of shape (n_samples,)) – Vector containing the class labels for each sample.

score(X, y, sample_weight=None)#

Return accuracy on provided data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Test samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True labels for X.

  • sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.

Returns:

score (float) – Mean accuracy of self.predict(X) w.r.t. y.

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') SVM#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns:

self (object) – The updated object.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self (estimator instance) – Estimator instance.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') SVM#

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self (object) – The updated object.

sparsify()#

Convert coefficient matrix to sparse format.

Converts the coef_ member to a scipy.sparse matrix, which for L1-regularized models can be much more memory- and storage-efficient than the usual numpy.ndarray representation.

The intercept_ member is not converted.

Warning

This method is not supported for estimators fitted with array API inputs (i.e. when sklearn.config_context() is used with array_api_dispatch=True). The call may succeed but subsequent calls to predict() and other methods involving passing arrays may raise or return unexpected results.

Returns:

self – Fitted estimator.

Notes

For non-sparse models, i.e. when there are not many zeros in coef_, this may actually increase memory usage, so use this method with care. A rule of thumb is that the number of zero elements, which can be computed with (coef_ == 0).sum(), must be more than 50% for this to provide significant benefits.

After calling this method, further fitting with the partial_fit method (if any) will not work until you call densify.