asreview.ActiveLearningCycle#

class asreview.ActiveLearningCycle(querier, classifier=None, balancer=None, feature_extractor=None, stopper=None, n_query=1)[source]#

Bases: object

Active learner cycle class.

The active learner cycle class is a wrapper around the various learner components.

The classifier is optional, if no classifier is provided, the active learner will only rank the instances based on the query strategy. This can be useful for example if you want random screening.

Parameters:
  • querier (BaseQueryStrategy) – The query strategy to use.

  • classifier (BaseTrainClassifier) – The classifier to use. Default is None.

  • balancer (BaseTrainClassifier) – The balance strategy to use. Default is None.

  • feature_extractor (BaseFeatureExtraction) – The feature extraction method to use. Default is None.

  • stopper (BaseStopper) – The stopping criteria. Default is None.

  • n_query (int, callable) – The number of instances to query at once. If None, the querier will determine the number of instances to query. Default is None.

Methods

__init__(querier[, classifier, balancer, ...])

fit(X, y)

Fit the classifier to the data.

from_file(fp[, load])

Load the active learner from a file.

from_meta(cycle_meta_data[, ...])

Load the active learner from a metadata object.

get_n_query(results, labels)

Get the number of records to query at each step in the active learning.

rank(X)

Rank the instances in X.

stop(results, data)

Check if the stopping criteria is met.

to_file(fp)

to_meta()

transform(X)

Transform the data.

fit(X, y)[source]#

Fit the classifier to the data.

Parameters:
  • X (np.array) – The instances to fit.

  • y (np.array) – The labels of the instances.

classmethod from_file(fp, load=None)[source]#

Load the active learner from a file.

Parameters:
  • fp (str, Path) – Review config file.

  • load (object) – Config reader. Default tomllib.load for TOML (.toml) files, otherwise json.load.

classmethod from_meta(cycle_meta_data, skip_feature_extraction=False)[source]#

Load the active learner from a metadata object.

Parameters:

cycle_meta_data (CycleMetaData) – The metadata object with the active learner settings.

Returns:

ActiveLearningCycle – The active learner cycle object.

get_n_query(results, labels)[source]#

Get the number of records to query at each step in the active learning.

n_query can be an integer or a function that takes the results of the simulation as input. If n_query is a function, it should return an integer. n_query can not be larger than the number of records left to label.

Parameters:
  • n_query (int | callable) – Number of records to query at each step in the active learning process. Default is 1.

  • results (pd.DataFrame) – The results of the simulation.

Returns:

int – Number of records to query at each step in the active learning process.

rank(X)[source]#

Rank the instances in X.

Parameters:

X (np.array) – The instances to rank.

Returns:

np.array – The ranking of the instances.

stop(results, data)[source]#

Check if the stopping criteria is met.

Parameters:
  • results (pd.DataFrame) – The results of the simulation.

  • data (pandas.DataFrame) – The data store object.

Returns:

bool – True if the stopping criteria is met, False otherwise.

to_file(fp)[source]#
to_meta()[source]#
transform(X)[source]#

Transform the data.

Parameters:

X (np.array) – The instances to transform.

Returns:

np.array, scipy.sparse.csr_matrix – The transformed instances.