API Reference

Data and datasets

Dataset object

`load_dataset`(name, **kwargs)	Load data from file, URL, or plugin.
`Dataset`([df, column_spec])	Dataset object to the dataset with texts, labels, DOIs etc.

Readers and writers

Functions and classes for file reading and writing.

`data.list_readers`()	List available dataset reader classes.
`data.list_writers`()	List available dataset writer classes.
`data.CSVReader`()	CVS file reader.
`data.CSVWriter`()	CSV file writer.
`data.ExcelReader`()	Excel file reader.
`data.ExcelWriter`()	Excel file writer.
`data.RISReader`()	RIS file reader.
`data.RISWriter`()	RIS file writer.
`data.TSVWriter`()	TSV file writer.

Statistics

`data.statistics.abstract_length`(data)	Return the average length of the abstracts.
`data.statistics.n_duplicates`(data[, pid])	Number of duplicates.
`data.statistics.n_irrelevant`(data)	Return the number of irrelevant records.
`data.statistics.n_keywords`(data)	Return the number of keywords.
`data.statistics.n_missing_abstract`(data)	Return the number of records with missing abstracts.
`data.statistics.n_missing_title`(data)	Return the number of records with missing titles.
`data.statistics.n_records`(data)	Return the number of records.
`data.statistics.n_relevant`(data)	Return the number of relevant records.
`data.statistics.n_unlabeled`(data)	Return the number of unlabeled records.
`data.statistics.title_length`(data)	Return the average length of the titles.

Datasets

Available datasets

`asreview.datasets.SynergyDataGroup`()	Datasets available in the SYNERGY dataset.
`asreview.datasets.NaturePublicationDataGroup`()	Datasets used in the paper Van de Schoot et al. 2020.

Dataset managers

`asreview.datasets.BaseDataSet`(dataset_id[, ...])
`asreview.datasets.BaseDataGroup`(*datasets)
`asreview.datasets.DatasetManager`()

Reviewer

simulation.Simulate(as_data, project[, ...])

ASReview Simulation mode class.

Models

This section provides an overview of the available models for active learning in ASReview. For command line usage, use the name (example) given behind the model description (or see the name property of the model). Some models require additional dependencies, see the model class for more information and instructions.

Base class

models.base.BaseModel()

Abstract class for any kind of model.

`asreview.models.feature_extraction`

Classes

`feature_extraction.base.BaseFeatureExtraction`([...])	Base class for feature extraction methods.
`feature_extraction.Tfidf`(*args[, ngram_max, ...])	TF-IDF feature extraction technique (`tfidf`).
`feature_extraction.Doc2Vec`(*args[, ...])	Doc2Vec feature extraction technique (`doc2vec`).
`feature_extraction.EmbeddingIdf`(*args[, ...])	Embedding IDF feature extraction technique (`embedding-idf`).
`feature_extraction.EmbeddingLSTM`(*args[, ...])	Embedding LSTM feature extraction technique (`embedding-lstm`).
`feature_extraction.SBERT`(*args[, ...])	Sentence BERT feature extraction technique (`sbert`).

Functions

`feature_extraction.get_feature_model`(name, *args)	Get an instance of a feature extraction model from a string.
`feature_extraction.get_feature_class`(name)	Get class of feature extraction from string.
`feature_extraction.list_feature_extraction`()	List available feature extraction method classes.

`asreview.models.classifiers`

Classes

`classifiers.base.BaseTrainClassifier`()	Base model, abstract class to be implemented by derived ones.
`classifiers.NaiveBayesClassifier`([alpha])	Naive Bayes classifier (`nb`).
`classifiers.RandomForestClassifier`([...])	Random forest classifier (`rf`).
`classifiers.SVMClassifier`([gamma, ...])	Support vector machine classifier (`svm`).
`classifiers.LogisticClassifier`([C, ...])	Logistic regression classifier (`logistic`).
`classifiers.NN2LayerClassifier`([...])	Fully connected neural network (2 hidden layers) classifier (`nn-2-layer`).

Functions

`classifiers.get_classifier`(name, *args[, ...])	Get an instance of a model from a string.
`classifiers.get_classifier_class`(name)	Get class of model from string.
`classifiers.list_classifiers`()	List available classifier classes.

`asreview.models.query`

Classes

`query.base.BaseQueryStrategy`()	Abstract class for query strategies.
`query.base.ProbaQueryStrategy`()
`query.MaxQuery`()	Maximum query strategy (`max`).
`query.MixedQuery`([strategy_1, strategy_2, ...])	Mixed query strategy.
`query.MaxRandomQuery`([mix_ratio, random_state])	Mixed (95% Maximum and 5% Random) query strategy (`max_random`).
`query.MaxUncertaintyQuery`([mix_ratio, ...])	Mixed (95% Maximum and 5% Uncertainty) query strategy (`max_uncertainty`).
`query.UncertaintyQuery`()	Uncertainty query strategy (`uncertainty`).
`query.RandomQuery`([random_state])	Random query strategy (`random`).
`query.ClusterQuery`([cluster_size, ...])	Clustering query strategy (`cluster`).

Functions

`query.get_query_model`(name, *args[, ...])	Get an instance of the query strategy.
`query.get_query_class`(name)	Get class of query strategy from its name.
`query.list_query_strategies`()	List available query strategy classes.

`asreview.models.balance`

Classes

`balance.base.BaseBalance`()	Abstract class for balance strategies.
`balance.SimpleBalance`()	No balance strategy (`simple`).
`balance.DoubleBalance`([a, alpha, b, beta, ...])	Double balance strategy (`double`).
`balance.TripleBalance`([a, alpha, b, beta, ...])	Triple balance strategy (`triple`).
`balance.UndersampleBalance`([ratio, random_state])	Undersampling balance strategy (`undersample`).

Functions

`balance.get_balance_model`(name, *args[, ...])	Get an instance of a balance model from a string.
`balance.get_balance_class`(name)	Get class of balance model from string.
`balance.list_balance_strategies`()	List available balancing strategy classes.

Projects and States

Load, interact, and extract information from project files and states (the “diary” of the review).

Project

Project(project_path[, project_id])

Project class for ASReview project files.

State

`open_state`(asreview_obj[, review_id, read_only])	Initialize a state class instance from a project folder.
`state.SQLiteState`([read_only])	Class for storing the review state.

Utils

`project.get_project_path`(folder_id)	Get the project directory.
`project.project_from_id`(f)	Decorator function that takes a user account as parameter, the user account is used to get the correct sub folder in which the projects is
`project.get_projects`([project_paths])	Get the ASReview projects at the given paths.
`project.is_project`(project_path)
`project.is_v0_project`(project_path)	Check if a project file is of a ASReview version 0 project.

Misc

Classes

asreview.settings.ASReviewSettings(model, ...)

Object to store the configuration of a review session.

Functions

`search.fuzzy_find`(as_data, keywords[, ...])	Find a record using keywords.
`asreview_path`()	Get the location where projects are stored.
`get_data_home`([data_home])	Return the path of the ASR data dir.

API Reference

Data and datasets

Dataset object

Readers and writers

Statistics

Datasets

Available datasets

Dataset managers

Reviewer

Models

asreview.models.feature_extraction

asreview.models.classifiers

asreview.models.query

asreview.models.balance

Projects and States

Project

State

Utils

Misc

`asreview.models.feature_extraction`

`asreview.models.classifiers`

`asreview.models.query`

`asreview.models.balance`