Command Line¶
ASReview provides a powerful command line interface for running tasks like
simulations. For a list of available commands, type asreview --help
.
Lab¶
asreview lab launches the ASReview LAB software (the frontend).
asreview lab [options]
-
--ip
IP
¶ The IP address the server will listen on.
-
--port
PORT
¶ The port the server will listen on.
-
--clean_project
CLEAN_PROJECT
¶ Safe cleanup of temporary files in project.
-
--clean_all_projects
CLEAN_ALL_PROJECTS
¶ Safe cleanup of temporary files in all projects.
-
--embedding
EMBEDDING_FP
¶ File path of embedding matrix. Required for LSTM models.
-
--seed
SEED
¶ Seed for the model (classifiers, balance strategies, feature extraction techniques, and query strategies). Use an integer between 0 and 2^32 - 1.
-
-h
,
--help
¶
Show help message and exit.
Simulate¶
asreview simulate measures the performance of the software on existing systematic reviews. The software shows how many papers you could have potentially skipped during the systematic review. You can use your own labelled dataset
asreview simulate [options] [dataset [dataset ...]]
or one of the benchmark-datasets (see index.csv for dataset IDs).
asreview simulate [options] benchmark: [dataset_id]
Examples:
asreview simulate YOUR_DATA.csv --state_file myreview.h5
asreview simulate benchmark:van_de_Schoot_2017 --state_file myreview.h5
-
dataset
¶
A dataset to simulate
-
-m
,
--model
MODEL
¶ The prediction model for Active Learning. Default:
nb
. (See available options below: Classifiers)
-
-q
,
--query_strategy
QUERY_STRATEGY
¶ The query strategy for Active Learning. Default:
max
. (See available options below: Query strategies)
-
-b
,
--balance_strategy
BALANCE_STRATEGY
¶ Data rebalancing strategy. Helps against imbalanced datasets with few inclusions and many exclusions. Default:
double
. (See available options below: Balance strategies)
-
-e
,
--feature_extraction
FEATURE_EXTRACTION
¶ Feature extraction method. Some combinations of feature extraction method and prediction model are not available. Default:
tfidf
. (See available options below: Feature extraction)
-
--embedding
EMBEDDING_FP
¶ File path of embedding matrix. Required for LSTM models.
-
--config_file
CONFIG_FILE
¶ Configuration file with model settingsand parameter values.
-
--seed
SEED
¶ Seed for the model (classifiers, balance strategies, feature extraction techniques, and query strategies). Use an integer between 0 and 2^32 - 1.
-
--n_prior_included
N_PRIOR_INCLUDED
¶ The number of prior included papers. Only used when
prior_idx
is not given. Default 1.
-
--n_prior_excluded
N_PRIOR_EXCLUDED
¶ The number of prior excluded papers. Only used when
prior_idx
is not given. Default 1.
-
--prior_idx
[PRIOR_IDX [PRIOR_IDX ...]]
¶ Prior indices by rownumber (0 is first rownumber).
-
--prior_record_id
[PRIOR_RECORD_ID [PRIOR_RECORD_ID ...]]
¶ Prior indices by record_id.
New in version 0.15.
-
--included_dataset
[INCLUDED_DATASET [INCLUDED_DATASET ...]]
¶ A dataset with papers that should be includedCan be used multiple times.
-
--excluded_dataset
[EXCLUDED_DATASET [EXCLUDED_DATASET ...]]
¶ A dataset with papers that should be excludedCan be used multiple times.
-
--prior_dataset
[PRIOR_DATASET [PRIOR_DATASET ...]]
¶ A dataset with papers from prior studies.
-
--state_file
STATE_FILE
,
-s
STATE_FILE
¶ Location to store the (active learning) state of the simulation. It is possible to output the state to a JSON file (extension
.json
) or HDF5 file (extension.h5
).
-
--init_seed
INIT_SEED
¶ Seed for setting the prior indices if the prior_idx option is not used. If the option prior_idx is used with one or more index, this option is ignored.
-
--n_instances
N_INSTANCES
¶ Number of papers queried each query.Default 1.
-
--n_queries
N_QUERIES
¶ The number of queries. By default, the program stops after all documents are reviewed or is interrupted by the user.
-
-n
N_PAPERS
,
--n_papers
N_PAPERS
¶ The number of papers to be reviewed. By default, the program stops after all documents are reviewed or is interrupted by the user.
-
--verbose
VERBOSE
,
-v
VERBOSE
¶ Verbosity
-
-h
,
--help
¶
Show help message and exit.
Note
Some classifiers (models) and feature extraction algorithms require additional dependecies. Use pip install asreview[all]
to install all additional dependencies at once.
Feature Extraction¶
Name | Reference | Requires |
---|---|---|
tfidf | asreview.models.feature_extraction.Tfidf |
|
doc2vec | asreview.models.feature_extraction.Doc2Vec |
gensim |
embedding-idf | asreview.models.feature_extraction.EmbeddingIdf |
|
embedding-lstm | asreview.models.feature_extraction.EmbeddingLSTM |
|
sbert | asreview.models.feature_extraction.SBERT |
sentence_transformers |
Classifiers¶
Query Strategies¶
Name | Reference | Requires |
---|---|---|
max | asreview.models.query.MaxQuery |
|
random | asreview.models.query.RandomQuery |
|
uncertainty | asreview.models.query.UncertaintyQuery |
|
cluster | asreview.models.query.ClusterQuery |
Balance Strategies¶
Name | Reference | Requires |
---|---|---|
simple | asreview.models.balance.SimpleBalance |
|
double | asreview.models.balance.DoubleBalance |
|
triple | asreview.models.balance.TripleBalance |
|
undersample | asreview.models.balance.UndersampleBalance |
Simulate-batch¶
asreview simulate-batch provides the same interface as the
asreview simulate, but adds an extra option (--n_runs
) to run a
batch of simulation runs with the same configuration.
asreview simulate-batch [options] [dataset [dataset ...]]
Warning
The behavior of some arguments of asreview simulate-batch will differ slightly from asreview simulate.
-
dataset
¶
A dataset to simulate
-
--n_runs
¶
Number of simulation runs.
Algorithms¶
New in version 0.14.
asreview algorithms provides an overview of all available active learning model elements (classifiers, query strategies, balance strategies, and feature extraction algorithms) in ASReview.
asreview algorithms
Note
asreview algorithms included models added via extensions. See Extensions for more information on extending ASReview with new models via extensions.