Overview
ASReview LAB offers three different solutions to run simulations with the:
What is a simulation?
A simulation involves mimicking the screening process with a certain model. As it is already known which records are labeled as relevant, the software can automatically reenact the screening process as if a human was labeling the records in interaction with the Active Learning model.
Why run a simulation?
Simulating with ASReview LAB has multiple purposes. First, the performance of one or multiple models can be measured by different metrics (see Analyzing results). A convenient one is that you can investigate the amount of work you could have saved by using active learning compared to your manual screening process.
Suppose you don’t know which model to choose for a new (unlabeled) dataset. In that case, you can experiment with the best performing combination of the classifier, feature extraction, query strategy, and balancing and test the performance on a labeled dataset with similar characteristics.
You could also use the simulation mode to benchmark your own model against existing models for different available datasets. ASReview LAB allows for adding new models via a template.
You can also find ‘odd’ relevant records in a ‘classical’ search. Such records are typically found isolated from most other records and might be worth closer inspection
Datasets for simulation
Simulations require fully labeled datasets (labels: 0
= irrelevant, 1
= relevant). Such a dataset can be the result of an earlier study. ASReview offers also fully labeled datasets via the benchmark platform. These datasets are available via the user interface in the Data step of the setup and in the command line with the prefix benchmark: (e.g. benchmark:van_de_schoot_2017).
Tip
When you import your data, make sure to remove duplicates and to retrieve as many abstracts as possible (See Importance-of-abstracts blog for help). With clean data you benefit most from what active learning has to offer.