Overview

ASReview LAB offers three different solutions to run simulations with the:

What is a simulation?

A simulation involves mimicking the screening process with a certain model. As it is already known which records are labeled as relevant, the software can automatically reenact the screening process as if a human was labeling the records in interaction with the Active Learning model.

Why run a simulation?

Simulating with ASReview LAB has multiple purposes. First, the performance of one or multiple models can be measured by different metrics (see Analyzing results). A convenient one is that you can investigate the amount of work you could have saved by using active learning compared to your manual screening process.

Suppose you don’t know which model to choose for a new (unlabeled) dataset. In that case, you can experiment with the best performing combination of the classifier, feature extraction, query strategy, and balancing and test the performance on a labeled dataset with similar characteristics.

You could also use the simulation mode to benchmark your own model against existing models for different available datasets. ASReview LAB allows for adding new models via a template.

You can also find ‘odd’ relevant records in a ‘classical’ search. Such records are typically found isolated from most other records and might be worth closer inspection

Datasets for simulation

Simulations require fully labeled datasets (labels: 0 = irrelevant, 1 = relevant). Such a dataset can be the result of an earlier study. ASReview offers also fully labeled datasets via the SYNERGY dataset. These datasets are available via the user interface in the Data step of the setup and in the command line with the prefix synergy: (e.g. synergy:van_de_schoot_2018).

Tip

When you import your data, make sure to remove duplicates and to retrieve as many abstracts as possible (See Importance-of-abstracts blog for help). With clean data you benefit most from what active learning has to offer.

Cloud environments

For advanced scenarios, such as executing ASReview simulations in cloud environments or running them in parallel, consult the specialized cloud usage guide. This guide provides tailored instructions for a variety of use cases, including simulations on cloud platforms such as SURF, Digital Ocean, AWS, Azure, and leveraging Kubernetes for large-scale simulation tasks. More information can be found in the paper: Optimizing ASReview simulations: A generic multiprocessing solution for ‘light-data’ and ‘heavy-data’ users