What is ASReview LAB?
ASReview LAB is a free (Libre) open-source machine learning tool for screening and systematically labeling a large collection of textual data. It’s sometimes referred to as a tool for title and abstract screening in systematic reviews or meta-analyses, but it can handle any type of textual data that must be screened systematically, see the paper published in Nature Machine Intelligence.
ASReview LAB implements three different options:
Oracle: Screen textual data in interaction with the active learning model. The reviewer is the ‘oracle’, making the labeling decisions.
Simulation: Evaluate the performance of active learning models on fully labeled data.
Exploration: Explore or demonstrate ASReview LAB with a completely labeled dataset. This mode is suitable for teaching purposes.
ASReview LAB is one of the products of the ASReview research project initiated at Utrecht University, which has grown into a vivid community of researchers, users, and developers from around the world.
What is active learning?
Artificial Intelligence (AI) and machine learning has allowed the development of AI-aided pipelines that assist in finding relevant texts for search tasks. A well-established approach to increasing the efficiency of screening large amounts of textual data is screening prioritization through Active Learning: a constant interaction between a human who labels records and a machine learning model which selects the most likely relevant record based on a minimum training dataset. The active learning cycle is repeated until the annotator is sufficiently confident they have seen all relevant records. Thus, the machine learning model is responsible for ranking the records and the human provides the labels, this is called Researcher-In-The-Loop (RITL).
It allows the screening of large amounts of text in an intelligent and time-efficient manner. ASReview LAB, published in Nature Machine Intelligence, has shown the benefits of active learning, reducing up to 95% of the required screening time.
Labeling workflow with ASReview
Start and finish a systematic labeling process with ASReview LAB by following these steps:
Create a dataset with potentially relevant records you want to screen systematically. Improve the quality of the data and specify clear reviewing (inclusion/exclusion) criteria
Specify a stopping criterion
Select the four components of the Active learning model (feature extractor, classifier, balancing method, query strategy)
Wait until the warm up of the AI is ready (the software is extracting the features and trains the classifier on the prior knowledge)
Start Screening until you reach your stopping criterion
At any time, you can export the dataset the labeling decisions or the entire project.
Check if Python 3.7 or later is installed (if not, install Python)
Install ASReview LAB
pip install asreview
Open ASReview LAB
Click Create to create a project
Select a mode (Oracle, Exploration, Simulation)
Name the project, and if you want, add an author name(s) and type a description
Import a dataset you want to review, or select a benchmark dataset (only available for the Exploration and Simulation mode)
Add prior knowledge. Select at least 1 relevant and 1 irrelevant record to warm up the AI. You can search for a specific record or request random records
Select the four components of the active learning model, or rely on the default settings that have shown fast and excellent performance in many simulation studies
ASReview LAB starts extracting the features and runs the classifier with the prior knowledge
You’re ready to start labeling your data! All your labeling actions are automatically saved, so there is no need to click the save button (we don’t even have one).
ASReview LAB terminology
When you do text screening for a systematic review in ASReview LAB, it can be useful to know some basic concepts about systematic reviewing and machine learning to understand. The following overview describes some terms you might encounter as you use ASReview LAB.
- Active learning model
An active learning model is the combination of four elements: a feature extraction technique, a classifier, a balance, and a query strategy.
ASReview stands for Active learning for Systematic Reviews or AI-assisted Systematic Reviews, depending on context. Avoid this explanation, only use as tagline.
- ASReview CLI
ASReview CLI is the command line interface that is developed for advanced options or for running simulation studies.
A dataset is the collection of records that the user imports and exports.
ELAS stands for “Electronic Learning Assistant”. It is the name of ASReview mascot. It is used for storytelling and to increase explainability.
Export is the action of exporting a dataset or a project from ASReview LAB.
An extension is the additional element to the ASReview LAB, such as the ASReview Datatools extension.
Import is the action of importing a dataset or a project into ASReview LAB.
- Model configuration
Model configuration is the action of the user to configure the active learning model.
A note is the information added by the user in the note field and stored in the project file. It can be edited on the History page.
A project is a project created in ASReview LAB.
- Projects dashboard
The project dashboard is the landing page containing an overview of all projects in ASReview LAB.
- Project file
The project file is the
.asreviewfile containing the data and model configuration. The file is exported from ASReview LAB and can be imported back.
- Project mode
the project mode includes oracle, simulation, and exploration in ASReview LAB:
Oracle mode is used when a user reviews a dataset systematically with interactive artificial intelligence (AI).
Exploration mode is used when a user explores or demonstrates ASReview LAB with a completely labeled dataset. This mode is suitable for teaching purposes.
Simulation mode is used when a user simulates a review on a completely labeled dataset to see the performance of ASReview LAB.
The project status is the stage that a project is at in ASReview LAB.
Setup refers to the fact that the user adds project information, imports the dataset, selects the prior knowledge, configures the model and initiates the first iteration of model training.
In Review refers to the fact that in oracle or exploration mode, the user adds labels to records, or in simulation mode, the simulation is running.
Finished refers to the fact that in oracle or exploration mode, the user decides to complete the reviewing process or has labeled all the records, or in simulation mode, the simulation has been completed.
Published refers to the fact that the user publishes the dataset and project file in a repository, preferably with a Digital Object Identifier (DOI).
A record is the data point that needs to be labeled. A record can contain both information that is used for training the active learning model, and information that is not used for this purpose.
In the case of systematic reviewing, a record is meta-data for a scientific publication. Here, the information that is used for training purposes is the text in the title and abstract of the publication. The information that is not used for training typically consists of other metadata, for example, the authors, journal, or DOI of the publication.
Reviewing is the decision-making process on the relevance of records (“irrelevant” or “relevant”). It is interchangeable with Labeling, Screening, and Classifying.
The human annotator is the person who labels records.
Replacement term when the context is PRISMA-based reviewing.
The use of ASReview LAB comes with five fundamental principles:
Humans are the oracle;
Code is open & results are transparent;
Decisions are unbiased;
The interface shows an AI is at work;
Users are responsible for importing high quality data.
The ASReview LAB software doesn’t collect any information about the usage or its user. Great, isn’t it!