What is ASReview LAB?¶
ASReview LAB is a free (Libre) open-source machine learning tool for screening and systematically labeling a large collection of textual data. It’s sometimes referred to as a tool for title and abstract screening in systematic reviews or meta-analyses, but it can handle any type of textual data that must be screened systematically.
ASReview LAB implements three different options:
- Oracle: Screen textual data in interaction with the active learning model. The reviewer is the ‘oracle’, making the labeling decisions.
- Simulation: Evaluate the performance of active learning models on fully labeled data.
- Exploration: Explore or demonstrate ASReview LAB with a completely labeled dataset. This mode is suitable for teaching purposes.
ASReview LAB is one of the products of the ASReview research project initiated at Utrecht University, which has grown into a vivid community of researchers, users, and developers from around the world.
What is active learning?¶
Artificial Intelligence (AI) and machine learning has allowed the development of AI-aided pipelines that assist in finding relevant texts for search tasks. A well-established approach to increasing the efficiency of screening large amounts of textual data is screening prioritization through Active Learning: a constant interaction between a human who labels records and a machine learning model which selects the most likely relevant record based on a minimum training dataset. It allows the screening of large amounts of text in an intelligent and time-efficient manner. ASReview LAB, published in Nature Machine Intelligence, has shown the benefits of active learning, reducing up to 95% of the required screening time.
Labeling workflow with ASReview¶
Start and finish a systematic labeling process with ASReview LAB by following these steps:
- Create a dataset with potentially relevant records you want to screen systematically. Improve the quality of the data and specify clear reviewing (inclusion/exclusion) criteria
- Specify a stopping criterium
- Start ASReview LAB
- Create a project
- Import your dataset
- Select Prior Knowledge
- Select the four components of the active learning model (feature extractor, classifier, balancing method, query strategy)
- Wait until the warm up of the AI is ready (the software is extracting the features and trains the classifier on the prior knowledge)
- Start Screening until you reach your stopping criterium
- Export results and Export Project
- Check if Python 3.7 or later is installed (if not, install Python)
- Install ASReview LAB
pip install asreview
- Open ASReview LAB
- Click Create to create a project.
- Select a mode (Oracle, Exploration, Simulation)
- Name the project, and if you want, add an author name(s) and type a description.
- Import a dataset you want to review, or select a benchmark dataset (only available for Exploration and Simulation).
- Add prior knowledge. Select at least 1 relevant and 1 irrelevant record to warm up the AI. You can search for a specific record or request random records.
- Select the four components of the active learning model, or rely on the default settings that have shown fast and excellent performance in many simulation studies.
- ASReview LAB starts extracting the features and runs the classifier with the prior knowledge.
You’re ready to start labeling your data! All your labeling actions are automatically saved, so there is no need to click the save button (we don’t even have one).
ASReview LAB terminology¶
When you do text screening for a systematic review in ASReview LAB, it can be useful to know some basic concepts about systematic reviewing and machine learning to understand. The following overview describes some terms you might encounter as you use ASReview LAB.
- Active learning model
- An active learning model is the combination of four elements: a feature extraction technique, a classifier, a balance, and a query strategy.
- ASReview stands for Active learning for Systematic Reviews or AI-assisted Systematic Reviews, depending on context. Avoid this explanation, only use as tagline.
- ASReview CLI
- ASReview CLI is the command line interface that is developed for advanced options or for running simulation studies.
- Data includes dataset, prior knowledge, labels, and notes.
- A d ataset is the collection of records that the user imports and exports.
- ELAS stands for “Electronic Learning Assistant”. It is the name of ASReview mascot. It is used for storytelling and to increase explainability.
- Export is the action of exporting a dataset or a project from ASReview LAB.
- An extension is the additional element to the ASReview LAB, such as the ASReview visualisation extension, or the ASReview CORD-19 extension.
- Import is the action of importing a dataset or a project into ASReview LAB.
- Model configuration
- Model configuration is the action of the user to configure the active learning model.
- A note is the information added by the user in the note field and stored in the project file. It can be edited on the History page.
- A project is a project created in ASReview LAB.
- Projects dashboard
- The project dashboard is the landing page containing an overview of all projects in ASReview LAB.
- Project file
- The project file is the
.asreviewfile containing the data and model configuration. The file is exported from ASReview LAB and can be imported back.
- Project mode
the project mode includes oracle, simulation, and exploration in ASReview LAB:
Exploration mode is used when a user explores or demonstrates ASReview LAB with a completely labeled dataset. This mode is suitable for teaching purposes.
Simulation mode is used when a user simulates a review on a completely labeled dataset to see the performance of ASReview LAB.
The project status is the stage that a project is at in ASReview LAB.
In Review refers to the fact that in oracle or exploration mode, the user adds labels to records, or in simulation mode, the simulation is running.
Finished refers to the fact that in oracle or exploration mode, the user decides to complete the reviewing process or has labeled all the records, or in simulation mode, the simulation has been completed.
Published refers to the fact that the user publishes the dataset and project file in a repository, preferably with a Digital Object Identifier (DOI).
A record is the data point that needs to be labeled. A record can contain both information that is used for training the active learning model, and information that is not used for this purpose.
In the case of systematic reviewing, a record is meta-data for a scientific publication. Here, the information that is used for training purposes is the text in the title and abstract of the publication. The information that is not used for training typically consists of other metadata, for example, the authors, journal, or DOI of the publication.
- Reviewing is the decision-making process on the relevancy of records (“irrelevant” or “relevant”). It is interchangeable with Labeling, Screening, and Classifying.
- The human annotator is the person who labels records.
- Replacement term when the context is PRISMA-based reviewing.
The ASReview LAB software doesn’t collect any information about the usage or user. Great, isn’t it!