Simulate with Python API#
The ASReview Python API provides advanced control over the ASReview software, allowing users to customize models, implement different sampling strategies, and more. This example demonstrates how to simulate a systematic review using the ASReview API and save the results in an ASReview project file.
[1]:
import asreview as asr
from synergy_dataset import Dataset
Here, we use a dataset from the SYNERGY collection, accessed via the synergy-dataset
package.
[2]:
d = Dataset("Hall_2012").to_frame()
d.head()
[2]:
doi | title | abstract | label_included | |
---|---|---|---|---|
openalex_id | ||||
https://openalex.org/W2131536587 | https://doi.org/10.1109/indcon.2010.5712716 | Computer vision based offset error computation... | The use of computer vision based approach has ... | 0 |
https://openalex.org/W2557025555 | https://doi.org/10.1109/induscon.2010.5740045 | Design and development of a software for fault... | This paper presents an on-line fault diagnosis... | 0 |
https://openalex.org/W2143148279 | https://doi.org/10.1109/tpwrd.2005.848672 | Analytical Approach to Internal Fault Simulati... | A new method for simulating faulted transforme... | 0 |
https://openalex.org/W2111816457 | https://doi.org/10.1109/icelmach.2008.4799852 | Nonlinear equivalent circuit model of a tracti... | The paper presents the development of an equiv... | 0 |
https://openalex.org/W3142547111 | https://doi.org/10.1109/ipdps.2006.1639408 | Fault tolerance with real-time Java | After having drawn up a state of the art on th... | 0 |
Next, we import the required models for the simulation.
[3]:
from asreview.models.balancers import Balanced
from asreview.models.classifiers import SVM
from asreview.models.feature_extractors import Tfidf
from asreview.models.queriers import Max, TopDown
from asreview.models.stoppers import IsFittable
We create a simulation workflow that begins with a top-down reading strategy until both a relevant and an irrelevant article are identified. Afterward, the simulation transitions to an active learning phase powered by an SVM classifier.
[4]:
learners = [
asr.ActiveLearningCycle(querier=TopDown(), stopper=IsFittable()),
asr.ActiveLearningCycle(
querier=Max(),
classifier=SVM(C=3),
balancer=Balanced(ratio=5),
feature_extractor=Tfidf(),
),
]
sim = asr.Simulate(
d,
d["label_included"],
learners,
)
sim.review()
Relevant records found: 100%|██████████| 104/104 [02:06<00:00, 1.22s/it]
Records labeled : 65%|██████▍ | 5672/8793 [02:06<01:09, 44.75it/s]
Loss: 0.022
NDCG: 0.656
Finally, we review the simulation results to analyze the performance and outcomes of the systematic review process.
[5]:
sim._results
[5]:
record_id | label | classifier | querier | balancer | feature_extractor | training_set | time | note | tags | user_id | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | None | top_down | None | None | 0 | 1.745846e+09 | None | None | None |
1 | 1 | 0 | None | top_down | None | None | 1 | 1.745846e+09 | None | None | None |
2 | 2 | 0 | None | top_down | None | None | 2 | 1.745846e+09 | None | None | None |
3 | 3 | 0 | None | top_down | None | None | 3 | 1.745846e+09 | None | None | None |
4 | 4 | 0 | None | top_down | None | None | 4 | 1.745846e+09 | None | None | None |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
5667 | 8389 | 0 | svm | max | balanced | tfidf | 5667 | 1.745846e+09 | None | None | None |
5668 | 1739 | 0 | svm | max | balanced | tfidf | 5668 | 1.745846e+09 | None | None | None |
5669 | 4807 | 0 | svm | max | balanced | tfidf | 5669 | 1.745846e+09 | None | None | None |
5670 | 5160 | 0 | svm | max | balanced | tfidf | 5670 | 1.745846e+09 | None | None | None |
5671 | 5647 | 1 | svm | max | balanced | tfidf | 5671 | 1.745846e+09 | None | None | None |
5672 rows × 11 columns