asreview.data.statistics.n_duplicates
- asreview.data.statistics.n_duplicates(data, pid='doi')[source]
Number of duplicates.
Duplicate detection can be a very challenging task. Multiple algorithms can be used and results can be vary.
- Parameters:
data (asreview.Dataset) – An Dataset object with the records.
pid (string) – Which persistent identifier (PID) to use for deduplication. Default is ‘doi’.
- Returns:
int – Number of duplicates