asreview.Dataset.duplicated

Dataset.duplicated(pid='doi')[source]

Return boolean Series denoting duplicate rows.

Identify duplicates based on titles and abstracts and if available, on a persistent identifier (PID) such as the Digital Object Identifier (DOI).

Parameters:

pid (string) – Which persistent identifier to use for deduplication. Default is ‘doi’.

Returns:

pandas.Series – Boolean series for each duplicated rows.