asreview.Dataset

class asreview.Dataset(df=None, column_spec=None)[source]

Dataset object to the dataset with texts, labels, DOIs etc.

Parameters:

df (pandas.DataFrame) – Dataframe containing the data for the ASReview data object.
column_spec (dict) – Specification for which column corresponds to which standard specification. Key is the standard specification, key is which column it is actually in. Default: None.

Variables:

record_ids (numpy.ndarray) – Return an array representing the data in the Index.
texts (numpy.ndarray) – Returns an array with either headings, bodies, or both.
headings (numpy.ndarray) – Returns an array with dataset headings.
title (numpy.ndarray) – Identical to headings.
bodies (numpy.ndarray) – Returns an array with dataset bodies.
abstract (numpy.ndarray) – Identical to bodies.
notes (numpy.ndarray) – Returns an array with dataset notes.
keywords (numpy.ndarray) – Returns an array with dataset keywords.
authors (numpy.ndarray) – Returns an array with dataset authors.
doi (numpy.ndarray) – Returns an array with dataset DOI.
included (numpy.ndarray) – Returns an array with document inclusion markers.
final_included (numpy.ndarray) – Pending deprecation! Returns an array with document inclusion markers.
labels (numpy.ndarray) – Identical to included.

Attributes

Methods

`drop_duplicates`([pid, inplace, reset_index])	Drop duplicate records.
`duplicated`([pid])	Return boolean Series denoting duplicate rows.
`get`(name)	Get column with name.
`is_prior`()	Get the labels that are marked as 'prior'.
`record`(i)	Create a record from an index.
`to_dataframe`([labels, ranking, keep_old_labels])	Create new dataframe with updated label (order).
`to_file`(fp[, labels, ranking, writer, ...])	Export data object to file.