asreview.Dataset

class asreview.Dataset(df=None, column_spec=None)[source]

Dataset object to the dataset with texts, labels, DOIs etc.

Parameters:
  • df (pandas.DataFrame) – Dataframe containing the data for the ASReview data object.

  • column_spec (dict) – Specification for which column corresponds to which standard specification. Key is the standard specification, key is which column it is actually in. Default: None.

Variables:
  • record_ids (numpy.ndarray) – Return an array representing the data in the Index.

  • texts (numpy.ndarray) – Returns an array with either headings, bodies, or both.

  • headings (numpy.ndarray) – Returns an array with dataset headings.

  • title (numpy.ndarray) – Identical to headings.

  • bodies (numpy.ndarray) – Returns an array with dataset bodies.

  • abstract (numpy.ndarray) – Identical to bodies.

  • notes (numpy.ndarray) – Returns an array with dataset notes.

  • keywords (numpy.ndarray) – Returns an array with dataset keywords.

  • authors (numpy.ndarray) – Returns an array with dataset authors.

  • doi (numpy.ndarray) – Returns an array with dataset DOI.

  • included (numpy.ndarray) – Returns an array with document inclusion markers.

  • final_included (numpy.ndarray) – Pending deprecation! Returns an array with document inclusion markers.

  • labels (numpy.ndarray) – Identical to included.

Attributes

abstract

authors

bodies

doi

headings

included

keywords

labels

notes

record_ids

texts

title

url

Methods

drop_duplicates([pid, inplace, reset_index])

Drop duplicate records.

duplicated([pid])

Return boolean Series denoting duplicate rows.

get(name)

Get column with name.

is_prior()

Get the labels that are marked as 'prior'.

record(i)

Create a record from an index.

to_dataframe([labels, ranking, keep_old_labels])

Create new dataframe with updated label (order).

to_file(fp[, labels, ranking, writer, ...])

Export data object to file.