asreview.DataStore#

class asreview.DataStore(fp, record_cls=<class 'asreview.data.record.Record'>)[source]#

Bases: object

Data store to hold user input data.

Data input always happends via the record class. This means that if you want to add data to the data store, you will first need to clean it, make sure it has the correct columns and make sure it passes the validations defined in the record class.

Getting data from the store can happen in rows or in columns. If you read rows, you will get record objects as response. If you read columns, you will get pandas objects. If you ask for a single column you get a pandas Series, and if you ask for multiple columns you get a pandas DataFrame.

DataStore uses an SQLite database in the backend and SQLAlchemy ORM to interact with the database.

Methods

__init__(fp[, record_cls])

Initialize the data store.

add_records(records)

Add records to the data store.

create_tables()

Initialize the tables containing the data.

delete_record(record_id)

get_df()

Get all data from the data store as a pandas DataFrmae.

get_records(record_id)

Get the records with the given record identifiers.

is_empty()

Attributes

columns

pandas_dtype_mapping

pandas data type}

user_version

Version number of the state.

add_records(records)[source]#

Add records to the data store.

Parameters:

records (list[self.record_cls]) – List of records to add to the store.

property columns#
create_tables()[source]#

Initialize the tables containing the data.

If you are creating a new data store, you will need to call this method before adding data to the data store.

delete_record(record_id)[source]#
get_df()[source]#

Get all data from the data store as a pandas DataFrmae.

Returns:

pd.DataFrame

get_records(record_id)[source]#

Get the records with the given record identifiers.

Parameters:

record_id (int | list[int]) – Record identifier or list record identifiers.

Returns:

asreview.data.record.Record | list[asreview.data.record.Record] | None

is_empty()[source]#
property pandas_dtype_mapping#

pandas data type}

Type:

Mapping {column name

property user_version#

Version number of the state.