asreview.Project#

class asreview.Project(project_path, project_id=None)[source]#

Bases: object

Project class for ASReview project files.

This class represents the complete data file for a review project. This data is contained in a single directory with the following files and subdirectories: - project.json: A JSON file containing the configuration and metadata of the project. It’s structure is described in schema.py - data/: A directory containing the input data file exactly as provided by the user. When exporting, this input data is merged with the results of the review to get the export file. - feature_matrices/: A directory containing all the feature matrices that are generated during the review. - results.db: An SQLite database containing all data generated by ASReview: the data parsed from the input file, the labeled records, the last model ranking etc. See asreview/data for information on the parsing of the input. See asreview/state for information on the model and the labeling decisions.

Methods

`__init__`(project_path[, project_id])
`add_dataset`(fp[, dataset_id, file_writer])	Add a dataset to the project file.
`add_feature_matrix`(feature_matrix, name)	Add feature matrix to project file.
`add_review`([cycle, reviewer, status])	Add new review metadata.
`close`()	Close the project and release all resources.
`create`(project_path[, project_id, ...])	Initialize the necessary files specific to the web app.
`export`(export_fp)
`get_feature_matrix`(name)	Get the feature matrix from the project file.
`get_input_data_reader`()
`get_model_config`()	Get the current model configuration of the review.
`get_review_error`()
`label_priors`()	Label prior knowledge from a partially labeled dataset.
`load`(asreview_file, project_path[, ...])
`read_input_data`(args, *kwargs)
`remove_dataset`()	Remove dataset from project.
`remove_review_error`()
`set_review_error`(err)
`update_config`(**kwargs)	Update project info
`update_review`([status, model_name, model])	Update review metadata.

Attributes

`MODE_SIMULATE`
`PATH_CONFIG`
`PATH_CONFIG_LOCK`
`PATH_DATA_DIR`
`PATH_DB`
`PATH_ERROR`
`PATH_FEATURE_MATRICES`
`VERSION`
`config`
`db`
`feature_matrices`
`input_data_fp`
`review`

MODE_SIMULATE = 'simulate'#

PATH_CONFIG = 'project.json'#

PATH_CONFIG_LOCK = 'project.json.lock'#

PATH_DATA_DIR = 'data'#

PATH_DB = 'results.db'#

PATH_ERROR = 'error.json'#

PATH_FEATURE_MATRICES = 'feature_matrices'#

VERSION = 3#

add_dataset(fp, dataset_id=None, file_writer=None)[source]#

Add a dataset to the project file.

Parameters:: fp (str, Path) – Filepath to the dataset. It will be copied to the correct location in the project file.

add_feature_matrix(feature_matrix, name)[source]#

Add feature matrix to project file.

Parameters:

feature_matrix (numpy.ndarray, scipy.sparse.csr.csr_matrix) – The feature matrix to add to the project file.
name (str) – Name of the feature extractor.

add_review(cycle=None, reviewer=None, status='setup')[source]#

Add new review metadata.

Parameters:

cycle – An active learning cycle object to add to the review. This object is used to store the configuration of the active learning cycle to file.
reviewer (object) – A reviewer object with to_sql() method.
status (str) – The status of the review. One of ‘setup’, ‘running’, ‘finished’.

close()[source]#

Close the project and release all resources.

Closes the database connection if it was opened. Safe to call multiple times.

property config#

classmethod create(project_path, project_id=None, project_mode='oracle', project_name=None, project_tags=None)[source]#: Initialize the necessary files specific to the web app.

property db#

export(export_fp)[source]#

property feature_matrices#

get_feature_matrix(name)[source]#

Get the feature matrix from the project file.

Parameters:: name (str) – Name of the feature extractor for which to get the cached matrix.
Returns:: numpy.ndarray, scipy.sparse – (Sparse) feature matrix.

get_input_data_reader()[source]#

get_model_config()[source]#

Get the current model configuration of the review.

Returns:: dict | None – Dictionary containing the model configuration. Returns None if there is no review yet in the project.

get_review_error()[source]#

property input_data_fp#

label_priors()[source]#

Label prior knowledge from a partially labeled dataset.

If the input dataset is partially labeled (some records have an included value of 0 or 1 while others are unlabeled), the labeled records are stored as prior knowledge in the results table.

Fully labeled or fully unlabeled datasets are skipped.

classmethod load(asreview_file, project_path, safe_import=False, reset_model_if_not_found=False)[source]#

read_input_data(*args, **kwargs)[source]#

remove_dataset()[source]#: Remove dataset from project.

remove_review_error()[source]#

property review#

set_review_error(err)[source]#

update_config(**kwargs)[source]#: Update project info

update_review(status=None, model_name=None, model=None)[source]#: Update review metadata.