asreview.Project#

class asreview.Project(project_path, project_id=None)[source]#

Bases: object

Project class for ASReview project files.

This class represents the complete data file for a review project. This data is contained in a single directory with the following files and subdirectories: - project.json: A JSON file containing the configuration and metadata of the project. It’s structure is described in schema.py - data/: A directory containing the input data file exactly as provided by the user. When exporting, this input data is merged with the results of the review to get the export file. - feature_matrices/: A directory containing all the feature matrices that are generated during the review. - results.db: An SQLite database containing all data generated by ASReview: the data parsed from the input file, the labeled records, the last model ranking etc. See asreview/data for information on the parsing of the input. See asreview/state for information on the model and the labeling decisions.

Methods

__init__(project_path[, project_id])

add_dataset(fp[, dataset_id, file_writer])

Add a dataset to the project file.

add_feature_matrix(feature_matrix, name)

Add feature matrix to project file.

add_review([cycle, reviewer, status])

Add new review metadata.

close()

Close the project and release all resources.

create(project_path[, project_id, ...])

Initialize the necessary files specific to the web app.

export(export_fp)

get_feature_matrix(name)

Get the feature matrix from the project file.

get_input_data_reader()

get_model_config()

Get the current model configuration of the review.

get_review_error()

label_priors()

Label prior knowledge from a partially labeled dataset.

load(asreview_file, project_path[, ...])

read_input_data(*args, **kwargs)

remove_dataset()

Remove dataset from project.

remove_review_error()

set_review_error(err)

update_config(**kwargs)

Update project info

update_review([status, model_name, model])

Update review metadata.

Attributes

MODE_SIMULATE = 'simulate'#
PATH_CONFIG = 'project.json'#
PATH_CONFIG_LOCK = 'project.json.lock'#
PATH_DATA_DIR = 'data'#
PATH_DB = 'results.db'#
PATH_ERROR = 'error.json'#
PATH_FEATURE_MATRICES = 'feature_matrices'#
VERSION = 3#
add_dataset(fp, dataset_id=None, file_writer=None)[source]#

Add a dataset to the project file.

Parameters:

fp (str, Path) – Filepath to the dataset. It will be copied to the correct location in the project file.

add_feature_matrix(feature_matrix, name)[source]#

Add feature matrix to project file.

Parameters:
  • feature_matrix (numpy.ndarray, scipy.sparse.csr.csr_matrix) – The feature matrix to add to the project file.

  • name (str) – Name of the feature extractor.

add_review(cycle=None, reviewer=None, status='setup')[source]#

Add new review metadata.

Parameters:
  • cycle – An active learning cycle object to add to the review. This object is used to store the configuration of the active learning cycle to file.

  • reviewer (object) – A reviewer object with to_sql() method.

  • status (str) – The status of the review. One of ‘setup’, ‘running’, ‘finished’.

close()[source]#

Close the project and release all resources.

Closes the database connection if it was opened. Safe to call multiple times.

property config#
classmethod create(project_path, project_id=None, project_mode='oracle', project_name=None, project_tags=None)[source]#

Initialize the necessary files specific to the web app.

property db#
export(export_fp)[source]#
property feature_matrices#
get_feature_matrix(name)[source]#

Get the feature matrix from the project file.

Parameters:

name (str) – Name of the feature extractor for which to get the cached matrix.

Returns:

numpy.ndarray, scipy.sparse – (Sparse) feature matrix.

get_input_data_reader()[source]#
get_model_config()[source]#

Get the current model configuration of the review.

Returns:

dict | None – Dictionary containing the model configuration. Returns None if there is no review yet in the project.

get_review_error()[source]#
property input_data_fp#
label_priors()[source]#

Label prior knowledge from a partially labeled dataset.

If the input dataset is partially labeled (some records have an included value of 0 or 1 while others are unlabeled), the labeled records are stored as prior knowledge in the results table.

Fully labeled or fully unlabeled datasets are skipped.

classmethod load(asreview_file, project_path, safe_import=False, reset_model_if_not_found=False)[source]#
read_input_data(*args, **kwargs)[source]#
remove_dataset()[source]#

Remove dataset from project.

remove_review_error()[source]#
property review#
set_review_error(err)[source]#
update_config(**kwargs)[source]#

Update project info

update_review(status=None, model_name=None, model=None)[source]#

Update review metadata.