asreview.models.balance.TripleBalance

class asreview.models.balance.TripleBalance(a=2.155, alpha=0.94, b=0.789, beta=1.0, c=0.835, gamma=2.0, shuffle=True, random_state=None)[source]

Triple balance strategy (triple).

Broken. Only for internal and experimental use.

This divides the training data into three sets: included papers, excluded papers found with random sampling and papers found with max sampling. They are balanced according to formulas depending on the percentage of papers read in the dataset, the number of papers with random/max sampling etc. Works best for stochastic training algorithms. Reduces to both full sampling and undersampling with corresponding parameters.

Parameters:
  • a (float) – Governs the weight of the 1’s. Higher values mean linearly more 1’s in your training sample.

  • alpha (float) – Governs the scaling the weight of the 1’s, as a function of the ratio of ones to zeros. A positive value means that the lower the ratio of zeros to ones, the higher the weight of the ones.

  • b (float) – Governs how strongly we want to sample depending on the total number of samples. A value of 1 means no dependence on the total number of samples, while lower values mean increasingly stronger dependence on the number of samples.

  • beta (float) – Governs the scaling of the weight of the zeros depending on the number of samples. Higher values means that larger samples are more strongly penalizing zeros.

  • c (float) – Value between one and zero that governs the weight of samples done with maximal sampling. Higher values mean higher weight.

  • gamma (float) – Governs the scaling of the weight of the max samples as a function of the % of papers read. Higher values mean stronger scaling.

Attributes

default_param

Get the default parameters of the model.

label

name

param

Get the (assigned) parameters of the model.

Methods

full_hyper_space()

hyper_space()

sample(X, y, train_idx, shared)

Resample the training data.