engforge.eng.prediction.PredictionMixin

class PredictionMixin[source]

Bases: object

Methods

add_prediction_record

adds a record to the prediction records, and calcultes the average and variance of the data :type record: :param record: a dict of the record :type extra_add: :param extra_add: if true, the record is added to the prediction records even if the data is inbounds :returns: a boolean indicating if the record was out of bounds of current data (therefor should be added)

check_and_retrain

Checks if more data than threshold to train or if error is sufficiently low to ignore retraining, or if more data already exists than window size (no training)

check_out_of_domain

checks if the record is in bounds of the current data

observe_and_predict

uses the existing models to predict the row and measure the error

prediction_dataframe

prediction_weights

score_data

scores a dataframe

train_compare

Use the dataframe to train the models, and compare the results to the current models using train_frac to divide total samples into training and testing sets, unless train_full is set.

training_callback

override to provide a callback when training is complete, such as saving the models

Attributes

basis

prediction_goal_error

prediction_records

train_window

trained

add_prediction_record(record, extra_add=True, mult_sigma=1, target_items=1000)[source]

adds a record to the prediction records, and calcultes the average and variance of the data :type record: :param record: a dict of the record :type extra_add: :param extra_add: if true, the record is added to the prediction records even if the data is inbounds :returns: a boolean indicating if the record was out of bounds of current data (therefor should be added)

check_and_retrain(records, min_rec=None)[source]

Checks if more data than threshold to train or if error is sufficiently low to ignore retraining, or if more data already exists than window size (no training)

check_out_of_domain(record, extra_margin=1, target_items=1000)[source]

checks if the record is in bounds of the current data

observe_and_predict(row)[source]

uses the existing models to predict the row and measure the error

score_data(df)[source]

scores a dataframe

train_compare(df, test_frac=2, train_full=False, min_rec=250)[source]

Use the dataframe to train the models, and compare the results to the current models using train_frac to divide total samples into training and testing sets, unless train_full is set.

Parameters:
  • df – dataframe to train with

  • test_frac – N/train_frac will be size of the training window

  • train_full – boolean to use full training data

Returns:

trained models

training_callback(models)[source]

override to provide a callback when training is complete, such as saving the models