engforge.eng.prediction.PredictionMixin

class PredictionMixin[source]

Bases: object

Methods

`add_prediction_record`	adds a record to the prediction records, and calcultes the average and variance of the data :type record: :param record: a dict of the record :type extra_add: :param extra_add: if true, the record is added to the prediction records even if the data is inbounds :returns: a boolean indicating if the record was out of bounds of current data (therefor should be added)
`check_and_retrain`	Checks if more data than threshold to train or if error is sufficiently low to ignore retraining, or if more data already exists than window size (no training)
`check_out_of_domain`	checks if the record is in bounds of the current data
`observe_and_predict`	uses the existing models to predict the row and measure the error
`prediction_dataframe`
`prediction_weights`
`score_data`	scores a dataframe
`train_compare`	Use the dataframe to train the models, and compare the results to the current models using train_frac to divide total samples into training and testing sets, unless train_full is set.
`training_callback`	override to provide a callback when training is complete, such as saving the models

Attributes

`basis`
`prediction_goal_error`
`prediction_records`
`train_window`
`trained`

add_prediction_record(record, extra_add=True, mult_sigma=1, target_items=1000)[source]: adds a record to the prediction records, and calcultes the average and variance of the data :type record: :param record: a dict of the record :type extra_add: :param extra_add: if true, the record is added to the prediction records even if the data is inbounds :returns: a boolean indicating if the record was out of bounds of current data (therefor should be added)

check_and_retrain(records, min_rec=None)[source]: Checks if more data than threshold to train or if error is sufficiently low to ignore retraining, or if more data already exists than window size (no training)

check_out_of_domain(record, extra_margin=1, target_items=1000)[source]: checks if the record is in bounds of the current data

observe_and_predict(row)[source]: uses the existing models to predict the row and measure the error

score_data(df)[source]: scores a dataframe

train_compare(df, test_frac=2, train_full=False, min_rec=250)[source]

Use the dataframe to train the models, and compare the results to the current models using train_frac to divide total samples into training and testing sets, unless train_full is set.

Parameters:

df – dataframe to train with
test_frac – N/train_frac will be size of the training window
train_full – boolean to use full training data

Returns:

trained models

training_callback(models)[source]: override to provide a callback when training is complete, such as saving the models