engforge.eng.prediction.PredictionMixin
- class PredictionMixin[source]
Bases:
objectMethods
adds a record to the prediction records, and calcultes the average and variance of the data :type record: :param record: a dict of the record :type extra_add: :param extra_add: if true, the record is added to the prediction records even if the data is inbounds :returns: a boolean indicating if the record was out of bounds of current data (therefor should be added)
Checks if more data than threshold to train or if error is sufficiently low to ignore retraining, or if more data already exists than window size (no training)
checks if the record is in bounds of the current data
uses the existing models to predict the row and measure the error
prediction_dataframeprediction_weightsscores a dataframe
Use the dataframe to train the models, and compare the results to the current models using train_frac to divide total samples into training and testing sets, unless train_full is set.
override to provide a callback when training is complete, such as saving the models
Attributes
basisprediction_goal_errorprediction_recordstrain_windowtrained- add_prediction_record(record, extra_add=True, mult_sigma=1, target_items=1000)[source]
adds a record to the prediction records, and calcultes the average and variance of the data :type record: :param record: a dict of the record :type extra_add: :param extra_add: if true, the record is added to the prediction records even if the data is inbounds :returns: a boolean indicating if the record was out of bounds of current data (therefor should be added)
- check_and_retrain(records, min_rec=None)[source]
Checks if more data than threshold to train or if error is sufficiently low to ignore retraining, or if more data already exists than window size (no training)
- check_out_of_domain(record, extra_margin=1, target_items=1000)[source]
checks if the record is in bounds of the current data
- train_compare(df, test_frac=2, train_full=False, min_rec=250)[source]
Use the dataframe to train the models, and compare the results to the current models using train_frac to divide total samples into training and testing sets, unless train_full is set.
- Parameters:
df – dataframe to train with
test_frac – N/train_frac will be size of the training window
train_full – boolean to use full training data
- Returns:
trained models