src.utils package¶
Subpackages¶
Submodules¶
src.utils.event_attributes module¶
-
src.utils.event_attributes.get_additional_columns(log)¶
-
src.utils.event_attributes.get_event_attributes(log)¶ Get log event attributes that are not name or time
As log file is a list, it has no global event attributes. Getting from first event of first trace. This may be bad.
-
src.utils.event_attributes.get_global_event_attributes(log)¶ Get log event attributes that are not name or time
-
src.utils.event_attributes.get_global_trace_attributes(log)¶
-
src.utils.event_attributes.unique_events(log)¶ List of unique events using event concept:name
Adds all events into a list and removes duplicates while keeping order.
-
src.utils.event_attributes.unique_events2(training_log, test_log)¶ Combines unique events from two logs into one list.
Renamed to 2 because Python doesn’t allow functions with same names. Python is objectively the worst language.
src.utils.file_service module¶
-
src.utils.file_service.create_unique_name(name)¶ Return type: str
-
src.utils.file_service.get_log(log)¶ Read in event log from disk
Uses xes_importer to parse log.
Return type: EventLog
-
src.utils.file_service.save_result(results, job, start_time)¶
src.utils.log_metrics module¶
-
src.utils.log_metrics.avg_events_in_log(log)¶ Returns the average number of events in any trace
:return 3
Return type: int
-
src.utils.log_metrics.event_executions(log)¶ Creates dict of event execution count
Return {‘Event A’: 7, ‘2011-01-06’: 8} Return type: OrderedDict
-
src.utils.log_metrics.events_by_date(log)¶ Creates dict of events by date ordered by date
Return {‘2010-12-30’: 7, ‘2011-01-06’: 8} Return type: OrderedDict
-
src.utils.log_metrics.events_in_trace(log)¶ Creates dict of number of events in trace
Return {‘4’: 11, ‘3’: 8} Return type: OrderedDict
-
src.utils.log_metrics.max_events_in_log(log)¶ Returns the maximum number of events in any trace
:return 3
Return type: int
-
src.utils.log_metrics.new_trace_start(log)¶ Creates dict of new traces by date
Return {‘2010-12-30’: 1, ‘2011-01-06’: 2} Return type: OrderedDict
-
src.utils.log_metrics.resources_by_date(log)¶ Creates dict of used unique resources ordered by date
Resource and timestamp delimited by &&. If this is in resources name, bad stuff will happen. Returns a dict with a date and the number of unique resources used on that day. :return {‘2010-12-30’: 7, ‘2011-01-06’: 8}
Return type: OrderedDict
-
src.utils.log_metrics.std_var_events_in_log(log)¶ Returns the standard variation of the average number of events in any trace
:return 3
Return type: int
-
src.utils.log_metrics.trace_attributes(log)¶ Creates an array of dicts that describe trace attributes. Only looks at first trace. Filters out concept:name.
Return [{name: ‘name’, type: ‘string’, example: 34}] Return type: list
src.utils.result_metrics module¶
-
src.utils.result_metrics.calculate_auc(actual, scores, auc)¶ Return type: float
-
src.utils.result_metrics.calculate_nlevenshtein(actual, predicted)¶ Return type: float
-
src.utils.result_metrics.calculate_results_classification(actual, predicted)¶ Return type: dict
-
src.utils.result_metrics.calculate_results_regression(input_df, label)¶ Return type: dict
-
src.utils.result_metrics.calculate_results_time_series_prediction(actual, predicted)¶ Return type: dict
-
src.utils.result_metrics.get_auc(actual, scores)¶ Return type: float
-
src.utils.result_metrics.get_confusion_matrix(actual, predicted)¶ Return type: dict
src.utils.tests_utils module¶
-
src.utils.tests_utils.create_test_clustering(clustering_type='noCluster', configuration={})¶ Return type: Clustering
-
src.utils.tests_utils.create_test_encoding(prefix_length=1, padding=False, value_encoding='simpleIndex', add_elapsed_time=False, add_remaining_time=False, add_resources_used=False, add_new_traces=False, add_executed_events=False, task_generation_type='only')¶ Return type: Encoding
-
src.utils.tests_utils.create_test_hyperparameter_optimizer(hyperoptim_type='hyperopt', performance_metric='acc', max_evals=10)¶
-
src.utils.tests_utils.create_test_job(split=None, encoding=None, labelling=None, clustering=None, predictive_model=None, job_type='prediction', hyperparameter_optimizer=None)¶
-
src.utils.tests_utils.create_test_labelling(label_type='next_activity', attribute_name=None, threshold_type='threshold_mean', threshold=0.0)¶ Return type: Labelling
-
src.utils.tests_utils.create_test_log(log_name='general_example.xes', log_path='cache/log_cache/test_logs/general_example.xes')¶ Return type: Log
-
src.utils.tests_utils.create_test_predictive_model(predictive_model='classification', prediction_method='randomForest')¶ Return type: PredictiveModel
-
src.utils.tests_utils.create_test_split(split_type='single', split_ordering_method='sequential', test_size=0.2, original_log=None, train_log=None, test_log=None)¶
src.utils.time_metrics module¶
-
src.utils.time_metrics.count_on_event_day(trace, date_dict, event_id)¶ Finds the date of event and returns the value from date_dict :param date_dict one of the dicts from log_metrics.py :param event_id Event id :param trace Log trace
-
src.utils.time_metrics.duration(trace)¶ Calculate the duration of a trace
-
src.utils.time_metrics.elapsed_time(trace, event)¶ Calculate elapsed time by event in trace
-
src.utils.time_metrics.elapsed_time_id(trace, event_index)¶ Calculate elapsed time by event index in trace
-
src.utils.time_metrics.remaining_time(trace, event)¶ Calculate remaining time by event in trace
-
src.utils.time_metrics.remaining_time_id(trace, event_index)¶ Calculate remaining time by event index in trace