src.utils package¶
Subpackages¶
Submodules¶
src.utils.event_attributes module¶
-
src.utils.event_attributes.
get_additional_columns
(log)¶
-
src.utils.event_attributes.
get_event_attributes
(log)¶ Get log event attributes that are not name or time
As log file is a list, it has no global event attributes. Getting from first event of first trace. This may be bad.
-
src.utils.event_attributes.
get_global_event_attributes
(log)¶ Get log event attributes that are not name or time
-
src.utils.event_attributes.
get_global_trace_attributes
(log)¶
-
src.utils.event_attributes.
unique_events
(log)¶ List of unique events using event concept:name
Adds all events into a list and removes duplicates while keeping order.
-
src.utils.event_attributes.
unique_events2
(training_log, test_log)¶ Combines unique events from two logs into one list.
Renamed to 2 because Python doesn’t allow functions with same names. Python is objectively the worst language.
src.utils.file_service module¶
-
src.utils.file_service.
create_unique_name
(name)¶ Return type: str
-
src.utils.file_service.
get_log
(log)¶ Read in event log from disk
Uses xes_importer to parse log.
Return type: EventLog
-
src.utils.file_service.
save_result
(results, job, start_time)¶
src.utils.log_metrics module¶
-
src.utils.log_metrics.
avg_events_in_log
(log)¶ Returns the average number of events in any trace
:return 3
Return type: int
-
src.utils.log_metrics.
event_executions
(log)¶ Creates dict of event execution count
Return {‘Event A’: 7, ‘2011-01-06’: 8} Return type: OrderedDict
-
src.utils.log_metrics.
events_by_date
(log)¶ Creates dict of events by date ordered by date
Return {‘2010-12-30’: 7, ‘2011-01-06’: 8} Return type: OrderedDict
-
src.utils.log_metrics.
events_in_trace
(log)¶ Creates dict of number of events in trace
Return {‘4’: 11, ‘3’: 8} Return type: OrderedDict
-
src.utils.log_metrics.
max_events_in_log
(log)¶ Returns the maximum number of events in any trace
:return 3
Return type: int
-
src.utils.log_metrics.
new_trace_start
(log)¶ Creates dict of new traces by date
Return {‘2010-12-30’: 1, ‘2011-01-06’: 2} Return type: OrderedDict
-
src.utils.log_metrics.
resources_by_date
(log)¶ Creates dict of used unique resources ordered by date
Resource and timestamp delimited by &&. If this is in resources name, bad stuff will happen. Returns a dict with a date and the number of unique resources used on that day. :return {‘2010-12-30’: 7, ‘2011-01-06’: 8}
Return type: OrderedDict
-
src.utils.log_metrics.
std_var_events_in_log
(log)¶ Returns the standard variation of the average number of events in any trace
:return 3
Return type: int
-
src.utils.log_metrics.
trace_attributes
(log)¶ Creates an array of dicts that describe trace attributes. Only looks at first trace. Filters out concept:name.
Return [{name: ‘name’, type: ‘string’, example: 34}] Return type: list
src.utils.result_metrics module¶
-
src.utils.result_metrics.
calculate_auc
(actual, scores, auc)¶ Return type: float
-
src.utils.result_metrics.
calculate_nlevenshtein
(actual, predicted)¶ Return type: float
-
src.utils.result_metrics.
calculate_results_classification
(actual, predicted)¶ Return type: dict
-
src.utils.result_metrics.
calculate_results_regression
(input_df, label)¶ Return type: dict
-
src.utils.result_metrics.
calculate_results_time_series_prediction
(actual, predicted)¶ Return type: dict
-
src.utils.result_metrics.
get_auc
(actual, scores)¶ Return type: float
-
src.utils.result_metrics.
get_confusion_matrix
(actual, predicted)¶ Return type: dict
src.utils.tests_utils module¶
-
src.utils.tests_utils.
create_test_clustering
(clustering_type='noCluster', configuration={})¶ Return type: Clustering
-
src.utils.tests_utils.
create_test_encoding
(prefix_length=1, padding=False, value_encoding='simpleIndex', add_elapsed_time=False, add_remaining_time=False, add_resources_used=False, add_new_traces=False, add_executed_events=False, task_generation_type='only')¶ Return type: Encoding
-
src.utils.tests_utils.
create_test_hyperparameter_optimizer
(hyperoptim_type='hyperopt', performance_metric='acc', max_evals=10)¶
-
src.utils.tests_utils.
create_test_job
(split=None, encoding=None, labelling=None, clustering=None, predictive_model=None, job_type='prediction', hyperparameter_optimizer=None)¶
-
src.utils.tests_utils.
create_test_labelling
(label_type='next_activity', attribute_name=None, threshold_type='threshold_mean', threshold=0.0)¶ Return type: Labelling
-
src.utils.tests_utils.
create_test_log
(log_name='general_example.xes', log_path='cache/log_cache/test_logs/general_example.xes')¶ Return type: Log
-
src.utils.tests_utils.
create_test_predictive_model
(predictive_model='classification', prediction_method='randomForest')¶ Return type: PredictiveModel
-
src.utils.tests_utils.
create_test_split
(split_type='single', split_ordering_method='sequential', test_size=0.2, original_log=None, train_log=None, test_log=None)¶
src.utils.time_metrics module¶
-
src.utils.time_metrics.
count_on_event_day
(trace, date_dict, event_id)¶ Finds the date of event and returns the value from date_dict :param date_dict one of the dicts from log_metrics.py :param event_id Event id :param trace Log trace
-
src.utils.time_metrics.
duration
(trace)¶ Calculate the duration of a trace
-
src.utils.time_metrics.
elapsed_time
(trace, event)¶ Calculate elapsed time by event in trace
-
src.utils.time_metrics.
elapsed_time_id
(trace, event_index)¶ Calculate elapsed time by event index in trace
-
src.utils.time_metrics.
remaining_time
(trace, event)¶ Calculate remaining time by event in trace
-
src.utils.time_metrics.
remaining_time_id
(trace, event_index)¶ Calculate remaining time by event index in trace