msticpy.transform.auditdextract module

Auditd extractor.

Module to load and decode Linux audit logs. It collapses messages sharing the same message ID into single events, decodes hex-encoded data fields and performs some event-specific formatting and normalization (e.g. for process start events it will re-assemble the process command line arguments into a single string). This is still a work-in-progress.

msticpy.transform.auditdextract.extract_events_to_df(data: DataFrame, input_column: str = 'AuditdMessage', event_type: str | None = None, verbose: bool = False) DataFrame

Extract auditd raw messages into a dataframe.

Parameters:
  • data (pd.DataFrame) – The input dataframe with raw auditd data in a single string column

  • input_column (str, optional) – the input column name (the default is ‘AuditdMessage’)

  • event_type (str, optional) – the event type, if None, defaults to all (the default is None)

  • verbose (bool, optional) – Give feedback on stages of processing (the default is False)

Returns:

The resultant DataFrame

Return type:

pd.DataFrame

msticpy.transform.auditdextract.generate_process_tree(audit_data: DataFrame, branch_depth: int = 4, processes: DataFrame | None = None) DataFrame

Generate process tree data from auditd logs.

Parameters:
  • audit_data (pd.DataFrame) – The Audit data containing process creation events

  • branch_depth (int, optional) – The maximum depth of parent or child processes to extract from the data (The default is 4)

  • processes (pd.DataFrame, optional) – Dataframe of processes to generate tree for

Returns:

The formatted process tree data

Return type:

pd.DataFrame

msticpy.transform.auditdextract.get_event_subset(data: DataFrame, event_type: str) DataFrame

Return a subset of the events matching type event_type.

Parameters:
  • data (pd.DataFrame) – The input data

  • event_type (str) – The event type to select

Returns:

The subset of the data where data[‘EventType’] == event_type

Return type:

pd.DataFrame

msticpy.transform.auditdextract.read_from_file(filepath: str, event_type: str | None = None, verbose: bool = False, dummy_sep: str = '\t') DataFrame

Extract Audit events from a log file.

Parameters:
  • filepath (str) – path to the input file

  • event_type (str, optional) – The type of event to extract if only a subset required. (the default is None, which processes all types)

  • verbose (bool, optional) – If true more progress messages are output (the default is False)

  • dummy_sep (str, optional) – Separator to use for reading the ‘csv’ file (default is tab - ‘t’)

Returns:

The output DataFrame

Return type:

pd.DataFrame

Notes

The dummy_sep parameter should be a character that does not occur in an input line. This function uses pandas read_csv to read the audit lines into a single column. Using a separator that does appear in the input (e.g. space or comma) will cause data to be parsed into multiple columns and anything after the first separator in a line will be lost.

msticpy.transform.auditdextract.unpack_auditd(audit_str: List[Dict[str, str]]) Mapping[str, Mapping[str, Any]]

Unpack an Audit message and returns a dictionary of fields.

Parameters:

audit_str (str) – The auditd raw record

Returns:

The extracted message fields and values

Return type:

Mapping[str, Any]