msticpy.analysis.anomalous_sequence.utils.laplace_smooth module
Helper module for laplace smoothing counts.
- msticpy.analysis.anomalous_sequence.utils.laplace_smooth.laplace_smooth_cmd_counts(seq1_counts: DefaultDict[str, int], seq2_counts: DefaultDict[str, DefaultDict[str, int]], start_token: str, end_token: str, unk_token: str) Tuple[DefaultDict[str, int], DefaultDict[str, DefaultDict[str, int]]]
Apply laplace smoothing to the input counts for the cmds.
In particular, add 1 to each of the counts, including the unk_token. By including the unk_token, we can handle unseen commands.
- Parameters:
seq1_counts (DefaultDict[str, int]) – individual command counts
seq2_counts (DefaultDict[str, DefaultDict[str, int]]) – sequence command (length 2) counts
start_token (str) – dummy command to signify the start of a session (e.g. “##START##”)
end_token (str) – dummy command to signify the end of a session (e.g. “##END##”)
unk_token (str) – dummy command to signify an unseen command (e.g. “##UNK##”)
- Returns:
individual command counts, sequence command (length 2) counts
- Return type:
tuple of laplace smoothed counts
- msticpy.analysis.anomalous_sequence.utils.laplace_smooth.laplace_smooth_param_counts(cmds: List[str], param_counts: DefaultDict[str, int], cmd_param_counts: DefaultDict[str, DefaultDict[str, int]], unk_token: str) Tuple[DefaultDict[str, int], DefaultDict[str, DefaultDict[str, int]]]
Apply laplace smoothing to the input counts for the params.
In particular, add 1 to each of the counts, including the unk_token. By including the unk_token, we can handle unseen params.
- Parameters:
cmds (List[str]) – list of all the possible commands (including the unk_token)
param_counts (DefaultDict[str, int]) – individual param counts
cmd_param_counts (DefaultDict[str, DefaultDict[str, int]]) – param conditional on command counts
unk_token (str) – dummy command to signify an unseen command (e.g. “##UNK##”)
- Returns:
individual param probabilities, param conditional on command probabilities
- Return type:
Tuple
- msticpy.analysis.anomalous_sequence.utils.laplace_smooth.laplace_smooth_value_counts(params: List[str], value_counts: DefaultDict[str, int], param_value_counts: DefaultDict[str, DefaultDict[str, int]], unk_token: str) Tuple[DefaultDict[str, int], DefaultDict[str, DefaultDict[str, int]]]
Apply laplace smoothing to the input counts for the values.
In particular, add 1 to each of the counts, including the unk_token. By including the unk_token, we can handle unseen values.
- Parameters:
params (List[str]) – list of all possible params, including the unk_token
value_counts (DefaultDict[str, int]) – individual value counts
param_value_counts (DefaultDict[str, DefaultDict[str, int]]) – value conditional on param counts
unk_token (str) – dummy command to signify an unseen command (e.g. “##UNK##”)
- Returns:
individual value probabilities, value conditional on param probabilities
- Return type:
Tuple