msticpy.sectools package

msticpy.sectools.auditdextract module

Auditd extractor.

Module to load and decode Linux audit logs. It collapses messages sharing the same message ID into single events, decodes hex-encoded data fields and performs some event-specific formatting and normalization (e.g. for process start events it will re-assemble the process command line arguments into a single string). This is still a work-in-progress.

msticpy.sectools.auditdextract.extract_events_to_df(data: pandas.core.frame.DataFrame, input_column: str = 'AuditdMessage', event_type: Optional[str] = None, verbose: bool = False) pandas.core.frame.DataFrame

Extract auditd raw messages into a dataframe.

Parameters
  • data (pd.DataFrame) – The input dataframe with raw auditd data in a single string column

  • input_column (str, optional) – the input column name (the default is ‘AuditdMessage’)

  • event_type (str, optional) – the event type, if None, defaults to all (the default is None)

  • verbose (bool, optional) – Give feedback on stages of processing (the default is False)

Returns

The resultant DataFrame

Return type

pd.DataFrame

msticpy.sectools.auditdextract.generate_process_tree(audit_data: pandas.core.frame.DataFrame, branch_depth: int = 4, processes: Optional[pandas.core.frame.DataFrame] = None) pandas.core.frame.DataFrame

Generate process tree data from auditd logs.

Parameters
  • audit_data (pd.DataFrame) – The Audit data containing process creation events

  • branch_depth (int, optional) – The maximum depth of parent or child processes to extract from the data (The default is 4)

  • processes (pd.DataFrame, optional) – Dataframe of processes to generate tree for

Returns

The formatted process tree data

Return type

pd.DataFrame

msticpy.sectools.auditdextract.get_event_subset(data: pandas.core.frame.DataFrame, event_type: str) pandas.core.frame.DataFrame

Return a subset of the events matching type event_type.

Parameters
  • data (pd.DataFrame) – The input data

  • event_type (str) – The event type to select

Returns

The subset of the data where data[‘EventType’] == event_type

Return type

pd.DataFrame

msticpy.sectools.auditdextract.read_from_file(filepath: str, event_type: Optional[str] = None, verbose: bool = False, dummy_sep: str = '\t') pandas.core.frame.DataFrame

Extract Audit events from a log file.

Parameters
  • filepath (str) – path to the input file

  • event_type (str, optional) – The type of event to extract if only a subset required. (the default is None, which processes all types)

  • verbose (bool, optional) – If true more progress messages are output (the default is False)

  • dummy_sep (str, optional) – Separator to use for reading the ‘csv’ file (default is tab - ‘t’)

Returns

The output DataFrame

Return type

pd.DataFrame

Notes

The dummy_sep parameter should be a character that does not occur in an input line. This function uses pandas read_csv to read the audit lines into a single column. Using a separator that does appear in the input (e.g. space or comma) will cause data to be parsed into muliple columns and anything after the first separator in a line will be lost.

msticpy.sectools.auditdextract.unpack_auditd(audit_str: List[Dict[str, str]]) Mapping[str, Mapping[str, Any]]

Unpack an Audit message and returns a dictionary of fields.

Parameters

audit_str (str) – The auditd raw record

Returns

The extracted message fields and values

Return type

Mapping[str, Any]

msticpy.sectools.base64unpack module

base64_unpack.

The main function of this module is to decode and unpack strings that are obfuscated using base64 and/or certain compression algorithms such as gzip and zip.

It has the following functions: unpack_items - this is the main entry point and takes either a string or a pandas dataframe (with specified column) as input. It returns a string with obfuscated parts replaced by decoded equivalents (unless the decoding results in an undecodable binary, in which case a placeholder is used).

Other helper functions may also be useful standalone get_items_from_gzip(binary): Return decompressed gzip content of byte string get_items_from_zip(binary): Return dictionary of zip contents from byte string get_items_from_tar(binary): Return dictionary of tar file contents get_hashes(binary): Return md5, sha1 and sha256 hashes of input byte string

class msticpy.sectools.base64unpack.B64ExtractAccessor(pandas_obj)

Bases: object

Base64 Unpack pandas extension.

Initialize the extension.

extract(column, **kwargs) pandas.core.frame.DataFrame

Base64 decode strings taken from a pandas dataframe.

Parameters
  • data (pd.DataFrame) – dataframe containing column to decode

  • column (str) – Name of dataframe text column

  • trace (bool, optional) – Show additional status (the default is None)

  • utf16 (bool, optional) – Attempt to decode UTF16 byte strings

Returns

Decoded string and additional metadata in dataframe

Return type

pd.DataFrame

Notes

Items that decode to utf-8 or utf-16 strings will be returned as decoded strings replaced in the original string. If the encoded string is a known binary type it will identify the file type and return the hashes of the file. If any binary types are known archives (zip, tar, gzip) it will unpack the contents of the archive. For any binary it will return the decoded file as a byte array, and as a printable list of byte values.

The columns of the output DataFrame are:

  • decoded string: this is the input string with any decoded sections replaced by the results of the decoding

  • reference : this is an index that matches an index number in the decoded string (e.g. <<encoded binary type=pdf index=1.2’).

  • original_string : the string prior to decoding - file_type : the type of file if this could be determined

  • file_hashes : a dictionary of hashes (the md5, sha1 and sha256 hashes are broken out into separate columns)

  • input_bytes : the binary image as a byte array

  • decoded_string : printable form of the decoded string (either string or list of hex byte values)

  • encoding_type : utf-8, utf-16 or binary

  • md5, sha1, sha256 : the respective hashes of the binary file_type, file_hashes, input_bytes, md5, sha1, sha256 will be null if this item is decoded to a string

  • src_index - the index of the source row in the input frame.

class msticpy.sectools.base64unpack.BinaryRecord(reference, original_string, file_name, file_type, input_bytes, decoded_string, encoding_type, file_hashes, md5, sha1, sha256, printable_bytes)

Bases: tuple

Create new instance of BinaryRecord(reference, original_string, file_name, file_type, input_bytes, decoded_string, encoding_type, file_hashes, md5, sha1, sha256, printable_bytes)

count(value, /)

Return number of occurrences of value.

property decoded_string

Alias for field number 5

property encoding_type

Alias for field number 6

property file_hashes

Alias for field number 7

property file_name

Alias for field number 2

property file_type

Alias for field number 3

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

property input_bytes

Alias for field number 4

property md5

Alias for field number 8

property original_string

Alias for field number 1

property printable_bytes

Alias for field number 11

property reference

Alias for field number 0

property sha1

Alias for field number 9

property sha256

Alias for field number 10

msticpy.sectools.base64unpack.get_hashes(binary: bytes) Dict[str, str]

Return md5, sha1 and sha256 hashes of input byte string.

Parameters

binary (bytes) – byte string of item to be hashed

Returns

dictionary of hash algorithm + hash value

Return type

Dict[str, str]

msticpy.sectools.base64unpack.get_items_from_gzip(binary: bytes) Tuple[str, Dict[str, bytes]]

Return decompressed gzip contents.

Parameters

binary (bytes) – byte array of gz file

Returns

File type + decompressed file

Return type

Tuple[str, bytes]

msticpy.sectools.base64unpack.get_items_from_tar(binary: bytes) Tuple[str, Dict[str, bytes]]

Return dictionary of tar file contents.

Parameters

binary (bytes) – byte array of zip file

Returns

Filetype + dictionary of file name + file content

Return type

Tuple[str, Dict[str, bytes]]

msticpy.sectools.base64unpack.get_items_from_zip(binary: bytes) Tuple[str, Dict[str, bytes]]

Return dictionary of zip contents.

Parameters

binary (bytes) – byte array of zip file

Returns

Filetype + dictionary of file name + file content

Return type

Tuple[str, Dict[str, bytes]]

msticpy.sectools.base64unpack.unpack(input_string: str, trace: bool = False, utf16: bool = False) Tuple[str, Optional[List[msticpy.sectools.base64unpack.BinaryRecord]]]

Base64 decode an input string.

Parameters
  • input_string (str, optional) – single string to decode (the default is None)

  • trace (bool, optional) – Show additional status (the default is None)

  • utf16 (bool, optional) – Attempt to decode UTF16 byte strings

Returns

Decoded string and additional metadata

Return type

Tuple[str, Optional[List[BinaryRecord]]]

Notes

Items that decode to utf-8 or utf-16 strings will be returned as decoded strings replaced in the original string. If the encoded string is a known binary type it will identify the file type and return the hashes of the file. If any binary types are known archives (zip, tar, gzip) it will unpack the contents of the archive. For any binary it will return the decoded file as a byte array, and as a printable list of byte values. If the input is a string the function returns:

  • decoded string: this is the input string with any decoded sections replaced by the results of the decoding

msticpy.sectools.base64unpack.unpack_df(data: pandas.core.frame.DataFrame, column: str, trace: bool = False, utf16: bool = False) pandas.core.frame.DataFrame

Base64 decode strings taken from a pandas dataframe.

Parameters
  • data (pd.DataFrame) – dataframe containing column to decode

  • column (str) – Name of dataframe text column

  • trace (bool, optional) – Show additional status (the default is None)

  • utf16 (bool, optional) – Attempt to decode UTF16 byte strings

Returns

Decoded string and additional metadata in dataframe

Return type

pd.DataFrame

Notes

Items that decode to utf-8 or utf-16 strings will be returned as decoded strings replaced in the original string. If the encoded string is a known binary type it will identify the file type and return the hashes of the file. If any binary types are known archives (zip, tar, gzip) it will unpack the contents of the archive. For any binary it will return the decoded file as a byte array, and as a printable list of byte values.

The columns of the output DataFrame are:

  • decoded string: this is the input string with any decoded sections replaced by the results of the decoding

  • reference : this is an index that matches an index number in the decoded string (e.g. <<encoded binary type=pdf index=1.2’).

  • original_string : the string prior to decoding

  • file_type : the type of file if this could be determined

  • file_hashes : a dictionary of hashes (the md5, sha1 and sha256 hashes are broken out into separate columns)

  • input_bytes : the binary image as a byte array

  • decoded_string : printable form of the decoded string (either string or list of hex byte values)

  • encoding_type : utf-8, utf-16 or binary

  • md5, sha1, sha256 : the respective hashes of the binary file_type, file_hashes, input_bytes, md5, sha1, sha256 will be null if this item is decoded to a string

  • src_index - the index of the source row in the input frame.

msticpy.sectools.base64unpack.unpack_items(input_string: Optional[str] = None, data: Optional[pandas.core.frame.DataFrame] = None, column: Optional[str] = None, trace: bool = False, utf16: bool = False) Any

Base64 decode an input string or strings taken from a pandas dataframe.

Parameters
  • input_string (str, optional) – single string to decode (the default is None)

  • data (pd.DataFrame, optional) – dataframe containing column to decode (the default is None)

  • column (str, optional) – Name of dataframe text column (the default is None)

  • trace (bool, optional) – Show additional status (the default is None)

  • utf16 (bool, optional) – Attempt to decode UTF16 byte strings

Returns

  • Tuple[str, pd.DataFrame] (if input_string) – Decoded string and additional metadata

  • pd.DataFrame – Decoded stringa and additional metadata in dataframe

Notes

If the input is a dataframe you must supply the name of the column to use.

Items that decode to utf-8 or utf-16 strings will be returned as decoded strings replaced in the original string. If the encoded string is a known binary type it will identify the file type and return the hashes of the file. If any binary types are known archives (zip, tar, gzip) it will unpack the contents of the archive. For any binary it will return the decoded file as a byte array, and as a printable list of byte values. If the input is a string the function returns:

  • decoded string: this is the input string with any decoded sections replaced by the results of the decoding

It also returns the data as a Pandas DataFrame with the following columns:

  • reference : this is an index that matches an index number in the returned string (e.g. <<encoded binary type=pdf index=1.2’).

  • original_string : the string prior to decoding - file_type : the type of file if this could be determined

  • file_hashes : a dictionary of hashes (the md5, sha1 and sha256 hashes are broken out into separate columns)

  • input_bytes : the binary image as a byte array

  • decoded_string : printable form of the decoded string (either string or list of hex byte values)

  • encoding_type : utf-8, utf-16 or binary

  • md5, sha1, sha256 : the respective hashes of the binary file_type, file_hashes, input_bytes, md5, sha1, sha256 will be null if this item is decoded to a string

If the input is a dataframe the output dataframe will also include the following column: - src_index - the index of the source row in the input frame. This allows you to re-join the output data to the input data.

msticpy.sectools.cmd_line module

cmd_line - Syslog Command processing module.

Contains a series of functions required to correct collect, parse and visualise linux syslog data.

Designed to support standard linux syslog for investigations where auditd is not avalaible.

msticpy.sectools.cmd_line.cmd_speed(cmd_events: pandas.core.frame.DataFrame, cmd_field: str, time: int = 5, events: int = 10) list

Detect patterns of cmd_line activity whose speed of execution may be suspicious.

Parameters
  • cmd_events (pd.DataFrame) – A DataFrame of all sudo events to check.

  • cmd_field (str) – The column of the event data that contains command line activity

  • time (int, optional) – Time window in seconds in which to evaluate speed of execution against (Defaults to 5)

  • events (int, optional) – Number of syslog command execution events in which to evaluate speed of execution against (Defaults to 10)

Returns

risky suspicious_actions – A list of commands that match a risky pattern

Return type

list

Raises

AttributeError – If cmd_field is not in supplied data set or TimeGenerated note datetime format

msticpy.sectools.cmd_line.risky_cmd_line(events: pandas.core.frame.DataFrame, log_type: str, detection_rules: str = '/home/docs/checkouts/readthedocs.org/user_builds/msticpy/envs/stable/lib/python3.7/site-packages/msticpy/resources/cmd_line_rules.json', cmd_field: str = 'Command') dict

Detect patterns of risky commands in syslog messages.

Risky patterns are defined in a json format file.

Parameters
  • events (pd.DataFrame) – A DataFrame of all syslog events potentially containing risky command line activity.

  • log_type (str) – The log type of the data included in events. Must correspond to a detection type in detection_rules file.

  • detection_rules (str, optional) – Path to json file containing patterns of risky activity to detect. (Defaults to msticpy/resources/cmd_line_rules.json)

  • cmd_field (str, optional;) – The column in the events dataset that contains the command lines to be analysed. (Defaults to “Command”)

Returns

risky actions – A dictionary of commands that match a risky pattern

Return type

dict

Raises

MsticpyException – The provided dataset does not contain the cmd_field field

msticpy.sectools.geoip module

Geoip Lookup module using IPStack and Maxmind GeoLite2.

Geographic location lookup for IP addresses. This module has two classes for different services:

Both services offer a free tier for non-commercial use. However, a paid tier will normally get you more accuracy, more detail and a higher throughput rate. Maxmind geolite uses a downloadable database, while IPStack is an online lookup (API key required).

exception msticpy.sectools.geoip.GeoIPDatabaseException

Bases: Exception

Exception when GeoIP database cannot be found.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class msticpy.sectools.geoip.GeoIpLookup

Bases: object

Abstract base class for GeoIP Lookup classes.

See also

IPStackLookup

IPStack GeoIP Implementation

GeoLiteLookup

MaxMind GeoIP Implementation

Initialize instance of GeoIpLookup class.

df_lookup_ip(data: pandas.core.frame.DataFrame, column: str) pandas.core.frame.DataFrame

Lookup Geolocation data from a pandas Dataframe.

Parameters
  • data (pd.DataFrame) – pandas dataframe containing IpAddress column

  • column (str) – the name of the dataframe column to use as a source

Returns

Copy of original dataframe with IP Location information columns appended (where a location lookup was successful)

Return type

pd.DataFrame

abstract lookup_ip(ip_address: Optional[str] = None, ip_addr_list: Optional[collections.abc.Iterable] = None, ip_entity: Optional[msticpy.datamodel.entities.ip_address.IpAddress] = None) Tuple[List[Any], List[msticpy.datamodel.entities.ip_address.IpAddress]]

Lookup IP location abstract method.

Parameters
  • ip_address (str, optional) – a single address to look up (the default is None)

  • ip_addr_list (Iterable, optional) – a collection of addresses to lookup (the default is None)

  • ip_entity (IpAddress, optional) – an IpAddress entity (the default is None) - any existing data in the Location property will be overwritten

Returns

raw geolocation results and same results as IpAddress entities with populated Location property.

Return type

Tuple[List[Any], List[IpAddress]]

lookup_ips(data: pandas.core.frame.DataFrame, column: str) pandas.core.frame.DataFrame

Lookup Geolocation data from a pandas Dataframe.

Parameters
  • data (pd.DataFrame) – pandas dataframe containing IpAddress column

  • column (str) – the name of the dataframe column to use as a source

Returns

IpLookup results as DataFrame.

Return type

pd.DataFrame

class msticpy.sectools.geoip.GeoLiteLookup(api_key: Optional[str] = None, db_folder: Optional[str] = None, force_update: bool = False, auto_update: bool = True, debug: bool = False)

Bases: msticpy.sectools.geoip.GeoIpLookup

GeoIP Lookup using MaxMindDB database.

See also

GeoIpLookup

Abstract base class

IPStackLookup

IPStack GeoIP Implementation

Return new instance of GeoLiteLookup class.

Parameters
  • api_key (str, optional) – Default is None - use configuration value from msticpyconfig.yaml. API Key from MaxMind - Read more about GeoLite2 : https://dev.maxmind.com/geoip/geoip2/geolite2/ Sign up for a MaxMind account: https://www.maxmind.com/en/geolite2/signup Set your password and create a license key: https://www.maxmind.com/en/accounts/current/license-key

  • db_folder (str, optional) – Provide absolute path to the folder containing MMDB file (e.g. ‘/usr/home’ or ‘C:/maxmind’). If no path provided, it is set to download to .msticpy/GeoLite2 under user`s home directory.

  • force_update (bool, optional) – Force update can be set to true or false. depending on it, new download request will be initiated.

  • auto_update (bool, optional) – Auto update can be set to true or false. depending on it, new download request will be initiated if age criteria is matched.

  • debug (bool, optional) – Print additional debugging information, default is False.

close()

Close an open GeoIP DB.

df_lookup_ip(data: pandas.core.frame.DataFrame, column: str) pandas.core.frame.DataFrame

Lookup Geolocation data from a pandas Dataframe.

Parameters
  • data (pd.DataFrame) – pandas dataframe containing IpAddress column

  • column (str) – the name of the dataframe column to use as a source

Returns

Copy of original dataframe with IP Location information columns appended (where a location lookup was successful)

Return type

pd.DataFrame

lookup_ip(ip_address: Optional[str] = None, ip_addr_list: Optional[collections.abc.Iterable] = None, ip_entity: Optional[msticpy.datamodel.entities.ip_address.IpAddress] = None) Tuple[List[Any], List[msticpy.datamodel.entities.ip_address.IpAddress]]

Lookup IP location from GeoLite2 data created by MaxMind.

Parameters
  • ip_address (str, optional) – a single address to look up (the default is None)

  • ip_addr_list (Iterable, optional) – a collection of addresses to lookup (the default is None)

  • ip_entity (IpAddress, optional) – an IpAddress entity (the default is None) - any existing data in the Location property will be overwritten

Returns

raw geolocation results and same results as IpAddress entities with populated Location property.

Return type

Tuple[List[Any], List[IpAddress]]

lookup_ips(data: pandas.core.frame.DataFrame, column: str) pandas.core.frame.DataFrame

Lookup Geolocation data from a pandas Dataframe.

Parameters
  • data (pd.DataFrame) – pandas dataframe containing IpAddress column

  • column (str) – the name of the dataframe column to use as a source

Returns

IpLookup results as DataFrame.

Return type

pd.DataFrame

class msticpy.sectools.geoip.IPStackLookup(api_key: Optional[str] = None, bulk_lookup: bool = False)

Bases: msticpy.sectools.geoip.GeoIpLookup

IPStack GeoIP Implementation.

See also

GeoIpLookup

Abstract base class

GeoLiteLookup

MaxMind GeoIP Implementation

Create a new instance of IPStackLookup.

Parameters
  • api_key (str, optional) – API Key from IPStack - see https://ipstack.com default is None - obtain key from msticpyconfig.yaml

  • bulk_lookup (bool, optional) – For Professional and above tiers allowing you to submit multiple IPs in a single request. (the default is False, which submits a single request per address)

df_lookup_ip(data: pandas.core.frame.DataFrame, column: str) pandas.core.frame.DataFrame

Lookup Geolocation data from a pandas Dataframe.

Parameters
  • data (pd.DataFrame) – pandas dataframe containing IpAddress column

  • column (str) – the name of the dataframe column to use as a source

Returns

Copy of original dataframe with IP Location information columns appended (where a location lookup was successful)

Return type

pd.DataFrame

lookup_ip(ip_address: Optional[str] = None, ip_addr_list: Optional[collections.abc.Iterable] = None, ip_entity: Optional[msticpy.datamodel.entities.ip_address.IpAddress] = None) Tuple[List[Any], List[msticpy.datamodel.entities.ip_address.IpAddress]]

Lookup IP location from IPStack web service.

Parameters
  • ip_address (str, optional) – a single address to look up (the default is None)

  • ip_addr_list (Iterable, optional) – a collection of addresses to lookup (the default is None)

  • ip_entity (IpAddress, optional) – an IpAddress entity (the default is None) - any existing data in the Location property will be overwritten

Returns

raw geolocation results and same results as IpAddress entities with populated Location property.

Return type

Tuple[List[Any], List[IpAddress]]

Raises
  • ConnectionError – Invalid status returned from http request

  • PermissionError – Service refused request (e.g. requesting batch of addresses on free tier API key)

lookup_ips(data: pandas.core.frame.DataFrame, column: str) pandas.core.frame.DataFrame

Lookup Geolocation data from a pandas Dataframe.

Parameters
  • data (pd.DataFrame) – pandas dataframe containing IpAddress column

  • column (str) – the name of the dataframe column to use as a source

Returns

IpLookup results as DataFrame.

Return type

pd.DataFrame

msticpy.sectools.geoip.entity_distance(ip_src: msticpy.datamodel.entities.ip_address.IpAddress, ip_dest: msticpy.datamodel.entities.ip_address.IpAddress) float

Return distance between two IP Entities.

Parameters
  • ip_src (IpAddress) – Source/Origin IpAddress Entity

  • ip_dest (IpAddress) – Destination IpAddress Entity

Returns

Distance in kilometers.

Return type

float

Raises

AttributeError – If either entity has no location information

msticpy.sectools.geoip.geo_distance(origin: Tuple[float, float], destination: Tuple[float, float]) float

Calculate the Haversine distance.

Parameters
  • origin (Tuple[float, float]) – Latitude, Longitude of origin of distance measurement.

  • destination (Tuple[float, float]) – Latitude, Longitude of origin of distance measurement.

Returns

Distance in kilometers.

Return type

float

Examples

>>> origin = (48.1372, 11.5756)  # Munich
>>> destination = (52.5186, 13.4083)  # Berlin
>>> round(geo_distance(origin, destination), 1)
504.2

Notes

Author: Martin Thoma - stackoverflow

msticpy.sectools.iocextract module

Module for IoCExtract class.

Uses a set of builtin regular expressions to look for Indicator of Compromise (IoC) patterns. Input can be a single string or a pandas dataframe with one or more columns specified as input.

The following types are built-in:

  • IPv4 and IPv6

  • URL

  • DNS domain

  • Hashes (MD5, SHA1, SHA256)

  • Windows file paths

  • Linux file paths (this is kind of noisy because a legal linux file path can have almost any character) You can modify or add to the regular expressions used at runtime.

class msticpy.sectools.iocextract.IoCExtract

Bases: object

IoC Extractor - looks for common IoC patterns in input strings.

The extract() method takes either a string or a pandas DataFrame as input. When using the string option as an input extract will return a dictionary of results. When using a DataFrame the results will be returned as a new DataFrame with the following columns: IoCType: the mnemonic used to distinguish different IoC Types Observable: the actual value of the observable SourceIndex: the index of the row in the input DataFrame from which the source for the IoC observable was extracted.

The class has a number of built-in IoC regex definitions. These can be retrieved using the ioc_types attribute.

Addition IoC definitions can be added using the add_ioc_type method.

Note: due to some ambiguity in the regular expression patterns for different types and observable may be returned assigned to multiple observable types. E.g. 192.168.0.1 is a also a legal file name in both Linux and Windows. Linux file names have a particularly large scope in terms of legal characters so it will be quite common to see other IoC observables (or parts of them) returned as a possible linux path.

Initialize new instance of IoCExtract.

DNS_REGEX = '((?=[a-z0-9-]{1,63}\\.)[a-z0-9]+(-[a-z0-9]+)*\\.){1,126}[a-z]{2,63}'
IPV4_REGEX = '(?P<ipaddress>(?:[0-9]{1,3}\\.){3}[0-9]{1,3})'
IPV6_REGEX = '(?<![:.\\w])(?:[A-F0-9]{0,4}:){2,7}[A-F0-9]{0,4}(?![:.\\w])'
LXPATH_REGEX = '(?P<root>/+||[.]+)\n            (?P<folder>/(?:[^\\\\/:*?<>|\\r\\n]+/)*)\n            (?P<file>[^/\\0<>|\\r\\n ]+)'
MD5_REGEX = '(?:^|[^A-Fa-f0-9])(?P<hash>[A-Fa-f0-9]{32})(?:$|[^A-Fa-f0-9])'
SHA1_REGEX = '(?:^|[^A-Fa-f0-9])(?P<hash>[A-Fa-f0-9]{40})(?:$|[^A-Fa-f0-9])'
SHA256_REGEX = '(?:^|[^A-Fa-f0-9])(?P<hash>[A-Fa-f0-9]{64})(?:$|[^A-Fa-f0-9])'
URL_REGEX = '\n            (?P<protocol>(https?|ftp|telnet|ldap|file)://)\n            (?P<userinfo>([a-z0-9-._~!$&\\\'()*+,;=:]|%[0-9A-F]{2})*@)?\n            (?P<host>([a-z0-9-._~!$&\\\'()*+,;=]|%[0-9A-F]{2})*)\n            (:(?P<port>\\d*))?\n            (/(?P<path>([^?\\#"<>\\s]|%[0-9A-F]{2})*/?))?\n            (\\?(?P<query>([a-z0-9-._~!$&\'()*+,;=:/?@]|%[0-9A-F]{2})*))?\n            (\\#(?P<fragment>([a-z0-9-._~!$&\'()*+,;=:/?@]|%[0-9A-F]{2})*))?'
WINPATH_REGEX = '\n            (?P<root>[a-z]:|\\\\\\\\[a-z0-9_.$-]+||[.]+)\n            (?P<folder>\\\\(?:[^\\/:*?"\\\'<>|\\r\\n]+\\\\)*)\n            (?P<file>[^\\\\/*?""<>|\\r\\n ]+)'
add_ioc_type(ioc_type: str, ioc_regex: str, priority: int = 0, group: Optional[str] = None)

Add an IoC type and regular expression to use to the built-in set.

Parameters
  • ioc_type (str) – A unique name for the IoC type

  • ioc_regex (str) – A regular expression used to search for the type

  • priority (int, optional) – Priority of the regex match vs. other ioc_patterns. 0 is the highest priority (the default is 0).

  • group (str, optional) – The regex group to match (the default is None, which will match on the whole expression)

Notes

Pattern priorities.

If two IocType patterns match on the same substring, the matched substring is assigned to the pattern/IocType with the highest priority. E.g. foo.bar.com will match types: dns, windows_path and linux_path but since dns has a higher priority, the expression is assigned to the dns matches.

extract(src: Optional[str] = None, data: Optional[pandas.core.frame.DataFrame] = None, columns: Optional[List[str]] = None, **kwargs) Union[Dict[str, Set[str]], pandas.core.frame.DataFrame]

Extract IoCs from either a string or pandas DataFrame.

Parameters
  • src (str, optional) – source string in which to look for IoC patterns (the default is None)

  • data (pd.DataFrame, optional) – input DataFrame from which to read source strings (the default is None)

  • columns (list, optional) – The list of columns to use as source strings, if the data parameter is used. (the default is None)

  • ioc_types (list, optional) – Restrict matching to just specified types. (default is all types)

  • include_paths (bool, optional) – Whether to include path matches (which can be noisy) (the default is false - excludes ‘windows_path’ and ‘linux_path’). If ioc_types is specified this parameter is ignored.

  • ignore_tlds (bool, optional) – If True, ignore the official Top Level Domains list when determining whether a domain name is a legal domain.

Returns

dict of found observables (if input is a string) or DataFrame of observables

Return type

Any

Notes

Extract takes either a string or a pandas DataFrame as input. When using the string option as an input extract will return a dictionary of results. When using a DataFrame the results will be returned as a new DataFrame with the following columns: - IoCType: the mnemonic used to distinguish different IoC Types - Observable: the actual value of the observable - SourceIndex: the index of the row in the input DataFrame from which the source for the IoC observable was extracted.

IoCType Pattern selection The default list is: [‘ipv4’, ‘ipv6’, ‘dns’, ‘url’, ‘md5_hash’, ‘sha1_hash’, ‘sha256_hash’] plus any user-defined types. ‘windows_path’, ‘linux_path’ are excluded unless include_paths is True or explicitly included in ioc_paths.

extract_df(data: pandas.core.frame.DataFrame, columns: Union[str, List[str]], **kwargs) pandas.core.frame.DataFrame

Extract IoCs from either a pandas DataFrame.

Parameters
  • data (pd.DataFrame) – input DataFrame from which to read source strings

  • columns (Union[str, list]) – A single column name as a string or a a list of columns to use as source strings,

  • ioc_types (list, optional) – Restrict matching to just specified types. (default is all types)

  • include_paths (bool, optional) – Whether to include path matches (which can be noisy) (the default is false - excludes ‘windows_path’ and ‘linux_path’). If ioc_types is specified this parameter is ignored.

  • ignore_tlds (bool, optional) – If True, ignore the official Top Level Domains list when determining whether a domain name is a legal domain.

Returns

DataFrame of observables

Return type

pd.DataFrame

Notes

Extract takes a pandas DataFrame as input. The results will be returned as a new DataFrame with the following columns: - IoCType: the mnemonic used to distinguish different IoC Types - Observable: the actual value of the observable - SourceIndex: the index of the row in the input DataFrame from which the source for the IoC observable was extracted.

IoCType Pattern selection The default list is: [‘ipv4’, ‘ipv6’, ‘dns’, ‘url’, ‘md5_hash’, ‘sha1_hash’, ‘sha256_hash’] plus any user-defined types. ‘windows_path’, ‘linux_path’ are excluded unless include_paths is True or explicitly included in ioc_paths.

static file_hash_type(file_hash: str) msticpy.sectools.iocextract.IoCType

Return specific IoCType based on hash length.

Parameters

file_hash (str) – File hash string

Returns

Specific hash type or unknown.

Return type

IoCType

get_ioc_type(observable: str) str

Return first matching type.

Parameters

observable (str) – The IoC Observable to check

Returns

The IoC type enumeration (unknown, if no match)

Return type

str

property ioc_types: dict

Return the current set of IoC types and regular expressions.

Returns

dict of IoC Type names and regular expressions

Return type

dict

validate(input_str: str, ioc_type: str, ignore_tlds: bool = False) bool

Check that input_str matches the regex for the specificed ioc_type.

Parameters
  • input_str (str) – the string to test

  • ioc_type (str) – the regex pattern to use

  • ignore_tlds (bool, optional) – If True, ignore the official Top Level Domains list when determining whether a domain name is a legal domain.

Returns

True if match.

Return type

bool

class msticpy.sectools.iocextract.IoCExtractAccessor(pandas_obj)

Bases: object

Pandas api extension for IoC Extractor.

Instantiate pandas extension class.

extract(columns, **kwargs)

Extract IoCs from either a pandas DataFrame.

Parameters
  • columns (list) – The list of columns to use as source strings,

  • ioc_types (list, optional) – Restrict matching to just specified types. (default is all types)

  • include_paths (bool, optional) – Whether to include path matches (which can be noisy) (the default is false - excludes ‘windows_path’ and ‘linux_path’). If ioc_types is specified this parameter is ignored.

Returns

DataFrame of observables

Return type

pd.DataFrame

Notes

Extract takes a pandas DataFrame as input. The results will be returned as a new DataFrame with the following columns: - IoCType: the mnemonic used to distinguish different IoC Types - Observable: the actual value of the observable - SourceIndex: the index of the row in the input DataFrame from which the source for the IoC observable was extracted.

IoCType Pattern selection The default list is: [‘ipv4’, ‘ipv6’, ‘dns’, ‘url’, ‘md5_hash’, ‘sha1_hash’, ‘sha256_hash’] plus any user-defined types. ‘windows_path’, ‘linux_path’ are excluded unless include_paths is True or explicitly included in ioc_paths.

class msticpy.sectools.iocextract.IoCPattern(ioc_type, comp_regex, priority, group)

Bases: tuple

Create new instance of IoCPattern(ioc_type, comp_regex, priority, group)

property comp_regex

Alias for field number 1

count(value, /)

Return number of occurrences of value.

property group

Alias for field number 3

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

property ioc_type

Alias for field number 0

property priority

Alias for field number 2

class msticpy.sectools.iocextract.IoCType(value)

Bases: enum.Enum

Enumeration of IoC Types.

dns = 'dns'
email = 'email'
file_hash = 'file_hash'
hostname = 'hostname'
ipv4 = 'ipv4'
ipv6 = 'ipv6'
linux_path = 'linux_path'
md5_hash = 'md5_hash'
classmethod parse(value: str) msticpy.sectools.iocextract.IoCType

Return parsed IoCType of string.

Parameters

value (str) – Enumeration name

Returns

IoCType matching name or unknown if no match

Return type

IoCType

sha1_hash = 'sha1_hash'
sha256_hash = 'sha256_hash'
unknown = 'unknown'
url = 'url'
windows_path = 'windows_path'

msticpy.sectools.proc_tree_builder module

Process Tree Builder module for Process Tree Visualization.

class msticpy.sectools.proc_tree_builder.ProcSchema(process_name: str, process_id: str, parent_id: str, logon_id: str, cmd_line: str, user_name: str, path_separator: str, host_name_column: str, time_stamp: str = 'TimeGenerated', parent_name: Optional[str] = None, target_logon_id: Optional[str] = None, user_id: Optional[str] = None, event_id_column: Optional[str] = None, event_id_identifier: Optional[Any] = None)

Bases: object

Property name lookup for Process event schema.

Each property maps a generic column name on to the schema of the input data. Most of these are mandatory, some are optional - not supplying them may result in a less complete tree. The time_stamp column should be supplied although defaults to ‘TimeGenerated’.

Method generated by attrs for class ProcSchema.

cmd_line: str
property column_map: Dict[str, str]

Return a dictionary that maps fields to schema names.

property columns

Return list of columns in schema data source.

property event_filter: Any

Return the event type/ID to process for the current schema.

Returns

The value of the event ID to process.

Return type

Any

Raises

ProcessTreeSchemaException – If the schema is not known.

event_id_column: Optional[str]
event_id_identifier: Optional[Any]
property event_type_col: str

Return the column name containing the event identifier.

Returns

The name of the event ID column.

Return type

str

Raises

ProcessTreeSchemaException – If the schema is not known.

get_df_cols(data: pandas.core.frame.DataFrame)

Return the subset of columns that are present in data.

host_name_column: str
logon_id: str
parent_id: str
parent_name: Optional[str]
path_separator: str
process_id: str
process_name: str
property required_columns

Return columns required for Init.

target_logon_id: Optional[str]
time_stamp: str
user_id: Optional[str]
user_name: str
exception msticpy.sectools.proc_tree_builder.ProcessTreeSchemaException(*args, help_uri: Optional[Union[Tuple[str, str], str]] = None, **kwargs)

Bases: msticpy.common.exceptions.MsticpyUserError

Custom exception for Process Tree schema.

Create an instance of the MsticpyUserError class.

Parameters
  • args (Iterable of strings) – Args will be printed as text of the exception.

  • help_uri (Union[Tuple[str, str], str, None], optional) – Primary URL, by default “https://msticpy.readthedocs.org

  • title (str) – If a title keyword argument is supplied it will be used to create the title line.

  • *_uri (str) – Additional keyword arguments who’s names end in “_uri” will be used to create a list of references in addition to the primary help_uri

Notes

The exception text is displayed when the exception is created and not when it is raised. We recommend creating the exception within the raise statement. E.g.

raise MsticpyUserException(arg1, arg2…)

Developer note: Any classes derived from MsticpyUserError should be named with an “Error” suffix to distinguish these from standard exception types.

DEF_HELP_URI = ('MSTICPy Process Tree documentation', 'https://msticpy.readthedocs.io/en/latest/visualization/ProcessTree.html')
args
property help_uri: Union[Tuple[str, str], str]

Get the default help URI.

classmethod no_display_exceptions()

Context manager to block exception display to IPython/stdout.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

msticpy.sectools.proc_tree_builder.build_process_tree(procs: pandas.core.frame.DataFrame, schema: Optional[Union[msticpy.sectools.proc_tree_builder.ProcSchema, Dict[str, Any]]] = None, show_summary: bool = False, debug: bool = False) pandas.core.frame.DataFrame

Build process trees from the process events.

Parameters
  • procs (pd.DataFrame) – Process events (Windows 4688 or Linux Auditd)

  • schema (Union[ProcSchema, Dict[str, Any]], optional) – The column schema to use, by default None. If supplied as a dict it must include definitions for the required fields in the ProcSchema class If None, then the schema is inferred

  • show_summary (bool) – Shows summary of the built tree, default is False.

  • debug (bool) – If True produces extra debugging output, by default False

Returns

Process tree dataframe.

Return type

pd.DataFrame

See also

ProcSchema

msticpy.sectools.proc_tree_builder.infer_schema(data: Union[pandas.core.frame.DataFrame, pandas.core.series.Series]) Optional[msticpy.sectools.proc_tree_builder.ProcSchema]

Infer the correct schema to use for this data set.

Parameters

data (Union[pd.DataFrame, pd.Series]) – Data set to test

Returns

The schema most closely matching the data set.

Return type

ProcSchema

msticpy.sectools.proc_tree_build_winlx module

Process Tree builder for Windows security and Linux auditd events.

msticpy.sectools.proc_tree_build_winlx.extract_process_tree(procs: pandas.core.frame.DataFrame, schema: ProcSchema, debug: bool = False) pandas.core.frame.DataFrame

Build process trees from the process events.

Parameters
  • procs (pd.DataFrame) – Process events (Windows 4688 or Linux Auditd)

  • schema (Union[ProcSchema, Dict[str, Any]], optional) – The column schema to use, by default None. If supplied as a dict it must include definitions for the required fields in the ProcSchema class If None, then the schema is inferred

  • debug (bool) – If True produces extra debugging output, by default False

Returns

Process tree dataframe.

Return type

pd.DataFrame

See also

ProcSchema

msticpy.sectools.proc_tree_build_mde module

Process tree builder routines for MDE process data.

msticpy.sectools.proc_tree_build_mde.extract_process_tree(data: pandas.core.frame.DataFrame, debug: bool = False) pandas.core.frame.DataFrame

Build a process tree from raw MDE process logs.

Parameters
  • data (pd.DataFrame) – DataFrame of process events.

  • debug (bool, optional) – Turn on additional debugging output, by default False.

Returns

Process tree DataFrame with child->parent keys and extracted parent processes from child data.

Return type

pd.DataFrame

msticpy.sectools.process_tree_utils module

Process Tree Visualization.

msticpy.sectools.process_tree_utils.get_ancestors(procs: pandas.core.frame.DataFrame, source, include_source=True) pandas.core.frame.DataFrame

Return the ancestor processes of the source process.

Parameters
  • procs (pd.DataFrame) – Process events (with process tree metadata)

  • source (Union[str, pd.Series]) – source_index of process or the process row

  • include_source (bool, optional) – Include the source process in the results, by default True

Returns

Ancestor processes

Return type

pd.DataFrame

msticpy.sectools.process_tree_utils.get_children(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series], include_source: bool = True) pandas.core.frame.DataFrame

Return the child processes for the source process.

Parameters
  • procs (pd.DataFrame) – Process events (with process tree metadata)

  • source (Union[str, pd.Series]) – source_index of process or the process row

  • include_source (bool, optional) – If True include the source process in the results, by default True

Returns

Child processes

Return type

pd.DataFrame

msticpy.sectools.process_tree_utils.get_descendents(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series], include_source: bool = True, max_levels: int = - 1) pandas.core.frame.DataFrame

Return the descendents of the source process.

Parameters
  • procs (pd.DataFrame) – Process events (with process tree metadata)

  • source (Union[str, pd.Series]) – source_index of process or the process row

  • include_source (bool, optional) – Include the source process in the results, by default True

  • max_levels (int, optional) – Maximum number of levels to descend, by default -1 (all levels)

Returns

Descendent processes

Return type

pd.DataFrame

msticpy.sectools.process_tree_utils.get_parent(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series]) Optional[pandas.core.series.Series]

Return the parent of the source process.

Parameters
  • procs (pd.DataFrame) – Process events (with process tree metadata)

  • source (Union[str, pd.Series]) – source_index of process or the process row

Returns

Parent Process row or None if no parent was found.

Return type

Optional[pd.Series]

msticpy.sectools.process_tree_utils.get_process(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series]) pandas.core.series.Series

Return the process event as a Series.

Parameters
  • procs (pd.DataFrame) – Process events (with process tree metadata)

  • source (Union[str, pd.Series]) – source_index of process or the process row

Returns

Process row

Return type

pd.Series

Raises

ValueError – If unknown type is supplied as source

msticpy.sectools.process_tree_utils.get_process_key(procs: pandas.core.frame.DataFrame, source_index: int) str

Return the process key of the process given its source_index.

Parameters
  • procs (pd.DataFrame) – Process events

  • source_index (int, optional) – source_index of the process record

Returns

The process key of the process.

Return type

str

msticpy.sectools.process_tree_utils.get_root(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series]) pandas.core.series.Series

Return the root process for the source process.

Parameters
  • procs (pd.DataFrame) – Process events (with process tree metadata)

  • source (Union[str, pd.Series]) – source_index of process or the process row

Returns

Root process

Return type

pd.Series

msticpy.sectools.process_tree_utils.get_root_tree(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series]) pandas.core.frame.DataFrame

Return the process tree to which the source process belongs.

Parameters
  • procs (pd.DataFrame) – Process events (with process tree metadata)

  • source (Union[str, pd.Series]) – source_index of process or the process row

Returns

Process Tree

Return type

pd.DataFrame

msticpy.sectools.process_tree_utils.get_roots(procs: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame

Return the process tree roots for the current data set.

Parameters

procs (pd.DataFrame) – Process events (with process tree metadata)

Returns

Process Tree root processes

Return type

pd.DataFrame

msticpy.sectools.process_tree_utils.get_siblings(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series], include_source: bool = True) pandas.core.frame.DataFrame

Return the processes that share the parent of the source process.

Parameters
  • procs (pd.DataFrame) – Process events (with process tree metadata)

  • source (Union[str, pd.Series]) – source_index of process or the process row

  • include_source (bool, optional) – Include the source process in the results, by default True

Returns

Sibling processes.

Return type

pd.DataFrame

msticpy.sectools.process_tree_utils.get_summary_info(procs: pandas.core.frame.DataFrame) Dict[str, int]

Return summary information about the process trees.

Parameters

procs (pd.DataFrame) – Process events (with process tree metadata)

Returns

Summary statistic about the process tree

Return type

Dict[str, int]

msticpy.sectools.process_tree_utils.get_tree_depth(procs: pandas.core.frame.DataFrame) int

Return the depth of the process tree.

Parameters

procs (pd.DataFrame) – Process events (with process tree metadata)

Returns

Tree depth

Return type

int

msticpy.sectools.syslog_utils module

syslog_utils - Syslog parsing and utility module.

Functions required to correct collect, parse and visualize syslog data.

Designed to support standard linux syslog for investigations where auditd is not available.

msticpy.sectools.syslog_utils.cluster_syslog_logons_df(logon_events: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame

Cluster logon sessions in syslog by start/end time based on PAM events.

Parameters

logon_events (pd.DataFrame) – A DataFrame of all syslog logon events (can be generated with LinuxSyslog.user_logon query)

Returns

logon_sessions – A dictionary of logon sessions including start and end times and logged on user

Return type

pd.DataFrame

Raises

MsticpyException – There are no logon sessions in the supplied data set

msticpy.sectools.syslog_utils.create_host_record(syslog_df: pandas.core.frame.DataFrame, heartbeat_df: pandas.core.frame.DataFrame, az_net_df: Optional[pandas.core.frame.DataFrame] = None) msticpy.datamodel.entities.host.Host

Generate host_entity record for selected computer.

Parameters
  • syslog_df (pd.DataFrame) – A dataframe of all syslog events for the host in the time window requried

  • heartbeat_df (pd.DataFrame) – A dataframe of heartbeat data for the host

  • az_net_df (pd.DataFrame) – Option dataframe of Azure network data for the host

Returns

Details of the host data collected

Return type

Host

msticpy.sectools.syslog_utils.risky_sudo_sessions(sudo_sessions: pandas.core.frame.DataFrame, risky_actions: Optional[dict] = None, suspicious_actions: Optional[list] = None) dict

Detect if a sudo session occurs at the point of a suspicious event.

Parameters
  • sudo_sessions (dict) – Dictionary of sudo sessions (as generated by cluster_syslog_logons)

  • risky_actions (dict (Optional)) – Dictionary of risky sudo commands (as generated by cmd_line.risky_cmd_line)

  • suspicious_actions (list (Optional)) – List of risky sudo commands (as generated by cmd_line.cmd_speed)

Returns

risky_sessions – A dictionary of sudo sessions with flags denoting risk

Return type

dict

msticpy.sectools.tilookup module

Module for TILookup classes.

Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.

class msticpy.sectools.tilookup.TILookup(primary_providers: Optional[List[msticpy.sectools.tiproviders.ti_provider_base.TIProvider]] = None, secondary_providers: Optional[List[msticpy.sectools.tiproviders.ti_provider_base.TIProvider]] = None, providers: Optional[List[str]] = None)

Bases: object

Threat Intel observable lookup from providers.

Initialize TILookup instance.

Parameters
  • primary_providers (Optional[List[TIProvider]], optional) – Primary TI Providers, by default None

  • secondary_providers (Optional[List[TIProvider]], optional) – Secondary TI Providers, by default None

  • providers (Optional[List[str]], optional) – List of provider names to load, by default all available providers are loaded. To see the list of available providers call TILookup.list_available_providers(). Note: if primary_provides or secondary_providers is specified This will override the providers list.

add_provider(provider: msticpy.sectools.tiproviders.ti_provider_base.TIProvider, name: Optional[str] = None, primary: bool = True)

Add a TI provider to the current collection.

Parameters
  • provider (TIProvider) – Provider instance

  • name (str, optional) – The name to use for the provider (overrides the class name of provider)

  • primary (bool, optional) – “primary” or “secondary” if False, by default “primary”

property available_providers: List[str]

Return a list of builtin providers.

Returns

List of TI Provider classes.

Return type

List[str]

static browse(data: pandas.core.frame.DataFrame, severities: Optional[List[str]] = None, **kwargs)

Return TI Results list browser.

Parameters
  • data (pd.DataFrame) – TI Results data from TIProviders

  • severities (Optional[List[str]], optional) – A list of the severity classes to show. By default these are [‘warning’, ‘high’]. Pass [‘information’, ‘warning’, ‘high’] to see all results.

  • kwargs – passed to SelectItem constuctor.

Returns

SelectItem browser for TI Data.

Return type

SelectItem

static browse_results(data: pandas.core.frame.DataFrame, severities: Optional[List[str]] = None, **kwargs)

Return TI Results list browser.

Parameters
  • data (pd.DataFrame) – TI Results data from TIProviders

  • severities (Optional[List[str]], optional) – A list of the severity classes to show. By default these are [‘warning’, ‘high’]. Pass [‘information’, ‘warning’, ‘high’] to see all results.

  • kwargs – passed to SelectItem constuctor.

Returns

SelectItem browser for TI Data.

Return type

SelectItem

property configured_providers: List[str]

Return a list of avaliable providers that have configuration details present.

Returns

List of TI Provider classes.

Return type

List[str]

disable_provider(providers: Union[str, Iterable[str]])

Set the provider as secondary (not used by default).

Parameters

providers (Union[str, Iterable[str]) – Provider name or list of names. Use list_available_providers() to see the list of loaded providers.

Raises

ValueError – If the provider name is not recognized.

enable_provider(providers: Union[str, Iterable[str]])

Set the provider(s) as primary (used by default).

Parameters

providers (Union[str, Iterable[str]) – Provider name or list of names. Use list_available_providers() to see the list of loaded providers.

Raises

ValueError – If the provider name is not recognized.

classmethod list_available_providers(show_query_types=False, as_list: bool = False) Optional[List[str]]

Print a list of builtin providers with optional usage.

Parameters
  • show_query_types (bool, optional) – Show query types supported by providers, by default False

  • as_list (bool, optional) – Return list of providers instead of printing to stdout. Note: if you specify show_query_types this will be printed irrespective of this parameter setting.

Returns

A list of provider names (if return_list=True)

Return type

Optional[List[str]]

property loaded_providers: Dict[str, msticpy.sectools.tiproviders.ti_provider_base.TIProvider]

Return dictionary of loaded providers.

Returns

[description]

Return type

Dict[str, TIProvider]

lookup_ioc(observable: Optional[str] = None, ioc_type: Optional[str] = None, ioc_query_type: Optional[str] = None, providers: Optional[List[str]] = None, prov_scope: str = 'primary', **kwargs) Tuple[bool, List[Tuple[str, msticpy.sectools.tiproviders.ti_provider_base.LookupResult]]]

Lookup single IoC in active providers.

Parameters
  • observable (str) – IoC observable (ioc is also an alias for observable)

  • ioc_type (str, optional) – One of IoCExtract.IoCType, by default None If none, the IoC type will be inferred

  • ioc_query_type (str, optional) – The ioc query type (e.g. rep, info, malware)

  • providers (List[str]) – Explicit list of providers to use

  • prov_scope (str, optional) – Use “primary”, “secondary” or “all” providers, by default “primary”

  • kwargs – Additional arguments passed to the underlying provider(s)

Returns

The result returned as a tuple(bool, list): bool indicates whether a TI record was found in any provider list has an entry for each provider result

Return type

Tuple[bool, List[Tuple[str, LookupResult]]]

lookup_iocs(data: Union[pandas.core.frame.DataFrame, Mapping[str, str], Iterable[str]], obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None, ioc_query_type: Optional[str] = None, providers: Optional[List[str]] = None, prov_scope: str = 'primary', **kwargs) pandas.core.frame.DataFrame

Lookup a collection of IoCs.

Parameters
  • data (Union[pd.DataFrame, Mapping[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Mapping (e.g. a dict) of [observable, IoCType] 3. Iterable of observables - IoCTypes will be inferred

  • obs_col (str, optional) – DataFrame column to use for observables, by default None (“col” and “column” are also aliases for this parameter)

  • ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None

  • ioc_query_type (str, optional) – The ioc query type (e.g. rep, info, malware)

  • providers (List[str]) – Explicit list of providers to use

  • prov_scope (str, optional) – Use “primary”, “secondary” or “all” providers, by default “primary”

  • kwargs – Additional arguments passed to the underlying provider(s)

Returns

DataFrame of results

Return type

pd.DataFrame

property provider_status: Iterable[str]

Return loaded provider status.

Returns

List of providers and descriptions.

Return type

Iterable[str]

provider_usage()

Print usage of loaded providers.

classmethod reload_provider_settings()

Reload provider settings from config.

reload_providers()

Reload providers based on current settings in config.

Parameters

clear_keyring (bool, optional) – Clears any secrets cached in keyring, by default False

static result_to_df(ioc_lookup: Tuple[bool, List[Tuple[str, msticpy.sectools.tiproviders.ti_provider_base.LookupResult]]]) pandas.core.frame.DataFrame

Return DataFrame representation of IoC Lookup response.

Parameters

ioc_lookup (Tuple[bool, List[Tuple[str, LookupResult]]]) – Output from lookup_ioc

Returns

The response as a DataFrame with a row for each provider response.

Return type

pd.DataFrame

set_provider_state(prov_dict: Dict[str, bool])

Set a dict of providers to primary/secondary.

Parameters

prov_dict (Dict[str, bool]) – Dictionary of provider name and bool - True if enabled/primary, False if disabled/secondary.

msticpy.sectools.tiproviders.ti_provider_base module

Module for TILookup classes.

Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.

class msticpy.sectools.tiproviders.ti_provider_base.LookupResult(ioc: str, ioc_type: str, safe_ioc: str = '', query_subtype: Optional[str] = None, provider: Optional[str] = None, result: bool = False, severity: int = 0, details: Optional[Any] = None, raw_result: Optional[Union[str, dict]] = None, reference: Optional[str] = None, status: int = 0)

Bases: object

Lookup result for IoCs.

Method generated by attrs for class LookupResult.

classmethod column_map()

Return a dictionary that maps fields to DF Names.

details: Any
ioc: str
ioc_type: str
provider: Optional[str]
query_subtype: Optional[str]
raw_result: Optional[Union[str, dict]]
property raw_result_fmtd

Print raw results of the Lookup Result.

reference: Optional[str]
result: bool
safe_ioc: str
set_severity(value: Any)

Set the severity from enum, int or string.

Parameters

value (Any) – The severity value to set

severity: int
property severity_name: str

Return text description of severity score.

Returns

Severity description.

Return type

str

status: int
property summary

Print a summary of the Lookup Result.

class msticpy.sectools.tiproviders.ti_provider_base.SanitizedObservable(observable, status)

Bases: tuple

Create new instance of SanitizedObservable(observable, status)

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

property observable

Alias for field number 0

property status

Alias for field number 1

class msticpy.sectools.tiproviders.ti_provider_base.TILookupStatus(value)

Bases: enum.Enum

Threat intelligence lookup status.

bad_format = 2
not_supported = 1
ok = 0
other = 10
query_failed = 3
class msticpy.sectools.tiproviders.ti_provider_base.TIProvider(**kwargs)

Bases: abc.ABC

Abstract base class for Threat Intel providers.

Initialize the provider.

property ioc_query_defs: Dict[str, Any]

Return current dictionary of IoC query/request definitions.

Returns

IoC query/requist definitions keyed by IoCType

Return type

Dict[str, Any]

classmethod is_known_type(ioc_type: str) bool

Return True if this a known IoC Type.

Parameters

ioc_type (str) – IoCType string to test

Returns

True if known type.

Return type

bool

is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) bool

Return True if the passed type is supported.

Parameters

ioc_type (Union[str, IoCType]) – IoC type name or instance

Returns

True if supported.

Return type

bool

abstract lookup_ioc(ioc: str, ioc_type: Optional[str] = None, query_type: Optional[str] = None, **kwargs) msticpy.sectools.tiproviders.ti_provider_base.LookupResult

Lookup a single IoC observable.

Parameters
  • ioc (str) – IoC Observable value

  • ioc_type (str, optional) – IoC Type, by default None (type will be inferred)

  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.

Returns

The returned results.

Return type

LookupResult

lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None, query_type: Optional[str] = None, **kwargs) pandas.core.frame.DataFrame

Lookup collection of IoC observables.

Parameters
  • data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred

  • obs_col (str, optional) – DataFrame column to use for observables, by default None

  • ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None

  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.

Returns

DataFrame of results.

Return type

pd.DataFrame

abstract parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]

Return the details of the response.

Parameters

response (LookupResult) – The returned data response

Returns

bool = positive or negative hit TISeverity = enumeration of severity Object with match details

Return type

Tuple[bool, TISeverity, Any]

static resolve_ioc_type(observable: str) str

Return IoCType determined by IoCExtract.

Parameters

observable (str) – IoC observable string

Returns

IoC Type (or unknown if type could not be determined)

Return type

str

property supported_types: List[str]

Return list of supported IoC types for this provider.

Returns

List of supported type names

Return type

List[str]

classmethod usage()

Print usage of provider.

class msticpy.sectools.tiproviders.ti_provider_base.TISeverity(value)

Bases: enum.Enum

Threat intelligence report severity.

high = 2
information = 0
classmethod parse(value) msticpy.sectools.tiproviders.ti_provider_base.TISeverity

Parse string or numeric value to TISeverity.

Parameters

value (Any) – TISeverity, str or int

Returns

TISeverity instance.

Return type

TISeverity

unknown = -1
warning = 1
msticpy.sectools.tiproviders.ti_provider_base.entropy(input_str: str) float

Compute entropy of input string.

msticpy.sectools.tiproviders.ti_provider_base.generate_items(data: Any, obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None) Iterable[Tuple[Optional[str], Optional[str]]]

Generate item pairs from different input types.

Parameters
  • data (Any) – DataFrame, dictionary or iterable

  • obs_col (Optional[str]) – If data is a DataFrame, the column containing the observable value.

  • ioc_type_col (Optional[str]) – If data is a DataFrame, the column containing the observable type.

Returns

Return type

Iterable[Tuple[Optional[str], Optional[str]]]] - a tuple of Observable/Type.

msticpy.sectools.tiproviders.ti_provider_base.get_schema_and_host(url: str, require_url_encoding: bool = False) Tuple[Optional[str], Optional[str], Optional[str]]

Return URL scheme and host and cleaned URL.

Parameters
  • url (str) – Input URL

  • require_url_encoding (bool) – Set to True if url needs encoding. Defualt is False.

Returns

Tuple of URL, scheme, host

Return type

Tuple[Optional[str], Optional[str], Optional[str]

msticpy.sectools.tiproviders.ti_provider_base.preprocess_observable(observable, ioc_type, require_url_encoding: bool = False) msticpy.sectools.tiproviders.ti_provider_base.SanitizedObservable

Preprocesses and checks validity of observable against declared IoC type.

param observable

the value of the IoC

param ioc_type

the IoC type

msticpy.sectools.tiproviders.greynoise module

GreyNoise Lookup.

msticpy.sectools.tiproviders.GreyNoise.ioc_query_defs

Return current dictionary of IoC query/request definitions.

Returns

IoC query/requist definitions keyed by IoCType

Return type

Dict[str, Any]

msticpy.sectools.tiproviders.GreyNoise.supported_types

Return list of supported IoC types for this provider.

Returns

List of supported type names

Return type

List[str]

msticpy.sectools.tiproviders.http_base module

HTTP TI Provider base.

Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.

class msticpy.sectools.tiproviders.http_base.HttpProvider(**kwargs)

Bases: msticpy.sectools.tiproviders.ti_provider_base.TIProvider

HTTP TI provider base class.

Initialize a new instance of the class.

property ioc_query_defs: Dict[str, Any]

Return current dictionary of IoC query/request definitions.

Returns

IoC query/requist definitions keyed by IoCType

Return type

Dict[str, Any]

classmethod is_known_type(ioc_type: str) bool

Return True if this a known IoC Type.

Parameters

ioc_type (str) – IoCType string to test

Returns

True if known type.

Return type

bool

is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) bool

Return True if the passed type is supported.

Parameters

ioc_type (Union[str, IoCType]) – IoC type name or instance

Returns

True if supported.

Return type

bool

lookup_ioc(ioc: str, ioc_type: str = None, query_type: str = None, **kwargs) msticpy.sectools.tiproviders.ti_provider_base.LookupResult

Lookup a single IoC observable.

Parameters
  • ioc (str) – IoC observable

  • ioc_type (str, optional) – IocType, by default None (type will be inferred)

  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.

Returns

The lookup result: result - Positive/Negative, details - Lookup Details (or status if failure), raw_result - Raw Response reference - URL of IoC

Return type

LookupResult

Raises

NotImplementedError – If attempting to use an HTTP method or authentication protocol that is not supported.

Notes

Note: this method uses memoization (lru_cache) to cache results for a particular observable to try avoid repeated network calls for the same item.

lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None, query_type: Optional[str] = None, **kwargs) pandas.core.frame.DataFrame

Lookup collection of IoC observables.

Parameters
  • data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred

  • obs_col (str, optional) – DataFrame column to use for observables, by default None

  • ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None

  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.

Returns

DataFrame of results.

Return type

pd.DataFrame

abstract parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]

Return the details of the response.

Parameters

response (LookupResult) – The returned data response

Returns

bool = positive or negative hit TISeverity = enumeration of severity Object with match details

Return type

Tuple[bool, TISeverity, Any]

static resolve_ioc_type(observable: str) str

Return IoCType determined by IoCExtract.

Parameters

observable (str) – IoC observable string

Returns

IoC Type (or unknown if type could not be determined)

Return type

str

property supported_types: List[str]

Return list of supported IoC types for this provider.

Returns

List of supported type names

Return type

List[str]

classmethod usage()

Print usage of provider.

class msticpy.sectools.tiproviders.http_base.IoCLookupParams(path: str = '', verb: str = 'GET', full_url: bool = False, headers: Dict[str, str] = NOTHING, params: Dict[str, str] = NOTHING, data: Dict[str, str] = NOTHING, auth_type: str = '', auth_str: List[str] = NOTHING, sub_type: str = '')

Bases: object

IoC HTTP Lookup Params definition.

Method generated by attrs for class IoCLookupParams.

auth_str: List[str]
auth_type: str
data: Dict[str, str]
full_url: bool
headers: Dict[str, str]
params: Dict[str, str]
path: str
sub_type: str
verb: str

msticpy.sectools.tiproviders.alienvault_otx module

AlienVault OTX Provider.

Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.

class msticpy.sectools.tiproviders.alienvault_otx.OTX(**kwargs)

Bases: msticpy.sectools.tiproviders.http_base.HttpProvider

AlientVault OTX Lookup.

Set OTX specific settings.

property ioc_query_defs: Dict[str, Any]

Return current dictionary of IoC query/request definitions.

Returns

IoC query/requist definitions keyed by IoCType

Return type

Dict[str, Any]

classmethod is_known_type(ioc_type: str) bool

Return True if this a known IoC Type.

Parameters

ioc_type (str) – IoCType string to test

Returns

True if known type.

Return type

bool

is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) bool

Return True if the passed type is supported.

Parameters

ioc_type (Union[str, IoCType]) – IoC type name or instance

Returns

True if supported.

Return type

bool

lookup_ioc(ioc: str, ioc_type: str = None, query_type: str = None, **kwargs) msticpy.sectools.tiproviders.ti_provider_base.LookupResult

Lookup a single IoC observable.

Parameters
  • ioc (str) – IoC observable

  • ioc_type (str, optional) – IocType, by default None (type will be inferred)

  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.

Returns

The lookup result: result - Positive/Negative, details - Lookup Details (or status if failure), raw_result - Raw Response reference - URL of IoC

Return type

LookupResult

Raises

NotImplementedError – If attempting to use an HTTP method or authentication protocol that is not supported.

Notes

Note: this method uses memoization (lru_cache) to cache results for a particular observable to try avoid repeated network calls for the same item.

lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None, query_type: Optional[str] = None, **kwargs) pandas.core.frame.DataFrame

Lookup collection of IoC observables.

Parameters
  • data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred

  • obs_col (str, optional) – DataFrame column to use for observables, by default None

  • ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None

  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.

Returns

DataFrame of results.

Return type

pd.DataFrame

parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]

Return the details of the response.

Parameters

response (LookupResult) – The returned data response

Returns

bool = positive or negative hit TISeverity = enumeration of severity Object with match details

Return type

Tuple[bool, TISeverity, Any]

static resolve_ioc_type(observable: str) str

Return IoCType determined by IoCExtract.

Parameters

observable (str) – IoC observable string

Returns

IoC Type (or unknown if type could not be determined)

Return type

str

property supported_types: List[str]

Return list of supported IoC types for this provider.

Returns

List of supported type names

Return type

List[str]

classmethod usage()

Print usage of provider.

msticpy.sectools.tiproviders.ibm_xforce module

IBM XForce Provider.

Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.

class msticpy.sectools.tiproviders.ibm_xforce.XForce(**kwargs)

Bases: msticpy.sectools.tiproviders.http_base.HttpProvider

IBM XForce Lookup.

Initialize a new instance of the class.

property ioc_query_defs: Dict[str, Any]

Return current dictionary of IoC query/request definitions.

Returns

IoC query/requist definitions keyed by IoCType

Return type

Dict[str, Any]

classmethod is_known_type(ioc_type: str) bool

Return True if this a known IoC Type.

Parameters

ioc_type (str) – IoCType string to test

Returns

True if known type.

Return type

bool

is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) bool

Return True if the passed type is supported.

Parameters

ioc_type (Union[str, IoCType]) – IoC type name or instance

Returns

True if supported.

Return type

bool

lookup_ioc(ioc: str, ioc_type: str = None, query_type: str = None, **kwargs) msticpy.sectools.tiproviders.ti_provider_base.LookupResult

Lookup a single IoC observable.

Parameters
  • ioc (str) – IoC observable

  • ioc_type (str, optional) – IocType, by default None (type will be inferred)

  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.

Returns

The lookup result: result - Positive/Negative, details - Lookup Details (or status if failure), raw_result - Raw Response reference - URL of IoC

Return type

LookupResult

Raises

NotImplementedError – If attempting to use an HTTP method or authentication protocol that is not supported.

Notes

Note: this method uses memoization (lru_cache) to cache results for a particular observable to try avoid repeated network calls for the same item.

lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None, query_type: Optional[str] = None, **kwargs) pandas.core.frame.DataFrame

Lookup collection of IoC observables.

Parameters
  • data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred

  • obs_col (str, optional) – DataFrame column to use for observables, by default None

  • ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None

  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.

Returns

DataFrame of results.

Return type

pd.DataFrame

parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]

Return the details of the response.

Parameters

response (LookupResult) – The returned data response

Returns

bool = positive or negative hit TISeverity = enumeration of severity Object with match details

Return type

Tuple[bool, TISeverity, Any]

static resolve_ioc_type(observable: str) str

Return IoCType determined by IoCExtract.

Parameters

observable (str) – IoC observable string

Returns

IoC Type (or unknown if type could not be determined)

Return type

str

property supported_types: List[str]

Return list of supported IoC types for this provider.

Returns

List of supported type names

Return type

List[str]

classmethod usage()

Print usage of provider.

msticpy.sectools.tiproviders.virustotal module

VirusTotal Provider.

Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.

class msticpy.sectools.tiproviders.virustotal.VirusTotal(**kwargs)

Bases: msticpy.sectools.tiproviders.http_base.HttpProvider

VirusTotal Lookup.

Initialize a new instance of the class.

property ioc_query_defs: Dict[str, Any]

Return current dictionary of IoC query/request definitions.

Returns

IoC query/requist definitions keyed by IoCType

Return type

Dict[str, Any]

classmethod is_known_type(ioc_type: str) bool

Return True if this a known IoC Type.

Parameters

ioc_type (str) – IoCType string to test

Returns

True if known type.

Return type

bool

is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) bool

Return True if the passed type is supported.

Parameters

ioc_type (Union[str, IoCType]) – IoC type name or instance

Returns

True if supported.

Return type

bool

lookup_ioc(ioc: str, ioc_type: str = None, query_type: str = None, **kwargs) msticpy.sectools.tiproviders.ti_provider_base.LookupResult

Lookup a single IoC observable.

Parameters
  • ioc (str) – IoC observable

  • ioc_type (str, optional) – IocType, by default None (type will be inferred)

  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.

Returns

The lookup result: result - Positive/Negative, details - Lookup Details (or status if failure), raw_result - Raw Response reference - URL of IoC

Return type

LookupResult

Raises

NotImplementedError – If attempting to use an HTTP method or authentication protocol that is not supported.

Notes

Note: this method uses memoization (lru_cache) to cache results for a particular observable to try avoid repeated network calls for the same item.

lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None, query_type: Optional[str] = None, **kwargs) pandas.core.frame.DataFrame

Lookup collection of IoC observables.

Parameters
  • data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred

  • obs_col (str, optional) – DataFrame column to use for observables, by default None

  • ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None

  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.

Returns

DataFrame of results.

Return type

pd.DataFrame

parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]

Return the details of the response.

Parameters

response (LookupResult) – The returned data response

Returns

bool = positive or negative hit TISeverity = enumeration of severity Object with match details

Return type

Tuple[bool, TISeverity, Any]

static resolve_ioc_type(observable: str) str

Return IoCType determined by IoCExtract.

Parameters

observable (str) – IoC observable string

Returns

IoC Type (or unknown if type could not be determined)

Return type

str

property supported_types: List[str]

Return list of supported IoC types for this provider.

Returns

List of supported type names

Return type

List[str]

classmethod usage()

Print usage of provider.

msticpy.sectools.vtlookup module

Module for VTLookup class.

Wrapper class around Virus Total API. Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing requires a Virus Total account and API key and processing performance is limited to the number of requests per minute for the account type that you have. Support IoC Types:

  • Filehash

  • URL

  • DNS Domain

  • IPv4 Address

class msticpy.sectools.vtlookup.DuplicateStatus(is_dup, status)

Bases: tuple

Create new instance of DuplicateStatus(is_dup, status)

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

property is_dup

Alias for field number 0

property status

Alias for field number 1

class msticpy.sectools.vtlookup.VTLookup(vtkey: str, verbosity: int = 1)

Bases: object

VTLookup: VirusTotal lookup of IoC reports.

Main methods are: lookup_iocs() - accepts input of multiple IoCs in a Pandas DataFrame lookup_ioc() - looks up a single IoC observable. supported_ioc_types - a list of valid target types. ioc_vt_type_mapping - a dictionary of mappings to recognized VT Types. Types mapped to None will not be submitted to VT.

For urls a full http request can be submitted, query string and fragments will be dropped before submitting. For files MD5, SHA1 and SHA256 hashes are supported. For IP addresses only dotted IPv4 addresses are supported.

Create a new instance of VTLookup class.

Parameters
  • vtkey (str) – VirusTotal API key

  • verbosity (int, optional) –

    The level of detail of reporting

    0 = no reporting 1 = minimal reporting (default) 2 = verbose reporting

property ioc_vt_type_mapping: Dict[str, str]

Return mapping between internal and VirusTotal IoC type names.

Returns

Return mapping between internal and VirusTotal IoC type names.

Return type

Mapping[str, str]

lookup_ioc(observable: str, ioc_type: str, output: str = 'dict') Any

Look up and single IoC observable.

Parameters
  • observable (str) – The observable value

  • ioc_type (str) – The IoC Type (see ‘supported_ioc_types’ attribute)

  • output (str, optional) – Output results as a dictionary (or list of dicts) if output is any other value the result will be returned in a Pandas DataFrame (the default is ‘dict’)

Returns

  • list{dict} (if output == ‘dict’)

  • pd.DataFrame (otherwise)

Raises

KeyError – Unknown ioc_type

lookup_iocs(data: pandas.core.frame.DataFrame, src_col: str = 'Observable', type_col: str = 'IoCType', src_index_col: str = 'SourceIndex', **kwargs) pandas.core.frame.DataFrame

Retrieve results for IoC observables in the source dataframe.

Parameters
  • data (pd.DataFrame) – Dataframe containing the observables to search for

  • src_col (str, optional) – The column name that contains the observable data (one item per row) (the default is ‘Observable’)

  • type_col (str, optional) – The column name containing the observable type (the default is ‘IoCType’)

  • src_index_col (str, optional) – The name of the column to use as source index. If not specified this defaults to ‘SourceIndex’. If this (or the supplied value) is not in the source dataframe, the index of the source dataframe will be used. This is retained in the output so that you can join the results back to the original data. (the default is ‘SourceIndex’)

  • names (key/value pairs of additional mappings to supported IoC type) –

  • ipv4='ipaddress' (e.g.) –

  • url='httprequest'.

  • custom (This allows you to specify) –

  • names. (mappings when the source data is tagged with different) –

Returns

Combined results of local pre-processing and VirusTotal Lookups

Return type

pd.DataFrame

Raises

KeyError – Unknown ioc_type

Notes

See supported_ioc_types attribute for a list of valid target types. Not all of these types are supported by VirusTotal. See ioc_vt_type_mapping for current mappings. Types mapped to None will not be submitted to VT.

For urls a full http request can be submitted, query string and fragments will be dropped before submitting. Other supported protocols are ftp, telnet, ldap, file For files MD5, SHA1 and SHA256 hashes are supported. For IP addresses only dotted IPv4 addresses are supported.

property supported_ioc_types: List[str]

Return list of supported IoC type internal names.

Returns

List of supported IoC type internal names.

Return type

List[str]

property supported_vt_types: List[str]

Return list of VirusTotal supported IoC type names.

Returns

List of VirusTotal supported IoC type names.

Return type

List[str]

class msticpy.sectools.vtlookup.VTParams(api_type, batch_size, batch_delimiter, http_verb, api_var_name, headers)

Bases: tuple

Create new instance of VTParams(api_type, batch_size, batch_delimiter, http_verb, api_var_name, headers)

property api_type

Alias for field number 0

property api_var_name

Alias for field number 4

property batch_delimiter

Alias for field number 2

property batch_size

Alias for field number 1

count(value, /)

Return number of occurrences of value.

property headers

Alias for field number 5

property http_verb

Alias for field number 3

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

msticpy.sectools.vtlookupv3 module

VirusTotal v3 API.

class msticpy.sectools.vtlookupv3.ColumnNames(value)

Bases: enum.Enum

Column name enum for DataFrame output.

DETECTIONS = 'detections'
ID = 'id'
RELATIONSHIP_TYPE = 'relationship_type'
SCANS = 'scans'
SOURCE = 'source'
SOURCE_TYPE = 'source_type'
TARGET = 'target'
TARGET_TYPE = 'target_type'
TYPE = 'type'
exception msticpy.sectools.vtlookupv3.MsticpyVTGraphSaveGraphError

Bases: Exception

Could not save VT Graph.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception msticpy.sectools.vtlookupv3.MsticpyVTNoDataError

Bases: Exception

No data returned from VT API.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class msticpy.sectools.vtlookupv3.VTEntityType(value)

Bases: enum.Enum

VTEntityType: Enum class for VirusTotal entity types.

DOMAIN = 'domain'
FILE = 'file'
IP_ADDRESS = 'ip_address'
URL = 'url'
class msticpy.sectools.vtlookupv3.VTLookupV3(vt_key: Optional[str] = None)

Bases: object

VTLookupV3: VirusTotal lookup of IoC reports.

Create a new instance of VTLookupV3 class.

Parameters

vt_key (str, optional) – VirusTotal API key, if not supplied, this is read from user configuration.

create_vt_graph(relationship_dfs: List[pandas.core.frame.DataFrame], name: str, private: bool) str

Create a VirusTotal Graph with a set of Relationship DataFrames.

Parameters
  • relationship_dfs – List of Relationship DataFrames

  • name – New graph name

  • private – Indicates if the Graph is private or not.

Returns

Return type

Graph ID

Raises
  • ValueError when private is not indicated.

  • ValueError when there are no relationship DataFrames

  • MsticpyVTGraphSaveGraphError when Graph can not be saved

get_object(vt_id: str, vt_type: str) pandas.core.frame.DataFrame

Return the full VT object as a DataFrame.

Parameters
  • vt_id (str) – The ID of the object

  • vt_type (str) – The type of object to query.

Returns

Single column DataFrame with attribute names as index and values as data column.

Return type

pd.DataFrame

Raises
lookup_ioc(observable: str, vt_type: str) pandas.core.frame.DataFrame

Look up and single IoC observable.

Parameters
  • observable (str) – The observable value

  • vt_type (str) – The VT entity type

Returns

Return type

Attributes Pandas DataFrame with the properties of the entity

Raises

KeyError – Unknown vt_type

lookup_ioc_relationships(observable: str, vt_type: str, relationship: str, limit: Optional[int] = None) pandas.core.frame.DataFrame

Look up and single IoC observable relationships.

Parameters
  • observable (str) – The observable value

  • vt_type (str) – The VT entity type

  • relationship (str) – Desired relationship

  • limit (int) – Relations limit

Returns

Return type

Relationship Pandas DataFrame with the relationships of the entity

lookup_iocs(observables_df: pandas.core.frame.DataFrame, observable_column: str = 'target', observable_type_column: str = 'target_type')

Look up and multiple IoC observables.

Parameters
  • observables_df (pd.DataFrame) – A Pandas DataFrame, where each row is an observable

  • observable_column – ID column of each observable

  • observable_type_column – Type column of each observable

Returns

Return type

Attributes Pandas DataFrame with the properties of the entities

lookup_iocs_relationships(observables_df: pandas.core.frame.DataFrame, relationship: str, observable_column: str = 'target', observable_type_column: str = 'target_type', limit: Optional[int] = None) pandas.core.frame.DataFrame

Look up and single IoC observable relationships.

Parameters
  • observables_df (pd.DataFrame) – A Pandas DataFrame, where each row is an observable

  • relationship (str) – Desired relationship

  • observable_column – ID column of each observable

  • observable_type_column – Type column of each observable.

  • limit (int) – Relations limit

Returns

Return type

Relationship Pandas DataFrame with the relationships of each observable.

static render_vt_graph(graph_id: str, width: int = 800, height: int = 600)

Display a VTGraph in a Jupyter Notebook.

Parameters
  • graph_id – Graph ID

  • width – Graph width.

  • height – Graph height

property supported_vt_types: List[str]

Return list of VirusTotal supported IoC type names.

Returns

List of VirusTotal supported IoC type names.

Return type

List[str]

class msticpy.sectools.vtlookupv3.VTObjectProperties(value)

Bases: enum.Enum

Enum for VT Object properties.

ATTRIBUTES = 'attributes'
LAST_ANALYSIS_STATS = 'last_analysis_stats'
MALICIOUS = 'malicious'
RELATIONSHIPS = 'relationship'