msticpy.sectools package

msticpy.sectools.auditdextract module

Auditd extractor.

Module to load and decode Linux audit logs. It collapses messages sharing the same message ID into single events, decodes hex-encoded data fields and performs some event-specific formatting and normalization (e.g. for process start events it will re-assemble the process command line arguments into a single string). This is still a work-in-progress.

msticpy.sectools.auditdextract.extract_events_to_df(data: pandas.core.frame.DataFrame, input_column: str = 'AuditdMessage', event_type: str = None, verbose: bool = False) → pandas.core.frame.DataFrame

Extract auditd raw messages into a dataframe.

Parameters:
  • data (pd.DataFrame) – The input dataframe with raw auditd data in a single string column
  • input_column (str, optional) – the input column name (the default is ‘AuditdMessage’)
  • event_type (str, optional) – the event type, if None, defaults to all (the default is None)
  • verbose (bool, optional) – Give feedback on stages of processing (the default is False)
Returns:

The resultant DataFrame

Return type:

pd.DataFrame

msticpy.sectools.auditdextract.generate_process_tree(audit_data: pandas.core.frame.DataFrame, branch_depth: int = 4, processes: pandas.core.frame.DataFrame = None) → pandas.core.frame.DataFrame

Generate process tree data from auditd logs.

Parameters:
  • audit_data (pd.DataFrame) – The Audit data containing process creation events
  • branch_depth (int, optional) – The maximum depth of parent or child processes to extract from the data (The default is 4)
  • processes (pd.DataFrame, optional) – Dataframe of processes to generate tree for
Returns:

The formatted process tree data

Return type:

pd.DataFrame

msticpy.sectools.auditdextract.get_event_subset(data: pandas.core.frame.DataFrame, event_type: str) → pandas.core.frame.DataFrame

Return a subset of the events matching type event_type.

Parameters:
  • data (pd.DataFrame) – The input data
  • event_type (str) – The event type to select
Returns:

The subset of the data where data[‘EventType’] == event_type

Return type:

pd.DataFrame

msticpy.sectools.auditdextract.read_from_file(filepath: str, event_type: str = None, verbose: bool = False, dummy_sep: str = '\t') → pandas.core.frame.DataFrame

Extract Audit events from a log file.

Parameters:
  • filepath (str) – path to the input file
  • event_type (str, optional) – The type of event to extract if only a subset required. (the default is None, which processes all types)
  • verbose (bool, optional) – If true more progress messages are output (the default is False)
  • dummy_sep (str, optional) – Separator to use for reading the ‘csv’ file (default is tab - ‘t’)
Returns:

The output DataFrame

Return type:

pd.DataFrame

Notes

The dummy_sep parameter should be a character that does not occur in an input line. This function uses pandas read_csv to read the audit lines into a single column. Using a separator that does appear in the input (e.g. space or comma) will cause data to be parsed into multiple columns and anything after the first separator in a line will be lost.

msticpy.sectools.auditdextract.unpack_auditd(audit_str: List[Dict[str, str]]) → Mapping[str, Mapping[str, Any]]

Unpack an Audit message and returns a dictionary of fields.

Parameters:audit_str (str) – The auditd raw record
Returns:The extracted message fields and values
Return type:Mapping[str, Any]

msticpy.sectools.base64unpack module

base64_unpack.

The main function of this module is to decode and unpack strings that are obfuscated using base64 and/or certain compression algorithms such as gzip and zip.

It has the following functions: unpack_items - this is the main entry point and takes either a string or a pandas dataframe (with specified column) as input. It returns a string with obfuscated parts replaced by decoded equivalents (unless the decoding results in an undecodable binary, in which case a placeholder is used).

Other helper functions may also be useful standalone get_items_from_gzip(binary): Return decompressed gzip content of byte string get_items_from_zip(binary): Return dictionary of zip contents from byte string get_items_from_tar(binary): Return dictionary of tar file contents get_hashes(binary): Return md5, sha1 and sha256 hashes of input byte string

class msticpy.sectools.base64unpack.B64ExtractAccessor(pandas_obj)

Bases: object

Base64 Unpack pandas extension.

Initialize the extension.

extract(column, **kwargs) → pandas.core.frame.DataFrame

Base64 decode strings taken from a pandas dataframe.

Parameters:
  • data (pd.DataFrame) – dataframe containing column to decode
  • column (str) – Name of dataframe text column
  • trace (bool, optional) – Show additional status (the default is None)
  • utf16 (bool, optional) – Attempt to decode UTF16 byte strings
Returns:

Decoded string and additional metadata in dataframe

Return type:

pd.DataFrame

Notes

Items that decode to utf-8 or utf-16 strings will be returned as decoded strings replaced in the original string. If the encoded string is a known binary type it will identify the file type and return the hashes of the file. If any binary types are known archives (zip, tar, gzip) it will unpack the contents of the archive. For any binary it will return the decoded file as a byte array, and as a printable list of byte values.

The columns of the output DataFrame are:

  • decoded string: this is the input string with any decoded sections replaced by the results of the decoding
  • reference : this is an index that matches an index number in the decoded string (e.g. <<encoded binary type=pdf index=1.2’).
  • original_string : the string prior to decoding - file_type : the type of file if this could be determined
  • file_hashes : a dictionary of hashes (the md5, sha1 and sha256 hashes are broken out into separate columns)
  • input_bytes : the binary image as a byte array
  • decoded_string : printable form of the decoded string (either string or list of hex byte values)
  • encoding_type : utf-8, utf-16 or binary
  • md5, sha1, sha256 : the respective hashes of the binary file_type, file_hashes, input_bytes, md5, sha1, sha256 will be null if this item is decoded to a string
  • src_index - the index of the source row in the input frame.
class msticpy.sectools.base64unpack.BinaryRecord(reference, original_string, file_name, file_type, input_bytes, decoded_string, encoding_type, file_hashes, md5, sha1, sha256, printable_bytes)

Bases: tuple

Create new instance of BinaryRecord(reference, original_string, file_name, file_type, input_bytes, decoded_string, encoding_type, file_hashes, md5, sha1, sha256, printable_bytes)

count()

Return number of occurrences of value.

decoded_string

Alias for field number 5

encoding_type

Alias for field number 6

file_hashes

Alias for field number 7

file_name

Alias for field number 2

file_type

Alias for field number 3

index()

Return first index of value.

Raises ValueError if the value is not present.

input_bytes

Alias for field number 4

md5

Alias for field number 8

original_string

Alias for field number 1

printable_bytes

Alias for field number 11

reference

Alias for field number 0

sha1

Alias for field number 9

sha256

Alias for field number 10

msticpy.sectools.base64unpack.get_hashes(binary: bytes) → Dict[str, str]

Return md5, sha1 and sha256 hashes of input byte string.

Parameters:binary (bytes) – byte string of item to be hashed
Returns:dictionary of hash algorithm + hash value
Return type:Dict[str, str]
msticpy.sectools.base64unpack.get_items_from_gzip(binary: bytes) → Tuple[str, Dict[str, bytes]]

Return decompressed gzip contents.

Parameters:binary (bytes) – byte array of gz file
Returns:File type + decompressed file
Return type:Tuple[str, bytes]
msticpy.sectools.base64unpack.get_items_from_tar(binary: bytes) → Tuple[str, Dict[str, bytes]]

Return dictionary of tar file contents.

Parameters:binary (bytes) – byte array of zip file
Returns:Filetype + dictionary of file name + file content
Return type:Tuple[str, Dict[str, bytes]]
msticpy.sectools.base64unpack.get_items_from_zip(binary: bytes) → Tuple[str, Dict[str, bytes]]

Return dictionary of zip contents.

Parameters:binary (bytes) – byte array of zip file
Returns:Filetype + dictionary of file name + file content
Return type:Tuple[str, Dict[str, bytes]]
msticpy.sectools.base64unpack.unpack(input_string: str, trace: bool = False, utf16: bool = False) → Tuple[str, Optional[List[msticpy.sectools.base64unpack.BinaryRecord]]]

Base64 decode an input string.

Parameters:
  • input_string (str, optional) – single string to decode (the default is None)
  • trace (bool, optional) – Show additional status (the default is None)
  • utf16 (bool, optional) – Attempt to decode UTF16 byte strings
Returns:

Decoded string and additional metadata

Return type:

Tuple[str, Optional[List[BinaryRecord]]]

Notes

Items that decode to utf-8 or utf-16 strings will be returned as decoded strings replaced in the original string. If the encoded string is a known binary type it will identify the file type and return the hashes of the file. If any binary types are known archives (zip, tar, gzip) it will unpack the contents of the archive. For any binary it will return the decoded file as a byte array, and as a printable list of byte values. If the input is a string the function returns:

  • decoded string: this is the input string with any decoded sections replaced by the results of the decoding
msticpy.sectools.base64unpack.unpack_df(data: pandas.core.frame.DataFrame, column: str, trace: bool = False, utf16: bool = False) → pandas.core.frame.DataFrame

Base64 decode strings taken from a pandas dataframe.

Parameters:
  • data (pd.DataFrame) – dataframe containing column to decode
  • column (str) – Name of dataframe text column
  • trace (bool, optional) – Show additional status (the default is None)
  • utf16 (bool, optional) – Attempt to decode UTF16 byte strings
Returns:

Decoded string and additional metadata in dataframe

Return type:

pd.DataFrame

Notes

Items that decode to utf-8 or utf-16 strings will be returned as decoded strings replaced in the original string. If the encoded string is a known binary type it will identify the file type and return the hashes of the file. If any binary types are known archives (zip, tar, gzip) it will unpack the contents of the archive. For any binary it will return the decoded file as a byte array, and as a printable list of byte values.

The columns of the output DataFrame are:

  • decoded string: this is the input string with any decoded sections replaced by the results of the decoding
  • reference : this is an index that matches an index number in the decoded string (e.g. <<encoded binary type=pdf index=1.2’).
  • original_string : the string prior to decoding
  • file_type : the type of file if this could be determined
  • file_hashes : a dictionary of hashes (the md5, sha1 and sha256 hashes are broken out into separate columns)
  • input_bytes : the binary image as a byte array
  • decoded_string : printable form of the decoded string (either string or list of hex byte values)
  • encoding_type : utf-8, utf-16 or binary
  • md5, sha1, sha256 : the respective hashes of the binary file_type, file_hashes, input_bytes, md5, sha1, sha256 will be null if this item is decoded to a string
  • src_index - the index of the source row in the input frame.
msticpy.sectools.base64unpack.unpack_items(input_string: str = None, data: pandas.core.frame.DataFrame = None, column: str = None, trace: bool = False, utf16: bool = False) → Any

Base64 decode an input string or strings taken from a pandas dataframe.

Parameters:
  • input_string (str, optional) – single string to decode (the default is None)
  • data (pd.DataFrame, optional) – dataframe containing column to decode (the default is None)
  • column (str, optional) – Name of dataframe text column (the default is None)
  • trace (bool, optional) – Show additional status (the default is None)
  • utf16 (bool, optional) – Attempt to decode UTF16 byte strings
Returns:

  • Tuple[str, pd.DataFrame] (if input_string) – Decoded string and additional metadata
  • pd.DataFrame – Decoded stringa and additional metadata in dataframe

Notes

If the input is a dataframe you must supply the name of the column to use.

Items that decode to utf-8 or utf-16 strings will be returned as decoded strings replaced in the original string. If the encoded string is a known binary type it will identify the file type and return the hashes of the file. If any binary types are known archives (zip, tar, gzip) it will unpack the contents of the archive. For any binary it will return the decoded file as a byte array, and as a printable list of byte values. If the input is a string the function returns:

  • decoded string: this is the input string with any decoded sections replaced by the results of the decoding

It also returns the data as a Pandas DataFrame with the following columns:

  • reference : this is an index that matches an index number in the returned string (e.g. <<encoded binary type=pdf index=1.2’).
  • original_string : the string prior to decoding - file_type : the type of file if this could be determined
  • file_hashes : a dictionary of hashes (the md5, sha1 and sha256 hashes are broken out into separate columns)
  • input_bytes : the binary image as a byte array
  • decoded_string : printable form of the decoded string (either string or list of hex byte values)
  • encoding_type : utf-8, utf-16 or binary
  • md5, sha1, sha256 : the respective hashes of the binary file_type, file_hashes, input_bytes, md5, sha1, sha256 will be null if this item is decoded to a string

If the input is a dataframe the output dataframe will also include the following column: - src_index - the index of the source row in the input frame. This allows you to re-join the output data to the input data.

msticpy.sectools.cmd_line module

cmd_line - Syslog Command processing module.

Contains a series of functions required to correct collect, parse and visualise linux syslog data.

Designed to support standard linux syslog for investigations where auditd is not avalaible.

msticpy.sectools.cmd_line.cmd_speed(cmd_events: pandas.core.frame.DataFrame, cmd_field: str, time: int = 5, events: int = 10) → list

Detect patterns of cmd_line activity whose speed of execution may be suspicious.

Parameters:
  • cmd_events (pd.DataFrame) – A DataFrame of all sudo events to check.
  • cmd_field (str) – The column of the event data that contains command line activity
  • time (int, optional) – Time window in seconds in which to evaluate speed of execution against (Defaults to 5)
  • events (int, optional) – Number of syslog command execution events in which to evaluate speed of execution against (Defaults to 10)
Returns:

risky suspicious_actions – A list of commands that match a risky pattern

Return type:

list

Raises:

AttributeError – If cmd_field is not in supplied data set or TimeGenerated note datetime format

msticpy.sectools.cmd_line.risky_cmd_line(events: pandas.core.frame.DataFrame, log_type: str, detection_rules: str = '/home/docs/checkouts/readthedocs.org/user_builds/msticpy/envs/v1.0.0/lib/python3.7/site-packages/msticpy/resources/cmd_line_rules.json', cmd_field: str = 'Command') → dict

Detect patterns of risky commands in syslog messages.

Risky patterns are defined in a json format file.

Parameters:
  • events (pd.DataFrame) – A DataFrame of all syslog events potentially containing risky command line activity.
  • log_type (str) – The log type of the data included in events. Must correspond to a detection type in detection_rules file.
  • detection_rules (str, optional) – Path to json file containing patterns of risky activity to detect. (Defaults to msticpy/resources/cmd_line_rules.json)
  • cmd_field (str, optional;) – The column in the events dataset that contains the command lines to be analysed. (Defaults to “Command”)
Returns:

risky actions – A dictionary of commands that match a risky pattern

Return type:

dict

Raises:

MsticpyException – The provided dataset does not contain the cmd_field field

msticpy.sectools.geoip module

Geoip Lookup module using IPStack and Maxmind GeoLite2.

Geographic location lookup for IP addresses. This module has two classes for different services:

Both services offer a free tier for non-commercial use. However, a paid tier will normally get you more accuracy, more detail and a higher throughput rate. Maxmind geolite uses a downloadable database, while IPStack is an online lookup (API key required).

exception msticpy.sectools.geoip.GeoIPDatabaseException

Bases: Exception

Exception when GeoIP database cannot be found.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class msticpy.sectools.geoip.GeoIpLookup

Bases: object

Abstract base class for GeoIP Lookup classes.

See also

IPStackLookup
IPStack GeoIP Implementation
GeoLiteLookup
MaxMind GeoIP Implementation

Initialize instance of GeoIpLookup class.

df_lookup_ip(data: pandas.core.frame.DataFrame, column: str) → pandas.core.frame.DataFrame

Lookup Geolocation data from a pandas Dataframe.

Parameters:
  • data (pd.DataFrame) – pandas dataframe containing IpAddress column
  • column (str) – the name of the dataframe column to use as a source
Returns:

Copy of original dataframe with IP Location information columns appended (where a location lookup was successful)

Return type:

pd.DataFrame

lookup_ip(ip_address: str = None, ip_addr_list: collections.abc.Iterable = None, ip_entity: msticpy.datamodel.entities.ip_address.IpAddress = None) → Tuple[List[Any], List[msticpy.datamodel.entities.ip_address.IpAddress]]

Lookup IP location abstract method.

Parameters:
  • ip_address (str, optional) – a single address to look up (the default is None)
  • ip_addr_list (Iterable, optional) – a collection of addresses to lookup (the default is None)
  • ip_entity (IpAddress, optional) – an IpAddress entity (the default is None) - any existing data in the Location property will be overwritten
Returns:

raw geolocation results and same results as IpAddress entities with populated Location property.

Return type:

Tuple[List[Any], List[IpAddress]]

lookup_ips(data: pandas.core.frame.DataFrame, column: str) → pandas.core.frame.DataFrame

Lookup Geolocation data from a pandas Dataframe.

Parameters:
  • data (pd.DataFrame) – pandas dataframe containing IpAddress column
  • column (str) – the name of the dataframe column to use as a source
Returns:

IpLookup results as DataFrame.

Return type:

pd.DataFrame

class msticpy.sectools.geoip.GeoLiteLookup(api_key: Optional[str] = None, db_folder: Optional[str] = None, force_update: bool = False, auto_update: bool = True, debug: bool = False)

Bases: msticpy.sectools.geoip.GeoIpLookup

GeoIP Lookup using MaxMindDB database.

See also

GeoIpLookup
Abstract base class
IPStackLookup
IPStack GeoIP Implementation

Return new instance of GeoLiteLookup class.

Parameters:
  • api_key (str, optional) – Default is None - use configuration value from msticpyconfig.yaml. API Key from MaxMind - Read more about GeoLite2 : https://dev.maxmind.com/geoip/geoip2/geolite2/ Sign up for a MaxMind account: https://www.maxmind.com/en/geolite2/signup Set your password and create a license key: https://www.maxmind.com/en/accounts/current/license-key
  • db_folder (str, optional) – Provide absolute path to the folder containing MMDB file (e.g. ‘/usr/home’ or ‘C:/maxmind’). If no path provided, it is set to download to .msticpy/GeoLite2 under user`s home directory.
  • force_update (bool, optional) – Force update can be set to true or false. depending on it, new download request will be initiated.
  • auto_update (bool, optional) – Auto update can be set to true or false. depending on it, new download request will be initiated if age criteria is matched.
  • debug (bool, optional) – Print additional debugging information, default is False.
close()

Close an open GeoIP DB.

df_lookup_ip(data: pandas.core.frame.DataFrame, column: str) → pandas.core.frame.DataFrame

Lookup Geolocation data from a pandas Dataframe.

Parameters:
  • data (pd.DataFrame) – pandas dataframe containing IpAddress column
  • column (str) – the name of the dataframe column to use as a source
Returns:

Copy of original dataframe with IP Location information columns appended (where a location lookup was successful)

Return type:

pd.DataFrame

lookup_ip(ip_address: str = None, ip_addr_list: collections.abc.Iterable = None, ip_entity: msticpy.datamodel.entities.ip_address.IpAddress = None) → Tuple[List[Any], List[msticpy.datamodel.entities.ip_address.IpAddress]]

Lookup IP location from GeoLite2 data created by MaxMind.

Parameters:
  • ip_address (str, optional) – a single address to look up (the default is None)
  • ip_addr_list (Iterable, optional) – a collection of addresses to lookup (the default is None)
  • ip_entity (IpAddress, optional) – an IpAddress entity (the default is None) - any existing data in the Location property will be overwritten
Returns:

raw geolocation results and same results as IpAddress entities with populated Location property.

Return type:

Tuple[List[Any], List[IpAddress]]

lookup_ips(data: pandas.core.frame.DataFrame, column: str) → pandas.core.frame.DataFrame

Lookup Geolocation data from a pandas Dataframe.

Parameters:
  • data (pd.DataFrame) – pandas dataframe containing IpAddress column
  • column (str) – the name of the dataframe column to use as a source
Returns:

IpLookup results as DataFrame.

Return type:

pd.DataFrame

class msticpy.sectools.geoip.IPStackLookup(api_key: Optional[str] = None, bulk_lookup: bool = False)

Bases: msticpy.sectools.geoip.GeoIpLookup

IPStack GeoIP Implementation.

See also

GeoIpLookup
Abstract base class
GeoLiteLookup
MaxMind GeoIP Implementation

Create a new instance of IPStackLookup.

Parameters:
  • api_key (str, optional) – API Key from IPStack - see https://ipstack.com default is None - obtain key from msticpyconfig.yaml
  • bulk_lookup (bool, optional) – For Professional and above tiers allowing you to submit multiple IPs in a single request. (the default is False, which submits a single request per address)
df_lookup_ip(data: pandas.core.frame.DataFrame, column: str) → pandas.core.frame.DataFrame

Lookup Geolocation data from a pandas Dataframe.

Parameters:
  • data (pd.DataFrame) – pandas dataframe containing IpAddress column
  • column (str) – the name of the dataframe column to use as a source
Returns:

Copy of original dataframe with IP Location information columns appended (where a location lookup was successful)

Return type:

pd.DataFrame

lookup_ip(ip_address: str = None, ip_addr_list: collections.abc.Iterable = None, ip_entity: msticpy.datamodel.entities.ip_address.IpAddress = None) → Tuple[List[Any], List[msticpy.datamodel.entities.ip_address.IpAddress]]

Lookup IP location from IPStack web service.

Parameters:
  • ip_address (str, optional) – a single address to look up (the default is None)
  • ip_addr_list (Iterable, optional) – a collection of addresses to lookup (the default is None)
  • ip_entity (IpAddress, optional) – an IpAddress entity (the default is None) - any existing data in the Location property will be overwritten
Returns:

raw geolocation results and same results as IpAddress entities with populated Location property.

Return type:

Tuple[List[Any], List[IpAddress]]

Raises:
  • ConnectionError – Invalid status returned from http request
  • PermissionError – Service refused request (e.g. requesting batch of addresses on free tier API key)
lookup_ips(data: pandas.core.frame.DataFrame, column: str) → pandas.core.frame.DataFrame

Lookup Geolocation data from a pandas Dataframe.

Parameters:
  • data (pd.DataFrame) – pandas dataframe containing IpAddress column
  • column (str) – the name of the dataframe column to use as a source
Returns:

IpLookup results as DataFrame.

Return type:

pd.DataFrame

msticpy.sectools.geoip.entity_distance(ip_src: msticpy.datamodel.entities.ip_address.IpAddress, ip_dest: msticpy.datamodel.entities.ip_address.IpAddress) → float

Return distance between two IP Entities.

Parameters:
  • ip_src (IpAddress) – Source/Origin IpAddress Entity
  • ip_dest (IpAddress) – Destination IpAddress Entity
Returns:

Distance in kilometers.

Return type:

float

Raises:

AttributeError – If either entity has no location information

msticpy.sectools.geoip.geo_distance(origin: Tuple[float, float], destination: Tuple[float, float]) → float

Calculate the Haversine distance.

Parameters:
  • origin (Tuple[float, float]) – Latitude, Longitude of origin of distance measurement.
  • destination (Tuple[float, float]) – Latitude, Longitude of origin of distance measurement.
Returns:

Distance in kilometers.

Return type:

float

Examples

>>> origin = (48.1372, 11.5756)  # Munich
>>> destination = (52.5186, 13.4083)  # Berlin
>>> round(geo_distance(origin, destination), 1)
504.2

Notes

Author: Martin Thoma - stackoverflow

msticpy.sectools.iocextract module

Module for IoCExtract class.

Uses a set of builtin regular expressions to look for Indicator of Compromise (IoC) patterns. Input can be a single string or a pandas dataframe with one or more columns specified as input.

The following types are built-in:

  • IPv4 and IPv6
  • URL
  • DNS domain
  • Hashes (MD5, SHA1, SHA256)
  • Windows file paths
  • Linux file paths (this is kind of noisy because a legal linux file path can have almost any character) You can modify or add to the regular expressions used at runtime.
class msticpy.sectools.iocextract.IoCExtract

Bases: object

IoC Extractor - looks for common IoC patterns in input strings.

The extract() method takes either a string or a pandas DataFrame as input. When using the string option as an input extract will return a dictionary of results. When using a DataFrame the results will be returned as a new DataFrame with the following columns: IoCType: the mnemonic used to distinguish different IoC Types Observable: the actual value of the observable SourceIndex: the index of the row in the input DataFrame from which the source for the IoC observable was extracted.

The class has a number of built-in IoC regex definitions. These can be retrieved using the ioc_types attribute.

Addition IoC definitions can be added using the add_ioc_type method.

Note: due to some ambiguity in the regular expression patterns for different types and observable may be returned assigned to multiple observable types. E.g. 192.168.0.1 is a also a legal file name in both Linux and Windows. Linux file names have a particularly large scope in terms of legal characters so it will be quite common to see other IoC observables (or parts of them) returned as a possible linux path.

Initialize new instance of IoCExtract.

DNS_REGEX = '((?=[a-z0-9-]{1,63}\\.)[a-z0-9]+(-[a-z0-9]+)*\\.){1,126}[a-z]{2,63}'
IPV4_REGEX = '(?P<ipaddress>(?:[0-9]{1,3}\\.){3}[0-9]{1,3})'
IPV6_REGEX = '(?<![:.\\w])(?:[A-F0-9]{0,4}:){2,7}[A-F0-9]{0,4}(?![:.\\w])'
LXPATH_REGEX = '(?P<root>/+||[.]+)\n (?P<folder>/(?:[^\\\\/:*?<>|\\r\\n]+/)*)\n (?P<file>[^/\\0<>|\\r\\n ]+)'
MD5_REGEX = '(?:^|[^A-Fa-f0-9])(?P<hash>[A-Fa-f0-9]{32})(?:$|[^A-Fa-f0-9])'
SHA1_REGEX = '(?:^|[^A-Fa-f0-9])(?P<hash>[A-Fa-f0-9]{40})(?:$|[^A-Fa-f0-9])'
SHA256_REGEX = '(?:^|[^A-Fa-f0-9])(?P<hash>[A-Fa-f0-9]{64})(?:$|[^A-Fa-f0-9])'
URL_REGEX = '\n (?P<protocol>(https?|ftp|telnet|ldap|file)://)\n (?P<userinfo>([a-z0-9-._~!$&\\\'()*+,;=:]|%[0-9A-F]{2})*@)?\n (?P<host>([a-z0-9-._~!$&\\\'()*+,;=]|%[0-9A-F]{2})*)\n (:(?P<port>\\d*))?\n (/(?P<path>([^?\\#"<>\\s]|%[0-9A-F]{2})*/?))?\n (\\?(?P<query>([a-z0-9-._~!$&\'()*+,;=:/?@]|%[0-9A-F]{2})*))?\n (\\#(?P<fragment>([a-z0-9-._~!$&\'()*+,;=:/?@]|%[0-9A-F]{2})*))?'
WINPATH_REGEX = '\n (?P<root>[a-z]:|\\\\\\\\[a-z0-9_.$-]+||[.]+)\n (?P<folder>\\\\(?:[^\\/:*?"\\\'<>|\\r\\n]+\\\\)*)\n (?P<file>[^\\\\/*?""<>|\\r\\n ]+)'
add_ioc_type(ioc_type: str, ioc_regex: str, priority: int = 0, group: str = None)

Add an IoC type and regular expression to use to the built-in set.

Parameters:
  • ioc_type (str) – A unique name for the IoC type
  • ioc_regex (str) – A regular expression used to search for the type
  • priority (int, optional) – Priority of the regex match vs. other ioc_patterns. 0 is the highest priority (the default is 0).
  • group (str, optional) – The regex group to match (the default is None, which will match on the whole expression)

Notes

Pattern priorities.
If two IocType patterns match on the same substring, the matched substring is assigned to the pattern/IocType with the highest priority. E.g. foo.bar.com will match types: dns, windows_path and linux_path but since dns has a higher priority, the expression is assigned to the dns matches.
extract(src: str = None, data: pandas.core.frame.DataFrame = None, columns: List[str] = None, **kwargs) → Union[Dict[str, Set[str]], pandas.core.frame.DataFrame]

Extract IoCs from either a string or pandas DataFrame.

Parameters:
  • src (str, optional) – source string in which to look for IoC patterns (the default is None)
  • data (pd.DataFrame, optional) – input DataFrame from which to read source strings (the default is None)
  • columns (list, optional) – The list of columns to use as source strings, if the data parameter is used. (the default is None)
Other Parameters:
 
  • ioc_types (list, optional) – Restrict matching to just specified types. (default is all types)
  • include_paths (bool, optional) – Whether to include path matches (which can be noisy) (the default is false - excludes ‘windows_path’ and ‘linux_path’). If ioc_types is specified this parameter is ignored.
  • ignore_tlds (bool, optional) – If True, ignore the official Top Level Domains list when determining whether a domain name is a legal domain.
Returns:

dict of found observables (if input is a string) or DataFrame of observables

Return type:

Any

Notes

Extract takes either a string or a pandas DataFrame as input. When using the string option as an input extract will return a dictionary of results. When using a DataFrame the results will be returned as a new DataFrame with the following columns: - IoCType: the mnemonic used to distinguish different IoC Types - Observable: the actual value of the observable - SourceIndex: the index of the row in the input DataFrame from which the source for the IoC observable was extracted.

IoCType Pattern selection The default list is: [‘ipv4’, ‘ipv6’, ‘dns’, ‘url’, ‘md5_hash’, ‘sha1_hash’, ‘sha256_hash’] plus any user-defined types. ‘windows_path’, ‘linux_path’ are excluded unless include_paths is True or explicitly included in ioc_paths.

extract_df(data: pandas.core.frame.DataFrame, columns: Union[str, List[str]], **kwargs) → pandas.core.frame.DataFrame

Extract IoCs from either a pandas DataFrame.

Parameters:
  • data (pd.DataFrame) – input DataFrame from which to read source strings
  • columns (Union[str, list]) – A single column name as a string or a a list of columns to use as source strings,
Other Parameters:
 
  • ioc_types (list, optional) – Restrict matching to just specified types. (default is all types)
  • include_paths (bool, optional) – Whether to include path matches (which can be noisy) (the default is false - excludes ‘windows_path’ and ‘linux_path’). If ioc_types is specified this parameter is ignored.
  • ignore_tlds (bool, optional) – If True, ignore the official Top Level Domains list when determining whether a domain name is a legal domain.
Returns:

DataFrame of observables

Return type:

pd.DataFrame

Notes

Extract takes a pandas DataFrame as input. The results will be returned as a new DataFrame with the following columns: - IoCType: the mnemonic used to distinguish different IoC Types - Observable: the actual value of the observable - SourceIndex: the index of the row in the input DataFrame from which the source for the IoC observable was extracted.

IoCType Pattern selection The default list is: [‘ipv4’, ‘ipv6’, ‘dns’, ‘url’, ‘md5_hash’, ‘sha1_hash’, ‘sha256_hash’] plus any user-defined types. ‘windows_path’, ‘linux_path’ are excluded unless include_paths is True or explicitly included in ioc_paths.

static file_hash_type(file_hash: str) → msticpy.sectools.iocextract.IoCType

Return specific IoCType based on hash length.

Parameters:file_hash (str) – File hash string
Returns:Specific hash type or unknown.
Return type:IoCType
get_ioc_type(observable: str) → str

Return first matching type.

Parameters:observable (str) – The IoC Observable to check
Returns:The IoC type enumeration (unknown, if no match)
Return type:str
ioc_types

Return the current set of IoC types and regular expressions.

Returns:dict of IoC Type names and regular expressions
Return type:dict
validate(input_str: str, ioc_type: str, ignore_tlds: bool = False) → bool

Check that input_str matches the regex for the specificed ioc_type.

Parameters:
  • input_str (str) – the string to test
  • ioc_type (str) – the regex pattern to use
  • ignore_tlds (bool, optional) – If True, ignore the official Top Level Domains list when determining whether a domain name is a legal domain.
Returns:

True if match.

Return type:

bool

class msticpy.sectools.iocextract.IoCExtractAccessor(pandas_obj)

Bases: object

Pandas api extension for IoC Extractor.

Instantiate pandas extension class.

extract(columns, **kwargs)

Extract IoCs from either a pandas DataFrame.

Parameters:

columns (list) – The list of columns to use as source strings,

Other Parameters:
 
  • ioc_types (list, optional) – Restrict matching to just specified types. (default is all types)
  • include_paths (bool, optional) – Whether to include path matches (which can be noisy) (the default is false - excludes ‘windows_path’ and ‘linux_path’). If ioc_types is specified this parameter is ignored.
Returns:

DataFrame of observables

Return type:

pd.DataFrame

Notes

Extract takes a pandas DataFrame as input. The results will be returned as a new DataFrame with the following columns: - IoCType: the mnemonic used to distinguish different IoC Types - Observable: the actual value of the observable - SourceIndex: the index of the row in the input DataFrame from which the source for the IoC observable was extracted.

IoCType Pattern selection The default list is: [‘ipv4’, ‘ipv6’, ‘dns’, ‘url’, ‘md5_hash’, ‘sha1_hash’, ‘sha256_hash’] plus any user-defined types. ‘windows_path’, ‘linux_path’ are excluded unless include_paths is True or explicitly included in ioc_paths.

class msticpy.sectools.iocextract.IoCPattern(ioc_type, comp_regex, priority, group)

Bases: tuple

Create new instance of IoCPattern(ioc_type, comp_regex, priority, group)

comp_regex

Alias for field number 1

count()

Return number of occurrences of value.

group

Alias for field number 3

index()

Return first index of value.

Raises ValueError if the value is not present.

ioc_type

Alias for field number 0

priority

Alias for field number 2

class msticpy.sectools.iocextract.IoCType

Bases: enum.Enum

Enumeration of IoC Types.

dns = 'dns'
email = 'email'
file_hash = 'file_hash'
hostname = 'hostname'
ipv4 = 'ipv4'
ipv6 = 'ipv6'
linux_path = 'linux_path'
md5_hash = 'md5_hash'
parse = <bound method IoCType.parse of <enum 'IoCType'>>
sha1_hash = 'sha1_hash'
sha256_hash = 'sha256_hash'
unknown = 'unknown'
url = 'url'
windows_path = 'windows_path'

msticpy.sectools.process_tree_utils module

Process Tree Visualization.

msticpy.sectools.process_tree_utils.get_ancestors(procs: pandas.core.frame.DataFrame, source, include_source=True) → pandas.core.frame.DataFrame

Return the ancestor processes of the source process.

Parameters:
  • procs (pd.DataFrame) – Process events (with process tree metadata)
  • source (Union[str, pd.Series]) – source_index of process or the process row
  • include_source (bool, optional) – Include the source process in the results, by default True
Returns:

Ancestor processes

Return type:

pd.DataFrame

msticpy.sectools.process_tree_utils.get_children(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series], include_source: bool = True) → pandas.core.frame.DataFrame

Return the child processes for the source process.

Parameters:
  • procs (pd.DataFrame) – Process events (with process tree metadata)
  • source (Union[str, pd.Series]) – source_index of process or the process row
  • include_source (bool, optional) – If True include the source process in the results, by default True
Returns:

Child processes

Return type:

pd.DataFrame

msticpy.sectools.process_tree_utils.get_descendents(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series], include_source: bool = True, max_levels: int = -1) → pandas.core.frame.DataFrame

Return the descendents of the source process.

Parameters:
  • procs (pd.DataFrame) – Process events (with process tree metadata)
  • source (Union[str, pd.Series]) – source_index of process or the process row
  • include_source (bool, optional) – Include the source process in the results, by default True
  • max_levels (int, optional) – Maximum number of levels to descend, by default -1 (all levels)
Returns:

Descendent processes

Return type:

pd.DataFrame

msticpy.sectools.process_tree_utils.get_parent(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series]) → Optional[pandas.core.series.Series]

Return the parent of the source process.

Parameters:
  • procs (pd.DataFrame) – Process events (with process tree metadata)
  • source (Union[str, pd.Series]) – source_index of process or the process row
Returns:

Parent Process row or None if no parent was found.

Return type:

Optional[pd.Series]

msticpy.sectools.process_tree_utils.get_process(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series]) → pandas.core.series.Series

Return the process event as a Series.

Parameters:
  • procs (pd.DataFrame) – Process events (with process tree metadata)
  • source (Union[str, pd.Series]) – source_index of process or the process row
Returns:

Process row

Return type:

pd.Series

Raises:

ValueError – If unknown type is supplied as source

msticpy.sectools.process_tree_utils.get_process_key(procs: pandas.core.frame.DataFrame, source_index: int) → str

Return the process key of the process given its source_index.

Parameters:
  • procs (pd.DataFrame) – Process events
  • source_index (int, optional) – source_index of the process record
Returns:

The process key of the process.

Return type:

str

msticpy.sectools.process_tree_utils.get_root(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series]) → pandas.core.series.Series

Return the root process for the source process.

Parameters:
  • procs (pd.DataFrame) – Process events (with process tree metadata)
  • source (Union[str, pd.Series]) – source_index of process or the process row
Returns:

Root process

Return type:

pd.Series

msticpy.sectools.process_tree_utils.get_root_tree(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series]) → pandas.core.frame.DataFrame

Return the process tree to which the source process belongs.

Parameters:
  • procs (pd.DataFrame) – Process events (with process tree metadata)
  • source (Union[str, pd.Series]) – source_index of process or the process row
Returns:

Process Tree

Return type:

pd.DataFrame

msticpy.sectools.process_tree_utils.get_roots(procs: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame

Return the process tree roots for the current data set.

Parameters:procs (pd.DataFrame) – Process events (with process tree metadata)
Returns:Process Tree root processes
Return type:pd.DataFrame
msticpy.sectools.process_tree_utils.get_siblings(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series], include_source: bool = True) → pandas.core.frame.DataFrame

Return the processes that share the parent of the source process.

Parameters:
  • procs (pd.DataFrame) – Process events (with process tree metadata)
  • source (Union[str, pd.Series]) – source_index of process or the process row
  • include_source (bool, optional) – Include the source process in the results, by default True
Returns:

Sibling processes.

Return type:

pd.DataFrame

msticpy.sectools.process_tree_utils.get_summary_info(procs: pandas.core.frame.DataFrame) → Dict[str, int]

Return summary information about the process trees.

Parameters:procs (pd.DataFrame) – Process events (with process tree metadata)
Returns:Summary statistic about the process tree
Return type:Dict[str, int]
msticpy.sectools.process_tree_utils.get_tree_depth(procs: pandas.core.frame.DataFrame) → int

Return the depth of the process tree.

Parameters:procs (pd.DataFrame) – Process events (with process tree metadata)
Returns:Tree depth
Return type:int

msticpy.sectools.syslog_utils module

syslog_utils - Syslog parsing and utility module.

Functions required to correct collect, parse and visualize syslog data.

Designed to support standard linux syslog for investigations where auditd is not available.

msticpy.sectools.syslog_utils.cluster_syslog_logons_df(logon_events: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame

Cluster logon sessions in syslog by start/end time based on PAM events.

Parameters:logon_events (pd.DataFrame) – A DataFrame of all syslog logon events (can be generated with LinuxSyslog.user_logon query)
Returns:logon_sessions – A dictionary of logon sessions including start and end times and logged on user
Return type:pd.DataFrame
Raises:MsticpyException – There are no logon sessions in the supplied data set
msticpy.sectools.syslog_utils.create_host_record(syslog_df: pandas.core.frame.DataFrame, heartbeat_df: pandas.core.frame.DataFrame, az_net_df: pandas.core.frame.DataFrame = None) → msticpy.datamodel.entities.host.Host

Generate host_entity record for selected computer.

Parameters:
  • syslog_df (pd.DataFrame) – A dataframe of all syslog events for the host in the time window requried
  • heartbeat_df (pd.DataFrame) – A dataframe of heartbeat data for the host
  • az_net_df (pd.DataFrame) – Option dataframe of Azure network data for the host
Returns:

Details of the host data collected

Return type:

Host

msticpy.sectools.syslog_utils.risky_sudo_sessions(sudo_sessions: pandas.core.frame.DataFrame, risky_actions: dict = None, suspicious_actions: list = None) → dict

Detect if a sudo session occurs at the point of a suspicious event.

Parameters:
  • sudo_sessions (dict) – Dictionary of sudo sessions (as generated by cluster_syslog_logons)
  • risky_actions (dict (Optional)) – Dictionary of risky sudo commands (as generated by cmd_line.risky_cmd_line)
  • suspicious_actions (list (Optional)) – List of risky sudo commands (as generated by cmd_line.cmd_speed)
Returns:

risky_sessions – A dictionary of sudo sessions with flags denoting risk

Return type:

dict

msticpy.sectools.tilookup module

Module for TILookup classes.

Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.

class msticpy.sectools.tilookup.TILookup(primary_providers: Optional[List[msticpy.sectools.tiproviders.ti_provider_base.TIProvider]] = None, secondary_providers: Optional[List[msticpy.sectools.tiproviders.ti_provider_base.TIProvider]] = None, providers: Optional[List[str]] = None)

Bases: object

Threat Intel observable lookup from providers.

Initialize TILookup instance.

Parameters:
  • primary_providers (Optional[List[TIProvider]], optional) – Primary TI Providers, by default None
  • secondary_providers (Optional[List[TIProvider]], optional) – Secondary TI Providers, by default None
  • providers (Optional[List[str]], optional) – List of provider names to load, by default all available providers are loaded. To see the list of available providers call TILookup.list_available_providers(). Note: if primary_provides or secondary_providers is specified This will override the providers list.
add_provider(provider: msticpy.sectools.tiproviders.ti_provider_base.TIProvider, name: str = None, primary: bool = True)

Add a TI provider to the current collection.

Parameters:
  • provider (TIProvider) – Provider instance
  • name (str, optional) – The name to use for the provider (overrides the class name of provider)
  • primary (bool, optional) – “primary” or “secondary” if False, by default “primary”
available_providers

Return a list of builtin providers.

Returns:List of TI Provider classes.
Return type:List[str]
static browse(data: pandas.core.frame.DataFrame, severities: Optional[List[str]] = None, **kwargs)

Return TI Results list browser.

Parameters:
  • data (pd.DataFrame) – TI Results data from TIProviders
  • severities (Optional[List[str]], optional) – A list of the severity classes to show. By default these are [‘warning’, ‘high’]. Pass [‘information’, ‘warning’, ‘high’] to see all results.
Other Parameters:
 

kwargs – passed to SelectItem constuctor.

Returns:

SelectItem browser for TI Data.

Return type:

SelectItem

static browse_results(data: pandas.core.frame.DataFrame, severities: Optional[List[str]] = None, **kwargs)

Return TI Results list browser.

Parameters:
  • data (pd.DataFrame) – TI Results data from TIProviders
  • severities (Optional[List[str]], optional) – A list of the severity classes to show. By default these are [‘warning’, ‘high’]. Pass [‘information’, ‘warning’, ‘high’] to see all results.
Other Parameters:
 

kwargs – passed to SelectItem constuctor.

Returns:

SelectItem browser for TI Data.

Return type:

SelectItem

configured_providers

Return a list of avaliable providers that have configuration details present.

Returns:List of TI Provider classes.
Return type:List[str]
disable_provider(providers: Union[str, Iterable[str]])

Set the provider as secondary (not used by default).

Parameters:providers (Union[str, Iterable[str]) – Provider name or list of names. Use list_available_providers() to see the list of loaded providers.
Raises:ValueError – If the provider name is not recognized.
enable_provider(providers: Union[str, Iterable[str]])

Set the provider(s) as primary (used by default).

Parameters:providers (Union[str, Iterable[str]) – Provider name or list of names. Use list_available_providers() to see the list of loaded providers.
Raises:ValueError – If the provider name is not recognized.
classmethod list_available_providers(show_query_types=False, as_list: bool = False) → Optional[List[str]]

Print a list of builtin providers with optional usage.

Parameters:
  • show_query_types (bool, optional) – Show query types supported by providers, by default False
  • as_list (bool, optional) – Return list of providers instead of printing to stdout. Note: if you specify show_query_types this will be printed irrespective of this parameter setting.
Returns:

A list of provider names (if return_list=True)

Return type:

Optional[List[str]]

loaded_providers

Return dictionary of loaded providers.

Returns:[description]
Return type:Dict[str, TIProvider]
lookup_ioc(observable: str = None, ioc_type: str = None, ioc_query_type: str = None, providers: List[str] = None, prov_scope: str = 'primary', **kwargs) → Tuple[bool, List[Tuple[str, msticpy.sectools.tiproviders.ti_provider_base.LookupResult]]]

Lookup single IoC in active providers.

Parameters:
  • observable (str) – IoC observable (ioc is also an alias for observable)
  • ioc_type (str, optional) – One of IoCExtract.IoCType, by default None If none, the IoC type will be inferred
  • ioc_query_type (str, optional) – The ioc query type (e.g. rep, info, malware)
  • providers (List[str]) – Explicit list of providers to use
  • prov_scope (str, optional) – Use “primary”, “secondary” or “all” providers, by default “primary”
  • kwargs – Additional arguments passed to the underlying provider(s)
Returns:

The result returned as a tuple(bool, list): bool indicates whether a TI record was found in any provider list has an entry for each provider result

Return type:

Tuple[bool, List[Tuple[str, LookupResult]]]

lookup_iocs(data: Union[pandas.core.frame.DataFrame, Mapping[str, str], Iterable[str]], obs_col: str = None, ioc_type_col: str = None, ioc_query_type: str = None, providers: List[str] = None, prov_scope: str = 'primary', **kwargs) → pandas.core.frame.DataFrame

Lookup a collection of IoCs.

Parameters:
  • data (Union[pd.DataFrame, Mapping[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Mapping (e.g. a dict) of [observable, IoCType] 3. Iterable of observables - IoCTypes will be inferred
  • obs_col (str, optional) – DataFrame column to use for observables, by default None (“col” and “column” are also aliases for this parameter)
  • ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None
  • ioc_query_type (str, optional) – The ioc query type (e.g. rep, info, malware)
  • providers (List[str]) – Explicit list of providers to use
  • prov_scope (str, optional) – Use “primary”, “secondary” or “all” providers, by default “primary”
  • kwargs – Additional arguments passed to the underlying provider(s)
Returns:

DataFrame of results

Return type:

pd.DataFrame

provider_status

Return loaded provider status.

Returns:List of providers and descriptions.
Return type:Iterable[str]
provider_usage()

Print usage of loaded providers.

classmethod reload_provider_settings()

Reload provider settings from config.

reload_providers()

Reload providers based on current settings in config.

Parameters:clear_keyring (bool, optional) – Clears any secrets cached in keyring, by default False
static result_to_df(ioc_lookup: Tuple[bool, List[Tuple[str, msticpy.sectools.tiproviders.ti_provider_base.LookupResult]]]) → pandas.core.frame.DataFrame

Return DataFrame representation of IoC Lookup response.

Parameters:ioc_lookup (Tuple[bool, List[Tuple[str, LookupResult]]]) – Output from lookup_ioc
Returns:The response as a DataFrame with a row for each provider response.
Return type:pd.DataFrame
set_provider_state(prov_dict: Dict[str, bool])

Set a dict of providers to primary/secondary.

Parameters:prov_dict (Dict[str, bool]) – Dictionary of provider name and bool - True if enabled/primary, False if disabled/secondary.

msticpy.sectools.tiproviders.ti_provider_base module

Module for TILookup classes.

Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.

class msticpy.sectools.tiproviders.ti_provider_base.LookupResult(ioc: str, ioc_type: str, safe_ioc: str = '', query_subtype: Optional[str] = None, provider: Optional[str] = None, result: bool = False, severity: int = 0, details: Any = None, raw_result: Union[str, dict, None] = None, reference: Optional[str] = None, status: int = 0)

Bases: object

Lookup result for IoCs.

Method generated by attrs for class LookupResult.

classmethod column_map()

Return a dictionary that maps fields to DF Names.

raw_result_fmtd

Print raw results of the Lookup Result.

set_severity(value: Any)

Set the severity from enum, int or string.

Parameters:value (Any) – The severity value to set
severity_name

Return text description of severity score.

Returns:Severity description.
Return type:str
summary

Print a summary of the Lookup Result.

class msticpy.sectools.tiproviders.ti_provider_base.SanitizedObservable(observable, status)

Bases: tuple

Create new instance of SanitizedObservable(observable, status)

count()

Return number of occurrences of value.

index()

Return first index of value.

Raises ValueError if the value is not present.

observable

Alias for field number 0

status

Alias for field number 1

class msticpy.sectools.tiproviders.ti_provider_base.TILookupStatus

Bases: enum.Enum

Threat intelligence lookup status.

bad_format = 2
not_supported = 1
ok = 0
other = 10
query_failed = 3
class msticpy.sectools.tiproviders.ti_provider_base.TIPivotProvider

Bases: abc.ABC

A class which provides pivot functions and a means of registering them.

register_pivots(pivot_reg: PivotRegistration, pivot: Pivot)

Register pivot functions for the TI Provider.

Parameters:
class msticpy.sectools.tiproviders.ti_provider_base.TIProvider(**kwargs)

Bases: abc.ABC

Abstract base class for Threat Intel providers.

Initialize the provider.

ioc_query_defs

Return current dictionary of IoC query/request definitions.

Returns:IoC query/requist definitions keyed by IoCType
Return type:Dict[str, Any]
classmethod is_known_type(ioc_type: str) → bool

Return True if this a known IoC Type.

Parameters:ioc_type (str) – IoCType string to test
Returns:True if known type.
Return type:bool
is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) → bool

Return True if the passed type is supported.

Parameters:ioc_type (Union[str, IoCType]) – IoC type name or instance
Returns:True if supported.
Return type:bool
lookup_ioc(ioc: str, ioc_type: str = None, query_type: str = None, **kwargs) → msticpy.sectools.tiproviders.ti_provider_base.LookupResult

Lookup a single IoC observable.

Parameters:
  • ioc (str) – IoC Observable value
  • ioc_type (str, optional) – IoC Type, by default None (type will be inferred)
  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
Returns:

The returned results.

Return type:

LookupResult

lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: str = None, ioc_type_col: str = None, query_type: str = None, **kwargs) → pandas.core.frame.DataFrame

Lookup collection of IoC observables.

Parameters:
  • data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred
  • obs_col (str, optional) – DataFrame column to use for observables, by default None
  • ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None
  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
Returns:

DataFrame of results.

Return type:

pd.DataFrame

parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) → Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]

Return the details of the response.

Parameters:response (LookupResult) – The returned data response
Returns:bool = positive or negative hit TISeverity = enumeration of severity Object with match details
Return type:Tuple[bool, TISeverity, Any]
resolve_ioc_type

Return IoCType determined by IoCExtract.

Parameters:observable (str) – IoC observable string
Returns:IoC Type (or unknown if type could not be determined)
Return type:str
supported_types

Return list of supported IoC types for this provider.

Returns:List of supported type names
Return type:List[str]
classmethod usage()

Print usage of provider.

class msticpy.sectools.tiproviders.ti_provider_base.TISeverity

Bases: enum.Enum

Threat intelligence report severity.

high = 2
information = 0
parse = <bound method TISeverity.parse of <enum 'TISeverity'>>
unknown = -1
warning = 1
msticpy.sectools.tiproviders.ti_provider_base.entropy(input_str: str) → float

Compute entropy of input string.

msticpy.sectools.tiproviders.ti_provider_base.generate_items(data: Any, obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None) → Iterable[Tuple[Optional[str], Optional[str]]]

Generate item pairs from different input types.

Parameters:
  • data (Any) – DataFrame, dictionary or iterable
  • obs_col (Optional[str]) – If data is a DataFrame, the column containing the observable value.
  • ioc_type_col (Optional[str]) – If data is a DataFrame, the column containing the observable type.
Returns:

Return type:

Iterable[Tuple[Optional[str], Optional[str]]]] - a tuple of Observable/Type.

msticpy.sectools.tiproviders.ti_provider_base.get_schema_and_host(url: str, require_url_encoding: bool = False) → Tuple[Optional[str], Optional[str], Optional[str]]

Return URL scheme and host and cleaned URL.

Parameters:
  • url (str) – Input URL
  • require_url_encoding (bool) – Set to True if url needs encoding. Defualt is False.
Returns:

Tuple of URL, scheme, host

Return type:

Tuple[Optional[str], Optional[str], Optional[str]

msticpy.sectools.tiproviders.ti_provider_base.preprocess_observable(observable, ioc_type, require_url_encoding: bool = False) → msticpy.sectools.tiproviders.ti_provider_base.SanitizedObservable

Preprocesses and checks validity of observable against declared IoC type.

param observable:
 the value of the IoC
param ioc_type:the IoC type

msticpy.sectools.tiproviders.http_base module

HTTP TI Provider base.

Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.

class msticpy.sectools.tiproviders.http_base.HttpProvider(**kwargs)

Bases: msticpy.sectools.tiproviders.ti_provider_base.TIProvider

HTTP TI provider base class.

Initialize a new instance of the class.

ioc_query_defs

Return current dictionary of IoC query/request definitions.

Returns:IoC query/requist definitions keyed by IoCType
Return type:Dict[str, Any]
classmethod is_known_type(ioc_type: str) → bool

Return True if this a known IoC Type.

Parameters:ioc_type (str) – IoCType string to test
Returns:True if known type.
Return type:bool
is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) → bool

Return True if the passed type is supported.

Parameters:ioc_type (Union[str, IoCType]) – IoC type name or instance
Returns:True if supported.
Return type:bool
lookup_ioc

Lookup a single IoC observable.

Parameters:
  • ioc (str) – IoC observable
  • ioc_type (str, optional) – IocType, by default None (type will be inferred)
  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
Returns:

The lookup result: result - Positive/Negative, details - Lookup Details (or status if failure), raw_result - Raw Response reference - URL of IoC

Return type:

LookupResult

Raises:

NotImplementedError – If attempting to use an HTTP method or authentication protocol that is not supported.

Notes

Note: this method uses memoization (lru_cache) to cache results for a particular observable to try avoid repeated network calls for the same item.

lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: str = None, ioc_type_col: str = None, query_type: str = None, **kwargs) → pandas.core.frame.DataFrame

Lookup collection of IoC observables.

Parameters:
  • data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred
  • obs_col (str, optional) – DataFrame column to use for observables, by default None
  • ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None
  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
Returns:

DataFrame of results.

Return type:

pd.DataFrame

parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) → Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]

Return the details of the response.

Parameters:response (LookupResult) – The returned data response
Returns:bool = positive or negative hit TISeverity = enumeration of severity Object with match details
Return type:Tuple[bool, TISeverity, Any]
resolve_ioc_type

Return IoCType determined by IoCExtract.

Parameters:observable (str) – IoC observable string
Returns:IoC Type (or unknown if type could not be determined)
Return type:str
supported_types

Return list of supported IoC types for this provider.

Returns:List of supported type names
Return type:List[str]
classmethod usage()

Print usage of provider.

class msticpy.sectools.tiproviders.http_base.IoCLookupParams(path: str = '', verb: str = 'GET', full_url: bool = False, headers: Dict[str, str] = NOTHING, params: Dict[str, str] = NOTHING, data: Dict[str, str] = NOTHING, auth_type: str = '', auth_str: List[str] = NOTHING, sub_type: str = '')

Bases: object

IoC HTTP Lookup Params definition.

Method generated by attrs for class IoCLookupParams.

msticpy.sectools.tiproviders.alienvault_otx module

AlienVault OTX Provider.

Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.

class msticpy.sectools.tiproviders.alienvault_otx.OTX(**kwargs)

Bases: msticpy.sectools.tiproviders.http_base.HttpProvider

AlientVault OTX Lookup.

Set OTX specific settings.

ioc_query_defs

Return current dictionary of IoC query/request definitions.

Returns:IoC query/requist definitions keyed by IoCType
Return type:Dict[str, Any]
classmethod is_known_type(ioc_type: str) → bool

Return True if this a known IoC Type.

Parameters:ioc_type (str) – IoCType string to test
Returns:True if known type.
Return type:bool
is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) → bool

Return True if the passed type is supported.

Parameters:ioc_type (Union[str, IoCType]) – IoC type name or instance
Returns:True if supported.
Return type:bool
lookup_ioc

Lookup a single IoC observable.

Parameters:
  • ioc (str) – IoC observable
  • ioc_type (str, optional) – IocType, by default None (type will be inferred)
  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
Returns:

The lookup result: result - Positive/Negative, details - Lookup Details (or status if failure), raw_result - Raw Response reference - URL of IoC

Return type:

LookupResult

Raises:

NotImplementedError – If attempting to use an HTTP method or authentication protocol that is not supported.

Notes

Note: this method uses memoization (lru_cache) to cache results for a particular observable to try avoid repeated network calls for the same item.

lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: str = None, ioc_type_col: str = None, query_type: str = None, **kwargs) → pandas.core.frame.DataFrame

Lookup collection of IoC observables.

Parameters:
  • data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred
  • obs_col (str, optional) – DataFrame column to use for observables, by default None
  • ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None
  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
Returns:

DataFrame of results.

Return type:

pd.DataFrame

parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) → Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]

Return the details of the response.

Parameters:response (LookupResult) – The returned data response
Returns:bool = positive or negative hit TISeverity = enumeration of severity Object with match details
Return type:Tuple[bool, TISeverity, Any]
resolve_ioc_type

Return IoCType determined by IoCExtract.

Parameters:observable (str) – IoC observable string
Returns:IoC Type (or unknown if type could not be determined)
Return type:str
supported_types

Return list of supported IoC types for this provider.

Returns:List of supported type names
Return type:List[str]
classmethod usage()

Print usage of provider.

msticpy.sectools.tiproviders.ibm_xforce module

IBM XForce Provider.

Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.

class msticpy.sectools.tiproviders.ibm_xforce.XForce(**kwargs)

Bases: msticpy.sectools.tiproviders.http_base.HttpProvider

IBM XForce Lookup.

Initialize a new instance of the class.

ioc_query_defs

Return current dictionary of IoC query/request definitions.

Returns:IoC query/requist definitions keyed by IoCType
Return type:Dict[str, Any]
classmethod is_known_type(ioc_type: str) → bool

Return True if this a known IoC Type.

Parameters:ioc_type (str) – IoCType string to test
Returns:True if known type.
Return type:bool
is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) → bool

Return True if the passed type is supported.

Parameters:ioc_type (Union[str, IoCType]) – IoC type name or instance
Returns:True if supported.
Return type:bool
lookup_ioc

Lookup a single IoC observable.

Parameters:
  • ioc (str) – IoC observable
  • ioc_type (str, optional) – IocType, by default None (type will be inferred)
  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
Returns:

The lookup result: result - Positive/Negative, details - Lookup Details (or status if failure), raw_result - Raw Response reference - URL of IoC

Return type:

LookupResult

Raises:

NotImplementedError – If attempting to use an HTTP method or authentication protocol that is not supported.

Notes

Note: this method uses memoization (lru_cache) to cache results for a particular observable to try avoid repeated network calls for the same item.

lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: str = None, ioc_type_col: str = None, query_type: str = None, **kwargs) → pandas.core.frame.DataFrame

Lookup collection of IoC observables.

Parameters:
  • data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred
  • obs_col (str, optional) – DataFrame column to use for observables, by default None
  • ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None
  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
Returns:

DataFrame of results.

Return type:

pd.DataFrame

parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) → Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]

Return the details of the response.

Parameters:response (LookupResult) – The returned data response
Returns:bool = positive or negative hit TISeverity = enumeration of severity Object with match details
Return type:Tuple[bool, TISeverity, Any]
resolve_ioc_type

Return IoCType determined by IoCExtract.

Parameters:observable (str) – IoC observable string
Returns:IoC Type (or unknown if type could not be determined)
Return type:str
supported_types

Return list of supported IoC types for this provider.

Returns:List of supported type names
Return type:List[str]
classmethod usage()

Print usage of provider.

msticpy.sectools.tiproviders.virustotal module

VirusTotal Provider.

Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.

class msticpy.sectools.tiproviders.virustotal.VirusTotal(**kwargs)

Bases: msticpy.sectools.tiproviders.http_base.HttpProvider

VirusTotal Lookup.

Initialize a new instance of the class.

ioc_query_defs

Return current dictionary of IoC query/request definitions.

Returns:IoC query/requist definitions keyed by IoCType
Return type:Dict[str, Any]
classmethod is_known_type(ioc_type: str) → bool

Return True if this a known IoC Type.

Parameters:ioc_type (str) – IoCType string to test
Returns:True if known type.
Return type:bool
is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) → bool

Return True if the passed type is supported.

Parameters:ioc_type (Union[str, IoCType]) – IoC type name or instance
Returns:True if supported.
Return type:bool
lookup_ioc

Lookup a single IoC observable.

Parameters:
  • ioc (str) – IoC observable
  • ioc_type (str, optional) – IocType, by default None (type will be inferred)
  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
Returns:

The lookup result: result - Positive/Negative, details - Lookup Details (or status if failure), raw_result - Raw Response reference - URL of IoC

Return type:

LookupResult

Raises:

NotImplementedError – If attempting to use an HTTP method or authentication protocol that is not supported.

Notes

Note: this method uses memoization (lru_cache) to cache results for a particular observable to try avoid repeated network calls for the same item.

lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: str = None, ioc_type_col: str = None, query_type: str = None, **kwargs) → pandas.core.frame.DataFrame

Lookup collection of IoC observables.

Parameters:
  • data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred
  • obs_col (str, optional) – DataFrame column to use for observables, by default None
  • ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None
  • query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
Returns:

DataFrame of results.

Return type:

pd.DataFrame

parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) → Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]

Return the details of the response.

Parameters:response (LookupResult) – The returned data response
Returns:bool = positive or negative hit TISeverity = enumeration of severity Object with match details
Return type:Tuple[bool, TISeverity, Any]
resolve_ioc_type

Return IoCType determined by IoCExtract.

Parameters:observable (str) – IoC observable string
Returns:IoC Type (or unknown if type could not be determined)
Return type:str
supported_types

Return list of supported IoC types for this provider.

Returns:List of supported type names
Return type:List[str]
classmethod usage()

Print usage of provider.

msticpy.sectools.vtlookup module

Module for VTLookup class.

Wrapper class around Virus Total API. Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing requires a Virus Total account and API key and processing performance is limited to the number of requests per minute for the account type that you have. Support IoC Types:

  • Filehash
  • URL
  • DNS Domain
  • IPv4 Address
class msticpy.sectools.vtlookup.DuplicateStatus(is_dup, status)

Bases: tuple

Create new instance of DuplicateStatus(is_dup, status)

count()

Return number of occurrences of value.

index()

Return first index of value.

Raises ValueError if the value is not present.

is_dup

Alias for field number 0

status

Alias for field number 1

class msticpy.sectools.vtlookup.VTLookup(vtkey: str, verbosity: int = 1)

Bases: object

VTLookup: VirusTotal lookup of IoC reports.

Main methods are: lookup_iocs() - accepts input of multiple IoCs in a Pandas DataFrame lookup_ioc() - looks up a single IoC observable. supported_ioc_types - a list of valid target types. ioc_vt_type_mapping - a dictionary of mappings to recognized VT Types. Types mapped to None will not be submitted to VT.

For urls a full http request can be submitted, query string and fragments will be dropped before submitting. For files MD5, SHA1 and SHA256 hashes are supported. For IP addresses only dotted IPv4 addresses are supported.

Create a new instance of VTLookup class.

Parameters:
  • vtkey (str) – VirusTotal API key
  • verbosity (int, optional) –
    The level of detail of reporting
    0 = no reporting 1 = minimal reporting (default) 2 = verbose reporting
ioc_vt_type_mapping

Return mapping between internal and VirusTotal IoC type names.

Returns:Return mapping between internal and VirusTotal IoC type names.
Return type:Mapping[str, str]
lookup_ioc(observable: str, ioc_type: str, output: str = 'dict') → Any

Look up and single IoC observable.

Parameters:
  • observable (str) – The observable value
  • ioc_type (str) – The IoC Type (see ‘supported_ioc_types’ attribute)
  • output (str, optional) – Output results as a dictionary (or list of dicts) if output is any other value the result will be returned in a Pandas DataFrame (the default is ‘dict’)
Returns:

  • list{dict} (if output == ‘dict’)
  • pd.DataFrame (otherwise)

Raises:

KeyError – Unknown ioc_type

lookup_iocs(data: pandas.core.frame.DataFrame, src_col: str = 'Observable', type_col: str = 'IoCType', src_index_col: str = 'SourceIndex', **kwargs) → pandas.core.frame.DataFrame

Retrieve results for IoC observables in the source dataframe.

Parameters:
  • data (pd.DataFrame) – Dataframe containing the observables to search for
  • src_col (str, optional) – The column name that contains the observable data (one item per row) (the default is ‘Observable’)
  • type_col (str, optional) – The column name containing the observable type (the default is ‘IoCType’)
  • src_index_col (str, optional) – The name of the column to use as source index. If not specified this defaults to ‘SourceIndex’. If this (or the supplied value) is not in the source dataframe, the index of the source dataframe will be used. This is retained in the output so that you can join the results back to the original data. (the default is ‘SourceIndex’)
Other Parameters:
 
  • key/value pairs of additional mappings to supported IoC type names
  • e.g. ipv4=’ipaddress’, url=’httprequest’.
  • This allows you to specify custom
  • mappings when the source data is tagged with different names.
Returns:

Combined results of local pre-processing and VirusTotal Lookups

Return type:

pd.DataFrame

Raises:

KeyError – Unknown ioc_type

Notes

See supported_ioc_types attribute for a list of valid target types. Not all of these types are supported by VirusTotal. See ioc_vt_type_mapping for current mappings. Types mapped to None will not be submitted to VT.

For urls a full http request can be submitted, query string and fragments will be dropped before submitting. Other supported protocols are ftp, telnet, ldap, file For files MD5, SHA1 and SHA256 hashes are supported. For IP addresses only dotted IPv4 addresses are supported.

supported_ioc_types

Return list of supported IoC type internal names.

Returns:List of supported IoC type internal names.
Return type:List[str]
supported_vt_types

Return list of VirusTotal supported IoC type names.

Returns:List of VirusTotal supported IoC type names.
Return type:List[str]
class msticpy.sectools.vtlookup.VTParams(api_type, batch_size, batch_delimiter, http_verb, api_var_name, headers)

Bases: tuple

Create new instance of VTParams(api_type, batch_size, batch_delimiter, http_verb, api_var_name, headers)

api_type

Alias for field number 0

api_var_name

Alias for field number 4

batch_delimiter

Alias for field number 2

batch_size

Alias for field number 1

count()

Return number of occurrences of value.

headers

Alias for field number 5

http_verb

Alias for field number 3

index()

Return first index of value.

Raises ValueError if the value is not present.

msticpy.sectools.vtlookupv3 module

VirusTotal v3 API.

class msticpy.sectools.vtlookupv3.ColumnNames

Bases: enum.Enum

Column name enum for DataFrame output.

DETECTIONS = 'detections'
ID = 'id'
RELATIONSHIP_TYPE = 'relationship_type'
SCANS = 'scans'
SOURCE = 'source'
SOURCE_TYPE = 'source_type'
TARGET = 'target'
TARGET_TYPE = 'target_type'
TYPE = 'type'
exception msticpy.sectools.vtlookupv3.MsticpyVTGraphSaveGraphError

Bases: Exception

Could not save VT Graph.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception msticpy.sectools.vtlookupv3.MsticpyVTNoDataError

Bases: Exception

No data returned from VT API.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class msticpy.sectools.vtlookupv3.VTEntityType

Bases: enum.Enum

VTEntityType: Enum class for VirusTotal entity types.

DOMAIN = 'domain'
FILE = 'file'
IP_ADDRESS = 'ip_address'
URL = 'url'
class msticpy.sectools.vtlookupv3.VTLookupV3(vt_key: Optional[str] = None)

Bases: object

VTLookupV3: VirusTotal lookup of IoC reports.

Create a new instance of VTLookupV3 class.

Parameters:vt_key (str, optional) – VirusTotal API key, if not supplied, this is read from user configuration.
create_vt_graph(relationship_dfs: List[pandas.core.frame.DataFrame], name: str, private: bool) → str

Create a VirusTotal Graph with a set of Relationship DataFrames.

Parameters:
  • relationship_dfs – List of Relationship DataFrames
  • name – New graph name
  • private – Indicates if the Graph is private or not.
Returns:

Return type:

Graph ID

Raises:
  • ValueError when private is not indicated.
  • ValueError when there are no relationship DataFrames
  • MsticpyVTGraphSaveGraphError when Graph can not be saved
get_object(vt_id: str, vt_type: str) → pandas.core.frame.DataFrame

Return the full VT object as a DataFrame.

Parameters:
  • vt_id (str) – The ID of the object
  • vt_type (str) – The type of object to query.
Returns:

Single column DataFrame with attribute names as index and values as data column.

Return type:

pd.DataFrame

Raises:
lookup_ioc(observable: str, vt_type: str) → pandas.core.frame.DataFrame

Look up and single IoC observable.

Parameters:
  • observable (str) – The observable value
  • vt_type (str) – The VT entity type
Returns:

Return type:

Attributes Pandas DataFrame with the properties of the entity

Raises:

KeyError – Unknown vt_type

lookup_ioc_relationships(observable: str, vt_type: str, relationship: str, limit: int = None) → pandas.core.frame.DataFrame

Look up and single IoC observable relationships.

Parameters:
  • observable (str) – The observable value
  • vt_type (str) – The VT entity type
  • relationship (str) – Desired relationship
  • limit (int) – Relations limit
Returns:

Return type:

Relationship Pandas DataFrame with the relationships of the entity

lookup_iocs(observables_df: pandas.core.frame.DataFrame, observable_column: str = 'target', observable_type_column: str = 'target_type')

Look up and multiple IoC observables.

Parameters:
  • observables_df (pd.DataFrame) – A Pandas DataFrame, where each row is an observable
  • observable_column – ID column of each observable
  • observable_type_column – Type column of each observable
Returns:

Return type:

Attributes Pandas DataFrame with the properties of the entities

lookup_iocs_relationships(observables_df: pandas.core.frame.DataFrame, relationship: str, observable_column: str = 'target', observable_type_column: str = 'target_type', limit: int = None) → pandas.core.frame.DataFrame

Look up and single IoC observable relationships.

Parameters:
  • observables_df (pd.DataFrame) – A Pandas DataFrame, where each row is an observable
  • relationship (str) – Desired relationship
  • observable_column – ID column of each observable
  • observable_type_column – Type column of each observable.
  • limit (int) – Relations limit
Returns:

Return type:

Relationship Pandas DataFrame with the relationships of each observable.

static render_vt_graph(graph_id: str, width: int = 800, height: int = 600)

Display a VTGraph in a Jupyter Notebook.

Parameters:
  • graph_id – Graph ID
  • width – Graph width.
  • height – Graph height
supported_vt_types

Return list of VirusTotal supported IoC type names.

Returns:List of VirusTotal supported IoC type names.
Return type:List[str]
class msticpy.sectools.vtlookupv3.VTObjectProperties

Bases: enum.Enum

Enum for VT Object properties.

ATTRIBUTES = 'attributes'
LAST_ANALYSIS_STATS = 'last_analysis_stats'
MALICIOUS = 'malicious'
RELATIONSHIPS = 'relationship'