msticpy.sectools package
msticpy.sectools.auditdextract module
Auditd extractor.
Module to load and decode Linux audit logs. It collapses messages sharing the same message ID into single events, decodes hex-encoded data fields and performs some event-specific formatting and normalization (e.g. for process start events it will re-assemble the process command line arguments into a single string). This is still a work-in-progress.
- msticpy.sectools.auditdextract.extract_events_to_df(data: pandas.core.frame.DataFrame, input_column: str = 'AuditdMessage', event_type: Optional[str] = None, verbose: bool = False) pandas.core.frame.DataFrame
Extract auditd raw messages into a dataframe.
- Parameters
data (pd.DataFrame) – The input dataframe with raw auditd data in a single string column
input_column (str, optional) – the input column name (the default is ‘AuditdMessage’)
event_type (str, optional) – the event type, if None, defaults to all (the default is None)
verbose (bool, optional) – Give feedback on stages of processing (the default is False)
- Returns
The resultant DataFrame
- Return type
pd.DataFrame
- msticpy.sectools.auditdextract.generate_process_tree(audit_data: pandas.core.frame.DataFrame, branch_depth: int = 4, processes: Optional[pandas.core.frame.DataFrame] = None) pandas.core.frame.DataFrame
Generate process tree data from auditd logs.
- Parameters
audit_data (pd.DataFrame) – The Audit data containing process creation events
branch_depth (int, optional) – The maximum depth of parent or child processes to extract from the data (The default is 4)
processes (pd.DataFrame, optional) – Dataframe of processes to generate tree for
- Returns
The formatted process tree data
- Return type
pd.DataFrame
- msticpy.sectools.auditdextract.get_event_subset(data: pandas.core.frame.DataFrame, event_type: str) pandas.core.frame.DataFrame
Return a subset of the events matching type event_type.
- Parameters
data (pd.DataFrame) – The input data
event_type (str) – The event type to select
- Returns
The subset of the data where data[‘EventType’] == event_type
- Return type
pd.DataFrame
- msticpy.sectools.auditdextract.read_from_file(filepath: str, event_type: Optional[str] = None, verbose: bool = False, dummy_sep: str = '\t') pandas.core.frame.DataFrame
Extract Audit events from a log file.
- Parameters
filepath (str) – path to the input file
event_type (str, optional) – The type of event to extract if only a subset required. (the default is None, which processes all types)
verbose (bool, optional) – If true more progress messages are output (the default is False)
dummy_sep (str, optional) – Separator to use for reading the ‘csv’ file (default is tab - ‘t’)
- Returns
The output DataFrame
- Return type
pd.DataFrame
Notes
The dummy_sep parameter should be a character that does not occur in an input line. This function uses pandas read_csv to read the audit lines into a single column. Using a separator that does appear in the input (e.g. space or comma) will cause data to be parsed into multiple columns and anything after the first separator in a line will be lost.
- msticpy.sectools.auditdextract.unpack_auditd(audit_str: List[Dict[str, str]]) Mapping[str, Mapping[str, Any]]
Unpack an Audit message and returns a dictionary of fields.
- Parameters
audit_str (str) – The auditd raw record
- Returns
The extracted message fields and values
- Return type
Mapping[str, Any]
msticpy.sectools.base64unpack module
base64_unpack.
The main function of this module is to decode and unpack strings that are obfuscated using base64 and/or certain compression algorithms such as gzip and zip.
It has the following functions: unpack_items - this is the main entry point and takes either a string or a pandas dataframe (with specified column) as input. It returns a string with obfuscated parts replaced by decoded equivalents (unless the decoding results in an undecodable binary, in which case a placeholder is used).
Other helper functions may also be useful standalone get_items_from_gzip(binary): Return decompressed gzip content of byte string get_items_from_zip(binary): Return dictionary of zip contents from byte string get_items_from_tar(binary): Return dictionary of tar file contents get_hashes(binary): Return md5, sha1 and sha256 hashes of input byte string
- class msticpy.sectools.base64unpack.B64ExtractAccessor(pandas_obj)
Bases:
object
Base64 Unpack pandas extension.
Initialize the extension.
- extract(column, **kwargs) pandas.core.frame.DataFrame
Base64 decode strings taken from a pandas dataframe.
- Parameters
data (pd.DataFrame) – dataframe containing column to decode
column (str) – Name of dataframe text column
trace (bool, optional) – Show additional status (the default is None)
utf16 (bool, optional) – Attempt to decode UTF16 byte strings
- Returns
Decoded string and additional metadata in dataframe
- Return type
pd.DataFrame
Notes
Items that decode to utf-8 or utf-16 strings will be returned as decoded strings replaced in the original string. If the encoded string is a known binary type it will identify the file type and return the hashes of the file. If any binary types are known archives (zip, tar, gzip) it will unpack the contents of the archive. For any binary it will return the decoded file as a byte array, and as a printable list of byte values.
The columns of the output DataFrame are:
decoded string: this is the input string with any decoded sections replaced by the results of the decoding
reference : this is an index that matches an index number in the decoded string (e.g. <<encoded binary type=pdf index=1.2’).
original_string : the string prior to decoding - file_type : the type of file if this could be determined
file_hashes : a dictionary of hashes (the md5, sha1 and sha256 hashes are broken out into separate columns)
input_bytes : the binary image as a byte array
decoded_string : printable form of the decoded string (either string or list of hex byte values)
encoding_type : utf-8, utf-16 or binary
md5, sha1, sha256 : the respective hashes of the binary file_type, file_hashes, input_bytes, md5, sha1, sha256 will be null if this item is decoded to a string
src_index - the index of the source row in the input frame.
- class msticpy.sectools.base64unpack.BinaryRecord(reference, original_string, file_name, file_type, input_bytes, decoded_string, encoding_type, file_hashes, md5, sha1, sha256, printable_bytes)
Bases:
tuple
Create new instance of BinaryRecord(reference, original_string, file_name, file_type, input_bytes, decoded_string, encoding_type, file_hashes, md5, sha1, sha256, printable_bytes)
- count(value, /)
Return number of occurrences of value.
- property decoded_string
Alias for field number 5
- property encoding_type
Alias for field number 6
- property file_hashes
Alias for field number 7
- property file_name
Alias for field number 2
- property file_type
Alias for field number 3
- index(value, start=0, stop=9223372036854775807, /)
Return first index of value.
Raises ValueError if the value is not present.
- property input_bytes
Alias for field number 4
- property md5
Alias for field number 8
- property original_string
Alias for field number 1
- property printable_bytes
Alias for field number 11
- property reference
Alias for field number 0
- property sha1
Alias for field number 9
- property sha256
Alias for field number 10
- msticpy.sectools.base64unpack.get_hashes(binary: bytes) Dict[str, str]
Return md5, sha1 and sha256 hashes of input byte string.
- Parameters
binary (bytes) – byte string of item to be hashed
- Returns
dictionary of hash algorithm + hash value
- Return type
Dict[str, str]
- msticpy.sectools.base64unpack.get_items_from_gzip(binary: bytes) Tuple[str, Dict[str, bytes]]
Return decompressed gzip contents.
- Parameters
binary (bytes) – byte array of gz file
- Returns
File type + decompressed file
- Return type
Tuple[str, bytes]
- msticpy.sectools.base64unpack.get_items_from_tar(binary: bytes) Tuple[str, Dict[str, bytes]]
Return dictionary of tar file contents.
- Parameters
binary (bytes) – byte array of zip file
- Returns
Filetype + dictionary of file name + file content
- Return type
Tuple[str, Dict[str, bytes]]
- msticpy.sectools.base64unpack.get_items_from_zip(binary: bytes) Tuple[str, Dict[str, bytes]]
Return dictionary of zip contents.
- Parameters
binary (bytes) – byte array of zip file
- Returns
Filetype + dictionary of file name + file content
- Return type
Tuple[str, Dict[str, bytes]]
- msticpy.sectools.base64unpack.unpack(input_string: str, trace: bool = False, utf16: bool = False) Tuple[str, Optional[List[msticpy.sectools.base64unpack.BinaryRecord]]]
Base64 decode an input string.
- Parameters
input_string (str, optional) – single string to decode (the default is None)
trace (bool, optional) – Show additional status (the default is None)
utf16 (bool, optional) – Attempt to decode UTF16 byte strings
- Returns
Decoded string and additional metadata
- Return type
Tuple[str, Optional[List[BinaryRecord]]]
Notes
Items that decode to utf-8 or utf-16 strings will be returned as decoded strings replaced in the original string. If the encoded string is a known binary type it will identify the file type and return the hashes of the file. If any binary types are known archives (zip, tar, gzip) it will unpack the contents of the archive. For any binary it will return the decoded file as a byte array, and as a printable list of byte values. If the input is a string the function returns:
decoded string: this is the input string with any decoded sections replaced by the results of the decoding
- msticpy.sectools.base64unpack.unpack_df(data: pandas.core.frame.DataFrame, column: str, trace: bool = False, utf16: bool = False) pandas.core.frame.DataFrame
Base64 decode strings taken from a pandas dataframe.
- Parameters
data (pd.DataFrame) – dataframe containing column to decode
column (str) – Name of dataframe text column
trace (bool, optional) – Show additional status (the default is None)
utf16 (bool, optional) – Attempt to decode UTF16 byte strings
- Returns
Decoded string and additional metadata in dataframe
- Return type
pd.DataFrame
Notes
Items that decode to utf-8 or utf-16 strings will be returned as decoded strings replaced in the original string. If the encoded string is a known binary type it will identify the file type and return the hashes of the file. If any binary types are known archives (zip, tar, gzip) it will unpack the contents of the archive. For any binary it will return the decoded file as a byte array, and as a printable list of byte values.
The columns of the output DataFrame are:
decoded string: this is the input string with any decoded sections replaced by the results of the decoding
reference : this is an index that matches an index number in the decoded string (e.g. <<encoded binary type=pdf index=1.2’).
original_string : the string prior to decoding
file_type : the type of file if this could be determined
file_hashes : a dictionary of hashes (the md5, sha1 and sha256 hashes are broken out into separate columns)
input_bytes : the binary image as a byte array
decoded_string : printable form of the decoded string (either string or list of hex byte values)
encoding_type : utf-8, utf-16 or binary
md5, sha1, sha256 : the respective hashes of the binary file_type, file_hashes, input_bytes, md5, sha1, sha256 will be null if this item is decoded to a string
src_index - the index of the source row in the input frame.
- msticpy.sectools.base64unpack.unpack_items(input_string: Optional[str] = None, data: Optional[pandas.core.frame.DataFrame] = None, column: Optional[str] = None, trace: bool = False, utf16: bool = False) Any
Base64 decode an input string or strings taken from a pandas dataframe.
- Parameters
input_string (str, optional) – single string to decode (the default is None)
data (pd.DataFrame, optional) – dataframe containing column to decode (the default is None)
column (str, optional) – Name of dataframe text column (the default is None)
trace (bool, optional) – Show additional status (the default is None)
utf16 (bool, optional) – Attempt to decode UTF16 byte strings
- Returns
Tuple[str, pd.DataFrame] (if input_string) – Decoded string and additional metadata
pd.DataFrame – Decoded stringa and additional metadata in dataframe
Notes
If the input is a dataframe you must supply the name of the column to use.
Items that decode to utf-8 or utf-16 strings will be returned as decoded strings replaced in the original string. If the encoded string is a known binary type it will identify the file type and return the hashes of the file. If any binary types are known archives (zip, tar, gzip) it will unpack the contents of the archive. For any binary it will return the decoded file as a byte array, and as a printable list of byte values. If the input is a string the function returns:
decoded string: this is the input string with any decoded sections replaced by the results of the decoding
It also returns the data as a Pandas DataFrame with the following columns:
reference : this is an index that matches an index number in the returned string (e.g. <<encoded binary type=pdf index=1.2’).
original_string : the string prior to decoding - file_type : the type of file if this could be determined
file_hashes : a dictionary of hashes (the md5, sha1 and sha256 hashes are broken out into separate columns)
input_bytes : the binary image as a byte array
decoded_string : printable form of the decoded string (either string or list of hex byte values)
encoding_type : utf-8, utf-16 or binary
md5, sha1, sha256 : the respective hashes of the binary file_type, file_hashes, input_bytes, md5, sha1, sha256 will be null if this item is decoded to a string
If the input is a dataframe the output dataframe will also include the following column: - src_index - the index of the source row in the input frame. This allows you to re-join the output data to the input data.
msticpy.sectools.cmd_line module
cmd_line - Syslog Command processing module.
Contains a series of functions required to correct collect, parse and visualise linux syslog data.
Designed to support standard linux syslog for investigations where auditd is not avalaible.
- msticpy.sectools.cmd_line.cmd_speed(cmd_events: pandas.core.frame.DataFrame, cmd_field: str, time: int = 5, events: int = 10) list
Detect patterns of cmd_line activity whose speed of execution may be suspicious.
- Parameters
cmd_events (pd.DataFrame) – A DataFrame of all sudo events to check.
cmd_field (str) – The column of the event data that contains command line activity
time (int, optional) – Time window in seconds in which to evaluate speed of execution against (Defaults to 5)
events (int, optional) – Number of syslog command execution events in which to evaluate speed of execution against (Defaults to 10)
- Returns
risky suspicious_actions – A list of commands that match a risky pattern
- Return type
list
- Raises
AttributeError – If cmd_field is not in supplied data set or TimeGenerated note datetime format
- msticpy.sectools.cmd_line.risky_cmd_line(events: pandas.core.frame.DataFrame, log_type: str, detection_rules: str = '/home/docs/checkouts/readthedocs.org/user_builds/msticpy/envs/v1.5.0/lib/python3.7/site-packages/msticpy/resources/cmd_line_rules.json', cmd_field: str = 'Command') dict
Detect patterns of risky commands in syslog messages.
Risky patterns are defined in a json format file.
- Parameters
events (pd.DataFrame) – A DataFrame of all syslog events potentially containing risky command line activity.
log_type (str) – The log type of the data included in events. Must correspond to a detection type in detection_rules file.
detection_rules (str, optional) – Path to json file containing patterns of risky activity to detect. (Defaults to msticpy/resources/cmd_line_rules.json)
cmd_field (str, optional;) – The column in the events dataset that contains the command lines to be analysed. (Defaults to “Command”)
- Returns
risky actions – A dictionary of commands that match a risky pattern
- Return type
dict
- Raises
MsticpyException – The provided dataset does not contain the cmd_field field
msticpy.sectools.geoip module
Geoip Lookup module using IPStack and Maxmind GeoLite2.
Geographic location lookup for IP addresses. This module has two classes for different services:
GeoLiteLookup - Maxmind Geolite (see https://www.maxmind.com)
IPStackLookup - IPStack (see https://ipstack.com)
Both services offer a free tier for non-commercial use. However, a paid tier will normally get you more accuracy, more detail and a higher throughput rate. Maxmind geolite uses a downloadable database, while IPStack is an online lookup (API key required).
- exception msticpy.sectools.geoip.GeoIPDatabaseException
Bases:
Exception
Exception when GeoIP database cannot be found.
- args
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class msticpy.sectools.geoip.GeoIpLookup
Bases:
object
Abstract base class for GeoIP Lookup classes.
See also
IPStackLookup
IPStack GeoIP Implementation
GeoLiteLookup
MaxMind GeoIP Implementation
Initialize instance of GeoIpLookup class.
- df_lookup_ip(data: pandas.core.frame.DataFrame, column: str) pandas.core.frame.DataFrame
Lookup Geolocation data from a pandas Dataframe.
- Parameters
data (pd.DataFrame) – pandas dataframe containing IpAddress column
column (str) – the name of the dataframe column to use as a source
- Returns
Copy of original dataframe with IP Location information columns appended (where a location lookup was successful)
- Return type
pd.DataFrame
- abstract lookup_ip(ip_address: Optional[str] = None, ip_addr_list: Optional[collections.abc.Iterable] = None, ip_entity: Optional[msticpy.datamodel.entities.ip_address.IpAddress] = None) Tuple[List[Any], List[msticpy.datamodel.entities.ip_address.IpAddress]]
Lookup IP location abstract method.
- Parameters
ip_address (str, optional) – a single address to look up (the default is None)
ip_addr_list (Iterable, optional) – a collection of addresses to lookup (the default is None)
ip_entity (IpAddress, optional) – an IpAddress entity (the default is None) - any existing data in the Location property will be overwritten
- Returns
raw geolocation results and same results as IpAddress entities with populated Location property.
- Return type
Tuple[List[Any], List[IpAddress]]
- lookup_ips(data: pandas.core.frame.DataFrame, column: str) pandas.core.frame.DataFrame
Lookup Geolocation data from a pandas Dataframe.
- Parameters
data (pd.DataFrame) – pandas dataframe containing IpAddress column
column (str) – the name of the dataframe column to use as a source
- Returns
IpLookup results as DataFrame.
- Return type
pd.DataFrame
- class msticpy.sectools.geoip.GeoLiteLookup(api_key: Optional[str] = None, db_folder: Optional[str] = None, force_update: bool = False, auto_update: bool = True, debug: bool = False)
Bases:
msticpy.sectools.geoip.GeoIpLookup
GeoIP Lookup using MaxMindDB database.
See also
GeoIpLookup
Abstract base class
IPStackLookup
IPStack GeoIP Implementation
Return new instance of GeoLiteLookup class.
- Parameters
api_key (str, optional) – Default is None - use configuration value from msticpyconfig.yaml. API Key from MaxMind - Read more about GeoLite2 : https://dev.maxmind.com/geoip/geoip2/geolite2/ Sign up for a MaxMind account: https://www.maxmind.com/en/geolite2/signup Set your password and create a license key: https://www.maxmind.com/en/accounts/current/license-key
db_folder (str, optional) – Provide absolute path to the folder containing MMDB file (e.g. ‘/usr/home’ or ‘C:/maxmind’). If no path provided, it is set to download to .msticpy/GeoLite2 under user`s home directory.
force_update (bool, optional) – Force update can be set to true or false. depending on it, new download request will be initiated.
auto_update (bool, optional) – Auto update can be set to true or false. depending on it, new download request will be initiated if age criteria is matched.
debug (bool, optional) – Print additional debugging information, default is False.
- close()
Close an open GeoIP DB.
- df_lookup_ip(data: pandas.core.frame.DataFrame, column: str) pandas.core.frame.DataFrame
Lookup Geolocation data from a pandas Dataframe.
- Parameters
data (pd.DataFrame) – pandas dataframe containing IpAddress column
column (str) – the name of the dataframe column to use as a source
- Returns
Copy of original dataframe with IP Location information columns appended (where a location lookup was successful)
- Return type
pd.DataFrame
- lookup_ip(ip_address: Optional[str] = None, ip_addr_list: Optional[collections.abc.Iterable] = None, ip_entity: Optional[msticpy.datamodel.entities.ip_address.IpAddress] = None) Tuple[List[Any], List[msticpy.datamodel.entities.ip_address.IpAddress]]
Lookup IP location from GeoLite2 data created by MaxMind.
- Parameters
ip_address (str, optional) – a single address to look up (the default is None)
ip_addr_list (Iterable, optional) – a collection of addresses to lookup (the default is None)
ip_entity (IpAddress, optional) – an IpAddress entity (the default is None) - any existing data in the Location property will be overwritten
- Returns
raw geolocation results and same results as IpAddress entities with populated Location property.
- Return type
Tuple[List[Any], List[IpAddress]]
- lookup_ips(data: pandas.core.frame.DataFrame, column: str) pandas.core.frame.DataFrame
Lookup Geolocation data from a pandas Dataframe.
- Parameters
data (pd.DataFrame) – pandas dataframe containing IpAddress column
column (str) – the name of the dataframe column to use as a source
- Returns
IpLookup results as DataFrame.
- Return type
pd.DataFrame
- class msticpy.sectools.geoip.IPStackLookup(api_key: Optional[str] = None, bulk_lookup: bool = False)
Bases:
msticpy.sectools.geoip.GeoIpLookup
IPStack GeoIP Implementation.
See also
GeoIpLookup
Abstract base class
GeoLiteLookup
MaxMind GeoIP Implementation
Create a new instance of IPStackLookup.
- Parameters
api_key (str, optional) – API Key from IPStack - see https://ipstack.com default is None - obtain key from msticpyconfig.yaml
bulk_lookup (bool, optional) – For Professional and above tiers allowing you to submit multiple IPs in a single request. (the default is False, which submits a single request per address)
- df_lookup_ip(data: pandas.core.frame.DataFrame, column: str) pandas.core.frame.DataFrame
Lookup Geolocation data from a pandas Dataframe.
- Parameters
data (pd.DataFrame) – pandas dataframe containing IpAddress column
column (str) – the name of the dataframe column to use as a source
- Returns
Copy of original dataframe with IP Location information columns appended (where a location lookup was successful)
- Return type
pd.DataFrame
- lookup_ip(ip_address: Optional[str] = None, ip_addr_list: Optional[collections.abc.Iterable] = None, ip_entity: Optional[msticpy.datamodel.entities.ip_address.IpAddress] = None) Tuple[List[Any], List[msticpy.datamodel.entities.ip_address.IpAddress]]
Lookup IP location from IPStack web service.
- Parameters
ip_address (str, optional) – a single address to look up (the default is None)
ip_addr_list (Iterable, optional) – a collection of addresses to lookup (the default is None)
ip_entity (IpAddress, optional) – an IpAddress entity (the default is None) - any existing data in the Location property will be overwritten
- Returns
raw geolocation results and same results as IpAddress entities with populated Location property.
- Return type
Tuple[List[Any], List[IpAddress]]
- Raises
ConnectionError – Invalid status returned from http request
PermissionError – Service refused request (e.g. requesting batch of addresses on free tier API key)
- lookup_ips(data: pandas.core.frame.DataFrame, column: str) pandas.core.frame.DataFrame
Lookup Geolocation data from a pandas Dataframe.
- Parameters
data (pd.DataFrame) – pandas dataframe containing IpAddress column
column (str) – the name of the dataframe column to use as a source
- Returns
IpLookup results as DataFrame.
- Return type
pd.DataFrame
- msticpy.sectools.geoip.entity_distance(ip_src: msticpy.datamodel.entities.ip_address.IpAddress, ip_dest: msticpy.datamodel.entities.ip_address.IpAddress) float
Return distance between two IP Entities.
- msticpy.sectools.geoip.geo_distance(origin: Tuple[float, float], destination: Tuple[float, float]) float
Calculate the Haversine distance.
- Parameters
origin (Tuple[float, float]) – Latitude, Longitude of origin of distance measurement.
destination (Tuple[float, float]) – Latitude, Longitude of origin of distance measurement.
- Returns
Distance in kilometers.
- Return type
float
Examples
>>> origin = (48.1372, 11.5756) # Munich >>> destination = (52.5186, 13.4083) # Berlin >>> round(geo_distance(origin, destination), 1) 504.2
Notes
Author: Martin Thoma - stackoverflow
msticpy.sectools.iocextract module
Module for IoCExtract class.
Uses a set of builtin regular expressions to look for Indicator of Compromise (IoC) patterns. Input can be a single string or a pandas dataframe with one or more columns specified as input.
The following types are built-in:
IPv4 and IPv6
URL
DNS domain
Hashes (MD5, SHA1, SHA256)
Windows file paths
Linux file paths (this is kind of noisy because a legal linux file path can have almost any character) You can modify or add to the regular expressions used at runtime.
- class msticpy.sectools.iocextract.IoCExtract
Bases:
object
IoC Extractor - looks for common IoC patterns in input strings.
The extract() method takes either a string or a pandas DataFrame as input. When using the string option as an input extract will return a dictionary of results. When using a DataFrame the results will be returned as a new DataFrame with the following columns: IoCType: the mnemonic used to distinguish different IoC Types Observable: the actual value of the observable SourceIndex: the index of the row in the input DataFrame from which the source for the IoC observable was extracted.
The class has a number of built-in IoC regex definitions. These can be retrieved using the ioc_types attribute.
Addition IoC definitions can be added using the add_ioc_type method.
Note: due to some ambiguity in the regular expression patterns for different types and observable may be returned assigned to multiple observable types. E.g. 192.168.0.1 is a also a legal file name in both Linux and Windows. Linux file names have a particularly large scope in terms of legal characters so it will be quite common to see other IoC observables (or parts of them) returned as a possible linux path.
Initialize new instance of IoCExtract.
- DNS_REGEX = '((?=[a-z0-9-]{1,63}\\.)[a-z0-9]+(-[a-z0-9]+)*\\.){1,126}[a-z]{2,63}'
- IPV4_REGEX = '(?P<ipaddress>(?:[0-9]{1,3}\\.){3}[0-9]{1,3})'
- IPV6_REGEX = '(?<![:.\\w])(?:[A-F0-9]{0,4}:){2,7}[A-F0-9]{0,4}(?![:.\\w])'
- LXPATH_REGEX = '(?P<root>/+||[.]+)\n (?P<folder>/(?:[^\\\\/:*?<>|\\r\\n]+/)*)\n (?P<file>[^/\\0<>|\\r\\n ]+)'
- MD5_REGEX = '(?:^|[^A-Fa-f0-9])(?P<hash>[A-Fa-f0-9]{32})(?:$|[^A-Fa-f0-9])'
- SHA1_REGEX = '(?:^|[^A-Fa-f0-9])(?P<hash>[A-Fa-f0-9]{40})(?:$|[^A-Fa-f0-9])'
- SHA256_REGEX = '(?:^|[^A-Fa-f0-9])(?P<hash>[A-Fa-f0-9]{64})(?:$|[^A-Fa-f0-9])'
- URL_REGEX = '\n (?P<protocol>(https?|ftp|telnet|ldap|file)://)\n (?P<userinfo>([a-z0-9-._~!$&\\\'()*+,;=:]|%[0-9A-F]{2})*@)?\n (?P<host>([a-z0-9-._~!$&\\\'()*+,;=]|%[0-9A-F]{2})*)\n (:(?P<port>\\d*))?\n (/(?P<path>([^?\\#"<>\\s]|%[0-9A-F]{2})*/?))?\n (\\?(?P<query>([a-z0-9-._~!$&\'()*+,;=:/?@]|%[0-9A-F]{2})*))?\n (\\#(?P<fragment>([a-z0-9-._~!$&\'()*+,;=:/?@]|%[0-9A-F]{2})*))?'
- WINPATH_REGEX = '\n (?P<root>[a-z]:|\\\\\\\\[a-z0-9_.$-]+||[.]+)\n (?P<folder>\\\\(?:[^\\/:*?"\\\'<>|\\r\\n]+\\\\)*)\n (?P<file>[^\\\\/*?""<>|\\r\\n ]+)'
- add_ioc_type(ioc_type: str, ioc_regex: str, priority: int = 0, group: Optional[str] = None)
Add an IoC type and regular expression to use to the built-in set.
- Parameters
ioc_type (str) – A unique name for the IoC type
ioc_regex (str) – A regular expression used to search for the type
priority (int, optional) – Priority of the regex match vs. other ioc_patterns. 0 is the highest priority (the default is 0).
group (str, optional) – The regex group to match (the default is None, which will match on the whole expression)
Notes
- Pattern priorities.
If two IocType patterns match on the same substring, the matched substring is assigned to the pattern/IocType with the highest priority. E.g. foo.bar.com will match types: dns, windows_path and linux_path but since dns has a higher priority, the expression is assigned to the dns matches.
- extract(src: Optional[str] = None, data: Optional[pandas.core.frame.DataFrame] = None, columns: Optional[List[str]] = None, **kwargs) Union[Dict[str, Set[str]], pandas.core.frame.DataFrame]
Extract IoCs from either a string or pandas DataFrame.
- Parameters
src (str, optional) – source string in which to look for IoC patterns (the default is None)
data (pd.DataFrame, optional) – input DataFrame from which to read source strings (the default is None)
columns (list, optional) – The list of columns to use as source strings, if the data parameter is used. (the default is None)
ioc_types (list, optional) – Restrict matching to just specified types. (default is all types)
include_paths (bool, optional) – Whether to include path matches (which can be noisy) (the default is false - excludes ‘windows_path’ and ‘linux_path’). If ioc_types is specified this parameter is ignored.
ignore_tlds (bool, optional) – If True, ignore the official Top Level Domains list when determining whether a domain name is a legal domain.
- Returns
dict of found observables (if input is a string) or DataFrame of observables
- Return type
Any
Notes
Extract takes either a string or a pandas DataFrame as input. When using the string option as an input extract will return a dictionary of results. When using a DataFrame the results will be returned as a new DataFrame with the following columns: - IoCType: the mnemonic used to distinguish different IoC Types - Observable: the actual value of the observable - SourceIndex: the index of the row in the input DataFrame from which the source for the IoC observable was extracted.
IoCType Pattern selection The default list is: [‘ipv4’, ‘ipv6’, ‘dns’, ‘url’, ‘md5_hash’, ‘sha1_hash’, ‘sha256_hash’] plus any user-defined types. ‘windows_path’, ‘linux_path’ are excluded unless include_paths is True or explicitly included in ioc_paths.
- extract_df(data: pandas.core.frame.DataFrame, columns: Union[str, List[str]], **kwargs) pandas.core.frame.DataFrame
Extract IoCs from either a pandas DataFrame.
- Parameters
data (pd.DataFrame) – input DataFrame from which to read source strings
columns (Union[str, list]) – A single column name as a string or a a list of columns to use as source strings,
ioc_types (list, optional) – Restrict matching to just specified types. (default is all types)
include_paths (bool, optional) – Whether to include path matches (which can be noisy) (the default is false - excludes ‘windows_path’ and ‘linux_path’). If ioc_types is specified this parameter is ignored.
ignore_tlds (bool, optional) – If True, ignore the official Top Level Domains list when determining whether a domain name is a legal domain.
- Returns
DataFrame of observables
- Return type
pd.DataFrame
Notes
Extract takes a pandas DataFrame as input. The results will be returned as a new DataFrame with the following columns: - IoCType: the mnemonic used to distinguish different IoC Types - Observable: the actual value of the observable - SourceIndex: the index of the row in the input DataFrame from which the source for the IoC observable was extracted.
IoCType Pattern selection The default list is: [‘ipv4’, ‘ipv6’, ‘dns’, ‘url’, ‘md5_hash’, ‘sha1_hash’, ‘sha256_hash’] plus any user-defined types. ‘windows_path’, ‘linux_path’ are excluded unless include_paths is True or explicitly included in ioc_paths.
- static file_hash_type(file_hash: str) msticpy.sectools.iocextract.IoCType
Return specific IoCType based on hash length.
- Parameters
file_hash (str) – File hash string
- Returns
Specific hash type or unknown.
- Return type
- get_ioc_type(observable: str) str
Return first matching type.
- Parameters
observable (str) – The IoC Observable to check
- Returns
The IoC type enumeration (unknown, if no match)
- Return type
str
- property ioc_types: dict
Return the current set of IoC types and regular expressions.
- Returns
dict of IoC Type names and regular expressions
- Return type
dict
- validate(input_str: str, ioc_type: str, ignore_tlds: bool = False) bool
Check that input_str matches the regex for the specificed ioc_type.
- Parameters
input_str (str) – the string to test
ioc_type (str) – the regex pattern to use
ignore_tlds (bool, optional) – If True, ignore the official Top Level Domains list when determining whether a domain name is a legal domain.
- Returns
True if match.
- Return type
bool
- class msticpy.sectools.iocextract.IoCExtractAccessor(pandas_obj)
Bases:
object
Pandas api extension for IoC Extractor.
Instantiate pandas extension class.
- extract(columns, **kwargs)
Extract IoCs from either a pandas DataFrame.
- Parameters
columns (list) – The list of columns to use as source strings,
ioc_types (list, optional) – Restrict matching to just specified types. (default is all types)
include_paths (bool, optional) – Whether to include path matches (which can be noisy) (the default is false - excludes ‘windows_path’ and ‘linux_path’). If ioc_types is specified this parameter is ignored.
- Returns
DataFrame of observables
- Return type
pd.DataFrame
Notes
Extract takes a pandas DataFrame as input. The results will be returned as a new DataFrame with the following columns: - IoCType: the mnemonic used to distinguish different IoC Types - Observable: the actual value of the observable - SourceIndex: the index of the row in the input DataFrame from which the source for the IoC observable was extracted.
IoCType Pattern selection The default list is: [‘ipv4’, ‘ipv6’, ‘dns’, ‘url’, ‘md5_hash’, ‘sha1_hash’, ‘sha256_hash’] plus any user-defined types. ‘windows_path’, ‘linux_path’ are excluded unless include_paths is True or explicitly included in ioc_paths.
- class msticpy.sectools.iocextract.IoCPattern(ioc_type, comp_regex, priority, group)
Bases:
tuple
Create new instance of IoCPattern(ioc_type, comp_regex, priority, group)
- property comp_regex
Alias for field number 1
- count(value, /)
Return number of occurrences of value.
- property group
Alias for field number 3
- index(value, start=0, stop=9223372036854775807, /)
Return first index of value.
Raises ValueError if the value is not present.
- property ioc_type
Alias for field number 0
- property priority
Alias for field number 2
- class msticpy.sectools.iocextract.IoCType(value)
Bases:
enum.Enum
Enumeration of IoC Types.
- dns = 'dns'
- email = 'email'
- file_hash = 'file_hash'
- hostname = 'hostname'
- ipv4 = 'ipv4'
- ipv6 = 'ipv6'
- linux_path = 'linux_path'
- md5_hash = 'md5_hash'
- classmethod parse(value: str) msticpy.sectools.iocextract.IoCType
Return parsed IoCType of string.
- Parameters
value (str) – Enumeration name
- Returns
IoCType matching name or unknown if no match
- Return type
- sha1_hash = 'sha1_hash'
- sha256_hash = 'sha256_hash'
- unknown = 'unknown'
- url = 'url'
- windows_path = 'windows_path'
msticpy.sectools.proc_tree_builder module
Process Tree Builder module for Process Tree Visualization.
- msticpy.sectools.proc_tree_builder.build_process_tree(procs: pandas.core.frame.DataFrame, schema: Optional[Union[msticpy.sectools.proc_tree_schema.ProcSchema, Dict[str, Any]]] = None, show_summary: bool = False, debug: bool = False) pandas.core.frame.DataFrame
Build process trees from the process events.
- Parameters
procs (pd.DataFrame) – Process events (Windows 4688 or Linux Auditd)
schema (Union[ProcSchema, Dict[str, Any]], optional) – The column schema to use, by default None. If supplied as a dict it must include definitions for the required fields in the ProcSchema class If None, then the schema is inferred
show_summary (bool) – Shows summary of the built tree, default is False.
debug (bool) – If True produces extra debugging output, by default False
- Returns
Process tree dataframe.
- Return type
pd.DataFrame
See also
ProcSchema
- msticpy.sectools.proc_tree_builder.infer_schema(data: Union[pandas.core.frame.DataFrame, pandas.core.series.Series]) Optional[msticpy.sectools.proc_tree_schema.ProcSchema]
Infer the correct schema to use for this data set.
- Parameters
data (Union[pd.DataFrame, pd.Series]) – Data set to test
- Returns
The schema most closely matching the data set.
- Return type
ProcSchema
msticpy.sectools.proc_tree_build_winlx module
Process Tree builder for Windows security and Linux auditd events.
- msticpy.sectools.proc_tree_build_winlx.extract_process_tree(procs: pandas.core.frame.DataFrame, schema: ProcSchema, debug: bool = False) pandas.core.frame.DataFrame
Build process trees from the process events.
- Parameters
procs (pd.DataFrame) – Process events (Windows 4688 or Linux Auditd)
schema (Union[ProcSchema, Dict[str, Any]], optional) – The column schema to use, by default None. If supplied as a dict it must include definitions for the required fields in the ProcSchema class If None, then the schema is inferred
debug (bool) – If True produces extra debugging output, by default False
- Returns
Process tree dataframe.
- Return type
pd.DataFrame
See also
ProcSchema
msticpy.sectools.proc_tree_build_mde module
Process tree builder routines for MDE process data.
- msticpy.sectools.proc_tree_build_mde.convert_mde_schema_to_internal(data: pandas.core.frame.DataFrame, schema: msticpy.sectools.proc_tree_schema.ProcSchema) pandas.core.frame.DataFrame
Convert DeviceProcessEvents schema data to internal MDE schema.
- Parameters
data (pd.DataFrame) – Input data in MS Sentinel schema.
schema (ProcSchema) – The mapping schema for the data set.
- Returns
Reformatted data into MDE internal schema.
- Return type
pd.DataFrame
- msticpy.sectools.proc_tree_build_mde.extract_process_tree(data: pandas.core.frame.DataFrame, debug: bool = False) pandas.core.frame.DataFrame
Build a process tree from raw MDE process logs.
- Parameters
data (pd.DataFrame) – DataFrame of process events.
debug (bool, optional) – Turn on additional debugging output, by default False.
- Returns
Process tree DataFrame with child->parent keys and extracted parent processes from child data.
- Return type
pd.DataFrame
msticpy.sectools.process_tree_utils module
Process Tree Visualization.
- msticpy.sectools.process_tree_utils.get_ancestors(procs: pandas.core.frame.DataFrame, source, include_source=True) pandas.core.frame.DataFrame
Return the ancestor processes of the source process.
- Parameters
procs (pd.DataFrame) – Process events (with process tree metadata)
source (Union[str, pd.Series]) – source_index of process or the process row
include_source (bool, optional) – Include the source process in the results, by default True
- Returns
Ancestor processes
- Return type
pd.DataFrame
- msticpy.sectools.process_tree_utils.get_children(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series], include_source: bool = True) pandas.core.frame.DataFrame
Return the child processes for the source process.
- Parameters
procs (pd.DataFrame) – Process events (with process tree metadata)
source (Union[str, pd.Series]) – source_index of process or the process row
include_source (bool, optional) – If True include the source process in the results, by default True
- Returns
Child processes
- Return type
pd.DataFrame
- msticpy.sectools.process_tree_utils.get_descendents(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series], include_source: bool = True, max_levels: int = - 1) pandas.core.frame.DataFrame
Return the descendents of the source process.
- Parameters
procs (pd.DataFrame) – Process events (with process tree metadata)
source (Union[str, pd.Series]) – source_index of process or the process row
include_source (bool, optional) – Include the source process in the results, by default True
max_levels (int, optional) – Maximum number of levels to descend, by default -1 (all levels)
- Returns
Descendent processes
- Return type
pd.DataFrame
- msticpy.sectools.process_tree_utils.get_parent(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series]) Optional[pandas.core.series.Series]
Return the parent of the source process.
- Parameters
procs (pd.DataFrame) – Process events (with process tree metadata)
source (Union[str, pd.Series]) – source_index of process or the process row
- Returns
Parent Process row or None if no parent was found.
- Return type
Optional[pd.Series]
- msticpy.sectools.process_tree_utils.get_process(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series]) pandas.core.series.Series
Return the process event as a Series.
- Parameters
procs (pd.DataFrame) – Process events (with process tree metadata)
source (Union[str, pd.Series]) – source_index of process or the process row
- Returns
Process row
- Return type
pd.Series
- Raises
ValueError – If unknown type is supplied as source
- msticpy.sectools.process_tree_utils.get_process_key(procs: pandas.core.frame.DataFrame, source_index: int) str
Return the process key of the process given its source_index.
- Parameters
procs (pd.DataFrame) – Process events
source_index (int, optional) – source_index of the process record
- Returns
The process key of the process.
- Return type
str
- msticpy.sectools.process_tree_utils.get_root(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series]) pandas.core.series.Series
Return the root process for the source process.
- Parameters
procs (pd.DataFrame) – Process events (with process tree metadata)
source (Union[str, pd.Series]) – source_index of process or the process row
- Returns
Root process
- Return type
pd.Series
- msticpy.sectools.process_tree_utils.get_root_tree(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series]) pandas.core.frame.DataFrame
Return the process tree to which the source process belongs.
- Parameters
procs (pd.DataFrame) – Process events (with process tree metadata)
source (Union[str, pd.Series]) – source_index of process or the process row
- Returns
Process Tree
- Return type
pd.DataFrame
- msticpy.sectools.process_tree_utils.get_roots(procs: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame
Return the process tree roots for the current data set.
- Parameters
procs (pd.DataFrame) – Process events (with process tree metadata)
- Returns
Process Tree root processes
- Return type
pd.DataFrame
- msticpy.sectools.process_tree_utils.get_siblings(procs: pandas.core.frame.DataFrame, source: Union[str, pandas.core.series.Series], include_source: bool = True) pandas.core.frame.DataFrame
Return the processes that share the parent of the source process.
- Parameters
procs (pd.DataFrame) – Process events (with process tree metadata)
source (Union[str, pd.Series]) – source_index of process or the process row
include_source (bool, optional) – Include the source process in the results, by default True
- Returns
Sibling processes.
- Return type
pd.DataFrame
- msticpy.sectools.process_tree_utils.get_summary_info(procs: pandas.core.frame.DataFrame) Dict[str, int]
Return summary information about the process trees.
- Parameters
procs (pd.DataFrame) – Process events (with process tree metadata)
- Returns
Summary statistic about the process tree
- Return type
Dict[str, int]
- msticpy.sectools.process_tree_utils.get_tree_depth(procs: pandas.core.frame.DataFrame) int
Return the depth of the process tree.
- Parameters
procs (pd.DataFrame) – Process events (with process tree metadata)
- Returns
Tree depth
- Return type
int
msticpy.sectools.syslog_utils module
syslog_utils - Syslog parsing and utility module.
Functions required to correct collect, parse and visualize syslog data.
Designed to support standard linux syslog for investigations where auditd is not available.
- msticpy.sectools.syslog_utils.cluster_syslog_logons_df(logon_events: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame
Cluster logon sessions in syslog by start/end time based on PAM events.
- Parameters
logon_events (pd.DataFrame) – A DataFrame of all syslog logon events (can be generated with LinuxSyslog.user_logon query)
- Returns
logon_sessions – A dictionary of logon sessions including start and end times and logged on user
- Return type
pd.DataFrame
- Raises
MsticpyException – There are no logon sessions in the supplied data set
- msticpy.sectools.syslog_utils.create_host_record(syslog_df: pandas.core.frame.DataFrame, heartbeat_df: pandas.core.frame.DataFrame, az_net_df: Optional[pandas.core.frame.DataFrame] = None) msticpy.datamodel.entities.host.Host
Generate host_entity record for selected computer.
- Parameters
syslog_df (pd.DataFrame) – A dataframe of all syslog events for the host in the time window requried
heartbeat_df (pd.DataFrame) – A dataframe of heartbeat data for the host
az_net_df (pd.DataFrame) – Option dataframe of Azure network data for the host
- Returns
Details of the host data collected
- Return type
- msticpy.sectools.syslog_utils.risky_sudo_sessions(sudo_sessions: pandas.core.frame.DataFrame, risky_actions: Optional[dict] = None, suspicious_actions: Optional[list] = None) dict
Detect if a sudo session occurs at the point of a suspicious event.
- Parameters
sudo_sessions (dict) – Dictionary of sudo sessions (as generated by cluster_syslog_logons)
risky_actions (dict (Optional)) – Dictionary of risky sudo commands (as generated by cmd_line.risky_cmd_line)
suspicious_actions (list (Optional)) – List of risky sudo commands (as generated by cmd_line.cmd_speed)
- Returns
risky_sessions – A dictionary of sudo sessions with flags denoting risk
- Return type
dict
msticpy.sectools.tilookup module
Module for TILookup classes.
Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.
- class msticpy.sectools.tilookup.TILookup(primary_providers: Optional[List[msticpy.sectools.tiproviders.ti_provider_base.TIProvider]] = None, secondary_providers: Optional[List[msticpy.sectools.tiproviders.ti_provider_base.TIProvider]] = None, providers: Optional[List[str]] = None)
Bases:
object
Threat Intel observable lookup from providers.
Initialize TILookup instance.
- Parameters
primary_providers (Optional[List[TIProvider]], optional) – Primary TI Providers, by default None
secondary_providers (Optional[List[TIProvider]], optional) – Secondary TI Providers, by default None
providers (Optional[List[str]], optional) – List of provider names to load, by default all available providers are loaded. To see the list of available providers call TILookup.list_available_providers(). Note: if primary_provides or secondary_providers is specified This will override the providers list.
- add_provider(provider: msticpy.sectools.tiproviders.ti_provider_base.TIProvider, name: Optional[str] = None, primary: bool = True)
Add a TI provider to the current collection.
- Parameters
provider (TIProvider) – Provider instance
name (str, optional) – The name to use for the provider (overrides the class name of provider)
primary (bool, optional) – “primary” or “secondary” if False, by default “primary”
- property available_providers: List[str]
Return a list of builtin providers.
- Returns
List of TI Provider classes.
- Return type
List[str]
- static browse(data: pandas.core.frame.DataFrame, severities: Optional[List[str]] = None, **kwargs)
Return TI Results list browser.
- Parameters
data (pd.DataFrame) – TI Results data from TIProviders
severities (Optional[List[str]], optional) – A list of the severity classes to show. By default these are [‘warning’, ‘high’]. Pass [‘information’, ‘warning’, ‘high’] to see all results.
kwargs – passed to SelectItem constuctor.
- Returns
SelectItem browser for TI Data.
- Return type
- static browse_results(data: pandas.core.frame.DataFrame, severities: Optional[List[str]] = None, **kwargs)
Return TI Results list browser.
- Parameters
data (pd.DataFrame) – TI Results data from TIProviders
severities (Optional[List[str]], optional) – A list of the severity classes to show. By default these are [‘warning’, ‘high’]. Pass [‘information’, ‘warning’, ‘high’] to see all results.
kwargs – passed to SelectItem constuctor.
- Returns
SelectItem browser for TI Data.
- Return type
- property configured_providers: List[str]
Return a list of avaliable providers that have configuration details present.
- Returns
List of TI Provider classes.
- Return type
List[str]
- disable_provider(providers: Union[str, Iterable[str]])
Set the provider as secondary (not used by default).
- Parameters
providers (Union[str, Iterable[str]) – Provider name or list of names. Use list_available_providers() to see the list of loaded providers.
- Raises
ValueError – If the provider name is not recognized.
- enable_provider(providers: Union[str, Iterable[str]])
Set the provider(s) as primary (used by default).
- Parameters
providers (Union[str, Iterable[str]) – Provider name or list of names. Use list_available_providers() to see the list of loaded providers.
- Raises
ValueError – If the provider name is not recognized.
- classmethod list_available_providers(show_query_types=False, as_list: bool = False) Optional[List[str]]
Print a list of builtin providers with optional usage.
- Parameters
show_query_types (bool, optional) – Show query types supported by providers, by default False
as_list (bool, optional) – Return list of providers instead of printing to stdout. Note: if you specify show_query_types this will be printed irrespective of this parameter setting.
- Returns
A list of provider names (if return_list=True)
- Return type
Optional[List[str]]
- property loaded_providers: Dict[str, msticpy.sectools.tiproviders.ti_provider_base.TIProvider]
Return dictionary of loaded providers.
- Returns
[description]
- Return type
Dict[str, TIProvider]
- lookup_ioc(observable: Optional[str] = None, ioc_type: Optional[str] = None, ioc_query_type: Optional[str] = None, providers: Optional[List[str]] = None, prov_scope: str = 'primary', **kwargs) Tuple[bool, List[Tuple[str, msticpy.sectools.tiproviders.ti_provider_base.LookupResult]]]
Lookup single IoC in active providers.
- Parameters
observable (str) – IoC observable (ioc is also an alias for observable)
ioc_type (str, optional) – One of IoCExtract.IoCType, by default None If none, the IoC type will be inferred
ioc_query_type (str, optional) – The ioc query type (e.g. rep, info, malware)
providers (List[str]) – Explicit list of providers to use
prov_scope (str, optional) – Use “primary”, “secondary” or “all” providers, by default “primary”
kwargs – Additional arguments passed to the underlying provider(s)
- Returns
The result returned as a tuple(bool, list): bool indicates whether a TI record was found in any provider list has an entry for each provider result
- Return type
Tuple[bool, List[Tuple[str, LookupResult]]]
- lookup_iocs(data: Union[pandas.core.frame.DataFrame, Mapping[str, str], Iterable[str]], obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None, ioc_query_type: Optional[str] = None, providers: Optional[List[str]] = None, prov_scope: str = 'primary', **kwargs) pandas.core.frame.DataFrame
Lookup a collection of IoCs.
- Parameters
data (Union[pd.DataFrame, Mapping[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Mapping (e.g. a dict) of [observable, IoCType] 3. Iterable of observables - IoCTypes will be inferred
obs_col (str, optional) – DataFrame column to use for observables, by default None (“col” and “column” are also aliases for this parameter)
ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None
ioc_query_type (str, optional) – The ioc query type (e.g. rep, info, malware)
providers (List[str]) – Explicit list of providers to use
prov_scope (str, optional) – Use “primary”, “secondary” or “all” providers, by default “primary”
kwargs – Additional arguments passed to the underlying provider(s)
- Returns
DataFrame of results
- Return type
pd.DataFrame
- property provider_status: Iterable[str]
Return loaded provider status.
- Returns
List of providers and descriptions.
- Return type
Iterable[str]
- provider_usage()
Print usage of loaded providers.
- classmethod reload_provider_settings()
Reload provider settings from config.
- reload_providers()
Reload providers based on current settings in config.
- Parameters
clear_keyring (bool, optional) – Clears any secrets cached in keyring, by default False
- static result_to_df(ioc_lookup: Tuple[bool, List[Tuple[str, msticpy.sectools.tiproviders.ti_provider_base.LookupResult]]]) pandas.core.frame.DataFrame
Return DataFrame representation of IoC Lookup response.
- Parameters
ioc_lookup (Tuple[bool, List[Tuple[str, LookupResult]]]) – Output from lookup_ioc
- Returns
The response as a DataFrame with a row for each provider response.
- Return type
pd.DataFrame
- set_provider_state(prov_dict: Dict[str, bool])
Set a dict of providers to primary/secondary.
- Parameters
prov_dict (Dict[str, bool]) – Dictionary of provider name and bool - True if enabled/primary, False if disabled/secondary.
msticpy.sectools.tiproviders.ti_provider_base module
Module for TILookup classes.
Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.
- class msticpy.sectools.tiproviders.ti_provider_base.LookupResult(ioc: str, ioc_type: str, safe_ioc: str = '', query_subtype: Optional[str] = None, provider: Optional[str] = None, result: bool = False, severity: int = 0, details: Optional[Any] = None, raw_result: Optional[Union[str, dict]] = None, reference: Optional[str] = None, status: int = 0)
Bases:
object
Lookup result for IoCs.
Method generated by attrs for class LookupResult.
- classmethod column_map()
Return a dictionary that maps fields to DF Names.
- details: Any
- ioc: str
- ioc_type: str
- provider: Optional[str]
- query_subtype: Optional[str]
- raw_result: Optional[Union[str, dict]]
- property raw_result_fmtd
Print raw results of the Lookup Result.
- reference: Optional[str]
- result: bool
- safe_ioc: str
- set_severity(value: Any)
Set the severity from enum, int or string.
- Parameters
value (Any) – The severity value to set
- severity: int
- property severity_name: str
Return text description of severity score.
- Returns
Severity description.
- Return type
str
- status: int
- property summary
Print a summary of the Lookup Result.
- class msticpy.sectools.tiproviders.ti_provider_base.SanitizedObservable(observable, status)
Bases:
tuple
Create new instance of SanitizedObservable(observable, status)
- count(value, /)
Return number of occurrences of value.
- index(value, start=0, stop=9223372036854775807, /)
Return first index of value.
Raises ValueError if the value is not present.
- property observable
Alias for field number 0
- property status
Alias for field number 1
- class msticpy.sectools.tiproviders.ti_provider_base.TILookupStatus(value)
Bases:
enum.Enum
Threat intelligence lookup status.
- bad_format = 2
- not_supported = 1
- ok = 0
- other = 10
- query_failed = 3
- class msticpy.sectools.tiproviders.ti_provider_base.TIPivotProvider
Bases:
abc.ABC
A class which provides pivot functions and a means of registering them.
- abstract register_pivots(pivot_reg: PivotRegistration, pivot: Pivot)
Register pivot functions for the TI Provider.
- Parameters
pivot_reg (PivotRegistration) – Pivot registration settings.
pivot (Pivot) – Pivot library instance
- class msticpy.sectools.tiproviders.ti_provider_base.TIProvider(**kwargs)
Bases:
abc.ABC
Abstract base class for Threat Intel providers.
Initialize the provider.
- property ioc_query_defs: Dict[str, Any]
Return current dictionary of IoC query/request definitions.
- Returns
IoC query/requist definitions keyed by IoCType
- Return type
Dict[str, Any]
- classmethod is_known_type(ioc_type: str) bool
Return True if this a known IoC Type.
- Parameters
ioc_type (str) – IoCType string to test
- Returns
True if known type.
- Return type
bool
- is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) bool
Return True if the passed type is supported.
- Parameters
ioc_type (Union[str, IoCType]) – IoC type name or instance
- Returns
True if supported.
- Return type
bool
- abstract lookup_ioc(ioc: str, ioc_type: Optional[str] = None, query_type: Optional[str] = None, **kwargs) msticpy.sectools.tiproviders.ti_provider_base.LookupResult
Lookup a single IoC observable.
- Parameters
ioc (str) – IoC Observable value
ioc_type (str, optional) – IoC Type, by default None (type will be inferred)
query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
- Returns
The returned results.
- Return type
- lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None, query_type: Optional[str] = None, **kwargs) pandas.core.frame.DataFrame
Lookup collection of IoC observables.
- Parameters
data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred
obs_col (str, optional) – DataFrame column to use for observables, by default None
ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None
query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
- Returns
DataFrame of results.
- Return type
pd.DataFrame
- abstract parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]
Return the details of the response.
- Parameters
response (LookupResult) – The returned data response
- Returns
bool = positive or negative hit TISeverity = enumeration of severity Object with match details
- Return type
Tuple[bool, TISeverity, Any]
- static resolve_ioc_type(observable: str) str
Return IoCType determined by IoCExtract.
- Parameters
observable (str) – IoC observable string
- Returns
IoC Type (or unknown if type could not be determined)
- Return type
str
- property supported_types: List[str]
Return list of supported IoC types for this provider.
- Returns
List of supported type names
- Return type
List[str]
- classmethod usage()
Print usage of provider.
- class msticpy.sectools.tiproviders.ti_provider_base.TISeverity(value)
Bases:
enum.Enum
Threat intelligence report severity.
- high = 2
- information = 0
- classmethod parse(value) msticpy.sectools.tiproviders.ti_provider_base.TISeverity
Parse string or numeric value to TISeverity.
- Parameters
value (Any) – TISeverity, str or int
- Returns
TISeverity instance.
- Return type
- unknown = -1
- warning = 1
- msticpy.sectools.tiproviders.ti_provider_base.entropy(input_str: str) float
Compute entropy of input string.
- msticpy.sectools.tiproviders.ti_provider_base.generate_items(data: Any, obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None) Iterable[Tuple[Optional[str], Optional[str]]]
Generate item pairs from different input types.
- Parameters
data (Any) – DataFrame, dictionary or iterable
obs_col (Optional[str]) – If data is a DataFrame, the column containing the observable value.
ioc_type_col (Optional[str]) – If data is a DataFrame, the column containing the observable type.
- Returns
- Return type
Iterable[Tuple[Optional[str], Optional[str]]]] - a tuple of Observable/Type.
- msticpy.sectools.tiproviders.ti_provider_base.get_schema_and_host(url: str, require_url_encoding: bool = False) Tuple[Optional[str], Optional[str], Optional[str]]
Return URL scheme and host and cleaned URL.
- Parameters
url (str) – Input URL
require_url_encoding (bool) – Set to True if url needs encoding. Defualt is False.
- Returns
Tuple of URL, scheme, host
- Return type
Tuple[Optional[str], Optional[str], Optional[str]
- msticpy.sectools.tiproviders.ti_provider_base.preprocess_observable(observable, ioc_type, require_url_encoding: bool = False) msticpy.sectools.tiproviders.ti_provider_base.SanitizedObservable
Preprocesses and checks validity of observable against declared IoC type.
- param observable
the value of the IoC
- param ioc_type
the IoC type
msticpy.sectools.tiproviders.greynoise module
GreyNoise Lookup.
- msticpy.sectools.tiproviders.GreyNoise.ioc_query_defs
Return current dictionary of IoC query/request definitions.
- Returns
IoC query/requist definitions keyed by IoCType
- Return type
Dict[str, Any]
- msticpy.sectools.tiproviders.GreyNoise.supported_types
Return list of supported IoC types for this provider.
- Returns
List of supported type names
- Return type
List[str]
msticpy.sectools.tiproviders.http_base module
HTTP TI Provider base.
Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.
- class msticpy.sectools.tiproviders.http_base.HttpProvider(**kwargs)
Bases:
msticpy.sectools.tiproviders.ti_provider_base.TIProvider
HTTP TI provider base class.
Initialize a new instance of the class.
- property ioc_query_defs: Dict[str, Any]
Return current dictionary of IoC query/request definitions.
- Returns
IoC query/requist definitions keyed by IoCType
- Return type
Dict[str, Any]
- classmethod is_known_type(ioc_type: str) bool
Return True if this a known IoC Type.
- Parameters
ioc_type (str) – IoCType string to test
- Returns
True if known type.
- Return type
bool
- is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) bool
Return True if the passed type is supported.
- Parameters
ioc_type (Union[str, IoCType]) – IoC type name or instance
- Returns
True if supported.
- Return type
bool
- lookup_ioc(ioc: str, ioc_type: str = None, query_type: str = None, **kwargs) msticpy.sectools.tiproviders.ti_provider_base.LookupResult
Lookup a single IoC observable.
- Parameters
ioc (str) – IoC observable
ioc_type (str, optional) – IocType, by default None (type will be inferred)
query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
- Returns
The lookup result: result - Positive/Negative, details - Lookup Details (or status if failure), raw_result - Raw Response reference - URL of IoC
- Return type
- Raises
NotImplementedError – If attempting to use an HTTP method or authentication protocol that is not supported.
Notes
Note: this method uses memoization (lru_cache) to cache results for a particular observable to try avoid repeated network calls for the same item.
- lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None, query_type: Optional[str] = None, **kwargs) pandas.core.frame.DataFrame
Lookup collection of IoC observables.
- Parameters
data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred
obs_col (str, optional) – DataFrame column to use for observables, by default None
ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None
query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
- Returns
DataFrame of results.
- Return type
pd.DataFrame
- abstract parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]
Return the details of the response.
- Parameters
response (LookupResult) – The returned data response
- Returns
bool = positive or negative hit TISeverity = enumeration of severity Object with match details
- Return type
Tuple[bool, TISeverity, Any]
- static resolve_ioc_type(observable: str) str
Return IoCType determined by IoCExtract.
- Parameters
observable (str) – IoC observable string
- Returns
IoC Type (or unknown if type could not be determined)
- Return type
str
- property supported_types: List[str]
Return list of supported IoC types for this provider.
- Returns
List of supported type names
- Return type
List[str]
- classmethod usage()
Print usage of provider.
- class msticpy.sectools.tiproviders.http_base.IoCLookupParams(path: str = '', verb: str = 'GET', full_url: bool = False, headers: Dict[str, str] = NOTHING, params: Dict[str, str] = NOTHING, data: Dict[str, str] = NOTHING, auth_type: str = '', auth_str: List[str] = NOTHING, sub_type: str = '')
Bases:
object
IoC HTTP Lookup Params definition.
Method generated by attrs for class IoCLookupParams.
- auth_str: List[str]
- auth_type: str
- data: Dict[str, str]
- full_url: bool
- headers: Dict[str, str]
- params: Dict[str, str]
- path: str
- sub_type: str
- verb: str
msticpy.sectools.tiproviders.alienvault_otx module
AlienVault OTX Provider.
Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.
- class msticpy.sectools.tiproviders.alienvault_otx.OTX(**kwargs)
Bases:
msticpy.sectools.tiproviders.http_base.HttpProvider
AlientVault OTX Lookup.
Set OTX specific settings.
- property ioc_query_defs: Dict[str, Any]
Return current dictionary of IoC query/request definitions.
- Returns
IoC query/requist definitions keyed by IoCType
- Return type
Dict[str, Any]
- classmethod is_known_type(ioc_type: str) bool
Return True if this a known IoC Type.
- Parameters
ioc_type (str) – IoCType string to test
- Returns
True if known type.
- Return type
bool
- is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) bool
Return True if the passed type is supported.
- Parameters
ioc_type (Union[str, IoCType]) – IoC type name or instance
- Returns
True if supported.
- Return type
bool
- lookup_ioc(ioc: str, ioc_type: str = None, query_type: str = None, **kwargs) msticpy.sectools.tiproviders.ti_provider_base.LookupResult
Lookup a single IoC observable.
- Parameters
ioc (str) – IoC observable
ioc_type (str, optional) – IocType, by default None (type will be inferred)
query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
- Returns
The lookup result: result - Positive/Negative, details - Lookup Details (or status if failure), raw_result - Raw Response reference - URL of IoC
- Return type
- Raises
NotImplementedError – If attempting to use an HTTP method or authentication protocol that is not supported.
Notes
Note: this method uses memoization (lru_cache) to cache results for a particular observable to try avoid repeated network calls for the same item.
- lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None, query_type: Optional[str] = None, **kwargs) pandas.core.frame.DataFrame
Lookup collection of IoC observables.
- Parameters
data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred
obs_col (str, optional) – DataFrame column to use for observables, by default None
ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None
query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
- Returns
DataFrame of results.
- Return type
pd.DataFrame
- parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]
Return the details of the response.
- Parameters
response (LookupResult) – The returned data response
- Returns
bool = positive or negative hit TISeverity = enumeration of severity Object with match details
- Return type
Tuple[bool, TISeverity, Any]
- static resolve_ioc_type(observable: str) str
Return IoCType determined by IoCExtract.
- Parameters
observable (str) – IoC observable string
- Returns
IoC Type (or unknown if type could not be determined)
- Return type
str
- property supported_types: List[str]
Return list of supported IoC types for this provider.
- Returns
List of supported type names
- Return type
List[str]
- classmethod usage()
Print usage of provider.
msticpy.sectools.tiproviders.ibm_xforce module
IBM XForce Provider.
Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.
- class msticpy.sectools.tiproviders.ibm_xforce.XForce(**kwargs)
Bases:
msticpy.sectools.tiproviders.http_base.HttpProvider
IBM XForce Lookup.
Initialize a new instance of the class.
- property ioc_query_defs: Dict[str, Any]
Return current dictionary of IoC query/request definitions.
- Returns
IoC query/requist definitions keyed by IoCType
- Return type
Dict[str, Any]
- classmethod is_known_type(ioc_type: str) bool
Return True if this a known IoC Type.
- Parameters
ioc_type (str) – IoCType string to test
- Returns
True if known type.
- Return type
bool
- is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) bool
Return True if the passed type is supported.
- Parameters
ioc_type (Union[str, IoCType]) – IoC type name or instance
- Returns
True if supported.
- Return type
bool
- lookup_ioc(ioc: str, ioc_type: str = None, query_type: str = None, **kwargs) msticpy.sectools.tiproviders.ti_provider_base.LookupResult
Lookup a single IoC observable.
- Parameters
ioc (str) – IoC observable
ioc_type (str, optional) – IocType, by default None (type will be inferred)
query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
- Returns
The lookup result: result - Positive/Negative, details - Lookup Details (or status if failure), raw_result - Raw Response reference - URL of IoC
- Return type
- Raises
NotImplementedError – If attempting to use an HTTP method or authentication protocol that is not supported.
Notes
Note: this method uses memoization (lru_cache) to cache results for a particular observable to try avoid repeated network calls for the same item.
- lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None, query_type: Optional[str] = None, **kwargs) pandas.core.frame.DataFrame
Lookup collection of IoC observables.
- Parameters
data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred
obs_col (str, optional) – DataFrame column to use for observables, by default None
ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None
query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
- Returns
DataFrame of results.
- Return type
pd.DataFrame
- parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]
Return the details of the response.
- Parameters
response (LookupResult) – The returned data response
- Returns
bool = positive or negative hit TISeverity = enumeration of severity Object with match details
- Return type
Tuple[bool, TISeverity, Any]
- static resolve_ioc_type(observable: str) str
Return IoCType determined by IoCExtract.
- Parameters
observable (str) – IoC observable string
- Returns
IoC Type (or unknown if type could not be determined)
- Return type
str
- property supported_types: List[str]
Return list of supported IoC types for this provider.
- Returns
List of supported type names
- Return type
List[str]
- classmethod usage()
Print usage of provider.
msticpy.sectools.tiproviders.virustotal module
VirusTotal Provider.
Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing may require a an API key and processing performance may be limited to a specific number of requests per minute for the account type that you have.
- class msticpy.sectools.tiproviders.virustotal.VirusTotal(**kwargs)
Bases:
msticpy.sectools.tiproviders.http_base.HttpProvider
VirusTotal Lookup.
Initialize a new instance of the class.
- property ioc_query_defs: Dict[str, Any]
Return current dictionary of IoC query/request definitions.
- Returns
IoC query/requist definitions keyed by IoCType
- Return type
Dict[str, Any]
- classmethod is_known_type(ioc_type: str) bool
Return True if this a known IoC Type.
- Parameters
ioc_type (str) – IoCType string to test
- Returns
True if known type.
- Return type
bool
- is_supported_type(ioc_type: Union[str, msticpy.sectools.iocextract.IoCType]) bool
Return True if the passed type is supported.
- Parameters
ioc_type (Union[str, IoCType]) – IoC type name or instance
- Returns
True if supported.
- Return type
bool
- lookup_ioc(ioc: str, ioc_type: str = None, query_type: str = None, **kwargs) msticpy.sectools.tiproviders.ti_provider_base.LookupResult
Lookup a single IoC observable.
- Parameters
ioc (str) – IoC observable
ioc_type (str, optional) – IocType, by default None (type will be inferred)
query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
- Returns
The lookup result: result - Positive/Negative, details - Lookup Details (or status if failure), raw_result - Raw Response reference - URL of IoC
- Return type
- Raises
NotImplementedError – If attempting to use an HTTP method or authentication protocol that is not supported.
Notes
Note: this method uses memoization (lru_cache) to cache results for a particular observable to try avoid repeated network calls for the same item.
- lookup_iocs(data: Union[pandas.core.frame.DataFrame, Dict[str, str], Iterable[str]], obs_col: Optional[str] = None, ioc_type_col: Optional[str] = None, query_type: Optional[str] = None, **kwargs) pandas.core.frame.DataFrame
Lookup collection of IoC observables.
- Parameters
data (Union[pd.DataFrame, Dict[str, str], Iterable[str]]) – Data input in one of three formats: 1. Pandas dataframe (you must supply the column name in obs_col parameter) 2. Dict of observable, IoCType 3. Iterable of observables - IoCTypes will be inferred
obs_col (str, optional) – DataFrame column to use for observables, by default None
ioc_type_col (str, optional) – DataFrame column to use for IoCTypes, by default None
query_type (str, optional) – Specify the data subtype to be queried, by default None. If not specified the default record type for the IoC type will be returned.
- Returns
DataFrame of results.
- Return type
pd.DataFrame
- parse_results(response: msticpy.sectools.tiproviders.ti_provider_base.LookupResult) Tuple[bool, msticpy.sectools.tiproviders.ti_provider_base.TISeverity, Any]
Return the details of the response.
- Parameters
response (LookupResult) – The returned data response
- Returns
bool = positive or negative hit TISeverity = enumeration of severity Object with match details
- Return type
Tuple[bool, TISeverity, Any]
- static resolve_ioc_type(observable: str) str
Return IoCType determined by IoCExtract.
- Parameters
observable (str) – IoC observable string
- Returns
IoC Type (or unknown if type could not be determined)
- Return type
str
- property supported_types: List[str]
Return list of supported IoC types for this provider.
- Returns
List of supported type names
- Return type
List[str]
- classmethod usage()
Print usage of provider.
msticpy.sectools.vtlookup module
Module for VTLookup class.
Wrapper class around Virus Total API. Input can be a single IoC observable or a pandas DataFrame containing multiple observables. Processing requires a Virus Total account and API key and processing performance is limited to the number of requests per minute for the account type that you have. Support IoC Types:
Filehash
URL
DNS Domain
IPv4 Address
- class msticpy.sectools.vtlookup.DuplicateStatus(is_dup, status)
Bases:
tuple
Create new instance of DuplicateStatus(is_dup, status)
- count(value, /)
Return number of occurrences of value.
- index(value, start=0, stop=9223372036854775807, /)
Return first index of value.
Raises ValueError if the value is not present.
- property is_dup
Alias for field number 0
- property status
Alias for field number 1
- class msticpy.sectools.vtlookup.VTLookup(vtkey: str, verbosity: int = 1)
Bases:
object
VTLookup: VirusTotal lookup of IoC reports.
Main methods are: lookup_iocs() - accepts input of multiple IoCs in a Pandas DataFrame lookup_ioc() - looks up a single IoC observable. supported_ioc_types - a list of valid target types. ioc_vt_type_mapping - a dictionary of mappings to recognized VT Types. Types mapped to None will not be submitted to VT.
For urls a full http request can be submitted, query string and fragments will be dropped before submitting. For files MD5, SHA1 and SHA256 hashes are supported. For IP addresses only dotted IPv4 addresses are supported.
Create a new instance of VTLookup class.
- Parameters
vtkey (str) – VirusTotal API key
verbosity (int, optional) –
- The level of detail of reporting
0 = no reporting 1 = minimal reporting (default) 2 = verbose reporting
- property ioc_vt_type_mapping: Dict[str, str]
Return mapping between internal and VirusTotal IoC type names.
- Returns
Return mapping between internal and VirusTotal IoC type names.
- Return type
Mapping[str, str]
- lookup_ioc(observable: str, ioc_type: str, output: str = 'dict') Any
Look up and single IoC observable.
- Parameters
observable (str) – The observable value
ioc_type (str) – The IoC Type (see ‘supported_ioc_types’ attribute)
output (str, optional) – Output results as a dictionary (or list of dicts) if output is any other value the result will be returned in a Pandas DataFrame (the default is ‘dict’)
- Returns
list{dict} (if output == ‘dict’)
pd.DataFrame (otherwise)
- Raises
KeyError – Unknown ioc_type
- lookup_iocs(data: pandas.core.frame.DataFrame, src_col: str = 'Observable', type_col: str = 'IoCType', src_index_col: str = 'SourceIndex', **kwargs) pandas.core.frame.DataFrame
Retrieve results for IoC observables in the source dataframe.
- Parameters
data (pd.DataFrame) – Dataframe containing the observables to search for
src_col (str, optional) – The column name that contains the observable data (one item per row) (the default is ‘Observable’)
type_col (str, optional) – The column name containing the observable type (the default is ‘IoCType’)
src_index_col (str, optional) – The name of the column to use as source index. If not specified this defaults to ‘SourceIndex’. If this (or the supplied value) is not in the source dataframe, the index of the source dataframe will be used. This is retained in the output so that you can join the results back to the original data. (the default is ‘SourceIndex’)
names (key/value pairs of additional mappings to supported IoC type) –
ipv4='ipaddress' (e.g.) –
url='httprequest'. –
custom (This allows you to specify) –
names. (mappings when the source data is tagged with different) –
- Returns
Combined results of local pre-processing and VirusTotal Lookups
- Return type
pd.DataFrame
- Raises
KeyError – Unknown ioc_type
Notes
See supported_ioc_types attribute for a list of valid target types. Not all of these types are supported by VirusTotal. See ioc_vt_type_mapping for current mappings. Types mapped to None will not be submitted to VT.
For urls a full http request can be submitted, query string and fragments will be dropped before submitting. Other supported protocols are ftp, telnet, ldap, file For files MD5, SHA1 and SHA256 hashes are supported. For IP addresses only dotted IPv4 addresses are supported.
- property supported_ioc_types: List[str]
Return list of supported IoC type internal names.
- Returns
List of supported IoC type internal names.
- Return type
List[str]
- property supported_vt_types: List[str]
Return list of VirusTotal supported IoC type names.
- Returns
List of VirusTotal supported IoC type names.
- Return type
List[str]
- class msticpy.sectools.vtlookup.VTParams(api_type, batch_size, batch_delimiter, http_verb, api_var_name, headers)
Bases:
tuple
Create new instance of VTParams(api_type, batch_size, batch_delimiter, http_verb, api_var_name, headers)
- property api_type
Alias for field number 0
- property api_var_name
Alias for field number 4
- property batch_delimiter
Alias for field number 2
- property batch_size
Alias for field number 1
- count(value, /)
Return number of occurrences of value.
- property headers
Alias for field number 5
- property http_verb
Alias for field number 3
- index(value, start=0, stop=9223372036854775807, /)
Return first index of value.
Raises ValueError if the value is not present.
msticpy.sectools.vtlookupv3 module
VirusTotal v3 API.
- class msticpy.sectools.vtlookupv3.ColumnNames(value)
Bases:
enum.Enum
Column name enum for DataFrame output.
- DETECTIONS = 'detections'
- ID = 'id'
- RELATIONSHIP_TYPE = 'relationship_type'
- SCANS = 'scans'
- SOURCE = 'source'
- SOURCE_TYPE = 'source_type'
- TARGET = 'target'
- TARGET_TYPE = 'target_type'
- TYPE = 'type'
- exception msticpy.sectools.vtlookupv3.MsticpyVTGraphSaveGraphError
Bases:
Exception
Could not save VT Graph.
- args
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- exception msticpy.sectools.vtlookupv3.MsticpyVTNoDataError
Bases:
Exception
No data returned from VT API.
- args
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class msticpy.sectools.vtlookupv3.VTEntityType(value)
Bases:
enum.Enum
VTEntityType: Enum class for VirusTotal entity types.
- DOMAIN = 'domain'
- FILE = 'file'
- IP_ADDRESS = 'ip_address'
- URL = 'url'
- class msticpy.sectools.vtlookupv3.VTLookupV3(vt_key: Optional[str] = None)
Bases:
object
VTLookupV3: VirusTotal lookup of IoC reports.
Create a new instance of VTLookupV3 class.
- Parameters
vt_key (str, optional) – VirusTotal API key, if not supplied, this is read from user configuration.
- create_vt_graph(relationship_dfs: List[pandas.core.frame.DataFrame], name: str, private: bool) str
Create a VirusTotal Graph with a set of Relationship DataFrames.
- Parameters
relationship_dfs – List of Relationship DataFrames
name – New graph name
private – Indicates if the Graph is private or not.
- Returns
- Return type
Graph ID
- Raises
ValueError when private is not indicated. –
ValueError when there are no relationship DataFrames –
MsticpyVTGraphSaveGraphError when Graph can not be saved –
- get_object(vt_id: str, vt_type: str) pandas.core.frame.DataFrame
Return the full VT object as a DataFrame.
- Parameters
vt_id (str) – The ID of the object
vt_type (str) – The type of object to query.
- Returns
Single column DataFrame with attribute names as index and values as data column.
- Return type
pd.DataFrame
- Raises
KeyError – Unrecognized VT Type
MsticpyVTNoDataError – Error requesting data from VT.
- lookup_ioc(observable: str, vt_type: str) pandas.core.frame.DataFrame
Look up and single IoC observable.
- Parameters
observable (str) – The observable value
vt_type (str) – The VT entity type
- Returns
- Return type
Attributes Pandas DataFrame with the properties of the entity
- Raises
KeyError – Unknown vt_type
- lookup_ioc_relationships(observable: str, vt_type: str, relationship: str, limit: Optional[int] = None) pandas.core.frame.DataFrame
Look up and single IoC observable relationships.
- Parameters
observable (str) – The observable value
vt_type (str) – The VT entity type
relationship (str) – Desired relationship
limit (int) – Relations limit
- Returns
- Return type
Relationship Pandas DataFrame with the relationships of the entity
- lookup_iocs(observables_df: pandas.core.frame.DataFrame, observable_column: str = 'target', observable_type_column: str = 'target_type')
Look up and multiple IoC observables.
- Parameters
observables_df (pd.DataFrame) – A Pandas DataFrame, where each row is an observable
observable_column – ID column of each observable
observable_type_column – Type column of each observable
- Returns
- Return type
Attributes Pandas DataFrame with the properties of the entities
- lookup_iocs_relationships(observables_df: pandas.core.frame.DataFrame, relationship: str, observable_column: str = 'target', observable_type_column: str = 'target_type', limit: Optional[int] = None) pandas.core.frame.DataFrame
Look up and single IoC observable relationships.
- Parameters
observables_df (pd.DataFrame) – A Pandas DataFrame, where each row is an observable
relationship (str) – Desired relationship
observable_column – ID column of each observable
observable_type_column – Type column of each observable.
limit (int) – Relations limit
- Returns
- Return type
Relationship Pandas DataFrame with the relationships of each observable.
- static render_vt_graph(graph_id: str, width: int = 800, height: int = 600)
Display a VTGraph in a Jupyter Notebook.
- Parameters
graph_id – Graph ID
width – Graph width.
height – Graph height
- property supported_vt_types: List[str]
Return list of VirusTotal supported IoC type names.
- Returns
List of VirusTotal supported IoC type names.
- Return type
List[str]