Adding Pivot functions
See Pivot Functions for more details on use of pivot functions.
The pivot library supports adding functions as pivot functions from any importable Python library. Not all functions will be wrappable. Currently Pivot supports functions that take input parameters as either scalar values (I’m including strings in this although that isn’t exactly correct), iterables or DataFrames with column specifications.
You can create a persistent pivot function definition (that can be loaded from a yaml file) or an ad hoc definition that you can create in code. The next two sections describe creating a persistent function definition. Programmatically creating pivots follows this.
Information needed to define a pivot function
If you have a library function that you want to expose as a pivot function you need to gather a bit of information about it.
This table describes the configuration parameters needed to create a pivot function (most are optional).
Item |
Description |
Required |
Default |
---|---|---|---|
src_module |
The src_module to containing the class or function |
Yes |
|
class |
The class containing function |
No |
|
src_func_name |
The name of the function to wrap |
Yes |
|
func_new_name |
Rename the function |
No |
|
input type |
The input type that the wrapped function expects (DataFrame iterable value) |
Yes |
|
entity_map |
Mapping of entity and attribute used for function |
Yes |
|
func_df_param_name |
The param name that the function uses as input param for DataFrame |
If DF input |
|
func_df_col_param_name |
The param name that function uses to identify the input column name |
If DF input |
|
func_out_column_name |
Name of the column in the output DF to use as a key to join |
If DF output |
|
func_static_params |
dict of static name/value params always sent to the function |
No |
|
func_input_value_arg |
Name of the param that the wrapped function uses for its input value |
If not DF input |
|
can_iterate |
True if the function supports being called multiple times |
No |
Yes |
entity_container_name |
The name of the container in the entity where the func will appear |
No |
custom |
The entity_map
item specifies which entity or entities the pivot function
will be added to. Each
entry requires an Entity name (see
entities
) and an
entity attribute name. The attribute name is only used if you want to
use an instance of the entity as a parameter to the function.
If you don’t care about this you can pick any attribute.
For IpAddress
in the (partial) example
below, the pivot function will try to extract the value of the
Address
attribute when an instance of IpAddress is used as a
function parameter.
...
entity_map:
IpAddress: Address
Host: HostName
Account: Name
...
This means that you can specify different attributes of the same entity for different functions (or even for two instances of the same function)
The func_df_param_name
and func_df_col_param_name
are needed
only if the source function (i.e. the function to be wrapped as a pivot
function) takes a DataFrame and column name as input
parameters.
func_out_column_name
is needed if the source function returns a
DataFrame. In order to join input data with output data this needs to be
the column in the output that has the same value as the function input
For example, if your function processes IP addresses and returns the IP
in a column named “ip_addr”, put “ip_addr” as the value of func_out_column_name
.
Adding ad hoc pivot functions in code
You can also add ad hoc functions as pivot functions in code.
To do this use the Pivot method
add_pivot_function
You can supply the pivot registration parameters as keyword arguments to this function:
def my_func(input: str):
return input.upper()
Pivot.add_pivot_function(
func=my_func,
container="change_case",
input_type="value",
entity_map={"Host": "HostName"},
func_input_value_arg="input",
func_new_name="upper_name",
)
Alternatively, you can create a
PivotRegistration
object and supply that (along with the func
parameter), to
add_pivot_function
.
from msticpy.init.pivot_core.pivot_register import PivotRegistration
def my_func(input: str):
return input.upper()
piv_reg = PivotRegistration(
input_type="value",
entity_map={"Host": "HostName"},
func_input_value_arg="input",
func_new_name="upper_name"
)
Pivot.add_pivot_function(my_func, piv_reg, container="change_case")
The function parameters and PivotRegistration attributes are described in the previous section Information needed to define a pivot function.
Creating a persistent pivot function definition
You can also add pivot definitions to yaml files and load your pivots from this.
The top-level element of the file is pivot_providers
.
Example from the MSTICPy ip_utils who_is
function
pivot_providers:
...
who_is:
src_module: msticpy.context.ip_utils
src_func_name: get_whois_df
func_new_name: whois
input_type: dataframe
entity_map:
IpAddress: Address
func_df_param_name: data
func_df_col_param_name: ip_column
func_out_column_name: query
func_static_params:
all_columns: True
show_progress: False
func_input_value_arg: ip_address
Note
the library also support creating pivots from ad hoc functions created in the current notebook (see below).
You can also put this function into a Python module.
If your module is in the current directory and is called
my_new_module
, the value you specify for
src_module
will be “my_new_module”.
Once you have your yaml definition file you can call
register_pivot_providers
Pivot.register_pivot_providers(
pivot_reg_path=path_to_your_yaml,
namespace=globals(),
def_container="my_container",
force_container=True
)
Warning
The pivot functions created will not persist
across notebook sessions. You will need to call
register_pivot_providers
with your yaml files each
time you start a new session. Currently there is no option
to automate this via msticpyconfig.yaml but this would be
easy to add - please open an issue in
the MSTICPy repo
if you would like to see this.
Creating and deleting shortcut pivot functions
If you are adding pivot functions of your own, you can add shortcuts (i.e. direct methods of the entity, rather than methods in sub-containers) to those functions.
Every entity class has the class method
make_pivot_shortcut
.
You can use this to add a shortcut to an existing pivot function on that
entity. Note that you must call this method on the entity class and not
on an instance of that Entity.
The parameters that you must supply are func_name
and target
. The former
is the relative path to the pivot function that you want to make the shortcut
to, e.g. for IpAddress.util.whois
you would use the string “util.whois”.
target
is the string name that you want the shortcut function to be called.
This should be a valid Python identifier - a string starting with a letter or
underscore, followed by any combination of letters, digits and underscores. If
you supply a string that is not a valid identifier, the function will try to
transform it into one.
>>> IpAddress.make_pivot_shortcut(func_name="util.whois", target="my_whois")
>>> IpAddress.my_whois("157.53.1.1")
ip_column AsnDescription whois_result
157.53.1.1 NA {'nir': None, 'asn_registry': 'arin', ...
If the shortcut function already exists, you will get an error (AttributeError).
You can force overwriting of an existing shortcut by adding overwrite=True
.
To delete a shortcut use
del_pivot_shortcut
,
giving the single parameter func_name
with the name of the shortcut function
you want to remove.
Removing pivot functions from an entity or all entities
Although not a common operation you can remove all pivot functions from an entity or from all entities.
See
remove_pivot_funcs
for more details.