msticpy.init.pivot_core.pivot_pipeline module

Pivot pipeline class.

class msticpy.init.pivot_core.pivot_pipeline.Pipeline(name, description=None, steps=None)

Bases: object

Pivot pipeline.

Create Pipeline instance.

Parameters:

name (str) – The pipeline name.
description (Optional[str]) – The pipeline description, by default None.
steps (Optional[Iterable[PipelineStep]]) – Pipeline steps, by default None.

classmethod from_yaml(yml_str)

Parse pipelines from yaml string.

Parameters:: yml_str (str) – Yaml dict of pipelines.
Yields:: Pipeline – Iterable of pipeline instances
Return type:: Iterable[Pipeline]

classmethod parse_pipeline(pipeline)

Parse single pipeline from dictionary.

Parameters:: pipeline (Dict[str, Dict[str, Any]]) – Single pipeline as a dictionary: {name: {pipeline_dict…}}.
Returns:: The pivot pipeline.
Return type:: Pipeline
Raises:: ValueError – The dictionary could not be parsed as a pipeline.

static parse_pipelines(pipelines)

Parse dict of pipelines.

Parameters:: pipelines (Dict[str, Dict[str, Any]]) – Dict of pipelines.
Yields:: Pipeline – Iterable of pipeline instances
Return type:: Iterable[Pipeline]

print_pipeline(df_name='input_df', comments=True)

Return the pipeline as text that can be executed in Python.

Parameters:

df_name (str, optional) – Name of the input dataframe to be used in the returned code, by default “input_df”
comments (bool, optional) – If True show step comments, by default True

Returns:

The executable pipeline text.

Return type:

str

run(data, verbose=True, debug=False)

Run the pipeline on the supplied DataFrame.

Parameters:

data (pd.DataFrame) – Input DataFrame for pipeline
verbose (bool, optional) – If True, report progress, by default True
debug (bool, optional) – If True, report more detailed progress, by default False

Returns:

The output of the last stage of the pipeline

Return type:

Any

to_yaml()

Return yaml representation of pipeline.

Returns:: Pipeline as yaml.
Return type:: str

class msticpy.init.pivot_core.pivot_pipeline.PipelineExecStep(accessor, pos_params, params, text, comment, step_type)

Bases: tuple

Create new instance of PipelineExecStep(accessor, pos_params, params, text, comment, step_type)

accessor: Alias for field number 0

comment: Alias for field number 4

count(value, /): Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

params: Alias for field number 2

pos_params: Alias for field number 1

step_type: Alias for field number 5

text: Alias for field number 3

class msticpy.init.pivot_core.pivot_pipeline.PipelineStep(name, step_type, function=None, entity=None, comment=None, pos_params=NOTHING, params=NOTHING)

Bases: object

Pivot pipeline step class.

Method generated by attrs for class PipelineStep.

Parameters:

name (str)
step_type (str)
function (str | None)
entity (str | None)
comment (str | None)
pos_params (list[str])
params (dict[str, Any])

comment: str | None

entity: str | None

function: str | None

get_exec_step()

Return the executable step details.

Returns:: Named tuple with the following fields accessor - the name of the pandas DataFrame accessor function params - parameters to be passed to the function text - the text representation of the accessor + params comment - optional comment that can be used by the pipeline builder to add Python comments to output. step_type - the type of pipeline step
Return type:: PipelineExecStep

name: str

params: dict[str, Any]

pos_params: list[str]

step_type: str