msticpy.init.pivot_core.pivot_pipeline module

Pivot pipeline class.

class msticpy.init.pivot_core.pivot_pipeline.Pipeline(name, description=None, steps=None)

Bases: object

Pivot pipeline.

Create Pipeline instance.

Parameters:
  • name (str) – The pipeline name.

  • description (Optional[str]) – The pipeline description, by default None.

  • steps (Optional[Iterable[PipelineStep]]) – Pipeline steps, by default None.

classmethod from_yaml(yml_str)

Parse pipelines from yaml string.

Parameters:

yml_str (str) – Yaml dict of pipelines.

Yields:

Pipeline – Iterable of pipeline instances

Return type:

Iterable[Pipeline]

classmethod parse_pipeline(pipeline)

Parse single pipeline from dictionary.

Parameters:

pipeline (Dict[str, Dict[str, Any]]) – Single pipeline as a dictionary: {name: {pipeline_dict…}}.

Returns:

The pivot pipeline.

Return type:

Pipeline

Raises:

ValueError – The dictionary could not be parsed as a pipeline.

static parse_pipelines(pipelines)

Parse dict of pipelines.

Parameters:

pipelines (Dict[str, Dict[str, Any]]) – Dict of pipelines.

Yields:

Pipeline – Iterable of pipeline instances

Return type:

Iterable[Pipeline]

print_pipeline(df_name='input_df', comments=True)

Return the pipeline as text that can be executed in Python.

Parameters:
  • df_name (str, optional) – Name of the input dataframe to be used in the returned code, by default “input_df”

  • comments (bool, optional) – If True show step comments, by default True

Returns:

The executable pipeline text.

Return type:

str

run(data, verbose=True, debug=False)

Run the pipeline on the supplied DataFrame.

Parameters:
  • data (pd.DataFrame) – Input DataFrame for pipeline

  • verbose (bool, optional) – If True, report progress, by default True

  • debug (bool, optional) – If True, report more detailed progress, by default False

Returns:

The output of the last stage of the pipeline

Return type:

Any

to_yaml()

Return yaml representation of pipeline.

Returns:

Pipeline as yaml.

Return type:

str

class msticpy.init.pivot_core.pivot_pipeline.PipelineExecStep(accessor, pos_params, params, text, comment, step_type)

Bases: tuple

Create new instance of PipelineExecStep(accessor, pos_params, params, text, comment, step_type)

accessor

Alias for field number 0

comment

Alias for field number 4

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

params

Alias for field number 2

pos_params

Alias for field number 1

step_type

Alias for field number 5

text

Alias for field number 3

class msticpy.init.pivot_core.pivot_pipeline.PipelineStep(name, step_type, function=None, entity=None, comment=None, pos_params=NOTHING, params=NOTHING)

Bases: object

Pivot pipeline step class.

Method generated by attrs for class PipelineStep.

Parameters:
  • name (str)

  • step_type (str)

  • function (str | None)

  • entity (str | None)

  • comment (str | None)

  • pos_params (list[str])

  • params (dict[str, Any])

comment: str | None
entity: str | None
function: str | None
get_exec_step()

Return the executable step details.

Returns:

Named tuple with the following fields accessor - the name of the pandas DataFrame accessor function params - parameters to be passed to the function text - the text representation of the accessor + params comment - optional comment that can be used by the pipeline builder to add Python comments to output. step_type - the type of pipeline step

Return type:

PipelineExecStep

name: str
params: dict[str, Any]
pos_params: list[str]
step_type: str