msticpy.init.pivot_core.pivot_pipeline module

Pivot pipeline class.

class msticpy.init.pivot_core.pivot_pipeline.Pipeline(name: str, description: str | None = None, steps: Iterable[PipelineStep] | None = None)

Bases: object

Pivot pipeline.

Create Pipeline instance.

Parameters:
  • name (str) – The pipeline name.

  • description (Optional[str]) – The pipeline description, by default None.

  • steps (Optional[Iterable[PipelineStep]]) – Pipeline steps, by default None.

classmethod from_yaml(yml_str: str) Iterable[Pipeline]

Parse pipelines from yaml string.

Parameters:

yml_str (str) – Yaml dict of pipelines.

Yields:

Pipeline – Iterable of pipeline instances

classmethod parse_pipeline(pipeline: Dict[str, Dict[str, Any]]) Pipeline

Parse single pipeline from dictionary.

Parameters:

pipeline (Dict[str, Dict[str, Any]]) – Single pipeline as a dictionary: {name: {pipeline_dict…}}.

Returns:

The pivot pipeline.

Return type:

Pipeline

Raises:

ValueError – The dictionary could not be parsed as a pipeline.

static parse_pipelines(pipelines: Dict[str, Dict[str, Any]]) Iterable[Pipeline]

Parse dict of pipelines.

Parameters:

pipelines (Dict[str, Dict[str, Any]]) – Dict of pipelines.

Yields:

Pipeline – Iterable of pipeline instances

print_pipeline(df_name: str = 'input_df', comments: bool = True) str

Return the pipeline as text that can be executed in Python.

Parameters:
  • df_name (str, optional) – Name of the input dataframe to be used in the returned code, by default “input_df”

  • comments (bool, optional) – If True show step comments, by default True

Returns:

The executable pipeline text.

Return type:

str

run(data: DataFrame, verbose: bool = True, debug: bool = False) Any | None

Run the pipeline on the supplied DataFrame.

Parameters:
  • data (pd.DataFrame) – Input DataFrame for pipeline

  • verbose (bool, optional) – If True, report progress, by default True

  • debug (bool, optional) – If True, report more detailed progress, by default False

Returns:

The output of the last stage of the pipeline

Return type:

Any

to_yaml() str

Return yaml representation of pipeline.

Returns:

Pipeline as yaml.

Return type:

str

class msticpy.init.pivot_core.pivot_pipeline.PipelineExecStep(accessor, pos_params, params, text, comment, step_type)

Bases: tuple

Create new instance of PipelineExecStep(accessor, pos_params, params, text, comment, step_type)

accessor

Alias for field number 0

comment

Alias for field number 4

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

params

Alias for field number 2

pos_params

Alias for field number 1

step_type

Alias for field number 5

text

Alias for field number 3

class msticpy.init.pivot_core.pivot_pipeline.PipelineStep(name: str, step_type: str, function: str | None = None, entity: str | None = None, comment: str | None = None, pos_params: List[str] = NOTHING, params: Dict[str, Any] = NOTHING)

Bases: object

Pivot pipeline step class.

Method generated by attrs for class PipelineStep.

comment: str | None
entity: str | None
function: str | None
get_exec_step() PipelineExecStep

Return the executable step details.

Returns:

Named tuple with the following fields accessor - the name of the pandas DataFrame accessor function params - parameters to be passed to the function text - the text representation of the accessor + params comment - optional comment that can be used by the pipeline builder to add Python comments to output. step_type - the type of pipeline step

Return type:

PipelineExecStep

name: str
params: Dict[str, Any]
pos_params: List[str]
step_type: str