Kedro Aim

📝 Description

kedro-aim is a kedro-plugin that enables tracking of metrics and parameters with Aim from within Kedro. Kedro is a great tool for data engineering and data science, but it lacks a clear way to track metrics and parameters. Aim is a great tool for tracking metrics and parameters, but it lacks a clear way to integrate with Kedro. This plugin aims to solve both problems.

🎖 Features

Automatic Registration of Aim Run in Data Catalog
Tracking of Artifact with Aim DataSet
Configuration over aim.yml

⚙️ Installation

Install the package with pip:

pip install kedro-aim

💡 Usage Examples

The plugin automatically registers a Run instance in the DataCatalog under the name run which can be accessed by all nodes to log metrics and parameters. This run instance can be used track metrics and parameters in the same way as in any other python project

First you need to initilize the aim.yml config file inside your pre-existing Kedro project. This can be done by running the following command:

kedro aim init

In order to use aim inside a node you need to pass the run object as a argument of the function. Inside the function you can access the run object and use it to log metrics and parameters.

# nodes.py
import pandas as pd
from aim import Run


def logging_in_node(run: Run, data: pd.DataFrame) -> None:
    # track metric
    run.track(0.5, "score")

    # track parameter
    run["parameter"] = "abc"

When defining the pipeline, you need to pass the run dataset as a input to the node. The run dataset will be automatically created by kedro-aim and added to the DataCatalog. As a result, the run dataset will be passed to the node as an argument.

# pipeline.py
from kedro.pipeline import node, Pipeline
from kedro.pipeline.modular_pipeline import pipeline


def create_pipeline(**kwargs) -> Pipeline:
    return pipeline(
        [
            node(
                func=logging_in_node,
                inputs=["run", "input_data"],
                outputs=None,
                name="logging_in_node",
            )
        ]
    )

🧰 Config File

The module is configured via the aim.yml file which should be placed inside the conf/base folder. A default config file can be generated using the kedro aim init command from the shell.

You can enable the schema validation in your VSCode IDE to enable real-time validation, autocompletion and see information about the different fields in your catalog as you write it. To enable this, make sure you have the YAML plugin installed. Then enter the following in your settings.json file:

{
  "yaml.schemas": {
    "https://raw.githubusercontent.com/AnH0ang/kedro-aim/master/static/jsonschema/kedro_aim_schema.json": "**/*aim*.yml"
  }
}

🙏 Acknowledgement

This project was inspired by the work of kedro-mlflow which is a plugin for Kedro that enables tracking of metrics and parameters with MLflow from within Kedro.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github		.github
.vscode		.vscode
docs		docs
scripts		scripts
src		src
static		static
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
mkdocs.yml		mkdocs.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kedro Aim

📝 Description

🎖 Features

⚙️ Installation

💡 Usage Examples

🧰 Config File

🙏 Acknowledgement

About

Releases

Contributors 2

Languages

License

AnH0ang/kedro-aim

Folders and files

Latest commit

History

Repository files navigation

Kedro Aim

📝 Description

🎖 Features

⚙️ Installation

💡 Usage Examples

🧰 Config File

🙏 Acknowledgement

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Contributors 2

Languages