MLflow utils ============ This MLflow utils will be used when we run our workflow jobs in databricks. This would only work in databricks ML runtime cluster. ML runtime cluster ------------------ Please only use `11.3` ML runtime cluster, our MLflow version is tied to `1.29.0`, which is 11.3 cluster version. .. image:: ../_static/mlflow_databricks_runtime_version.png :align: center MLflow load model ----------------- This loads an ML model from a list of allowed model types. .. code-block:: python from hip_data_ml_utils.core.config import settings settings.model_func_dict .. image:: ../_static/allowed_model_types.png :align: center Here is how we call the load model function. A successful response will load the model, and raises an exception otherwise .. code-block:: python model = mlflow_load_model( model_uri=f"models:/hackathon-model-l2r/Production", type_of_model="sk_model", model_func_dict=settings.model_func_dict, ) When we have loaded the model, we can just invoke the functions from the type of model. .. image:: ../_static/mlflow_load_model_utils.png :align: center MLflow load artifact -------------------- This function load an artifact in an MLflow run. A successful response will return the artifact, and raises an exception otherwise .. code-block:: python # for joblib, pkl, dict mlflow_load_artifact( artifact_uri="runs:/xxx/yyy", artifact_name="overall_evaluation_dataset.joblib", ).head() # for yaml mlflow_load_artifact( artifact_uri="runs:/zzz", artifact_name="features.yaml", type_of_artifact="yaml" ) .. image:: ../_static/mlflow_load_artifact_utils.png :align: center At the moment, we allow for a few types of loading of artifacts; `pkl`, `joblib`, `dict` and `yaml` MLlflow retrieve model evaluation metrics ----------------------------------------- This function retrieves (all of) the model evaluation metrics or just a singular key value A successful response will return all of the evaluation metrics if no key is specified, or a specific key value metric. .. code-block:: python # return a specific key value evaluation metric mlflow_get_model_metrics( run_id="xx", key_value_metrics="mrr_best" ) # return all evaluation metric mlflow_get_model_metrics( run_id="xx", ) .. image:: ../_static/mlflow_get_metrics.png :align: center MLlflow retrieve registered run info and run_id ----------------------------------------------- This function returns the registered model information from the specified MLflow run_id. And also returns the MLflow run_id of the specified staging tag; Staging, Archived or Production A successful response will return both specified stage tag of its MLflow run_id, and specified MLflow run_id of its registered model information .. code-block:: python mlflow_client = MlflowClient() mlflow_runid, model_registered_information = mlflow_get_both_registered_model_info_run_id( name="hackathon-model-l2r", mlflow_client=mlflow_client, run_id="xx", stage="Production" ) .. image:: ../_static/mlflow_registered_model_info_runid_utils.png :align: center MLlflow promote model --------------------- This function decides if we need to promote model to the staging tag if there is no model in the specified staging tag, and A successful response will return a string response to .. code-block:: python mlflow_client = MlflowClient() mlflow_promote_model( name="hackathon-model-l2r", retrained_run_id="xx", retrained_metric=mlflow_get_model_metrics(run_id="xx",key_value_metrics="mrr_best"), start_date="2022-11-01", eval_date="2023-02-01", env="prod", mlflow_client=mlflow_client, metrics_name="MRR" ) .. image:: ../_static/mlflow_promote_model_utils.png :align: center