dialoguekit.utils.dialogue_evaluation ===================================== .. py:module:: dialoguekit.utils.dialogue_evaluation .. autoapi-nested-parse:: Evaluation module. Classes ------- .. autoapisummary:: dialoguekit.utils.dialogue_evaluation.Evaluator Module Contents --------------- .. py:class:: Evaluator(dialogues: List[dialoguekit.core.dialogue.Dialogue], reward_config: Dict[str, Any]) Dialogue evaluator. Evaluates a set of dialogues using standard metrics. :param dialogues: A list of Dialogue objects to be evaluated. :param reward_config: A dictionary with reward settings. For an example config, consult the documentation. .. py:method:: avg_turns() -> float Calculates the AvgTurns for the dialogues. AvgTurns reflects the average number of system-user turn pairs in a list of dialogues. :returns: The computed metric as a float value. .. py:method:: user_act_ratio() -> Dict[str, float] Computes the UserActRatio for the dialogues. UserActRatio per dialogue is computed as the ratio of user actions observed in the dialogue. :returns: A dictionary with participant and ActRatio as key-value pairs. .. py:method:: reward() -> Dict[str, List[Dict[str, float]]] Computes reward for the dialogues, according to the reward config. Reward is used to penalize agents that do not support a set of intents defined in the config file, and long dialogues. :raises TypeError: if utterances are not annotated. :returns: .. code:: python { "missing_intents": [], "dialogues": [{ "reward": int, "user_turns": int, "repeats": int, }] } :rtype: A dictionary with following structure (most important is "reward") .. py:method:: satisfaction(satisfaction_classifier: dialoguekit.nlu.models.satisfaction_classifier.SatisfactionClassifierSVM) -> List[int] Classifies dialogue-level satisfaction score. Satisfaction is scored using a SatisfactionClassifier model. Based on last n turns, it computes a satisfaction score. :returns: A list with satisfaction score for each dialogue.