Patent attributes
A system using large language models (LLMs) with multimodal inputs that has a low latency (within 500 ms) response time is described. This low latency produces a response that leads to a more engaging and empathetic user experience with all calculations done on consumer hardware. In so doing, the LLM derives the emotional state of the user using message sentiment analysis and/or face behavior and/or voice parameters that leads to a more engaging and empathetic user experience. This occurs by reviewing: (a) the user's mood state while providing an input prompt to the LLM in the current turn; (b) the user's mood and mental states while reacting to the LLM's response in the previous turn; (c) the sentiment of the LLM's response in the previous turn; and (d) the desired user mood and mental state as determined by the system's empathetic goal.