GitHub - IYENTeam/Hent-ai: Emotion Image Attachment Plugin for AI agents — Auto-classify emotions via LLM and attach matching images to Discord messages
Key Points
- 1Hent-ai is a tool that automatically classifies the emotion of AI agent responses using an LLM and attaches corresponding visual emotion images to Discord messages.
- 2Users must provide six distinct character-based images representing emotions like happy, neutral, and confused, which are recommended to be generated consistently from a base character reference.
- 3The system integrates with agent platforms, advising that agents' persona files define clear emotional behaviors for Hent-ai to interpret, rather than directly specifying image attachments.
Hent-ai is a system designed to enhance AI agent interactions by automatically classifying the emotion of bot responses using a Large Language Model (LLM) and attaching a corresponding visual emotion image to Discord messages. The term "Hent" is a coined word meaning "intent." It supports integration with both OpenClaw and Hermes Agent platforms.
The core methodology of Hent-ai involves two primary components: emotion classification from text and the generation/management of emotion-specific visual assets.
Emotion Classification:
Hent-ai employs an LLM to perform text-based emotion classification. For every response generated by the AI agent, the LLM analyzes the text to determine the underlying emotion expressed. Six distinct emotions are supported:
- Happy: Signifying success, completion, or celebration.
- Neutral: For general, informational, or default responses.
- Loyalty: Indicating acknowledgment, greeting, or attentiveness.
- Sorry: Conveying apology or acknowledging mistakes.
- Confused: Expressing uncertainty or questions.
- Focused: Denoting working, investigating, or debugging.
Emotion Image Generation:
The system requires six unique images, one for each supported emotion. The recommended approach for creating these images is "Character + Reference-Based Generation," a technique leveraging image-to-image (img2img) capabilities of generative AI models (e.g., DALL-E, Midjourney, Stable Diffusion).
- Base Character Generation: A single base character image is first created using any image generation tool. This character serves as the visual identity of the AI agent.
- Emotion Variant Generation: The base character image is then used as a reference input for the image generator. For each of the six emotions, a specific text prompt (e.g., "Same character as the reference image, expressing [emotion]. Simple background, consistent art style.") is provided along with the base image to generate variants depicting the character expressing the desired emotion. Examples of emotional cues for prompts include:
- Happy: "smiling, thumbs up, celebrating"
- Neutral: "calm, relaxed, default expression"
- Loyalty: "saluting, nodding, attentive"
- Sorry: "apologetic, bowing, sheepish"
- Confused: "head tilt, question mark, puzzled"
- Focused: "concentrating, working, determined"
- File Naming and Placement: Generated images are renamed according to their emotion (e.g.,
happy.png,neutral.png) and placed in a designatedassets/directory.
To ensure visual consistency and effectiveness, the following guidelines are provided for image creation:
- Maintain a consistent art style across all images.
- Use simple backgrounds to ensure readability as small thumbnails.
- Ensure emotions are visually distinct to make the image swap meaningful.
- Prefer square aspect ratios (1:1) for optimal Discord rendering.
- Keep file sizes under 500KB for fast uploads.
- Use PNG format for transparency and clean edges.
Integration with Agent's SOUL.md (Persona File):
Hent-ai's effectiveness relies on the AI agent's textual output implicitly conveying emotion, rather than explicit instructions. The agent's persona file (typically SOUL.md or equivalent) needs to be configured to facilitate this:
- Removal of Media Tags: Any existing instructions for the agent to output
MEDIA:tags for images must be removed, as Hent-ai automates image attachment. - Defining Emotional Behaviors: Instead of direct image commands, the
SOUL.mdshould define clear behavioral patterns linked to emotions. For example, "When a task is completed successfully, celebrate briefly" signals a "happy" response; "When you make a mistake, own it immediately" signals "sorry." This allows the LLM classifier to infer emotion from the agent's natural language. - Promoting Personality Range: Agents should be designed with a varied personality to avoid monotone responses, which would consistently result in "neutral" classifications. A broader range of expressions (excitement, frustration, curiosity) allows for more distinct emotional classifications.
- Plugin Acknowledgment: A simple note can be added to the
SOUL.mdinforming the agent about the emotion-image plugin and its automatic handling of images.
Hent-ai is licensed under the MIT License.