GitHub - eduly-ai/eduly: Manim Coding Agent
Service

GitHub - eduly-ai/eduly: Manim Coding Agent

eduly-ai
2026.01.19
·GitHub·by 이호민
#LLM#Agent#Manim#AI#Education

Key Points

  • 1Eduly is an open-source Manim Coding Agent, powered by Gemini 2.5 and LangChain, designed to transform various content types into engaging 3Blue1Brown-style animated educational videos.
  • 2Its core functionality involves a three-stage pipeline that breaks down content into atomic topics, generates visual storyboards, and then employs an iterative Manim animation agent to produce the final videos.
  • 3Eduly aims to improve understanding by providing visual explanations for complex concepts, serving diverse users from students and educators to researchers and enterprises, and believes in accessible, free learning tools.

Eduly is an open-source Manim Coding Agent designed to transform various content types into visually compelling, 3Blue1Brown-style animated educational explanations. It addresses the challenge of abstract concepts being difficult to grasp from static text by converting them into dynamic visualizations. The system is built upon Gemini 2.5 and LangChain.

Its primary use cases span students (converting PDFs/notes into explainer videos), educators (animating PowerPoint slides), researchers (visualizing academic papers), content creators (production-ready animations from scripts), enterprise (visual explainers for training), and open-source projects (animated onboarding guides for technical documentation).

The core methodology of Eduly operates through a three-stage pipeline:

  1. Document Breakdown: The EdulyBreakdownClient, leveraging a Gemini model (e.g., gemini-2.5-flash-preview) in conjunction with Google Search capabilities, processes input content such as PDFs. Its objective is to deconstruct the document into atomic, self-contained topics, each suitable for a ~5-minute video. For each identified topic, it generates a comprehensive explanation, incorporates modern context, and distills key takeaways. The output is a structured Breakdown object containing document_title and a list of Topic objects, each with a name.
  1. Storyboard Generation: For each atomic topic identified in the breakdown phase, the EdulyBreakdownClient generates a detailed visual storyboard. This stage aims to create a narrative structure aligned with the 3Blue1Brown aesthetic. Key elements of each storyboard include:
    • Defining a core visual metaphor to intuitively represent the concept.
    • Structuring 8-15 distinct scenes with precise visual descriptions for each.
    • Crafting narration optimized for text-to-speech synthesis.
    • Including concrete examples, often with specific numerical values, to ground abstract ideas.
The output is a Storyboard object for each topic.

  1. Manim Animation Agent (Iterative Refinement): The EdulyAnimationClient, powered by a LangChain integration with ChatGoogleGenerativeAI (specifically, gemini-2.5-pro-preview), acts as an iterative coding agent responsible for generating the Manim animation code from the storyboards. This stage operates within a continuous feedback loop:
    • Prompting: The agent receives the generated storyboard with full contextual information.
    • LLM Code Generation: The underlying Large Language Model (LLM) generates Manim scene classes based on the storyboard instructions.
    • Manim Agent Execution: The generated Manim code is then executed. The agent has access to several tools for interacting with the file system and documentation, including ls, glob, grep, read_file, edit_file, and write_file. Crucially, it has access to the full Manim documentation, enabling it to research and utilize Manim functionalities effectively.
    • Rendering: The Manim code is rendered into a video.
    • Success Check & Iteration: If the rendering is successful, the process for that topic is complete, yielding an MP4 video (1080x1920 resolution, optimized for mobile). If rendering fails, the error message is fed back to the LLM as "error feedback," prompting the agent to iteratively refine its Manim code until successful generation or a maximum iteration limit is reached (e.g., maxiterations=5max_iterations = 5). This iterative self-correction mechanism is central to the agent's ability to produce robust animations.

The resultant MP4 videos are designed to be mobile-ready. Future roadmap plans include integrating text-to-speech APIs for narration, developing a mobile-first web platform, implementing multimodal feedback loops using vision-language models for quality assessment, and creating an RL environment to train specialized Manim animators. Eduly is distributed under the CC-BY-NC-SA-4.0 license, emphasizing its open-source and non-commercial usage for broader accessibility of knowledge.