GitHub - DangJin/pdf2video: Transform PDF documents into engaging video presentations with smooth animations.
Service

GitHub - DangJin/pdf2video: Transform PDF documents into engaging video presentations with smooth animations.

DangJin
2026.01.26
·GitHub·by 네루
#video generation#PDF to video#animation#React#Remotion

Key Points

  • 1pdf2video is a Remotion-based tool designed to transform PDF documents into engaging video presentations, offering various dynamic scene types and smooth animations.
  • 2It allows for highly customizable video flows through basic props or detailed scripts, enabling control over scene transitions, page focus, titles, descriptions, and dynamic durations.
  • 3The project supports high-quality rendering and includes a Claude Code skill for automated PDF content analysis and streamlined video generation.

This paper presents "pdf2video," an open-source project designed to transform PDF documents into engaging video presentations with smooth animations using the Remotion framework. The core methodology centers on programmatic video generation driven by either a declarative configuration (props.json) or a highly customizable script, allowing for diverse visual storytelling from static PDF content.

Core Methodology and Technical Details:

The system leverages Remotion, a React-based video framework, for rendering video frames, and integrates react-pdf for rendering PDF pages and pdfjs-dist for parsing PDF content. Zod is employed for schema validation of configuration inputs, ensuring robust data handling. The output video adheres to a standard 1920x1080 resolution at 30 frames per second (fps), with dynamic duration determined by the content and configuration.

The transformation process involves two primary configuration modes:

  1. Basic Props Configuration: This mode allows users to specify fundamental properties such as:
    • src: Path to the PDF document (e.g., /sample.pdf).
    • title, subtitle: Main video titles.
    • highlights: An array of page numbers (1-indexed) to feature in the video. The system automatically orchestrates a sequence of scenes (likely focus or switch types) for these highlighted pages.
    • pageTitles, pageDescriptions: Maps for custom titles and descriptions per page, displayed as typing effects.
  1. Custom Script Configuration: For granular control, users can define a script array, where each element specifies a scene type and its duration in frames. This provides explicit control over the video's narrative flow. Each script item is an object with type and duration properties. For focus, switch, and fan types, a page property is also required.
    • Scene Types:
      • stack: Displays a "card stack" of PDF pages, featuring an entrance animation. Default duration is 60 frames.
      • focus: Extracts and zooms into a specific PDF page, supporting scroll functionality within the zoomed view. Default duration is 120 frames.
      • switch: Transitions smoothly between pages using slide animations. Default duration is 120 frames.
      • fan: Arranges pages in a "fan" or "wheel" layout with rotation and a focus effect on a selected page. Default duration is 150 frames.

Smart Animations and Presentation Elements:

The system incorporates several "smart animations" to enhance visual appeal and user engagement:

  • Natural Card Spread: When focusing on a page, surrounding pages spread out naturally, akin to dealing poker cards, implying sophisticated interpolation for position and rotation.
  • Breathing Effect: A subtle animation applied before scrolling, suggesting a slight scale or pulsation, preparing the viewer for movement.
  • Bounce Effect: Implemented at scroll stops, providing a tactile, elastic feel to the page movement.
  • Collapse Animation: Enables seamless transitions between different scene types or states by animating elements collapsing into or expanding from a point.

Key presentation elements include:

  • Title System: Comprises a main title and optional subtitle at the opening, a persistent corner title throughout page viewing, and customizable per-page titles.
  • Bottom Info Bar: Displays the current scene title with a typing effect for descriptions, a progress indicator (e.g., "1/5"), and per-page custom descriptions.
  • Ending Scene: A dedicated segment where the PDF stack moves to the left with staggered cards, accompanied by a "Thank you" message, title, and an animated decorative line.

Dynamic Features:

  • Dynamic Duration Calculation: The video length is automatically determined either by summing the duration of all entries in a custom script or by inferring it from the highlights configuration in basic props.
  • Background Music: Supports automatic fade-in and fade-out (2 seconds each) and ensures the music duration matches the generated video length, facilitating seamless audio integration.

Rendering Quality:
Focused pages are rendered at 2x resolution to ensure sharpness and clarity even when zoomed in, addressing potential pixelation issues.

Claude Code Skill Integration:
A notable feature is the inclusion of a Claude Code skill (.claude/skills/pdf-to-video/). This skill automates the PDF to video conversion workflow. Upon invocation with a PDF path, Claude:

  1. Reads and analyzes the PDF content (likely extracting text, headings, and possibly image descriptions).
  2. Extracts key points and relevant page titles, possibly through content analysis or heuristics.
  3. Generates the props.json configuration automatically based on the extracted information, selecting suitable highlights and creating descriptions.
  4. Renders the video automatically using the generated configuration. This skill streamlines the creation process, requiring minimal manual configuration from the user.