GitHub - DangJin/pdf2video: Transform PDF documents into engaging video presentations with smooth animations.
Key Points
- 1pdf2video is a Remotion-based tool designed to transform PDF documents into engaging video presentations, offering various dynamic scene types and smooth animations.
- 2It allows for highly customizable video flows through basic props or detailed scripts, enabling control over scene transitions, page focus, titles, descriptions, and dynamic durations.
- 3The project supports high-quality rendering and includes a Claude Code skill for automated PDF content analysis and streamlined video generation.
This paper presents "pdf2video," an open-source project designed to transform PDF documents into engaging video presentations with smooth animations using the Remotion framework. The core methodology centers on programmatic video generation driven by either a declarative configuration (props.json) or a highly customizable script, allowing for diverse visual storytelling from static PDF content.
Core Methodology and Technical Details:
The system leverages Remotion, a React-based video framework, for rendering video frames, and integrates react-pdf for rendering PDF pages and pdfjs-dist for parsing PDF content. Zod is employed for schema validation of configuration inputs, ensuring robust data handling. The output video adheres to a standard 1920x1080 resolution at 30 frames per second (fps), with dynamic duration determined by the content and configuration.
The transformation process involves two primary configuration modes:
- Basic Props Configuration: This mode allows users to specify fundamental properties such as:
src: Path to the PDF document (e.g.,/sample.pdf).title,subtitle: Main video titles.highlights: An array of page numbers (1-indexed) to feature in the video. The system automatically orchestrates a sequence of scenes (likelyfocusorswitchtypes) for these highlighted pages.pageTitles,pageDescriptions: Maps for custom titles and descriptions per page, displayed as typing effects.
- Custom Script Configuration: For granular control, users can define a
scriptarray, where each element specifies a scene type and its duration in frames. This provides explicit control over the video's narrative flow. Each script item is an object withtypeanddurationproperties. Forfocus,switch, andfantypes, apageproperty is also required.- Scene Types:
stack: Displays a "card stack" of PDF pages, featuring an entrance animation. Default duration is 60 frames.focus: Extracts and zooms into a specific PDF page, supporting scroll functionality within the zoomed view. Default duration is 120 frames.switch: Transitions smoothly between pages using slide animations. Default duration is 120 frames.fan: Arranges pages in a "fan" or "wheel" layout with rotation and a focus effect on a selected page. Default duration is 150 frames.
- Scene Types:
Smart Animations and Presentation Elements:
The system incorporates several "smart animations" to enhance visual appeal and user engagement:
- Natural Card Spread: When focusing on a page, surrounding pages spread out naturally, akin to dealing poker cards, implying sophisticated interpolation for position and rotation.
- Breathing Effect: A subtle animation applied before scrolling, suggesting a slight scale or pulsation, preparing the viewer for movement.
- Bounce Effect: Implemented at scroll stops, providing a tactile, elastic feel to the page movement.
- Collapse Animation: Enables seamless transitions between different scene types or states by animating elements collapsing into or expanding from a point.
Key presentation elements include:
- Title System: Comprises a main title and optional subtitle at the opening, a persistent corner title throughout page viewing, and customizable per-page titles.
- Bottom Info Bar: Displays the current scene title with a typing effect for descriptions, a progress indicator (e.g., "1/5"), and per-page custom descriptions.
- Ending Scene: A dedicated segment where the PDF stack moves to the left with staggered cards, accompanied by a "Thank you" message, title, and an animated decorative line.
Dynamic Features:
- Dynamic Duration Calculation: The video length is automatically determined either by summing the
durationof all entries in a custom script or by inferring it from thehighlightsconfiguration in basic props. - Background Music: Supports automatic fade-in and fade-out (2 seconds each) and ensures the music duration matches the generated video length, facilitating seamless audio integration.
Rendering Quality:
Focused pages are rendered at 2x resolution to ensure sharpness and clarity even when zoomed in, addressing potential pixelation issues.
Claude Code Skill Integration:
A notable feature is the inclusion of a Claude Code skill (.claude/skills/pdf-to-video/). This skill automates the PDF to video conversion workflow. Upon invocation with a PDF path, Claude:
- Reads and analyzes the PDF content (likely extracting text, headings, and possibly image descriptions).
- Extracts key points and relevant page titles, possibly through content analysis or heuristics.
- Generates the
props.jsonconfiguration automatically based on the extracted information, selecting suitable highlights and creating descriptions. - Renders the video automatically using the generated configuration. This skill streamlines the creation process, requiring minimal manual configuration from the user.