30 open-source projects similar to zulko/moviepy, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Moviepy alternative.
FFmpeg is a cross-platform multimedia framework designed for the recording, conversion, and streaming of audio and video content. It functions as a comprehensive toolkit that provides both a command-line utility for direct media manipulation and a collection of low-level libraries for integration into custom applications. At its core, the project utilizes a packet-based stream engine and a format-agnostic abstraction layer to handle diverse media standards, containers, and network protocols. The framework distinguishes itself through a modular, graph-based filter execution model that allows f
Editly is a headless, programmatic video engine and automated assembler. It functions as a declarative video editor that generates MP4 and GIF exports from structured data or code, removing the need for a manual graphical user interface. The system is distinguished by its ability to integrate GLSL fragment shaders as visual layers within a programmatic timeline. It uses a configuration-based model to define clips, layers, and audio tracks, allowing for reproducible video assembly and the generation of custom programmatic graphics. The engine covers a broad range of media production capabilit
Autocut is a text-based video editor and automatic speech recognition tool. It allows users to cut and merge video clips by modifying a text transcript instead of using a traditional timeline. The system operates as an FFmpeg video processor and subtitle manipulation utility. It converts spoken audio into text and compacts subtitle files into simplified formats, enabling the removal of unwanted video segments by deleting corresponding sentences from a transcription file. The project covers automated video transcription, non-linear video cutting, and subtitle file management. It supports hard
Recordly is a screen recording and video editing suite designed for creating product demonstrations. It combines screen and audio capture software with a dedicated demo video editor and tools for merging webcam overlays and exporting projects as MP4 files or looping GIFs. The platform features a specialized cursor animation engine that applies smoothing, motion blur, and click animations to the rendered mouse movements. It also provides customizable webcam bubbles and a system for placing recordings inside styled containers with custom wallpapers and drop shadows. The editing workflow center
node-fluent-ffmpeg is a Node.js wrapper for FFmpeg that provides a fluent interface for executing media commands and processing files. It functions as a process manager that handles the lifecycle of external FFmpeg binaries, enabling programmatic media transcoding, video thumbnail generation, and metadata extraction via ffprobe. The library distinguishes itself through a command builder that translates JavaScript method calls into command-line arguments. It features event-driven progress monitoring to track processed frames and throughput, as well as the ability to route processed media data
ffmpeg-python is a Python wrapper that translates programmatic method calls into command-line arguments for executing FFmpeg media processing tasks. It functions as a multimedia transcoding interface and a media stream capture tool, allowing for the recording of live audio and video from hardware devices and network sources. The library features a fluent interface for constructing complex directed graphs of audio and video filters through method chaining. It also includes an FFprobe metadata extractor that retrieves structured technical properties from media files and returns them as Python d
Kdenlive is an open-source non-linear video editing suite designed for digital video post-production. Built on the MLT Framework and utilizing KDE Frameworks for its user interface, it provides a multi-track environment for assembling clips, applying transitions, and rendering final video files. The editor distinguishes itself through a comprehensive set of animation and effect tools, including keyframe-based parameter animation with a visual curve editor for fine-tuning transitions. It supports advanced visual modifications such as clip speed remapping, effect region masking, and the integra
Jumpcutter is an audio-based video cutter and automatic editor designed to eliminate dead air from video files. It functions as a utility that condenses footage by detecting and removing silent sections based on audio track analysis. The tool utilizes FFmpeg to automatically identify quiet gaps and strip them from recordings. This process focuses on removing silent video sections to create faster-paced content without the need for manual editing. The system operates by calculating decibel levels against a defined volume threshold to generate a list of timestamps for audible segments. These s
PHP-FFmpeg is an object-oriented wrapper for executing FFmpeg binary commands within PHP applications. It serves as a multimedia processing library and toolkit for transcoding, clipping, merging, and filtering audio and video files through a standardized programmatic interface. The project provides specialized drivers for video manipulation, audio editing, and media metadata extraction. These drivers allow for the application of visual filters, the modification of audio sample rates, and the probing of multimedia files to retrieve technical specifications and validate file integrity. The lib
Pydub is a Python audio manipulation library and digital audio processor used for editing, slicing, and converting audio files and segments. It serves as a programmatic wrapper for FFmpeg to import and export a wide variety of audio formats. The library functions as an audio signal generator capable of creating synthetic waveforms, such as sine waves and white noise. It also provides tools for digital signal processing, including the application of filters, fades, crossfades, and gain adjustments to sound signals. Its broader capabilities cover programmatic audio editing through concatenatio
Librosa is a Python audio analysis library and digital signal processing framework. It functions as a feature extraction suite and music information retrieval tool designed to analyze the structural and sonic characteristics of audio signals. The library provides specialized capabilities for music analysis, including dynamic tempo tracking to identify rhythmic pulses and spectral feature extraction to compute harmonic spectra, chroma variants, and onset points. It also serves as a time-series audio processor for synchronizing audio streams. The system covers a broad range of audio processing
This project is an open-source video production suite and non-linear video editor. It provides a multi-track timeline for cutting, splicing, and arranging video and audio clips with frame-level precision, serving as a comprehensive workspace for video post-production. The suite includes specialized tools for keyframe animation, allowing for the creation of 2D and 3D visual effects and motion graphics. It also features a multi-track audio mixer for blending sound sources and adjusting levels to accompany visual content. Capability areas cover a full post-production workflow, including color c
RxFFmpeg is an Android multimedia framework and media transcoder based on FFmpeg. It provides a set of tools for video and audio editing, transcoding, and processing on Android devices. The framework integrates a video player component for rendering local files and network streams with zoom and rotation support. It also includes specialized libraries for Android video editing, such as cropping and splicing, and Android audio processing for mixing tracks and modifying voice pitch. The project covers broad media manipulation capabilities, including the conversion of images to video, the extrac
Flowblade is a non-linear video editor and multitrack video compositor. It provides a professional environment for composing multitrack timelines, trimming media clips, and managing assets through a visual effects processing engine. The project distinguishes itself with a hardware-synced playback controller that allows for manual scrubbing using external USB shuttle and jog devices. It also includes a hardware-accelerated video encoder that utilizes CPU and GPU acceleration to render project timelines into final formats. The software covers a broad range of production capabilities, including
Auto-editor is a command-line automated video editor that uses FFmpeg to remove silence and inactive footage from video files. It functions as a processing suite with specialized cut generators that identify segments to trim based on loudness thresholds, motion analysis, and speech-to-text transcription. The tool distinguishes itself by offering a flexible post-production workflow, allowing users to export automated cut timelines as XML or JSON files for use in professional non-linear editing software. Beyond simple deletion, it can perform dynamic playback adjustments, such as increasing the
Olive is an open-source non-linear video editor and multi-track compositor designed for digital video production. It provides a desktop application for arranging, cutting, and trimming video and audio clips on a timeline to create high-resolution cinematic content. The software employs a proxy-based editing workflow, substituting high-resolution assets with low-bitrate temporary files to maintain fluid playback and scrubbing performance. This is supported by a non-destructive timeline engine that maps media clips using metadata pointers rather than modifying original source files. The system
GPAC is an open-source multimedia framework built around a pluggable filter graph pipeline, where modular processing units called filters connect into a directed graph to handle media workflows. At its core, the framework centers all media packaging and manipulation on the ISO Base Media File Format (ISOBMFF), with specialized tools for reading, writing, fragmenting, and encrypting MP4 and related containers. It also provides a declarative scene graph composition system for describing interactive multimedia scenes using MPEG-4 BIFS, X3D, SVG, or VRML syntax, alongside a hardware-accelerated re
Cap is a self-hosted screen recording and video collaboration platform designed for teams to replace synchronous meetings with asynchronous video updates. It provides a comprehensive suite for capturing high-resolution desktop activity, including system audio, microphone input, and camera overlays, which are then processed through an integrated post-production workflow. The platform distinguishes itself by offering full data sovereignty through containerized deployment and object storage abstractions, allowing users to host their media assets on private infrastructure or S3-compatible buckets
LosslessCut is a desktop application designed for the precise editing of video and audio files without re-encoding the underlying media streams. By performing stream copying and container remuxing, the software allows users to cut, merge, and rearrange media segments while maintaining the original bit-perfect quality of the source content. The application distinguishes itself by utilizing a stream-copying data pipeline that transfers raw media packets directly from source to destination, significantly reducing processing time compared to traditional transcoding workflows. It also functions as
Gyroflow is a gyroscope video stabilization software and IMU telemetry processor designed to remove camera shake from video files. It functions as a hardware-accelerated video renderer and lens calibration tool, utilizing embedded or external gyroscope and accelerometer data to perform pixel-level stabilization. The system is distinguished by its ability to integrate with professional non-linear video editing software via plugins, allowing stabilization to be applied directly to timelines without transcoding original footage. It supports diverse telemetry ingestion from camera brands, flight
Albumentations is a computer vision image augmentation library designed to increase training data diversity for deep learning models. It provides a toolset for applying geometric and color transformations to images and annotations, including a specialized collection of 3D operations for volumetric data used in medical and scientific imaging. The library functions as an image mask and bounding box transformer, automatically updating masks, bounding boxes, and keypoints when images undergo geometric changes. This ensures that spatial alterations remain synchronized across images and their assoc
Albumentations is an image augmentation library and computer vision preprocessing tool designed to expand datasets for deep learning models. It provides a collection of transformations that modify pixel values and spatial geometry to increase the diversity of training samples and improve model generalization. The library supports both 2D image augmentation and 3D volumetric data augmentation. It handles a variety of labels alongside images, ensuring that bounding boxes, keypoints, and segmentation masks remain accurately aligned when spatial transformations are applied. The tool incorporates
Auto-subs is an AI transcription and automatic captioning tool that converts spoken audio from video files into synchronized subtitles. It functions as a subtitle generator and a transcription bridge, enabling the conversion of speech to text with automatic speaker identification and multi-language translation support. The software prioritizes data privacy by utilizing on-device AI inference to process audio and video files locally on the user's hardware. It distinguishes itself by offering deep integration with professional video editing workflows, allowing users to export timing and transcr
Remotion is a programmatic video framework that enables the creation of video content using component-based logic and standard web technologies. By leveraging a declarative animation engine, it allows developers to structure visual content as a hierarchy of reusable components, ensuring that animations and state updates remain consistent through deterministic frame execution. The framework distinguishes itself by utilizing a headless browser renderer that captures visual output frame-by-frame to generate high-quality video files. This architecture supports a cloud-native media pipeline, allow
Detectron2 is a PyTorch computer vision framework and visual recognition platform designed for training and deploying models for object detection, image segmentation, and visual recognition. It provides a research-oriented environment for training complex vision models with multi-GPU acceleration. The project includes a specialized object detection library for identifying and locating multiple objects via bounding boxes, as well as an image segmentation toolkit for creating pixel-level masks through instance, semantic, and panoptic segmentation. Additionally, it features a human pose estimati
Libvips is a C-based image processing library designed to manipulate large visual assets through a low-memory, parallel processing pipeline. It functions as a streaming image processor that avoids loading entire files into system memory, enabling the handling of massive images in resource-constrained environments. The library distinguishes itself through a demand-driven architecture that constructs a deferred execution plan, computing only the necessary pixels for a final output. By utilizing a cache-friendly tiled processing model and memory-mapped file access, it minimizes latency and redun
KittenTTS is a neural text-to-speech engine and text-to-audio synthesis tool that converts written text into spoken audio using lightweight neural network models. It functions as both a speech synthesizer and an audio file generator, producing spoken audio for offline playback. The system includes a text normalization processor that expands numbers and abbreviations into full spoken words to improve the naturalness of the synthesized speech. It supports diverse voice options and provides the ability to adjust playback speed.
A High-performance cross-platform Video Processing Python framework powerpacked with unique trailblazing features :fire:
This project is a Python wrapper for the OpenCV computer vision library, providing a bridge that exposes high-performance C++ functions to the Python programming language. It serves as a collection of tools for real-time image processing, object detection, and machine learning on visual data. The project provides precompiled binary distributions, allowing for the integration of vision capabilities into Python applications without requiring a local C++ compiler. It offers multi-variant package distributions, including headless versions designed for server or cloud environments where a graphica
Python library and CLI tool to interface with Google Translate's text-to-speech API