Releases: Huanshere/VideoLingo
v1.8.0
Release Notes
🔧 Improvements:
- Refined and optimized the overall code structure for better performance and maintainability.
- Enhanced the prompt to be more concise and applicable to a wider range of models.
- Improved fuzzy and precise matching in translation processes.
🐛 Bug Fixes:
- Fixed several errors occurring during the FFmpeg compression process.
- Resolved an issue with phrase errors caused by lack of initialization and null returns from model translations.
📝 Updates:
- Demucs vocal separation is no longer performed by default before transcription, addressing the issue of missing sentences and improving processing speed.
- Removed support for the whisperX replicate API to simplify the project as an open-source initiative.
- Adjusted the translation process to handle smaller segments, reducing the likelihood of errors.
发布说明
🔧 改进:
- 精简优化了整体代码结构,提高了性能和可维护性。
- 优化了提示词,使其更加精简,适用于更多模型。
- 改进了翻译过程中的模糊和精确匹配。
🐛 问题修复:
- 修复了在 FFmpeg 压制过程中发生的一些错误。
- 解决了由于未初始化和模型翻译返回空值导致的 phrase 错误。
📝 更新:
- 默认不在转录前进行 Demucs 人声分离,以解决遗漏句子的问题并提高处理速度。
- 删除了 whisperX replicate API 的支持,以简化开源项目。
- 调整翻译过程为处理更小的块,减少错误的可能性。
v1.7.1
Release Notes
🚀 New Features:
- Added MPS support for Demucs to improve performance.
- Implemented an error retry mechanism for batch processing.
- Set the default to use the Gemini model and updated documentation accordingly.
- Added an auto-update feature for
ytdlp
.
🐛 Bug Fixes:
- Corrected a long video segmentation error.
- Fixed loading issues with the local Chinese Whisper model.
- Improved audio splitting robustness and encoding handling.
- Resolved issues with handling reference audio prerequisites for GPT-SoVITS batch processing.
- Correctly implemented retry on translation failures.
🔧 Improvements:
- Updated the CPU-specific torch version in the installation process.
- Refactored to simplify the prompt reasoning chain due to minimal improvement.
📝 Updates:
- Removed the install option in the OneKey batch script.
- Added a section for the SaaS website in the documentation.
🚀 新功能:
- 为 Demucs 增加了 MPS 支持以提升性能。
- 实现了批处理的错误重试机制。
- 设置默认使用 Gemini 模型并相应更新了文档。
- 新增
ytdlp
自动更新功能。
🐛 问题修复:
- 修正了长视频分段错误。
- 修复了本地中文 Whisper 模型的加载问题。
- 改进了音频分割的鲁棒性和编码处理。
- 解决了 GPT-SoVITS 批处理的参考音频前提条件问题。
- 正确实现了翻译失败时的重试机制。
🔧 改进:
- 更新了安装过程中与 CPU 相关的 torch 版本。
- 重构以简化提示推理链条,因其改进效果有限。
📝 更新:
- 移除了 OneKey 批处理脚本中的安装选项。
- 在文档中添加了 SaaS 网站的部分。
v1.7.0
🚀 New Features:
- Enabled GPU acceleration for FFmpeg encoding (5x speed boost)
- Replaced UVR with Demucs for vocal isolation (5x speed boost)
- Automated torch version selection based on GPU in install.py
🐛 Bug Fixes:
- Resolved local whisperX video segmentation issue
- Fixed support for uppercase file extensions
🔧 Improvements:
- Simplified code structure
- Improved spacing between Chinese and English in subtitles with autocorrect
- Streamlined PyPI sources to official and Tsinghua mirrors
📝 Updates:
- One-click package and free test key are no longer provided
- Commercial SaaS version will be released tomorrow
🚀 新功能:
- 为FFmpeg编码启用GPU加速(5倍速度提升)
- 用Demucs替换UVR进行人声分离(5倍速度提升)
- 在install.py中根据GPU自动选择torch版本
🐛 问题修复:
- 解决本地whisperX视频分段问题
- 修复对大写文件扩展名的支持
🔧 改进:
- 简化代码结构
- 使用自动纠正改善字幕中中英文之间的间距
- 精简PyPI源为官方和清华镜像
📝 更新:
- 不再提供一键包和免费的测试key
- 商业SaaS版本将于明天发布
v1.6.4
🚀 New Features:
- Added m4a file support
- Automated Chinese transcription model download
- Optimized multilingual PyPI source selection
🐛 Bug Fixes:
- Fixed uppercase file extension issue
- Adjusted summary length to 4k characters
📝 Updates:
- New logo and documentation improvements
- Model uploaded to Docker Hub
🚀 新功能:
- 支持m4a文件格式
- 自动下载中文转录模型
- 优化多语言PyPI源选择
🐛 问题修复:
- 解决文件扩展名大写问题
- 调整摘要长度为4k字符
📝 更新:
- 新logo和文档改进
- 模型上传至Docker Hub
v1.6.3
-
🌐 Simplified the implementation of multilingual support . Chinese users, please apply the localization patch according to the installation documentation 🇨🇳. Support for other languages is TODO 📝
-
📚 Updated the technical documentation on the official website
-
🌐 简化了多语言的实现方式,中文用户请根据安装文档打上汉化补丁,其他语言的支持TODO
-
📚 更新了官网的技术文档
v1.6.2
📢 Announcement: Starting from this version, the open-source edition will only receive stability updates. Our commercial SaaS version is coming soon, stay tuned!
🎉 New Features
- Added support for audio file uploads
- Added Docker support
🛠️ Stability Fixes
- Fixed upload-related bugs
- Fixed audio format processing issues
- Optimized configuration file import
- Fixed OpenAI TTS configuration issues
- Fixed and optimized ffmpeg-related issues
- Added int8 support for older GPUs
- Optimized pip source selection and Hugging Face mirror choice during installation
📢 公告:从这个版本开始,开源版本将只进行稳定性更新。我们的商业SaaS版本即将推出,敬请期待!
🎉 新功能
- 添加对音频文件上传的支持
- 添加 Docker 支持
🛠️ 稳定性修复
- 修复上传相关 bug
- 修复音频格式处理问题
- 优化配置文件导入
- 修复 OpenAI TTS 配置问题
- 修复并优化 ffmpeg 相关问题
- 为旧款 GPU 添加 int8 支持
- 安装时优选pip源和选择huggingface镜像
v1.6.1
Enhance System Stability
- Revert FFmpeg installation due to encountered issues
- Resolve SoVITS configuration import problems
- Address YAML installation errors
- Replace Librosa with FFmpeg to mitigate compatibility issues
- Boost stability in batch processing mode"
提升系统稳定性
- 由于出现问题,回退 FFmpeg 安装
- 解决 SoVITS 配置导入问题
- 修复 YAML 安装错误
- 用 FFmpeg 替换 Librosa 以解决兼容性问题
- 增强批处理模式的稳定性
v1.6
-
Refactored
config.py
intoconfig.yaml
, with corresponding restructuring of the codebase. The UI no longer requires clicking a save button. -
Automatic source selection during local installation
-
Fixed an issue where language settings in batch mode were ineffective
-
Improved stability for accessing Replicate
-
将
config.py
重构为config.yaml
,代码库也跟随着进行了重构,UI中现在不需要点击保存了。 -
本地安装时自动选择源
-
修复了 batch mode 中填写语言无效的问题
-
现在访问replicate的方法更稳定了
v1.5.1
-
Emergency fix for the blank nan bug and zh check bug in batch mode
-
Fixed an issue where the language set in the tasks settings of version 1.5 was ineffective.
-
Added mirror source selection for installation steps
-
紧急修复了 batch 模式的留空 nan bug和 zh 检查 bug,
-
修复了1.5版本tasks setting中设置的语言无效的问题
-
增加了安装步骤的镜像源选择
v1.5
Major Updates:
- Added batch processing functionality
- Simplified JSON key structure in prompts
Minor Improvements:
- Support for running in 6GB gpu memory environments
- Improved empty line detection in NLP step
- Fixed font issues in Linux environments
- Handled phrase alignment errors and prompted to retry UVR
- Limited file upload size to 500MB in Streamlit
- Made video compression a separate optional step
- Restructured README documentation
主要更新:
- 新增批量处理功能
- 简化promot中的JSON key结构
小改进:
- 改善了6G显存环境下的运行速度
- 改进了NLP步骤中的空行检测
- 修复Linux环境下的字体问题
- 处理短语对齐错误并提示重试UVR
- 在Streamlit中限制文件上传大小为500MB
- 是否压制视频单独作为可选项
- 重构了README文档