Hadithi is an innovative open-source tool designed for AI and ML developers to create high-quality video datasets for refining large language models (LLMs). This bash-based command-line tool streamlines the process of generating video content, making it easier for developers to fine-tune their models effectively.
With Hadithi, developers can convert videos from various sources, including YouTube, Torrent, and enterprise platforms, into datasets suitable for training LLMs. By organizing videos into folders, segmenting them, detecting scenes, and performing other preprocessing tasks, Hadithi simplifies the data preparation process for model training.
The tool offers a range of functionalities, such as renaming videos with timestamps, removing audio, filtering out short videos, resizing videos, and extracting frames. By providing these features, Hadithi acts as a data factory for generative video models, empowering developers to create and refine large language models with ease.
To learn more about Hadithi and its capabilities, you can visit their GitHub repository at Hadithi GitHub.