← Back to Discover
datajuicer

datajuicer/data-juicer

PythonApache-2.0activepopular
88Health

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Stars6.4k
Forks371
Open Issues67
Contributors371
Last Push3d ago

Health Breakdown

Activity
25
Community
25
Maintenance
13
Popularity
25
#data#data-analysis#data-pipeline#data-processing#data-science#data-visualization#foundation-models#instruction-tuning#large-language-models#llm#llms#multi-modal#pre-training#synthetic-data
View on GitHub ↗Issues (67) ↗Pull Requests ↗Wiki ↗

Community

datajuicer
datajuicer/data-juicer
PythonApache 2.0
88

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

activepopular
6.4k371 contributors3d ago

More Python repos

isaac-sim
isaac-sim/IsaacLab
Unified framework for robot learning built on NVIDIA Isaac Sim
7.2k92
home-assistant
home-assistant/core
:house_with_garden: Open source home automation that puts local control and privacy first.
87.1k92
hardikvasa
hardikvasa/google-images-download
Python Script to download hundreds of images from 'Google Images'. It is a ready-to-run code!
8.7k91