Repository for the Data-Centric AI Competition
-
Updated
Sep 27, 2021 - Jupyter Notebook
Repository for the Data-Centric AI Competition
Codes for a Top 5% finish in the Data-Centric AI Competition organized by Andrew Ng and DeepLearning.AI
Customer churn train/prediction library with automatic dataset size optimisation features.
A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀
An Empirical Study of Memorization in NLP (ACL 2022)
Find illustrations in historic book using computer vision
📕 flyswot book on developing a pragmatic machine learning workflow in a library setting
Lab assignments for Introduction to Data-Centric AI, MIT IAP 2023 👩🏽💻
nbsynthetic is simple and robust tabular synthetic data generation library for small and medium size datasets
Data-SUITE: Data-centric identification of in-distribution incongruous examples (ICML 2022)
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)
Cleanlab and MachineHack Organised Data-Centric AI Competition 2023. This is One of Solution I tried and achieved 13th rank.
Implementation of data typology for imbalanced datasets.
[ECCV 2022] Official Implementation for Unsupervised Selective Labeling for More Effective Semi-Supervised Learning
The implementation for our paper, "Improving Simultaneous Machine Translation with Monolingual Data," accepted to AAAI 2023. 🎉
Input-Agnostic Face Detection
Quickly set up an image labelling web application for manually tagging images for machine learning tasks.
AQuA: A Benchmarking Tool for Label Quality Assessment
Add a description, image, and links to the data-centric-ai topic page so that developers can more easily learn about it.
To associate your repository with the data-centric-ai topic, visit your repo's landing page and select "manage topics."