[go: up one dir, main page]

Skip to content
View dragonhail's full-sized avatar

Block or report dragonhail

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A Commander for modern Go CLI interactions

Go 38,171 2,851 Updated Nov 5, 2024

An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.

Python 1,289 212 Updated Mar 9, 2020

Example end to end data engineering project.

Python 1,135 223 Updated Dec 8, 2022

Easy to maintain open source documentation websites.

TypeScript 56,555 8,487 Updated Nov 4, 2024

🔯 Modern, batteries-included Hugo theme for creating beautiful doc, blog and static websites

CSS 699 166 Updated Nov 3, 2024

Hands-on MLOps projects to explore and learn the practical aspects of machine learning engineering for production.

Jupyter Notebook 25 4 Updated Jul 2, 2024

Production ML rental prediction system.

Jupyter Notebook 29 1 Updated Feb 28, 2024

A list of publicly available datasets with real-time data maintained by the team at bytewax.io

604 25 Updated May 28, 2024

tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.

Jupyter Notebook 529 267 Updated Sep 6, 2024

Cool DE Projects

Jupyter Notebook 18 3 Updated Jun 27, 2024

데이터 엔지니어 기술 정리

15 5 Updated Jan 22, 2024

🥪🦘 An open source sandbox project exploring dbt workflows via a fictional sandwich shop's data.

107 141 Updated Oct 3, 2024

A self-contained dbt project for testing purposes

453 931 Updated Sep 12, 2024

A streaming ETL pipeline for Realtime Tweet Collection, Analysis and Reporting

Python 9 3 Updated Sep 22, 2021

Processing TfL data for bike usage with Google Cloud Platform.

Python 42 5 Updated Jul 15, 2022

This is a PySpark-based data pipeline that fetches weather data for a few cities, performs some basic processing and transformation on the data, and then writes the processed data to a Google Cloud…

Python 5 2 Updated May 1, 2023

A data pipeline with Kafka, Spark Streaming, dbt, Docker, Airflow, and GCP!

Jupyter Notebook 11 Updated Jul 6, 2023

docker-compose.yml files for cp-all-in-one , cp-all-in-one-community, cp-all-in-one-cloud, Apache Kafka Confluent Platform

Python 958 683 Updated Nov 5, 2024

Deploying Data Pipelines - Kubernetes Way

Python 2 2 Updated Feb 25, 2019

Apache Pinot - A realtime distributed OLAP datastore

Java 5,494 1,284 Updated Nov 6, 2024

The official home of the Presto distributed SQL query engine for big data

Java 16,035 5,370 Updated Nov 6, 2024

Apache Superset is a Data Visualization and Data Exploration Platform

TypeScript 62,591 13,796 Updated Nov 5, 2024

End to end data engineering project with kafka, airflow, spark, postgres and docker.

Python 63 30 Updated Aug 7, 2024

A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!

Python 566 118 Updated Apr 16, 2022

Free Data Engineering course!

Jupyter Notebook 25,051 5,361 Updated Nov 4, 2024

Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python

Python 43 20 Updated Dec 28, 2022