This is a PySpark-based data pipeline that fetches weather data for a few cities, performs some basic processing and transformation on the data, and then writes the processed data to a Google Cloud…

Python 5 2 Updated May 1, 2023

sarthak-sarbahi / covid19datapipeline

Python 1 1 Updated Dec 17, 2023

nburkett / ShowPulse_Ticketmaster

A data pipeline with Kafka, Spark Streaming, dbt, Docker, Airflow, and GCP!

Jupyter Notebook 11 Updated Jul 6, 2023

confluentinc / cp-all-in-one

docker-compose.yml files for cp-all-in-one , cp-all-in-one-community, cp-all-in-one-cloud, Apache Kafka Confluent Platform

Python 958 683 Updated Nov 5, 2024

sontivr / data-pipeline-in-k8s

Forked from agdsouza/OnTheSamePage

Deploying Data Pipelines - Kubernetes Way

Python 2 2 Updated Feb 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dragonhail

Block or report dragonhail

Stars

spf13 / cobra

san089 / goodreads_etl_pipeline

damklis / DataEngineeringProject

facebook / docusaurus

imfing / hextra

solygambas / mlops-projects

alexandergirardet / london_rightmove

bytewax / awesome-public-real-time-datasets

ukairia777 / tensorflow-nlp-tutorial

vedanthv / data-engineering-portfolio

mjs1995 / muse-data-engineer

dbt-labs / jaffle-shop

dbt-labs / jaffle-shop-classic

AlphanAksoyoglu / tweeter-etl-pipeline

jackgisby / tfl-bikes-data-pipeline

24jmwangi / weather_data_pipeline

sarthak-sarbahi / covid19datapipeline

nburkett / ShowPulse_Ticketmaster

confluentinc / cp-all-in-one

sontivr / data-pipeline-in-k8s

apache / pinot

prestodb / presto

apache / superset

HamzaG737 / data-engineering-project

simardeep1792 / Real-Time-E-Commerce-Analytics-Pipeline

simardeep1792 / Data-Engineering-Streaming-Project

ankurchavda / streamify

DataTalksClub / data-engineering-zoomcamp

garystafford / streaming-sales-generator

ABZ-Aaron / Reddit-API-Pipeline