spark-rdd

This project utilizes PySpark DataFrames and PySpark RDD to implement item-based collaborative filtering. By calculating cosine similarity scores or identifying movies with the highest number of shared viewers, the system recommends 10 similar movies for a given target movie that aligns users’ preferences.

python spark apache-spark collaborative-filtering pyspark movie-recommendation spark-dataframes spark-rdd

Updated Jun 29, 2024
Jupyter Notebook

MaxineXiong / Degrees-of-Separation-with-Breadth-first-Search

Star

This project utilizes PySpark RDD and the Breadth-first Search (BFS) algorithm to find the shortest path and degrees of separation between two given Marvel superheroes based on based on their appearances together in the same comic books, empowering users to discover connections between their favourite superheroes in the Marvel universe.

python spark apache-spark pyspark breadth-first-search bfs-algorithm marvel-characters spark-rdd degrees-of-separation

Updated Jun 29, 2024
Jupyter Notebook

adityajn105 / Apache-Spark-Tutorials

Star

Apache spark is a big data analysis framework.

spark bigdata pyspark spark-ml spark-rdd spark-tutorials

Updated Apr 11, 2019
Jupyter Notebook

ricardoariasalazar / Flights-Delay

Star

In this project, we use Spark to visualize, manipulate, model and stream historical flight-delays data using Spark RDD, Spark SQL and Kafka

pyspark kafka-streams spark-sql big-data-analytics spark-rdd

Updated Jan 5, 2022
Jupyter Notebook

manojpawar94 / Spark-Scala-Examples

Star

I have implemented the sample programs using apache spark. The programs have developed on the concepts of Spark RDD and Spark SQL Dataframe.

spark apache-spark spark-sql spark-rdd

Updated Aug 31, 2021
Scala

nikhilkumawat03 / Extracting-Relevant-Document

Star

Projects contains based on Big Data

hadoop java-8 mapreduce spark-sql spark-rdd

Updated Feb 15, 2020
Java

mohammad-safari / spark-hadoop-exercise

Star

spark hadoop exercise of cloud computing course - aut 1402-1403 fall

big-data spark hadoop hdfs mapreduce spark-sql spark-dataframes hadoop-yarn spark-rdd

Updated Feb 1, 2024
Jupyter Notebook

ShreeshaN / SparkBigDataTutorials

Star

Demonstration of basic data transformations using Spark RDD and Spark DataFrame in Scala

spark spark-sql spark-scala spark-rdd scala-sbt spark-sql-udf

Updated Nov 18, 2022
Scala

vaibhav50596 / DeerfootTrailAnalysis

Star

The goal is to train a linear regression model to predict Deerfoot commute times given weather and accident conditions using Spark RDD and MLlib

spark spark-mllib spark-rdd

Updated Apr 12, 2020
Jupyter Notebook

firedent / Data-curation-and-indexing-with-ElasticSearch

Star

This program will process legal report via Stanford CoreNLP and index them in ElasticSearch

elasticsearch json scala xml spark-rdd

Updated Dec 4, 2019
Scala

on2e / ntua-atdb

Star

Advanced Topics in Databases course project - NTUA ECE - 2022-23

apache-spark pyspark spark-dataframes advanced-database apache-hadoop ntua-ece spark-rdd

Updated Mar 30, 2023
Python

RiccardoRevalor / Spark

Star

Spark exercises

spark pyspark spark-sql spark-rdd

Updated Nov 16, 2024
Jupyter Notebook

demanejar / spark-rdd

Star

Spark RDD basic

spark project spark-rdd

Updated Jul 15, 2021
Java

contactsunny / spring-spark-s3-file-read

Sponsor

Star

A POC written in Java using the Spring framework, which uses Apache Spark to read a file from Amazon S3 FS and counts the number of lines in the file.

java spark apache-spark spring spring-boot poc spark-rdd spark-s3 thetechcheck rdd-s3 spark-rdd-s3

Updated May 30, 2018
Java

Improve this page

Add a description, image, and links to the spark-rdd topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the spark-rdd topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spark-rdd

Here are 23 public repositories matching this topic...

mahmoudparsian / pyspark-tutorial

mahmoudparsian / big-data-mapreduce-course

Thomas-George-T / Movies-Analytics-in-Spark-and-Scala

yennanliu / spark-etl-pipeline

nipunmanral / Community-Detection-In-Graphs

Ren294 / Log-Analysis-Project

MaxineXiong / Item-based-collaborative-filtering

MaxineXiong / Degrees-of-Separation-with-Breadth-first-Search

adityajn105 / Apache-Spark-Tutorials

ricardoariasalazar / Flights-Delay

manojpawar94 / Spark-Scala-Examples

nikhilkumawat03 / Extracting-Relevant-Document

mohammad-safari / spark-hadoop-exercise

ShreeshaN / SparkBigDataTutorials

vaibhav50596 / DeerfootTrailAnalysis

firedent / Data-curation-and-indexing-with-ElasticSearch

on2e / ntua-atdb

RiccardoRevalor / Spark

demanejar / spark-rdd

contactsunny / spring-spark-s3-file-read

Improve this page

Add this topic to your repo