You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Parallel implementation of Breadth-First Search algorith in Java MapReduce and PySpark. This implementation finds degrees of separation between Twitter Users
A REST-based service that translates the SQL query into MapReduce and Spark jobs. It runs these jobs and provides the JSON object. SQL to MapReduce and Spark translator.