[go: up one dir, main page]

Skip to content

fegaras/array

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SAC: Scalable Array Comprehensions

An array comprehension is a monolithic array construction that is as expressive as basic SQL by supporting a group-by syntax that allows us to capture many array computations in declarative form.

SAC translates array comprehensions to Scala code that calls Spark RDD operations whose functional arguments call the Scala's Parallel Collections library for multicore parallelism.

Benchmarks

The SAC benchmarks were evaluated on SDSC Comet. The SBATCH shell script used to run the benchmarks on Comet is in tests/spark file comet.run. The log files generated by the scripts that contain the run times are run*.log in the same directory.

The cluster should support Slurm Workload Manager, Hadoop 2.*, and myhadoop.

You compile SAC, use mvn install on the top directory.

Steps to run the scripts on Comet (or on any Slurm-managed cluster):

  1. Install Scala 2.12.
  2. Install Spark 3.0 on Hadoop 2.7.
  3. Change SCALA_HOME and SPARK_HOME in the SBATCH scripts to point to your installations.
  4. Execute the scripts using sbatch, eg, sbatch comet.run.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published