- A UI Dashboard built on top of Spark to browse knowledge (a.k.a data)
- Real-time query spark and visualise it as graph.
- Supports SQL query syntax.
- This is just a sample application to get an idea on how to go about building any kind of analytics dashboard on top of the data that Spark processed. One can customize it according to their needs.
- When you run this project, the dashboard page will look something like the one shown above.
- User can type his query and hit submit.
- Upon submit, spark processes it and returns the data as JSON.
- The json result is rendered as graph using D3 and AngularJS.
- Above demo illustrated a simple country profile search
CountryCode IN ("USA", "IND", "WLD")
and how the three countries information is displayed as graph. Notice, any common relationship between 2 countries are linked via a common node.
- Build:
mvn clean install
- Run:
spark-submit --class MainApp graph-knowledge-browser.jar
- Go to browser and start querying @ http://localhost:8002/index.html
I've used http://data.worldbank.org open data countries profile information as knowledge base in this project.
Following table displays some sample rows to give an idea on the columns and schema of the data taken as knowledge base:
Query USA profile:
CountryCode = 'USA'
Query all countries with name starting with letter 'I':
CountryCode LIKE 'I%'
Query to get India, USA and World's profile info:
CountryCode IN ('USA', 'WLD', 'IND')
Query total population, mortality rate and population growth information in India, USA and World countries:
CountryCode IN ('USA', 'WLD', 'IND') AND SeriesCode IN ('SP.POP.TOTL', 'SH.DYN.MORT', 'SP.POP.GROW')
Optionally, you can provide configuration params like the host and port from command line. To see the list of configurable params, just type:
$ spark-submit <path-to-graph-knowledge-browser.jar> --help
Help content will look something like this:
Apart from Spark, this application uses akka-http from browser integration.
So, it needs config params like AkkaWebPort to bind to, SparkMaster
and SparkAppName
Usage: spark-submit graph-knowledge-browser.jar [options]
Options:
-h, --help
-m, --master <master_url> spark://host:port, mesos://host:port, yarn, or local. Default: $sparkMasterDef
-n, --name <name> A name of your application. Default: $sparkAppNameDef
-p, --akkaHttpPort <portnumber> Port where akka-http is binded. Default: $akkaHttpPortDef
Configured one route:
1. http://host:port/index.html - takes user to knowledge browser page
- src/main/scala/com/spoddutur/MainApp.scala: The main class from where application execution begins
- data/countriesProfile.csv sample data used to query
- src/main/resources/application.conf: tweak command line args directly here before building the jar and run spark-submit
- src/main/scala/com/spoddutur/web/WebServer.scala: Starts akka-http webserver at the mentioned host and port. Also, registers the routes (index.html).
- src/main/scala/com/spoddutur/web/Router.scala:: This is where we can create and register more routes apart from index.html