Data Analysis & comparisons of the execution time take to compute word counts for different input textFile sizes executed on Spark-Shell in Scala in local cluster mode.
- apache-hadoop-wiki.txt: 46.5 kB
- big.txt: 6.5 MB
- Linux System
- Hadoop
- Spark 2.0 set up in Local cluster mode
In terminal execute the following command:
spark-shell -i "SparkWordCount.scala"
The average execution times for the spark jobs on Spark local mode are:
- apache-hadoop-wiki.txt: 1 second
- big.txt: 3 seconds
To view the source code of the word counts of:
To view the results of the word counts