Name		Name	Last commit message	Last commit date
parent directory ..
src		src
README.md		README.md
hadoop-graph.png		hadoop-graph.png

README.md

Hadoop MapReduce Word Count Data Analysis

Description

Data Analysis and comparison of execution times of a MapReduce program to compute the word counts of sample input text files of different sizes executed on Hadoop. Mapper and Reducer class are written in Java and run on the local cluster mode in Hadoop.

Environment

Hadoop local mode
JDK 8

File Sizes

apache-hadoop-wiki.txt: 46.5 kB
big.txt: 6.5 MB

Execution

Start hdfs and yarn daemons
```
start-hdfs.sh
start-yarn.sh
```
Open a java project in Eclipse
In the java project, Add External Archives using a build path From the hadoop software folder add :
```
/common/*.jar 
/common/lib/All
/hdfs/*.jar
/mapreduce/*.jar
/yarn/*.jar
```
Write the mapper, reducer and driver class
Export it into a .jar file
Make a sample input file (Source: text, data set)

Copy file from local to hdfs

hdfs dfs -copyFromLocal <sourcefile> <destinationPath>

Submit the file to the hdfs cluster

Hadoop jar <source.jar Path> MainClassDriver <sourceFile Path in hdfs> <Destination Folder path in hdfs>

Read the output file

hdfs dfs -cat /DestinationFolder/* to view the results

Stop all the hdfs daemons

stop-dfs.sh
stop-yarn.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hadoop-MapReduce-WordCount-Data-Analysis

Hadoop-MapReduce-WordCount-Data-Analysis

README.md

Hadoop MapReduce Word Count Data Analysis

Description

Environment

File Sizes

Execution

Observations

Source Code

File Sources

Files

Hadoop-MapReduce-WordCount-Data-Analysis

Directory actions

More options

Directory actions

More options

Latest commit

History

Hadoop-MapReduce-WordCount-Data-Analysis

Folders and files

parent directory

README.md

Hadoop MapReduce Word Count Data Analysis

Description

Environment

File Sizes

Execution

Observations

Source Code

File Sources