Posts

Showing posts from July, 2018

STEPS TO RUN WORD COUNT MAP-REDUCE

Image
If you have hadoop installed (extracted hadoop tar file) please ignore the below installation step. Installation: ==> please replace user value with appropriate user name from your system terminal> cd /home/user ==> to download the tarball from below location you: Either you can paste below link in web browser which would start downloading or you can you wget command to download hadoop tar file. terminal>wget  http://redrockdigimark.com/apachemirror/hadoop/common/stable/hadoop-2.9.1.tar.gz http://redrockdigimark.com/apachemirror/hadoop/common/stable/hadoop-2.9.1.tar.gz ==> untar the file using below command terminal>tar -xzf hadoop-2.9.1.tar.gz ==> you can use the below command to check if the files has been sucessfully extracted or not  terminal> cd hadoop-2.9.1/bin ==> below ls command will list all the commands which related to hadoop terminal> ls ==> below command would show you the current files from the cu

HADOOP MAP-REDUCE WORD COUNT JAVA CODE

WordCount.java package org.myorg.hadoop; import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class WordCount { public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable( 1 ); private Text word = new Text(); public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { context.getCurrentKey(); StringTokenizer itr = new StringTokenizer(value.toString());