MapReduce的输入文件是两个
[学习笔记]
1.对于mapreduce程序,如何输入文件是两个文件?
这一小节,我们将继续第一章大数据入门的helloworld例子做进一步的研究。这里,我们研究如何输入文件是两个文件。
package com;
import java.io.ioexception;
import java.util.stringtokenizer;
import org.apache.hadoop.conf.configuration;
import org.apache.hadoop.fs.path;
import org.apache.hadoop.io.intwritable;
import org.apache.hadoop.io.text;
import org.apache.hadoop.mapreduce.job;
import org.apache.hadoop.mapreduce.mapper;
import org.apache.hadoop.mapreduce.reducer;
import org.apache.hadoop.mapreduce.lib.input.fileinputformat;
import org.apache.hadoop.mapreduce.lib.output.fileoutputformat;
import org.apache.hadoop.util.genericoptionsparser;
public class wordcountmark_to_win {
public static class tokenizermapper extends mapper<object, text, text, intwritable> {
private final static intwritable one = new intwritable(1);
private text word = new text();
public void map(object key, text value, context context) throws ioexception, interruptedexception {
system.out.println("key is 马克-to-win @ 马克java社区:防盗版实名手机尾号:73203"+key.tostring()+" value is "+value.tostring());
stringtokenizer itr = new stringtokenizer(value.tostring());
while (itr.hasmoretokens()) {
word.set(itr.nexttoken());
context.write(word, one);
}
}
}
文章转载自原文:
下一篇: 盘点淘宝平台化的运营模块 这些你认全了吗