Hadoop的序列化与反序列化实操
程序员文章站
2022-04-28 18:11:17
...
0x00 文章内容
- 编写代码
- 测试结果
0x01 编写代码
前提:因为需要用到Hadoop,所以需要先引入Hadoop相关的jar包
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.5</version>
</dependency>
1. 编写对象类
a. 编写Block类
package com.shaonaiyi.hadoop.serialize;
import org.apache.hadoop.io.Writable;
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
/**
* @Author aaa@qq.com
* @Date 2019/12/13 16:51
* @Description 定义引入Hadoop序列化机制的BlockWritable类
*/
public class BlockWritable implements Writable {
private long blockId;
private long numBytes;
private long generationStamp;
public BlockWritable(long blockId) {
this.blockId = blockId;
}
public BlockWritable(long blockId, long numBytes, long generationStamp) {
this.blockId = blockId;
this.numBytes = numBytes;
this.generationStamp = generationStamp;
}
public long getBlockId() {
return blockId;
}
public long getNumBytes() {
return numBytes;
}
public long getGenerationStamp() {
return generationStamp;
}
@Override
public String toString() {
return "Block{" +
"blockId=" + blockId +
", numBytes=" + numBytes +
", generationStamp=" + generationStamp +
'}';
}
public void write(DataOutput dataOutput) throws IOException {
dataOutput.writeLong(blockId);
dataOutput.writeLong(numBytes);
dataOutput.writeLong(generationStamp);
}
public void readFields(DataInput dataInput) throws IOException {
this.blockId = dataInput.readLong();
this.numBytes = dataInput.readLong();
this.generationStamp = dataInput.readLong();
}
}
2. 编写调用测试代码
a. 编写序列化与反序列化代码
package com.shaonaiyi.hadoop.serialize;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.io.WritableFactories;
import java.io.*;
/**
* @Author aaa@qq.com
* @Date 2019/12/13 16:53
* @Description 编写Hadoop序列化机制测试类代码
*/
public class HadoopSerializableTest {
public static void main(String[] args) throws IOException {
String fileName = "blockWritable.txt";
serialize(fileName);
// deSerialize(fileName);
}
private static void serialize(String fileName) throws IOException {
BlockWritable block = new BlockWritable(78062621L, 39447651L, 56737546L);
File file = new File(fileName);
if (file.exists()) {
file.delete();
}
file.createNewFile();
FileOutputStream fileOutputStream = new FileOutputStream(file);
DataOutputStream dataOutputStream = new DataOutputStream(fileOutputStream);
block.write(dataOutputStream);
dataOutputStream.close();
}
private static void deSerialize(String fileName) throws IOException {
FileInputStream fileInputStream = new FileInputStream(fileName);
DataInputStream dataInputStream = new DataInputStream(fileInputStream);
Writable writable = WritableFactories.newInstance(BlockWritable.class);
writable.readFields(dataInputStream);
System.out.println((BlockWritable)writable);
}
}
0x02 测试结果
1. 测试序列化
a. 执行序列化方法,发现项目与main文件夹同级目录下多了一个blockWritable.txt
文件,双击打开:
查看大小,发现是24字节:
我们定义的BlockWritable类,Long类是8个字节,三个Long类型的属性是24个字节,没有引入额外的信息,是我们想要的结果,请与上一篇教程做比较:Java的序列化与反序列化实操。
2. 测试反序列化
a. 打开反序列化方法,执行:
// serialize(fileName);
deSerialize(fileName);
发现可以将BlockWritable对象反序列化出来了。
3. 测试修改对象代码
a. 尝试修改BlockWritable类,如在里面添加一个无参数的构造方法
public BlockWritable() {
}
b. 然后再反序列化,发现执行的结果并没有报错,与前面一样。
0xFF 总结
- Hadoop的序列化机制解决了内置的Java序列化接口的缺陷。
- 本教程拥有前置教程:Java的序列化与反序列化实操!
- Hadoop的序列化机制接口如下:
Java类型 | Writable | 序列化后字节大小 |
---|---|---|
boolean | BooleanWritable | 1 |
byte | ByteWritable | 1 |
int | IntWritable | 4 |
float | FloatWriable | 4 |
long | LongWritable | 8 |
double | DoubleWritable | 8 |
String | Text | 据具体情况而定 |
- 在平时代码中,可以参考如下:
Text text = new Text();
String word = "hello";
text.set(word);
IntWritable intWritable = new IntWritable(3);
作者简介:邵奈一
全栈工程师、市场洞察者、专栏编辑
| 公众号 | 微信 | 微博 | CSDN | 简书 |
福利:
邵奈一的技术博客导航
邵奈一 原创不易,如转载请标明出处。
上一篇: 用hadoop实现倒排索引简单实例