Hash结构小谈

程序员文章站 2022-07-15 14:15:17

...

要理解Hash结构首先得明白Hash算法。Hash算法是将一组数据映射到一组紧凑的数据上，通常是不可逆的数学函数。但是不同的输入在这一映射上可能得到相同的输出,产生冲突,所以需要处理冲突。Hash结构将大量数据映射到一组比较紧凑的数据上，查找容易。下面举例谈谈自定义的Hash结构。这里我用的Hash算法是比较常见的取余法，处理冲突的方法是挂链法。既然采用挂链法,那么就需要一个链表存储相互冲突的数据。可以一个数组来存储比较紧凑的哈希表。存储的数据可以是任意类型的数据，而取余法适用的是int型数据，那么就需要给每一个数据设立一个一一对应的键值，通过键值来寻找对应的数据。这便是一个Map结构。链表是由一个一个的结点连接而成的，每个结点存储着我们需要存储的数据。下面我们定义结点类如下代码所示：


public class MyNode {
	private MyNode next;
	private MyNode parent;
	private int key;
	private Object value;

	public MyNode getNext() {
		return next;
	}

	public void setNext(MyNode next) {
		this.next = next;
	}

	public MyNode getParent() {
		return parent;
	}

	public void setParent(MyNode parent) {
		this.parent = parent;
	}

	public MyNode(int key, Object value) {
		this.key = key;
		this.value = value;
	}

	public String toString(){
		return value.toString();
	}

	public int getKey(){
		return key;
	}

	public Object getValue(){
		return value;
	}

}

定义了结点类，下面需要定义的是链表类。代码如下所示：


public class MyLinkedList {
	private MyNode head=null;
	public MyNode getHead(){
		return head;
	}
	public void setHead(MyNode head){
		this.head=head;
	}
	/**
	 * 在链表前面添加元素的方法
	 * @param mn
	 */
	public void add(MyNode nmn){		
		if(head==null){
			head=nmn;
		}else{
			nmn.setNext(head);
			head.setParent(nmn);
			head=nmn;
		}
	}
	/**
	 * 遍历过程中不能改变链表头
	 */
	public void travelList(){
		MyNode temp=head;
		while(temp!=null){
			System.out.print(temp.toString()+"-->");
			temp=temp.getNext();
		}		
	}
}

下面定义我们的Hash结构。在定义Hash结构之前我们得明白Hash结构中的初始容量与加载因子两个概念。初始容量是在创建Hash表时的容量，也就是存放Hash表的数组的长度。加载因子是对Hash结构中存储数据数量的一种限制。当存储的数据量超过了初使容量*加载因子的积时调用重构Hash的方法，使Hash结构中能够存储的数据更多一点。可以通过增加初始容量或加载因子来rehash()。代码如下所示：


public class MyHashMap {
	private int length = 20;// 初始容量
	private float stowage_factor = 3;// 加载因子
	private int num = 0;
	private MyLinkedList[] listArray = new MyLinkedList[length];

	/**
	 * 添加数据的方法
	 * 
	 * @param key
	 * @param value
	 */
	public void put(int key, Object value) {
		int index = hash(key);
		if (listArray[index] == null) {
			listArray[index] = new MyLinkedList();
		}
		listArray[index].add(new MyNode(key, value));
		num++;
	}

	/**
	 *取余法hash函数
	 * 
	 * @param key
	 * @return
	 */
	private int hash(int key) {
		return key % length;
	}

	/**
	 * 遍历HashMap的方法
	 * 
	 * @param listArray
	 * @return
	 */
	public void travel() {
		for (int i = 0; i < listArray.length; i++) {
			if (listArray[i] != null) {
				listArray[i].travelList();
				System.out.println("数组第" + i + "个挂链");
			}
		}
	}

	/**
	 *查找的数据的方法,输出查找次数
	 * 
	 * @param key
	 * @返回查找的数据
	 */
	public Object getValue(int key) {
		int count = 0;
		MyNode temp = listArray[hash(key)].getHead();
		while (temp != null) {
			count++;
			if (temp.getKey() == key) {
				System.out.println("查找次数:" + count);
				return temp.getValue();
			}
			temp = temp.getNext();
		}
		System.out.println("所对应的键值不存在");
		return null;
	}

	/**
	 * 重构哈希的方法
	 */
	public boolean rehash() {
		System.out.println("Hash重构了!");
		length += 2;
		listArray = new MyLinkedList[length];
		return isFull();
	}

	public boolean isFull() {
		return num > length * stowage_factor;
	}

	/**
	 * 删除数据的方法
	 * 
	 * @param key
	 * @return
	 */
	public boolean delete(int key) {
		boolean flag = false;
		MyNode temp = listArray[hash(key)].getHead();
		while (temp != null) {
			if (key == temp.getKey()) {
				if (temp.getNext() == null) {
					temp.getParent().setNext(null);
				} else if (temp.getParent() == null) {
					listArray[hash(key)].setHead(temp.getNext());
				} else {
					temp.getParent().setNext(temp.getNext());
				}
				flag = true;
			}
			temp = temp.getNext();
		}
		return flag;
	}
}

下面我们来测试一下刚才自定义好的Hash结构。代码如下所示：


public class HashTest {
	public static void main(String args[]){
		MyHashMap mhm=new MyHashMap();
		for(int i=0;i<60;i++){
			if(mhm.isFull()){
				mhm.rehash();
			}else{
				mhm.put(i, "hello"+i);
			}
		}
		mhm.travel();
		//链表的第一个
		if(mhm.delete(40)){
			System.out.println("键值50对应的数据被删除了!");
		}
		//链表中间的结点
		if(mhm.delete(22)){
			System.out.println("键值22对应的数据被删除了!");
		}
		//链表的最后一个结点
		if(mhm.delete(4)){
			System.out.println("键值4对应的数据被删除了!");
		}
		mhm.travel();
	}
}

代码运行结果如下图所示：
[color=red]hello40-->hello20-->hello0-->数组第0个挂链
hello41-->hello21-->hello1-->数组第1个挂链
hello42-->hello22-->hello2-->数组第2个挂链
hello43-->hello23-->hello3-->数组第3个挂链
hello44-->hello24-->hello4-->数组第4个挂链
hello45-->hello25-->hello5-->数组第5个挂链
hello46-->hello26-->hello6-->数组第6个挂链
hello47-->hello27-->hello7-->数组第7个挂链
hello48-->hello28-->hello8-->数组第8个挂链
hello49-->hello29-->hello9-->数组第9个挂链
hello50-->hello30-->hello10-->数组第10个挂链
hello51-->hello31-->hello11-->数组第11个挂链
hello52-->hello32-->hello12-->数组第12个挂链
hello53-->hello33-->hello13-->数组第13个挂链
hello54-->hello34-->hello14-->数组第14个挂链
hello55-->hello35-->hello15-->数组第15个挂链
hello56-->hello36-->hello16-->数组第16个挂链
hello57-->hello37-->hello17-->数组第17个挂链
hello58-->hello38-->hello18-->数组第18个挂链
hello59-->hello39-->hello19-->数组第19个挂链
键值50对应的数据被删除了!
键值22对应的数据被删除了!
键值4对应的数据被删除了!
hello20-->hello0-->数组第0个挂链
hello41-->hello21-->hello1-->数组第1个挂链
hello42-->hello2-->数组第2个挂链
hello43-->hello23-->hello3-->数组第3个挂链
hello44-->hello24-->数组第4个挂链
hello45-->hello25-->hello5-->数组第5个挂链
hello46-->hello26-->hello6-->数组第6个挂链
hello47-->hello27-->hello7-->数组第7个挂链
hello48-->hello28-->hello8-->数组第8个挂链
hello49-->hello29-->hello9-->数组第9个挂链
hello50-->hello30-->hello10-->数组第10个挂链
hello51-->hello31-->hello11-->数组第11个挂链
hello52-->hello32-->hello12-->数组第12个挂链
hello53-->hello33-->hello13-->数组第13个挂链
hello54-->hello34-->hello14-->数组第14个挂链
hello55-->hello35-->hello15-->数组第15个挂链
hello56-->hello36-->hello16-->数组第16个挂链
hello57-->hello37-->hello17-->数组第17个挂链
hello58-->hello38-->hello18-->数组第18个挂链
hello59-->hello39-->hello19-->数组第19个挂链[/color]
需要注意的是这里存储的数据并没有超过Hash结构中初始容量与加载因子的积，因而没有触发rehash()方法。如果触发了rehash()方法，虽然哈希重构要、实现比较简单，但是重构了的哈希存储数据就面临两个选择：
1.把重构前的数据再存储一遍；
2.把源数据再存储一遍；
这其实就取决于取得数据源的难易程度。如果源数据不易取得那么就需要在重构前保存原Hash结构中的数据，重构后再存储一遍。
下面再来重新谈谈加载因子。我这里定久的加载因子为3,而我们知道JDK中实现的HashMap中的加载因子默认为0.75，这就需要我们反思下为什么加载因子要设置成小于1的数呢？
加载因子表示Hash填满的一种程度,取0.75是为了在空间和时间利用率上达到一种折中。当加载因子越大时空间利用率是高了但是增加了冲突的机会，会增加查找时间开销；同理当加载因子越小时虽然冲突少了从而减少查找时间开销，但是空间利用率不高。

相关标签： java 算法 Hash 数据结构

上一篇：【算法笔记】散列表（中）

下一篇：散列表中的平方探测法

Hash结构小谈

小谈php正则提取图片地址

从VantComponent谈小程序维护-随笔-SegmentFault思否

微信小程序学习笔记之目录结构、基本配置图文详解

golang中json小谈之字符串转浮点数的操作

小谈MP3使用中三个误区

默默小谈PHP&MYSQL分页原理及实现

性能与续航哪个更好小谈笔记本显卡

何苦宗谈营销，视频小技巧解决企业冷门关键词推荐问题

odoo权限小谈

谈耳塞结构对音质及耳朵的影响

Hash结构小谈

小谈php正则提取图片地址

从VantComponent谈小程序维护-随笔-SegmentFault思否

微信小程序学习笔记之目录结构、基本配置图文详解

golang中json小谈之字符串转浮点数的操作

小谈MP3使用中三个误区

默默小谈PHP&MYSQL分页原理及实现

性能与续航哪个更好 小谈笔记本显卡

何苦宗谈营销，视频小技巧解决企业冷门关键词推荐问题

odoo权限小谈

谈耳塞结构对音质及耳朵的影响

性能与续航哪个更好小谈笔记本显卡