📜  负载因子和重新散列

📅  最后修改于: 2021-10-28 01:45:40             🧑  作者: Mango

先决条件:散列介绍和通过单独链接进行碰撞处理

散列的工作原理:

要将key(K) – value(V) 对插入哈希映射,需要 2 个步骤:

  1. 使用散列函数K 转换为一个小整数(称为其散列码)。
  2. 哈希码用于查找索引(hashCode % arrSize),并且首先搜索该索引处的整个链表(单独链接)是否已经存在 K。
  3. 如果找到,则更新其值,如果没有,则将 KV 对作为新节点存储在列表中。

复杂性和负载因子

  • 对于第一步,所用时间取决于 K 和散列函数。
    例如,如果键是字符串“abcd”,那么它的哈希函数可能取决于字符串的长度。但是对于非常大的 n 值,映射中的条目数,键的长度与 n 相比几乎可以忽略不计,因此可以认为散列计算发生在恒定时间,即O(1)
  • 对于第二步,需要遍历存在于该索引处的 KV 对列表。为此,最坏的情况可能是所有 n 个条目都在同一索引上。因此,时间复杂度为O(n) 。但是,已经进行了足够的研究来使散列函数均匀分布数组中的键,因此这种情况几乎不会发生。
  • 因此,平均而言,如果有 n 个条目并且 b 是数组的大小,则每个索引上都会有 n/b 个条目。这个值 n/b 被称为负载因子,代表我们地图上的负载。
  • 此负载因子需要保持较低,以便一个索引处的条目数较少,因此复杂度几乎恒定,即 O(1)。

重新哈希:

顾名思义,重新散列意味着再次散列。基本上,当负载因子增加到超过其预先定义的值(负载因子的默认值是 0.75)时,复杂度就会增加。因此,为了克服这一点,数组的大小增加(加倍),所有值再次散列并存储在新的双倍大小的数组中,以保持低负载因子和低复杂性。

为什么要重刷?

重新散列是因为每当键值对被插入到映射中时,加载因子就会增加,这意味着时间复杂度也会增加,如上所述。这可能不会给出 O(1) 所需的时间复杂度。

因此,必须进行rehash,增加bucketArray的大小,以降低负载因子和时间复杂度。

Rehashing是如何完成的?

可以按如下方式进行重新散列:

  • 每次向地图添加新条目时,请检查负载因子。
  • 如果它大于它的预定义值(或者如果没有给出默认值 0.75),则 Rehash。
  • 对于 Rehash,创建一个两倍于先前大小的新数组并将其设为新的 bucketarray。
  • 然后遍历旧bucketArray中的每个元素,并为每个元素调用insert(),以便将其插入到新的更大的bucket数组中。

实现 Rehashing 的程序:

// Java program to implement Rehashing
  
import java.util.ArrayList;
  
class Map {
  
    class MapNode {
  
        K key;
        V value;
        MapNode next;
  
        public MapNode(K key, V value)
        {
            this.key = key;
            this.value = value;
            next = null;
        }
    }
  
    // The bucket array where
    // the nodes containing K-V pairs are stored
    ArrayList > buckets;
  
    // No. of pairs stored - n
    int size;
  
    // Size of the bucketArray - b
    int numBuckets;
  
    // Default loadFactor
    final double DEFAULT_LOAD_FACTOR = 0.75;
  
    public Map()
    {
        numBuckets = 5;
  
        buckets = new ArrayList<>(numBuckets);
  
        for (int i = 0; i < numBuckets; i++) {
            // Initialising to null
            buckets.add(null);
        }
        System.out.println("HashMap created");
        System.out.println("Number of pairs in the Map: " + size);
        System.out.println("Size of Map: " + numBuckets);
        System.out.println("Default Load Factor : " + DEFAULT_LOAD_FACTOR + "\n");
    }
  
    private int getBucketInd(K key)
    {
  
        // Using the inbuilt function from the object class
        int hashCode = key.hashCode();
  
        // array index = hashCode%numBuckets
        return (hashCode % numBuckets);
    }
  
    public void insert(K key, V value)
    {
        // Getting the index at which it needs to be inserted
        int bucketInd = getBucketInd(key);
  
        // The first node at that index
        MapNode head = buckets.get(bucketInd);
  
        // First, loop through all the nodes present at that index
        // to check if the key already exists
        while (head != null) {
  
            // If already present the value is updated
            if (head.key.equals(key)) {
                head.value = value;
                return;
            }
            head = head.next;
        }
  
        // new node with the K and V
        MapNode newElementNode = new MapNode(key, value);
  
        // The head node at the index
        head = buckets.get(bucketInd);
  
        // the new node is inserted
        // by making it the head
        // and it's next is the previous head
        newElementNode.next = head;
  
        buckets.set(bucketInd, newElementNode);
  
        System.out.println("Pair(" + key + ", " + value + ") inserted successfully.\n");
  
        // Incrementing size
        // as new K-V pair is added to the map
        size++;
  
        // Load factor calculated
        double loadFactor = (1.0 * size) / numBuckets;
  
        System.out.println("Current Load factor = " + loadFactor);
  
        // If the load factor is > 0.75, rehashing is done
        if (loadFactor > DEFAULT_LOAD_FACTOR) {
            System.out.println(loadFactor + " is greater than " + DEFAULT_LOAD_FACTOR);
            System.out.println("Therefore Rehashing will be done.\n");
  
            // Rehash
            rehash();
  
            System.out.println("New Size of Map: " + numBuckets + "\n");
        }
  
        System.out.println("Number of pairs in the Map: " + size);
        System.out.println("Size of Map: " + numBuckets + "\n");
    }
  
    private void rehash()
    {
  
        System.out.println("\n***Rehashing Started***\n");
  
        // The present bucket list is made temp
        ArrayList > temp = buckets;
  
        // New bucketList of double the old size is created
        buckets = new ArrayList >(2 * numBuckets);
  
        for (int i = 0; i < 2 * numBuckets; i++) {
            // Initialised to null
            buckets.add(null);
        }
        // Now size is made zero
        // and we loop through all the nodes in the original bucket list(temp)
        // and insert it into the new list
        size = 0;
        numBuckets *= 2;
  
        for (int i = 0; i < temp.size(); i++) {
  
            // head of the chain at that index
            MapNode head = temp.get(i);
  
            while (head != null) {
                K key = head.key;
                V val = head.value;
  
                // calling the insert function for each node in temp
                // as the new list is now the bucketArray
                insert(key, val);
                head = head.next;
            }
        }
  
        System.out.println("\n***Rehashing Ended***\n");
    }
  
    public void printMap()
    {
  
        // The present bucket list is made temp
        ArrayList > temp = buckets;
  
        System.out.println("Current HashMap:");
        // loop through all the nodes and print them
        for (int i = 0; i < temp.size(); i++) {
  
            // head of the chain at that index
            MapNode head = temp.get(i);
  
            while (head != null) {
                System.out.println("key = " + head.key + ", val = " + head.value);
  
                head = head.next;
            }
        }
        System.out.println();
    }
}
  
public class GFG {
  
    public static void main(String[] args)
    {
  
        // Creating the Map
        Map map = new Map();
  
        // Inserting elements
        map.insert(1, "Geeks");
        map.printMap();
  
        map.insert(2, "forGeeks");
        map.printMap();
  
        map.insert(3, "A");
        map.printMap();
  
        map.insert(4, "Computer");
        map.printMap();
  
        map.insert(5, "Portal");
        map.printMap();
    }
}
输出:
HashMap created
Number of pairs in the Map: 0
Size of Map: 5
Default Load Factor : 0.75

Pair(1, Geeks) inserted successfully.

Current Load factor = 0.2
Number of pairs in the Map: 1
Size of Map: 5

Current HashMap:
key = 1, val = Geeks

Pair(2, forGeeks) inserted successfully.

Current Load factor = 0.4
Number of pairs in the Map: 2
Size of Map: 5

Current HashMap:
key = 1, val = Geeks
key = 2, val = forGeeks

Pair(3, A) inserted successfully.

Current Load factor = 0.6
Number of pairs in the Map: 3
Size of Map: 5

Current HashMap:
key = 1, val = Geeks
key = 2, val = forGeeks
key = 3, val = A

Pair(4, Computer) inserted successfully.

Current Load factor = 0.8
0.8 is greater than 0.75
Therefore Rehashing will be done.


***Rehashing Started***

Pair(1, Geeks) inserted successfully.

Current Load factor = 0.1
Number of pairs in the Map: 1
Size of Map: 10

Pair(2, forGeeks) inserted successfully.

Current Load factor = 0.2
Number of pairs in the Map: 2
Size of Map: 10

Pair(3, A) inserted successfully.

Current Load factor = 0.3
Number of pairs in the Map: 3
Size of Map: 10

Pair(4, Computer) inserted successfully.

Current Load factor = 0.4
Number of pairs in the Map: 4
Size of Map: 10


***Rehashing Ended***

New Size of Map: 10

Number of pairs in the Map: 4
Size of Map: 10

Current HashMap:
key = 1, val = Geeks
key = 2, val = forGeeks
key = 3, val = A
key = 4, val = Computer

Pair(5, Portal) inserted successfully.

Current Load factor = 0.5
Number of pairs in the Map: 5
Size of Map: 10

Current HashMap:
key = 1, val = Geeks
key = 2, val = forGeeks
key = 3, val = A
key = 4, val = Computer
key = 5, val = Portal

如果您希望与专家一起参加现场课程,请参阅DSA 现场工作专业课程学生竞争性编程现场课程