📜  哈希联接和排序合并联接之间的区别

📅  最后修改于: 2021-08-24 05:08:00             🧑  作者: Mango

1.哈希加入:
在加入运算符的情况下,它也被称为“ go-to-guy”。这意味着如果没有其他连接是首选的(可能是由于没有排序或索引等原因),则使用哈希连接。当要联接大型,未排序和未索引的数据(驻留在表中)时,哈希联接是最佳算法。哈希联接算法包括探测阶段和构建阶段。

在两个关系分别为R和S的情况下,哈希联接的算法如下:

Hash records of R, one by one, using A values
(Use same M buckets and same hash function h)
Hash matching pair of records into same bucket
End

2.排序合并联接:
顾名思义,排序合并联接在联接算法中有两个阶段,即排序阶段和合并阶段。合并算法是最快的联接算法。这是在排序关系的情况下“排序合并联接”最快的原因。假设需要将2个排序关系R和S合并,算法如下:

If R is sorted on A, S is sorted on B do
Merge R and S to get join result
End

哈希联接和排序合并联接之间的区别:

S.No. Hash Join Sort Merge Join
1. It is specifically used in case of joining of larger tables. It is usually used to join two independent sources of data represented in a table.
2. It has best performance in case of large and sorted and non-indexed inputs. It is better than hash join in case of performance in large tables.
3. Two phases in this are build and probe. It consists of 2 phases consisting sort operation and merge operation.
4. Steps involved are building a Hash table on a small table. It is used to probe hash value of Hash table is applicable for each element in second row. First row from first table and second row from table is taken, if it is not end then, selected rows are checked for merger. If they can be merged, merged row is returned else next rows are taken from tables and steps are repeated until rows are exhausted.
5. It is not as fast as sort merge join in case of sorted tables. It is fastest join operation in case of sorted tables. This is because it uses merge phase and sort phase, where, if sort is already previously done, then merge is fastest operation.
6. Its types are classic hash join, Grace hash join, hybrid hash join, hash anti join, hash semi-join, recursive hash join and hash bailout. It does not have further classifications.
7. This join is automatically selected in case there is no specific reason to adopt other types of join algorithms. It is also known as go-to guy of all join operators. It is not automatically selected.