📜  在集合上高效地设计“插入”,“删除”和“中值”查询

📅  最后修改于: 2021-04-17 12:10:37             🧑  作者: Mango

最初给定一个空集并对其进行大量查询,每种查询可能具有以下类型:

  1. 插入–插入新元素“ x”。
  2. 删除–删除现有元素“ x”。
  3. 中位数–打印集合中当前数字的中位数元素

例子:

Input :  Insert 1 
         Insert 4
         Insert 7
         Median 
Output : The first three queries
         should insert 1, 4 and 7
         into an empty set. The
         fourth query should return
         4 (median of 1, 4, 7).

出于说明目的,我们假设以下内容,但这些假设并非此处讨论的方法的局限性:
1.在任何情况下,所有元素都是截然不同的,也就是说,它们都不会出现一次以上。
2.仅在集合中的元素数为奇数时才进行“中位数”查询。(如果是偶数,则我们需要在段树上进行两个查询)
3.设置的元素范围为1到+ 10 ^ 6。

方法1(天真)
在朴素的实现中,我们可以在O(1)中执行前两个查询,但可以在O(max_elem)中执行最后一个查询,其中max_elem是所有时间中的最大元素(包括已删除的元素)。

让我们假设一个数组count [] (大小为10 ^ 6 +1)来维护子集中每个元素的计数。以下是这3个查询的简单且易于解释的算法:
插入x查询:

count[x]++;
  if (x > max_elem)
     max_elem = x;
  n++;

删除x查询:

if (count[x] > 0)
     count[x]--;
  n--;

中位数查询:

sum = 0;
  i = 0;
  while( sum <= n / 2 )
  {
     i++;
     sum += count[i];
  }
  median = i;
  return median;

数组count []的示意图,表示集合{1、4、7、8、9},中位元素为’7’:

“中位数”查询旨在在数组中找到第(n + 1)/ 2个第1个“ 1”,在这种情况下为第3个“ 1”;现在我们使用段树进行相同的操作。方法2(使用段树)
我们制作一个段树来存储间隔之和,其中间隔[a,b]代表当前在[a,b]范围内的集合中存在的元素数。例如,如果考虑上述示例,query(3,7)返回2,query(4,4)返回1,query(5,5)返回0。

插入和删除查询很简单,并且都可以使用update(int x,int diff)函数来实现(在索引“ x”处添加“ diff”)

算法

// adds ‘diff’ at index ‘x’
update(node, a, b, x, diff)

  // If leaf node
  If a == b and a == x
     segmentTree[node] += diff

  // If non-leaf node and x lies in its range
  If x is in [a, b]

     // Update children recursively    
     update(2*node, a, (a + b)/2, x, diff)
     update(2*node + 1, (a + b)/2 + 1, b, x, diff)

      // Update node
      segmentTree[node] = segmentTree[2 * node] + 
                          segmentTree[2 * node + 1]

上面的递归函数在O(log(max_elem))中运行(在这种情况下,max_elem为10 ^ 6),并通过以下调用用于插入和删除:

  1. 插入’x’是使用update(1、0、10 ^ 6,x,1)完成的。请注意,传递树根,传递起始索引为0,终止索引为10 ^ 6,以便更新所有具有x的范围。
  2. 删除’x’是使用update(1、0、10 ^ 6,x,-1)完成的。请注意,传递树根,传递起始索引为0,终止索引为10 ^ 6,以便更新所有具有x的范围。

现在,该函数查找第k个’1’的索引,在这种情况下,’k’始终为(n + 1)/ 2,这将非常类似于二进制搜索,您可以将其视为段树上的递归二进制搜索函数。

让我们以一个例子来了解一下,我们的集合当前有元素{1,4,7,8,9},因此由下面的段树表示。

如果我们在一个非叶节点上,则确保它具有两个子节点,然后查看左子节点是否有多个等于“ k”的人,如果是,则确保索引位于左子树中,否则,如果左子树的1的个数少于k,则我们确定索引位于右子树中。我们递归执行此操作以达到索引,然后从那里返回索引。

算法

1.findKth(node, a, b, k)
2.  If a != b 
3.     If segmentTree[ 2 * node ] >= k
4.       return findKth(2*node, a, (a + b)/2, k)
5.     else
6.       return findKth(2*node + 1, (a + b)/2 + 1, 
                       b, k - segmentTree[ 2 * node ])
7.     else
8.       return a

上面的递归函数在O(log(max_elem))中运行

C++
// A C++ program to implement insert, delete and
// median queries using segment tree
#include
#define maxn 3000005
#define max_elem 1000000
using namespace std;
   
// A global array to store segment tree.
// Note: Since it is global, all elements are 0.
int segmentTree[maxn];
   
// Update 'node' and its children in segment tree.
// Here 'node' is index in segmentTree[], 'a' and
// 'b' are starting and ending indexes of range stored
//  in current node.
// 'diff' is the value to be added to value 'x'.
void update(int node, int a, int b, int x, int diff)
{
    // If current node is a leaf node
    if (a == b && a == x )
    {
        // add 'diff' and return
        segmentTree[node] += diff;
        return ;
    }
   
    // If current node is non-leaf and 'x' is in its
    // range
    if (x >= a && x <= b)
    {
        // update both sub-trees, left and right
        update(node*2, a, (a + b)/2, x, diff);
        update(node*2 + 1, (a + b)/2 + 1, b, x, diff);
   
        // Finally update current node
        segmentTree[node] = segmentTree[node*2] +
                            segmentTree[node*2 + 1];
    }
}
   
// Returns k'th node in segment tree
int findKth(int node, int a, int b, int k)
{
    // non-leaf node, will definitely have both
    // children; left and right
    if (a != b)
    {
        // If kth element lies in the left subtree
        if (segmentTree[node*2] >= k)
            return findKth(node*2, a, (a + b)/2, k);
   
        // If kth one lies in the right subtree
        return findKth(node*2 + 1, (a + b)/2  + 1,
                       b, k - segmentTree[node*2]);
    }
   
    // if at a leaf node, return the index it stores
    // information about
    return (segmentTree[node])? a : -1;
}
   
// insert x in the set
void insert(int x)
{
    update(1, 0, max_elem, x, 1);
}
   
// delete x from the set
void delet(int x)
{
    update(1, 0, max_elem, x, -1);
}
   
// returns median element of the set with odd
// cardinality only
int median()
{
    int k = (segmentTree[1] + 1) / 2;
    return findKth(1, 0, max_elem, k);
}
   
// Driver code
int main()
{
    insert(1);
    insert(4);
    insert(7);
    cout << "Median for the set {1,4,7} = "
         << median() << endl;
    insert(8);
    insert(9);
    cout << "Median for the set {1,4,7,8,9} = "
         << median() << endl;
    delet(1);
    delet(8);
    cout << "Median for the set {4,7,9} = "
         << median() << endl;
    return 0;
}


Java
// A Java program to implement insert, delete and
// median queries using segment tree
import java.io.*;
 
class GFG{
 
public static int maxn = 3000005;
public static int max_elem = 1000000;
 
// A global array to store segment tree.
// Note: Since it is global, all elements are 0.
public static int[] segmentTree = new int[maxn];
 
// Update 'node' and its children in segment tree.
// Here 'node' is index in segmentTree[], 'a' and
// 'b' are starting and ending indexes of range stored
//  in current node.
// 'diff' is the value to be added to value 'x'.
public static void update(int node, int a, int b,
                          int x, int diff)
{
     
    // If current node is a leaf node
    if (a == b && a == x )
    {
         
        // Add 'diff' and return
        segmentTree[node] += diff;
        return ;
    }
     
    // If current node is non-leaf and 'x'
    // is in its range
    if (x >= a && x <= b)
    {
         
        // Update both sub-trees, left and right
        update(node * 2, a, (a + b) / 2, x, diff);
        update(node * 2 + 1, (a + b) / 2 + 1,
               b, x, diff);
         
        // Finally update current node
        segmentTree[node] = segmentTree[node * 2] +
                            segmentTree[node * 2 + 1];
    }
}
 
// Returns k'th node in segment tree
public static int findKth(int node, int a,
                          int b, int k)
{
     
    // Non-leaf node, will definitely have both
    // children; left and right
    if (a != b)
    {
         
        // If kth element lies in the left subtree
        if (segmentTree[node * 2] >= k)
        {
            return findKth(node * 2, a, (a + b) / 2, k);
        }
         
        // If kth one lies in the right subtree
        return findKth(node * 2 + 1, (a + b) / 2  + 1,
                       b, k - segmentTree[node * 2]);
         
    }
     
    // If at a leaf node, return the index it stores
    // information about
    return (segmentTree[node] != 0) ? a : -1;
}
 
// Insert x in the set
public static void insert(int x)
{
    update(1, 0, max_elem, x, 1);
}
 
// Delete x from the set
public static void delet(int x)
{
    update(1, 0, max_elem, x, -1);
}
 
// Returns median element of the set
// with odd cardinality only
public static int median()
{
    int k = (segmentTree[1] + 1) / 2;
    return findKth(1, 0, max_elem, k);
}
 
// Driver code
public static void main(String[] args)
{
    insert(1);
    insert(4);
    insert(7);
    System.out.println("Median for the set {1,4,7} = " +
                       median());
    insert(8);
    insert(9);
    System.out.println("Median for the set {1,4,7,8,9} = " +
                       median());
    delet(1);
    delet(8);
    System.out.println("Median for the set {4,7,9} = " +
                       median());
}
}
 
// This code is contributed by avanitrachhadiya2155


Python3
# A Python3 program to implement insert, delete and
# median queries using segment tree
maxn = 3000005
max_elem = 1000000
 
# A global array to store segment tree.
# Note: Since it is global, all elements are 0.
segmentTree = [0 for i in range(maxn)]
 
# Update 'node' and its children in segment tree.
# Here 'node' is index in segmentTree[], 'a' and
# 'b' are starting and ending indexes of range stored
# in current node.
# 'diff' is the value to be added to value 'x'.
def update(node, a, b, x, diff):
    global segmentTree
     
    # If current node is a leaf node
    if (a == b and a == x ):
         
        # add 'diff' and return
        segmentTree[node] += diff
        return
 
    # If current node is non-leaf and 'x' is in its
    # range
    if (x >= a and x <= b):
         
        # update both sub-trees, left and right
        update(node * 2, a, (a + b)//2, x, diff)
        update(node * 2 + 1, (a + b)//2 + 1, b, x, diff)
 
        # Finally update current node
        segmentTree[node] = segmentTree[node * 2] + segmentTree[node * 2 + 1]
 
# Returns k'th node in segment tree
def findKth(node, a, b, k):
    global segmentTree
     
    # non-leaf node, will definitely have both
    # children left and right
    if (a != b):
         
        # If kth element lies in the left subtree
        if (segmentTree[node * 2] >= k):
            return findKth(node * 2, a, (a + b)//2, k)
 
        # If kth one lies in the right subtree
        return findKth(node * 2 + 1, (a + b)//2 + 1,
                       b, k - segmentTree[node * 2])
 
    # if at a leaf node, return the index it stores
    # information about
    return a if (segmentTree[node]) else -1
 
# insert x in the set
def insert(x):
    update(1, 0, max_elem, x, 1)
 
# delete x from the set
def delet(x):
    update(1, 0, max_elem, x, -1)
 
# returns median element of the set with odd
# cardinality only
def median():
    k = (segmentTree[1] + 1) // 2
    return findKth(1, 0, max_elem, k)
 
# Driver code
if __name__ == '__main__':
    insert(1)
    insert(4)
    insert(7)
    print("Median for the set {1,4,7} =",median())
    insert(8)
    insert(9)
    print("Median for the set {1,4,7,8,9} =",median())
    delet(1)
    delet(8)
    print("Median for the set {4,7,9} =",median())
 
# This code is contributed by mohit kumar 29


C#
// A C# program to implement insert, delete
// and median queries using segment tree
using System;
 
class GFG{
     
public static int maxn = 3000005;
public static int max_elem = 1000000;
 
// A global array to store segment tree.
// Note: Since it is global, all elements are 0.
public static int[] segmentTree = new int[maxn];
 
// Update 'node' and its children in segment tree.
// Here 'node' is index in segmentTree[], 'a' and
// 'b' are starting and ending indexes of range stored
//  in current node.
// 'diff' is the value to be added to value 'x'.
public static void update(int node, int a,
                          int b, int x, int diff)
{
     
    // If current node is a leaf node
    if (a == b && a == x)
    {
         
        // Add 'diff' and return
        segmentTree[node] += diff;
        return ;
    }
     
    // If current node is non-leaf and 'x'
    // is in its range
    if (x >= a && x <= b)
    {
         
        // Update both sub-trees, left and right
        update(node * 2, a, (a + b) / 2, x, diff);
        update(node * 2 + 1, (a + b) / 2 + 1,
               b, x, diff);
         
        // Finally update current node
        segmentTree[node] = segmentTree[node * 2] +
                            segmentTree[node * 2 + 1];
    }
}
 
// Returns k'th node in segment tree
public static int findKth(int node, int a,
                          int b, int k)
{
     
    // Non-leaf node, will definitely have both
    // children; left and right
    if (a != b)
    {
         
        // If kth element lies in the left subtree
        if (segmentTree[node * 2] >= k)
        {
            return findKth(node * 2, a,
                        (a + b) / 2, k);
        }
         
        // If kth one lies in the right subtree
        return findKth(node * 2 + 1, (a + b) / 2  + 1,
                     b, k - segmentTree[node * 2]);
    }
     
    // If at a leaf node, return the index it
    // stores information about
    if (segmentTree[node] != 0)
    {
        return a;
    }
    else
    {
        return -1;
    }
}
 
// Insert x in the set
public static void insert(int x)
{
    update(1, 0, max_elem, x, 1);
}
 
// Delete x from the set
public static void delet(int x)
{
    update(1, 0, max_elem, x, -1);
}
 
// Returns median element of the set
// with odd cardinality only
public static int median()
{
    int k = (segmentTree[1] + 1) / 2;
    return findKth(1, 0, max_elem, k);
}
 
// Driver code
static public void Main()
{
    insert(1);
    insert(4);
    insert(7);
    Console.WriteLine("Median for the set {1,4,7} = " +
                      median());
    insert(8);
    insert(9);
    Console.WriteLine("Median for the set {1,4,7,8,9} = " +
                      median());
    delet(1);
    delet(8);
    Console.WriteLine("Median for the set {4,7,9} = " +
                      median());
}
}
 
// This code is contributed by rag2127


输出:

Median for the set {1,4,7} = 4
Median for the set {1,4,7,8,9} = 7
Median for the set {4,7,9} = 7

结论:
所有这三个查询都在O(log(max_elem))中运行,在这种情况下,max_elem = 10 ^ 6,因此log(max_elem)大约等于20。
段树使用O(max_elem)空间。

如果不存在删除查询,则也可以使用此处著名的算法来解决此问题。