📜  降维与归约的区别

📅  最后修改于: 2021-09-14 01:59:26             🧑  作者: Mango

1. 降维:
它是一种用于获得原始数据的简化或压缩表示的技术。它进一步分为两个组成部分:

  • 特征选择——
    它是去除不相关或冗余特征的过程。
  • 特征提取 –
    它是将数据转换为适合建模的特征的过程。

2. 数量减少:
它是一种数据缩减技术,用于通过使用合适的数据表示形式来减少数据量。这些技术可以是参数的或非参数的。对于参数方法,使用模型来估计数据,因此通常只需要存储数据参数,而不是实际数据。用于存储数据简化表示的非参数方法包括直方图、聚类和采样。

降维和归约的区别:

Dimensionality Reduction Numerosity Reduction
In dimensionality reduction, data encoding or data transformations are applied to obtain a reduced or compressed for of original data. In Numerosity reduction, data volume is reduced by choosing suitable alternating forms of data representation.
It can be used to remove irrelevant or redundant attributes. It is merely a representation technique of original data into smaller form.
In this method, some data can be lost which is irrelevant. In this method, there is no loss of data.
Methods for dimensionality reduction are:
  1. Wavelet transformations.
  2. Principal Component Analysis.
Methods for Numerosity reduction are:
  1. Regression or log-linear model (parametric).
  2. Histograms, clusturing, sampling (non-parametric).
The components of dimensionality reduction are feature selection and feature extraction. It has no components but methods that ensure reduction of data volume.
It leads to less misleading data and more model accuracy. It preserves the integrity of data and the data volume is also reduced.