📜  Hadoop 和 Cassandra 的区别

📅  最后修改于: 2021-09-12 10:56:53             🧑  作者: Mango

Hadoop是一个开源软件编程框架。 Hadoop 的框架基于Java编程语言,带有一些 shell 脚本和 C 的本机代码。
该框架用于管理、存储和处理集群系统下运行的大数据不同应用的数据和计算。 Hadoop 的主要组件是 HDFS、MapReduce 和 YARN。

Cassandra是一个开源分布式数据管理系统,具有广泛的列存储和 NoSQL 数据库。在此 NoSQL 数据库中,能够处理跨许多商用硬件的大量数据,且无单点故障和高可用性。该代码是用Java编写的,由 Apache 软件基金会开发。

Hadoop 和 Cassandra 的区别

S.NO. HADOOP CASSANDRA
1 Hadoop is a scalable framework that is designed to be deployed on low-cost hardware. It is deployed in a very distributed fashion as a cluster of instances that are all aware of each other.
2 Hadoop is a big data processing framework based on the famous MapReduce programming model. Cassandra is mainly used for real-time data processing.
3 Hadoop supports a variety of formats. Cassandra does not support images.
4 Hadoop follows a master slave architecture. Cassandra follows a peer-to-peer architecture
5 Hadoop is deployed in a single data center. Cassandra is deployed in a very distributed fashion.
6 It used map reduce to read/write. This uses Cassandra query language.
7 In hadoop, data is directly written to data node. While in Cassandra, data is first written to mem-table and then it is written to disk.
8 Hadoop has a fixed replication factor of 3. Replication factor in Cassandra depend on the number of nodes.
9 It has high latency rate. It has less latency rate.
10 Hadoop uses TCP and UDP for communication. In Cassandra, gossip protocol is used for communication.
11 It is for data batch processing. It is for real-time processing.