Introduction

By definition, a Distributed system is not a Centralized system where state is stored on a single computer. Centralized Systems are simpler, easy to understand and can be faster for a single user.

Distributed Systems state is divided over multiple computers. It is more robust, scalable but also more complex.

Examples of Distributed Systems :

  • Domain Name Systems (DNS)
  • Facebook
  • Google
  • Netflix
  • Email servers (SMTP)

Distributed Systems Advantages

  1. Scalability

The main advantage is definitively Scalability ! Indeed, it’s only a matter of adding more machines, it’s also cheaper than super computers. Finally, more machines means more parallelism, so better performance.

2. Sharing

The same resource is shared between multiple of users.

3. Communication

Communication between (geographically isolated) machines and users.

4. Reliability

The service remains active even if multiple machines go down !

Distributed Systems Challenges

Despite all advantages described previously, here are the challenges you will have to deal with Distributed Systems :

  1. Concurrency

Concurrent execution requires some form of coordination.

2. Fault-tolerance

Any component can fail at any instant due to a software or a hardware bug.

3. Security

One machine can compromise the entire system.

4. Coordination

No global time so non-trivial to coordinate.

5. Trouble Shooting

Harder to trouble shoot because hard to reason about the system.

Distributed Systems Categories

Distributed Systems can be split between 6 categories :

  1. Data stores (aka Distributed Databases)

Most distributed databases are NoSQL non-relational databases. They provide incredible performance and scalability at the cost of consistency or availability. (Cassandra, Riak, Voldemort)

2. Computing

The goal is to split enormous task (i.e 100 billion records), into many smaller tasks. So when you have a bigger task, simply include more nodes in the calculation. (Kafka, Apache Spark, Apache Storm)

3. File systems

Distributed file systems can be thought of as distributed data stores. They’re the same thing as a concept — storing and accessing a large amount of data across a cluster of machines all appearing as one. They typically go hand in hand with Distributed Computing. (Hadoop HDFS, Interplanetary FileSystem IPFS)

4. Messaging systems

Messaging systems provide a central place for storage and propagation of messages/events inside your overall system. They allow you to decouple your application logic from directly talking with your other systems. (RabbitMQ, Kafka, Apache ActiveMQ, Amazon SQS)

5. Ledgers

A distributed ledger can be thought of as an immutable, append-only database that is replicated, synchronized and shared across all nodes in the distributed network. (BlockChain, Bitcoin)

6. Applications

A system is distributed only if the nodes communicate with each other to coordinate their actions.Therefore something like an application running its back-end code on a peer-to-peer network can better be classified as a distributed application. (BitTorrent)

Source : https://medium.freecodecamp.org/a-thorough-introduction-to-distributed-systems-3b91562c9b3c

Leave a Comment

Your email address will not be published. Required fields are marked *