Learning by doing: Distributed Systems

At ShuttleCloud, we’ve developed a distributed platform than can handle very high loads. This has given me a good knowledge of distributed systems from the perspective of the practitioner. We’re using, for example,CouchDb, pacemaker and corosync, Amazon RDS, Rabbitmq, etc.

However, implementing this kind of software is a completely different beast. It’s a broad and complex field.  Most tech people I admire and follow on twitter are working in that field and you can see that is really difficult to keep up. There are too many things to learn!😀

Lately I’ve been reading a lot about the subject, but reading is not enough. If you want to learn something, you had better start using it. That’s why I’ve decided to implement an In-Memory key-value Store (a very simple one).

The objective is to learn things like:

  • Should I implement a Write-ahead log so that the DB can recover from a crash?
  • Should I implement a Log-structured to store the values so that the database is not limited by the RAM (well, keys have to fit in memory)?
  • How can the database scale reads? and writes? Do I need sharding? Replicas? Should it be a Leader based replication?…..
  • How can the data be replicated in different machines?
  • How does it know when the Leader has a failure?
  • ….

My idea is to write post about my decisions on those questions and to publish the implementation in this repo, so that I can learn from other people.

Félix

Responder

Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de WordPress.com

Estás comentando usando tu cuenta de WordPress.com. Cerrar sesión / Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Cerrar sesión / Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Cerrar sesión / Cambiar )

Google+ photo

Estás comentando usando tu cuenta de Google+. Cerrar sesión / Cambiar )

Conectando a %s