This post is part 1 of a 4 part series.
I have been setting up a microservice architecture recently and in setting up some of the infrastructure with docker I came across quite a few hurdles just to get them up and running while dealing with two of the complexities below, communication and Tracing and Debugging. These are a series of posts that I am compiling so that hopefully I save someone sometime, it will definitely save me some if I forget something.
This is the first part of a multi-part posting about understanding and implementing a simple set of microservices. In this posting I will be attempting to get everyone on the same page.
By the time we are done I hope you have learned
What is a microservice?
A microservice is a single or collection of "modules" used to fulfill a single responsibility. A microservice can be 1 or more API, database, console application, Remote procedure call, what ever it takes to fulfill that 1 responsibility and provide value to the business.
A microservice architecture is when you take a collection of microservices (probably over 20) and link them together to build a functioning application.
Which is where the complexities start to show up, after if you are only running one and it's only doing one thing then shouldn't be much to test.
Complexities of Microservices
- What my definition of a microservice is.
- Where some of the complexity lays
These are what I consider the big 3, solve these and your microservices should stand a chance of surviving.
In an application, when data is entered in one area, that data is usually consumed somewhere else in the application as well. In a monolith, you just select from that table and assuming the database exists, your good to go. In a microservice architecture, where each service is responsible for it's own data, that isn't the case. So how do you get the data from one service to another? If you attempt to talk directly between services you wind up with a pile of problems, from tightly coupling your services to an application that never returns. The solution is a message broker of some form rabbitMQ, kafka, service bus to name a few. rabbitMq is commonly recommended for development and is what I will be demonstrating later. The concept is relatively simple.
- You have a publisher (the person doing something or needing something done)
- A message of the event (user.created)
- A queue or set of queue (place to put the message)
- A subscriber (someone that cares enough to watch the queue)
With this pattern services that care about a particular event can watch for it and as long as they can get to the queue which is usually more stable since it's not under active development then they don't have to worry about the state of the other services or even know where they are.
In a monolith, if there is a failure then there is a failure everywhere. If the database goes down, or the server goes down then nobody can access the application. With microservices don't have this problem, if you have two microservices and one fails, the other will keep on working. You also tend to have more infrastructure needs in a microservice architecture like the message broker. The broker dying shouldn't cause the application to fail, and it should be able to recover from that failure if it does occur.
Since we have already established the communication aspect and know that we are going to use a message broker for inter service communication, I will use it as the example. Let's say the broker went down, what happens? First off we can't send or receive any messages so any part of the system down stream will be working off of stale and inconsistent data. Second, the message that we attempt to send could be lost. To solve the failure we need a way to automatically bring a failed service back online this is what orchestration tools like kubernetes and Azure service fabric do. They detect a failure and restart the service. Next the publishers need to retry the request a couple of times and if it isn't back, then that will need handled, either by rolling back the changes or if possible, store the message for later and retry at a different time. At least this way the message will still be sent.
Tracing and Debugging
In a monolithic architecture you can debug and step through pretty much all of the code. In a microservice architecture running 100s of services this isn't really feasible especially if you only have source code for one service. With this logging became very important. Log data as it comes in unmodified, before it leaves modified as well as the messages that you are sending. The messages will help you track it through each of the services. However your logs could be distributed across multiple servers so reading them would be a pain. A popular solution to this is the Elastic Stack(formerly the ELK stack), elasticsearch, logstash, and kibana. Elasticsearch is the database where the logs and messages will get sent for storage, logstash is a logging server that allows you to parse, filter and manipulate the logs before they get sent to elasticsearch. Kibana is a data visualization service that allows you to query and aggregate the logs to make them more meaningful. The Elastic stack also has things called Beats that you can use to directly insert data into elastic search without logstash though you lose the transformation capabilities.