You’ve decided to go with ELK to centralize and manage your logs. Wise decision.
The ELK Stack is now the world’s most popular log management platform, with millions of downloads per month. The platform’s open source roots, scalability, speed, and high availability, as well as the huge and ever-growing community of users, are the reasons that drove you to this decision.
But before you go ahead and install Elasticsearch, Logstash, Kibana, and the different Beats, there is one crucial question that you need to answer: Are you going to run the stack on your own, or are you going to opt for a cloud-hosted solution?
The simple answer is: It all boils down to time and money. When contemplating whether to invest the valuable resources at your disposal in doing ELK on your own, you must ask yourself if you have the resources to pull it off.
To help guide your decision making, let's look at the variables you need to consider: installation and shipping, parsing, mapping, scaling, performance tuning, data retention and archiving, handling upgrades, infrastructure, security, and open source path. These reflect what a production deployment of ELK needs to include based on the extensive experience of both our customers and ourselves while working with ELK. Also, these recommendations are based on the assertion that you are starting from scratch and require a scalable, highly available, and at least medium-sized ELK deployment.
Installation and Shipping
Installing ELK is usually hassle-free. Getting up and running with your first instances of Elasticsearch, Logstash, Kibana, and Beats (usually Filebeat or Metricbeat) is pretty straightforward, and there is plenty of documentation available if you encounter issues during installation. However, connecting the dots is not always error-free. Depending on whether you decided to install the stack on a local, cloud, or hybrid infrastructure, you may encounter various configuration and networking issues. Kibana not connecting with Elasticsearch, Kibana not being able to fetch mapping, and Logstash not running or not shipping data are all-too-frequent occurrences.
Once you’ve dealt with those issues, you need to establish a pipeline into the stack. This pipeline will greatly depend on the type of logs you want to ingest and the type of data source from which you are pulling the logs. You could be ingesting database logs, web server logs, or application logs. The logs could be coming in from a local instance, AWS, Docker or Kubernetes. Most likely, you will be pulling data from multiple and distributed sources. Configuring the various integrations and pipelines in Logstash can be complicated and extremely frustrating, and configuration errors can bring down your entire logging pipeline.
It’s one thing to ship the logs into the stack. It’s another thing entirely to have them actually mean something. When trying to analyze your data, you need your messages to be structured in a way that makes sense.