The Cost of Doing the ELK Stack on Your Own

Dec 8, 2021

By Dotan Horovits, Product Evangelist at Logz.io

<< back Page 3 of 4 next >>

These are just a few examples of the grunt work that is required to maintain your own ELK deployment. Again, it is totally doable — but it can also be very resource-consuming.

Data retention and archiving

What happens to all of the data once ingested into Elasticsearch? Indices pile up and eventually—if not taken care of—will cause Elasticsearch to crash and lose your data. If you are running your own stack, you can either scale up or manually remove old indices. Of course, manually performing these tasks in large deployments is not an option, so use Elasticsearch Curator or set up cron jobs to handle them.

Curation is quickly becoming a de-facto compliance requirement, so you will also need to figure out how to archive logs in their original formats. Archiving to Amazon S3 is the most common solution, but this again costs more time and money to configure and execute. Cloud-hosted ELK solutions such as our Logz.io platform provide this service as part of the bundle.

Handling upgrades

Handling an ELK Stack upgrade is one of the biggest issues you must consider when deciding whether to deploy ELK on your own. In fact, upgrading a large ELK deployment in production is so daunting a task that you will find plenty of companies that are still using extremely old versions.

When upgrading Elasticsearch, making sure that you do not lose data is the top priority — so you must pay attention to replication and data synchronization while upgrading one node at a time. Good luck with that if you are running a multi-node cluster! This incremental upgrade method is not even an option when upgrading to a major version, which is an action that requires a full cluster restart.

Upgrading Kibana can be a serious hassle with plugins breaking and visualizations sometimes needing total rewrites.

Infrastructure

Think big. As your business grows, more and more logs are going to be ingested into your ELK Stack. This means more servers, more network usage, and more storage. The overall amount of computing resources needed to process all of this traffic can be substantial.

Log management systems consume huge amounts of CPU, network bandwidth, disk space, and memory. With sporadic data bursts being a frequent phenomenon — when an error takes place in production, your system for generating a large number of logs — capacity allocation needs to follow suit. The underlying infrastructure needed can amount to hundreds of thousands of dollars per year.

Security

In most cases, your log data is likely to contain sensitive information about yourself, your customers, or both. Just as you expect your data to be safe, so do your customers. As a result, security features such as authorization and authentication are a must to protect both the logs coming into your ELK Stack specifically and the success of your business in general.

The problem is that the open source ELK Stack does not provide easy ways to implement enterprise-grade data protection strategies. Ironically, ELK is used extensively for PCI compliance and SIEM but does not include proper security functionality out of the box. If you are running your own stack, your options are not great. You could opt for using the provided basic security features but these are somewhat limited in scope and do not provide advanced security features such as LDAP/AD support, SSO, encryption at rest, and more. Or, you could try to hack your own solution, but as far as I know, there is no easy and fast way to do that.

Open Source Path

Elasticsearch, Kibana, and the rest of the ELK Stack components have been open source software (OSS) projects since their foundation, and have been distributed under the Apache 2.0 license. This provided clear OSS benefits such as avoiding vendor lock-in, future-proofing, and the ability to freely reuse, modify, and adapt the open source to your needs.