Another example is Blackboard, the leader in education technology serving K–12 education, business, and government clients. Blackboard’s product development team typically used log analytics to monitor cloud deployments of the company’s SaaS-based learning, management system (LMS) in order to troubleshoot application issues, etc. But when COVID-19 hit, millions of students switched to online learning and those log volumes skyrocketed—product usage grew by 3,000% in 2020 when the world went virtual. Its custom-managed ELK (Elasticsearch, Logstash, Kibana) stacks and managed Elasticsearch service for centralized log management couldn’t support the new log volumes—at a time when that log data was most valuable. The Blackboard team needed to be able to analyze short-term data for troubleshooting but also long-term data for deeper analysis and compliance purposes. The Blackboard team moved its log data to a data lake platform running directly on Amazon S3 and serving analytics to end users via Kibana, which is included natively under the hood. The company now has day-to-day visibility of cloud computing environments at scale, app troubleshooting and alerting over long periods of time, root cause analysis without data retention limits, and fast resolution of application performance issues.
Now We’re Cooking
Cloud storage has the potential to truly democratize data analytics for businesses. There’s no better or more cost-effective place to store a company’s treasure trove of information. The trick is unlocking cloud object storage for analytics without data movement or pipelining. Many data lake, warehouse, and even lakehouse providers have the right idea, but their underlying architectures are based on 1970s computer science, making the process brittle, complex, and slow.
If you are developing or implementing a data lake and want to avoid building a swamp—ask yourself these questions:
- What business use cases or analytics questions should we be able to address with the data lake?
- How will data get into the data lake?
- How will users across the organization get access to the data in the lake?
- What analytics tools need to be connected to the data lake to facilitate the democratization of insights?
It is important to find a solution that allows you to turn up the heat in the data lake with a platform that is cost-effective, elastically scalable, fast, and easily accessible. A winning solution allows business analysts to query all the data in the data lake using the BI tools they know and love, without any data movement, transformation, or governance risk.