Varada, the data lake query acceleration provider, announced it has open-sourced its Workload Analyzer for Presto, including both Trino (formerly known as PrestoSQL) and PrestoDB, making the source code available to everyone via Github.
The Workload Analyzer is a free, easy-to-use tool that offers visibility into how big data and analytics workloads are performing, offering users insights into how to improve performance and optimize resources.
“Presto democratized big data, exponentially expanding the number of business users that can ask questions to a Big Data infrastructure and enlarging the number of underlying data sources they can query,” said Ori Reshef, vice president of products at Varada. “But as the number of users within an organization grows, the challenge of DataOps teams is to keep queries running quickly, delivering results in a timely way so that those users can do their jobs. Unfortunately, DataOps teams are only able to get bits and pieces of the information they need to optimize resources from Presto itself. So Varada built the Workload Analyzer to give DataOps teams deep and actionable insights.”
The Workload Analyzer collects details and metrics on every query, aggregates and extracts information, and delivers dozens of charts describing all the facets of cluster performance.
The Workload Analyzer script runs safely within the Presto cluster in the user’s Virtual Private Cloud (VPC), collecting and analyzing query statistics (JSONs). No data leaves the cluster and the tool does not require any external resources.
Using the Workload Analyzer, data teams can:
- Learn how resources are used on an hourly and weekly basis and define scaling rules
- Identify heavy spenders and improve the pipeline
- Improve predicate pushdown and significantly reduce IO and CPU
- Identify “hottest” data
- Improve JOINs performance
- Provide a better production roll-out experience and identify upgrade risks upfront
For more information about this news, visit https://varada.io/.