Deci Unveils Latest LLM, Sets New Benchmarks in Accuracy

Deci, the deep learning company harnessing AI to build AI, is adding a large language model, DeciLM-7B,  to its suite of innovative generative AI models—setting new benchmarks in accuracy and efficiency.

"With the increasing use of generative AI in various business sectors, there's a growing demand for models that are not only highly performant but also operationally cost efficient,” said Yonatan Geifman, CEO and co-founder of Deci. "Our latest innovation, DeciLM-7B, combined with Infery-LLM, is a game-changer in this regard. It's adaptable to diverse settings, including on-premise solutions, and its exceptional inference efficiency makes high-quality large language models more accessible to a wider range of users.”

Building upon the success of its predecessor DeciLM 6B, DeciLM-7B stands out for its unmatched performance, surpassing open-source language models up to 13 billion parameters in both accuracy and speed with less computational demand, according to the company.

It achieves a 1.83x and 2.39x increase in throughput over Mistral 7B and Llama 2 7B, respectively, which means significantly faster processing speeds compared to competing models. Its compact design is ideal for cost-effective GPUs, striking an unparalleled balance between affordability and high-end performance.

The remarkable performance of DeciLM-7B can be further accelerated when used in tandem with Infery-LLM, the world’s fastest inference engine, designed to deliver high throughput, low latency and cost effective inference on widely available GPUs.

According to the company, this powerful duo sets a new standard in throughput performance, achieving speeds 4.4 times greater than Mistral 7B with vLLM without sacrificing quality. Leveraging DeciLM-7B in conjunction with Infery-LLM enables teams to drastically reduce their LLM compute expenses, while simultaneously benefiting from quicker inference times. This integration facilitates the efficient scaling of generative AI workloads and supports the transition to more cost-effective hardware solutions.

This synergy enables the efficient serving of multiple clients simultaneously without excessive compute costs or latency issues.

This is especially crucial in sectors such as telecommunications, online retail, and cloud services, where the ability to respond to a massive influx of concurrent customer inquiries in real time can significantly enhance user experience and operational efficiency, according to the company.

Licensed under Apache 2.0, DeciLM-7B is available for use and deployment anywhere, including local setups, enabling teams to fine tune for specific industry applications without compromising on data security or privacy. Its versatility allows teams to easily tailor it for unique use cases across a wide range of business applications, including content creation, translation, conversation modeling, data categorization, summarization, sentiment analysis and chatbot development, among others.

With DeciLM-7B, companies can now leverage the full potential of AI without the prohibitive costs or complexities previously associated with high-end language models, according to the company.

For more information about this news, visit