What's New

100x More Efficient.
100% in Your Cloud.

Welcome to the age of Serverless AI. The latest features with this release improves the efficiency gains by 100x. This further drives acceleration and strengthen guardrails for generative AI responses, paving the way for business user led innovation in the enterprise.

Try ChatAible Today

Other Major Releases

Oct 31st Release 2023

July 4th Release 2023

Prior Releases

100x more efficient Serverless Vector Databases

Vector Databases are a key component of Retrieval Augmented Generation (RAG) use cases, but they can be extremely expensive - often eclipsing the cost of the Large Language Model (LLM) for such use cases. Why? Because the Vector Database sticks around, costing you money, whether or not you are actively using it. While some Vector DB vendors have announced serverless versions, these run in a shared environment outside the customers control. Aible now includes a serverless Vector Database built on open source ChromaDB that runs fully in the customer’s own cloud. Because it is serverless, it is 100 times more cost efficient for most use cases. Because it runs fully in the customer’s cloud under the customer’s control, it is more secure. Because it is fully integrated into the Aible experience, users do not need to know any data science skills to use this technology for their genAI use cases.

100x more efficient Serverless Vector Databases

Automated Caching for all types of questions

Our Valentine’s Day release is all about delivering on customer feature requests. Ever since we introduced our comprehensive logging capabilities, customers kept noting duplicative questions from different users in their logs and asked us - Can’t Aible figure out it has already answered this question and avoid incurring additional LLM, Vector DB, and analysis costs?

We introduced such caching for Natural Language use cases last Halloween. Now caching works for all kinds of questions. Aible now detects either exact or conceptual matches for user questions and retrieves the answer from cache if possible. The organization saves a significant amount of money, while the user gets the question answered instantaneously. Users can easily adjust the cache settings or turn it off for specific questions.

Automated Caching for all types of questions

Serverless Small Models that don’t need GPUs to operate

Multiple projects have demonstrated that small models like Mistral, Llama 2, etc., when fine-tuned for specific use cases, perform very well compared to large models like OpenAI and Gemini.

Some organizations prefer small models because they can run fully within their cloud instead of being hosted by the cloud provider. Unfortunately, the operating cost benefits of these models disappear once you need to use a hefty GPU-powered server to run them. For GCP customers, Aible now runs small models like Mistral in Quantized form, serverless without requiring GPUs. This significantly reduces the operating cost per chat, but reduces the overall cost by more than 100X when compared to running such small models on dedicated servers. This feature is currently only available on GCP for a subset of models such as Mistral and Llama. We expect to add more clouds and models soon.

Serverless Small Models that don’t need GPUs to operate

Multiple projects have demonstrated that small models like Mistral, Llama 2, etc., when fine-tuned for specific use cases, perform very well compared to large models like OpenAI and Gemini.

One-Click Fine Tuning

As we worked on our Small Model capabilities, we quickly realized that there was a night and day difference between the generic versions of such models vs. even models fine tuned on hundreds or thousands of examples for a cost of about $100. But doing model fine tuning requires significant data science expertise today. So, Aible automated the process end-to-end, from fine tuning data collection, to setting the correct fine tuning parameters, to automatically doing the fine tuning, to making the fine tuned available as a serverless option in Aible. Essentially, users just have to provide feedback using thumbs up/down or editing chat responses - then once enough data has been collected, they just have to click a button.

No other expertise required. Of course, expert users can set the fine tuning parameters if they want. This feature is currently only available on GCP for a subset of models such as Mistral and Llama. We expect to add more clouds and models soon.

One-Click Fine Tuning

Aible Anywhere

Our enterprise customers are looking for tailored AI and analytics experiences that achieve end-to-end business process workflows, rather than monolithic or piece-meal applications. They want to transform their end-to-end processes such as Order to Cash and Procure to Pay by leveraging AI to reduce risk and friction.

Aible Anywhere stitches together a complete set of composable serverless AI capabilities, securely in the customer’s cloud account to enable real end-to-end enterprise use cases. The custom solution combines multiple AI systems and components, covering both Generative AI (including Q&A or chat functionality, document creation, summarization for structured & unstructured data, and the ability to synthesize/ combine information) and Classical AI (including augmented data engineering, analytics, DSML, scenario planning, model monitoring). This capability is only available to enterprise customers.

100x more efficient Serverless Vector Databases

100x more efficient Serverless Vector Databases

Automated Caching for all types of questions

Automated Caching for all types of questions

Serverless Small Models that don’t need GPUs to operate

Serverless Small Models that don’t need GPUs to operate

One-Click Fine Tuning

One-Click Fine Tuning

Aible Anywhere

Aible Anywhere

Subscribe Here!