Unleash the Power of GenAI on a Budget with Model Customisation

By leveraging an existing, pre-trained model and customising it to suit your needs, you will avoid the significant upfront costs associated with model design.

The Rise of Generative AI in HPC

Generative AI is the new buzzwords in the High-Performance Compute (HPC) and Artificial Intelligence (AI) field. And for good reasons!

Be it new content generation, summarisation, classification… the potential use cases are endless. Startups and businesses are all jumping on the opportunity to gain a competitive advantage and improve the relevance of their products.

Hyperion Research recently issued its bi-annual HPC market study at SC’23 in Denver. They find that the compute consumption associated with Large Language Models in particular (a type of Generative AI models) eclipses the growth of the traditional HPC market since its inception in the 1950s.

For people discovering AI and its potential, the concept of implementing a new model can seem quite daunting. In our blog post entitled “Mastering Generative AI: A Strategic Roadmap for Your Business”, we covered how to approach AI from a business strategy perspective.

But what happens next? Does initiating the journey imply that you will spend millions on model development and compute costs to train your AI model?

Of course, you can if you’d like! The good news is: you might not need to. There are ways of approaching AI in an easy, cost-effective, and pain-free way.

The Power of Foundational Models

AI models are programs that analyse data and make predictions. For a model to work, it first needs to be trained on (often large) datasets. Once this phase is complete, the model can be used in production using live data. This is called inference.

Some models, called foundational models, have been built on huge datasets and impact our society a lot more than others. Usually created by leading organisations like OpenAI, foundational models are adaptable and capable of performing tasks applicable to a broad set of use cases. GPT and its subsequent versions (GPT-3 or GPT-4) are good examples. They are large foundational models capable of writing high quality texts (blog posts, essays, code…) based on any kind of input (e.g. images or texts).

Foundational models are, by definition, used as fundamental building blocks to create applications. OpenAI’s ChatGPT chat bot, famous for having human-like conversations of staggering quality, is a prime example of an application leveraging the GPT foundational models.

Creating an AI Model from Scratch

The arduous, expensive way of reaching your AI goals might be to train your own model from scratch. Whilst the resulting AI model might not become a new AI paradigm such as GPT, you would have full control over its development, from the capture of data all the way to production.

Figure 1: Key steps towards the creation of an AI model from scratch

As such, it would specifically meet your (and your customers’) very needs. It could even be sold to other companies or researchers seeking to create their own version of the model! Such an AI strategy sometimes brings tremendous value to a business. OpenAI is a prime example: by creating state-of-the-art foundational models, it is set to be valued at a mind-boggling $80B.

As you would expect though, reaching such remarkable milestones comes at a dear cost. If you wish to create your own foundational model, expect computing resources expenditure in the millions, man-years of research & development, and a good 6 months to 3 years runway. As an example, analysts estimate that training GPT costs $10M to $20M in compute alone.

Whilst this kind of investment may be the best way to differentiate and create new revenue streams, it is also the most expensive.

Leveraging Existing AI Models

The easiest and fastest way for a startup to use AI is to use an existing model. Many free options exist, although most are based on a consumption model (e.g. charge a fee per inference. This approach is usually the first step towards the creation of a proof of concept, maybe even a minimum viable product.

As you would expect, there are caveats. Would you like your brand-new customer support bot to start sharing its views on politics or other controversial topics? Probably not! Chances are, you will be looking to make slight adjustments to a foundational model, so that it caters for your specific needs. With this approach, a generic chat bot could be trained to learn domain-specific knowledge and exclude topics outside the scope of its functional domain. Typical customization techniques include:

Fine tuning: Re-train an existing model with additional domain-specific data to increase the model’s proficiency at a particular topic
Prompt engineering: Condition an existing model by ensuring the inputs are focused on a particular tasks
Model control: Make use of guard rails to safeguard interactions with the model
Reinforcement learning with Human Feedback (RLHF): Continuously improve the model using feedback as the model is used in production.

Figure 2: A few customisation techniques to for generative AI and large language models

Jasper AI, a writing tool targeted for enterprise marketing, is a perfect example of this AI strategy. Jasper AI is built on GPT-3 and has been customised specifically for copywriting. It now serves as the core technology of a business.

By leveraging an existing, pre-trained model and customising it to suit your needs, you will avoid the significant upfront costs associated with model design. Your runway will also be much shorter, as we are now talking about weeks of development rather than months.

Choosing Your Path in AI

Regardless of the path you choose, we would love to hear from you and learn about your project. Not only are we excited to learn about what people are working on, but we can help you reach the AI nirvana.

EscherCloudAI is geared to support companies through their AI journey. Even though AI technologies are undeniably complex, we make it easier. We offer technical assistance and the infrastructure to run AI workloads at scale. And our offering is not just any infrastructure: our servers are based on liquid-cooled Nvidia’s A100 GPUs, which are some of the fastest and most energy-efficient hardware available today.

Click here to tell us about your yourself! We are looking forward to working with you and getting your models trained and operating in production in record times.

A special thanks to Patrick Wohlschlegel for providing us insight on how to benefit from GenAI on a budget with model customisation.