Alternatives to Fine-tuning

People commonly refer to fine-tuning when attempting to address their problems with AI. In a recent conversation with an OpenAI Ambassador, I was told that they “convince 90% not to use fine-tuning to solve their problem,” confirming my hunch that fine-tuning is one of the most misunderstood tools in AI.

Fine-tuning is often more complicated and less effective than people realize.

To start, it requires a sufficiently large and properly labeled dataset. Data augmentation techniques like synonym replacement, random insertion, and random deletion can increase the amount of training data artificially, but this still requires a substantial investment.

Another potential problem is whether the data fits the intended solution. Fine-tuned models aren’t as flexible as the ChatGPT most people are familiar with. The fine-tuned model is designed to serve a narrow purpose based on your training data. Even ChatGPT, a fine-tuned model, has a very narrow focus on providing answers for a mainstream audience. So you have to consider whether the training data accurately represent the desired output from the model you’re fine-tuning. Because if it doesn’t, your fine-tuned solution isn’t likely to perform as expected.

Classification

What if you need to classify incoming content as good, bad, relevant, off-topic, important, etc.?

Past machine learning tools required significant overhead in both development time and experience with niche tools to achieve this same outcome. But now there’s an easier solution!

It’s possible to build a simple classifying algorithm with the OpenAI Embeddings API and calculations that can run on any computer or even in a spreadsheet!

The process is quite simple:

Create two or more groups of embeddings based on your available data.
Similarly, get the embedding for the new data you would like to classify.
Compare the embeddings, often using a process known as cosine similarity, to determine which group is most similar to the new data.

The cosine similarity returns a decimal between zero and one, which can easily translate into a percent for easy interpretation. The higher the similarity, the higher the likelihood that the data belongs to the group.

Querying

What if you want to ask questions in the context of your existing data? Another common reason people believe they need fine-tuning is to get answers based on their documents.

But what if your data is unstructured? Like the files in your Google drive. It doesn’t matter if you aren’t fine-tuning! This is where combining embeddings with the latest text generation models shines!

This video by David Shapiro does a great job introducing the differences between fine-tuning and semantic search when seeking to integrate your data with AI.

Without getting too technical:

Start by generating “embeddings” for the existing data that will be used as context when the AI responds to your questions.
Use the embeddings to surface the most relevant content based on your question to the AI before the GPT-3/ChatGPT model receives your prompt.
Insert the results from step #2 inside the prompt that’s inputted to the GPT-3/ChatGPT model

By including the search results as the context within a prompt, you can instruct the model to return a well-formatted answer to the original question based while including specific information without fine-tuning.

Popular open-source software to implement this is GPT Index.

When to Fine-tune

As mentioned in the above video, there are valid reasons to use fine-tuning. Though, if you don’t already have an existing AI solution, then starting with fine-tuning is probably not the best direction.

If you have an existing solution, for example, that uses the Davinci model to answer a very narrow set of questions at high volume, then maybe fine-tuning a Curie model could significantly reduce your costs.

Much more can be written about the use cases for fine-tuning. But before you further research the process of fine-tuning, I encourage the development of an AI system without it. This will make it easier to deploy and increase its flexibility for future improvement. And when you’re ready, fine-tuning will be there to help.