What is Underfitting: LLMs Explained




A simple line graph showing an underfit model

Underfitting is a concept in machine learning, particularly in the context of Large Language Models (LLMs) like ChatGPT, where the model fails to capture the underlying pattern of the data. This article aims to provide a comprehensive understanding of underfitting, its causes, effects, and how it can be mitigated in LLMs.

Underfitting is the opposite of overfitting. While overfitting occurs when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data, underfitting is what happens when a machine learning model is not complex enough to capture the underlying structure of the data. The model, therefore, performs poorly both on the training data and on unseen data.

Understanding Underfitting

Underfitting is a term used in statistics and machine learning to describe a situation where a model has not learned enough from the training data, resulting in a poor fit to the data. This can be due to the model being too simple to capture the complexity of the data structure or due to the model not being trained long enough.

When a model underfits, it has high bias and low variance. High bias means that the model makes strong assumptions about the data and misses the relevant relations between features and output predictions. Low variance indicates that the model does not change much with different training data. The result is a model that is oversimplified, with poor performance.

Causes of Underfitting

Underfitting can occur for several reasons. One of the most common causes is a model that is too simple. In the context of LLMs, this could mean a model with not enough layers or neurons, or a model that lacks the complexity to understand the nuances of human language.

Another cause of underfitting is insufficient training. If a model is not trained for long enough, it may not have had enough time to learn the patterns in the data. This is particularly relevant for LLMs, which require extensive training to understand and generate human-like text.

Effects of Underfitting

When a model underfits, it performs poorly on both the training data and on any new, unseen data. This is because it has not learned the underlying structure of the data, and so cannot make accurate predictions.

In the context of LLMs, an underfitted model may produce text that is nonsensical or not in line with the input prompt. It may also struggle with tasks such as text completion, translation, or question answering, as it has not learned the necessary language patterns.

Identifying Underfitting

Identifying underfitting can be challenging, as it requires a clear understanding of the expected performance of the model and the complexity of the data. However, there are some signs that can indicate underfitting.

One clear sign of underfitting is poor performance on the training data. If a model is underfitting, it will not be able to accurately predict the output even for the data it has been trained on. This is different from overfitting, where the model performs well on the training data but poorly on new data.

Performance Metrics

Performance metrics can be used to identify underfitting. For classification tasks, metrics such as accuracy, precision, recall, and F1 score can be used. For regression tasks, metrics such as mean squared error, root mean squared error, and R-squared can be used.

In the context of LLMs, metrics such as perplexity can be used. Perplexity measures how well a probability model predicts a sample and can be used to compare the performance of different language models. A high perplexity indicates a model that is uncertain of its predictions, which can be a sign of underfitting.


Visualizing the model’s predictions against the actual values can also help identify underfitting. If the model is underfitting, the predictions will not align well with the actual values, indicating that the model has not learned the data structure.

For LLMs, visualization might involve examining the generated text. If the text is nonsensical, irrelevant to the prompt, or lacks the expected language patterns, this could indicate underfitting.

Preventing Underfitting

Section Image

There are several strategies to prevent underfitting in machine learning models, including increasing the complexity of the model, using more features, and increasing the training time.

In the context of LLMs, increasing the complexity might involve using a model with more layers or neurons. More features could include using a larger vocabulary or more context in the input. Increasing the training time allows the model more opportunity to learn the patterns in the data.

Model Complexity

Increasing the complexity of the model can help prevent underfitting. A more complex model has a greater capacity to learn from the data and can capture more complex patterns.

In the context of LLMs, this could involve using a model with more layers or neurons. For example, GPT-3, the model behind ChatGPT, has 175 billion parameters, making it one of the most complex language models currently available.

Feature Engineering

Feature engineering is the process of creating new features or modifying existing ones to improve the performance of a model. This can help prevent underfitting by providing the model with more information to learn from.

In the context of LLMs, feature engineering might involve using a larger vocabulary or more context in the input. This could help the model understand more complex language patterns and produce more accurate text.

Training Time

Increasing the training time can help prevent underfitting. The longer a model is trained, the more opportunity it has to learn the patterns in the data.

For LLMs, this might involve training the model on more data or for more epochs. However, it’s important to monitor the model during training to ensure it is not overfitting, as training for too long can lead to overfitting.


Underfitting is a common issue in machine learning, including in Large Language Models like ChatGPT. It occurs when a model fails to learn the underlying structure of the data, resulting in poor performance. Understanding underfitting, its causes, and how to prevent it is crucial for developing effective machine learning models.

While this article has focused on underfitting in the context of LLMs, the concepts and strategies discussed are applicable to any machine learning model. By understanding and addressing underfitting, we can develop models that accurately capture the complexity of the data and make accurate predictions.

Share this content

Latest posts