How does ChatGPT really work?
Everyone is talking about ChatGPT, learn about the technology behind it in simple terms
As everyone is talking about ChatGPT, I noticed that there wasn't a lot of information about how it works in simple terms for non-tech people like me. So, I did some research over the weekend, and here's what I learned:
First of all, what is ChatGPT? ChatGPT is a conversational AI tool capable of carrying out an intelligent conversation with a human.
AI research has been around for almost 80 years, but scientists have struggled to make progress in conversational, human-like tech. Why?
Languages are not as precise as mathematics; there are nuances in grammatical structure, complexities in the sequence of words, and even pronunciation of words written the same way but pronounced differently, with different meanings. Think about learning a new language, it takes time and effort to reach a stage where you can have a fluent conversation.
So, how did scientists overcome the conversational AI challenge?
Scientists looked at how the human brains works and specifically how neurons react and activate when an input reaches the brain. There are two main concepts to remember here:
Neuron network: in our human brain, connections happen within neurons.
When an input reaches the brain (an image we see, a smell, or pain), neurons react and get activated. The signal propagates to the next connecting neuron in a path toward the output neuron.
These patterns are not linear but multidirectional, and neurons are activated following different behaviors. Activation behaviors is the foundation of how ChatGPT works.Supervised vs. unsupervised learning models: as babies, we learn a language from our parents, and our brain develops in our first years based on all inputs and interactions with the environment. This process is completely unsupervised and it is based on feedback and reactions, which allow the brain to re-look and reorganize connections and test new outputs, finding patterns in chaos.
When entering school, supervised learning starts, telling kids what is the right answer versus the wrong answer. This is the fine-tuning of the connections formed during the unsupervised training.
Do you see how all of this comes together inside ChatGPT?
ChatGPT has been developed based on the human brain just with "artificial neuron networks" that activate connections and behaviors.
These networks are trained with a mix of unsupervised and supervised learning models.
As ChatGPT is a text chat-based tool there are two main neuron networks inside it.
A user asks a question or gives an input. This is when the AI needs to understand the context of the input: it is a question about history, tech or science or a creative storytelling request?
At this stage, the network of artificial neurons finds a pattern and context from the input received.
This input-related neuron network in ChatGPT was trained by feeding it information available until 2021 to create connections and patterns.
This process took one year and was based on an unsupervised learning model.The second part of ChatGPT is answering the input in an ethical and human-like way. This is the response-generating neuron network, which produces the answer to the user. This neuron network has been trained with human supervision (supervised training). This process took 6 months.
What else do you need to know about chat GPT?
In total Chat GPT took 1.5 years to be trained and released between the unsupervised and supervised training.
Also ChatGPT neuron network does not change between releases (we are now on ChatGPT3 with ChatGPT4 to be launched soon). This will probably change as AI becomes faster and more flexible.
Also ChatGPT requires a lot of energy to function properly.
So, there you have it, a brief explanation of how ChatGPT works.
Personally I feel even more excited after learning more about it and thinking about its potential to revolutionize communication and enhance our daily lives.
Tl;dr
ChatGPT is a conversational AI tool that can carry out intelligent conversations with humans.
Scientists have struggled to create conversational AI due to the complexity of language, including nuances in grammar, word sequence, and pronunciation.
To overcome this challenge, scientists looked at how human brains work and how neurons react and activate when an input reaches the brain.
There are two main neuron networks inside ChatGPT: one to understand the context of the input, and one to generate human-like responses.
The network is trained using information available up until 2021 and was based on an unsupervised learning model.
The response-generating network has been trained with human supervision, which took six months.