GPT-3
Execution Engines
GPT-3 comes with four execution engines of varying sizes and capabilities: Davinci, Ada, Babbage, and Curie. Davinci is the most powerful and the Playground’s default.
Ada
- Ada is the fastest and least expensive of the available eigines
Babbage
- Babbage is faster than Curie but not capable of performing tasks that involve understanding complex intent.
- it is quite capable and is preferable when it comes to semantic search rankings and analyzing how well documents match up with search queries.
Curie
- aims to find an optimal balance between power and speed that is very important for performing high-frequency tasks like classification on a large scale or putting a model into production.
- If you are building a customer support chatbot, you might choose Curie to serve high-volume requests faster.
- While Davinci is stronger at analyzing complicated texts, Curie can perform with low latency and lightning-fast speed.
Davinci
- Davinci is the largest execution engine and the default when you first open the Playground.
- The trade-off is that it cost more to use per API call and is slower than other engines.
Customizing GPT-3
Fine-tuning is about tweaking the whole model so that it performs every time in the way you wish it to perform. You can use an existing dataset of any shape and size, or incrementally add data based on user feedback.
OpenAI also found that each doubling of the number of examples tends to improve quality linearly.
Existing OpenAI API customers found that customizing GPT-3 could dramatically reduce the frequency of unreliable outputs, and there is a growing group of customers that can vouch for it with their performance numbers.
How to Customize GPT-3 for your application
- Prepare new training data and upload it to the OpenAI server
- must be a JSONL document, where each line is a prompt-completion pair corresponding to a training example.
- JSONL document look something like this:
{"prompt": "prompt text", "completion": "ideal generated text"}
- Fine-tune the existing models with the new training data
- Use the fine-tuned model
GPT-3 for Corporations
As soon as the API was released, corporations started experimenting with it but they ran into a significant barrier: data privacy
OpenAI has devised several techniques to transform the functioning of language models like GPT-3 from simple next word prediction to more useful NLP tasks such as answering questions, summarizing documents, and generating context-specific text.
GitHub Copilot is powered by OpenAI’s Codex, a descendant of the GPT-3 model that is designed specifically to interpret and write code.
GitHub has more than 73 million developers
How GitHub Copilot works
The Copilot editor extension intelligently chooses which context to send to the GitHub Copiot service, which in turn runs OpenAI’s Codex model to synthesize suggestions. Even though Copilot generates the code, the user is still in charge: you can cycle through options, choose which to accept or reject, and manually edit the suggested code. GitHub Copilot adapts to the edits you make and matches your coding style.
It links natural language with source code so you can use it in both directions. You can use the source code to generate comments or you can use the comments to generate the source code, making it immensely powerful.
Scaling with the API
Scaling in terms of language models has been undervalued for so long because of theoretical concepts like Occam’s Razor and vanishing results when you expand the neural network to a significant size.
With conventional deep learning, it has always been a norm to keep the model size small with fewer parameters to avoid the problem of vanishing gradients and introducing complexity in the model training process.
- Occam’s Razor implies “a simple model is the best model. This principle ha been a center of reference for training new models, which has discouraged people from experimenting with scale.
In 2020, when OpenAI released its marquee language model GPT-3, the potential of scaling came into the limelight and the common conception of the AI community started to shift. People started realizing that the “gift of scale” can give rise to a more generalized artificial intelligence, where a single model like GPT-3 can perform an array of tasks.
GitHub Copilot is a code synthesizer, not a search engine: the vast majority of the code that is suggests is uniquely generated and has never been seen before.