OpenAI's latest language generation model, GPT-3, has made quite the splash within AI circles, astounding reporters to the point where even Sam Altman, OpenAI's leader, mentioned on Twitter that it may be overhyped. Still, there is no doubt that GPT-3 is powerful. Those with early-stage access to OpenAI's GPT-3 API have shown how to translate natural language into code for websites, solve complex medical question-and-answer problems, create basic tabular financial reports, and even write code to train machine learning models — all with just a few well-crafted examples as input (i.e., via "few-shot learning").
Machine learning models are only as good, or as bad, as the data fed into them during training. In the case of GPT-3, that data is massive. GPT-3 was trained on the Common Crawl dataset, a broad scrape of the 60 million domains on the internet along with a large subset of the sites to which they link. This means that GPT-3 ingested many of the internet's more reputable outlets- think the BBC or The New York Times — along with the less reputable ones – think Reddit. Yet, Common Crawl makes up just 60% of GPT-3's training data; OpenAI researchers also fed in other curated sources such as Wikipedia and the full text of historically relevant books.
Language models learn which succeeding words, phrases and sentences are likely to come next for any given input word or phrase. By "reading" text during training that is largely written by us, language models such as GPT-3 also learn how to "write" like us, complete with all of humanity's best and worst qualities. Tucked away in the GPT-3 paper's supplemental material, the researchers give us some insight into a small fraction of the problematic bias that lurks within. Just as you'd expect from any model trained on a largely unfiltered snapshot of the internet, the findings can be fairly toxic.
But one of the problems with LLMs is that they always have an answer to your prompt, even if that answer is completely wrong. And there have been numerous cases of LLMs making wrong claims and generating text that, although impressive, is utter nonsense.
LLMs are gradually finding their way into real-world applications, from composing emails and writing articles, to answering questions and filling in for customer service agents. Accordingly, there is growing interest in finding ways to determine the reliability and trustfulness of the answers these machine learning models produce. According to a new study by researchers at OpenAI and the University of Oxford, large language models can be calibrated to express their level of certainty in the answers they provide. The study, which focuses on GPT-3, shows that with the right training, LLMs can contribute to making AI systems aligned with human goals and intents.
One of the QA tasks GPT-3 was tested on was NaturalQS which focuses on factual accuracy. GPT-3 underperformed in this task, whereas it got high marks for trivia questions. This behavior is troubling because it seems to indicate that question-answer pairs that are frequently found on the internet are more likely to be given correct answers. But text understanding that is required to answer a complex question from just one example of text is clearly beyond the capability of the LM. If the answer, however, sounds authoritative and is written in correct English, humans may not spot the wrong answer so easily.
As a matter of fact, it's getting more and more difficult for humans to distinguish news written by a machine from articles written by humans. One of the experiments reported in the GPT-3 paper showed that humans have a hard time identifying machine-generated news. The larger the LM got the more problems humans had correctly identifying machine-written news and with the largest version of GPT-3 (175B parameters) the decision was basically like a coin flip.
Join our WhatsApp Channel to get the latest news, exclusives and videos on WhatsApp
_____________
Disclaimer: Analytics Insight does not provide financial advice or guidance. Also note that the cryptocurrencies mentioned/listed on the website could potentially be scams, i.e. designed to induce you to invest financial resources that may be lost forever and not be recoverable once investments are made. You are responsible for conducting your own research (DYOR) before making any investments. Read more here.