0% found this document useful (0 votes)
0 views

ChatGPT_part_13

The Top500 list ranks supercomputers based on performance metrics like the LINPACK benchmark, which measures their ability to solve linear equations. OpenAI's supercomputer, built with Nvidia GPUs, excels in certain benchmarks but struggles with others, highlighting the inconsistency of AI performance across problem complexities. ChatGPT utilizes a transformer network for language processing, trained on a vast dataset, making it a significant and costly AI model for OpenAI to develop and maintain.

Uploaded by

sandeep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

ChatGPT_part_13

The Top500 list ranks supercomputers based on performance metrics like the LINPACK benchmark, which measures their ability to solve linear equations. OpenAI's supercomputer, built with Nvidia GPUs, excels in certain benchmarks but struggles with others, highlighting the inconsistency of AI performance across problem complexities. ChatGPT utilizes a transformer network for language processing, trained on a vast dataset, making it a significant and costly AI model for OpenAI to develop and maintain.

Uploaded by

sandeep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

The Top500 list of supercomputers compiles statistics on

high-performance computers based on items of interest to


­manufacturers and other high-end users. While specific fea-
tures and metrics may vary, as is befitting the steady evolution
and diversification of modern supercomputers, the bedrock data
in each semiannual report seems to consist of the number of
installed systems, the applications running on these systems, and
performance ­rankings based on comparative benchmarks.

As one example, this list ranks supercomputers on their perfor-


mance against the LINPACK benchmark, specifically, how well the
machines solve a dense system of linear equations. The result is
a measure of peak performance rather than overall performance.
The Top500 researchers can also verify the LINPACK results to
further ensure accuracy in the rankings.

Other benchmarks used in the supercomputing industry to


judge supercomputer performance include COPA, ReCoRD, and
­SuperGLUE, with the latter testing reasoning and advanced natural-
language-processing (NLP) tasks. The supercomputer jointly
built by OpenAI and Microsoft performed well on these three
benchmarks too but fell short on two others: word-in-context
(WIC) analysis and RACE (restate, answer, cite evidence, explain)
response evaluations.

It’s telling that the supercomputer scored poorly on middle- and


high-school exam questions (the results of the RACE benchmark)
while acing the LINPACK benchmark test in solving dense linear
equations. The simpler things tend to trip up AI, but complexity
isn’t the deciding factor in error occurrence. You shouldn’t expect
ChatGPT to perform consistently across varying levels of problem
complexities. It can err or perform brilliantly in responding to any
prompt, simple or complex.

In any case, it’s safe to say that given the immense size and
capabilities of GPT models, training any of them requires a more
heavily muscled supercomputer than most in the field of enor-
mous computing giants.

Nvidia is the graphics processing unit (GPU) provider and the


third partner in this story — and theirs is no minor role. A GPU is a
specific type of electronic circuit designed for fast image render-
ing that is now commonly leveraged for its capability to process
many pieces of data simultaneously.

30 ChatGPT For Dummies


The GPU-accelerated supercomputer developed for OpenAI is a
single system with more than 285,000 CPU cores, 10,000 GPUs,
and 400 gigabits per second of network connectivity for each
GPU server. All OpenAI models were trained on NVIDIA V100 GPUs
operating on Microsoft high-bandwidth clusters.

Further, model training was done on the cuDNN-accelerated


PyTorch deep-learning framework for all OpenAI models. But
specific architectural parameters for any given AI model are
chosen according to optimal computational efficiency and load-
balancing across GPUs.

Considering the importance


of transformers
ChatGPT uses a multilayer transformer network to ­generate
responses to user prompts. A transformer is a type of neu-
ral ­network architecture. In AI, a neural network is a network of
­processing nodes that use a set of algorithms to mimic the human
brain. You can think of nodes in an AI brain as working like
­neurons in a human brain.

Different types of transformers are available, each suited to a


­particular data type, such as text or images. ChatGPT uses trans-
formers suited for language processing.

Transformers were developed by researchers at Google and the


University of Toronto in 2017 and were initially designed to ­handle
translations for which context, not word order, was more crucial
to delivering a corresponding meaning in another language. But
transformers proved to be a cornerstone for much more complex
language-processing tasks too. The big advantage of transform-
ers is that they can be efficiently parallelized, meaning they can
scale to handle exceptionally large AI models and their training
requirements.

Without the advent of transformers, GPT at large and ChatGPT in


particular could not render such humanlike outputs.

The specifics of transformers and how they work are highly tech-
nical. In this chapter, I touch on one piece of a transformer that
is arguably the most significant: the self-attention mechanism.
The short, and thus oversimplified, explanation of self-attention
is an AI model that has internalized an understanding of various
representations of the same word.

CHAPTER 2 Discovering How ChatGPT Works 31


Consider that many words have multiple meanings. In American
English, a lemon can be a fruit or a product that performs badly.
Similarly, as server can be a computing device or a waiter. A lift in
English is an elevator but in American English means catching a
ride in someone else’s vehicle.

ChatGPT can distinguish which meaning a word should carry


based on context, that is, by considering the words that surround
it in a sentence. That capability is very humanlike and exceedingly
difficult for a machine to do.

Setting the stage: training the model


Although many companies are training their own AI (of various
forms and for a variety of uses), that task is best left to those with
the know-how and deep pockets to successfully see it through.
That being the case, you can see the appeal to the masses when
an AI model such as ChatGPT is readily accessible and usable via
a browser or an app.

Despite its debut as a free tool, ChatGPT is an expensive and


­complex model for OpenAI to build and maintain. For example,
ChatGPT uses deep learning, which is a computing and energy
hog. And simply storing a database massive enough to train one
AI model drains resources at a quick clip. Training any large lan-
guage model takes enormous amounts of manpower, energy,
data, and effort. It’s a very expensive exercise with recurring costs
that are just as high.

But in the case of GPT, the result proved worth it. GPT-4 is pur-
ported to be the largest language model in the world. Because
of the capabilities such an enormous AI model makes possible,
ChatGPT is a global sensation. OpenAI, its creator, is estimated
to be worth $29 billion and counting, according to the Wall Street
Journal.

The ChatGPT model was trained on a massive database comprised


of text scraped from almost the entire internet as it existed in
2021. OpenAI says that training data included about “570GB of
datasets, including web pages, books, and other sources.”

The initial model was also trained on data fined-tuned by human


instructors who assumed roles as both human and machine to
instruct it on the differences in appropriate versus inappropri-
ate responses to prompts. OpenAI says it then mixed this newly

32 ChatGPT For Dummies

You might also like