SlideShare a Scribd company logo
ประมวลภาษาแบบ
Transfer Learning
ด้วย Transformers (BERT)
Dr. Kobkrit Viriyayudhakorn
iApp Technology Co., Ltd.
Outlines
• ทําไม Transfer Learning ด้วย Transformer ถึงน่าสนใจ?
• ประวัติของการประมวลผลข้อความในยุค Deep Learning
• Bert: Model Architecture
• Bert: Pre-training
• Bert: Fine-Tuning
• Bert: State of the Art
• Better than BERT?
ทําไม Transfer Learning ด้วย Transformer ถึงน่าสนใจ?
• True-Voice Intent Classification
• Thai QA (*Not same test set)
88.967% (BERT, 2019)83.498% (ULMFit,2018)
Dr.QA (Bi-LSTM, 2017)
34.0% (Exact Match)
iApp QA (BERT, 2019)
45.7% (Exact Match)
https://github.com/PyThaiNLP/classification-benchmarks https://ai.iapp.co.th/intent-class
https://ai.iapp.co.th/qaNSC2019 QA
Data Science & Engineering Workshop 2019: The future of Thai NLP
ทําไม Transfer Learning ด้วย Transformer ถึงน่าสนใจ?
• Text Classification
• Sentiment Analysis
• Intent Classification
• Any Classifications
• Question Answering
• Machine Translation
• Text Summarization
• Name Entity Recognition
• Paraphrasing
• Natural Language Inference
• Coreference Resolution
• Sentence Completion
• Word sense disambiguation
• Language Generation
arxiv.org/abs/1910.12840
ประวัติการประมวลผลข้อความในยุค Deep Learning
• One-hot Encoding
• Transfer Learning
• Word Global Representation
• Word-Embbedding
• Word2Vec (Thai2Vec)
• Glove
• ULMFit (Thai2Fit)
• Word Contextual Representation
• LSTM+Conv1D
• Elmo
• Subword Contextual Representations
• Transformers
• GPT-1, GPT-2 (OpenAI)
• BERT (Google)
• Tranformer XL (Google)
• Xlnet (Google)
• XLM (Facebook)
• RoBERTa (Facebook)
• DistilBERT (Victor Sanh)
• CTRL (Saleforce)
• Albert (Google)
One-hot Encoding
Word Sequence One-hot VectorText
I really love
my dog
[ 4 <OOV> 2 1 3 ]
One-hot Encoding
• # Dimension = # Vocabulary
• English Languages: 25,000 (without NE) – 300,000 (with most NE)
vocabulary.
• Curse of dimensionality
• Too sparse dimensional spaces => Never have enough data points => Can not
learn anything.
• Get low accuracy, Process too long.
Transfer Learning in NLP
Pretraining Fine Tuning
Decision
Model
Universal Language Model
• Create Fundamental
Language Model
• Epic Corpus
• Wikipedia
• No Label Need
• 5-7 days on TPUv3
Contextual Language Model
• Adjust Weight for Specific Task
• Small Corpus
• True-Voice Intent Class
• Label data is need
• 1 hour on TPUv3
ML Model
• Word-Embedding: LSTM
• Transformers: Dense
Word-Embedding
I
really
love
my
dog
https://projector.tensorflow.org/
CBOW, Skip Gram
Word2Vec
Word-Embedding
• Libraries
• Word2Vec (Thai2Vec)
• Glove
• ULMFit (Thai2Fit)
• FastText
• Context Independent
• 1 Word = 1 Global Representation
• Does not care sequence in Training
• Word Arithmetic
Pretraining
(Embedding)
Fine Tuning
Word2Vec (Thai2Vec)
Glove
ULMFit (Thai2Fit)
FastText
RNN
GRU
LSTM
Conv1D
Dense
SVM
SVR
Logit
Softmax
…
Decision
Model
X
Static Word Vectors
Text
https://towardsdatascience.com/introduction-to-word-embedding-and-word2vec-652d0c2060fa
Thai Text processing by Transfer Learning using Transformer (Bert)
Elmo
• Context Dependent
= Need whole input
sentence.
• 1 Word = Many
Representations
• He went to the
prison cell with
his cell phone to
extract
blood cell samples
from inmates.
• Bi-LSTM +
Character-based
Pretraining Fine Tuning
Elmo Model Logit
Softmax
…
Decision
Model
Contextual Word Vectors
X
Text
https://arxiv.org/abs/1802.05365
Bert
• Using Transformers
• 1 Subword = Many
Contextual Representations
(Aimed to reduce vocabulary
size)
• Byte-Pair Encoding (BPE)
• Universal Languages
• Don’t care on the splitting
correctness (like in Thai, China)
• Similar to TCC
• https://github.com/bheinzerlin
g/bpemb
Pretraining Fine Tuning
Bert Logit
Softmax
…
Decision
Model
Bert
Fine Tuned Weight
with Label Data
Language Model Weight Weight for Downstream Task
https://github.com/google-research/bert
Bert: Model Architecture
Equal to Open AI Transformer
• 12 Blocks (Heads)
• 768 Hidden Unit
• 12 Attention Heads
• 110M parameters
State of the Art Model
• 24 Blocks
• 1024 Hidden Unit
• 16 Attention Heads
• 340M parameters
https://jalammar.github.io/illustrated-bert/ https://github.com/google-research/bert
Bert: Model Architecture
Size of hidden unit = 768
Sequence Max Length = 512 Units
<1 Head>
12 Layers/Blocks
https://jalammar.github.io/illustrated-bert/
Bert: Inside Encoder
https://jalammar.github.io/illustrated-bert/
Bert: Inside Encoder
https://jalammar.github.io/illustrated-bert/
Bert: Self-Attention
https://colab.research.google.com/drive/1uTof3Dw9wtvWt-KFG_opMnZ4rBHKZj8D
Bert: Self-Attention #1
https://jalammar.github.io/illustrated-bert/
Bert: Self-Attention #2
https://jalammar.github.io/illustrated-bert/
Bert: Self-Attention #3
https://jalammar.github.io/illustrated-bert/
Bert: Self-Attention #4
https://jalammar.github.io/illustrated-bert/
Bert: Self-Attention in Matrix Styles
https://jalammar.github.io/illustrated-bert/
Bert: Inside Encoder
https://jalammar.github.io/illustrated-bert/
Bert: Multi-head Architecture
<Head:1>
<Head:2>
<Head:12>
……
https://jalammar.github.io/illustrated-bert/
Bert: Self-Attention with Multi-head
http://jalammar.github.io/illustrated-transformer/
Bert: Self-Attention with Multi-head
http://jalammar.github.io/illustrated-transformer/
Bert: Self-Attention with Multi-head
http://jalammar.github.io/illustrated-transformer/
Bert: Self-Attention with Multi-head
Single Head Multi Head
http://jalammar.github.io/illustrated-transformer/
Bert: Self-Attention with Multi-head
Bert: Model Summary
http://jalammar.github.io/illustrated-transformer/
Bert: Model Summary #2
http://jalammar.github.io/illustrated-transformer/
Bert: Using Layers Output
https://jalammar.github.io/illustrated-bert/
Bert: Pretraining
https://jalammar.github.io/illustrated-bert/
Bert: Pretraining #1
• Step 1: BERT ไร้ Weight ต้องหาความรู้ด้านภาษาก่อน
ไม่มี Weight ไม่มี Weight
Pretraining
Fine
Tuning
Decision
Model
Bert ไร้ซึFงสมองหรือ
สิFงใดๆทัIงสิIน
?
ตัIงใจเรียน
Bert: Pretraining #2
• Step 2: BERT นําเข้าข้อมูล Wikipedia + Google Book ภาษาอังกฤษ แล้ว Challenge
ตัวเอง (1) หาคําทีFหายไป โดยการใส่ [MARK] แบบสุ่ม
[CLS] [MASK] man [MASK] up , put his [MASK] on phil [MASK] ##mon ' s head [SEP] [SEP]
the man jumped up , put his basket on phil ##am ##mon ' s head
Input to bert
Output from bert
… the man jumped up ,
put his basket on
philammon ' s head …
Preprocessing
+ Random marking
Weight is learning
Learn and Adapt
from correct Answer
Missing (Marking) Word Prediction
Text Representation
Vector
Output from model
Pretraining
Fine
Tuning
Decision
Model
Bert: Pretraining #3
https://github.com/google-research/bert
Bert: Pretraining #4 (Real Example)
• Input: this is one of the most amazing stories i have ever seen . < br / > < br / > if this film had been directed by larry clark , then
this story about a school shooting probably would have been shown through the eyes of the killer and whatever led that person to
go insane in the first place . < br / > < br / > instead , the plot focuses mainly on the aftermath of a school shooting , and how it
effect ##ed the victims who survived . < br / > < br / > i had seen busy phillips in other films before , but her performance in this
movie is by far , her best . the … (512 chopped)
• INFO:tensorflow:tokens: [CLS] this is one of the most amazing stories i have ever seen . < br / > < br / > if this film had been
directed by larry clark , then this story about a school shooting probably would have been shown through the eyes of the killer and
whatever led that person to go insane in the first place . < br / > < br / > instead , the plot focuses mainly on the aftermath of a
school shooting , and how it effect ##ed the victims who survived . < br / > < br / > i had seen busy phillips in other films before ,
but her performance in this movie is by far , her best . the [SEP] [SEP]
• INFO:tensorflow:input_ids: 101 2023 2003 2028 1997 1996 2087 6429 3441 1045 2031 2412 2464 1012 1026 7987 1013 1028
1026 7987 1013 1028 2065 2023 2143 2018 2042 2856 2011 6554 5215 1010 2059 2023 2466 2055 1037 2082 5008 2763 2052
2031 2042 3491 2083 1996 2159 1997 1996 6359 1998 3649 2419 2008 2711 2000 2175 9577 1999 1996 2034 2173 1012 1026
7987 1013 1028 1026 7987 1013 1028 2612 1010 1996 5436 7679 3701 2006 1996 10530 1997 1037 2082 5008 1010 1998 2129
2009 3466 2098 1996 5694 2040 5175 1012 1026 7987 1013 1028 1026 7987 1013 1028 1045 2018 2464 5697 8109 1999 2060
3152 2077 1010 2021 2014 2836 1999 2023 3185 2003 2011 2521 1010 2014 2190 1012 1996 102
• INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
• INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 c
Bert: Pretraining #5 (Position Encoding)
Using SIN() and COS() for telling positions
http://jalammar.github.io/illustrated-transformer/
Bert: Pretraining #6 (Position Encoding)
http://jalammar.github.io/illustrated-transformer/
Bert: Pretraining #6
• Step 3: (2) ให้ประโยคแรกและประโยคถัดไป แล้วให้ BERT บอกว่าเป็นประโยคต่อกันหรือไม่
[CLS] the man went to the store [SEP] he bought a gallon of milk [SEP]
Input to bert
… the man went to the store.
he bought a gallon of milk …
Preprocessing
Weight is learning
Learn from correct
Answer
Predict Next Sentence Task
Output from bert
Text Representation
Vector
Output from model
1 = Yes
Pretraining
Fine
Tuning
Decision
Model
ตัIงใจเรียน
Bert: Pretraining #7
• Step 4: BERT มี Language Model แล้ว
มี Weight มี Weight
Bert เรียนจบแล้ว เย้ๆ
Pretraining
Fine
Tuning
Decision
Model
Bert: Pretraining #8
มี Weight มี Weight
• หรือ เราสามารถ Download Weight ของ Bert ทีF Pre-train แล้วมาจาก
https://github.com/google-research/bert (English) หรือ
https://github.com/ThAIKeras/bert (Thai) ได้เลย
เอาไปเลยน้องพีF
Bert พูดภาษาคนรู้เรืFอง
แล้วววววว
เอาไปเลยพีFน้อง
Pretraining
Fine
Tuning
Decision
Model
มี Weight มี Weight
Bert มาสมัครงานครับ
มี Weight มี Weight
ตอนนีIแปลงประโยคให้เป็นตัวเลข Vector ได้ครับ
ตอนนีIคุณทําอะไรได้บ้าง
ความสามารถของ BERT หลัง
Pre-train
มี Weight มี Weight
• หากใส่ข้อความลงไป ก็จะ Subword Representation ใน Context ของประโยคนัIนออกมา
I mean, part of the beauty of me is that I’m very rich.
[CLS] I mean part of the beaut ##y of me is that I ##m very rich [SEP] [SEP]
Preprocessing
Bert (Base)
Output Parameters=
1 Final Layer x
1 Batch Size x
12 Attention Heads x
512 max-length x
768 hidden features x
= 4,718,592 output parameters /
sentences
มี Weight มี Weight
อะไรคือ Sentimental Analysis หรอครับ??
แล้วคุณทําเป็นพวก
Sentimental Analysis
แยกแยะข้อความว่าเป็น บวก
เป็น ลบได้หรือยัง
มี Weight มี Weight
ว๊อทททททท
ยังใช้งานจริงไม่ได้
คุณยังไม่มีประสบการณ์ทํางานนะ
ฝึกงานฟรีไปก่อนแล้วกัน
BERT ฝึกงาน #1
มี Weight มี Weight
• สมมุติเราอยากให้ BERT ทํางานเป็นคนแยกแยะอารมณ์ของข้อความให้กับเรา (Pos, Neg, Neu)
• เราสามารถเขียน Classification Model ของเราต่อท้ายผลของ Bert จะเป็นอะไรก็ได้ ตามใจเรา Logit, DNN, LSTM ก็ได้
I mean part of the beauty of me is that I’m very rich.
[CLS] I mean part of the beaut ##y of me is that I ##m very rich [SEP] [SEP]
Preprocessing
Sentimental Classification (Logit, DNN
or LSTM or Anythings…)
เขียน Model ต่อท้ายของผลของ BERT
Sentimental คืออะไรฟร่ะ งง งวย ไม่มี
ความรู้ อารายเยย
Pretraining
Fine
Tuning
Decision
Model
BERT ฝึกงาน #2
• หา Dataset ทีFมี Label Class (Pos, Neg, Neu) มาสอนงาน (Fine-tune) น้อง Bert
• อาทิเช่น
• Sentiment140 (English) (https://www.kaggle.com/kazanova/sentiment140)
• Wongnai Corpus (1-5) (https://github.com/wongnai/wongnai-corpus)
• อย่างน้อย 10,000 แถวขึIน (ยิFงเยอะ ยิFงดี)
Pretraining
Fine
Tuning
Decision
Model
Sentiment 140
BERT ฝึกงาน #3 (Fine Tuning)
[CLS] is upset that he can't update
his face book by texting it [SEP] [SEP]
Our’s Sentimental Classification (Logits
or Dense or LSTM or Any models)
Weight Learning
เขียน Model ต่อท้ายของผล
ของ BERT
Sentiment 140
is upset that he can't update
his Facebook by texting it...+ SAD
Happy?
Learn and Adapt
from correct Answer
It is actually
SAD, Go re-learn.
Preprocessing
Pretraining
Fine
Tuning
Decision
Model
Weight Learning
Weight Frozen Weight Frozen
BERT ฝึกงาน #4 (Fine Tuning)
• และแล้วเราก็ได้ Classifier ทีFสามารถทํานายอารมณ์ได้จาก BERT
[CLS] is upset that he can't update
his face book by texting it [SEP] [SEP]
Our’s Sentimental Classification (DNN
or LSTM or Any model you want)
Weight Learned
SADOutput
Input
ผ่าน PRO แล้วจ้า
Pretraining
Fine
Tuning
Decision
Model
Weight Learned
Weight Frozen Weight Frozen
Bert: Finetuning in several tasks
https://github.com/google-research/bert
Bert Rocks!!
https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=bert
DistilBert
Albert
RoBERTa
XLnet
TransformerXL
XLM
Q8Bert
Alberto
CTRL
Better than Bert??
https://towardsdatascience.com/bert-roberta-distilbert-xlnet-which-one-to-use-3d5ab82ba5f8
QA
Any Questions: kobkrit@iapp.co.th
Ad

More Related Content

What's hot (20)

NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERT
shaurya uppal
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP models
OVHcloud
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Minh Pham
 
BERT
BERTBERT
BERT
Khang Pham
 
Text classification presentation
Text classification presentationText classification presentation
Text classification presentation
Marijn van Zelst
 
BERT introduction
BERT introductionBERT introduction
BERT introduction
Hanwha System / ICT
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
Benjamin Bengfort
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
Daiki Tanaka
 
A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)
Shuntaro Yada
 
Transformers in 2021
Transformers in 2021Transformers in 2021
Transformers in 2021
Grigory Sapunov
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
ananth
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
Illia Polosukhin
 
Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Attention is All You Need (Transformer)
Attention is All You Need (Transformer)
Jeong-Gwan Lee
 
[Paper review] BERT
[Paper review] BERT[Paper review] BERT
[Paper review] BERT
JEE HYUN PARK
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
Arvind Devaraj
 
Transformer Zoo
Transformer ZooTransformer Zoo
Transformer Zoo
Grigory Sapunov
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
Simplilearn
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
VenkateshMurugadas
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNN
Hye-min Ahn
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERT
shaurya uppal
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP models
OVHcloud
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Minh Pham
 
Text classification presentation
Text classification presentationText classification presentation
Text classification presentation
Marijn van Zelst
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
Benjamin Bengfort
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
Daiki Tanaka
 
A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)
Shuntaro Yada
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
ananth
 
Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Attention is All You Need (Transformer)
Attention is All You Need (Transformer)
Jeong-Gwan Lee
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
Arvind Devaraj
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
Simplilearn
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
VenkateshMurugadas
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNN
Hye-min Ahn
 

Similar to Thai Text processing by Transfer Learning using Transformer (Bert) (20)

Devoxx traitement automatique du langage sur du texte en 2019
Devoxx   traitement automatique du langage sur du texte en 2019 Devoxx   traitement automatique du langage sur du texte en 2019
Devoxx traitement automatique du langage sur du texte en 2019
Alexis Agahi
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Jeongkyu Shin
 
Deep-learning based Language Understanding and Emotion extractions
Deep-learning based Language Understanding and Emotion extractionsDeep-learning based Language Understanding and Emotion extractions
Deep-learning based Language Understanding and Emotion extractions
Jeongkyu Shin
 
Data oriented design and c++
Data oriented design and c++Data oriented design and c++
Data oriented design and c++
Mike Acton
 
self-supervised learning and Bert from a
self-supervised learning and Bert from aself-supervised learning and Bert from a
self-supervised learning and Bert from a
FeiXiao19
 
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and SpectrumDeep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Julien SIMON
 
5_RNN_LSTM.pdf
5_RNN_LSTM.pdf5_RNN_LSTM.pdf
5_RNN_LSTM.pdf
FEG
 
Code Evolution Day 2024 = Opening talk: Demystifying LLMs
Code Evolution Day 2024 = Opening talk: Demystifying LLMsCode Evolution Day 2024 = Opening talk: Demystifying LLMs
Code Evolution Day 2024 = Opening talk: Demystifying LLMs
riki_kurniawan
 
RNN is recurrent neural networks and deep learning
RNN is recurrent neural networks and deep learningRNN is recurrent neural networks and deep learning
RNN is recurrent neural networks and deep learning
FeiXiao19
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
Robert McDermott
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
Robert McDermott
 
Conversational AI with Transformer Models
Conversational AI with Transformer ModelsConversational AI with Transformer Models
Conversational AI with Transformer Models
Databricks
 
A Peek into Google's Edge TPU
A Peek into Google's Edge TPUA Peek into Google's Edge TPU
A Peek into Google's Edge TPU
Koan-Sin Tan
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
ORAU
 
Python 3.6 Features 20161207
Python 3.6 Features 20161207Python 3.6 Features 20161207
Python 3.6 Features 20161207
Jay Coskey
 
State of GPT - Microsoft Build 2023 RD.pdf
State of GPT - Microsoft Build 2023 RD.pdfState of GPT - Microsoft Build 2023 RD.pdf
State of GPT - Microsoft Build 2023 RD.pdf
solarobin
 
Deep Learning and Design Thinking
Deep Learning and Design ThinkingDeep Learning and Design Thinking
Deep Learning and Design Thinking
Yen-lung Tsai
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with Transformers
Databricks
 
Modeling computer networks by colored Petri nets
Modeling computer networks by colored Petri netsModeling computer networks by colored Petri nets
Modeling computer networks by colored Petri nets
DmitryZaitsev5
 
Better Machine Learning with Less Data - Slater Victoroff (Indico Data)
Better Machine Learning with Less Data - Slater Victoroff (Indico Data)Better Machine Learning with Less Data - Slater Victoroff (Indico Data)
Better Machine Learning with Less Data - Slater Victoroff (Indico Data)
Shift Conference
 
Devoxx traitement automatique du langage sur du texte en 2019
Devoxx   traitement automatique du langage sur du texte en 2019 Devoxx   traitement automatique du langage sur du texte en 2019
Devoxx traitement automatique du langage sur du texte en 2019
Alexis Agahi
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Jeongkyu Shin
 
Deep-learning based Language Understanding and Emotion extractions
Deep-learning based Language Understanding and Emotion extractionsDeep-learning based Language Understanding and Emotion extractions
Deep-learning based Language Understanding and Emotion extractions
Jeongkyu Shin
 
Data oriented design and c++
Data oriented design and c++Data oriented design and c++
Data oriented design and c++
Mike Acton
 
self-supervised learning and Bert from a
self-supervised learning and Bert from aself-supervised learning and Bert from a
self-supervised learning and Bert from a
FeiXiao19
 
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and SpectrumDeep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Julien SIMON
 
5_RNN_LSTM.pdf
5_RNN_LSTM.pdf5_RNN_LSTM.pdf
5_RNN_LSTM.pdf
FEG
 
Code Evolution Day 2024 = Opening talk: Demystifying LLMs
Code Evolution Day 2024 = Opening talk: Demystifying LLMsCode Evolution Day 2024 = Opening talk: Demystifying LLMs
Code Evolution Day 2024 = Opening talk: Demystifying LLMs
riki_kurniawan
 
RNN is recurrent neural networks and deep learning
RNN is recurrent neural networks and deep learningRNN is recurrent neural networks and deep learning
RNN is recurrent neural networks and deep learning
FeiXiao19
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
Robert McDermott
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
Robert McDermott
 
Conversational AI with Transformer Models
Conversational AI with Transformer ModelsConversational AI with Transformer Models
Conversational AI with Transformer Models
Databricks
 
A Peek into Google's Edge TPU
A Peek into Google's Edge TPUA Peek into Google's Edge TPU
A Peek into Google's Edge TPU
Koan-Sin Tan
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
ORAU
 
Python 3.6 Features 20161207
Python 3.6 Features 20161207Python 3.6 Features 20161207
Python 3.6 Features 20161207
Jay Coskey
 
State of GPT - Microsoft Build 2023 RD.pdf
State of GPT - Microsoft Build 2023 RD.pdfState of GPT - Microsoft Build 2023 RD.pdf
State of GPT - Microsoft Build 2023 RD.pdf
solarobin
 
Deep Learning and Design Thinking
Deep Learning and Design ThinkingDeep Learning and Design Thinking
Deep Learning and Design Thinking
Yen-lung Tsai
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with Transformers
Databricks
 
Modeling computer networks by colored Petri nets
Modeling computer networks by colored Petri netsModeling computer networks by colored Petri nets
Modeling computer networks by colored Petri nets
DmitryZaitsev5
 
Better Machine Learning with Less Data - Slater Victoroff (Indico Data)
Better Machine Learning with Less Data - Slater Victoroff (Indico Data)Better Machine Learning with Less Data - Slater Victoroff (Indico Data)
Better Machine Learning with Less Data - Slater Victoroff (Indico Data)
Shift Conference
 
Ad

More from Kobkrit Viriyayudhakorn (20)

Thai E-Voting System
Thai E-Voting System Thai E-Voting System
Thai E-Voting System
Kobkrit Viriyayudhakorn
 
Thai National ID Card OCR
Thai National ID Card OCRThai National ID Card OCR
Thai National ID Card OCR
Kobkrit Viriyayudhakorn
 
Chochae Robot - Thai voice communication extension pack for Service Robot
Chochae Robot - Thai voice communication extension pack for Service RobotChochae Robot - Thai voice communication extension pack for Service Robot
Chochae Robot - Thai voice communication extension pack for Service Robot
Kobkrit Viriyayudhakorn
 
ศักยภาพของ AI สู่โอกาสใหม่แห่งการแข่งขันและความสำเร็จ (Thai AI updates in yea...
ศักยภาพของ AI สู่โอกาสใหม่แห่งการแข่งขันและความสำเร็จ (Thai AI updates in yea...ศักยภาพของ AI สู่โอกาสใหม่แห่งการแข่งขันและความสำเร็จ (Thai AI updates in yea...
ศักยภาพของ AI สู่โอกาสใหม่แห่งการแข่งขันและความสำเร็จ (Thai AI updates in yea...
Kobkrit Viriyayudhakorn
 
How Emoticon Affects Chatbot Users
How Emoticon Affects Chatbot UsersHow Emoticon Affects Chatbot Users
How Emoticon Affects Chatbot Users
Kobkrit Viriyayudhakorn
 
หัวใจของปัญญาประดิษฐ์ (Gradient Descent ทำงานอย่างไร)
หัวใจของปัญญาประดิษฐ์ (Gradient Descent ทำงานอย่างไร)หัวใจของปัญญาประดิษฐ์ (Gradient Descent ทำงานอย่างไร)
หัวใจของปัญญาประดิษฐ์ (Gradient Descent ทำงานอย่างไร)
Kobkrit Viriyayudhakorn
 
Check Raka Chatbot Pitching Presentation
Check Raka Chatbot Pitching PresentationCheck Raka Chatbot Pitching Presentation
Check Raka Chatbot Pitching Presentation
Kobkrit Viriyayudhakorn
 
[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)
[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)
[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)
Kobkrit Viriyayudhakorn
 
[Lecture 4] AI and Deep Learning: Neural Network (Theory)
[Lecture 4] AI and Deep Learning: Neural Network (Theory)[Lecture 4] AI and Deep Learning: Neural Network (Theory)
[Lecture 4] AI and Deep Learning: Neural Network (Theory)
Kobkrit Viriyayudhakorn
 
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
Kobkrit Viriyayudhakorn
 
ITS488 Lecture 6: Music and Sound Effect & GVR Try out.
ITS488 Lecture 6: Music and Sound Effect & GVR Try out.ITS488 Lecture 6: Music and Sound Effect & GVR Try out.
ITS488 Lecture 6: Music and Sound Effect & GVR Try out.
Kobkrit Viriyayudhakorn
 
Lecture 12: React-Native Firebase Authentication
Lecture 12: React-Native Firebase AuthenticationLecture 12: React-Native Firebase Authentication
Lecture 12: React-Native Firebase Authentication
Kobkrit Viriyayudhakorn
 
Unity Google VR Cardboard Deployment on iOS and Android
Unity Google VR Cardboard Deployment on iOS and AndroidUnity Google VR Cardboard Deployment on iOS and Android
Unity Google VR Cardboard Deployment on iOS and Android
Kobkrit Viriyayudhakorn
 
ITS488 Lecture 4: Google VR Cardboard Game Development: Basket Ball Game #2
ITS488 Lecture 4: Google VR Cardboard Game Development: Basket Ball Game #2ITS488 Lecture 4: Google VR Cardboard Game Development: Basket Ball Game #2
ITS488 Lecture 4: Google VR Cardboard Game Development: Basket Ball Game #2
Kobkrit Viriyayudhakorn
 
Lecture 4: ITS488 Digital Content Creation with Unity - Game and VR Programming
Lecture 4: ITS488 Digital Content Creation with Unity - Game and VR Programming Lecture 4: ITS488 Digital Content Creation with Unity - Game and VR Programming
Lecture 4: ITS488 Digital Content Creation with Unity - Game and VR Programming
Kobkrit Viriyayudhakorn
 
Lecture 2: C# Programming for VR application in Unity
Lecture 2: C# Programming for VR application in UnityLecture 2: C# Programming for VR application in Unity
Lecture 2: C# Programming for VR application in Unity
Kobkrit Viriyayudhakorn
 
Lecture 1 Introduction to VR Programming
Lecture 1 Introduction to VR ProgrammingLecture 1 Introduction to VR Programming
Lecture 1 Introduction to VR Programming
Kobkrit Viriyayudhakorn
 
Thai Word Embedding with Tensorflow
Thai Word Embedding with Tensorflow Thai Word Embedding with Tensorflow
Thai Word Embedding with Tensorflow
Kobkrit Viriyayudhakorn
 
Lecture 3 - ES6 Script Advanced for React-Native
Lecture 3 - ES6 Script Advanced for React-NativeLecture 3 - ES6 Script Advanced for React-Native
Lecture 3 - ES6 Script Advanced for React-Native
Kobkrit Viriyayudhakorn
 
สร้างซอฟต์แวร์อย่างไรให้โดนใจผู้คน (How to make software that people love)
สร้างซอฟต์แวร์อย่างไรให้โดนใจผู้คน (How to make software that people love)สร้างซอฟต์แวร์อย่างไรให้โดนใจผู้คน (How to make software that people love)
สร้างซอฟต์แวร์อย่างไรให้โดนใจผู้คน (How to make software that people love)
Kobkrit Viriyayudhakorn
 
Chochae Robot - Thai voice communication extension pack for Service Robot
Chochae Robot - Thai voice communication extension pack for Service RobotChochae Robot - Thai voice communication extension pack for Service Robot
Chochae Robot - Thai voice communication extension pack for Service Robot
Kobkrit Viriyayudhakorn
 
ศักยภาพของ AI สู่โอกาสใหม่แห่งการแข่งขันและความสำเร็จ (Thai AI updates in yea...
ศักยภาพของ AI สู่โอกาสใหม่แห่งการแข่งขันและความสำเร็จ (Thai AI updates in yea...ศักยภาพของ AI สู่โอกาสใหม่แห่งการแข่งขันและความสำเร็จ (Thai AI updates in yea...
ศักยภาพของ AI สู่โอกาสใหม่แห่งการแข่งขันและความสำเร็จ (Thai AI updates in yea...
Kobkrit Viriyayudhakorn
 
หัวใจของปัญญาประดิษฐ์ (Gradient Descent ทำงานอย่างไร)
หัวใจของปัญญาประดิษฐ์ (Gradient Descent ทำงานอย่างไร)หัวใจของปัญญาประดิษฐ์ (Gradient Descent ทำงานอย่างไร)
หัวใจของปัญญาประดิษฐ์ (Gradient Descent ทำงานอย่างไร)
Kobkrit Viriyayudhakorn
 
Check Raka Chatbot Pitching Presentation
Check Raka Chatbot Pitching PresentationCheck Raka Chatbot Pitching Presentation
Check Raka Chatbot Pitching Presentation
Kobkrit Viriyayudhakorn
 
[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)
[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)
[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)
Kobkrit Viriyayudhakorn
 
[Lecture 4] AI and Deep Learning: Neural Network (Theory)
[Lecture 4] AI and Deep Learning: Neural Network (Theory)[Lecture 4] AI and Deep Learning: Neural Network (Theory)
[Lecture 4] AI and Deep Learning: Neural Network (Theory)
Kobkrit Viriyayudhakorn
 
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
Kobkrit Viriyayudhakorn
 
ITS488 Lecture 6: Music and Sound Effect & GVR Try out.
ITS488 Lecture 6: Music and Sound Effect & GVR Try out.ITS488 Lecture 6: Music and Sound Effect & GVR Try out.
ITS488 Lecture 6: Music and Sound Effect & GVR Try out.
Kobkrit Viriyayudhakorn
 
Lecture 12: React-Native Firebase Authentication
Lecture 12: React-Native Firebase AuthenticationLecture 12: React-Native Firebase Authentication
Lecture 12: React-Native Firebase Authentication
Kobkrit Viriyayudhakorn
 
Unity Google VR Cardboard Deployment on iOS and Android
Unity Google VR Cardboard Deployment on iOS and AndroidUnity Google VR Cardboard Deployment on iOS and Android
Unity Google VR Cardboard Deployment on iOS and Android
Kobkrit Viriyayudhakorn
 
ITS488 Lecture 4: Google VR Cardboard Game Development: Basket Ball Game #2
ITS488 Lecture 4: Google VR Cardboard Game Development: Basket Ball Game #2ITS488 Lecture 4: Google VR Cardboard Game Development: Basket Ball Game #2
ITS488 Lecture 4: Google VR Cardboard Game Development: Basket Ball Game #2
Kobkrit Viriyayudhakorn
 
Lecture 4: ITS488 Digital Content Creation with Unity - Game and VR Programming
Lecture 4: ITS488 Digital Content Creation with Unity - Game and VR Programming Lecture 4: ITS488 Digital Content Creation with Unity - Game and VR Programming
Lecture 4: ITS488 Digital Content Creation with Unity - Game and VR Programming
Kobkrit Viriyayudhakorn
 
Lecture 2: C# Programming for VR application in Unity
Lecture 2: C# Programming for VR application in UnityLecture 2: C# Programming for VR application in Unity
Lecture 2: C# Programming for VR application in Unity
Kobkrit Viriyayudhakorn
 
Lecture 1 Introduction to VR Programming
Lecture 1 Introduction to VR ProgrammingLecture 1 Introduction to VR Programming
Lecture 1 Introduction to VR Programming
Kobkrit Viriyayudhakorn
 
Lecture 3 - ES6 Script Advanced for React-Native
Lecture 3 - ES6 Script Advanced for React-NativeLecture 3 - ES6 Script Advanced for React-Native
Lecture 3 - ES6 Script Advanced for React-Native
Kobkrit Viriyayudhakorn
 
สร้างซอฟต์แวร์อย่างไรให้โดนใจผู้คน (How to make software that people love)
สร้างซอฟต์แวร์อย่างไรให้โดนใจผู้คน (How to make software that people love)สร้างซอฟต์แวร์อย่างไรให้โดนใจผู้คน (How to make software that people love)
สร้างซอฟต์แวร์อย่างไรให้โดนใจผู้คน (How to make software that people love)
Kobkrit Viriyayudhakorn
 
Ad

Recently uploaded (20)

Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjksPpt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
panchariyasahil
 
03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia
Alexander Romero Arosquipa
 
Data Analytics Overview and its applications
Data Analytics Overview and its applicationsData Analytics Overview and its applications
Data Analytics Overview and its applications
JanmejayaMishra7
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjksPpt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
panchariyasahil
 
Data Analytics Overview and its applications
Data Analytics Overview and its applicationsData Analytics Overview and its applications
Data Analytics Overview and its applications
JanmejayaMishra7
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 

Thai Text processing by Transfer Learning using Transformer (Bert)

  • 1. ประมวลภาษาแบบ Transfer Learning ด้วย Transformers (BERT) Dr. Kobkrit Viriyayudhakorn iApp Technology Co., Ltd.
  • 2. Outlines • ทําไม Transfer Learning ด้วย Transformer ถึงน่าสนใจ? • ประวัติของการประมวลผลข้อความในยุค Deep Learning • Bert: Model Architecture • Bert: Pre-training • Bert: Fine-Tuning • Bert: State of the Art • Better than BERT?
  • 3. ทําไม Transfer Learning ด้วย Transformer ถึงน่าสนใจ? • True-Voice Intent Classification • Thai QA (*Not same test set) 88.967% (BERT, 2019)83.498% (ULMFit,2018) Dr.QA (Bi-LSTM, 2017) 34.0% (Exact Match) iApp QA (BERT, 2019) 45.7% (Exact Match) https://github.com/PyThaiNLP/classification-benchmarks https://ai.iapp.co.th/intent-class https://ai.iapp.co.th/qaNSC2019 QA Data Science & Engineering Workshop 2019: The future of Thai NLP
  • 4. ทําไม Transfer Learning ด้วย Transformer ถึงน่าสนใจ? • Text Classification • Sentiment Analysis • Intent Classification • Any Classifications • Question Answering • Machine Translation • Text Summarization • Name Entity Recognition • Paraphrasing • Natural Language Inference • Coreference Resolution • Sentence Completion • Word sense disambiguation • Language Generation arxiv.org/abs/1910.12840
  • 5. ประวัติการประมวลผลข้อความในยุค Deep Learning • One-hot Encoding • Transfer Learning • Word Global Representation • Word-Embbedding • Word2Vec (Thai2Vec) • Glove • ULMFit (Thai2Fit) • Word Contextual Representation • LSTM+Conv1D • Elmo • Subword Contextual Representations • Transformers • GPT-1, GPT-2 (OpenAI) • BERT (Google) • Tranformer XL (Google) • Xlnet (Google) • XLM (Facebook) • RoBERTa (Facebook) • DistilBERT (Victor Sanh) • CTRL (Saleforce) • Albert (Google)
  • 6. One-hot Encoding Word Sequence One-hot VectorText I really love my dog [ 4 <OOV> 2 1 3 ]
  • 7. One-hot Encoding • # Dimension = # Vocabulary • English Languages: 25,000 (without NE) – 300,000 (with most NE) vocabulary. • Curse of dimensionality • Too sparse dimensional spaces => Never have enough data points => Can not learn anything. • Get low accuracy, Process too long.
  • 8. Transfer Learning in NLP Pretraining Fine Tuning Decision Model Universal Language Model • Create Fundamental Language Model • Epic Corpus • Wikipedia • No Label Need • 5-7 days on TPUv3 Contextual Language Model • Adjust Weight for Specific Task • Small Corpus • True-Voice Intent Class • Label data is need • 1 hour on TPUv3 ML Model • Word-Embedding: LSTM • Transformers: Dense
  • 10. Word-Embedding • Libraries • Word2Vec (Thai2Vec) • Glove • ULMFit (Thai2Fit) • FastText • Context Independent • 1 Word = 1 Global Representation • Does not care sequence in Training • Word Arithmetic Pretraining (Embedding) Fine Tuning Word2Vec (Thai2Vec) Glove ULMFit (Thai2Fit) FastText RNN GRU LSTM Conv1D Dense SVM SVR Logit Softmax … Decision Model X Static Word Vectors Text https://towardsdatascience.com/introduction-to-word-embedding-and-word2vec-652d0c2060fa
  • 12. Elmo • Context Dependent = Need whole input sentence. • 1 Word = Many Representations • He went to the prison cell with his cell phone to extract blood cell samples from inmates. • Bi-LSTM + Character-based Pretraining Fine Tuning Elmo Model Logit Softmax … Decision Model Contextual Word Vectors X Text https://arxiv.org/abs/1802.05365
  • 13. Bert • Using Transformers • 1 Subword = Many Contextual Representations (Aimed to reduce vocabulary size) • Byte-Pair Encoding (BPE) • Universal Languages • Don’t care on the splitting correctness (like in Thai, China) • Similar to TCC • https://github.com/bheinzerlin g/bpemb Pretraining Fine Tuning Bert Logit Softmax … Decision Model Bert Fine Tuned Weight with Label Data Language Model Weight Weight for Downstream Task https://github.com/google-research/bert
  • 14. Bert: Model Architecture Equal to Open AI Transformer • 12 Blocks (Heads) • 768 Hidden Unit • 12 Attention Heads • 110M parameters State of the Art Model • 24 Blocks • 1024 Hidden Unit • 16 Attention Heads • 340M parameters https://jalammar.github.io/illustrated-bert/ https://github.com/google-research/bert
  • 15. Bert: Model Architecture Size of hidden unit = 768 Sequence Max Length = 512 Units <1 Head> 12 Layers/Blocks https://jalammar.github.io/illustrated-bert/
  • 23. Bert: Self-Attention in Matrix Styles https://jalammar.github.io/illustrated-bert/
  • 26. Bert: Self-Attention with Multi-head http://jalammar.github.io/illustrated-transformer/
  • 27. Bert: Self-Attention with Multi-head http://jalammar.github.io/illustrated-transformer/
  • 28. Bert: Self-Attention with Multi-head http://jalammar.github.io/illustrated-transformer/
  • 29. Bert: Self-Attention with Multi-head Single Head Multi Head http://jalammar.github.io/illustrated-transformer/
  • 32. Bert: Model Summary #2 http://jalammar.github.io/illustrated-transformer/
  • 33. Bert: Using Layers Output https://jalammar.github.io/illustrated-bert/
  • 35. Bert: Pretraining #1 • Step 1: BERT ไร้ Weight ต้องหาความรู้ด้านภาษาก่อน ไม่มี Weight ไม่มี Weight Pretraining Fine Tuning Decision Model Bert ไร้ซึFงสมองหรือ สิFงใดๆทัIงสิIน ?
  • 36. ตัIงใจเรียน Bert: Pretraining #2 • Step 2: BERT นําเข้าข้อมูล Wikipedia + Google Book ภาษาอังกฤษ แล้ว Challenge ตัวเอง (1) หาคําทีFหายไป โดยการใส่ [MARK] แบบสุ่ม [CLS] [MASK] man [MASK] up , put his [MASK] on phil [MASK] ##mon ' s head [SEP] [SEP] the man jumped up , put his basket on phil ##am ##mon ' s head Input to bert Output from bert … the man jumped up , put his basket on philammon ' s head … Preprocessing + Random marking Weight is learning Learn and Adapt from correct Answer Missing (Marking) Word Prediction Text Representation Vector Output from model Pretraining Fine Tuning Decision Model
  • 38. Bert: Pretraining #4 (Real Example) • Input: this is one of the most amazing stories i have ever seen . < br / > < br / > if this film had been directed by larry clark , then this story about a school shooting probably would have been shown through the eyes of the killer and whatever led that person to go insane in the first place . < br / > < br / > instead , the plot focuses mainly on the aftermath of a school shooting , and how it effect ##ed the victims who survived . < br / > < br / > i had seen busy phillips in other films before , but her performance in this movie is by far , her best . the … (512 chopped) • INFO:tensorflow:tokens: [CLS] this is one of the most amazing stories i have ever seen . < br / > < br / > if this film had been directed by larry clark , then this story about a school shooting probably would have been shown through the eyes of the killer and whatever led that person to go insane in the first place . < br / > < br / > instead , the plot focuses mainly on the aftermath of a school shooting , and how it effect ##ed the victims who survived . < br / > < br / > i had seen busy phillips in other films before , but her performance in this movie is by far , her best . the [SEP] [SEP] • INFO:tensorflow:input_ids: 101 2023 2003 2028 1997 1996 2087 6429 3441 1045 2031 2412 2464 1012 1026 7987 1013 1028 1026 7987 1013 1028 2065 2023 2143 2018 2042 2856 2011 6554 5215 1010 2059 2023 2466 2055 1037 2082 5008 2763 2052 2031 2042 3491 2083 1996 2159 1997 1996 6359 1998 3649 2419 2008 2711 2000 2175 9577 1999 1996 2034 2173 1012 1026 7987 1013 1028 1026 7987 1013 1028 2612 1010 1996 5436 7679 3701 2006 1996 10530 1997 1037 2082 5008 1010 1998 2129 2009 3466 2098 1996 5694 2040 5175 1012 1026 7987 1013 1028 1026 7987 1013 1028 1045 2018 2464 5697 8109 1999 2060 3152 2077 1010 2021 2014 2836 1999 2023 3185 2003 2011 2521 1010 2014 2190 1012 1996 102 • INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 • INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 c
  • 39. Bert: Pretraining #5 (Position Encoding) Using SIN() and COS() for telling positions http://jalammar.github.io/illustrated-transformer/
  • 40. Bert: Pretraining #6 (Position Encoding) http://jalammar.github.io/illustrated-transformer/
  • 41. Bert: Pretraining #6 • Step 3: (2) ให้ประโยคแรกและประโยคถัดไป แล้วให้ BERT บอกว่าเป็นประโยคต่อกันหรือไม่ [CLS] the man went to the store [SEP] he bought a gallon of milk [SEP] Input to bert … the man went to the store. he bought a gallon of milk … Preprocessing Weight is learning Learn from correct Answer Predict Next Sentence Task Output from bert Text Representation Vector Output from model 1 = Yes Pretraining Fine Tuning Decision Model ตัIงใจเรียน
  • 42. Bert: Pretraining #7 • Step 4: BERT มี Language Model แล้ว มี Weight มี Weight Bert เรียนจบแล้ว เย้ๆ Pretraining Fine Tuning Decision Model
  • 43. Bert: Pretraining #8 มี Weight มี Weight • หรือ เราสามารถ Download Weight ของ Bert ทีF Pre-train แล้วมาจาก https://github.com/google-research/bert (English) หรือ https://github.com/ThAIKeras/bert (Thai) ได้เลย เอาไปเลยน้องพีF Bert พูดภาษาคนรู้เรืFอง แล้วววววว เอาไปเลยพีFน้อง Pretraining Fine Tuning Decision Model
  • 44. มี Weight มี Weight Bert มาสมัครงานครับ
  • 45. มี Weight มี Weight ตอนนีIแปลงประโยคให้เป็นตัวเลข Vector ได้ครับ ตอนนีIคุณทําอะไรได้บ้าง
  • 46. ความสามารถของ BERT หลัง Pre-train มี Weight มี Weight • หากใส่ข้อความลงไป ก็จะ Subword Representation ใน Context ของประโยคนัIนออกมา I mean, part of the beauty of me is that I’m very rich. [CLS] I mean part of the beaut ##y of me is that I ##m very rich [SEP] [SEP] Preprocessing Bert (Base) Output Parameters= 1 Final Layer x 1 Batch Size x 12 Attention Heads x 512 max-length x 768 hidden features x = 4,718,592 output parameters / sentences
  • 47. มี Weight มี Weight อะไรคือ Sentimental Analysis หรอครับ?? แล้วคุณทําเป็นพวก Sentimental Analysis แยกแยะข้อความว่าเป็น บวก เป็น ลบได้หรือยัง
  • 48. มี Weight มี Weight ว๊อทททททท ยังใช้งานจริงไม่ได้ คุณยังไม่มีประสบการณ์ทํางานนะ ฝึกงานฟรีไปก่อนแล้วกัน
  • 49. BERT ฝึกงาน #1 มี Weight มี Weight • สมมุติเราอยากให้ BERT ทํางานเป็นคนแยกแยะอารมณ์ของข้อความให้กับเรา (Pos, Neg, Neu) • เราสามารถเขียน Classification Model ของเราต่อท้ายผลของ Bert จะเป็นอะไรก็ได้ ตามใจเรา Logit, DNN, LSTM ก็ได้ I mean part of the beauty of me is that I’m very rich. [CLS] I mean part of the beaut ##y of me is that I ##m very rich [SEP] [SEP] Preprocessing Sentimental Classification (Logit, DNN or LSTM or Anythings…) เขียน Model ต่อท้ายของผลของ BERT Sentimental คืออะไรฟร่ะ งง งวย ไม่มี ความรู้ อารายเยย Pretraining Fine Tuning Decision Model
  • 50. BERT ฝึกงาน #2 • หา Dataset ทีFมี Label Class (Pos, Neg, Neu) มาสอนงาน (Fine-tune) น้อง Bert • อาทิเช่น • Sentiment140 (English) (https://www.kaggle.com/kazanova/sentiment140) • Wongnai Corpus (1-5) (https://github.com/wongnai/wongnai-corpus) • อย่างน้อย 10,000 แถวขึIน (ยิFงเยอะ ยิFงดี) Pretraining Fine Tuning Decision Model Sentiment 140
  • 51. BERT ฝึกงาน #3 (Fine Tuning) [CLS] is upset that he can't update his face book by texting it [SEP] [SEP] Our’s Sentimental Classification (Logits or Dense or LSTM or Any models) Weight Learning เขียน Model ต่อท้ายของผล ของ BERT Sentiment 140 is upset that he can't update his Facebook by texting it...+ SAD Happy? Learn and Adapt from correct Answer It is actually SAD, Go re-learn. Preprocessing Pretraining Fine Tuning Decision Model Weight Learning Weight Frozen Weight Frozen
  • 52. BERT ฝึกงาน #4 (Fine Tuning) • และแล้วเราก็ได้ Classifier ทีFสามารถทํานายอารมณ์ได้จาก BERT [CLS] is upset that he can't update his face book by texting it [SEP] [SEP] Our’s Sentimental Classification (DNN or LSTM or Any model you want) Weight Learned SADOutput Input ผ่าน PRO แล้วจ้า Pretraining Fine Tuning Decision Model Weight Learned Weight Frozen Weight Frozen
  • 53. Bert: Finetuning in several tasks https://github.com/google-research/bert