Vnlawbert: A Vietnamese Legal Answer Selection Approach Using Bert Language Model

- The document proposes VNLawBERT, an approach to select relevant answers for Vietnamese legal questions using BERT. - It constructs training and testing datasets for answer selection in Vietnamese and fine-tunes BERT on the question-answer pairs to create VNLawBERT. - VNLawBERT achieves an F1-score of 87% for answer selection. Further pre-training BERT on Vietnamese legal documents leads to a higher F1-score of 90.6%, showing potential for domain-specific language models.

Uploaded by

Tony Khánh

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views

Vnlawbert: A Vietnamese Legal Answer Selection Approach Using Bert Language Model

Uploaded by

Tony Khánh

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/349015367

VNLawBERT: A Vietnamese Legal Answer Selection Approach Using BERT

Language Model

Conference Paper · November 2020

DOI: 10.1109/NICS51282.2020.9335906

CITATIONS READS

8 906

3 authors, including:

Son Nguyen
Ho Chi Minh City University of Science
22 PUBLICATIONS 241 CITATIONS

SEE PROFILE

All content following this page was uploaded by Son Nguyen on 05 August 2021.

The user has requested enhancement of the downloaded file.

VNLawBERT: A Vietnamese Legal Answer
Selection Approach Using BERT Language Model
Chieu-Nguyen Chau Truong-Son Nguyen Le-Minh Nguyen
University of Science, Ho Chi Minh Faculty of Information Technology, Japan Advanced Institute of Science
city, Vietnam. University of Science, Ho Chi Minh and Technology, Japan.
[email protected] city, Vietnam. [email protected]
Vietnam National University, Ho Chi
Minh city, Vietnam.
[email protected]

Abstract— Recently, with the development of NLP (Natural • We propose a solution to the answer selection
Language Processing) methods and Deep Learning, there are problem in the Vietnamese legal question
several solutions to the problems in question answering systems answering system.
that achieve superior results. However, there are not many
solutions to question-answering systems in the Vietnamese legal • We compile a large corpus of text that represents
domain. In this research, we propose an answer selection the Vietnamese legal documents.
approach by fine-tuning the BERT language model on our
Vietnamese legal question-answer pair corpus and achieve an • We achieve a higher performance model
87% F1-Score. We further pre-train the original BERT model (VNLawBERT), evaluate and compare it with
on a Vietnamese legal domain-specific corpus and achieve a the original BERT model on the Vietnamese
higher F1-Score than the original BERT at 90.6% on the same answer selection task.
task, which could reveal the potential of a new pre-trained
language model in the legal area. II. RELATED WORKS
In this section, we will present some existing Q-A
Keywords— natural language processing, question answering, (Question-Answering) systems, especially in Vietnamese and
answer selection, language model, legal document. answer selection methods.
I. INTRODUCTION Q-A systems are divided into two types: knowledge-based
Asking questions about laws is a crystal clear need in any and retrieval-based.
country, but it is not easy since an enormous number of laws A knowledge-based system tends to build a huge graph
have been enacted over the last decades; furthermore, with linked entities. Dai Quoc Nguyen, Dat Quoc Nguyen,
understanding laws requires certain knowledge in the legal Son Bao Pham[2] introduced an ontology-based Q-A system
domain. Therefore, building a question-answering system in for the Vietnamese language, it includes a question analysis
the legal domain is an essential need. It not only helps a module and an answer extraction module, their experiment
normal person to find an answer to their question based on results were promising, they achieved an accuracy of 95% in
current legal documents but also helps lawyers in their work. Question Analysis module and 70% in Answer Retrieval
A question-answering system consists of several parts, module.
one of them is the answer selection which aims to choose On the other hand, retrieval-based systems try to retrieve
the best relevant candidates among retrieved documents by relevant documents and extract the answer from those
measuring the relevance between a question and each documents. Huu-Thanh Duong, Bao-Quoc Ho[3] proposed a
retrieved document. Besides that, modern language models Q-A system for Vietnamese legal documents. They applied
have proved to give a great contextual representation of similarity calculation to select and extract the answer from the
words in sentences. Their impressive results on downstream relevant documents retrieved from Lucence. They achieved a
tasks like sentence pair classification yield a promising precision of approximately 70% in the experiment. However,
approach to measure the relevance in the answer selection the answer selection method in this system relies on
task. calculating similarity scores using tf-idf. Therefore it can not
BERT[1] is a language model that was pre-trained on a capture the contextual relationship between words, our
large general corpus and achieved state-of-the-art in several approach address this problem using a contextual language
NLP tasks. It is interesting to investigate the effect of using model like BERT.
BERT and a sentence pair classification task to the answer Jamshid Mozafari, Afsaneh Fatemi, Mohammad Ali
selection problem in the Q-A system, especially in Nematbakhsh[4] made use of the BERT language model and
Vietnamese. Therefore, we introduce VNLawBERT, an proposed their answer selection method. The result was pretty
approach to select relevant candidates by fine-tuning BERT high, it proved that a pre-trained language model is an
using our question-answer pair dataset. Additionally, we essential tool in NLP tasks such as answer selection.
further pre-train BERT on a legal domain-specific corpus to
achieve higher performance. Our contributions are: In the evolution of NLP, traditional language models like
word2vec[5][6] tried to convert words token into vectors in a
• We construct a training and hand-annotated non-contextual way, in which a word is represented as a single
testing dataset for the Vietnamese answer vector in the vocabulary. This is not suitable in some cases,
selection task. for example, the words "bank" in the sentence "My bank was
robbed" and in "I am sitting at the bank of the river" have the
same vector representation. Recent unsupervised pre-trained
language models like BERT, ElMo[7], XLNet[8] address this
problem by contextually embed each word token based on its
surroundings. In this paper, we will focus on BERT, a
language model pre-trained on general text BooksCorpus and
English Wikipedia, which significantly improve the
performance of many NLP tasks such as sentence-pair
classification, question-answering, language inference. Also,
many studies have shown that further pre-training BERT or
completely pre-training the model from scratch using domain-
specific corpora can significantly improve the performance of
it. SciBERT[9], BioBERT[10], ClinicalBERT[11],
FinBERT[12] are examples, they all performed better than the
original BERT in a domain-specific task. Fig. 1 The general architecture of a Retrieval-based question-
answering system.
To the best of our knowledge, our work is the first to
propose an answer selection approach using BERT language To address the answer selection problem, we aim to select
model for the Vietnamese legal Q-A system. relevant candidates among the retrieved documents, a
retrieved document can be articles, passages, or sentences of
III. BACKGROUND AND DATASETS
a legal document. To this end, we compile a classification
A. Background & Problem statement task with an input consists of two sequences represent a
1) Vietnamese legal documents: In this section, we question and a retrieved document, the output is a
present an overview of Vietnamese legal documents and their confirmation whether the document is a candidate for the
question or not.
structure. Legal documents in Vietnam are divided into the
following categories: B. Question - Answering Classification Dataset
• Constitution To build the dataset for our answer selection task, we need
• Code (of Law) real-world legal questions and relevant answers. We choose
• Ordinance Thu Ky Luat's website[13] to extract the data, Thu Ky Luat is
• Order a law consulting company that provides lawyer's advice to
• Resolution user's questions. We ran an extractor using Scrapy[14] and
• Joint Resolution acquired about 250,000 question-answer pairs in 27 domains
• Decree shown in Table I.
• Decision Table I. QUESTION-ANSWER DOMAINS LIST
• Circular
No. Name of domain English name of domain
• Joint Circular
1 Doanh nghiệp Enterprise
• Directive
2 Đầu tư Investment
A legal document content has different levels include:
3 Thương mại Commerce
chapter, section, article, paragraph, point. Each legal
document has its own validity. When the government enacts 4 Xuất nhập khẩu Import and export
an update of a certain document, the existing one will be 5 Tiền tệ-Ngân hàng Monetary-Banking
expired or partially expired. 6 Thuế-Phí-Lệ Phí Tax-Fee-Charge
Lawyers give their advice based on articles of valid or partial 7 Chứng khoán Stock
valid documents, they typically quote some articles from 8 Bảo hiểm Insurance
legal documents and conclude an answer to the question or a 9 Kế toán-Kiểm toán Accounting-auditing
situation. 10 Lao động-Tiền lương Labor-Salary
2) Question-Answering system and Problem Statement: 11 Bất động sản Real estate
In this research, we focus on the retrieval-based question- 12 Dịch vụ pháp lý Legal service
answering system, its architecture consists of four parts: 13 Sở hữu trí tuệ Intellectual property
Question Processing, Document Retrieval, Answer Selection, 14 Bộ máy hành chính Bureaucracy
and Answer Extraction. The question processing part detects 15 Vi phạm hành chính Administrative violation
the question's type, generates a query from the question. The 16 Trách nhiệm hình sự Criminal responsibility
document retrieval part takes that query and retrieve relevant 17 Thủ tục Tố tụng Procedures
documents. Those documents are then evaluated by the 18 Tài chính nhà nước State financial
answer selection part to pick the best relevant candidates. 19 Xây dựng-Đô thị Construction-Urban
Finally, the answer extraction part processes the candidates 20 Giáo dục Education
to find the exact answer to the question. Fig. 1 shows the 21 Tài nguyên-Môi trường Resources-Environment
general architecture of a retrieval-based Q-A system. 22 Thể thao-Y tế Sports-Health
23 Quyền dân sự Civil rights
24 Văn hóa-Xã hội Sociocultural
25 Công nghệ thông tin Information technology
26 Giao thông-Vận tải Transportation
27 Lĩnh vực khác Other domains
Those questions and answers are used to create the Legal Documents National Database’s website[16] to extract
training and testing datasets for our answer selection task. legal documents. We obtain 23,254 valid or partial valid legal
With each question, we pair it with the correct answer as a documents and create a 320 MB cased dataset.
candidate (labeled as "1") and other answers as non-
candidates (labeled as "0"). The total number of tokens in a IV. METHODOLOGY
question and a candidate/non-candidate is less than or equal In this section, we describe the methods which are applied
to 512. in fine-tuning BERT on our answer selection task and further
The training dataset is automatically generated. For non- pre-training it using the datasets in Section III. We use a
candidate examples, we use Elasticsearch[15] to find a Colab Pro instance along with a Cloud TPU from Google to
sequence from our question-answer data that has similar experiment with our methods.
content to the candidate. In this dataset, each question has one
A. Fine-tuning BERT for answer selection task
candidate and two non-candidates.
To make the evaluation accurate, the testing dataset is The answer selection task's goal is to define a sequence is
handpicked by us to make sure non-candidates are correctly or contains the answer to the question or not. We compile a
labeled. In our testing dataset, each question has one sentence-pair classifier that uses BERT as the initial
candidate and four non-candidates. checkpoint and fine-tune it with our dataset.
The sizes of training and testing datasets are described in We use the last checkpoint from the BERT-Base,
Table II. An example of our datasets is shown below: Multilingual Cased. We set a maximum sequence length of
1) Candidate example 512 tokens, a batch size of 128, a learning rate of 2e-5, and
• Question: Vi rút máy tính là gì? (What is a train over three epochs.
computer virus?) We found there is small randomness in the results, so we
• Candidate: Căn cứ pháp lý: Điều 4 Luật Công run the process three times using the same model
nghệ thông tin 2006 Vi rút máy tính là chương configurations and calculate the average result.
trình máy tính có khả năng lây lan, gây ra hoạt B. Pre-training BERT with legal data
động không bình thường cho thiết bị số hoặc sao
We further pre-trained BERT with our legal documents
chép, sửa đổi, xóa bỏ thông tin lưu trữ trong thiết
dataset to give the model more knowledge in the Vietnamese
bị số. (Article 4 of the Law on Information
legal domain. From the last checkpoint of Multilingual Cased
Technology 2006 Computer virus means a
BERT-Base, we make use of the original MLM (Masked
computer program capable of spreading or
Language Model) and NSP (Next Sentence Prediction) task
causing abnormal operation of digital equipment
of BERT to further pre-train the model with our legal
or copying, modifying or deleting information
documents dataset.
stored in digital equipment.)
We set a maximum sequence length of 512 tokens, a batch
• Label: 1 size of 128, a learning rate of 2e-5 (recommended by BERT
2) Non-candidate example document), and train over 20 epochs (181765 steps) in 2 days
• Question: Vi rút máy tính là gì? (What is a to achieve a new pre-trained model called VNLawBERT.
computer virus?) Then we compare the result of our new model with BERT-
• Non-candidate: Căn cứ pháp lý: Điều 2 Luật Base on our answer selection task.
phòng, chống nhiễm vi rút gây ra hội chứng suy
giảm miễn dịch mắc phải ở người (HIV/AIDS) V. EXPERIMENTS AND RESULTS
2006 HIV dương tính là kết quả xét nghiệm mẫu We present the results of the models in this section. We
máu, mẫu dịch sinh học của cơ thể người đã được use precision, recall, and F1-score as our metrics, all metrics
xác định nhiễm HIV. (Article 2 of Law on are calculated on positive results
Prevention and Control of Human
Immunodeficiency Syndrome (HIV / AIDS) A. Results
2006 HIV positive means the result of an HIV The result in Table III indicates that BERT can perform
infection confirmed test of a human blood quite good to classify candidate and non-candidate with the
sample or biological fluid.) F1-Score about 87%.
• Label: 0 However, it also shows that the legal domain-specific
Table II. QUESTION-ANSWER DATASET SIZE model VNLawBERT performs even better than the BERT-
Base in all metrics, especially improving F1-Score by 3.5%
Number Number Number Disk size from the BERT-Base.
of of of It is because the context of words in legal documents is
questions domains examples way more different than the context in a general domain
Training 68,174 27 204,522 266 MB corpus like Wikipedia. The model needs further pre-training
Testing 350 27 1,750 2.8 MB process to understand the contexts of legal problems.
Table III. PERFORMANCE OF FINE-TUNED BERT-Base AND
VNLawBERT
C. Pre-train Dataset
We also prepare a dataset used to further pre-train BERT Precision Recall F1-Score
in order to give the model more legal domain-specific BERT-Base 0.804 0.952 0.872
knowledge. Legal documents should be accurate and come VNLawBERT 0.860 0.958 0.906
from a truthful source; to this end, we choose the Vietnam
B. Additional experiments VI. CONCLUSION
In this paper, we also fine-tune VNLawBERT models on In this paper, we address the answer selection problem by
different training datasets to discover the effect of training fine-tuning BERT language model on our question-answer
size and the question's domain on the result. dataset. We also reveal the potential of a new domain-specific
1) Effect of the question's domain: Since our testing model for the legal area since our VNLawBERT model
dataset consists of questions from several domains, we outperforms the original BERT model in our answer selection
hypothesize that training on a multi-domains dataset makes task. With this research, we hope researchers can experiment
the model perform better than training on a dataset of one or the model with other tasks in the legal domain such as Named
two domains. We test our hypothesis in this section. Entity Recognition, Reference Extraction, Question
We fine-tune the models on a training dataset which only Classification to build a legal domain-specific language
consists of all the examples of one domain. In this case, we model, which is also our future work.
use "Thủ tục tố tụng" (Procedures) and "Thuế-Phí-Lệ Phí"
ACKNOWLEDGMENT
(Tax-Fee-Charge) domains since they contribute the least
amount of examples in our testing dataset, we obtain 40,000 This research is funded by the University of Science,
examples. VNU-HCM under grant number CNTT 2020-14.
We fine-tune another model on a different dataset which
has the same size, constructed from questions in all domains. REFERENCES
The performance of our answer selection task in Table IV [1] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova,
“BERT: Pre-training of Deep Bidirectional Transformers for Language
proves that the model with multi-domains knowledge Understanding”, in Proceedings of NAACL, pages 4171-4186, 2018.
performs better. [2] Dai Quoc Nguyen, Dat Quoc Nguyen, Son Bao Pham, “A Vietnamese
Question Answering System”, International Conference on Knowledge
Table IV. PERFORMANCE OF 2-DOMAINS VNLawBERT AND and Systems Engineering, 2009.
MULTI-DOMAINS VNLawBERT [3] Huu-Thanh Duong, Bao-Quoc Ho, “A Vietnamese Question
Answering System in Vietnam’s Legal Documents”, IFIP International
Precision Recall F1-Score Conference on Computer Information Systems and Industrial
Management, 2016.
2-domains 0.709 0.960 0.816
[4] Jamshid Mozafari, Afsaneh Fatemi, Mohammad Ali Nematbakhsh,
VNLawBERT “BAS: An Answer Selection Method Using BERT Language Model”,
VNLawBERT 0.749 0.960 0.841 2019.
2) Effect of the training size: In this experiment, we [5] Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean, “Efficient
evaluate the model with different dataset sizes to explore a Estimation of Word Representations in Vector Space”, Proceedings of
the International Conference on Learning Representations (ICLR
sufficient number of examples needed to train the model. 2013), 2013.
We use the methods described in Section III to build three [6] Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean, “Distributed
datasets of different sizes: 20%, 52%, 100% of our training Representations of Words and Phrases and their Compositionality”,
dataset respectively. Advances in neural information processing systems 26, 2013.
The result is shown in Table V indicates that the more [7] Matthew Peters et al., “Deep Contextualized Word Representations”,
”, in Proceedings of NAACL, 2018.
examples we have, the more accurate the model is. In our
[8] Zhilin Yang et al., “XLNet: Generalized Autoregressive Pretraining for
experience, using more than 204,000 examples does not Language Understanding”, Conference on Neural Information
improve the performance of the model. Therefore, a training Processing Systems (NeurIPS 2019), 2019.
dataset of 204,000 examples (8,000 examples from each [9] Iz Beltagy, Kyle Lo, Arman Cohan, “SCIBERT: A Pretrained
domain) is sufficient for the model to perform at its best. Language Model for Scientific Text”, Proceedings of the 2019
Conference on Empirical Methods in Natural Language
Processing and the 9th International Joint Conference on Natural
Table V. PERFORMANCE OF VNLawBERT WITH DIFFERENT Language Processing, 2019.
TRAINING SIZES
[10] Jinhyuk Lee et al., “BioBERT: a pre-trained biomedical language
representation model for biomedical text mining”,
% of our Precision Recall F1- Bioinformatics, Volume 36, Issue 4, 15 February 2020, Pages
training Score 1234–1240, 2019.
dataset [11] Kexin Huang, Jaan Altosaar, Rajesh Ranganath, “ClinicalBERT:
Modeling Clinical Notes and Predicting Hospital Readmission”,
40,500 20 0.749 0.960 0.841 The ACM Conference on Health, Inference, and Learning, 2020.
examples [12] Yi Yang, Mark Christopher Siy UY, Allen Huang, “FinBERT: A
107,000 52 0.818 0.976 0.890 Pretrained Language Model for Financial Communications”,
examples 2020.
204,000 100 0.860 0.958 0.906 [13] Thu Ky Luat's website [Online]
examples https://nganhangphapluat.thukyluat.vn/
[14] Scrapy [Online] https://scrapy.org/
[15] Elasticsearch [Online] https://www.elastic.co/
[16] Vietnam Legal Documents National Database’s website [Online]
http://vbpl.vn/pages/portal.aspx

View publication stats

Maths Pedagogy Free Notes by Himanshi Singh
100% (9)
Maths Pedagogy Free Notes by Himanshi Singh
27 pages
Memorandum of Agreement: (Date)
100% (12)
Memorandum of Agreement: (Date)
3 pages
Resource Roadmap, Hangukquant
No ratings yet
Resource Roadmap, Hangukquant
9 pages
8 Sentence Paragraph Format
No ratings yet
8 Sentence Paragraph Format
1 page
Vietnamese Text Clasification
No ratings yet
Vietnamese Text Clasification
7 pages
8 Quiz Maker Automatic Quiz Generation From Text Using NLP
No ratings yet
8 Quiz Maker Automatic Quiz Generation From Text Using NLP
11 pages
Sentiment Analysis For Vietnamese: Binh Thanh Kieu Son Bao Pham
No ratings yet
Sentiment Analysis For Vietnamese: Binh Thanh Kieu Son Bao Pham
6 pages
The_TREC_question_answering_track
No ratings yet
The_TREC_question_answering_track
9 pages
Word Segmentation For Vietnamese Text Categorization
No ratings yet
Word Segmentation For Vietnamese Text Categorization
6 pages
LARQS: An Analogical Reasoning Evaluation Dataset for Legal Word Embedding
No ratings yet
LARQS: An Analogical Reasoning Evaluation Dataset for Legal Word Embedding
16 pages
ICCCI 2020 Paper 58 PDF
No ratings yet
ICCCI 2020 Paper 58 PDF
12 pages
Duong 2019
No ratings yet
Duong 2019
5 pages
Employing A Domain Specific Ontology To Perform Semantic Search
No ratings yet
Employing A Domain Specific Ontology To Perform Semantic Search
13 pages
Opinerium Subjective Question Generation Using Large Language Models
No ratings yet
Opinerium Subjective Question Generation Using Large Language Models
15 pages
A Vietnamese Dataset For Evaluating Machine Readin
No ratings yet
A Vietnamese Dataset For Evaluating Machine Readin
12 pages
2406.04202v1
No ratings yet
2406.04202v1
14 pages
Aqua
No ratings yet
Aqua
25 pages
Preprints202306 1318 v1
No ratings yet
Preprints202306 1318 v1
8 pages
2111.06070v1
No ratings yet
2111.06070v1
7 pages
Question Answering System On Education Acts Using NLP Techniques
No ratings yet
Question Answering System On Education Acts Using NLP Techniques
6 pages
Question Summation and Sentence Similarity Using BERT For Key Information Extraction
No ratings yet
Question Summation and Sentence Similarity Using BERT For Key Information Extraction
6 pages
An Analysis of Sentence Level Text Classification For The Kannada Language
No ratings yet
An Analysis of Sentence Level Text Classification For The Kannada Language
5 pages
Network-Based Bag-Of-Words Model For Text Classification
No ratings yet
Network-Based Bag-Of-Words Model For Text Classification
12 pages
430_topic_based_question_generation
No ratings yet
430_topic_based_question_generation
10 pages
Faqcase: A Textual Case-Based Reasoning System: Aminu Bui Muhammad, Abba Almu
No ratings yet
Faqcase: A Textual Case-Based Reasoning System: Aminu Bui Muhammad, Abba Almu
8 pages
139 Zeinabaghahadi
No ratings yet
139 Zeinabaghahadi
6 pages
An Empirical Study of Information Retrieval and Machine Reading Comprehension Algorithms For An Online Education Platform
No ratings yet
An Empirical Study of Information Retrieval and Machine Reading Comprehension Algorithms For An Online Education Platform
10 pages
GPT-4 Passes The Bar Exam
No ratings yet
GPT-4 Passes The Bar Exam
35 pages
report24
No ratings yet
report24
7 pages
A Review On Various Techniques For Automatic Question Generation
No ratings yet
A Review On Various Techniques For Automatic Question Generation
3 pages
A Basic Framework To Build A Test Collection For The Vietnamese Text Catergorization
No ratings yet
A Basic Framework To Build A Test Collection For The Vietnamese Text Catergorization
2 pages
Concurrent Context Free Framework For Conceptual Similarity Problem Using Reverse Dictionary
No ratings yet
Concurrent Context Free Framework For Conceptual Similarity Problem Using Reverse Dictionary
4 pages
Vietnamese Sentiment Analysis
No ratings yet
Vietnamese Sentiment Analysis
27 pages
S4-Enhancing Unsupervised Neural Networks Based Text Summarization With Word Embedding and Ensemble Learning
No ratings yet
S4-Enhancing Unsupervised Neural Networks Based Text Summarization With Word Embedding and Ensemble Learning
17 pages
Learning Improved Class Vector For Multi-Class Question Type Classification
No ratings yet
Learning Improved Class Vector For Multi-Class Question Type Classification
9 pages
BERT - PLI-Modeling Paragraph-Level Interactions For Legal Case Retrieval
No ratings yet
BERT - PLI-Modeling Paragraph-Level Interactions For Legal Case Retrieval
7 pages
J173 Tech-Talk-Sum - Fine-Tuning Extractive Summarization and Enhancing BERT Text Contextualization For Technological Talk Videos
No ratings yet
J173 Tech-Talk-Sum - Fine-Tuning Extractive Summarization and Enhancing BERT Text Contextualization For Technological Talk Videos
18 pages
Natural Language Processing With Improved Deep Lea
No ratings yet
Natural Language Processing With Improved Deep Lea
8 pages
2024 Bea-1 44
No ratings yet
2024 Bea-1 44
6 pages
2022 Paclic-1 92
No ratings yet
2022 Paclic-1 92
8 pages
Ripple Down Rules For Question Answering: Dat Quoc Nguyen, Dai Quoc Nguyen and Son Bao Pham
No ratings yet
Ripple Down Rules For Question Answering: Dat Quoc Nguyen, Dai Quoc Nguyen and Son Bao Pham
22 pages
Question Generator and Text Summarizer Using NLP
No ratings yet
Question Generator and Text Summarizer Using NLP
9 pages
Eliciting Requirements From Stakeholders' Responses Using Natural Language Processing
No ratings yet
Eliciting Requirements From Stakeholders' Responses Using Natural Language Processing
18 pages
Brief Introduction To Tamil Verb Teaching Package (PDFDrive)
No ratings yet
Brief Introduction To Tamil Verb Teaching Package (PDFDrive)
46 pages
Question Answering, Information Retrieval, and Retrieval Augmented Generation
No ratings yet
Question Answering, Information Retrieval, and Retrieval Augmented Generation
22 pages
Aspect-Category Based Sentiment Analysis With Unified Sequence-To-Sequence Transfer Transformers
No ratings yet
Aspect-Category Based Sentiment Analysis With Unified Sequence-To-Sequence Transfer Transformers
11 pages
Question Classification in Social Media: International Journal of Information Studies Volume 1 Issue 2 April 2009 101
No ratings yet
Question Classification in Social Media: International Journal of Information Studies Volume 1 Issue 2 April 2009 101
9 pages
549 - 2023150032 A Brief Survey on Contextual Question Answering Systems IEEE
No ratings yet
549 - 2023150032 A Brief Survey on Contextual Question Answering Systems IEEE
6 pages
Feature Extraction and Analysis of Natural Languag
No ratings yet
Feature Extraction and Analysis of Natural Languag
12 pages
Application of The B Method For Evaluating Free-Text Answers in An E-Learning Environment
No ratings yet
Application of The B Method For Evaluating Free-Text Answers in An E-Learning Environment
4 pages
Anatomy of Long-Form Content To KBQA System and QA Generator
No ratings yet
Anatomy of Long-Form Content To KBQA System and QA Generator
8 pages
2149
No ratings yet
2149
12 pages
2022 Lrec-1 35
No ratings yet
2022 Lrec-1 35
10 pages
Building Information Extraction System Based On Computing Domain Ontology
No ratings yet
Building Information Extraction System Based On Computing Domain Ontology
5 pages
The Design of A System For The Automatic Extraction of A Lexical Database Analogous To Wordnet From Raw Text
No ratings yet
The Design of A System For The Automatic Extraction of A Lexical Database Analogous To Wordnet From Raw Text
8 pages
Building A Question Answering Test Collection: Judgments
No ratings yet
Building A Question Answering Test Collection: Judgments
8 pages
Recursive Deep Learning For Sentiment Analysis Over Social Data
No ratings yet
Recursive Deep Learning For Sentiment Analysis Over Social Data
6 pages
The 7 NLP Techniques That Will Change How You Communicate in the Future (Part I)
No ratings yet
The 7 NLP Techniques That Will Change How You Communicate in the Future (Part I)
19 pages
Combining Similarity and Transformer Methods For Case Law Entailment
No ratings yet
Combining Similarity and Transformer Methods For Case Law Entailment
7 pages
R1
No ratings yet
R1
10 pages
paper25
No ratings yet
paper25
9 pages
Query Expansion Basedon NLPand Word Embeddings
No ratings yet
Query Expansion Basedon NLPand Word Embeddings
8 pages
Explanation Based Learning: Fundamentals and Applications
From Everand
Explanation Based Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Question Answering: Fundamentals and Applications
From Everand
Question Answering: Fundamentals and Applications
Fouad Sabry
No ratings yet
CFP Icta 2023
No ratings yet
CFP Icta 2023
1 page
6 Study of Question Answering On
No ratings yet
6 Study of Question Answering On
5 pages
Future Directions
No ratings yet
Future Directions
30 pages
KW Gub V
No ratings yet
KW Gub V
11 pages
SSRN Id4464555
No ratings yet
SSRN Id4464555
39 pages
A Five Factor Asset Pricing Model
No ratings yet
A Five Factor Asset Pricing Model
52 pages
SAT-Solving: Ubung Zur Vorlesung Model Checking
No ratings yet
SAT-Solving: Ubung Zur Vorlesung Model Checking
2 pages
Examiner Comments: Ielts Speaking
No ratings yet
Examiner Comments: Ielts Speaking
2 pages
Meshell S Resume
No ratings yet
Meshell S Resume
1 page
Teachers' Classification Teachers' Profile Registered Functional Organizations
No ratings yet
Teachers' Classification Teachers' Profile Registered Functional Organizations
2 pages
First and Secondhand Source Lesson
No ratings yet
First and Secondhand Source Lesson
3 pages
Group-2 Technical-Writing T63
No ratings yet
Group-2 Technical-Writing T63
16 pages
Textbook - An Important Element in The Teaching Process: Dragana M. Gak
No ratings yet
Textbook - An Important Element in The Teaching Process: Dragana M. Gak
5 pages
C12 - Writing A Research Proposal
No ratings yet
C12 - Writing A Research Proposal
15 pages
Unit Outline: BUS301 - Integrated Capstone Project
No ratings yet
Unit Outline: BUS301 - Integrated Capstone Project
17 pages
Duolingo English Test Orientation.
100% (1)
Duolingo English Test Orientation.
10 pages
Authentic Assessment of Learning: Anthony Chan Abeng3 Presented To: Prof. Ava Clare Marie O. Robles PH.D
No ratings yet
Authentic Assessment of Learning: Anthony Chan Abeng3 Presented To: Prof. Ava Clare Marie O. Robles PH.D
4 pages
City Schools Division of Antipolo: Republic of The Philippines Department of Education Region Iv-A Calabarzon
No ratings yet
City Schools Division of Antipolo: Republic of The Philippines Department of Education Region Iv-A Calabarzon
4 pages
Speking Explanation
No ratings yet
Speking Explanation
4 pages
Praxis Core Exam Analysis
No ratings yet
Praxis Core Exam Analysis
4 pages
Life Long
No ratings yet
Life Long
14 pages
Writesonic Chatsonic 1678885987609
No ratings yet
Writesonic Chatsonic 1678885987609
1 page
Model Jean McNiff
No ratings yet
Model Jean McNiff
8 pages
BGSU Master Plan
No ratings yet
BGSU Master Plan
29 pages
Edtpa Lesson Plan Guide LPG Alexander Estes
No ratings yet
Edtpa Lesson Plan Guide LPG Alexander Estes
5 pages
Physical Education Lesson Plan 3
No ratings yet
Physical Education Lesson Plan 3
5 pages
BSBCMM511 Student Assessment Tasks 12-02-21
No ratings yet
BSBCMM511 Student Assessment Tasks 12-02-21
22 pages
2019-10 Rian, JP (2019) Revisiting The Deep-End Strategy-Unrehearsed Discussions in University EFL Classes
No ratings yet
2019-10 Rian, JP (2019) Revisiting The Deep-End Strategy-Unrehearsed Discussions in University EFL Classes
17 pages
Genre Process Approach PDF
No ratings yet
Genre Process Approach PDF
8 pages
Vision. Mission, Goals and Objectives
No ratings yet
Vision. Mission, Goals and Objectives
16 pages
Ma Paper
No ratings yet
Ma Paper
24 pages
Chapter1 1
No ratings yet
Chapter1 1
24 pages
DIAGNOSTIC-TEST-RESULT-English Grade3
No ratings yet
DIAGNOSTIC-TEST-RESULT-English Grade3
5 pages
A Corpus-Based Analysis of Errors in Spanish EFL Writings
No ratings yet
A Corpus-Based Analysis of Errors in Spanish EFL Writings
14 pages
SOP-Production, Industrial Engineering
100% (2)
SOP-Production, Industrial Engineering
2 pages

Vnlawbert: A Vietnamese Legal Answer Selection Approach Using Bert Language Model

Uploaded by

Vnlawbert: A Vietnamese Legal Answer Selection Approach Using Bert Language Model

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

VNLawBERT: A Vietnamese Legal Answer Selection Approach Using BERT

Conference Paper · November 2020

The user has requested enhancement of the downloaded file.

View publication stats

You might also like