Data clustering, data deduction and data visualization. Using advnaced skills to encode the free format articles to cluster data by using LLM pre-trained models.
2023 Supervised Learning for Orange3 from scratchFEG
This document provides an overview of supervised learning and decision tree models. It discusses supervised learning techniques for classification and regression. Decision trees are explained as a method that uses conditional statements to classify examples based on their features. The document reviews node splitting criteria like information gain that help determine the most important features. It also discusses evaluating models for overfitting/underfitting and techniques like bagging and boosting in random forests to improve performance. Homework involves building a classification model on a healthcare dataset and reporting the results.
This document provides an overview of unsupervised learning techniques including k-means clustering and association rule mining. It begins with introductions to the speaker and tutorial topics. It then contrasts supervised vs unsupervised learning, describing how k-means is used for clustering without labels and how association rules can discover relationships between items. The document provides examples of applying these techniques in domains like retail, sports, email marketing and healthcare. It also includes visualizations and discusses important concepts for k-means like data transformation and for association rules like support, confidence and lift. Homework questions are asked about preparing data for these algorithms in Orange.
202312 Exploration Data Analysis Visualization (English version)FEG
This document provides an overview of exploratory data analysis (EDA) and visualization techniques that can be performed before building a machine learning model. It introduces the Iris dataset as an example and outlines the key steps of EDA, including loading the data, examining correlations, creating scatter plots, and generating distribution and box plots to understand feature statistics. As homework, students are asked to explore another dataset with a numeric target feature called "housing.tab" and explain the visualizations.
202312 Exploration of Data Analysis VisualizationFEG
This document provides a tutorial on data visualization and analysis using Orange 3. It discusses different types of charts like pie charts, line charts, histograms, bar charts, scatter plots, box plots, and pivot tables. It demonstrates how to visualize survival rates from the Titanic dataset based on features like sex, passenger class, age, and fare paid. Key findings are that women and higher class passengers had higher survival rates, and survival rates also depended on combinations of these features.
Transfer learning (TL) is a research problem in machine learning (ML) that focuses on applying knowledge gained while solving one task to a related task
This document provides a summary of image classification using deep learning techniques. It begins with an introduction to the speaker and their background. It then discusses the main types of image AI tasks like classification, detection, and segmentation. The document reviews the history and timeline of deep learning, important datasets like ImageNet, and algorithms such as convolutional neural networks. It presents the typical process flow for image-based deep learning including feature extraction using convolutional and pooling layers, classification layers, and different network architectures. The document concludes by discussing a homework assignment on building a multi-class image classification model using a dataset of dog, cat, and bird images.
《美国DKU毕业证》95270640微信高仿加州理工学院假毕业证成绩单认证, 本科Caltech假文凭, 硕士Caltech毕业证书, 研究生Caltech学士学位证, 伪造加州理工学院毕业证成绩单做学历认证可以吗?Buy Fake California Institute of Technology Diploma Degree Transcript
# **Description of the PowerPoint Presentation: *Wind and Snow in Beiping (《風雪中的北平》投影片)**
## **Introduction**
The PowerPoint presentation titled *Wind and Snow in Beiping (《風雪中的北平》投影片)* is a visually and thematically rich exploration of Beiping (modern-day Beijing) during a period of historical turbulence, likely set in the early to mid-20th century. The presentation combines literary excerpts, historical context, and evocative imagery to immerse the audience in the atmosphere of a city caught between tradition and modernity, resilience and despair.
This description will provide a detailed breakdown of the PowerPoint’s structure, content, and thematic elements, analyzing its narrative techniques, visual design, and educational or artistic intentions. The presentation appears to be designed for an academic, literary, or historical audience, possibly as part of a course on modern Chinese literature, history, or cultural studies.
---
## **Structure and Slide Breakdown**
The PowerPoint is structured to guide the viewer through a narrative that blends fiction, history, and visual storytelling. While the exact number of slides is unspecified, the content can be categorized into several key sections:
### **1. Title Slide (《風雪中的北平》)**
The presentation opens with a striking title slide featuring the Chinese characters "《風雪中的北平》" (Wind and Snow in Beiping) in a calligraphic or bold font. The background likely depicts a wintry scene of old Beiping—snow-covered hutongs (alleys), traditional courtyard houses, or a historical black-and-white photograph. The title evokes both the literal weather conditions and the metaphorical "storm" of political and social change.
### **2. Historical Context of Beiping**
This section provides background on Beiping (the former name of Beijing) during the early 20th century, a time of war, revolution, and cultural transformation. Key points may include:
- **Beiping vs. Beijing**: Explanation of the city’s name change and its symbolic meaning.
- **Political Climate**: The fall of the Qing Dynasty, the Republic of China era, Japanese occupation, and the Chinese Civil War.
- **Cultural Significance**: Beiping as a center of intellectual life, literature, and traditional Chinese culture.
Visuals may include maps of Beiping, historical photographs, and timelines.
### **3. Literary Excerpts and Analysis**
The core of the presentation features passages from literary works set in Beiping, possibly including:
- **Lao She’s *Rickshaw Boy* (《骆驼祥子》)**: A novel depicting the struggles of a poor rickshaw puller in Beiping.
- **Shen Congwen’s essays**: Descriptions of the city’s landscapes and social changes.
- **Lu Xun’s writings**: Critical perspectives on Chinese society during this era.
Each excerpt is accompanied by analysis, focusing on themes such as:
- **Poverty and Survival**: How ordinary people endured harsh winters and economic hardship.
- **Nostalgia and Loss**: The erosion of
《美国DKU毕业证》95270640微信高仿西雅图华盛顿大学假毕业证成绩单认证, 本科UW假文凭, 硕士UW毕业证书, 研究生UW学士学位证, 伪造西雅图华盛顿大学毕业证成绩单做学历认证可以吗?Buy Fake University of Washington Diploma Degree Transcript
《美国DKU毕业证》95270640微信高仿加利福尼亚大学圣迭戈分校假毕业证成绩单认证, 本科UCSD假文凭, 硕士UCSD毕业证书, 研究生UCSD学士学位证, 伪造加利福尼亚大学圣迭戈分校毕业证成绩单做学历认证可以吗?Buy Fake University of California, San Diego Diploma Degree Transcript
【文凭认证】多大毕业证认证Q/微:892798920办多伦多大学毕业证留信留服使馆公证,多大硕士毕业证,U of T研究生毕业证,文凭,改U of T成绩...Q147258
【文凭认证】多大毕业证认证Q/微:892798920办多伦多大学毕业证留信留服使馆公证,多大硕士毕业证,U of T研究生毕业证,文凭,改U of T成绩单,GPA,学士学位证,硕士学位证,offer雅思考试申请学校University of Toronto Diploma,Degree,Transcript
2. About me
2
• Education
• NCU (MIS)、NCCU (CS)
• Experiences
• Telecom big data Innovation
• Retail Media Network (RMN)
• Customer Data Platform (CDP)
• Know-your-customer (KYC)
• Digital Transformation
• LLM Architecture & Development
• Research
• Data Ops (ML Ops)
• Generative AI research
• Business Data Analysis, AI
11. Transformer Models
• They are a more recent and highly effective architecture for sequence
modeling.
• They move away from recurrence and rely on a self-attention
mechanism to process sequences in parallel and capture long-term
dependencies in data, making them more efficient than traditional
RNNs.
• Self-attention mechanisms to weight the importance of different parts of
input data.
• They have been particularly successful in NLP tasks and have led to
models like BERT, GPT, and others.
11
12. • In the paper Attention is all you need (2017).
• It abandons traditional CNNs and RNNs.
• It includes Encoder and Decoder.
12
https://arxiv.org/abs/1706.03762
由多層的編碼器/解碼器堆疊而成,每層包含 Multi-head Self-Attention、LayerNorm、前饋神經網
路(FFN)、殘差連接(Residual) 等模組,整體架構是可微分的,並透過 反向傳播
(Backpropagation) 進行訓練
Q: 來自於 Decoder 的輸入
K,V: 來自於 Encoder 的輸出 (4x768)
類似 BERT
GPT
GPT不靠時間步方式,
以 Tokens + PE 方式以 attention 機制來生成
ChatGPT屬於那種?
Transformer Models
RNN與GPT都採用自回歸方式生成
13. • How are you? • 你好嗎?
Tokenizer
<CLS> “How” “are” “you” “?” <SEP>
Vocabulary
How 1
are 10
you 300
? 4
Word to index mapping
d=768 Add PE
(Positional Encoding)
Multi-Head-
Attention (HMA)
In
parallel
Feed
Forward
Residual Connection
N
O
R
M
Residual Connection
N
O
R
M
Block Block Block
context
vectors
BERT 的模型約有 30,000個
tokens,這些常用的詞、詞根、
子詞(subword) 編上編號
embedding
逐元素相加:
這樣模型就知道哪些詞
是出現在前面、哪些在
後面,才能理解語序
BERT 有 12 個 Block,BERT Large 有 24 個 Block
用 sin 和 cos 波形來表示不同位置,奇數用 cos,偶數用 sin;
每個位置 pos 對應一組固定向量
機器翻譯
計算的方向
Transformer Models
Encoder 階段
何謂 Token?
14. Multi-head-attention
• Attention mechanism:
• Select more useful information from words.
• Q, K and V are obtained by applying a linear transformation to the input word
vector x. (token 經過 word embedding 之後就是 x)
• Each matrix W can be learned through training.
14
Q: Information to be queried
K: Vectors being queried
V: Values obtained from the query
(多頭的意思,想像成 CNN 中多個卷積核的作用(抓取特徵值) => 【多個注意力】機制
RNN 難以學到遠距離的依賴關係,因此需要導入 attention 機制。
1. how are you 經過 Q·
Kᵀ 之後為注意力分數 [0.8, 1.2, -0.4] ,再透過 softmax 為 [0.31,
0.52, 0.17] ,這表示how應該要關注的字詞為 are 因為有 52%
2. 接下來 how 這個詞 [0.31 ·
vhow+0.52 ·
vare+0.17 ·
vyou],這會有 768維
3. 三個字詞算完,一共有 [3x768] 進入 Layer Norm