0% found this document useful (0 votes)
51 views

KMSPquickreviewfinal

The document discusses several topics related to knowledge management and knowledge discovery systems: 1) It describes knowledge discovery mechanisms and processes like knowledge discovery in databases (KDD) and data mining (DM), as well as techniques like descriptive and inferential statistics. 2) It discusses knowledge capture systems and techniques for eliciting knowledge from experts, such as concept maps, context-based reasoning (CxBR), and the CxBR-based CITKA method. 3) It covers discovering knowledge from the web, including web structure mining, web usage mining, and web content mining.

Uploaded by

vk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views

KMSPquickreviewfinal

The document discusses several topics related to knowledge management and knowledge discovery systems: 1) It describes knowledge discovery mechanisms and processes like knowledge discovery in databases (KDD) and data mining (DM), as well as techniques like descriptive and inferential statistics. 2) It discusses knowledge capture systems and techniques for eliciting knowledge from experts, such as concept maps, context-based reasoning (CxBR), and the CxBR-based CITKA method. 3) It covers discovering knowledge from the web, including web structure mining, web usage mining, and web content mining.

Uploaded by

vk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

KMPS

Quick review
Ch6-9
How contingency factors determine KM solutions?

ch9
ch7
ch8
ch6
Knowledge Discovery Systems
• Including the discussion of knowledge creation and the innovation and
advancement of knowledge
• Combining multiple bodies of explicit knowledge
• Existing explicit knowledge may need recontextualization
• Knowledge discovery mechanisms involve socialization processes
• KDD - knowledge discovery in databases (e.g., Efficient power stations)
• Developing DM Software
• CRISP-DM process
• Statistical and Non-Statistical Techniques
• Descriptive statistics (non-inference statistics)
• Inference statistics – e.g., MBR, decision tree, and neural network
• Discovering Knowledge on the Web
KDD - knowledge discovery in databases
• Group Decision Support Software (GDSS) for …
• Group thinking, problem-solving, and decision-making
• Brainstorming (30 to 45 minutes depending on the size of the group and the
complexity of the problem)
• e.g., Brainstorming Camps held by Disney and many Japanese companies (e.g., Honda),
Westinghouse Innovation Group
• Another name for KDD is data mining (DM)
• DM uses a neural network which has been used for marketing, retail, banking,
insurance, telecommunications, and operations management.
• E.g., Amazon making use of Market Basket Analysis
• E.g., banking - for identifying fraudulent bank accounts or to detect money laundering and
terrorist financing
• E.g., eBags (a web-based retailer of suitcases, wallets, and related products)
• e.g., Proflowers is a Web-based flower retailer, that effectively attract attention to lower-
selling items through their Web site (Stevens 2001)
• KDD and DM both are interactive and iterative process that turns data into
information and information into business knowledge.
e.g., Efficient power stations

• The expected temperature is the most influential factor.


• Data mining in this context consists of training neural networks to predict the
energy load in a certain area for a specified period of time.
• The network can be trained by mining a database containing actual recorded data
on ambient temperatures, wind speed, humidity, day of the week (among
others), and the actual power consumed per hour.
• supervised training (or inferential techniques) in this context
• the relations are embedded in the weights computed by the training algorithm, typically the
back-propagation algorithm.
• The forecast values can be fed the same attributes and it can predict the load on
a per hour basis for 24, 48, and 72 hours.
• Positive results in this arena have led the Electric Power Research Institute to
offer neural network-based tools to perform this specific function.
Developing DM Software
• Business understanding
• Data understanding (data collection, data description, data quality
and verification, exploratory analysis of the data)
• Data preparation (selection, construction and transformation of
variables, data integration, formatting)
• Model building and validation
• Evaluation and Interpretation
• Deployment
CRISP-DM process
• Cross-Industry Standard Process for Data Mining

• Step 1 - The goal of the data mining system is understood


• Step 2 - The data have been collected
• Step 3 - The data have been prepared (cleaned or preprocessed)
• Step 4 - Building and validating the data mining model.
• A Beautiful Mind
https://www.youtube.com/watch?v=EajIlG_OCvw
• (美麗境界)
https://www.youtube.com/watch?v=ge4IVgPHjGw
Statistical and Non-Statistical Techniques
• Data mining techniques include both statistical as well as
nonstatistical techniques.
• Statistical techniques are known as traditional data mining methods including
regression, logistic regression, and multivariate methods.
• Nonstatistical techniques, also known as intelligent techniques or data-
adaptive methods, include memory-based reasoning, decision trees, and
neural networks
• Note that for all these techniques, the outcome or output variable is
not defined.
• In unsupervised learning, there is NOT a previously known outcome in
mind
Statistical and Non-Statistical Techniques
• Descriptive statistics are used to find • Inferential statistics are used to
patterns or define classes of similar generalize from data and thus develop
objects in the data collected. models that generalize from the
• different descriptive techniques, observations.
including both association and clustering
methods, and their applicability • Memory-based reasoning (MBR)
pertaining to the characteristics of the • a DM technique that looks for the
input variables. nearest neighbors of known data samples
and combines their values to assign
• Need to specify a hypothesis before classification or prediction values for new
carrying out any analysis data samples.
• In other words, stringent assumptions • It is very similar to Case-based Reasoning.
• E.g., normality of the sample data, • Decision trees (or rule induction
uncorrelated error, or homogeneity of
variance. In particular, when the number methods)
of explanatory variables is large, model • Neural networks
specification and selection are
increasingly difficult, making it harder to
work with statistical techniques.
• Can provide for a more rigorous test
of hypotheses
Discovering Knowledge on the Web
• Web pages and documents found on the Web can provide important
information at a minimal cost to develop or maintain.
• Web mining is “Web crawling with on-line text mining” (Zanasi 2000).
• Unfortunately, the information and data in the Web are unstructured.
• 80 percent of the world’s online content is based on text (Chen 2001).
• Web mining requires linguistic analysis or natural language processing
(NLP) abilities.
• Web mining requires techniques from both information retrieval and
artificial intelligence domains.
• Gerald Salton (1989) is generally considered the father of information retrieval (IR).
• IR indexing techniques consist of calculating the function term frequency inverse
document frequency (TFIDF).
• Text mining refers to automatically “reading” large documents (called
corpora) of text written in natural language and being able to derive
knowledge from the process.
Discovering Knowledge on the Web (Cont’)
• Four layers of techniques
• Linguistic Analysis/NLP
• Statistical and Co-occurrence Analysis
• Statistical and Neural Networks Clustering and Categorization
• Visualization and Human Computer Interfaces
• Three types of uses for Web data mining
• Web Structure Mining
• Web Usage Mining (preprocessing, pattern analysis, pattern discovery)
• Web Content Mining
• DM and Customer relationship management (CRM)
• E.g., enterprise application, integration (EAI) technology and data warehouses, OLAP (Online
Analytical Processing), campaign management software to achieve
a “holistic view” of the customer and thus improve their experience at every one of their
customer touchpoints
• Some barriers and some cases (e.g., Real Estate Appraisal Systems, Expertise
Locator Systems, Novel-Knowledge Discovery on the Web - Google)
Knowledge Capture Systems
• Organizational storytelling
• Two types:
• one serves best to support educational settings;
• the other serves best to capture tactical knowledge.
• Knowledge-elicitation techniques
• Concept maps
• CxBR
• CxBR-based CITKA
• Learning by observation
• RFID technologies
Organizational storytelling
• To support the process of eliciting either explicit or tacit knowledge
that may reside in people, artifacts, and organizational entities.
• knowledge exists either within or outside organizational boundaries, among
employees, consultants, competitors, customers, suppliers, and even prior
employers of the organization’s new employees
• e.g., 3M Corporation
• Approaches
• developing models/prototyping
• learning by observation
• face-to-face meetings
• Some guidelines, steps, and benefits
Knowledge-elicitation techniques
• Concept maps
• a knowledge-modeling tool (e.g., CmapTools and others)
• provide an effective methodology to capture and represent (organize and
structure) the concepts representing the expert’s domain knowledge in
representation models that can later be used by potential students of the
domain
• Context-based reasoning (CxBR)
• to intuitively simulate human behavior
• best suited to capture the tactical knowledge of experts, which requires
assessment of the situation, selecting a plan of action, and acting on the plan.
• Both can be used to construct simulation models of human behavior
Knowledge-elicitation techniques (Cont’)
• CITKA - Context-based Intelligent Tactical Knowledge Acquisition
• based on CxBR
• composes questions and presents them to the expert
• Four modules of independent but cooperating sub-systems
• Knowledge engineering database back-end (KEDB) - a hierarchical data structure presented in
a tabular form created for each context type
• Knowledge engineering interface (KEI) - Maps into the KEDB module by providing eight
interacting dialogs.
• Query rule-based back-end (QRB) - for executing the intelligent dialog with the SMEs by
providing them an interface input screen
• Subject matter expert interface (SMEI) - the graphical user interface (GUI) for the QRB.
• Barriers
• develop some idea of the nature and structure of the knowledge very early in the process.
• familiarize with the structure of knowledge
• perhaps the most adequate representation is the concept map paradigm
Learning by observation
• to automate the knowledge acquisition task
• observation logs were kept from a large set of training examples that recorded the
sensor inputs and appropriate actions, which were used to create decision trees
describing the pilot’s behavior.
• Autonomous Land Vehicle in a neural network (ALVINN)
• an autonomous vehicle-driving system
• using neural networks that were trained by observing how human drivers responded to diverse
driving environments.
• OBSERVO-SOAR, which combined behavioral cloning with OBSERVER’s behavioral representation
in order to learn effectively in complex and dynamic domains such as a flight simulator domain.
• Cognitive imitation
• different from observational learning
• involves imitation with observational learning
• The modern approach to learning by observation
• “to design agents that already know something and are trying to learn some more”
• the agent must have some way to obtain the background knowledge that it will use to learn new
episodes via incremental development.
• Edge of Tomorrow
https://www.imdb.com/title/tt1631867/
• (明日邊界)
https://www.youtube.com/watch?v=s4EH7HnK2hM
RFID technologies
• Radio Frequency Identification
• the combination of radio broadcast with radar technology
• Consists of two parts
• an RFID tag - an integrated circuit that modulates and demodulates a radio frequency
signal and processes and stores information
• an antenna that receives and transmits the signal
• Three kinds of RFID tags
• passive (which have no battery and the power is supplied by the reader),
• active (which have a power supply), and
• semi-passive (which have a power supply that powers the tag).
• Applications of RFID technology today help improve inventory tracking and
supply chain management (e.g. Walmart’s RFID)
Knowledge Sharing Systems
• Also referred to as knowledge repositories (document management system
or an electronic storage medium) or knowledge markets
• Door/portal technologies are used to build a common entry into multiple
distributed repositories
• To promote knowledge sharing for reuse by other members from the same
organization and propagation of innovation, technology, and strategic
management (Yoo and Ginzberg 2003)
• Types
• Lessons learned systems (LLS)
• Expertise locator systems
• Others: Incident report databases, Alert systems, Best practices databases
• The roles of Ontology and Taxonomy
protocol://computer_name:port/document name

• WWW (World Wide Web)


• Client, Hypertext Markup Language (HTML)
• Uniform Resource Locator (URL)
• hypertext transfer protocol (http)

• Designing the Knowledge Sharing System


• via hyperlink
• some requirements
• some barriers
Lessons learned systems (LLS)
• The goal of LLS is to support organizational processes
• “to capture and provide lessons that can benefit employees who encounter
situations that closely resemble a previous experience in a similar situation”
(Weber et al. 2001)
• To support distributed project collaborations and their knowledge
sources while actively seeking to capture and reuse lessons from
project report archives.
• LLS Process
• Collect the lessons (passive, reactive, after-action collection, proactive collection, active
collection, interactive collection) >> Verifying the lessons >> Store the lesson >>
Distribute the lesson >> Apply the lesson (browsable, executable, outcome
reuse)
Lessons learned systems (LLS) – Cont’
• Corporate memory (also known as an organizational memory) is
made up of the aggregate intellectual assets of an organization.
• Collaborative computing provides a common communication space,
improves sharing of knowledge, provides a mechanism for real-time
feedback on the tasks being performed, helps to optimize processes,
and results in a centralized knowledge warehouse.
• Group decision-making tools
• E.g., Eureka - A Lessons Learned System for Xerox

• Knowledge owners vs. knowledge seekers


Expertise locator systems (ELS)
• To help locate intellectual capital (Becerra-Fernandez 2006).
• To catalog knowledge competencies, including information not typically
captured by human resources systems, in a way that could later be queried
across the organization.
• (see next slide) Characteristics of ELS

• E.g., NSA’s was based on O*NET, a standard published by the U.S.


Department of Labor
• E.g., Hewlett Packard based their taxonomy of an existing standard
published by the U.S. Library of Congress augmented by their own
knowledge competencies.
• MIB 3 (men in black)
https://www.youtube.com/watch?v=BV-WEb2oxLk
• (星際戰警3)
https://www.youtube.com/watch?v=Ds66JoAT0TI
Others
• Incident report databases
• bug reports, sensing equipment outages
• Alert systems
• to report problems experienced with a technology, e.g., Toyota car recall.
• Best practices databases
• typically from the re-engineering of business
• processes (O’Leary 1999) that could be applicable to organizational processes (e.g., Microsoft
Developer Network

• E.g., Ernest & Young, a professional services organization, and the Center; Ernest
& Young's KM initiatives specifically support knowledge sharing
• E.g., Advancing Microbial Risk Assessment (CAMRA), funded by the
Environmental Protection Agency and Department of Homeland Security,
successfully introduced knowledge sharing systems to share important
knowledge.
The roles of ontology and taxonomy
• Ontology is an explicit formal specification of how to represent the objects,
concepts, and other entities that are assumed to exist in some area of interest
and the relationships that hold among them.
• used to represent complex relationships between objects as rules and axioms, which are NOT
included in semantic networks.
• Taxonomies
• also called classification or categorization schemes
• serve to group objects together based on a particular characteristic.
• Knowledge taxonomies allow organizing knowledge or competency areas in the organization.
• Both are related to other knowledge organization systems,
including semantic networks and authority files.
• Semantic networks serve to structure concepts and terms in networks or webs versus the
hierarchies typically used to represent taxonomies.
• Authority files are lists of terms used to control the variant names in a particular field, and
link preferred terms to nonpreferred terms.
• Authority files are used to control the taxonomy vocabulary, in particular within an
organization.
• In other words, authority files are used to ensure that everyone in the organization uses the
same terms to organize similar concepts.
Knowledge Application Systems
- systems that utilize knowledge -

• Artificial intelligence (AI)

• Rule-based expert systems


• Case-based reasoning (CBR)
• Help desk systems
• Fault diagnosis systems
• Other examples: GenAID, SOS Advisor, Product Quality Analysis for National
Semiconductor, Darty, CLAIM, OFD Systems
Artificial intelligence (AI)
• Definition (Becerra-Fernandez et al. 2004)
The science that provides computers with the ability to represent and
manipulate symbols so they can be used to solve problems not easily
solved through algorithmic models.
• AI include natural language understanding, classification, diagnostics, design, machine
learning, planning and scheduling, robotics, and computer vision
• Turning test: https://www.javatpoint.com/turing-test-in-ai
• AI research first focused on
• Games: Numerous chess programs, e.g., Mac Hack, Chess 4.5… Komodo Dragon, Fat
Fritz, Stockfish, AlphaZero (DeepMind)
https://pythonawesome.com/chess-ai-game-with-python/
https://www.freecodecamp.org/news/simple-chess-ai-step-by-step-1d55a9266977/
• Natural language translation: Still working hard now!
• General Problem Solver (GPS) (Simon and Newell 1963):
To solve some problems by searching for an answer in a solution space
• The Imitation Game
https://www.youtube.com/watch?v=j2jRs4EAvWM
• (模仿遊戲)
https://www.youtube.com/watch?v=d72LzJxpibM
Rule-based expert systems
• To represent the domain knowledge.
• It requires the collaboration of a subject matter expert.
• Experts develop these rules-of-thumb (e.g., IF-THEM) over years of
practical experience at solving problems.
• Types: Forward chaining & Backward chaining
• Components: (1) Knowledge base, (2) Database, (3) Inference engine,
(4) Explanation facilities, (5) User interface
https://uomustansiriyah.edu.iq/media/lectures/6/6_2019_02_19!06_52_45_PM.pdf
• Expert systems with such a large number of rules offer man disadvantages:
• Difficulty in coding, verifying, validating, and maintaining the rules; and
• Reduction in the efficiency of the inference engine executing the rules.
• Cases: IKEA Virtual Assistant, Westinghouse Electric Corporation’s GenAID,
SBIR/STTR Online Advisor
https://www.professional-ai.com/rule-based-systems.html

• In addition to rules, other paradigms to represent knowledge include


frames, predicates, associative networks, and objects.
Case-based reasoning (CBR)
• Better than rule-based expert systems
• Based on Schank’s (1982) model of dynamic memory
• A method of analogical reasoning that utilizes old cases or
experiences in an effort to solve problems, critique solutions, explain
anomalous situations, or interpret situations (Aamodt and Plaza 1994;
Kolodner 1991, 1993; Leake 1996; Watson 2003)
• Types:
• Exemplar-based reasoning – classification
• Instance-based reasoning - attribute vectors and automated learning
• Analogy-based reasoning - case reuse or mapping problem; used to solve new
problem from a different domain
Case-based reasoning (CBR) – Cont’
• Processes
• Search the case library for similar cases.
• Select and retrieve the most similar case(s).
• Adapt the solution for the most similar case.
• Apply the generated solution and obtain feedback.
• Add the newly solved problem to the case library.
• Other intelligent systems
• Constraint-based reasoning - “what cannot be done,” e.g., scheduling a meeting
• Model-based reasoning (MBR) - to simulate a real behavior, e.g., hurricane model
• Diagrammatic reasoning - with the intention to understand concepts and ideas for a
specific area
Help desk systems
• The earliest help desk system was Lockheed’s CLAVIER
• the classic CBR system
• to solve a problem where no explicit decision model existed
• can learn by acquiring new cases and therefore improve their performance
with time (Watson and Marir 1994)
• Compaq Computer's SMART
• an integrated call-tracking and problem-solving system, supported by
hundreds of historical cases that help resolve diagnostic problems resulting
from the use of Compaq products (Allen 1994).
• to support its Customer Service Department when handling user calls through
its toll-free number.
• Other CBR examples
Fault diagnosis systems
• The earliest - CABER at Lockheed Martin Corporation (Mark et al. 1996) resolved
only 20 percent to 40 percent of the system faults.
• Compaq's QuickSource - A case base of over 500 diagnostic cases supported the
QuickSource knowledge applications system
• General Electric's (GE) FormTool - To determine the correct formulas to color
plastics according to the customers’ specifications (Cheetham 2005)
• GE's Support the Customer (STC) tool - helps call takers solve customers’
problems by suggesting questions that could help with the problem diagnosis
(Cheetham and Goebel 2007).
• Verdande Technology’s (2014) CBR-driven Edge platform - to identify, capture,
and analyze data patterns in real-time, predicting future problems based on past
events while offering solutions based on prior actions; launched for the oil and
gas, financial services, and health-care industries
Other CRB examples
• GenAID, the earliest diagnostic knowledge application system, based on
the use of rules and is still operational today
• SOS Advisor - a Web-based expert system built using a set of rules; a small
number of rules can define the domain
• Product Quality Analysis for National Semiconductor - to achieve lower
costs and just-in-time manufacturing schemes.
• Darty - designed with the goal of reusing the solutions to software quality
problems
• GE's CLAIM - improving the process of healthcare services reimbursement
• OFD Systems for Shuttle Processing - provided preflight, launch, landing,
and recovery services for the Kennedy Space Center

You might also like