This document provides an overview of deep learning and its applications in anomaly detection and software vulnerability detection. It discusses key deep learning architectures like feedforward networks, recurrent neural networks, and convolutional neural networks. It also covers unsupervised learning techniques such as word embedding, autoencoders, RBMs, and GANs. For anomaly detection, it describes approaches for multichannel anomaly detection, detecting unusual mixed-data co-occurrences, and modeling object lifetimes. It concludes by discussing applications in detecting malicious URLs, unusual source code, and software vulnerabilities.
This document provides an overview of deep learning 1.0 and discusses potential directions for deep learning 2.0. It summarizes limitations of deep learning 1.0 such as lack of reasoning abilities and discusses how incorporating memory and reasoning capabilities could help address these limitations. The document outlines several approaches being explored for neural memory and reasoning, including memory networks, neural Turing machines, and self-attentive associative memories. It argues that memory and reasoning will be important for developing more human-like artificial general intelligence.
Deep learning and applications in non-cognitive domains IIDeakin University
This document provides an overview of applying deep learning techniques to non-cognitive domains, with a focus on healthcare, software engineering, and anomaly detection. It introduces popular deep learning frameworks like Theano and TensorFlow and discusses best practices for building models. For healthcare, examples are given on using recurrent neural networks (RNNs) with electronic medical record (EMR) data and physiological time-series data from intensive care units. Challenges in software engineering like long-term temporal dependencies are discussed. Overall, the document outlines techniques for structured and unstructured data across different non-cognitive domains.
Deep learning and applications in non-cognitive domains IIIDeakin University
This document summarizes an presentation on unsupervised learning and advanced topics in deep learning. It discusses word embeddings, autoencoders, restricted Boltzmann machines, variational autoencoders, generative adversarial networks, graph neural networks, attention mechanisms, and end-to-end memory networks. It emphasizes representing complex domain structures like relations and graphs, and developing memory and attention capabilities in neural networks. The presentation concludes by discussing positioning research opportunities in these emerging areas.
Deep learning and applications in non-cognitive domains IDeakin University
This document outlines an agenda for a presentation on deep learning and its applications in non-cognitive domains. The presentation is divided into three parts: an introduction to deep learning theory, applying deep learning to non-cognitive domains in practice, and advanced topics. The introduction covers neural network architectures like feedforward, recurrent, and convolutional networks. It also discusses techniques for improving training like rectified linear units and skip connections. The practice section will provide hands-on examples in domains like healthcare and software engineering. The advanced topics section will discuss unsupervised learning, structured outputs, and positioning techniques in deep learning.
Deep Learning has taken the digital world by storm. As a general purpose technology, it is now present in all walks of life. Although the fundamental developments in methodology have been slowing down in the past few years, applications are flourishing with major breakthroughs in Computer Vision, NLP and Biomedical Sciences. The primary successes can be attributed to the availability of large labelled data, powerful GPU servers and programming frameworks, and advances in neural architecture engineering. This combination enables rapid construction of large, efficient neural networks that scale to the real world. But the fundamental questions of unsupervised learning, deep reasoning, and rapid contextual adaptation remain unsolved. We shall call what we currently have Deep Learning 1.0, and the next possible breakthroughs as Deep Learning 2.0.
This is part 1 of the Tutorial delivered at IEEE SSCI 2020, Canberra, December 1st (Virtual).
Describing latest research in visual reasoning, in particular visual question answering. Covering both images and videos. Dual-process theories approach. Relational memory.
This is the talk given at the Faculty of Information Technology, Monash University on 19/08/2020. It covers our recent research on the topics of learning to reason, including dual-process theory, visual reasoning and neural memories.
Introducing research works in the area of machine reasoning at our Applied AI Institute, Deakin University, Australia. Covering visual & social reasoning, neural Turing machine and System 2.
The current deep learning revolution has brought unprecedented changes to how we live, learn, interact with the digital and physical worlds, run business and conduct sciences. These are made possible thanks to the relative ease of construction of massive neural networks that are flexible to train and scale up to the real world. But the flexibility is hitting the limits due to excessive demand of labelled data, the narrowness of the tasks, the failure to generalize beyond surface statistics to novel combinations, and the lack of the key mental faculty of deliberate reasoning. In this talk, I will present a multi-year research program to push deep learning to overcome these limitations. We aim to build dynamic neural networks that can train themselves with little labelled data, compress on-the-fly in response to resource constraints, and respond to arbitrary query about a context. The networks are equipped with capability to make use of external knowledge, and operate that the high-level of objects and relations. The long-term goal is to build persistent digital companions that co-live with us and other AI entities, understand our need and intention, and share our human values and norms. They will be capable of having natural conversations, remembering lifelong events, and learning in an open-ended fashion.
This document discusses research on improving neural Turing machines through the use of program memory.
It introduces the neural universal Turing machine (NUTM), which augments neural Turing machines with a neural stored-program memory (NSM) to store programs. The NSM allows NUTMs to sequence tasks, do continual learning, and answer questions - addressing limitations of current neural Turing machines. Further research opportunities are outlined, including using memory for graphs and relational structures, memory-supported reasoning, and developing full cognitive architectures.
Deep learning for biomedical discovery and data mining IDeakin University
The document provides an agenda and slides for a deep learning tutorial focused on biomedical applications. The agenda covers topics such as classic deep learning architectures, genomics applications, healthcare applications, and improving data efficiency. The slides discuss challenges in applying deep learning to biomedicine like small datasets and the complexity of diseases. They also highlight opportunities like recent advances in deep learning techniques and biomedicine being a major source of new problems.
TL;DR: This tutorial was delivered at KDD 2021. Here we review recent developments to extend the capacity of neural networks to “learning to reason” from data, where the task is to determine if the data entails a conclusion.
The rise of big data and big compute has brought modern neural networks to many walks of digital life, thanks to the relative ease of construction of large models that scale to the real world. Current successes of Transformers and self-supervised pretraining on massive data have led some to believe that deep neural networks will be able to do almost everything whenever we have data and computational resources. However, this might not be the case. While neural networks are fast to exploit surface statistics, they fail miserably to generalize to novel combinations. Current neural networks do not perform deliberate reasoning – the capacity to deliberately deduce new knowledge out of the contextualized data. This tutorial reviews recent developments to extend the capacity of neural networks to “learning to reason” from data, where the task is to determine if the data entails a conclusion. This capacity opens up new ways to generate insights from data through arbitrary querying using natural languages without the need of predefining a narrow set of tasks.
Full day lectures @International University, HCM City, Vietnam, May 2019. Review of AI in 2019; outlook into the future; empirical research in AI; introduction to AI research at Deakin University
A discussion of the nature of AI/ML as an empirical science. Covering concepts in the field, how to position ourselves, how to plan for research, what are empirical methods in AI/ML, and how to build up a theory of AI.
This document provides an overview of a tutorial on machine learning and reasoning for drug discovery.
The tutorial covers several topics: molecular representation and property prediction, including fingerprints, string representations, graph representations, and self-supervised learning; protein representation and protein-drug binding; molecular optimization and generation; and knowledge graph reasoning and drug synthesis.
The introduction discusses the drug discovery pipeline and how machine learning can help with various tasks such as molecular property prediction, target identification, and reaction planning. Neural networks are well-suited for drug discovery due to their expressiveness, learnability, generalizability, and ability to handle large amounts of data.
The document discusses advance techniques of computational intelligence for biomedical image analysis. It provides an overview of computational intelligence, which involves adaptive mechanisms like artificial neural networks, evolutionary computation, fuzzy systems, and swarm intelligence. These techniques exhibit an ability to learn or adapt to new environments. The document also discusses deep learning techniques like convolutional neural networks and recurrent neural networks that are widely used for tasks like image classification.
This document summarizes Melanie Swan's presentation on deep learning. It began with defining key deep learning concepts and techniques, including neural networks, supervised vs. unsupervised learning, and convolutional neural networks. It then explained how deep learning works by using multiple processing layers to extract higher-level features from data and make predictions. Deep learning has various applications like image recognition and speech recognition. The presentation concluded by discussing how deep learning is inspired by concepts from physics and statistical mechanics.
This document provides an overview of deep learning, including definitions of AI, machine learning, and deep learning. It discusses neural network models like artificial neural networks, convolutional neural networks, and recurrent neural networks. The document explains key concepts in deep learning like activation functions, pooling techniques, and the inception model. It provides steps for fitting a deep learning model, including loading data, defining the model architecture, adding layers and functions, compiling, and fitting the model. Examples and visualizations are included to demonstrate how neural networks work.
1. The document discusses model interpretation and techniques for interpreting machine learning models, especially deep neural networks.
2. It describes what model interpretation is, its importance and benefits, and provides examples of interpretability algorithms like dimensionality reduction, manifold learning, and visualization techniques.
3. The document aims to help make machine learning models more transparent and understandable to humans in order to build trust and improve model evaluation, debugging and feature engineering.
Part of the ongoing effort with Skater for enabling better Model Interpretation for Deep Neural Network models presented at the AI Conference.
https://conferences.oreilly.com/artificial-intelligence/ai-ny/public/schedule/detail/65118
Human in the loop: Bayesian Rules Enabling Explainable AIPramit Choudhary
The document provides an overview of a presentation on enabling explainable artificial intelligence through Bayesian rule lists. Some key points:
- The presentation will cover challenges with model opacity, defining interpretability, and how Bayesian rule lists can be used to build naturally interpretable models through rule extraction.
- Bayesian rule lists work well for tabular datasets and generate human-understandable "if-then-else" rules. They aim to optimize over pre-mined frequent patterns to construct an ordered set of conditional statements.
- There is often a tension between model performance and interpretability. Bayesian rule lists can achieve accuracy comparable to more opaque models like random forests on benchmark datasets while maintaining interpretability.
The document is a glossary of deep learning terms from A-Z published by Re-Work, defining important concepts like artificial neural networks, convolutional neural networks, deep learning, embeddings, feedforward networks, generative adversarial networks, and more; each term includes a short definition and links to external sources for further explanation. The glossary aims to explain the key concepts underlying recent advances in deep learning that have enabled applications such as driverless cars, healthcare, and fashion recommendations.
The report covers the diverse field of neural networks compiled from various sources into a compact yet detailed form. It also has the formal report writing pattern incorporated in it.
Following topics are discussed in this presentation:What is Soft Computing?
What is Hard Computing?
What is Fuzzy Logic Models?
What is Neural Networks (NN)?
What is Genetic Algorithms or Evaluation Programming?
What is probabilistic reasoning?
Difference between fuzziness and probability
AI and Soft Computing
Future of Soft Computing
This document outlines the syllabus for an MTCSCS302 course on Soft Computing taught by Dr. Sandeep Kumar Poonia. The course covers topics including neural networks, fuzzy logic, probabilistic reasoning, and genetic algorithms. It is divided into five units: (1) neural networks, (2) fuzzy logic, (3) fuzzy arithmetic and logic, (4) neuro-fuzzy systems and applications of fuzzy logic, and (5) genetic algorithms and their applications. The goal of the course is to provide students with knowledge of soft computing fundamentals and approaches for solving complex real-world problems.
Soft computing is an approach to engineering that is inspired by nature. It includes techniques like fuzzy logic, probabilistic reasoning, evolutionary computation, neural networks, and machine learning. These techniques are useful for problems that are too complex or undefined for conventional analytical or hard computing techniques. Soft computing provides approximate solutions and can handle imprecise data. It has applications in areas like robotics, artificial intelligence, and machine translation.
This document discusses anomaly detection using deep auto-encoders. It begins by defining outliers and anomalies, and describes challenges with traditional machine learning techniques for anomaly detection. It then introduces hierarchical feature learning using deep neural networks, specifically using auto-encoders to learn the structure of normal data and detect anomalies based on reconstruction error. Examples of applying this for ECG pulse detection and MNIST digit recognition are provided.
Anomaly Detection using Deep Auto-Encoders | Gianmario SpacagnaData Science Milan
One of the determinants for a good anomaly detector is finding smart data representations that can easily evince deviations from the normal distribution. Traditional supervised approaches would require a strong assumption about what is normal and what not plus a non negligible effort in labeling the training dataset. Deep auto-encoders work very well in learning high-level abstractions and non-linear relationships of the data without requiring data labels. In this talk we will review a few popular techniques used in shallow machine learning and propose two semi-supervised approaches for novelty detection: one based on reconstruction error and another based on lower-dimensional feature compression.
The current deep learning revolution has brought unprecedented changes to how we live, learn, interact with the digital and physical worlds, run business and conduct sciences. These are made possible thanks to the relative ease of construction of massive neural networks that are flexible to train and scale up to the real world. But the flexibility is hitting the limits due to excessive demand of labelled data, the narrowness of the tasks, the failure to generalize beyond surface statistics to novel combinations, and the lack of the key mental faculty of deliberate reasoning. In this talk, I will present a multi-year research program to push deep learning to overcome these limitations. We aim to build dynamic neural networks that can train themselves with little labelled data, compress on-the-fly in response to resource constraints, and respond to arbitrary query about a context. The networks are equipped with capability to make use of external knowledge, and operate that the high-level of objects and relations. The long-term goal is to build persistent digital companions that co-live with us and other AI entities, understand our need and intention, and share our human values and norms. They will be capable of having natural conversations, remembering lifelong events, and learning in an open-ended fashion.
This document discusses research on improving neural Turing machines through the use of program memory.
It introduces the neural universal Turing machine (NUTM), which augments neural Turing machines with a neural stored-program memory (NSM) to store programs. The NSM allows NUTMs to sequence tasks, do continual learning, and answer questions - addressing limitations of current neural Turing machines. Further research opportunities are outlined, including using memory for graphs and relational structures, memory-supported reasoning, and developing full cognitive architectures.
Deep learning for biomedical discovery and data mining IDeakin University
The document provides an agenda and slides for a deep learning tutorial focused on biomedical applications. The agenda covers topics such as classic deep learning architectures, genomics applications, healthcare applications, and improving data efficiency. The slides discuss challenges in applying deep learning to biomedicine like small datasets and the complexity of diseases. They also highlight opportunities like recent advances in deep learning techniques and biomedicine being a major source of new problems.
TL;DR: This tutorial was delivered at KDD 2021. Here we review recent developments to extend the capacity of neural networks to “learning to reason” from data, where the task is to determine if the data entails a conclusion.
The rise of big data and big compute has brought modern neural networks to many walks of digital life, thanks to the relative ease of construction of large models that scale to the real world. Current successes of Transformers and self-supervised pretraining on massive data have led some to believe that deep neural networks will be able to do almost everything whenever we have data and computational resources. However, this might not be the case. While neural networks are fast to exploit surface statistics, they fail miserably to generalize to novel combinations. Current neural networks do not perform deliberate reasoning – the capacity to deliberately deduce new knowledge out of the contextualized data. This tutorial reviews recent developments to extend the capacity of neural networks to “learning to reason” from data, where the task is to determine if the data entails a conclusion. This capacity opens up new ways to generate insights from data through arbitrary querying using natural languages without the need of predefining a narrow set of tasks.
Full day lectures @International University, HCM City, Vietnam, May 2019. Review of AI in 2019; outlook into the future; empirical research in AI; introduction to AI research at Deakin University
A discussion of the nature of AI/ML as an empirical science. Covering concepts in the field, how to position ourselves, how to plan for research, what are empirical methods in AI/ML, and how to build up a theory of AI.
This document provides an overview of a tutorial on machine learning and reasoning for drug discovery.
The tutorial covers several topics: molecular representation and property prediction, including fingerprints, string representations, graph representations, and self-supervised learning; protein representation and protein-drug binding; molecular optimization and generation; and knowledge graph reasoning and drug synthesis.
The introduction discusses the drug discovery pipeline and how machine learning can help with various tasks such as molecular property prediction, target identification, and reaction planning. Neural networks are well-suited for drug discovery due to their expressiveness, learnability, generalizability, and ability to handle large amounts of data.
The document discusses advance techniques of computational intelligence for biomedical image analysis. It provides an overview of computational intelligence, which involves adaptive mechanisms like artificial neural networks, evolutionary computation, fuzzy systems, and swarm intelligence. These techniques exhibit an ability to learn or adapt to new environments. The document also discusses deep learning techniques like convolutional neural networks and recurrent neural networks that are widely used for tasks like image classification.
This document summarizes Melanie Swan's presentation on deep learning. It began with defining key deep learning concepts and techniques, including neural networks, supervised vs. unsupervised learning, and convolutional neural networks. It then explained how deep learning works by using multiple processing layers to extract higher-level features from data and make predictions. Deep learning has various applications like image recognition and speech recognition. The presentation concluded by discussing how deep learning is inspired by concepts from physics and statistical mechanics.
This document provides an overview of deep learning, including definitions of AI, machine learning, and deep learning. It discusses neural network models like artificial neural networks, convolutional neural networks, and recurrent neural networks. The document explains key concepts in deep learning like activation functions, pooling techniques, and the inception model. It provides steps for fitting a deep learning model, including loading data, defining the model architecture, adding layers and functions, compiling, and fitting the model. Examples and visualizations are included to demonstrate how neural networks work.
1. The document discusses model interpretation and techniques for interpreting machine learning models, especially deep neural networks.
2. It describes what model interpretation is, its importance and benefits, and provides examples of interpretability algorithms like dimensionality reduction, manifold learning, and visualization techniques.
3. The document aims to help make machine learning models more transparent and understandable to humans in order to build trust and improve model evaluation, debugging and feature engineering.
Part of the ongoing effort with Skater for enabling better Model Interpretation for Deep Neural Network models presented at the AI Conference.
https://conferences.oreilly.com/artificial-intelligence/ai-ny/public/schedule/detail/65118
Human in the loop: Bayesian Rules Enabling Explainable AIPramit Choudhary
The document provides an overview of a presentation on enabling explainable artificial intelligence through Bayesian rule lists. Some key points:
- The presentation will cover challenges with model opacity, defining interpretability, and how Bayesian rule lists can be used to build naturally interpretable models through rule extraction.
- Bayesian rule lists work well for tabular datasets and generate human-understandable "if-then-else" rules. They aim to optimize over pre-mined frequent patterns to construct an ordered set of conditional statements.
- There is often a tension between model performance and interpretability. Bayesian rule lists can achieve accuracy comparable to more opaque models like random forests on benchmark datasets while maintaining interpretability.
The document is a glossary of deep learning terms from A-Z published by Re-Work, defining important concepts like artificial neural networks, convolutional neural networks, deep learning, embeddings, feedforward networks, generative adversarial networks, and more; each term includes a short definition and links to external sources for further explanation. The glossary aims to explain the key concepts underlying recent advances in deep learning that have enabled applications such as driverless cars, healthcare, and fashion recommendations.
The report covers the diverse field of neural networks compiled from various sources into a compact yet detailed form. It also has the formal report writing pattern incorporated in it.
Following topics are discussed in this presentation:What is Soft Computing?
What is Hard Computing?
What is Fuzzy Logic Models?
What is Neural Networks (NN)?
What is Genetic Algorithms or Evaluation Programming?
What is probabilistic reasoning?
Difference between fuzziness and probability
AI and Soft Computing
Future of Soft Computing
This document outlines the syllabus for an MTCSCS302 course on Soft Computing taught by Dr. Sandeep Kumar Poonia. The course covers topics including neural networks, fuzzy logic, probabilistic reasoning, and genetic algorithms. It is divided into five units: (1) neural networks, (2) fuzzy logic, (3) fuzzy arithmetic and logic, (4) neuro-fuzzy systems and applications of fuzzy logic, and (5) genetic algorithms and their applications. The goal of the course is to provide students with knowledge of soft computing fundamentals and approaches for solving complex real-world problems.
Soft computing is an approach to engineering that is inspired by nature. It includes techniques like fuzzy logic, probabilistic reasoning, evolutionary computation, neural networks, and machine learning. These techniques are useful for problems that are too complex or undefined for conventional analytical or hard computing techniques. Soft computing provides approximate solutions and can handle imprecise data. It has applications in areas like robotics, artificial intelligence, and machine translation.
This document discusses anomaly detection using deep auto-encoders. It begins by defining outliers and anomalies, and describes challenges with traditional machine learning techniques for anomaly detection. It then introduces hierarchical feature learning using deep neural networks, specifically using auto-encoders to learn the structure of normal data and detect anomalies based on reconstruction error. Examples of applying this for ECG pulse detection and MNIST digit recognition are provided.
Anomaly Detection using Deep Auto-Encoders | Gianmario SpacagnaData Science Milan
One of the determinants for a good anomaly detector is finding smart data representations that can easily evince deviations from the normal distribution. Traditional supervised approaches would require a strong assumption about what is normal and what not plus a non negligible effort in labeling the training dataset. Deep auto-encoders work very well in learning high-level abstractions and non-linear relationships of the data without requiring data labels. In this talk we will review a few popular techniques used in shallow machine learning and propose two semi-supervised approaches for novelty detection: one based on reconstruction error and another based on lower-dimensional feature compression.
This document provides an introduction to deep learning. It discusses the history of machine learning and how neural networks work. Specifically, it describes different types of neural networks like deep belief networks, convolutional neural networks, and recurrent neural networks. It also covers applications of deep learning, as well as popular platforms, frameworks and libraries used for deep learning development. Finally, it demonstrates an example of using the Nvidia DIGITS tool to train a convolutional neural network for image classification of car park images.
Big Data Malaysia - A Primer on Deep LearningPoo Kuan Hoong
This document provides an overview of deep learning, including a brief history of machine learning and neural networks. It discusses various deep learning models such as deep belief networks, convolutional neural networks, and recurrent neural networks. Applications of deep learning in areas like computer vision, natural language processing, and robotics are also covered. Finally, popular platforms, frameworks and libraries for developing deep learning systems are mentioned.
In this talk we walk the audience through how to marry correlation analysis with anomaly detection, discuss how the topics are intertwined, and detail the challenges one may encounter based on production data. We also showcase how deep learning can be leveraged to learn nonlinear correlation, which in turn can be used to further contain the false positive rate of an anomaly detection system. Further, we provide an overview of how correlation can be leveraged for common representation learning.
Deep learning algorithms have drawn the attention of researchers working in the field of computer vision, speech
recognition, malware detection, pattern recognition and natural language processing. In this paper, we present an overview of
deep learning techniques like Convolutional neural network, deep belief network, Autoencoder, Restricted Boltzmann machine
and recurrent neural network. With this, current work of deep learning algorithms on malware detection is shown with the
help of literature survey. Suggestions for future research are given with full justification. We also showed the experimental
analysis in order to show the importance of deep learning techniques.
Although a new technological advancement, the scope of Deep Learning is expanding exponentially. Advanced Deep Learning technology aims to imitate the biological neural network, that is, of the human brain.
https://takeoffprojects.com/advanced-deep-learning-projects
We are providing you with some of the greatest ideas for building Final Year projects with proper guidance and assistance.
GDSC Introduction to Deep Learning Workshopssuser540861
This document provides an introduction and overview of deep learning presented by Philippe Bouchet. The presentation covers: who the presenter is, definitions of AI, machine learning, and deep learning, examples of deep learning applications including neural networks, frameworks for developing models, and common uses of deep learning. It is intended as a workshop for those new to AI and deep learning, with no advanced math required. The workshop will involve using Jupyter notebooks and Tensorflow to build a image classification model.
A Quick Overview of Artificial Intelligence and Machine Learning (revised ver...Hiroki Sayama
This document provides an overview of artificial intelligence and machine learning. It discusses early concepts of intelligence and key contributors. Statistics, data analytics, and optimization are identified as important ingredients. Machine learning techniques like supervised learning, unsupervised learning, and reinforcement learning are explained. Neural networks including recurrent neural networks and deep learning are covered. Research examples and challenges in the field are also summarized.
Deep Qualia: Philosophy of Statistics, Deep Learning, and Blockchain
Deep learning: What is it, why is it important, and what do I need to know?
The aim of this talk is to discuss deep learning as an advanced computational method and its philosophical implications. Computing is a fundamental model by which we are understanding more about ourselves and the world. We think that reality is composed of patterns, which can be detected by machine learning methods.
Deep learning is a complexity optimization technique in which algorithms learn from data by modeling high-level abstractions and assigning probabilities to nodes as they characterize the system and make predictions. An important challenge in deep learning is that these methods work in certain domains (image, speech, and text recognition), but we do not have a good explanation for why, which impedes a wider application of these solutions.
Another recent advance in computational methods is blockchain technology which allows the secure transfer of assets and information, and the automated coordination of operations via a trackable remunerative ledger and smart contracts (automatically-executing Internet-based programs).
This talk looks at how deep learning technology, particularly as coupled with blockchain systems, might be used to produce a new kind of global computing platform. The goal is for blockchain deep learning systems to address higher-dimensional computing challenges that require learning and dynamic response in domains such as economics and financial risk, epidemiology, social modeling, public health (cancer, aging), dark matter, atomic reactions, network-modeling (transportation, energy, smart cities), artificial intelligence, and consciousness.
Generational Adversarial Neural Networks - Essential ReferenceGokul Alex
My presentation on Generational Adversarial Neural Networks and the Challenges of Adversarial Learning Conditions in Neural Networks presented during the National Symposium on Machine Intelligence organised by Kerala University in 2017 in Thiruvananthapuram.
This is an introduction to deep learning presented to Plymouth University students. In the introduction it is explained how a neural network works. In the practical section it is shown how to use Tensorflow for building simple models. Finally the case studies, how to use deep learning in real world applications.
A Quick Overview of Artificial Intelligence and Machine LearningHiroki Sayama
A revised version is available below:
https://www.slideshare.net/HirokiSayama/a-quick-overview-of-artificial-intelligence-and-machine-learning-revised-version
Anomaly Detection and Spark Implementation - Meetup Presentation.pptxImpetus Technologies
StreamAnalytix sponsored a meetup on “Anomaly Detection Techniques and Implementation using Apache Spark” which took place on Tuesday December 5, 2017 at Larkspur Landing Milpitas Hotel, Milpitas, CA. The meetup was led by Maxim Shkarayev, Lead Data Scientist, Impetus Technologies along with Punit Shah, Solution Architect, StreamAnalytix and Anand Venugopal, Product Head & AVP, StreamAnalytix, who introduced and summarized the vast field of Anomaly Detection and its applications in various industry problems. The speakers at the event also offered a structured approach to choose the right anomaly detection techniques based on specific use-cases and data characteristics which was followed by a demonstration of some real-world anomaly detection use-cases on Apache Spark based analytics platform.
Machine Learning: Past, Present and Future - by Tom DietterichBigML, Inc
There are many uses to Machine Learning. This technology began as a form of Data-Driven Software Engineering; but a more recent development is Machine Learning for Data Science: its tools can help us understand the many forms of data that are collected by companies, scientists and governments. Another important trend is Machine Learning for Optimizing Operations: for example, logistics, scheduling, advertisement placement, etc. Also, recent advances in anomaly detection are helping us understand when the results of previous Machine Learning cannot be trusted or when changes in the inputs are surprising.
Find more details here: http://www.madridml.com/en/.
DSRLab seminar Introduction to deep learningPoo Kuan Hoong
Deep learning is a subfield of machine learning that has shown tremendous progress in the past 10 years. The success can be attributed to large datasets, cheap computing like GPUs, and improved machine learning models. Deep learning primarily uses neural networks, which are interconnected nodes that can perform complex tasks like object recognition. Key deep learning models include Restricted Boltzmann Machines (RBMs), Deep Belief Networks (DBNs), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs). CNNs are commonly used for computer vision tasks while RNNs are well-suited for sequential data like text or time series. Deep learning provides benefits like automatic feature learning and robustness, but also has weaknesses such
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsGreg Makowski
http://www.meetup.com/SF-Bay-ACM/events/227480571/
(see also YouTube for a recording of the presentation)
The talk will cover a brief review of neural network basics and the following types of neural network deep learning:
* autocorrelational - unsupervised learning for extracting features. He will describe how additional layers build complexity in the feature extraction.
* convolutional - how to detect shift invariant patterns in various data sources. Horizontal shift invariant detection applies to signals like speech recognition or IoT data. Horizontal and vertical shift invariance applies to images or videos, for faces or self driving cars
* discuss details of applying deep net systems for continuous or real time scoring
* reinforcement learning or Q Learning - such as learning how to play Atari video games
* continuous space word models - such as word2vec, skipgram training, NLP understanding and translation
Artificial intelligence in the post-deep learning eraDeakin University
Deep learning has recently reached the heights that pioneers in the field had aspired to, serving as the driving force behind recent breakthroughs in AI, which have arguably surpassed the Turing test. At present, the spotlight is on scaling Transformers and diffusion models on Internet-scale data. In this talk, I will provide an overview of the fundamental principles of deep learning, its powers, and limitations, and explore the new era of post-deep learning. This new era encompasses novel objectives, dynamic architectures, abstract reasoning, neurosymbolic hybrid systems, and LLM-based agent systems.
Deep learning has recently reached the height the pioneers wished for, serving as the driving force behind recent breakthroughs in AI, which have arguably surpassed the Turing test. In this tutorial, we will provide an overview of the fundamental principles of deep learning and explore the latest advances in the field, including Foundation Models. We will also examine the powers and limitations of deep learning, exploring how reasoning may emerge from carefully crafted neural networks and massively pre-trained models.
AI for automated materials discovery via learning to represent, predict, gene...Deakin University
A brief overview of how our AI can help automate the materials discovery process, covering a wide range of problems, from drug design to crystal plasticity.
Deep learning, enabled by powerful compute, and fuelled by massive data, has delivered unprecedented data analytics capabilities. However, major limitations remain. Chiefly among those is that deep neural networks tend to exploit the surface statistics in the data, creating short-cuts from the input to the output, without really deeply understanding of the data. As a result, these networks fail miserably to generalize to novel combinations. This is because the networks perform shallow pattern matching but not deliberate reasoning – the capacity to deliberately deduce new knowledge out of the contextualized data. Second, machine learning is often trained to do just one task at a time, making it impossible to re-define tasks on the fly as needed in a complex operating environment. This talk presents our recent developments to extend the capacity of neural networks to remove these limitations. Our main focus is on learning to reason from data, that is, learning to determine if the data entails a conclusion. This capacity opens up new ways to generate insights from data through arbitrary querying using natural languages without the need of predefining a narrow set of tasks.
Generative AI represents a pivotal moment in computing history, opening up new opportunities for scientific discoveries. By harnessing extensive and diverse datasets, we can construct new general-purpose Foundation Models that can be fine-tuned for specific prediction and exploration tasks. This talk introduces our research program, which focuses on leveraging the power of Generative AI for materials discovery. Generative AI facilitates rapid exploration of vast materials design spaces, enabling the identification of new compounds and combinations. However, this field also presents significant challenges, such as effectively representing crystals in a compact manner and striking the right balance between utilizing known structural regions and venturing into unexplored territories. Our research delves into the development of a new kind of generative models specifically designed to search for diverse molecular/crystal regions that yield high returns, as defined by domain experts. In addition, our toolset includes Large Language Models that have been fine-tuned using materials literature and scientific knowledge. These models possess the ability to comprehend extensive volumes of materials literature, encompassing molecular string representations, mathematical equations in LaTeX, and codebases. We explore the open challenges, including effectively representing deep domain knowledge and implementing efficient querying techniques to address materials discovery problems.
This document discusses generative AI and its potential impacts. It provides an overview of generative AI capabilities like one model for all tasks, emergent behaviors, and in-context learning. Applications discussed include materials discovery, process monitoring, and battery modeling. The document outlines a vision for 2030 where generative AI becomes more general purpose and powerful, enabling new industries and economic growth while also raising risks around concentration of power, misuse, and safe and ethical development.
AI has played a limited role in the COVID-19 pandemic so far, scoring a B- according to one expert. It has helped in some areas like early warning, image-based diagnosis, and optimizing clinical trials. However, it could not demonstrate great impact in regions with complex healthcare systems and high inertia. Going forward, AI may accelerate tasks like forecasting medical resource needs, optimizing logistics, and assisting vaccine and drug discovery for future pandemics if developed with proper objectives, less reliance on historical data, and alignment with human values.
- The document discusses various approaches for applying machine learning and artificial intelligence to drug discovery.
- It describes how molecules and proteins can be represented as graphs, fingerprints, or sequences to be used as input for models.
- Different tasks in drug discovery like target binding prediction, generative design of new molecules, and drug repurposing are framed as questions that AI models can aim to answer.
- Techniques discussed include graph neural networks, reinforcement learning, and conditional generation using techniques like translation models.
- Several recent works applying these approaches for tasks like predicting drug-target interactions and generating synthesizable molecules are referenced.
The document discusses using deep learning models to analyze episodic healthcare data and make predictions. It proposes:
1) Viewing healthcare processes as executable computer programs with hidden "grammars" that can be learned from observational data.
2) Modeling health dynamics as a system of state transitions where treatments shift illness states, and historical events' importance is person-specific.
3) Training models by minimizing prediction loss to forecast outcomes like readmission, mortality, and disease progression based on patients' diseases, treatments, and visits over time.
Deep learning for biomedical discovery and data mining IIDeakin University
(1) The document discusses deep learning techniques for analyzing biomedical data from electronic medical records (EMRs).
(2) It describes models like DeepPatient that use autoencoders to learn representations of patient records that can predict diseases.
(3) Other models like Deepr and DeepCare use convolutional and recurrent neural networks to model temporal patterns in EMRs and predict future health risks and care trajectories.
This document provides an overview of recent advances in applying artificial intelligence and machine learning techniques to matters and materials. It discusses several key ideas and approaches, including:
- Using graph neural networks and message passing algorithms to model molecules as graphs and predict molecular properties.
- Generative models like variational autoencoders and generative adversarial networks to represent molecules in a continuous latent space and generate new molecular structures.
- Reinforcement learning approaches for predicting chemical reactions and planning chemical syntheses.
- Directed generation of molecular graphs using graph variational autoencoders to overcome limitations of string-based representations.
The document outlines many promising directions for using deep learning to tackle important problems in chemistry, materials science
This document discusses representation learning on graphs. It begins by explaining why graph representation learning is important as graphs are pervasive in many scientific disciplines. It then discusses various techniques for graph representation learning including graph embedding methods like DeepWalk and Node2Vec, message passing neural networks, and graph generation methods like variational autoencoders. The document concludes by discussing challenges in graph reasoning to deduce knowledge from graphs in response to queries.
The document discusses how deep learning can be applied to genomics. It outlines several genomic problems that deep learning may be able to help with, such as gene-disease mapping, binding site identification, and sequence generation. It then provides examples of existing deep learning applications for related tasks like predicting gene expression and identifying binding sites. Overall, the document argues that deep learning is a promising approach for many genomics problems by leveraging its ability to learn from large amounts of data and discover complex patterns.
Ever wondered how to inject your dashboards with the power of Python? This presentation will show how combining Tableau with Python can unlock advanced analytics, predictive modeling, and automation that’ll make your dashboards not just smarter—but practically psychic
Comprehensive Roadmap of AI, ML, DS, DA & DSA.pdfepsilonice
This outlines a comprehensive roadmap for mastering artificial intelligence, machine learning, data science, data analysis, and data structures and algorithms, guiding learners from beginner to advanced levels by building upon foundational Python knowledge.
Mastering Data Science: Unlocking Insights and Opportunities at Yale IT Skill...smrithimuralidas
The Data Science Course at Yale IT Skill Hub in Coimbatore provides in-depth training in data analysis, machine learning, and AI using Python, R, SQL, and tools like Tableau. Ideal for beginners and professionals, it covers data wrangling, visualization, and predictive modeling through hands-on projects and real-world case studies. With expert-led sessions, flexible schedules, and 100% placement support, this course equips learners with skills for Coimbatore’s booming tech industry. Earn a globally recognized certification to excel in data-driven roles. The Data Analytics Course at Yale IT Skill Hub in Coimbatore offers comprehensive training in data visualization, statistical analysis, and predictive modeling using tools like Power BI, Tableau, Python, and R. Designed for beginners and professionals, it features hands-on projects, expert-led sessions, and real-world case studies tailored to industries like IT and manufacturing. With flexible schedules, 100% placement support, and globally recognized certification, this course equips learners to excel in Coimbatore’s growing data-driven job market.
Internal Architecture of Database Management SystemsM Munim
A Database Management System (DBMS) is software that allows users to define, create, maintain, and control access to databases. Internally, a DBMS is composed of several interrelated components that work together to manage data efficiently, ensure consistency, and provide quick responses to user queries. The internal architecture typically includes modules for query processing, transaction management, and storage management. This assignment delves into these key components and how they collaborate within a DBMS.
Understanding Tree Data Structure and Its ApplicationsM Munim
A Tree Data Structure is a widely used hierarchical model that represents data in a parent-child relationship. It starts with a root node and branches out to child nodes, forming a tree-like shape. Each node can have multiple children but only one parent, except for the root which has none. Trees are efficient for organizing and managing data, especially when quick searching, inserting, or deleting is needed. Common types include **binary trees**, **binary search trees (BST)**, **heaps**, and **tries**. A binary tree allows each node to have up to two children, while a BST maintains sorted order for fast lookup. Trees are used in various applications like file systems, databases, compilers, and artificial intelligence. Traversal techniques such as preorder, inorder, postorder, and level-order help in visiting all nodes systematically. Trees are fundamental to many algorithms and are essential for solving complex computational problems efficiently.
Glary Utilities Pro 5.157.0.183 Crack + Key Download [Latest]Designer
Copy Link & Paste in Google👉👉👉 https://alipc.pro/dl/
Glary Utilities Pro Crack Glary Utilities Pro Crack Free Download is an amazing collection of system tools and utilities to fix, speed up, maintain and protect your PC.
The final presentation of our time series forecasting project for the "Data Science for Society and Business" Master's program at Constructor University Bremen
delta airlines new york office (Airwayscityoffice)jamespromind
Visit the Delta Airlines New York Office for personalized assistance with your travel plans. The experienced team offers guidance on ticket changes, flight delays, and more. It’s a helpful resource for those needing support beyond the airport.
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...Karim Baïna
Artificial Intelligence (AI) is reshaping societies and raising complex ethical, legal, and geopolitical questions. This talk explores the foundations and limits of Trustworthy AI through the lens of global frameworks such as the EU’s HLEG guidelines, UNESCO’s human rights-based approach, OECD recommendations, and NIST’s taxonomy of AI security risks.
We analyze key principles like fairness, transparency, privacy, robustness, and accountability — not only as ideals, but in terms of their practical implementation and tensions. Special attention is given to real-world contexts such as Morocco’s deployment of 4,000 intelligent cameras and the country’s positioning in AI readiness indexes. These examples raise critical issues about surveillance, accountability, and ethical governance in the Global South.
Rather than relying on standardized terms or ethical "checklists", this presentation advocates for a grounded, interdisciplinary, and context-aware approach to responsible AI — one that balances innovation with human rights, and technological ambition with social responsibility.
This rich Trustworthy and Responsible AI frameworks context is a serious opportunity for Human and Social Sciences Researchers : either operate as gatekeepers, reinforcing existing ethical constraints, or become revolutionaries, pioneering new paradigms that redefine how AI interacts with society, knowledge production, and policymaking ?
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...Karim Baïna
Deep learning for detecting anomalies and software vulnerabilities
1. 17/1/17 1
Source: rdn consulting
Hanoi, Jan 17th 2017
Trần Thế Truyền
Deakin University
@truyenoz
prada-research.net/~truyen
[email protected]
letdataspeak.blogspot.com
goo.gl/3jJ1O0
DEEP LEARNING FOR
DETECTING ANOMALIES AND
SOFTWARE VULNERABILITIES
tranthetruyen
5. OUR APPROACH TO SECURITY: (DEEP)
MACHINE LEARNING
Usual detection of attacks are based on profiling and human skills
But attacking tactics change overtime, creating zero-day attacks
Systems are very complex now, and no humans can cover all
à It is best to use machine to learn continuously and automatically.
à Humans can provide feedbacks for the machine to correct itself.
à Deep learning is on the rise.
For now: It is best for human and machine to co-operate.
17/1/17 5
7. AGENDA
Part I: Introduction to deep learning
A brief history
Top 3 architectures
Unsupervised learning
Part II: Anomaly detection
Part III: Software vulnerabilities
17/1/17 7
9. DEEP LEARNING IS SUPER HOT
17/1/17 9
“deep learning”
+ data
“deep
learning” +
intelligence
Dec, 2016
10. 2016
DEEP LEARNING IS NEURAL NETS, BUT …
http://blog.refu.co/wp-content/uploads/2009/05/mlp.png
1986
17/1/17 10
11. TWO LEARNING PARADIGMS
Supervised learning
(mostly machine)
A à B
Unsupervised learning
(mostly human)
Will be quickly solved for “easy”
problems (Andrew Ng)
17/1/17 11
12. KEY IN MACHINE LEARNING: FEATURE
ENGINEERING
In typical machine learning projects, 80-90% effort is on feature engineering
A right feature representation doesn’t need much work. Simple linear methods often work
well.
Text: BOW, n-gram, POS, topics, stemming, tf-idf, etc.
Software: token, LOC, API calls, #loops, developer reputation, team complexity,
report readability, discussion length, etc.
Try yourself on Kaggle.com!
17/1/17 12
20. DEEP AUTOENCODER – SELF
RECONSTRUCTION OF DATA
17/1/17
20
Auto-encoderFeature detector
Representation
Raw data
Reconstruction
Deep Auto-encoder
Encoder
Decoder
21. GENERATIVE MODELS
17/1/17 21
Many applications:
• Text to speech
• Simulate data that are hard to obtain/share in real life (e.g.,
healthcare)
• Generate meaningful sentences conditioned on some input
(foreign language, image, video)
• Semi-supervised learning
• Planning
22. A FAMILY: RBM à DBN à DBM
17/1/17 22
energy
Restricted Boltzmann Machine
(~1994, 2001)
Deep Belief Net
(2006)
Deep Boltzmann Machine
(2009)
23. GAN: GENERATIVE ADVERSARIAL NETS
(GOODFELLOW ET AL, 2014)
Yann LeCun: GAN is one of best idea in past 10 years!
Instead of modeling the entire distribution of data, learns to map ANY random distribution into the
region of data, so that there is no discriminator that can distinguish sampled data
from real data.
Any random distribution
in any space
Binary discriminator,
usually a neural
classifier
Neural net that maps
z à x
24. GAN: GENERATED SAMPLES
The best quality pictures generated thus far!
17/1/17 24
Real Generated
http://kvfrans.com/generative-adversial-networks-explained/
25. DEEP LEARNING IN COGNITIVE DOMAINS
17/1/17 25
http://media.npr.org/
http://cdn.cultofmac.com/
Where human can
recognise, act or answer
accurately within seconds
blogs.technet.microsoft.com
26. dbta.com
DEEP LEARNING IN NON-COGNITIVE DOMAINS
Where humans need extensive training to do well
Domains that demand transparency & interpretability.
… healthcare
… security
… genetics, foods, water …
17/1/17 26
TEKsystems
28. ANOMALY DETECTION
USING UNSUPERVISED LEARNING
17/1/17 28
dbta.com
This work is partially supported by the Telstra-Deakin Centre of Excellence in Big Data and Machine Learning
29. AGENDA
Part I: Introduction to deep learning
Part II: Anomaly detection
Multichannel
Unusual mixed-data co-occurrence
Object lifetime model
Part III: Software vulnerabilities
17/1/17 29
30. BUT – we cannot
define anomaly
apriori
Strategy: learn normality, anything does not fit in is abnormal
17/1/17 30
31. PROJECT: DISCOVERY IN TELSTRA
SECURITY OPERATIONS
We use smart people and smart tools to
discover unknown malicious or risky behaviour to
inform and protect Telstra and its customers.
Discovery workflow:
32. SOFTWARE: SNAPSHOT
A) Anomaly detection systems
B) The main screen showing
the residual signal, and the
threshold for anomaly
detection
C) Top anomalies
D) Event details for selected
anomaly
33. MULTICHANNEL FRAMEWORK
Detect common anomalous events that happen across multiple
information channels
1. Cross-channel Autoencoder (CC-AE)
General framework
channel 1
channel N
anomaly
detection
anomaly
detection
…
…
anomalies 1
anomalies N
anomaly
detection
common cross-
channel
anomalies
35. METHOD: CROSS-CHANNEL AUTOENCODER
1. Single channel anomaly detection
For each channel, model the data with an autoencoder
Determine the anomalies by analysing the reconstruction errors
2. Augmenting the reconstruction errors
Augment the reconstruction errors across channels
Model the reconstruction errors with an autoencoder
3. Cross-channel anomaly detection
Determine the cross-channel anomalies by analysing the reconstruction errors
36. RESULTS 1 – NEWS DATA
A channel is defined to be the stream of articles about a specific topic
published by a news agency, e.g. economy-related articles from BBC
3 news agencies: BBC, Reuters, and CNN
9 predefined topics: politics, sports, health, entertainment, world-news,
technology, and Asian news
Free-form text data
Feature extraction: Bag of words representation
Anomaly injection: Breastfeeding articles
38. RESULTS 2 – DEAKIN SQUID DATA
Squid is a caching and forwarding web proxy.
Each server is defined to be a channel
7 channels with inbound and outbound network data
Sample datapoint:
Bag of words representation
Anomaly Injection: URLs from URLBlackList.com
1469032590.233 19 10.132.169.158 TCP_HIT/200 4771 GET http://Europe.cnn.com/
EUROPE/potd/2001/04/17/tz.pulltizer.ap.jp - NONE/- image/jpeg
1 32 4 5 6 7
7 cont’d 8 9 10
40. AGENDA
Part I: Introduction to deep learning
Part II: Anomaly detection
Multichannel
Unusual mixed-data co-occurrence
Object lifetime model
Part III: Software vulnerabilities
17/1/17 40
47. AGENDA
Part I: Introduction to deep learning
Part II: Anomaly detection
Multichannel
Unusual mixed-data co-occurrence
Object lifetime model
Part III: Software vulnerabilities
17/1/17 47
48. OBJECT LIFETIME MODELS
Objects with a life
User
Devices
Account
Detect unusual behavior at a given time given object’s history
I.e., (Low) conditional probability of next event/action/observation given the history
Two properties:
Irregular time by internal activities
Intervention by external agents
17/1/17 48
49. DEEPEVOLVE: A MODEL OF EVOLVING BODY
States are a dynamic memory process → LSTM moderated by time and
intervention
Discrete observations → vector embedding
Time and previous intervention → “forgetting” of old states
Current intervention → controlling the current states
17/1/17 49
55. TRADITIONAL METHOD: FEATURE
ENGINEERING + CLASSIFIER
Protocols
Domains, countries
IP-based analysis
Lexical analysis
Query analysis
17/1/17 55
Handling shortening &
dynamically-generated queries
N-grams
Special characters
Blacklist
56. NEW METHOD: LEARNABLE CONVOLUTION AS
FEATURE DETECTOR
17/1/17
56
http://colah.github.io/posts/2015-09-NN-Types-FP/
Learnable kernels
andreykurenkov.com
Feature detector,
often many
57. END-TO-END MODEL OF MALICIOUS URLS
17/1/17 57
Safe/Unsafe
max-pooling
convolution --
motif detection
Embedding (may
be one-hot)
Prediction with FFN
1
2
3
4
record
vector
char
vector
h t t p : / / w w w . s
Train on 900K malicious URLs
1,000K good URLs
Accuracy: 96%
No feature engineering!
58. AGENDA
Part I: Introduction to deep learning
Part II: Anomaly detection
Part III: Software vulnerabilities
Malicious URL detection
Unusual source code
Code vulnerabilities
17/1/17 58
60. MOTIVATIONS
Software is eating the world.
IoT development is exploding. Software security is an
extremely critical issue
Vulerable source files: 0.3-5%, depending on code review
policy & quality ot code.
General approach: Machine learning instead of human
manual effort and programming heuristics
Many software metrics have been found:
Bugs, code complexity, churn rate, developer network activity
metrics, fault history metrics,
Question: can machine learn all of these by itself?
17/1/17 60
61. APPROACH: CODE MODELING
Open source code is massive
Bad coding is often the sign of bugs and security holes
Malicious code may be different from the safe code
Ideas:
A code model assigns probability to a piece of code
Given the code context, if conditional probability of a code piece per token is
low compared to the rest à unusual code à more likely to contain defects
or security vulnerability
17/1/17 61
62. A DEEP LANGUAGE MODEL FOR
SOFTWARE CODE (DAM ET AL, FSE’16 SE+NL)
A good language model for source code would capture the long-term
dependencies
The model can be used for various prediction tasks, e.g. defect
prediction, code duplication, bug localization, etc.
17/1/17 62
Slide by Hoa Khanh Dam
63. CHARACTERISTICS OF SOFTWARE CODE
Repetitiveness
E.g. for (int i = 0; i < n; i++)
Localness
E.g. for (int size may appear more often that for (int i in some source files.
Rich and explicit structural information
E.g. nested loops, inheritance hierarchies
Long-term dependencies
try and catch (in Java) or file open and close are not immediately followed each other.
63
Slide by Hoa Khanh Dam
64. A LANGUAGE MODEL FOR SOFTWARE
CODE
Given a code sequence s= <w1, …, wk>, a language model estimate the
probability distribution P(s):
64
Slide by Hoa Khanh Dam
65. TRADITIONAL MODEL: N-GRAMS
Truncates the history length to n-1 words (usually 2 to 5 in practice)
Useful and intuitive in making use of repetitive sequential patterns in code
Context limited to a few code elements
Not sufficient in complex SE prediction tasks.
As we read a piece of code, we understand each code token based on our
understanding of previous code tokens, i.e. the information persists.
65
Slide by Hoa Khanh Dam
66. NEW METHOD: LONG SHORT-TERM
MEMORY (LSTM)
17/1/17 66
ct
it
ft
Input
67. CODE LANGUAGE MODEL
67
Previous work has applied RNNs to model software code (White et al, MSR 2015)
RNNs however do not capture the long-term dependencies in code
Slide by Hoa Khanh Dam
68. EXPERIMENTS
Built dataset of 10 Java projects: Ant, Batik, Cassandra, Eclipse-E4, Log4J, Lucene,
Maven2, Maven3, Xalan-J, and Xerces.
Comments and blank lines removed. Each source code file is tokenized to produce a
sequence of code tokens.
Integers, real numbers, exponential notation, hexadecimal numbers replaced with
<num> token, and constant strings replaced with <str> token.
Replaced less “popular” tokens with <unk>
Code corpus of 6,103,191 code tokens, with a vocabulary of 81,213 unique tokens.
68
Slide by Hoa Khanh Dam
69. EXPERIMENTS (CONT.)
69
Both RNN and LSTM improve with more training data (whose size grows with sequence length).
LSTM consistently performs better than RNN: 4.7% improvement to 27.2% (varying sequence
length), 10.7% to 37.9% (varying embedding size).
Slide by Hoa Khanh Dam
70. AGENDA
Part I: Introduction to deep learning
Part II: Anomaly detection
Part III: Software vulnerabilities
Malicious URL detection
Unusual source code
Code vulnerabilities
17/1/17 70
71. METHOD-1: LD-RNN FOR
SEQUENCE CLASSIFICATION
(CHOETKIERTIKUL ET AL, WORK IN PROGRESS)
LD = Long Deep
LSTM for document representation
Highway-net with tied parameters for
vulnerability score
17/1/17 71
pooling
Embed
LSTM
Vulnerability
score
W1 W2 W3 W4 W5 W6
Recurrent Highway NetRegression
Standardize XD logging to align with
document representation
h1
h2 h3
h4 h5
h6
….
….
….
….
72. METHOD-2: DEEP SEQUENTIAL MULTI-
INSTANCE LEARNING
Code file as a bag
Methods as instances
Data are sequential
17/1/17 72
Headers
Method 1
Method 2
Method N
Vulnerability
level
.
.
.
73. COLUMN BUNDLE FOR N-TO-1 MAPPING
(PHAM ET AL, WORK IN PROGRESS)
17/1/17 73
Function A
Function B
output
Column representation
75. The popular
On the rise
The black sheep
The good old • Representation learning (RBM, DBN, DBM, DDAE)
• Ensemble
• Back-propagation
• Adaptive stochastic gradient
• Dropouts & batch-norm
• Rectifier linear transforms & skip-connections
• Highway nets, LSTM & CNN
• Differential Turing machines
• Memory, attention & reasoning
• Reinforcement learning & planning
• Lifelong learning
• Group theory (Lie algebra, renormalisation group, spin-
class)
76. WHY DEEP LEARNING WORKS: PRINCIPLES
Expressiveness
Can represent the complexity of the world à Feedforward nets are universal function
approximator
Can compute anything computable à Recurrent nets are Turing-complete
Learnability
Have mechanism to learn from the training signals à Neural nets are highly trainable
Generalizability
Work on unseen data à Deep nets systems work in the wild (Self-driving cars, Google
Translate/Voice, AlphaGo)
17/1/17 76
77. WHEN DEEP LEARNING WORKS
Lots of data (e.g., millions)
Strong, clean training signals (e.g., when human can provide correct labels –
cognitive domains).
Andrew Ng of Baidu: When humans do well within sub-second.
Data structures are well-defined (e.g., image, speech, NLP, video)
Data is compositional (luckily, most data are like this)
The more primitive (raw) the data, the more benefit of using deep learning.
17/1/17 77
78. BONUS: HOW TO POSITION
17/1/17 78
“[…] the dynamics of the game will evolve. In the long
run, the right way of playing football is to position yourself
intelligently and to wait for the ball to come to you. You’ll
need to run up and down a bit, either to respond to how
the play is evolving or to get out of the way of the scrum
when it looks like it might flatten you.” (Neil Lawrence,
7/2015, now with Amazon)
http://inverseprobability.com/2015/07/12/Thoughts-on-ICML-2015/
79. THE ROOM IS WIDE OPEN
Architecture engineering
Non-cognitive apps
Unsupervised learning
Graphs
Learning while preserving privacy
Modelling of domain invariance
Better data efficiency
Multimodality
Learning under adversarial stress
Better optimization
Going Bayesian
http://smerity.com/articles/2016/architectures_are_the_new_feature_engineering.html