This document compares different machine learning algorithms for handwritten digit recognition on the MNIST dataset. Convolutional neural networks achieved the best results, with LeNet5 achieving 0.9% error and boosted LeNet4 achieving the lowest error rate of 0.7%. Neural networks required more training time but had faster recognition times and lower memory requirements compared to nearest neighbor classifiers. Overall, convolutional neural networks were best suited for handwritten digit recognition due to their ability to handle variations in size, position and orientation of digits.
HANDWRITTEN DIGIT RECOGNITION USING k-NN CLASSIFIERvineet raj
This document proposes using a k-nearest neighbor classifier to recognize handwritten digits from the MNIST database. It discusses existing methods that use star-layered histogram feature extraction and class-dependent feature selection, which achieve accuracies of around 93% and 92% respectively. However, these methods require thinning operations or have high computational costs. The document proposes using k-NN classification with pre-processing and feature extraction to achieve higher accuracy of around 96% with lower computation requirements than existing models.
Handwritten digit recognition using image processing anita maharjan
The document presents a case study on handwritten digit recognition using image processing and neural networks. It discusses collecting handwritten digit images, preprocessing the images by cutting, resizing and extracting features, and then training a neural network using backpropagation to recognize the digits. The system aims to recognize handwritten digits for applications like signature, currency and number plate recognition. It concludes that understanding neural networks makes it easier to apply such intelligent recognition to machines.
GUI based handwritten digit recognition using CNNAbhishek Tiwari
This project is to create a model which can recognize the digits as well as also to create GUI which is user friendly i.e. user can draw the digit on it and will get appropriate output.
A fast-paced introduction to Deep Learning concepts, such as activation functions, cost functions, back propagation, and then a quick dive into CNNs. Basic knowledge of vectors, matrices, and derivatives is helpful in order to derive the maximum benefit from this session.
Convolutional neural network (CNN / ConvNet's) is a part of Computer Vision. Machine Learning Algorithm. Image Classification, Image Detection, Digit Recognition, and many more. https://technoelearn.com .
The document discusses using a convolutional neural network to recognize handwritten digits from the MNIST database. It describes training a CNN on the MNIST training dataset, consisting of 60,000 examples, to classify images of handwritten digits from 0-9. The CNN architecture uses two convolutional layers followed by a flatten layer and fully connected layer with softmax activation. The model achieves high accuracy on the MNIST test set. However, the document notes that the model may struggle with color images or images with more complex backgrounds compared to the simple black and white MNIST digits. Improving preprocessing and adapting the model for more complex real-world images is suggested for future work.
In machine learning, a convolutional neural network is a class of deep, feed-forward artificial neural networks that have successfully been applied fpr analyzing visual imagery.
This document discusses a project using convolutional neural networks to recognize handwritten digits from the MNIST dataset. It proposes a hierarchical convolutional neural network approach with two levels - an initial CNN to make preliminary predictions and additional CNNs to further classify ambiguous digits. The model is trained and tested on MNIST data, achieving an error rate of 0.82%. Key aspects covered include CNNs, hierarchical networks, and training/testing a model for handwritten digit recognition.
This document discusses convolutional neural networks (CNNs). It explains that CNNs were inspired by research on the human visual system and take a similar approach to teach computers to identify objects in images. The document outlines the key components of CNNs, including convolutional and pooling layers to extract features from images, as well as fully connected layers to classify objects. It also notes that CNNs take pixel data as input and use many examples to generalize and make predictions, similar to how humans learn visual recognition.
This is a presentation on Handwritten Digit Recognition using Convolutional Neural Networks. Convolutional Neural Networks give better results as compared to conventional Artificial Neural Networks.
Deep Learning - Overview of my work IIMohamed Loey
Deep Learning Machine Learning MNIST CIFAR 10 Residual Network AlexNet VGGNet GoogleNet Nvidia Deep learning (DL) is a hierarchical structure network which through simulates the human brain’s structure to extract the internal and external input data’s features
Convolutional Neural Network (CNN) is a type of neural network that can take in an input image, assign importance to areas in the image, and distinguish objects in the image. CNNs use convolutional layers and pooling layers, which help introduce translation invariance to allow the network to recognize patterns and objects regardless of their position in the visual field. CNNs have been very effective for tasks involving visual imagery like image classification but may be less effective for natural language processing tasks that rely more on word order and sequence. Recurrent neural networks (RNNs) that can model sequential data may perform better than CNNs for some natural language processing tasks like text classification.
Tijmen Blankenvoort, co-founder Scyfer BV, presentation at Artificial Intelligence Meetup 15-1-2014. Introduction into Neural Networks and Deep Learning.
This document describes a technique for Sinhala handwritten character recognition using feature extraction and an artificial neural network. The methodology includes preprocessing, segmentation, feature extraction based on character geometry, and classification using an ANN. Features like starters, intersections, and zoning are extracted from segmented characters. The ANN was trained on these feature vectors and tested on 170 characters, achieving an accuracy of 82.1%. While the technique showed some success, the author notes room for improvement, such as making the system more font-independent and improving feature extraction and character separation.
Handwritten Digit Recognition using Convolutional Neural NetworksIRJET Journal
This document discusses using a convolutional neural network called LeNet to perform handwritten digit recognition on the MNIST dataset. It begins with an abstract that outlines using LeNet, a type of convolutional network, to accurately classify handwritten digits from 0 to 9. It then provides background on convolutional networks and how they can extract and utilize features from images to classify patterns with translation and scaling invariance. The document implements LeNet using the Keras deep learning library in Python to classify images from the MNIST dataset, which contains labeled images of handwritten digits. It analyzes the architecture of LeNet and how convolutional and pooling layers are used to extract features that are passed to fully connected layers for classification.
Machine Learning - Convolutional Neural NetworkRichard Kuo
The document provides an overview of convolutional neural networks (CNNs) for visual recognition. It discusses the basic concepts of CNNs such as convolutional layers, activation functions, pooling layers, and network architectures. Examples of classic CNN architectures like LeNet-5 and AlexNet are presented. Modern architectures such as Inception and ResNet are also discussed. Code examples for image classification using TensorFlow, Keras, and Fastai are provided.
Intro to selective search for object proposals, rcnn family and retinanet state of the art model deep dives for object detection along with MAP concept for evaluating model and how does anchor boxes make the model learn where to draw bounding boxes
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee
TensorFlow-KR 논문읽기모임 PR12 169번째 논문 review입니다.
이번에 살펴본 논문은 Google에서 발표한 EfficientNet입니다. efficient neural network은 보통 mobile과 같은 제한된 computing power를 가진 edge device를 위한 작은 network 위주로 연구되어왔는데, 이 논문은 성능을 높이기 위해서 일반적으로 network를 점점 더 키워나가는 경우가 많은데, 이 때 어떻게 하면 더 효율적인 방법으로 network을 키울 수 있을지에 대해서 연구한 논문입니다. 자세한 내용은 영상을 참고해주세요
논문링크: https://arxiv.org/abs/1905.11946
영상링크: https://youtu.be/Vhz0quyvR7I
This document provides an agenda for a presentation on deep learning, neural networks, convolutional neural networks, and interesting applications. The presentation will include introductions to deep learning and how it differs from traditional machine learning by learning feature representations from data. It will cover the history of neural networks and breakthroughs that enabled training of deeper models. Convolutional neural network architectures will be overviewed, including convolutional, pooling, and dense layers. Applications like recommendation systems, natural language processing, and computer vision will also be discussed. There will be a question and answer section.
Convolutional neural networks (CNNs) are a type of neural network used for image recognition tasks. CNNs use convolutional layers that apply filters to input images to extract features, followed by pooling layers that reduce the dimensionality. The extracted features are then fed into fully connected layers for classification. CNNs are inspired by biological processes and are well-suited for computer vision tasks like image classification, detection, and segmentation.
Image classification with Deep Neural NetworksYogendra Tamang
This document discusses image classification using deep neural networks. It provides background on image classification and convolutional neural networks. The document outlines techniques like activation functions, pooling, dropout and data augmentation to prevent overfitting. It summarizes a paper on ImageNet classification using CNNs with multiple convolutional and fully connected layers. The paper achieved state-of-the-art results on ImageNet in 2010 and 2012 by training CNNs on a large dataset using multiple GPUs.
Convolutional neural networks (CNNs) are a type of deep neural network commonly used for analyzing visual imagery. CNNs use various techniques like convolution, ReLU activation, and pooling to extract features from images and reduce dimensionality while retaining important information. CNNs are trained end-to-end using backpropagation to update filter weights and minimize output error. Overall CNN architecture involves an input layer, multiple convolutional and pooling layers to extract features, fully connected layers to classify features, and an output layer. CNNs can be implemented using sequential models in Keras by adding layers, compiling with an optimizer and loss function, fitting on training data over epochs with validation monitoring, and evaluating performance on test data.
This document provides an overview of pattern recognition and supervised learning for machine vision. It discusses what pattern recognition is, examples of pattern recognition applications, the basic steps in a pattern recognition system including data acquisition, preprocessing, feature extraction, supervised/unsupervised learning, and post-processing. For supervised learning, it describes the process of inferring functions from labeled training data. It also provides an example of using multiple features and decision boundaries for texture classification of images.
Convolutional neural networks can be used for handwritten digit recognition. They employ replicated feature detectors with shared weights to achieve translation equivariance. Pooling layers provide some translation invariance while reducing the number of inputs to subsequent layers. The LeNet architecture developed by Yann LeCun used these techniques along with multiple hidden layers and achieved an error rate of around 1% on handwritten digit recognition. Dropout regularization helps convolutional neural networks generalize well when applied to large scale tasks like ImageNet classification by preventing complex co-adaptations between hidden units.
This document discusses a project using convolutional neural networks to recognize handwritten digits from the MNIST dataset. It proposes a hierarchical convolutional neural network approach with two levels - an initial CNN to make preliminary predictions and additional CNNs to further classify ambiguous digits. The model is trained and tested on MNIST data, achieving an error rate of 0.82%. Key aspects covered include CNNs, hierarchical networks, and training/testing a model for handwritten digit recognition.
This document discusses convolutional neural networks (CNNs). It explains that CNNs were inspired by research on the human visual system and take a similar approach to teach computers to identify objects in images. The document outlines the key components of CNNs, including convolutional and pooling layers to extract features from images, as well as fully connected layers to classify objects. It also notes that CNNs take pixel data as input and use many examples to generalize and make predictions, similar to how humans learn visual recognition.
This is a presentation on Handwritten Digit Recognition using Convolutional Neural Networks. Convolutional Neural Networks give better results as compared to conventional Artificial Neural Networks.
Deep Learning - Overview of my work IIMohamed Loey
Deep Learning Machine Learning MNIST CIFAR 10 Residual Network AlexNet VGGNet GoogleNet Nvidia Deep learning (DL) is a hierarchical structure network which through simulates the human brain’s structure to extract the internal and external input data’s features
Convolutional Neural Network (CNN) is a type of neural network that can take in an input image, assign importance to areas in the image, and distinguish objects in the image. CNNs use convolutional layers and pooling layers, which help introduce translation invariance to allow the network to recognize patterns and objects regardless of their position in the visual field. CNNs have been very effective for tasks involving visual imagery like image classification but may be less effective for natural language processing tasks that rely more on word order and sequence. Recurrent neural networks (RNNs) that can model sequential data may perform better than CNNs for some natural language processing tasks like text classification.
Tijmen Blankenvoort, co-founder Scyfer BV, presentation at Artificial Intelligence Meetup 15-1-2014. Introduction into Neural Networks and Deep Learning.
This document describes a technique for Sinhala handwritten character recognition using feature extraction and an artificial neural network. The methodology includes preprocessing, segmentation, feature extraction based on character geometry, and classification using an ANN. Features like starters, intersections, and zoning are extracted from segmented characters. The ANN was trained on these feature vectors and tested on 170 characters, achieving an accuracy of 82.1%. While the technique showed some success, the author notes room for improvement, such as making the system more font-independent and improving feature extraction and character separation.
Handwritten Digit Recognition using Convolutional Neural NetworksIRJET Journal
This document discusses using a convolutional neural network called LeNet to perform handwritten digit recognition on the MNIST dataset. It begins with an abstract that outlines using LeNet, a type of convolutional network, to accurately classify handwritten digits from 0 to 9. It then provides background on convolutional networks and how they can extract and utilize features from images to classify patterns with translation and scaling invariance. The document implements LeNet using the Keras deep learning library in Python to classify images from the MNIST dataset, which contains labeled images of handwritten digits. It analyzes the architecture of LeNet and how convolutional and pooling layers are used to extract features that are passed to fully connected layers for classification.
Machine Learning - Convolutional Neural NetworkRichard Kuo
The document provides an overview of convolutional neural networks (CNNs) for visual recognition. It discusses the basic concepts of CNNs such as convolutional layers, activation functions, pooling layers, and network architectures. Examples of classic CNN architectures like LeNet-5 and AlexNet are presented. Modern architectures such as Inception and ResNet are also discussed. Code examples for image classification using TensorFlow, Keras, and Fastai are provided.
Intro to selective search for object proposals, rcnn family and retinanet state of the art model deep dives for object detection along with MAP concept for evaluating model and how does anchor boxes make the model learn where to draw bounding boxes
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee
TensorFlow-KR 논문읽기모임 PR12 169번째 논문 review입니다.
이번에 살펴본 논문은 Google에서 발표한 EfficientNet입니다. efficient neural network은 보통 mobile과 같은 제한된 computing power를 가진 edge device를 위한 작은 network 위주로 연구되어왔는데, 이 논문은 성능을 높이기 위해서 일반적으로 network를 점점 더 키워나가는 경우가 많은데, 이 때 어떻게 하면 더 효율적인 방법으로 network을 키울 수 있을지에 대해서 연구한 논문입니다. 자세한 내용은 영상을 참고해주세요
논문링크: https://arxiv.org/abs/1905.11946
영상링크: https://youtu.be/Vhz0quyvR7I
This document provides an agenda for a presentation on deep learning, neural networks, convolutional neural networks, and interesting applications. The presentation will include introductions to deep learning and how it differs from traditional machine learning by learning feature representations from data. It will cover the history of neural networks and breakthroughs that enabled training of deeper models. Convolutional neural network architectures will be overviewed, including convolutional, pooling, and dense layers. Applications like recommendation systems, natural language processing, and computer vision will also be discussed. There will be a question and answer section.
Convolutional neural networks (CNNs) are a type of neural network used for image recognition tasks. CNNs use convolutional layers that apply filters to input images to extract features, followed by pooling layers that reduce the dimensionality. The extracted features are then fed into fully connected layers for classification. CNNs are inspired by biological processes and are well-suited for computer vision tasks like image classification, detection, and segmentation.
Image classification with Deep Neural NetworksYogendra Tamang
This document discusses image classification using deep neural networks. It provides background on image classification and convolutional neural networks. The document outlines techniques like activation functions, pooling, dropout and data augmentation to prevent overfitting. It summarizes a paper on ImageNet classification using CNNs with multiple convolutional and fully connected layers. The paper achieved state-of-the-art results on ImageNet in 2010 and 2012 by training CNNs on a large dataset using multiple GPUs.
Convolutional neural networks (CNNs) are a type of deep neural network commonly used for analyzing visual imagery. CNNs use various techniques like convolution, ReLU activation, and pooling to extract features from images and reduce dimensionality while retaining important information. CNNs are trained end-to-end using backpropagation to update filter weights and minimize output error. Overall CNN architecture involves an input layer, multiple convolutional and pooling layers to extract features, fully connected layers to classify features, and an output layer. CNNs can be implemented using sequential models in Keras by adding layers, compiling with an optimizer and loss function, fitting on training data over epochs with validation monitoring, and evaluating performance on test data.
This document provides an overview of pattern recognition and supervised learning for machine vision. It discusses what pattern recognition is, examples of pattern recognition applications, the basic steps in a pattern recognition system including data acquisition, preprocessing, feature extraction, supervised/unsupervised learning, and post-processing. For supervised learning, it describes the process of inferring functions from labeled training data. It also provides an example of using multiple features and decision boundaries for texture classification of images.
Convolutional neural networks can be used for handwritten digit recognition. They employ replicated feature detectors with shared weights to achieve translation equivariance. Pooling layers provide some translation invariance while reducing the number of inputs to subsequent layers. The LeNet architecture developed by Yann LeCun used these techniques along with multiple hidden layers and achieved an error rate of around 1% on handwritten digit recognition. Dropout regularization helps convolutional neural networks generalize well when applied to large scale tasks like ImageNet classification by preventing complex co-adaptations between hidden units.
The document discusses classifying handwritten digits from the MNIST dataset using various machine learning classifiers and evaluation metrics. It begins with binary classification of the digit 5 using SGDClassifier, evaluating accuracy which is misleading due to class imbalance. The document then introduces confusion matrices and precision/recall metrics to better evaluate performance. It demonstrates how precision and recall can be traded off by varying the decision threshold, and introduces ROC curves to visualize this tradeoff. Finally, it compares SGDClassifier and RandomForestClassifier on this binary classification task.
Computer vision uses machine learning techniques to recognize objects in large amounts of images. A key development was the use of deep neural networks, which can recognize new images not in the training data as well as or better than humans. Graphics processing units (GPUs) enabled this breakthrough due to their ability to accelerate deep learning algorithms. Computer vision tasks involve both unsupervised learning, such as clustering visually similar images, and supervised learning, where algorithms are trained on labeled image data to learn visual classifications and recognize objects.
How Machine Learning Helps Organizations to Work More Efficiently?Tuan Yang
Data is increasing day by day and so is the cost of data storage and handling. However, by understanding the concepts of machine learning one can easily handle the excessive data and can process it in an affordable manner.
The process includes making models by using several kinds of algorithms. If the model is created precisely for certain task, then the organizations have a very wide chance of making use of profitable opportunities and avoiding the risks lurking behind the scenes.
Learn more about:
» Understanding Machine Learning Objectives.
» Data dimensions in Machine Learning.
» Fundamentals of Algorithms and Mapping from Input/Output.
» Parametric and Non-parametric Machine Learning Algorithms.
» Supervised, Unsupervised and Semi-Supervised Learning.
» Estimating Over-fitting and Under-fitting.
» Use Cases.
Machine Learning Basics for Web Application DevelopersEtsuji Nakai
This document provides an overview of machine learning basics for web application developers. It discusses linear binary classifiers and logistic regression, how to measure model fitness with loss functions, and graphical understandings of linear classifiers. It then covers linear multiclass classifiers using softmax functions, image classification with neural networks, and ways to improve accuracy using convolutional neural networks. Finally, it discusses client applications that use pre-trained machine learning models through API services and examples of smile detection and cucumber classification.
Undergraduate Topics in Computer Science, Concise Computer Vision Reinhard Klette An Introduction
into Theory and Algorithms:
FEATURE DETECTION AND OBJECT DETECTION - Localization, Classification, and Evaluation - Descriptors, Classifiers and Learning
Image Processing, Facial Expression
- The document presents a neural network model for recognizing handwritten digits. It uses a dataset of 20x20 pixel grayscale images of digits 0-9.
- The proposed neural network has an input layer of 400 nodes, a hidden layer of 25 nodes, and an output layer of 10 nodes. It is trained using backpropagation to classify images.
- The model achieves an accuracy of over 96.5% on test data after 200 iterations of training, outperforming a logistic regression model which achieved 91.5% accuracy. Future work could involve classifying more complex natural images.
Details of Lazy Deep Learning for Images Recognition in ZZ Photo appPAY2 YOU
В докладе представлена тема глубокого обучения (Deep Learning) для распознавания изображений. Рассматриваются практические аспекты обучения глубоких сверточных сетей на GPU, обсуждается личный опыт портирования обученных нейросетей в приложение на основе библиотеки OpenCV, проводится сравнение полученного детектора домашних животных на основе подхода Lazy Deep Learning с детектором Виолы-Джонса.
Докладчики: Артем Чернодуб – эксперт в области искусственных нейронных сетей и систем искусственного интеллекта. В 2007 году закончил Московский физико-технический институт. Руководит направлением Computer Vision в компании ZZ Wolf, а также по совместительству работает научным сотрудником в Институте проблем математических машин и систем НАНУ.
Юрий Пащенко – специалист в области систем машинного зрения и машинного обучения, магистр НТУУ «Киевский Политехнический Институт», факультет прикладной математики (2014). Работает в компании ZZ Wolf на должности R&D Engineer.
The document discusses various concepts in machine learning and deep learning including:
1. The semantic gap between what computers can see/read from raw inputs versus higher-level semantics. Deep learning aims to close this gap through hierarchical representations.
2. Traditional computer vision techniques versus deep learning approaches for tasks like face recognition.
3. The differences between rule-based AI, machine learning, and deep learning.
4. Key components of supervised machine learning models including data, models, loss functions, and optimizers.
5. Different problem types in machine learning like regression, classification, and their associated model architectures, activation functions, and loss functions.
6. Frameworks for machine learning like Keras and
- Quiz 1 will be on Wednesday covering material from lecture in multiple choice and short answer format, focusing on topics not covered by projects. Students are advised to review slides and textbook.
- A preview was given of Project 3 on machine learning, computer vision, and clustering strategies like k-means, agglomerative clustering, mean-shift clustering and spectral clustering.
- An overview of machine learning concepts was provided including the framework of applying a prediction function to an image's features to get an output label, the process of training and testing models, and common classifiers and their properties.
- Quiz 1 will be on Wednesday covering material from lecture in multiple choice and short answer format, focusing on topics not covered by projects. Students are advised to review slides and textbook.
- A preview was given of Project 3 and upcoming lecture topics on machine learning, computer vision, and clustering strategies.
- The document contained slides on machine learning frameworks, prediction functions, classifiers like nearest neighbors and SVMs, generalization, and bias-variance tradeoff. References for further machine learning reading were also provided.
- Quiz 1 will be on Wednesday covering material from lecture with an emphasis on topics not covered in projects. It will contain around 20 multiple choice or short answer questions to be completed in class.
- The document previews a machine learning lecture covering topics like clustering strategies, classifiers, generalization, bias-variance tradeoff, and support vector machines. It provides slides and summaries of key concepts.
- Summarizing techniques for reducing error in machine learning models like choosing simpler classifiers, adding regularization, and obtaining more training data.
- Quiz 1 will be on Wednesday covering material from lecture with an emphasis on topics not covered in projects. It will contain around 20 multiple choice or short answer questions to be completed in class.
- The document previews a machine learning lecture covering topics like clustering strategies, classifiers, generalization, bias-variance tradeoff, and support vector machines. It provides slides and summaries of key concepts.
- Summarizing techniques for reducing error in machine learning models like choosing simpler classifiers, collecting more training data, and regularizing parameters.
The document discusses deep learning in computer vision. It provides an overview of research areas in computer vision including 3D reconstruction, shape analysis, and optical flow. It then discusses how deep learning approaches can learn representations from raw data through methods like convolutional neural networks and restricted Boltzmann machines. Deep learning has achieved state-of-the-art results in applications such as handwritten digit recognition, ImageNet classification, learning optical flow, and generating image captions. Convolutional neural networks have been particularly successful due to properties of shared local weights and pooling layers.
Introduction to Convolutional Neural Network.pptxzikoAr
1. Introduction générale
Les réseaux de neurones convolutifs, ou CNN (Convolutional Neural Networks), représentent une avancée majeure dans le domaine de l'intelligence artificielle, et plus spécifiquement dans celui de l’apprentissage automatique supervisé. Inspirés du fonctionnement du cortex visuel humain, les CNN sont aujourd’hui omniprésents dans les applications de reconnaissance d’images, de vidéos, de traitement du langage naturel, de diagnostic médical, et bien plus encore.
L’objectif de cette présentation est de fournir une compréhension approfondie mais accessible des CNN. Nous allons explorer leur structure, leur fonctionnement, les raisons de leur efficacité, ainsi que des exemples d’applications réelles. Le tout sera soutenu par des illustrations et, si pertinent, des démonstrations pratiques.
2. Contexte et motivation
a. Origine
Les CNN trouvent leurs racines dans les années 1980 avec les travaux de Yann LeCun, qui a introduit les premiers réseaux convolutifs pour la reconnaissance de chiffres manuscrits. Ce système a été notamment utilisé par la banque américaine pour la lecture automatique des chèques.
b. Pourquoi les CNN ?
Avant les CNN, les modèles traditionnels nécessitaient une étape manuelle de feature engineering, c’est-à-dire que les humains devaient extraire les caractéristiques d’une image à la main (ex : bords, coins, formes). Les CNN permettent à la machine d’apprendre automatiquement ces caractéristiques à partir des données brutes.
c. Applications
Vision par ordinateur : détection d’objets, reconnaissance faciale, segmentation d’images.
Médical : détection de tumeurs sur des IRM.
Sécurité : reconnaissance biométrique, surveillance vidéo intelligente.
Voitures autonomes : lecture des panneaux, identification des piétons.
Art et création : style transfer, colorisation automatique.
3. Anatomie d’un CNN
Un CNN est composé de plusieurs couches, chacune jouant un rôle spécifique dans le traitement et l’analyse des images.
a. Convolution Layer
Le cœur du CNN.
Applique un filtre (ou noyau) sur l’image pour extraire des caractéristiques locales.
Par exemple, un filtre peut détecter des bords verticaux ou horizontaux.
b. ReLU (Rectified Linear Unit)
Fonction d’activation non-linéaire.
Applique f(x) = max(0, x) à chaque valeur, supprimant les valeurs négatives.
Permet d’introduire de la non-linéarité dans le modèle.
c. Pooling Layer (Sous-échantillonnage)
Réduit la taille des représentations (feature maps).
Les plus courants : Max Pooling, Average Pooling.
Réduction de la complexité, amélioration de la robustesse.
d. Flatten + Fully Connected Layers
À la fin du CNN, les données sont aplaties puis traitées par des couches entièrement connectées.
C’est ici que la classification finale est effectuée (ex. : chat ou chien).
4. Fonctionnement global d’un CNN
a. Propagation avant (Forward Propagation)
L’image passe de couche en couche, transformée à chaque étape.
How Data Annotation Services Drive Innovation in Autonomous Vehicles.docxsofiawilliams5966
Autonomous vehicles represent the cutting edge of modern technology, promising to revolutionize transportation by improving safety, efficiency, and accessibility.
Monterey College of Law’s mission is to zseoali2660
Monterey College of Law’s mission is to provide a quality legal education in a community law school setting with graduates who are dedicated to professional excellence, integrity, and community service.
Embracing AI in Project Management: Final Insights & Future VisionKavehMomeni1
🚀 Unlock the Future of Project Management: Embracing AI – Final Session!
This presentation is the culminating session (Session 13) of the "AI Applications in Project Management Workshop," hosted by OnAcademy and instructed by Kaveh Momeni, PMP®, COB & AI Lead at Chaharsotoon.
Dive deep into "Embracing AI: Empowering Project Managers for an AI-Driven Future." We consolidate critical learnings from the entire workshop and provide a forward-looking perspective on how AI is revolutionizing project management.
Inside, you'll discover:
A Comprehensive Course Recap: Key takeaways from across the workshop, covering everything from knowledge management and predictive analytics to AI agents.
Cutting-Edge AI Trends: The latest developments in AI impacting PM, including market growth, task automation, and the rise of autonomous project assistants.
AI vs. Human Capabilities: Understanding the unique strengths of AI and the irreplaceable value of human intuition, strategic thinking, and leadership in PM.
Optimizing Human-AI Collaboration: Practical models and frameworks for seamlessly integrating AI tools into PM workflows, emphasizing prompt engineering and growth mindsets.
Cultivating AI-Ready Mindsets: Strategies to foster organizational cultures that embrace AI as an opportunity for innovation and competitive advantage.
Essential Skills for Future-Proof PMs: Identifying the core competencies, including AI literacy, data-driven decision-making, and ethical AI governance, crucial for thriving in an AI-augmented world.
Implementation Roadmap & Best Practices: A strategic guide for integrating AI into your projects and organizations, from pilot projects to establishing Centers of Excellence.
Ethical & Practical Considerations: Navigating data quality, bias, transparency, regulatory compliance (like the EU AI Act), and human-centric values in AI-driven PM.
A Vision for AI-Enabled PM: Envisioning AI as a strategic partner, leading to enhanced outcomes, sustainable competitive advantage, and the rise of the "AI-Augmented PM."
Actionable Next Steps: Concrete steps you can take today to advance your AI journey in project management.
Presented by Kaveh Momeni, a seasoned Project Manager with 15+ years of experience and extensive AI/ML certifications from leading institutions. This session is designed to empower project managers, team leaders, and decision-makers to confidently navigate and leverage AI for transformative project success.
Perfect for anyone looking to understand the strategic implications of AI in project delivery and how to prepare for an AI-driven future.
The final presentation of our time series forecasting project for the "Data Science for Society and Business" Master's program at Constructor University Bremen
Peter's performance for your company event is all about your guests. It's about their laughs, their surprise, their amazement, their feeling of community with everyone else in the room.
14th International Conference on Advanced Computer Science and Information Te...ijitcs
Call for Research Papers!!
Welcome to ICAIT 2025
14th International Conference on Advanced Computer Science and Information Technology (ICAIT 2025)
September 20 ~ 21, 2025, Copenhagen, Denmark
Webpage URL: https://itcse2025.org/icait/index
Submission URL: https://itcse2025.org/submission/index
Submission Deadline: May 24, 2025
Contact Us
Here's where you can reach us : [email protected] (or) [email protected]
Computer Applications: An International Journal (CAIJ)ijitcs
Call For Papers...!!!
Computer Applications: An International Journal (CAIJ)
Web page link: http://airccse.com/caij/index.html
Submission Deadline: May 24, 2025
Submission link:http://airccse.com/caij/index.html
Contact Us: [email protected] or [email protected]
Chapter 2 protozoa and their phylum to gethamzagobena8
Protozoa the above uploaded image of a silt fence and make multiple choice questions about the mentioned topics above the above uploaded image of a few weeks ago and I used to you tomorrow morning
apidays New York 2025 - Turn API Chaos Into AI-Powered Growth by Jeremy Water...apidays
Turn API Chaos Into AI-Powered Growth
Jeremy Waterkotte, Solutions Consultant, Alliances at Boomi
apidays New York 2025
API Management for Surfing the Next Innovation Waves: GenAI and Open Banking
Convene 360 Madison, New York
May 14 & 15, 2025
------
Check out our conferences at https://www.apidays.global/
Do you want to sponsor or talk at one of our conferences?
https://apidays.typeform.com/to/ILJeAaV8
Learn more on APIscene, the global media made by the community for the community:
https://www.apiscene.io
Explore the API ecosystem with the API Landscape:
https://apilandscape.apiscene.io/
apidays New York 2025 - To tune or not to tune by Anamitra Dutta Majumdar (In...apidays
To tune or not to tune : Benefits and security pitfalls of fine-tuning
Anamitra Dutta Majumdar, Principal Engineer at Intuit
apidays New York 2025
API Management for Surfing the Next Innovation Waves: GenAI and Open Banking
Convene 360 Madison, New York
May 14 & 15, 2025
------
Check out our conferences at https://www.apidays.global/
Do you want to sponsor or talk at one of our conferences?
https://apidays.typeform.com/to/ILJeAaV8
Learn more on APIscene, the global media made by the community for the community:
https://www.apiscene.io
Explore the API ecosystem with the API Landscape:
https://apilandscape.apiscene.io/
IoT, Data Analytics and Big Data Security.pptxfizarcse
Comparison of Learning Algorithms for Handwritten Digit Recognition
1. Comparison of learning algorithms
for handwritten digit recognition
Y LeCun, L Jackel, L Bottou, A Brunot, C Cortes, J Denker, H, Drucker, I
Guyon, U Muller, E Sackinger, P Simard, and V Vapnik
1995
Author | Safaa Alnabulsi
3. Introduction
This paper compares the relative merits of several classification algorithms
develop ed at Bell Laboratories and elsewhere for the purpose of recognizing
handwritten digits.
It is an excellent benchmark for comparing shapes not only digits.
They consider:
o Raw accuracy
o Rejection training time
o Recognition time
o Memory requirement
Author | Safaa Alnabulsi
4. Database
The MNIST database of handwritten digits was constructed from NIST's Special
Database 3 and Special Database 1 which contain binary images of
handwritten digits:
• Training set was composed of 60,000 pattern contained examples from
approximately 250 disjoint writes.
• Test set was composed of 10,000 patterns.
All the images were size-normalized to fit in a 20x20 pixel box while
preserving the aspect ratio.
Author | Safaa Alnabulsi
5. The Classifiers
Author | Safaa Alnabulsi
LINEAR NEAREST
NEIGHBOR
NEURAL NETWORK CONVOLUTIONAL
NEURAL NETWORK
6. Linear
Classifiers
Baseline Linear Classifier
Pairwise Linear Classifier
PCA and Polynomial Classifier
Optimal Margin OMC
Author | Safaa Alnabulsi
7. Baseline Linear Classifier
The simplest classifier. Each input pixel
value contributes to a weighted sum for
each output unit.
The output unit with the highest sum
indicates the class of the input
character.
Thus, as we can see, the image is
treated as a 1D vector and connected to
a 10-output vector.
The test error rate is 8.4%.
Author | Safaa Alnabulsi
8. Pairwise Linear Classifier
A simple improvement of the basic linear
classifier. The idea is to train each unit of a
single-layer network to classify one class from
one other class.
The final score for class i is :
the sum of the outputs all the units labelled i/z
minus the sum of the output of all the units
labelled y/i, for all z and y.
Error rate on the test set was 7.6%, only slightly
better than a linear classifier.
Author | Safaa Alnabulsi
45
9. PCA and Polynomial Classifier
This classifier can be seen as a linear classifier with 821 inputs, preceded by
a stage which computes the projection of the input pattern on the 40
principal components of the set of training vectors.
The 40.dimensional feature vector was used as the input of a second degree
polynomial classifier.
Error on the test set was 3.3%.
Author | Safaa Alnabulsi
From “Handbook Of Character Recognition And Document Image Analysis” Page 111
10. Optimal Margin Classifier (OMC)
OMC is called SVM now, which constructs
a hyperplane or set of hyperplanes in a high or
infinite-dimensional space, which can be used
for classification.
Best hyperplane is the one that represents the largest
separation, or margin, between the two classes
Using Regular SVM, a test error of 1.4% was reached.
Whereas, using a slighlty different techinqe, Soft
Margin Classifier (Cortes & Vapnik ) with a 4-th
degree decision surface, a test error of 1.1% was
reached.
Author | Safaa Alnabulsi
12. Baseline Nearest Neighbor Classifier
Another simple classifier with a Euclidean
distance measure between input images.
It would operate on feature vectors rather
than directly on the pixels
No training time and no brain on the part of
the designer
The memory requirement and recognition
time are large
Deslanted 20x20 images were used.
The test error for k = 3 is 2.4%.
Author | Safaa Alnabulsi
14. Tangent Distance Classifier (TDC)
It is a nearest-neighbor method where the
distance function is made insensitive to small
distortions and translations of the input image.
Tangent plane ? If we consider an image as a
point in a high dimensional pixel space then
an evolving distortion traces out a curve in
pixel space.Taken together, all these
distortions define a low-dimensional manifold
in pixel space which can be approximated by a
tangent plane.
An excellent measure of „closeness“for
character images is the distance between
their tangent planes
A test error rate of 1.1% was achieved using
16x16 pixel images.
Author | Safaa Alnabulsi
16. Radial Basis Function Network
Architecture:
• The first layer was composed of 1000 Gaussian RBF
units with 400 inputs (20x20). The RBF units were
divided into 10 groups of 100.
• The second layer was a simple 100 linear classifier.
Training:
• Each group of units was trained on all the training
examples of one of the 10 classes using the
adaptive K-means algorithm.
Error rate on the test set was 3.6%
Author | Safaa Alnabulsi
17. Large Fully Connected Multi-Layer Neural Network
Architecture:
• Two layers of weights (one hidden layer)
Training:
• Each network trained with various numbers of
hidden units.
• Deslanted 20x20 images were used.
• As the learning proceeds, the weights grow,
which progressively increases the effective
capacity of the network.
The best result was 1.6% on the test set.
Author | Safaa Alnabulsi
19. Motiviation Behind CNN
To solve the dilemma between small networks that cannot learn the training
set, and large networks that seem overparameterized, one can design
specialized network architectures that are specifically designed to recognize
two-dimensional shapes such as digits, while eliminating irrelevant
distortions and variability.
These considerations lead us to the idea of convolutional network.
Author | Safaa Alnabulsi
20. LeNet1
Because of LeNet 1‘s small input field, the images were down-sampled to 16x16
pixels and centered in the 28x28 input layer.
Small number of free parameters, only about 3000.
LeNet 1 achieved 1.7% test error.
Author | Safaa Alnabulsi
21. LeNet4
LeNet 4 was designed to address the problem of large size of the training.
It is an expanded version of LeNet 1 that has a 32x32 input layer in which the
20x20 images (not deslanted) were centered by center of mass.
It includes more feature maps and an additional layer of hidden units that is
fully connected to both the last layer of features maps and to the output
units.
LeNet 4 contains about 260,000 connections and has about 17,000 free
parameters.
Test error was 1.1%.
Author | Safaa Alnabulsi
22. LeNet5
LeNet 5, has an architecture similar to LeNet 4, but has more feature maps, a
larger fully-connected layer.
LeNet 5 has a total of about 340,000 connections, and 60,000 free parameters,
most of them in the last two layers.
the training procedure included a module that distorts the input images during
training using randomly picked affine transformations (shift, scaling, rotation,
and procedureing small skewing).
It achieved 0.9% error.
Author | Safaa Alnabulsi
24. Boosted LeNet4
Three LeNet 4 are combined:
• The first one is trained the usual way.
• The second one is trained on a mix of patterns that are filtered by the
first net (50% of which the first net got right, and 50% of which it got
wrong).
• The third net is trained on new patterns on which the first and the second
nets disagree.
During testing, the outputs of the three nets are simply added.
The test error rate was 0.7%, the best of any of our classifiers.
At first glance, bossting appears to be three times more expensive as a single
net. In fact, when the first net produces a high confidence answer, the other
nets are not called. The cost is bout 1.75 times that of a single net.
Author | Safaa Alnabulsi
26. Discussion – Error Rate
Author | Safaa Alnabulsi
Boosted LeNet 4 is clearly the best, achieving score of 0.7%, closely followed
by LeNet 5 at 0.9%.
This can be compared to our estimate of human performance , 0.2%
27. Discussion – Rejection Training Time
Author | Safaa Alnabulsi
In many applications, rejection performance is more significant than raw
error rate.
Again Boosted LeNet 4 has the best score.
28. Discussion – Trainig Time
Author | Safaa Alnabulsi
K-nearest neighbors and TDC have essentially zero training time.
While the single-layer net, the pairwise net, and PCA+quadratic net could be
trained in less than an hour,
the multilayer net training times were expectedly much longer: 3 days for
LeNet 1, 7 days for the fully connected net, 2 weeks for LeNet 4 and 5, and
about a month for boosted LeNet 4. Training the Soft Margin classifier took
about 10 days.
29. Discussion – Memory
Author | Safaa Alnabulsi
Memory requirements for the neural networks assume 4 bytes.
Of the high-accuracy classifiers, LeNet 4 requires the least memory.
30. Conclusions
Performance depends on many factors including high accuracy, low run time, and
low memory requirements.
Furture: As computer technology improves, larger capacity recognizers become
feasible. The neural nets advantage will become more striking as training
databases continue to increase in size.
Boosting: We find that boosting gives a substantial improvement in accuracy, with
a relatively modest penalty in memory and computing expense.
Training Data: When plenty of data is available, many methods can attain
respectable accuracy.
Optimal margin classifier: it has excellent accuracy, which is most remarkable,
because unlike the other high performance classifiers, it does not include a priori
knowledge about the problem. It is still much slower and memory hungry than the
convolutional nets.
Convolutional networks: are particularly well suited for recognizing or rejecting
shapes with widely varying size, position, and orientation.
Trained neural networks can run much faster and require much less space than
memory-based techniques.
Author | Safaa Alnabulsi
#9:
For the n(=10) classes you build all n(n-1)/2 = 45 Binary classifiers, denoted by i/j where i and j are different classes.
The i/z classifier output tells what makes i favorable over class z.
On the other hand x/i tells what speaks against i compared to class x.
Then you add up all 9 unique comparisons where i is either left or right of the dash.If i is right, you should note that x/i so to say equals -i/x.
#10: To compute the principal components:
the mean of each input component was first computed and subtracted from the training vectors.
The covariance matrix of the resulting vectors was then computed, and diagonalized using Singular Value Decomposition (SVD).
#11: Challange: Polynomial classifiers are well studied methods for generating complex decision surfaces. Unfortunately, they are impractical for high-dimensional problems.
One reasonable choice as the best hyperplane is the one that represents the largest separation, or margin, between the two classes.
So we choose the hyperplane so that the distance from it to the nearest data point on each side is maximized. If such a hyperplane exists, it is known as the maximum-margin hyperplane and the linear classifier it defines is known as a maximum-margin classifier; or equivalently, the perceptron of optimal stability.[citation needed]
SVM?More formally, a support-vector machine constructs a hyperplane or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification, regression, or other tasks like outliers detection.
Intuitively, a good separation is achieved by the hyperplane that has the largest distance to the nearest training-data point of any class (so-called functional margin),
since in general the larger the margin, the lower the generalization error of the classifier.[4]
The drawing:
H1 does not separate the classes. H2 does, but only with a small margin. H3 separates them with the maximal margin.
Additional info:
In addition to performing linear classification, SVMs can efficiently perform a non-linear classification using what is called the kernel trick, implicitly mapping their inputs into high-dimensional feature spaces.
#13: Naturally, a realistic Euclidean distance nearest-neighbor system would operate on feature vectors rather than directly on the pixels
#14: an unlabeled image of a "9" must be classified by finding the closest prototype image out of two images representing respectively a "9" and a "4".
According to the Euclidean distance (sum of the squares of the pixel to pixel differences), the "4" is closer even though the "9" is much more similar once it has been rotated and thickened.
The result is an incorrect classification.
The key idea is to construct a distance measure which is invariant with respect to some chosen transformations such as translation, rotation and others.
#15: Explainaition of the picture from paper below:
P, E are patterns
Sp, Se are manifolds, obtained through small transformations of P such as (rotation, translation, scaling, etc.).
he Euclidean distance between two patterns P and E is in general not appropriate because it is sensitive to irrelevant transformations of P and of E.
In contrast, the distance D(E, P) defined to be the minimal distance between the two manifolds Sp and SE is truly invariant with respect to the transformation used to generate Sp and SE.
Unfortunately, these manifolds have no analytic expression in general, and finding the distance between them is a hard optimization problem with multiple local minima. Besides, t.rue invariance is not
necessarily desirable since a rotation ofa "6" into a "9" does not preserve the correct classification.
https://pdfs.semanticscholar.org/8314/dda1ec43ce57ff877f8f02ed89acb68ca035.pdf
#17: Radial basis function (RBF) networks typically have three layers: an input layer, a hidden layer with a non-linear RBF activation function and a linear output layer.
The second layer weights were computed using a regularized pseudo-inverse method.
#20: Convolution: extract features from the input image. By using “feature map”A shared filter (therefore small number of parameters) specifically designed for the data-type at hand (here pictures), that, when trained implicitly learns structured features such as edges in the picture.
Pooling or Sub Sampling: reduces the dimensionality of each feature map but retains the most important information. (parameter-free)
Classification (Fully Connected Layer)
#21: It should be intuitively clear to the audience that convolutions + (down-sampling) lead to small number of parameters, and that mixing those with fully connected layers is still more parameter-efficient compared to deep fully connected networks.
#22: In previous experiments with ZIP code data, replacing the last layer of LeNet 4 with a Euclidean Nearest Neighbor classifier, and with the “local learning” method of Bottou and Vapnik, in which a local linear classifier is retrained each time a new test pattern is shown.
Neither of those improve the raw error rate, although they did improve the rejection
#24: Boosting is a technique to combine the results from several/many weak classifiers to get a more accurate results
#27: Boosted LeNet 4 is clearly the best, achieving score of 0.7%, closely followed by LeNet 5 at 0.9%.
This can be compared to our estimate of human performance , 0.2%
#28: In many applications, rejection performance is more significant than raw error rate.
Again boosted LeNet 4 has the best score.
The enhanced LeNet 4 did better than original LeNet 4.
#29: Expectedly, memory-based method are much slower than neural networks.
Single-board hardware designed with LeNet in mind performs recognition at 1000 characters/sec (Säckinger & Graf 94).
Cost-effective hardware implementations of memory-based techniques are more elusive, due to their enormous memory requirements.
Training time was also measured.
However, while the training time is marginally relevant to the designer, it is totally irrelevant to the customer.