Egypt's11th - YouthInnovations - AUM Team'sDocumentation
Egypt's11th - YouthInnovations - AUM Team'sDocumentation
Project’s Preface
3 Contributors
3 About The Team
3 Our Vision
3 Team Members
8 Mechanical System
8 Mechanical Design
9 Vehicle Modeling
9 Hardware Criteria
11
Artificial intelligence System
12 Computer Vision
13 Natural Language Processing
13 The Main AI Assitant
The system consists of the driver, the observer, the upper level control,
and the lower level control.
The driver can turn the system on or off. The driver also provides the
system's control parameters:
- the desired headway time( the time after which the host vehicle
and the lead vehicle will collide when the lead vehicle makes a
sudden stop and the host vehicle maintains its relative velocity).
The sensors are responsible for collecting data about the lead vehicle then
this information is transmitted to the upper level controller ,which decides
the desired states of the host vehicle and its acceleration accordingly.
Then, the upper level controller sends control signals to the lower level
controllers( DC motors controller, brakes controller, transmission
controller, etc.).
7
Mechanical System
The design and production of
Mechanical Design electric cars require careful
consideration of material choices
(EVs).
Electrical Components
Four DC motor (one at each wheel)
Power Line
Protection Fuse 3A
Raspberry Pi
Raspberry Pi Camera
Raspberry Pi Power Bank
ESP32 10
Artificial Intelligence
System
In This Section We’re going to discuss the Software Part of
the System [Programming Codes] .
The Programming Codes written by the Developer is actually
the Brain of any Robot and Sometimes we can control the Whole
Progress or almost the Whole Progress with Just Lines of Codes ,
so that Indicates that the more the Perfection or the more the
accurate and Meaningful the Code , The More the System
becomes more Intelligent [Robot’s Brain].
Along the History the Scientists have defined that with a
Terminology of “Machine Learning (ML)” which’s extension of
AI [Artificial Intelligence] but that time making Predictions
according to supposed training modes and the Type of Data
[ Categorical, Numeric, …etc] , Briefly there is three Types of
Machine Learning which are:
-- Supervised Learning (e.g. Linear Regression)
-- Un-Supervised Learning (e.g. Clustering)
-- Reinforcement Learning (e.g. Markov’s Decision Process)
The Story didn’t end here , as the Scientists have came with
the “Deep Learning (DL)” which is the Next level of Machine
Learning and we’re going to use it mainly in our Robot ,so each
Application based on the DL.
Nowadays All Robot’s -or to be more clear- The Intelligent
Robot’s are based on DL and ML full Systems, To make it Talk
you can use the “Natural Language Processing” Techniques, to
make it See you can use the “Computer Vision” Techniques , to
make it do specific stuff and you are going to train it yourself ,
so you can use the “Reinforcement Learning” Techniques.
11
Computer Vision (CV) Model that was entirely Custom
from Training Phase to Dataset as
This Part is for The Image in that time we’ve get Dataset of
Processing Tasks which’s 877 Images and started Annotating
specifically focused on Detection, them all so at the end we got 1754
For that we have used two Models. Images half of them Masks
[ Bounding Boxes on the Target] ,
The First one is the Standard then all of that get trained of a
YOLOv5 [pre-trained] Model that’s MLOPs platform called Roboflow ,
Normally got 80 different Classes that’s also performed Data
- based on coco128 Dataset that’s Augmentation and Preprocessing
also pre-trained Dataset- like Phases on the Images, after that
[Person, Laptop, Phone, sink, we exported all of it in the form of
Toaster, ..etc] “.xmal” files that contains the
Locations of masks on each
That’s so simple in usage , as you corresponding Image, Then
just have to get access to the Entering that Data on a Custom
main YOLO repository from Github YOLO Code made also by
or either download or Clone the Roboflow, then it got trained and
Repository by GIT then start received Accuracy of 95.7% on our
calling the Model [in our Case Data , Finally exporting our Model
we’ve used PyTorch package to in the extension of “.pt” file and
load the Model but you can use it’s ready to use. That Model
Keras], then by using OpenCV mainly is used to detect only Four
package open Video Stream from Classes [Traffic Light, Stop Sign,
the used Camera and then it’ll Speed Limit Sign, Cross-Walk
start creating Bounding Boxes on Sign].
the detected classes that it sees . We actually got a problem in that,
as it detect these classes perfectly
Actually we face some Overfitting but it also detect another objects
in Detection despite the very high that seems to be similar to the
Validation Accuracy of 96% and main classes and that’s also a
that’s fixable by re training the Problem , we tend to fix it by
same Class but using more Dataset proving the Custom Model with
for training and Anotating [as the More classes and that requires
Dataset should be splitted into mainly more Dataset.
Images and their Masks]. That was
the First Model we used , now But In General everything in the
talking about the Second YOLO desired Situation is pretty Good.
12
Natural Language IMBD reviews in the form of “.csv”
file , our target was “review”
Processing (NLP) column of this file so we’ve
performed some Data Analysis on
Now Talking about one of the Most it and at the end exported that
Challenging and in the Same time column in the form of “.txt”
very Useful Topic in our Robot and file ,that was the input of our
that also takes apart in the Coming Model.
Section which is the NLP. That’s Now setting our Thresholds like
actually a Simulation of Human’s Batch Size [that’s number of
Conversations but at that time it’s samples processed before the
Manually generated according to model is updated], Number of
the Provided Data for Training. Epochs , then performing Data
The Idea of NLP in our case is to Preprocessing like One Hot
make to generate Text if the Input Encoding as our data are not
was something New to the Sparse , Encoding the whole
Assistant or something that you Dataset to be able to read by the
didn’t especially provide an Model , converting Characters to
Answer to it and in the Same time Integers and vice versa.
to be kinda Logic and accurate so The Main Model was Two layers
the best way to do this is to use LSTM , one Dropout Layer of 30%
the RNNs “ Recurrent Neural rate [that’s actually prevents
Networks” which actually got overfitting during training] and so,
extension of LSTMs “Long Short Our Model produced 955,010
Term Memory” Layers which able trainable Parameters.
to generate whole Text Sentence
not just a Word as it can hold After we found that the Model
information Longer . failed we’ve chosen to do
That’s actually what we did but something different that will be
unfortunately the Model didn’t discussed in the Following Section.
work well, it has successfully
trained for exactly [4 Hours : 37
Minutes : 49 Seconds ] and we The Main AI Assistant
have successfully exported it as
Now We’re going to talk about the
“.h5” extension but once we tried
Main Assistant where all of what
to use it , an Error occurred and
we did should exist here in that.
seems like the Model got
That took about 2 Months to
Corrupted .
produce that DEMO with that
Describing what we did , the First
Features , so we’re going to
thing we’ve got the Dataset from 13
The Assistant can make some
discuss what it can do as much as
Human things like telling you Jokes
we could.
and Facts, How Did we do that?
We actually make something like a
Calling up from the last Section
Database in form of “.json” file
our problem we faced after the
excpecting to hear from the User
Failed Model , The Solution we
the desired Words making the
decided to do is getting a Pre-
assistant understand that , that
trained Model to generate Text
word have been called and that
instead of that one that passed
means you need to say a joke or a
away. We found Hugging Face
fact according to the Input [ User’s
Platform that’s actually a
Input], It generates random Jokes
Magnificent Platform where
every time you call it normally and
Companies and Skilled Data
also random Fact from Wikipedia
Scientists post there Models in
every time it’s called.
various Fields.
The Model that we choose is
The Assistant repeat what you
“GPT-2 small “
said again after you say it just as a
Debugging step to make sure it has
heard you correctly and then it
flows through that text as it’s
converted to multiple Strings and
begin checking which feature it
should call at the moment.
Fig.7.GPT-2 Models Versions
One of it’s Features is telling you a
That’s a pre-trained Model on a Complete Weather Forecast
very large corpus of English data in according to your City with a
a self-supervised fashion. By the unique ID and that progress is
way we’ve activated Early actually called “Web Scrapping
Stopping while Calling that Model “ where we extract information
just to prevent Damaging the from something called an “API”
Accuracy. [Application Programming
Interface], the API used in our case
Ok Pretty good , now the Text was Open Weather Map API.
Generation Model is Ready to use
and has been successfully added to
the Assistant. Now an Important
Question what does that Assistant
can do? Fig.8.Open Weather API 14
We’ve also added Emotion Detection Model that predicts your Emotions
according to what you said. We’ve created two Models of that, One of
them is self made by normal Machine Learning Methods using “Sklearn”
Package but it’s Limitations were that the Prediction isn’t that Accurate,
so We’ve searched and found another Model [pre-trained] and it’s called
“NRCLex” [National Research Council Canada’s Emotion Lexicon] that’s a
MIT-approved PyPI project by Mark M. Bailey which predicts the
sentiments and emotion of a given text, That’s also got a problem that
when it can’t detect the Emotion it returns None so once it returns None
the Assistant will switch to use the Old Model [you can just say that’s a
plan B as it’s better than nothing].
The Assistant got the ability to switch between voices [ Male, Female,
Indian-Accent] as that’s what the Speech Recognition Package that
perfom that provides.
Also you can store some Orders in the Assistant's Memory and then you
got the Ability to add more and to remove one of them , the Assistant will
automatically repeat the name of the Order or the Task.
Mobile’s Application
Once you open it up you will find a
Idea of the Application Button at the Home Page which will
The Main Purpose or the Idea of take you directly to a Page that
creating and Building that Application Display the Values once you clicked it
is to get a Live Video Stream from the on. The App also got the Live Stream
Robot to the Mobile Phone and also Page and a Side Edge contains
display the Values of the Sensors exist Information about Team Members.
Cost Analysis
COMING SOON..
17
Main Aspects
Terminologies
Robotics | is an interdisciplinary Servo Motor | is a rotary Machine Learning | s a field of inquiry
branch of computer science and actuator or linear actuator that devoted to understanding and building
engineering. allows for precise control of methods that 'learn', that is, methods
Geared Motor | is a component angular or linear position, velocity that leverage data to improve
whose mechanism adjusts the and acceleration. performance on some set of tasks.
speed of the motor, leading them Motor Driver | acts as an interface Deep Learning | is a subset of machine
to operate at a certain speed between the motors and the learning, which is essentially a neural
IMU | Inertial Measurement Unit control circuits. network with three or more layers.
Working & Applications. Pandas | is a software library Natural Language Processing | Natural
Lithium Ion Battery | A lithium-ion written for the Python language processing (NLP) refers to
or Li-ion battery is a type of programming language for data the branch of computer science—and
rechargeable battery which uses manipulation and analysis. more specifically, the branch
the reversible reduction of lithium Matplotlib | is a plotting library for of artificial intelligence or AI—
ions to store energy. the Python programming language concerned with giving computers the
PID | PID means proportional, and its numerical mathematics ability to understand text and spoken
integral and the derivative is a extension NumPy. words in much the same way human
controller used to control the Seaborn | is a Python data beings can.
processes by adjusting the visualization library based on Computer Vision | is a field of artificial
manipulated variable to keep matplotlib. intelligence (AI) enabling computers to
process variable at a set-point. NumPy | is a library for the Python derive information from images,
Infrared Sensor | An infrared programming language, adding videos and other inputs.
(IR) sensor is an electronic device support for large, multi- YOLO | an algorithm that uses neural
that measures and detects dimensional arrays and matrices, networks to provide real-time object
infrared radiation in its along with a large collection of detection.
surrounding environment. high-level mathematical functions CNN | a kind of network architecture
Ultrasonic Sensor | is an electronic to operate on these arrays. for deep learning algorithms and is
device that measures the distance PyTorch | open source machine specifically used for image recognition
of a target object by emitting learning framework that and tasks that involve the processing
ultrasonic sound waves, and accelerates the path from research of pixel data.
converts the reflected sound into prototyping to production RNN |A recurrent neural
an electrical signal. deployment.. network (RNN) is a class of artificial
Microcontroller | is a small Sklearn | is a free software neural networks where connections
computer on a single VLSI machine learning library for the between nodes can create a cycle,
integrated circuit chip. Python programming language. It allowing output from some nodes to
LIDAR Sensor | a type of laser features various classification, affect subsequent input to the same
distance sensor that measures the regression and clustering nodes.
range, or depth from a surface. algorithms including support- LSTM | Long short-term
Autonomous Car | is a vehicle vector machines,..etc. memory (LSTM) is an artificial neural
capable of sensing its environment DataFrames | is a data structure network used in the fields of artificial
and operating without human that organizes data into a 2- intelligence and deep learning.
involvement. dimensional table of rows and Speech Recognition | is an
Arduino UNO | is a microcontroller columns, much like a spreadsheet. interdisciplinary sub-field of computer
board based on the ATmega328P Keras | is an open-source software science and computational linguistics
(datasheet). library that provides a Python that develops methodologies and
Atmega328P | is a high interface for artificial neural technologies that enable the
performance yet low power networks. recognition and translation of spoken
consumption 8-bit AVR OpenCV | a library of programming language into text by computers with
microcontroller that's able functions mainly aimed at real-time the main benefit of search-ability.
to achieve the most single clock computer vision. API | An application programming
cycle execution of 131 powerful Web Scraping | interface (API) is a way for two or
instructions. Scraping a web page involves more computer programs to
fetching it and extracting from it. communicate with each other.
18
Criteria
Font | Radian
Main Titles Font Size | 45
Sub-Titles Font Size | 26
Page’s Text Font Size | 18
Article’s Title Font Size | 22
Article’s Text Font Size | 12-16
19
20