0% found this document useful (0 votes)
23 views

Human Body Posture Recognition

The document is a dissertation proposal for a master's degree that aims to recognize human body posture using the OpenPose deep learning algorithm. The student proposes to use OpenPose, which can detect body keypoints in real-time, to identify different body postures from images. The research would help applications like activity recognition, emotion analysis, and medical imaging. OpenPose is an effective method as it can run on various hardware platforms and provides multi-person keypoint detection with high accuracy.

Uploaded by

aarshiya kc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Human Body Posture Recognition

The document is a dissertation proposal for a master's degree that aims to recognize human body posture using the OpenPose deep learning algorithm. The student proposes to use OpenPose, which can detect body keypoints in real-time, to identify different body postures from images. The research would help applications like activity recognition, emotion analysis, and medical imaging. OpenPose is an effective method as it can run on various hardware platforms and provides multi-person keypoint detection with high accuracy.

Uploaded by

aarshiya kc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Beijing University of Posts and Telecommunications

Dissertation Proposal of Master's Degree

Student Number: 2019194023

Student Name:

School: Computer Technology

Research Field: Machine Learning

Supervisor: Miss Wang Xiaoru

Degree Program: Master’s

Date: December 2020


Dissertation Proposal of Master's Degree

Title Human Body Posture Recognition Using Open-Pose Deep learning Algorithm

Topic Resource Thesis Type Research

Starting Date 2020-11-25 Working Lab. Home country; Nepal

1. Legislation Basis:
(Including research purpose, significance, research status at home and abroad; to discuss
the scientific significance of the proposal topic based on its development trend, or to
discuss the application prospects of the topic by considering urgent key technology
problem in the economic and social development.) (No less than 480 words.)

Research Purpose/Significance
Image processing basically includes three steps that is : Input image, analyze and
manipulate image and get the result. Image processing systems are becoming popular due
to the availability of human resources computer, large memory, graphics software and so
on. With the advancement in image processing techniques in the decades it is possible to
analyze the human behaviour by recognizing the posture of human body and it has
become one of the most researched topic even in pattern recognition field. Pose
Recognition is a field that has a vast area in terms of depth and width. Also, Basically,
human postures includes standing, sleeping, sitting, kneeling etc. Recently, many studies
have devoted a lot of attention to body postures as they also express more or less
emotions

Human Body Pose Recognition is the evolving and challenging problems in the field of
Artificial Intelligence. It deals with the localization of human joints in any skeletal
representations. Normally, its difficult to determine human’s pose in an image as it
depends on various aspects such as image resolution, background clutter, illumination
variations, surroundings and so on. A Human posture can give some quantity of
information too that is based on non-verbal communications. Many results have shown
the body postures based on emotions of human. Physical posture can be study using the
static and dynamic methods. Dynamic methods is the method where posture can be

1
Dissertation Proposal of Master's Degree

identified while human perform certain actions. Similarly, static method is the method
where human can just sit in one position and pose can be identified.

This project lays the foundation for understanding how the open pose deep learning
approach estimate the posture of single or multiple persons in real-time or in any
prestored or prerecorded images and videos respectively. The proposed work used the
convolutional neural network moreover to generate the confidence maps and affinity
fields that plays a key bit part in pose evaluation for the skeletal structure in the given
image.Human body posture recognition has made huge advancement in the past years
and has evolved from 2D to 3D estimations and also from single person to multi person
estimation. Formally Body posture Estimation predicts the different parts that is joint
position of any human body. Since it is one of the most evolving research area, it can be
used in predicting human emotions or any pattern recognition, medicine image
segmentation, human tracking and so on. Among the different algorithms for pose
estimations. Open Pose algorithm is being used with Tensorflow in this approach. This
algorithm represents the real- time system to jointly detect human body parts on single
image. The system follows the approach using the RGB image to generate the whole
human body keypoints for person being detected. The real time multi person system
which jointly detects human body, foot, hand and facial keypoints that equals 135
keypoints on single image. Some available 2D pose estimations libariries such as Alpha
pose require the user to implement most of the pipeline, display to visualize the results
and also the body and face keypoints are not combined which would require a separate
library for each purpose. Hereby, Open Pose addresses all these problems. The other
significance of this research using Open Pose is that it can run on various platform like
Ubuntu, Windows, MAC and also in embedded systems and provides the support for
different hardwares. Human Posture detection using opencv is cheaper compared to any
other methods. The system also requires minimum materials s compared to others.

Research status and development trend of Open Pose

2
Dissertation Proposal of Master's Degree

In today’s real world applications of human posture recognition, high degree of accuracy
as well as real-time inference is required. Openpose which is developed by researchers at
Carnegei Mellon University can be taken as the state for real time human body posture
recognition. The first original paper on Openpose was submitted on 2016 and the most
recent was submitted on 2018 which showed the minor difference as the neural network
architecture and some aspects resulting in improved speed and accuracy. The basic
architecture of Openpose is: First the image is taken then part the confidence maps and
part the affinity fields . After that Bipartite matching is done and lastly the result is
combined and parsed. It helps to identify the human body joints using RGB camera.
Openpose keypoints includes ear, eyes, neck, nose, elbows, shoulder, knee, wrist, ankle
and hips. It outputs the results obtained by processing the inputs from a camera in real-
time or pre-recorded videos or static images. Hence, it find its use in varios applications
like survelliance, activity recognition too. The work proposed uses the open pose for key
point identifications followed by Convolutional neural network for different pose.

The first step is detecting key points of every person in the image which is followed by
assigning parts to each distinct individual. Open pose network starts with the extractions
of features from the image using the initial layers . These features are then passed to two
convolutional layer branch which run in parallel. A prediction of confidence map which
represents the specific parts of human body is made by the first branch. On the other
side, Part Affinity Fields denote the association degree parts that is done bythe second
branch . Also, more stages are used to make refinement to all the predictions that resulted
form the previous branch. Then the bipartite graphs are formed between the different
parts using the part confidence map. With these steps human skeletons are estimated for
each person in a single frame. Confidence maps is actually a 2D representation of the
belief that a particular body part can be located in any given pixel. These maps are
described by the given equations:

S= (S1,S2,….SJ) where SJ €R^w*h, J €1,,,,,J

3
Dissertation Proposal of Master's Degree

Where J is the number of body part of locations


Similarly, Part Affinity Field is a set of 2D vector fields that encodes locations and
orientations of limbs of different people in the image. It is especially required in multi person
pose detection that are required to correctly map the body parts to body. Because for multi
person there would be mluti heads, multi shoulders, hands etc. Thus it becomes difficult
sometimes to distinguish when they are closely grouped together. So the stronger Part
Affinity Field link between body parts represents that high chance that those body parts
belongs to same person.

It encodes the data in the form of pairwise connections between body parts.

L= (L1,L2,……LC) where LC € R^w*h*c, C €1,,,,,,C


The confidence maps for each person k and each body part j is defined by:

It is a gaussian curve with gradual changes where sigma controls the spread of the peak.
The predicted peak of the network is an aggregation of the individual confidence maps
by maximum operator.

For, the convolutional neural network , its divided into three basic parts:

 The first set of stages predicted the part affinity fields refines Lt form the feature
maps of base network F.
 The second set of stages takes the output part affinity fields from the previous
layers to refine the prediction of confidence maps detection.
 The third stage is about the parsing of the body parts with the help of matching
algorithm.

References

4
Dissertation Proposal of Master's Degree

1. Neha Shirbhate, Kiran Talele, “Human Body Language Understanding for


Action Detection usingGeometric Features”, IEEE, 2016, pp. 603-607.

2. Manu Bali, Devendran V, ”Human Body Posture Recognition Using Artificial


Neural Networks”, IJITEE, 2019.

3. Gines Hidalgo Mart’inez, “Whole Body pose Estimation Using Deep


Learning”.

4. W. Tang, P. Yu, and Y. Wu, “Deeply learned compositional models for human
pose estimation,” in ECCV, 2018.

5. L. Pishchulin, M. Andriluka, P. Gehler, and B. Schiele, “Poselet conditioned


pictorial structures,” in CVPR, 2013.

6. C. Zimmermann and T. Brox, “Learning to estimate 3d hand pose from single


rgb images,” in ICCV, pp. 4903–4911, 2017.

7. Y. Shavit, R. Ferens, “Introduction to camera pose estimation with deep


learning”.

8. K. Pothanaicker, “Human action recognition using CNN and LSTM-RNN


with attention model”, Intl Journal of Innovative Tech. and Exploring Eng,
2019.

9. C. Chan, S. Ginosar, T. Zhou, and A. A. Efros, “Everybody dance now,” in


ECCV Workshop, 2018.

10. Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh, “Realtime multi-person 2d pose
estimation using part affinity fields,” in CVPR, 2017

11. J. M. Saragih, S. Lucey, and J. F. Cohn, “Face alignment through subspace


constrained mean-shifts,” in ICCV, pp. 1034–1041, IEEE, 2009.

12. M. Dantone, J. Gall, C. Leistner, and L. Van Gool, “Human pose estimation
using body parts dependent joint regressors,” in CVPR, 2013.

13. Y. Shavit, R. Ferens, “Introduction to camera pose estimation with deep

5
Dissertation Proposal of Master's Degree

learning”

14. P. Szczuko, “Deep neural networks for human pose estimation from a very
low resolution depth image”, Multimedia Tools and Appl, 2019.

15. A. Mohanty, A. Ahmed, T. Goswami, “Robust pose recognition using deep


learning”, Adv. in Intelligent Syst. and Comput, Singapore, pp 93-105, 2017

16. M. Chen, M. Low, “Recurrent human pose estimation”

17. D. Mehta, O. Sotnychenko, F. Mueller and W. Xu, “XNect: real-time multi-


person 3D human pose estimation with a single RGB camera”, ECCV, 2019

18. P. Dar, “AI guardman – a machine learning application that uses pose
estimation to detect shoplifters”.

19. A. Agarwal and B. Triggs, “3D human pose from silhouettes by relevance
vector regression”, Intl Conf. on Computer Vision & Pattern Recogn.pp.882–
888, 2004.

20. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-
scale image recognition,” arXiv preprint arXiv:1409.1556, 2014

21. Y. Yang and T. M. Hospedales, “Trace norm regularised deep multi-task


learning,” arXiv preprint arXiv:1606.04038, 2016.

22. U. Iqbal, P. Molchanov, T. Breuel Juergen Gall, and J. Kautz, “Hand pose
estimation via latent 2.5 d heatmap regression,” in ECCV, pp. 118–134, 2018.

23. Y. Cai, L. Ge, J. Cai, and J. Yuan, “Weakly-supervised 3d hand pose


estimation from monocular rgb images,” in ECCV, pp. 666–682, 2018

24. G. Papandreou, T. Zhu, L.-C. Chen, S. Gidaris, J. Tompson, and K. Murphy,


“Personlab: Person pose estimation and instance segmentation with a bottom-
up, part-based, geometric embedding model,” in ECCV, 2018.

25. A. Newell, Z. Huang, and J. Deng, “Associative embedding: End-to-end


learning for joint detection and grouping,” in NIPS, 2017

6
Dissertation Proposal of Master's Degree

26. G. Papandreou, T. Zhu, N. Kanazawa, A. Toshev, J. Tompson, C. Bregler, and


K. Murphy, “Towards accurate multi-person pose estimation in the wild,” in
CVPR, 2017

27. G. Gkioxari, B. Hariharan, R. Girshick, and J. Malik, “Using k-poselets for


detecting people and localizing their keypoints,” in CVPR, 2014

28. X. Chu, W. Yang, W. Ouyang, C. Ma, A. L. Yuille, and X. Wang, “Multi-


context attention for human pose estimation,” in CVPR, 2017

29. M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele, “2D human pose


estimation: new benchmark and state of the art analysis,” in CVPR, 2014

30. D. Ramanan, D. A. Forsyth, and A. Zisserman, “Strike a Pose: Tracking


people by finding stylized poses,” in CVPR, 2005.

31. L. Pishchulin, M. Andriluka, P. Gehler, and B. Schiele, “Poselet conditioned


pictorial structures,” in CVPR, 2013

7
Dissertation Proposal of Master's Degree

2. Research Contents and Objectives:


(To describe specific research contents, research objectives and results, as well as key
scientific problems to be solved. This section should be mainly elaborated.) (No less than
1500 words.)
Research Contents
A human skeleton represents the orientation of person in a graphical format. Essentially it
is a set of coordinates that can be connected to describe the pose of a person. Each
coordinate in the skeleton is known as a joint or keypoint. Deep learning models for pose
recognition comes in some varieties related to top down and bottom up approach. Among
the various methods for pose estimations; bottom up approach is the one that detects all
parts in a image. Deep Learning is popularly used for image classifications where the
models takes the input in the form of image and output as the predictions. These algorithm
uses the neural networks to determine the connections between the input and output. In this
project, Openpose followed by Convolutional network is proposed to extract the keypoints
of human joints from image and then training deep learning models on those keypoints. In
the thesis, I will use mobilenetV2 model for training of the datasets and the datsets would
be COCO key-point challenge datasets.

Convolutional Neural Network


Convolutional neural network is a type of network that is widely used in Artificial
Intelligence domain whch has also been proved as the high effective such that it has
become go to method for image data.CNN consists of minimum one convolutional layer
which is first layer and is responsible for feature extraction of the image. Here it performs
the feature extraction using convolutional filters in the input and analyzing some part of
input a given before sending to other layer. With the use of convolutional filters what it
generates is said to be the feature map.

CNN has been used as it promises in making a highly desirable choice. They can be
trained on keypoints of joint location of human skeleton.

8
Dissertation Proposal of Master's Degree

In case of keypoints, CNN basically extracts the features from 2D coordinates of the
Openpose keypoints using the convolutional filter techniques. Based on the filter size, the
convolutional filter slides to next set of input. After the convolution, the activation function
to be generally applied to add the non-lineraity in CNN as the real world data is mostly
non-linear.

The openpose algorithm is divided into three different parts: body detection, face detection
and hand detection. The core block is combined body key point detector. It can
alternatively be use original body only detector trained on datasets. Thus, based on the

9
Dissertation Proposal of Master's Degree

output of the body detector, facial bounding box proposals could roughly be estimated
from some body parts locations which could particularly be eyes, ears, nose and neck.

Improvised Network structure

In the previous works done, both the part affinity field and confidence maps branches were
refined at the each stage. But here in this improvised approach, part affinity field branch
are only refined and confidence maps would be predicted in one stage only. Hence that the
amount of computation per stage would be reduced by half. This results that refined
affinity field predictions improve the confidence map results. Looking up to the part
affinity field channels outputs, the body part location could be guessed. However if the
bunch of body parts is seen with no another extra information then it cannot be parsed into
the different people. More in addition, the network depth is increased.The proposed project
needs to be accurate and also fast. So, training any individual part affinity field based
network to predict each individual set of joints would somehow achieve the accuracy goal.
However there is little chance of inefficency so the extension of body part affinity field
framework to pose estimation requires different modification to train the network
architecture.

MobileNet_V2 Model

It is one of mobile net model which consists of two types of blocks that will be used in this
project. The first block represents the residual block with stride of 1 and other block with
stride 2 for downsizing. Both the blocks consists of three layers each; the first layer is 1 * 1
convolutional with RELU6 whose purpose is to expand the number of channels in the data
before it enter into the depth-wise convolution; hence this expansion layer has always
more output result than of the input channels, second layer is the depth-wise convolutional
which is similar as in mobile-net_V1 architecture and lastly the third layer is another 1 * 1
convolutional layer but without any non-linearity which is claimed that if RELU is used

10
Dissertation Proposal of Master's Degree

again, the only deep networks have the power of linear classification.

Research Goals
As I have already discussed earlier about the growth and advancement in Information
Technolgy in various field including the machine learning, Computer vision , image
processing area. Human body posture recognizing is the task that infers the pose of a
person in certain image or videos. Any person can think of estimating the pose as the
problem of determining of the position of camera relative to an image. The basic goal of
the proposed system is to track the human body keypoints in an image or video. The other
purpose is to promote the area of posture estimation and also to find out more approaches
and techniques that could be used for refining the performance in this field. There is key
distinction to be made between 2D and 3D estimation.2D pose estimation simply estimates
the locations of keypoints in 2D space relative to an image or video frame. 3D estimations
works to transform an object in a 2D image into 3D image by adding a-z dimensions to
predictions. Pose recognition matters a lot in today’s world because with this one is able to
track an object or person that could be multiple people too in the real world space at an
incredibly granular level. Somehow estimating pose differs from other computer vision
tasks such as object detection also locates an object within an image but this localization is
typically coarse grained consisting of a bounding box encompassing object. However, pose
recognition goes further predicting the precise location of keypoints associated with the
object. Talking about the other objective, the power of pose recognition is envisioned by
considering its applications in automatically tracking human movement. In addition of
human tracking movement, pose estimation opens up application in wide range such as:
Animation, augmented reality, gaming or robotics.

Limitations/Problems to be Solved
1. Pose estimation is classified into single person and multi person pose estimation
depending on number of people in an image. Some previous works done are more
focused to single person pose estimating however with the availability of huge

11
Dissertation Proposal of Master's Degree

multi person datasets, multi person pose estimation is getting increased attention.
The multi task learn training is proposed in this task that modifies the definition of
confidence map as the concatenation of body, face, hand confidence maps. An
interconnection between the different annotation task must be created that allows
the different set of keypoints of the same person to be assembled together. So, to
solve the problem, multi-task is proposed.
Multi-Task Learning
Usually multi task learning is an approach that improves generalization by using
domain information contained in training signals of tasks. It is a subfield of machine
learning in which multiple learning task are solved at same time. This learning has
been one of the successful approach across all approach of machine learning
applications. It has been successful in body keypoints detector. In order to improve and
speed up the learning process different multi task learning models are introduced. In
various fields such as computer vision, bioinformatics, speech, natural language
processing, multi task learning is used to improve the applications. Multi task learning
can also be viewed as one way for machines to mimic human learning activities as
people often transfer knowledge from one task to another. In this work, multiple task
leraning method is integrated which is combined with an updated model CNN network
architecture design that is able to train united models out of various keypoints detection
tasks with different sacle features and this results in first single network method for
human body posture estimation. Moreover it is trained in a single stage rather than
requiring independent network training for each individual tasks. So this reduces the
total training time almost by half. The proposed approach yields the high accuracy than
that of the previous Openpose works done specially for face and hand keypoints
detection. As per the multi network approach, it uses the existing body, face and hand
key point detection algorithm which results in suffering from some commitment like if
the body detector fails there is no way to recovery and it is prone to do so when only

12
Dissertation Proposal of Master's Degree

hand or face is visible in an image. So its overall runtime is proportional to the number
of people in an image making the human body posture recognition. This requires a
high receptive field to learn the complex interaction among the people while latter
requires high hand and facial requirements.

2. The image usually seen on a daily basis are most common types of inputs for Pose
recognition. But the system that works on RGB inputs have the huge advantages
over other in terms of mobility of input sources. The color information is not
frequently used which needs to be improved because in image processing,
identification of objects or any anatomical structure is important. So, for
enhancement of image information that quash the unwanted distortion and
strengthen some of the features of image; preprocessing technique will be
discussed. Complete-linkage method that is basically a classic clustering method
will be used where the points corresponds to one or more pixel in the given image
and this methods merge those points into clusters.

13
Dissertation Proposal of Master's Degree

3. Study Design and Feasibility Analysis:


(Including Research Methods, Technology Roadmap, Theoretical Analysis and
Computation, as well as Experimental Procedures and their feasibility analysis.) (No less
than 480 words.)
Research Study Design
To address the key research objectives goals that is to track the human skeleton in the
given frame, open pose deep learnning has been used followed by the convolutional
neural network.

This figure shows the backbone for the network architecture using a preprocessed RGB image of
size w × hto generate a human body key-points for every person detected in the screen. The
convolutional neural network determines the set of two-dimensional confidence maps S of body
part locations and also a set of two-dimensional vector fields L of affinity fields. The set S = (S1,

S2, ..., SJ) has J confidence maps, one per part, where The set L = (L1,
Sj ∈ R w ×h , j∈ { 1. . . J } .

L2, ..., LC) has C vector fields, one per limb, where . The steps are
Lc ∈ R w ×h ×2 , c ∈ { 1. . . C }

briefly explained below:


1. First, the image is passed through the convolutiona neural network to get the features
maps of the input.

14
Dissertation Proposal of Master's Degree

2. After that the feature map is extracted, it is then processed in a multi satge convolutional
neural network that integrates the multi task learning to generate the confidence maps and
part affinity field.
3. After the generation of these maps and fields, they are processed to a matching algorithm
for parsing in order to estimates the postures in a given image.

Confidence Maps:
A Confidence Map is a 2D representation of the belief that a particular body part can be located
in any given pixel. Confidence Maps are described by following equation:

S= (S1,S2,….SJ) where SJ €R^w*h, J €1,,,,,J


Where J represents the number of body part locations
It also refers to the Gaussian curve that has the gradual changes where  sigma controls the
spread of the peak. The predicted peak of the network is an aggregation of the individual
confidence maps by a max operator.
Part Affinity Field:
Part Affinity is a set of 2D vector fields that encodes location and orientation of limbs of
different people in the image. It encodes thee data in  the form of pairwise connections between
body parts.
L= (L1,L2,……LC) where LC € R^w*h*c, C €1,,,,,,C
The part affinity field is required especially in multi person pose estimation it is required to
map the correct body parts to its body. Because for multiple persons, there are multiple parts
such as heads, hands, shoulders etc. Thus it becomes difficult to differeniate sometimes when
they closely grouped together. PAF provides a connection between different part of the body
that belongs to the same person. A stronger part affinity field link between body parts
represents that high chances that those body parts belong to the same person.

Matching Algorithm

15
Dissertation Proposal of Master's Degree

Since the candidate for each of the body parts has been detected, the other step is to connect them
and form a pair for which matching algorithm is used that is bipartite matching algorithm. Once
the weighted bipartite graph shows all the possible connection between body of two parts, it
means it holds a score for every connection. Then the use of assignment problem is there for
finding the connection that maximizes the total score. After that the final step is to transform the
detected connections and joints into final skeleton structure that estimates the posture in given
image.

Feasibility Analysis
The recent growth of Internet technology and world wide web makes it appear that the
world is witneesing the arrival of completely new technology.

Along with the advancemet in different fields of Information technology such as


computer vision, machine learning, image processing and many more i.e. among the
various other areas human body posture estimation has been the evolving research topics
in today’s world.This task itself is very helpful in different applications such as pattern
recognition, activity recognition, human tracking in videos or images, video
survelliance, motion analysis, medical imaging and so on various sectors.

Previously done works on this task has the problem with the accuracy so as speed.
Therefore the multi task learning proposed here would result in improving and speeding
up the learning process and also helps in learning the accurate tasks yielding the high
accuracy.

16
Dissertation Proposal of Master's Degree

4. Possible Innovations in Research Topics:


(No less than 300 words.)
To overcome the datasets problem, the multi task learning is applied which is a classical machine
learning technique where the related learning tasks are solved at the same time while exploiting the
similarities and differences in them. This approach of multi learning is combined with the
improvised network architecture that is able to train a united model out of various keypoints
detection tasks with its respective properties.

Besides the previous works related to posture estimation, here during the testing time, the network
approach provides the constant real time inference regardless of the number of people detected in
the frame. Also the training time has always been the issue in previous works but in this project, it
is trained in single stage rather than requiring the independent network training for individual taks
which most probably reduces the total training time. This approach also yields higher accuracy
than that of the previus openpose especially for face and hand joints detections. Analogous to fast
R- convolutional neural network , it brings together multiple and currently independent keypoint
detection tasks into a unified framework.

Going through the different papers, I found that for 2D posture estimations libraries , mask R-
convolutional neural network or alpha pose requires the users to implement most of the pipeline,
display to visualize the results , output file generations with the results. Plus existing facial and
body keypoints detectoors are not combined requiring different library for each purpose.
Therefore, Open pose overcomes solves these limitations.

5. Research Foundation and Working Conditions:

17
Dissertation Proposal of Master's Degree

(Including (1) the accumulation of research foundation related to current research


topics; (2) the available experimental conditions, and the lack of conditions with
proposed solutions.) (No less than 300 words.)
I went through many surveys and research relevant to this topic. It gave me better understanding
that many researchers have already worked on this project. I Started reading many documents
related to my topic to clarify the basic knowledge on what is pose estimating, algorithms to
implements it too. Many subjects that I studied such as Artificial Intelligence, Computer vision and
its applications, Advance Computer Network, Research integration: these subjects became a basic
foundation for my research and helped much to gain knowledge in these fields. The project that I
did during my first year i.e. Handwritten digit recognition , I learned a lot from that project like on
how to work on python along with some of its libraries as python is an ocean of libraries that serve
the various purpose . Some of which I learned are tensorrflow, keras, numpy, pytorch, pandas.
Also, to use the anaconda environment and how it works on training and testing the different
datasets. Moreover, going through different papers and journals relevant to human body posture
estimation and analyzing which models have been used and the limitations that I could improve in
the new approach.

I am doing my research work under the supervision of my supervisor: from my home country,
Nepal as I am currently not present in the university due to the world wide spread of Covid.
Frankly speaking it is not so effective to work form home on research but however I have
managed to make the proper working and favourable environment with the internet facility for my
research work.

Day-by Day, the improvement is increasing in the technological world. So I tried my best to
include some new suggestions and ideas in my research work which will make the work better.

18
Dissertation Proposal of Master's Degree

19
Dissertation Proposal of Master's Degree

Work Plan for Dissertation

Date Research Contents Expected Outcomes

2020/09-2020/11 Research papers collected and read Successfully done

to design the network architecture and performs


2020/12 Build the design
basic experiments

2020/12 To write the proposal done

2021/1 Made some changes in proposal done

20
Dissertation Proposal of Master's Degree

Supervisory Committee Members

Name Title Organization Position

Comments of Supervisor:

Signature:

Date:

Comments of Supervisory Committee:

Signature:

Date:

Comments of School:

Signature:

21
Dissertation Proposal of Master's Degree

Date:

22

You might also like