Abstract:: Kinect For Xbox 360, or Simply Kinect (Originally Known by The Code Name Project
Abstract:: Kinect For Xbox 360, or Simply Kinect (Originally Known by The Code Name Project
In evolutionary gaming technologies, Xbox tech played a wiping role over the
earlier techs. The Xbox 360 is the second video game console produced by Microsoft as a
successor to the Xbox. The Xbox 360 contains an aggressive hardware architecture and
implementation targeted at game console workloads. The core chips include the standard
conceptual blocks of CPU, graphics processing unit (GPU), memory, and I/O.
Kinect for Xbox 360, or simply Kinect (originally known by the code name Project
Natal), is a "controller-free gaming and entertainment experience" by Microsoft for the Xbox
360 video game platform, and may later be supported by PCs via Windows 8. The Kinect sensor
offers a revolutionary new way to play: you’re the controller. Just move around and see what
happens. Control your Xbox 360 with a wave of your hand. Based around a webcam-style add-
on peripheral for the Xbox 360 console, it enables users to control and interact with the Xbox
360 without the need to touch a game controller, through a natural user interface using
gestures, spoken commands, or presented objects and images. Imagine controlling movies and
music with the wave of a hand or the sound of your voice. With Kinect, technology evaporates,
letting the natural magic in all of us shine.
1
1. Introduction
Over the years, we have gotten used to playing video games by pressing the controller’s
buttons using our thumbs. Recently, gaming companies have incorporated the joystick
mechanism to enhance game play. The level of interaction between the game and the gamer has
been little, using the up and down arrows with the action button.
Sony and Microsoft both designed a controller for the Play station 3 and Xbox 360. Sony
devised a controller something similar and more advanced in comparison to the Nintendo Wii
controller, called the Play station Move. The controller is much more precision-oriented than the
Wiimote. Sony, in an attempt to take the gaming future a step even further, came out with the
Xbox 360 Kinect, where there is no controller required.
People with motor disability or lack of full control on their upper limbs have a problem in
working with tactile user interfaces, such as keypads or joysticks. They need more advanced and
hand-free human machine interfaces (HMI) to communicate with computer based systems.
Particularly, pupils with severe motor disabilities, cerebral palsy (CP), or encephalitis, have a
problem in adopting control pads of video games such as Xbox.
. They need a non-tactile interface to play video games to improve their ability and
experiences. They provide a hand-free interface to communicate with electronic systems without
any physical connection. For instance, a head gesture interface can help advantageously a
disabled people with serious spinal disorders or CP to manipulate an electronic system with their
minimum possible head movements. Feature tracking, known as well as optical flow, is one of
the most fundamental approaches in computer vision to extract motion information from an
image sequence. It could be employed to recognize head gestures.
2
the 2009 Electronic Entertainment Expo. Here an external peripheral that needs to be attached to
the XBOX 360 to enable motion gaming .The sensor recognizes motion inputs and also speech
recognition. The device not only takes motion inputs but also interacts with the user.
An approximately nine-inch (23 cm) wide horizontal bar connected to a small circular
base with a ball joint pivot, the Project Natal sensor is designed to be positioned lengthwise
above or below the video display. The device features an "RGB camera, depth sensor, multi-
array microphone, and custom processor running proprietary software", which provides full-
body 3D motion capture, facial recognition, and voice recognition capabilities. The Project Natal
sensor's microphone array enables the Xbox 360 to conduct acoustic source localization and
ambient noise suppression, allowing for things such as headset-free party chat over Xbox Live.
The depth sensor consists of an infrared projector combined with a monochrome CMOS
sensor, and allows the Project Natal sensor to see in 3D under any ambient light conditions.
3
The Kinect sensor needs to be able to see you, and you need room to move. The sensor
can see you when you play approximately 6 feet (2 meters) from the sensor. For two people, you
should play approximately 8 feet (2.5 meters) from the sensor.
2.3 Microsoft Xbox 360
Microsoft Xbox is a gaming console which represents an evolution in both technology
and purpose, expanding game play and its applications. These consoles have major graphics
cards in them, and the graphics rendering has reached the point where it’s almost photorealistic.
“The hardware technology that creates the rendering of the virtual world at the end point, where
the client is sitting.”
The Xbox 360, pictured in Figure 1, contains an aggressive hardware architecture and
implementation targeted at game console workloads. The core silicon implements the product
designers’ goal of providing game developers a hardware platform to implement their next-
generation game ambitions. The core chips include the standard conceptual blocks of CPU,
graphics processing unit (GPU), memory, and I/O. Each of these components and their
interconnections are customized to provide a user-friendly game console product.
Each of these components and their interconnections are customized to provide a
user-friendly game console product.
4
to please every fan of online video media games. Xbox 360 could be adjusted easily. Xbox can
be played by the whole family. Designed to be played teenagers in addition for adults.
Kinect Adventures game: Experience the thrill of roaring rapids. Float in outer space. With
Kinect Adventures, you are the controller as you start, dodge and kick your method through
exciting adventures placed in a variety of exotic locations. Tackle mountaintop hurdle courses
and dive into the deep to explore a leaky underwater observatory - all from your living room.
Program. Graded “E” pertaining to Everyone. Rated "E"
5
2.4.3Product Features
Halo - Reach game - Halo - Reach is the culmination of 10 years of award winning Halo
games that have raised expectations for what can be achieved in a video game.
Two Xbox 360 Wireless Controllers - This award-winning, high performance wireless
controller is custom designed to match the console, and features a range of up to 30 feet and
a battery life of up to 30 hours on two AA batteries.
250GB Hard Drive - The internal, removable 250GB hard drive offers plenty of space to
save games, HD TV and movies, music and more.
Xbox 360 Headset (black) - Chat with friends on Xbox LIVE, or team up and jump in with
multiplayer game play. Built-in Wi-Fi - The new Xbox 360 is the only console with 802.
11n Wi-Fi built-in for a faster and easier connection to Xbox LIVE.
More ports - Connect more accessories and storage solutions with added USB ports. Now
with a total of 5, (3 back/2 front) you'll find more places to plug and play. Added integrated
optical audio out port for an easier connection to the booming What's In The Box
Kinect is an add-on peripheral for MICROSOFT’S XBOX360 gaming console . It is based
on controller free gaming and entertainment experience.
6
3. Architecture
The architecture is handled with the kinect components. The Infrared Projector
projects the scene i.e. the action we perform before the kinect console. Then the rays are
reflected to the Standard CMOS sensor to sense the reflected rays. Then this CMOS sends the
rays details to the SOC ( System On Chip) for processing.
The SOC processes the data and calculates the depth of the image scene. This depth is
processed to the xbox console to project it to
the Monitor or gaming output screen to
visualize to the user.
7
4. Kinect Console
The device features an "RGB camera, depth sensor, multi-array microphone, and custom
processor running proprietary software", which provides full-body 3D motion capture, facial
recognition, and voice recognition capabilities.
The KINECT device features an RGB camera, depth sensor and multiarray microphone.
RGB Camera
It is based on RGB color model.
The RGB color model is an additive color model in whish red, green and blue is
used.
It provides to capture the full 3D body motion.
IR Depth Sensor
It contains the Infrared projector combined with monochrome CMOS Sensor.
It has a range of 4-11 feet.
The sensor has an angular field of view of 57o horizontally and a 43o vertically
Multi- Array MIC
8
It features with four microphone capsules and employed for voice recognition of
different users.
It is also employed for voice recognition of the different users.
This is an array features for four microphone capsules
It operates with each channel processing 16-bit audio at a sampling rate of
16 kHz.
Motorized Pivot
Motorized pivot is used to tilt sensor.
5. TECHNOLOGIES BEHIND
We used some technologies in order to implement these gaming consoles. They
play very important role in making success of this gaming console. They are as follows
1. 3D Gesture Recognition
3. Functional Programming
9
Obtain information
Process the information conveyed
Through the gestures of a person like hand and leg movements, flapping, gait etc. It has
been programmed photographs, look for a basic human form, and identify about thirty necessary
parts eg your head, torso, hips, knees, elbows, and thighs. Microsoft relies on an advancing field
of AI called Machine Learning.
This saves programmers the near-impossible task of coding movements a body can make.
Representatives went into homes around the planet and recorded folks moving in front of a
specifically built rig. Microsoft’s PC farms sieve through this gigantic information set, letting the
brain come up with chances and statistical data about the human form. Once the brain is done
learning, it and its figures get packed into the Natal System.In order to implement this
technology we have to follow the steps
Step -1
As you stand in front of the camera, it judges the distance
to different points on your body.
It sees, a supposed “point cloud” representing a 3D
10
Step-4
It outputs most likely skeletal structure, shape to a 3D
Avatar
Human body pose tracking from video inputs has been an active research field motivated
by various applications including human computer interaction, motion capture systems, and
gesture recognition. The major challenges of recovering the large number of degrees of freedom
in human body movements are the difficulties to resolve various ambiguities in the projection of
human motion onto the image plane and the diversity of visual appearance caused by clothing
and varying illumination.
Human body motion tracking and analysis has received a significant amount of attention
in the computer vision research community in the past decade. This has been motivated mainly
by the desire of understanding human pose and gestures for building the next generation user
interface. Inspired from human to human interactions, such an interface will go beyond the
mouse-keyboard interaction, defining a system that responds naturally to user gestures. Naturally
other applications related to a marker-less capture of the human body motion can be considered
within the presented framework.
11
Human pose tracking remains as a challenging problem, primarily because pose is
difficult to track due to occlusion, fast movements, and ambiguity. Generating multiple
hypotheses for human pose for one image is at times necessary to arrive at a correct solution. A
method has been proposed to demonstrate a potential to integrate pose estimation results from
different modalities to improve the robustness and accuracy.
In order to apply this technology, we have to follow the above sequence.
The first step is to gather the information of the image in 2D.so that we get the complete image
from that we take only the outline of the body.
12
limbs [11] while others have proposed a model that contains hand and fingers joints Human body
motion tracking and analysis has received a significant amount of attention in the computer
vision research community in the past decade.
This has been motivated mainly by the desire of understanding human pose and gestures
for building the next generation user interface. Inspired from human to human interactions, such
an interface will go beyond the mouse-keyboard interaction, defining a system that responds
naturally to user gestures. Naturally other applications related to a marker-less capture of the
human body motion can be considered within the presented framework.
After we detect the region that is likely to be a person, we want to estimate the motion of
the head, limbs, and torso. In [7], a generative model of human appearance and motion is defined
in Bayesian framework. The probabilistic formulation of the generative model provides the basis
for evaluating the likelihood of the image measurements given the model parameters. A particle
filtering approach is used to represent and propagate the posterior distribution over time, thus
tracking multiple hypotheses in parallel.
Then we collect the model background scenes statistically to detect foreground objects,
distinguish people from other objects by checking skin-color blobs in the foregrounds. After we
estimate the position of the body in the foreground region. Following are the mechanisms that we
follow:
Boundary Tracking: First, the user is asked to initialize the points of the snake used for the
initial boundary (see Figure 4). The module will proceed to track the boundary using snake
deformations. This operation can be performed independently along with the Pose Estimation
Operation.
Pose Estimation: This operation tries to match a pose from it’s tree database to the frames of the
video. If the Full Frame radio button is checked, the method is fully autonomous. If the
Bounding Box is selected, the user is prompted to define a Bounding Box that includes the
human body of interest (see Figure5). This operation can be performed independently along with
the Boundary Tracking Operation.
13
Fig5.2.1: Interpolation of the pose angles at different positions of the player
5.2.2 Boundary Tracking and Pose Estimation: This combines the above operations, using the
pose detected to help initialize the snake for Boundary Tracking. If the Auto radio button is
checked, then the user does not need to provide any more input. If the user chooses to initialize
once, then the user is first asked to provide a Bounding Box as in the Pose Estimation operation.
Once an initial pose is estimated, the user is presented with the pose and asked if it’s a good
initialization for the snake (see Figure 6). If it is not, the user is asked to provide manual
initialization for the snake, as in the Boundary Tracking Operation. If the Multiple radio button is
checked, the above process is repeated every time the program estimates that the tracked
silhouette has lost its human body appearance.
14
Fig5.2.2: Filtering over time
An approach using motion a model that allows us to formulate the tracking problem as
one of minimizing a differential objective functions with respect to relatively few parameters.
The differential structure of these functions is rich enough to yield good convergence properties
using a deterministic optimization scheme at a much reduced computational cost. Furthermore,
by using a multi-activity database, we can partially overcome one of the major limitations of
approaches that rely on motion models.
15
5.3 Functional Programming:
Most of the programming languages you are familiar with (Pascal, Ada, C) are
imperative languages. They emphasize a programming style in which programs execute
commands sequentially, use variables to organize memory, and update variables with assignment
statements. The result of a program thus comprises the contents of all permanent variables (such
as files) at the end of execution.
In contrast, functional programming languages have no variables, no assignment
statements, and no iterative constructs. This design is based on the concept of mathematical
functions, which are often defined by separation into various cases, each of which is separately
defined by appealing (possibly recursively) to function applications.
A function is a good way of specifying a computation. This is the basis of the functional
programming style. A ‘program’ consists of the definition of one or more functions. With the
‘execution’ of a program the function is provided with parameters, and the result must be
calculated. With this calculation there is a certain degree of freedom.
An FP environment comprises the following:
A set of objects. An object is either an atom or a sequence, <x1 , . .. , xn>, whose
elements are objects.
A set of functions (which are not objects) mapping objects into objects. Functions may be
primitive (predefined), defined (represented by a name), or higher-order (a combination
of functions and objects using a predefined higher-order function).
6. Current Applications:
The primary application of project natal is as a Gaming controller. This extends the boundaries
of gaming in to new dimension. With the Natal sensor the gamer can interact with the Gaming
environment as he is actually living inside that. As long as it captures the full body motions
without any gadget attached in to the body, the games will be free to express his self with the
16
body motions and voice commands very precisely. Probably this will help Microsoft it increase
their sales exponentially and expand their gaming empire.
Project Natal will provide good support for 3D graphic designing industry. The current challenge
is to model the Object correctly. The traditional controllers including keyboard and mouse don’t
provide much support for that. Even when it comes to Laser scanning the process is complex and
takes lots of time. The natal sensor will provide accurate interface for the artists to design 3D
models without touching any controller, simply by their body motions.
When it comes to the motions capturing project natal can provide perfect solutions there as well.
This is one of the major challenges faced by the film industry to capture the accurate human
emotions and put it in to a 3D model. In addition to that the sensor can capture the voice too.
This kind of emotion capturing process would provide a relaxed environment to the actor to give
his best performance without getting attached to so many sensors.
6.4Military Applications
Another possible application of Natal sensor is to control advance military applications. The
modern unmanned military aircrafts are controlled by Joystick from a ground base. However
with the improvement of other implications, it’s asking for better and advanced controller where
the project Natal might give a solution. In addition to that this can be used to control unmanned
war robots. Finally let’s hope no one will ever use these things against Humans, but some kind of
Alien invasion may be.
17
7. Future Applications
8. Conclusion
Based on what Microsoft was showing off at E3, Natal promises to open up a new world
of gaming. Thus Kinect XBOX will be a revolution gaming and entertainment industry. It is a
new way of interacting with computers and machines.Finally Kinect is the application of
tomorrow launched today by MICROSOFT
9. References
http://www.techibuzz.com/iholo-touchscreen-mobile-features/
http://itechfuture.com/future-multimedia/
http://en.wikipedia.org/wiki/Xbox_360
http://www.xbox.com/en-US/kinect
http://www.xbox.com/en-IN
http://www.ehow.com/how-does_4728027_an-xbox-work.html
18