SECA7019
SECA7019
1.1COMMUNICATION NEEDS:
M2M (Machine-to-Machine) has come of age. It has been almost a decade since the idea
of expanding the scope of entities connected to “the network” (wireless, wireline; private,
public) beyond mere humans and their preferred communication gadgets has emerged around
the notions of the “Internet of Things” (IoT), the “Internet of Objects” or M2M. The initial
vision was that of a myriad of new devices, largely unnoticed by humans, working together
to expand the footprint of end-user services. This will create new ways to care for safety or
comfort, optimizing a variety of goods-delivery mechanisms, enabling efficient tracking of
people or vehicles, and at the same time creating new systems and generating new value.
As with every vision, it has taken time to materialize. Early efforts concentrated on
refining the initial vision by testing new business models, developing point solutions to test
feasibility, and forecasting the impact of insufficient interoperability. Over the past few
years, the realization that there are new viable sources of demand that can be met and
monetized has created the push for a joint effort by industry to turn a patchwork of
standalone elements and solutions into a coherent “system of systems”, gradually turning the
focus from the “what” to the “how” and developing the appropriate technologies and
standards.
Indeed, real time allows making optimal and timely decisions based on a large
amount of prior collected historical data. The trend is to move away from decision
making based on long-term averages to decisions based on real-time or short-term
averages, making a real difference to the large amount of nonergodic industrial
processes.
Scalability implies that all involved stakeholders, i.e., technology providers,
service companies, and finally end user in a given industry vertical, can scale up
the use and deployment where and when needed without jeopardizing prior
deployments. Wireless is instrumental when it comes to scalability!
Ubiquitous deployment and use are important since it allows reaching this critical
mass required to make technologies survive long term. A sparse or punctual use
of specific M2Mtechnologies makes sense in the bootstrapping phase of the
market but is unsustainable in the long term.
Reliability, often overlooked, means the industry customer gets sensor readings
that can be relied on because the system had been calibrated, tested, and certified.
A prominent example is the vertical of urban parking, where Google’s
crowdsourced platform to get parking occupancy had been discontinued since
stakeholders in this space preferred to get reliable occupancy messages 24/7 from
certified sensors installed at each parking spot, rather than unreliable
crowdsourced data, even when offered for free.
Finally, the big data dream can only be materialized if heterogeneous data, i.e.,
data from different verticals, are combined to give unprecedented insights into
behavior, trends, and opportunities and then also act upon them. In fact, when
referring to “big” in big data, it is less about quantity of data (i.e., large volumes
of terabytes of information) but rather the quality of data (i.e., different
information streams giving a new comprehension of the field, where one stream
could be only a single bit).
The data being delivered byM2Msystems are often referred to as the oil of the twentyfirst
century. It might be coincidental but the M2M data flow resembles the oil processing flow in
that it is composed of data (i) upstream, (ii) mash-up, and (iii) downstream:
Upstream: Here, data are being collected from sensors in the field that deliver the data
wirelessly to a gateway via single-hop or mesh networking technologies. The data traverse
the Internet and reach dedicated servers or cloud platforms, where they are stored and
profiled for further analytics.
Downstream: Here, business intelligence is applied to the data and intelligent decisions
are taken. These are pushed back down to the customer, by either controlling some actuators,
displaying some information/alerts in control/service platforms, or simply informing users
about issues pertaining to this specific M2M application. Human–computer interfaces (HCI)
or even computer–computer interfaces (beyond the typical APIs) will be instrumental in
ensuring the offtake of M2M technologies.
Mash-up: The data processing and business intelligent platforms are likely the most
important constituent of emerging M2M systems. Data flows from machines, humans, and
the Internet in general will feed intelligent analytics engines, which are able to find trends,
optimize processes, and thus make the industry vertical more efficient and effective.These
three constituents are depicted in Figure 1.1
Figure 1.1 : Upstream (top), downstream (middle), and mash-up (bottom) components for
typical industrial M2M turnkey solutions.
As for wireless M2M technologies, the ecosystem has so far relied on ZigBee-like and
2G/3G cellular technologies; however, new players are entering the space, such as low-power
Wi-Fi, Bluetooth low energy, and proprietary cellular systems. The pros and cons of these are as
follows:
ZigBee-like technologies: IEEE 802.15.4 and derivatives were (and still are)
perceived as the holy grail for wireless sensor networking and M2M usage.
Indeed, with the latest IEEE 802.15.4e amendments, it has become a very energy-
efficient technology, even in the case of multi-hop mesh. However, in its very
nature of providing fairly high data rates over short distances, it is against the
essence of M2M, which mainly requires very low data rates over large distances.
The need for frequent repeaters/gateways and skilled engineers to handle
connectivity/radio planning and the lack of a global critical mass when it comes to
coverage and deployment/adoption have prevented the predicted growth and will
likely be the demise of the technology itself. ZigBee and derivatives have never
made the jump from being a technology to being a turnkey solution, i.e., a system
that is easy to use to customers whose core business is not dealing with
dimensioning wireless. However, this community has achieved something that
none of the others have yet: they have penetrated the control community and were
able, despite clearly being technically inferior compared to any of the below
systems, to become the certified choice for wireless SCADA-like systems (see,
e.g., WirelessHART, ISA100.11a, and Wireless M-Bus).
Low-power Wi-Fi: An interesting contender in the M2M space is emerging in the
form of a well-tuned Wi-Fi system. Wi-Fi enjoys an enormous popularity in both
the business-toconsumer (B2C) and B2B markets, with more than 2bn access
points installed and a truly vibrant standardization ecosystem active. There is Wi-
Fi coverage virtually anywhere where it is worth taking M2M data. Interestingly,
it consumes significantly less energy when transmitting information. This is
mainly because data are transmitted in a single hop only when needed, thus
requiring the radio to be switched on only at events or regular alive beacons. In
contrast, ZigBee-like technologies need to listen periodically for neighbors to
transmit data; this consumes in the end more energy than a simple single-hop
network. It is thus to no surprise that chip manufacturing giants, like Broadcom,
have decided to ditch the IEEE 802.15.4 stack for the more promising IEEE
802.11 stack in its gamble for the IoT and M2M market.
Proprietary cellular: Cellular networks have the enormous advantage of covering
large areas to reach multiple M2M devices without the need for repeaters or
numerous base stations. A problem with current standardized cellular solutions
(2G/3G/LTE) is the fairly slow standardization cycle and high licensing costs
associated with enabling every sensor and actuator with such technology.
Proprietary solutions have thus emerged, which bridge this gap. Notable players
in the field are SIGFOX, Neul, Cycleo, On-Ramp, and lately also the chip giants
Texas Instruments (TI) and STMicroelectronics. These technologies are, in
general, fairly narrowband—thus enjoying very large link budgets—and connect
hundreds if not thousands of devices to a single base station over a large area.
While these solutions are technically well elaborate, they are not deemed to be
future-proof since they are not standardized. Also, while coverage is guaranteed
by the said systems, mobility and roaming are not supported.
Standardized cellular: Therefore, standardized cellular M2M technologies will be
a crucial constituent to the M2M landscape in the next years to come. Cellular
systems enjoy worldwide coverage and interoperability, allowing for M2M
devices to roam and be on the move.In addition, interference is being handled by
centralized radio resource management (RRM) algorithms, which, to a large
extent, ensure that there are no coexistence, delay, or throughput problems. It is
thus to no surprise that the major standardization bodies, such as 3GPP, IEEE, and
ETSI, started standardizing M2M architectures and protocols.
Before standardized cellular technologies can be used in the M2M space, the following
technical requirements need to be addressed, which contrast completely the requirement on
human-driven communications:
There will be a lot ofM2Mnodes, i.e., by orders of magnitude more than human
connections. l More and more applications are delay-intolerant, mainly control.
There will be little traffic per node, mainly in the uplink.
Nodes need to run autonomously for a long time.
Automated security and trust mechanisms need to be in place.
The difference between IoT and M2M can be baffling. In truth, the false impression that
M2M and IoT is the identical has been a continuing situation of discourse within the tech-sphere.
But now, extra than ever, as each technology preserve to evolve at wreck-neck speed.
Remote Access: M2M set the principles of device connectivity on which IoT stepped
forward on, IoT development takes place and is almost constructed upon. IoT is the larger
vision on connectivity this is fueled by the improvements of M2M packages.M2M is,
first and fundamental, a business answer that connects corporations to their likewise
connected machines. M2M revolutionizes commercial enterprise operations by way of
enabling them to screen and manipulate their machines or pieces of equipment remotely.
The primary deliverable of M2M is to connect a device to the cloud so that business can
control that tool and gather information from it remotely. IoT is a mass marketplace
generation that applies to both clients and corporations. Customer IoT connects people to
their devices and similarly, allows for far-off control over the one’s devices. Corporation
IoT, then again, takes it similarly through facilitating asset tracking and control.
Business Performance: A vital software of both technologies is its’ information-
centricity. Due to the furnished connectivity, data from these devices may be gleaned for
comments on its overall performance, consumer experience, and protection. For example,
facts from M2M systems are used to find machine mistakes and cut down on preservation
fees by means of doing away with the want for consistent manual preservation. Certainly,
the connectivity that both M2M and IoT offer lets in for the gathering of analytics and
different treasured insights that agencies can use to leverage within the development of
their structures.Undeniably, M2M and IoT share common factors. The center similarity is
that each offer far off gets right of entry to gadget records and each alternate records
among machines without human intervention. Although, aren’t synonymous. Right here
are where IoT and M2M diverge.
Connectivity: As the call implies, M2M connects machines to machines, while IoT takes
device-to-device connectivity, integrates internet programs, and connects it to a cloud.
While IoT converges disparate systems into an expansive machine to allow new
applications, M2M employs isolated structures of sensors and islands of remotely
gathered and measured statistics. Actually placed, IoT is extra than device connectivity,
as it’s for the community of linked gadgets.
Scalability: Because of its cloud-primarily based architecture, IoT is inherently extra
scalable than M2M. Cloud-based totally structure eliminates the want for extra difficult-
stressed connections and M2M sim card set up.
The goals of the Web of Things (WoT) are to improve the interoperability and usability
of the Internet of Things (IoT). Through a collaboration involving many stakeholders over many
years, several building blocks have been identified that help address these challenges.
This specification is focused on the scope of W3C WoT standardization, which can be
broken down into these building blocks as well as the abstract architecture that defines how they
are related. The building blocks are defined and described in detail in separate specifications.
However, in addition to defining the abstract architecture and its terminology and conceptual
framework, this specification also serves as an introduction to the WoT building blocks, and
explains their interworking:
This specification also covers non-normative architectural aspects and conditions for the
deployment of WoT systems. These guidelines are described in the context of example
deployment scenarios, although this specification does not normatively define specific concrete
implementations. This specification serves as an umbrella for W3C WoT specifications and
defines the basics such as terminology and the underlying abstract architecture of the W3C Web
of Things. In summary, the purpose of this specification is to provide:
a set of use cases in use Cases that lead to the W3C WoT Architecture,
a set of requirements for WoT implementations in Requirements,
a definition of the abstract architecture in WoT Architecture
an overview of a set of WoT building blocks and their interplay in WoT Building
Blocks,
an informative guideline on how to map the abstract architecture to possible
concrete implementations in Abstract Servient Architecture,
informative examples of possible deployment scenarios inExample WoT
Deployments,
and a discussion, at a high level, of security and privacy considerations to be
aware of when implementing a system based on the W3C WoT architecture in
Security and Privacy Considerations.
Terminology:
This specification uses the following terms as defined here. The WoT prefix is used to
avoid ambiguity for terms that are (re)defined specifically for Web of Things concepts.
Common Patterns:
This section introduces common use case patterns that illustrate how devices/things
interact with controllers, other devices, agents and servers. In this section, we use the term
client role as an initiator of a transport protocol, and the term server role as a passive
component of a transport protocol. This does not imply prescribing a specific role on any
system component.
A device can be in a client and server role simultaneously. One example of this dual role
is a sensor, that registers itself with a cloud service and regularly sends sensor readings to the
cloud. In the response messages the cloud can adjust the transmission rate of the sensor's
messages or select specific sensor attributes, that are to be transmitted in future messages.
Since the sensor registers itself with the cloud and initiates connections, it is in the 'client'
role. However, since it also reacts to requests, that are transmitted in response messages, it
also fulfills a 'server' role. The following sections illustrate the roles, tasks, and use case
patterns with increasing complexity. They are not exhaustive and are presented to motivate
for the WoT architecture and building blocks.
Web Thing:
A Web Thing has four architectural aspects of interest: its behavior, its Interaction
Affordances, its security configuration, and its Protocol Bindings, as depicted in Figure. The
behavior aspect of a Thing includes both the autonomous behavior and the handlers for the
Interaction Affordances. The Interaction Affordances provide a model of how Consumers
can interact with the Thing through abstract operations, but without reference to a specific
network protocol or data encoding. The protocol binding adds the additional detail needed to
map each Interaction Affordance to concrete messages of a certain protocol. In general,
different concrete protocols may be used to support different subsets of Interaction
Affordances, even within a single Thing. The security configuration aspect of a Thing
represents the mechanisms used to control access to the Interaction Affordances and the
management of related Public Security Metadata and Private Security Data.
Computers require very precise, rigidly defined rules or protocol for successful
communication. Even slight variations can render communication impossible. In order for two
computers using different protocols to communicate, some kind of translation must take place.
The device that performs this translation is called a gateway. Protocol translation, like language
translation, is an imperfect art. Concepts that are easily and clearly expressed in one language or
protocol may be difficult or even impossible to translate into another. Consider the English word
"snow." Native peoples of the arctic have many words in their languages that all translate to
snow.
All of the detail and nuance that was associated with the original word is lost when it is
translated to snow. For the English speaking person this may not be much of a problem because
the lost details may not have been useful information. However, when the word "snow" is
translated in the reverse direction the problem is more difficult. Much more detail is expected.
Which possibility is the best match for the current context? What problems might be caused by
choosing the wrong word? These issues must be addressed when designing gateways. How will
the gateway handle differences in the way concepts such as trends, schedules, alarms, and
prioritized commands are represented and manipulated? How will missing details be filled in
without causing unintended consequences? These difficulties often place severe limitations on
the kind of information that can be transferred. The result is often connectivity on a limited scale.
Some control system manufacturers use gateways within their own product line even though they
are using their own proprietary protocols.
A typical example would be field panel controllers and workstations that communicate
using one protocol while the field panels communicate with unitary controllers that they
supervise via a second protocol. The field panel in this example is also a gateway. Since the
same company designs both protocols and they were intended to be used together, the semantic
translation issues are avoided. Typically, translation problems occur only when trying to connect
products made by different manufacturers or, in some cases, different versions of products made
by the same manufacturer. The example shows that gateways may be stand-alone devices or be
included as part of a device that has other control functionality. Gateways are sometimes
confused with routers because gateways also perform routing tasks in addition to translation. The
distinction between a gateway and a router is important.
A router may be used to connect similar or dissimilar networks. If the networks are
similar the router merely forwards the message, changing only the address so that it will go to the
next stop. For the case of dissimilar networks the situation is slightly more complicated. This
case is illustrated in figure 1 for two BACnet local area network (LAN) options. In Figure 1, a
message originating with a device on a Master-Slave/Token-Passing (MS/TP) LAN passes
through a router in order to reach its destination, a workstation residing on an ISO 8802-3 or
"Ethernet" LAN. The content of the message and the binary representation of the message do not
change as it passes through the router. However, some other changes do take place when the
router forwards the message.
Gateway Benefits:
In some situations, a gateway can provide connectivity between devices that would not be
possible any other way. If this connectivity is vital to the building automation system, this factor
can be so critical that it outweighs any negative factors. This is one reason why an initial critical
analysis to determine exactly what information needs to be exchanged is so important. Can the
gateway provide all of the information needed? If not, is the information that it provides
important enough to proceed with the gateway? In some circumstances, options are available that
do not involve a gateway at all.
These options may be available from only a single source or a very limited number of
sources. In this case it may be beneficial to consider allowing gateways simply to create a
competitive bidding situation. The financial savings resulting from competition may make any
limitations of the gateway worthwhile. This approach may permit the involvement of a vendor in
the process with whom you have a very good business or service relationship. This business
relationship may be important enough during the life cycle of the system to make it worthwhile
to accept any limitations that the gateway imposes. Similarly, a vendor’s product may have some
unique feature that is very desirable. It may be beneficial to permit the use of a gateway just to
get this feature.
Traditionally, building automation functions such as lighting control, HVAC control, and
life safety systems have been isolated from each other. Sometimes hard-wired connections
between an HVAC control system and a life safety system exist that permits monitoring of some
limited status information. A growing consensus in the building industry indicates that important
benefits may be obtained through more complete integration of these systems. However,
complete integration may cause concern because it could possibly have an adverse impact on the
ability to guarantee the integrity of critical life safety systems. A gateway provides an
opportunity to meet both goals by significantly improving the ability to integrate and manage the
systems at the same time retaining some isolation because the gateway limits the interaction. In
some cases this may be important for meeting code requirements.
A policy decision might be made to require the use of standard communication protocols,
like BACnet, for all future purchases at a facility. This can be a prohibitively expensive change if
the entire building automation system must be replaced at one time. Gateways provide a way to
retain installed proprietary equipment that is still functioning satisfactorily and to migrate
towards standards-based products in a step-by-step manner as equipment reaches the end of its
useful life. These potential benefits can make gateways desirable. It is important to keep in mind
that all gateways impose limitations. A building automation system with gateways is more
complicated to design and maintain than one where all of the components communicate using the
same protocol. There is a cost to using gateways that must be balanced against the potential
benefits.
Gateway Limitations:
All gateways have finite information storage and processing resources. The gateway must
somehow map the information and concepts from one protocol to the other. This requires
translation algorithms for each kind of information that needs to be translated and tables that
provide mapping information. Sometimes the gateway stores a local copy of the information
from one network in a form that makes it easily accessible to devices on the other network. The
computer resources necessary to implement this process vary from case to case. In general, both
the number of different types of information that are to be translated and the number of
occurrences of each type, affect the amount of memory and processor time needed. The more
information exchange your application requires, the more probable it is that a gateway will not
have the capacity to do the job.
However, problems may arise for more complex concepts such as schedules, alarms,
prioritized commands, and trending data. Different approaches to handling these building
automation functions make gateway design and implementation very difficult. In some cases one
or more of these functions will be simply impossible to do through the gateway. Many control
systems provide a way to program or configure controllers over the network. It is usually not
possible to accomplish these tasks by passing messages through a gateway. Instead, a separate
connection that bypasses the gateway is usually required. Gateways can also introduce time
delays when attempting to retrieve information. In addition to the processing time required to do
the translation, the connected networks may have different transmission speeds and different
rules for gaining access to the media that also contribute to the delay.
Sometimes gateways are designed only to pass a limited amount of very specific
information. The gateway may poll devices for this information and store a local copy. When a
request for this information is received the response is prepared by the gateway based on the
local copy of the data. This approach can speed up the response but it does not scale easily to
large systems. It can also result in returning old and misleading data. What happens, for example,
if the proprietary communication fails temporarily and the stored data cannot be updated? Does
the gateway pass the previously stored data that is now obsolete? Will the recipient be able to tell
the difference? Gateways make troubleshooting network problems more difficult.
Different tools are needed to see and interpret the protocols on both sides of the gateway.
Any ambiguity introduced by the gateway’s imperfect translation introduces the possibility of
error. Even the limitation on the amount of information accessible through the gateway can make
troubleshooting difficult, because it may not be possible to access all of the information needed
to diagnose the problem. It should be clear that, under the right circumstances, a gateway is very
useful. However, it comes with a price. A wrench is very useful and effective tool to tighten a
nut. It is not a particularly helpful tool to drive a nail, even though with persistence it may be
used for that purpose. A gateway is a useful and effective tool to get limited connectivity
between otherwise incompatible devices. As the amount and variety of information that needs to
be exchanged grows, the viability of using a gateway to accomplish the task diminishes.
In some circumstances, a gateway is the only option for a critical integration and is
definitely worthwhile. In other cases, a gateway is acceptable because the advantages and
disadvantages roughly balance each other. However, sometimes a gateway is an unnecessary
liability and should not be used. The key point is that no two jobs are the same and the costs and
benefits must be weighed on a case-by-case basis. The most important part of the process is to
know where to look for the benefits, as well as the trouble, so that an informed decision can be
made. First, determine what information needs to be exchanged and prioritize the list. Then,
request detailed information about the gateway design from the vendor so that the potential
benefits and weaknesses.
Questions to Practice:
PART A
PART B
References:
2.1CommunicationModels:
The concept of combining computers, sensors, and networks to monitor and control
devices has been around for decades, the recent confluence of key technologies and market
trends is emerging in a new reality for the “Internet of Things’’. The main vision of IoT is to
usher in a revolutionary, fully interconnected “smart” world, with relationships between
objects and their environment and objects and people becoming more tightly intertwined.
However, if the system is more complex, a number of potential challenges may stand in the
way of this vision – particularly in the areas of security, privacy, interoperability and
standards, legal, regulatory, and rights issues, as well as the inclusion of emerging
economies. Clearly, the Internet of Things involves a complex and evolving set of
technological, social, and policy considerations across a diverse set of stakeholders.
While the end user benefits from effective communication models, it should be
mentioned that effective IoT communication models also enhance technical innovation and
open opportunity for commercial growth. New products and services can be designed to
take advantage of IoT data streams that didn’t exist previously, acting as a catalyst for
further innovation. The fact is that IoT devices will be found everywhere and will enable
ambient intelligence in the future. Therefore, the overview of the various communication
models used in IOT, given below, are of great value. The IoTs allow people and things to be
connected anytime, anyplace, with anything and anyone, ideally using any path/network and
any service. From an operational perspective, it is useful to think about how IoT devices
connect and communicate in terms of their technical communication models, before making
decisions:
By enabling the user to achieve better access to an IoT device and its data, the overall
value of the device is clearly amplified, however, these networked benefits come with trade-offs.
Careful consideration needs to be paid to the incurred cost burdens placed on users to connect to
cloud resources when considering an architecture, especially in regions where user connectivity
costs are high.
Request-Response Model:
HTTP works as a request-response protocol between a client and server. A web browser
may be the client, and an application on a computer that hosts a web site may be the server.
Example: A client (browser) submits an HTTP request to the server; then the server
returns a response to the client. The response contains status information about the request and
may also contain the requested content.
Figure 2.1: Request Response Communication Model
The WebRequest and WebResponse classes contain methods and properties that apply
generically across the different supported protocols and enable applications to download and
upload data given a specific URI and to specify authentication details, proxy information, and
content details such as the type and size of the content. An asynchronous pattern is also provided
for any operation that could be blocking on network activity. Extensibility is achieved through
the WebRequest.RegisterPrefix method or through entries in the System.Net section of the
application-specific or machine-wide configuration file.
Request-response model helper classes perform a specific function that’s usually
common to the different protocols that implement WebRequest and WebResponse. WebProxy
and WebException are both examples of such helper classes. Another example of a request-
response model helper is the WebPermission class. WebPermission is used to define code access
security privileges for classes that implement the WebRequest and WebResponse base classes.
Privileges that are controlled through WebPermission include the ability to access a specific
URI, a wildcard set of URIs as supplied by a regular expression, or all URIs as identified by the
System.Security.Permissions.PermissionState enumeration. Finally, the WebClient class
provides a number of task-focused methods that perform operations that are commonly useful to
applications that deal with resources on the Web. Here are the operations available as one-line
method calls with WebClient:
Because the WebClient class uses the WebRequest and WebResponse classes in its
implementation, it works with any protocol handler that has been registered into the request-
response model.
Publish-Subscribe Model:
Queue: A queue is an entity that supports the notion of named subjects of interest. Queues
can be characterized as:
2.2 Telemetry:
Today’s telemetry systems are built from commercial-off-the-shelf (COTS) products. But
while they all have many common elements, they are each uniquely configured to meet
specific application requirements. A telemetry system is often viewed as two components,
the Airborne System and the Ground System. In actuality, either or both may be in the air or
on the ground.
Data acquisition begins when sensors (aka, transducers) measure the amount of a
physical attribute and transform the measurement to an engineering unit value. Some sensors
produce a voltage directly (thermocouples for temperature or piezoelectric strain gages for
acceleration), while others require excitation (resistive strain gages, potentiometers for
rotation, etc.). Sensors attached to signal conditioners provide power for the sensors to
operate or modify signals for compatibility with the next stage of acquisition. Since
maintaining a separate path for each source is cumbersome and costly, a multiplexer
(historically known as a commutator) is employed. It serially measures each of the analog
voltages and outputs a single stream of pulses, each with a voltage relative to the respective
measured channel. The rigorous merging of data into a single stream is called Time Division
Multiplexing or TDM.
The scheme where the pulse height of the TDM stream is proportional to the measured
value is called Pulse Amplitude Modulation (PAM). A unique set of synchronization pulses
is added to identify the original measurands and their value. PAM has many limitations,
including accuracy, constraints on the number of measurands supported, and the poor ability
to integrate digital data.
Pulse Code Modulation (PCM) is today’s preferred telemetry format for the same reasons
that PAM is inadequate. Accuracy is high, with resolution limited only by the analog to
digital converter (ADC), and thousands of measurands can be acquired along with digital
data from multiple sources, including the contents of the computer’s memory and data buses.
In a PCM-based system, the original PAM multiplexer’s analog output is digitized to a
parallel format. The Output Formatter along with synchronization data for measurand
identification merges this, plus other sources of digital data. The Output Formatter serializes
the composite parallel data stream to a binary string of pulses (1’s and 0’s) for transmission
on copper wire, fiber cable, or "the ether." All components from after the sensor to the
formatter comprise the encoder (see figure below}. Other, often remote encoders are used to
multiplex additional sensor data into the main encoder’s output. Not only does this expand
the number of measurands to thousands per stream, but it also eliminates the weight of cables
required for each sensor.
The output of the main encoder is filtered and transmitted via radio transmitter and
antenna, coax cable, telephone line, tape recorder, etc. Filtering rounds or smoothes the
square data pulses to reduce frequencycontent and thus the required transmitter bandwidth.
At the Ground Station, the received data stream is amplified. Since the transmission path
often distorts the already rounded signal, a bit synchronizer reconstructs it to the original
serial square wave train. Then, a decommutator or decom (similar to that found in L-3’
Visual Test System or System 550) recognizes the synchronization pattern and returns the
serial digital stream to parallel data. The decom also separates the PCM stream into its
original measurands (also known as prime parameters) and data.
The computer (in the Visual Test System) or the telemetry front end (System 550) selects
prime parameters for real-time processing; archiving to disk or tape; display; output to strip chart
recorders and annunciators; or distribution to other computing resources according to the test
plan.
L-3 Telemetry-West and its sister divisions manufacture virtually the entire telemetry
system — from signal conditioners to antennas for the Airborne System and from antennas to
telemetry receivers for the Ground System. The following table breaks down which L-3 divisions
provide what for today's telemetry system requirements. Just remember that when you need to
put everything together, L-3 Telemetry-West will help specify, integrate, and install all L-3 and
third-party components for a total telemetry solution.
Airborne Systems:
Data Acquisition:A wide variety of sensors (also known as transducers) are used to measure and
acquire a physical property’s value. It is up to the instrumentation engineer to select the device to
meet the environmental, response, accuracy, size, and cost specifications for the application.
Signal conditioners serve as the interface of the data acquisition system from the transducers.
Many transducers require ac or dc power (e.g., thermistors, strain gages, and linear variable
differential transformers - LVDTs) while others generate signals (tachometers, thermocouples,
and piezoelectric strain gages). They provide excitation (power), network calibration, signal
amplification, and filtering.
A system under test may be subjected to known physical characteristics and the output
measured to ascertain and verify the relationship between the sensor and its output. For example,
when on the ground, an airplane’s flaps may be moved at known angles, while measurements are
taken on sensor or airborne system output. The plot of angle vs. output will be used by the
ground system for real-time data display in engineering units.
Today, the most popular form of telemetry multiplexing (originally called commutation,
as in an electric motor’s commutator) is TDM. Here, each channel is serially sampled for an
instant by the multiplexer.
When all channels have been sampled, the sequence restarts at the first channel. Thus,
samples from a particular channel are interleaved in time between samples from all of the other
channels. An example of a simple interleaved data stream is shown below as the output of the
commutator: Since no measurand is monitored continuously, sampling must be accomplished
fast enough so that the value of each measurand does not change significantly during intervals. In
practice, each parameter is measured at a rate of three to five times the highest frequency of
interest. There's a definite science to selecting efficient sampling rates and measurand positions.
Commutation:A complete scan by the multiplexer (one revolution of the commutator) produces a
frame of the stream of words containing the value of each measurand. Every scan produces the
same sequence of words. Only the value of a measurand is captured, not its address (name). If
only the measurand’s data is captured, there is no way to distinguish the owner of one value from
the next. Thus, a unique word called the frame sync is added at the end of each frame to serve as
a reference for the process of decommutating the stream’s data (i.e., extracting it into individual
measurand values).
In a simple commutator, each data word is sampled once per revolution at a rate
compatible with the measurand with the fastest changing data. Since the rate of change of a
measurand's value varies tremendously, the sampling frequency rate must accommodate it. As an
example, to characterize vibration requires many more samples per second (thousands) than
temperature (fractions). According to the Nyquist Theorem, you must sample data at twice the
maximum frequency component for the signal to be acquired. Sampling rates of 5 times the
maximum frequency component are typical.
A low pass filter is used to eliminate any frequencies that you cannot accurately digitize
to prevent aliasing. If we were to take a worst-case approach to sampling all measurands at the
highest rate, we could expect much waste in carrier frequency spectrum and power. Sampling
rates should therefore vary with respect to frequency content and be somewhat independent of
other measurands with different periodic acquisition rates. Highly sampled measurands are
super-commutated with multiple occurrences of the measurand in each frame.
Frame Synchronization Pattern:Identifying the end of each minor frame period is the
synchronization (sync) word, which is a unique sequence of 1's and 0's. The pattern is generally a
pseudo-random sequence that is unlikely to occur randomly in the acquired data and usually
occupies two words (or more) in the minor frame. The IRIG-106 Standard lists recommended
patterns for lengths 16 through 33 bits. The first three bits transmitted in a frame sync pattern are
always a "1," regardless of LSB or MSB alignment.
Figure 2.8: Frame Synchronization Pattern
The length of the frame sync is longer than usual data words to reduce the probability of
actual data matching it. The frame sync should also be commensurate with the number of words
in the minor frame (typically, it occupies 1 to 5 percent of the total minor frame). An identical
pattern is repeated for every minor frame on the assumption that random data will not
consistently match the defined pattern. The decommutator can then be programmed to lock onto
this pattern to begin regenerating the original commutated measurands.
Javascript origin:
JSON originates from the object literals of JavaScript (ECMAScript) and is specified in
RFC 7159 (The JavaScript Object Notation (JSON) Data Interchange Format). In fact, JSON
is referred in ECMA-262 (see also ECMA-404 (The JSON Data Interchange Format)).
Because the JSON syntax originates in JavaScript, it is possible to use eval in JavaScript to
parse a JSON document. Of course, this poses a security risk (like that of SQL injection) if
the JSON document contains executable code.
Tokens:
The grammar of JSON has only six structural characters: { and } [ and ] : , Besides them,
there are three other kinds of tokens: strings (enclosed in ") is a sequence of Unicode
characters numbers literals (true, false or null).
Primitive data types (scalars): JSON specifies four primitive data types. Each scalar value
has one of these data types.
Structured types:
The primitive data types are building blocks to create structured types. There are
twostructured types:
objects Unordered set name/value pairs
arrays Ordered list of values (whose data type need
not to be the same)
The terms object and array have their origins in JavaScript (see JavaScript objects and arrays)
Objects:
JSON objects are embedded in curly braces ({ … }). A JSON object consists of comma
separated key-value pairs. The key is a string and is separated from the value by a colon (:).
These key value pairs are referred to as members.
{
"item": "Lemon",
"price": {
"val": 1.84,
"currency: "CHF"
}
}
Arrays:
Arrays are embedded in square brackets: [ … ]. An array is an ordered, comma-separated
list of JSON values.
[ "foo",
42,
null,
{ "foo": "bar" }
]
Values:
A value is either a primitive or a structured type.
Questions to Practice:
Part-A
Part-B
1. Analyze the Performance of Telemetry in Communication Patterns
2. Categorize Inquiry and Status in Communication Patterns
4. Classify the concepts of Primitive Data Types and Structural types in JSON
References:
3.1 Introduction:
We have discussed various telecommunication mediums and technologies to move that
data from the WPAN, WLAN, and WAN. There are many complexities and subtleties in
building and bridging these network connections from non-IP-based PAN networks to IP-
based WAN networks. There are also protocol conversions that need to be understood.
Standard protocols are the tools that bind and encapsulate raw data from a sensor to
something meaningful and formatted for the cloud to accept. One of the principal differences
between an IOT system and M2M system is that M2M may communicate over a WAN to a
dedicated server or system with no encapsulated protocol whatsoever. The IoT by definition
is based on communication between endpoints and services with the internet being the
common network fabric.
IOT Protocols:
A natural question is, why are there any protocols outside of HTTP to transport data
across the WAN? HTTP has provided significant services and abilities for the internet for
over 20 years, yet was designed and architected for general purpose computing in
client/server models. IoT devices can be very constrained, remote, and bandwidth limited.
Therefore, more efficient, secure, and scalable protocols are necessary to manage a plethora
of devices in various network topologies such as mesh networks.
In transporting data to the internet, designs are relegated to the TCP/IP foundation layers.
TCP and UDP protocols are the obvious and only choice in data communication with TCP
being significantly more complex in its implementation than UDP (being a multicast
protocol). UDP, however, does not have the stability and reliability of TCP, forcing some
designs to compensate by adding resiliency in the application layers above UDP. Many of the
protocols listed in this chapter are Message Orientated Middleware (MOM) implementations.
The basic idea of a MOM is that communication between two devices takes places using
distributed message queues. A MOM delivers messages from one userspace application to
another. Some devices produce data to be added to a queue while others consume data stored
in a queue. Some implementations require a broker or middleman be the central service. In
that case, producers and consumers have a publish and subscribe-type relationship with the
broker. AMQP, MQTT, and STOMP are MOM implementations; others include CORBA
and Java messaging services. A MOM implementation using queues can use them for
resilience in design. Data can persist in the queues, even if the server fails.
MQTT:
The IBM Websphere Message Queue technology was first conceived in 1993 to address
problems in independent and non-concurrent distributed systems to securely communicate. A
derivative of the WebSphere Message Queue was authored by Andy Stanford-Clark and Arlen
Nipper at IBM in 1999 to address the constraints of connecting remote oil and gas pipelines over
a satellite connection. That protocol became known as the MQTT.
MQTT was an internal and proprietary protocol for IBM for many years until being
released in version 3.1 in 2010 as a royalty-free product. In 2013, MQTT was standardized
and accepted into the OASIS consortium. In 2014, OASIS released it publicly as version
MQTT 3.1.1. MQTT is also an ISO standard (ISO/IEC PRF 20922).
MQTT publish-subscribe:
While client-server architectures have been the mainstay for data center services for
years, publish-subscribe models represent an alternative that is useful for IoT uses.
Publishsubscribe, also known as pub/sub, is a way to decouple a client transmitting a message
from another client receiving a message. Unlike a traditional client-server model, the clients are
not aware of any physical identifiers such as the IP address or port. MQTT is a pub/sub
architecture but is not a message queue. Message queues by nature store messages while MQTT
does not. In MQTT, if no one is subscribing (or listening) to a topic, it is simply ignored and lost.
Messages queues also maintain a client-server topology where one consumer is paired with one
producer.
Subject filtering: By design, clients subscribe to topics and certain topic branches
and do not receive more data than they want. Each published message must
contain a topic and the broker is responsible for either re-broadcasting that
message to subscribed clients or ignoring it.
Content filtering: Brokers have the ability to inspect and filter published data.
Thus, any data not encrypted can be managed by the broker before being stored or
published to other clients.
Type filtering: A client listening to a subscribed data stream can apply their own
filters as well. Incoming data can be parsed and the data stream either processed
further or ignored.
MQTT may have many producers and many consumers, as shown in the following figure:
Figure 3.2: MQTT Publish-Subscribe Topology
MQTT successfully decouples publishers from consumers. Since the broker is the
governing body between publishers and consumers there is no need to directly identify a
publisher and consumer based on physical aspects (such as an IP address). This is very useful
in IoT deployments as the physical identity may be unknown or ubiquitous. MQTT and other
pub/sub models are also time-invariant. This implies that a message published by one client
may be read and responded to at any time by a subscriber. A subscriber could be in a very
low power/bandwidth-limited state (for example, Sigfox communication) and respond
minutes or hours later to a message. Because of the lack of physical and time relationships,
pub/sub models scale very well to extreme capacity.
Cloud-managed MQTT brokers typically can ingest millions of messages per hour and
support tens of thousands of publishers. MQTT is data format agnostic. Any type of data may
reside in the payload which is why both the publishers and subscribers must both understand
and agree to the data format. It is possible to transmit text messages, image data, audio data,
encrypted data, binary data, JSON objects, or virtually any other structure in the payload.
JSON text and binary data, however, are the most common data payload types. The
maximum allowable packet size in MQTT is 256 MB, which allows for am extremely large
payload.
MQTT architecture:
MQTT by name is a misnomer. There are no message queues inherent in the protocol.
While it is possible to queue messages, it's not necessary and often not done. MQTT is TCP
based, and therefore includes some guarantee that a packet is transferred reliably. MQTT is also
an asymmetric protocol, whereas HTTP is a nonsymmetric protocol. Say node A needs to
communicate with node B.
An asymmetric protocol between A and B would only require one side (A) to use the
protocol, however, all information needed for the reassembly of packets must be contained in the
fragmentation header sent by A. Asymmetric systems have one master and one slave (FTP being
a classic example). In a symmetric protocol, both A and B would have the protocol installed. A
or B can assume the role of master or slave (telnet being a primary example). MQTT has distinct
roles which makes sense in sensor/cloud topology.
MQTT defines an optional facility called Last Will and Testament (LWT). An LWT is a
message a client specifies during the connect phase. The LWT contains the Last Will topic, QoS,
and the actual message. If a client disconnects from a broker connection ungracefully (for
example, keep-alive timeout, I/O error, or the client closing the session without a disconnect), the
broker is then obligated to broadcast the LWT message to all other subscribed clients to that
topic. Even though MQTT is based on TCP, connections can still be lost, especially in a wireless
sensor environment.
A device may lose power, lose signal strength, or simply crash in the field, and a session
will enter a half-open state. Here, the server will believe the connection is still reliable and
expect data. To remedy this half-open state, MQTT uses a keep-alive system. Using this system,
both the MQTT broker and client have assurance that the connection is still valid even if there
hasn't been a transmission for some time. The client sends a PINGREQ packet to the broker,
which in turn acknowledges the message with a PINGRESP. A timer is preset on the client and
broker side. If a message has not been transmitted from either within a predetermined time limit,
a keep-alive packet should be sent. Both a PINGREQ or a message will reset the keep-alive
timer. If a keep-alive is not received and the timer expires, the broker will close the connection
and send out the LWT packet to all clients. The client may at some point later attempt to
reconnect. In that case, a half-open connection will be closed by the broker and open up a new
connection to the client.
QoS in MQTT is defined and controlled by the sender and each sender can have a
different policy. Typical use cases:
QoS-0: Should be used when message queuing isn't needed. It is best for wired
connections or when the system is severely constrained on bandwidth.
Qos-1: This should be the default usage. QoS1 is much faster than QoS2 and greatly
reduces transmission cost.
QoS-2: For mission-critical applications. Also, cases where retransmission of
duplicate message could cause faults.
An MQTT packet sits on top of the TCP layer of the OSI model network stack. The
packet consists of a 2 byte fixed header, which must always be present, a variably sized header
(optional) and concludes with the payload (again, optional):
Figure 3.3 MQTT Packet Structure
A communication link using MQTT starts with the client sending a CONNECT message
to a broker. Only a client can initiate a session and no client may communicate directly with
another client. The broker will always respond back to a CONNECT message with a
CONNACK response and status code. Once established the connection remains open. The
following are the MQTT messages and formats:
CONNECT format (client to server): A typical CONNECT message will contain the
following (only the clientID is required to initiate a session):
CONNECT return codes (server to client): The broker will respond to a CONNECT
message with a response code. A designer should be aware that not all connections may
be approved by a broker.
PUBLISH format (client to server): At this point, a client may publish data to a topic
branch.
SUBSCRIBE format (client to server): The payload of a subscribe packet includes at
least one pair of UTF-8 encoded topicIDs and QoS levels.
Wildcards can be used to subscribe to multiple topics in a single message. For these
examples, the topic will be the full path of
"{country}/{states}/{cities}/{temperature,humidity}".
+ Single level wildcard: Substitutes a single level in a topic string name. For example,
US/+/Milwaukee will substitute the state level and replace it with all 50 states, Alaska to
Wyoming.
* Multi-level wildcard: Replaces multiple levels rather than a single level. It is always the
last character in the topic name. For example, US/Wisconsin/# will subscribe to all
Wisconsin cities: Milwaukee, Madison, Glendale, Whitefish Bay, Brookfield, and so on.
$ Special topics: This is a special statistical mode for MQTT brokers. Clients cannot
publish to $ topics. There is no official standard for use at the moment. One model uses
$SYS in the following manner: $SYS/broker/clients/connected.
MQTT-SN:
In MQTT-SN, gateways can take on two distinct roles. First, a transparent gateway will
manage many independent MQTT-SN streams from sensor devices and convert each stream into
an MQTT message. An aggregating gateway will coalesce a number of MQTTSN streams into a
reduced number of MQTT streams sent to the cloud MQTT broker. An aggregating gateway is
more complex in design but will reduce the amount of communication overhead and number of
simultaneous connections left open on the server. For an aggregating gateway topology to
function, the clients need to publish or subscribe to the same topic.
Gateway advertisement and discovery:
Since the topology of MQTT-SN is slightly more complicated than MQTT, a discovery
process is used to establish routes through multiple gateways and forwarder nodes. Gateways
joining an MQTT-SN topology start by binding to an MQTT broker first. Afterwards, it may
issue an ADVERTISE packet to connected clients or other gateways. Multiple gateways can
exist on the network, but a client may only connect to a single gateway. The client has the
responsibility of storing a list of active gateways and their network addresses. This list is
constructed from the various ADVERTISEMENT and GWINFO messages being broadcast.
Because of the new types of gateways and topologies in MQTT-SN, several new messages are
available to assist in discovery and advertising:
There are three CONNECT messages in MQTT-SN versus one in MQTT. The
additional two are used to transport the Will topic and Will message explicitly.
MQTT-SN can run over simplified medium and UDP.
Topic names are replaced by short, two-byte long topic ID messages. This is to
assist with bandwidth constraints in wireless networks.
Pre-defined topic IDs and short topic names can be used without any registration.
To use this feature, both the client and server need to use the same topic ID. Short
topic names are short enough to be embedded in the PUBLISH message.
A discovery procedure is introduced to assist clients and to allow them to find the
network addresses of servers and gateways. Multiple gateways may exist in the
topology and can be used to load share communication with clients.
The cleanSession is extended to the Will feature. A client subscription can be
retained and preserved, but now the Will data is also preserved.
A revised keepAlive procedure is used in MQTT-SN. This is to support sleeping
clients in which all messages destined for them are buffered by a server or
edgerouter and delivered upon awakening.
The Constrained Application Protocol (CoAP) is the product of the IETF (RFC7228).
The IETF Constrained RESTful Environments (CoRE) working group created the first draft
of the protocol in June 2014 but had worked for several years on its creation. It is specifically
intended as a communication protocol for constrained devices. The core protocol is now
based on RFC7252. The protocol is unique as it is first tailored for machine-to-machine
(M2M) communication between edge nodes. It also supports mapping to HTTP through the
use of proxies. This HTTP mapping is the on-board facility to get data across the internet.
CoAP is excellent at providing a similar and easy structure of resource addressing familiar to
anyone with experience of using the web but with reduced resources and bandwidth
demands.Additionally, some implementations of CoAP perform up to 64x better than HTTP
equivalents on similar hardware.
CoAP architecture:
CoAP is based on the concept of mimicking and replacing heavy HTTP abilities and
usage with a lightweight equivalent for the IoT. It is not an HTTP replacement as it does lack
features; HTTP requires more powerful and service-orientated systems.
CoAP features can be summarized as follows:
HTTP-like
Connection-less protocols
Security through DTLS rather than TLS in a normal HTTP transmission
Asynchronous message exchanges
Lightweight design and resource requirements and low header overhead
Support for URI and content-types
Built upon UDP versus TCP/UDP for a normal HTTP session
A stateless HTTP mapping allowing for proxies top bridge to HTTP sessions
CoAP shares context, syntax, and usage similarly to HTTP. Addressing in CoAP is also
styled like HTTP. An address extends to the URI structure. As in HTTP URIs, the user must
know the address beforehand to gain access to a resource. At the top-most level, CoAP uses
requests such as GET, PUT, POST, and DELETE, as in HTTP.
2.01: Created
2.02: Deleted
2.04: Changed
2.05: Content
4.04: Not found (resource)
4.05: Method not allowed
The form of a typical URI in CoAP would be:
coap://host[:port]/[path][?query]
Endpoints: These are the sources and destinations of a CoAP message. The
specific definition of an endpoint depends on the transport being used.
Proxies: A CoAP endpoint that is tasked by CoAP clients to perform requests on
its behalf. Reducing network load, access sleeping nodes, and providing a layer of
security are some of the roles of a proxy. Proxies can be explicitly selected by a
client (forward-proxying) or can be used as in-situ servers (reverse-proxying).
Alternatively, a proxy can map from one CoAP request to another CoAP request
or even translate to a different protocol (cross-proxying). A common situation is
an edge router proxying from a CoAP network to HTTP services for cloud-based
internet connections.
Client: The originator of a request. The destination endpoint of a response.
Server: The destination endpoint of a request. The originator of a response.
Intermediary: A client acting as both a server and client towards an origin server.
A proxy is an intermediary.
Origin servers: The server on which a given resource resides.
Observers: An observer client can register itself using a modified GET message.
The observer is then connected to a resource and if the state of that resource
changes, the server will send a notification back to the observer.
Below is an example of CoAP architecture. Being a lightweight HTTP system allows for
CoAP clients to communicate to one another or services in the cloud supporting CoAP.
Alternatively, a proxy can be used to bridge to an HTTP service in the cloud. CoAP
endpoints can establish relationships with each other, even at the sensor level. Observers
allow for subscription-like attributes to resource that change in a similar manner to MQTT.
The graphic also illustrates origin servers holding the resource being shared.
The two proxies allow for CoAP to perform HTTP translation or allow forwarding
requests on behalf of a client.
Protocols based on UDP transport imply that the connection may not be inherently
reliable. To compensate for reliability issues, CoAP introduces two message types that differ
by either requiring acknowledgment or not. An additional feature of this approach is that
messages can be asynchronous.
While other messaging architecture requires a central server to propagate messages between
clients, CoAP allows messages to be sent between any CoAP client including sensors and
servers. CoAP includes a simple caching model. Caching is controlled through response codes in
the message header. An option number mask will determine if it is a cache key. The Max_Age
option is used to control cache element lifetimes and ensure freshness of the data. That is,
Max_Age sets the maximum time a response can be cached for before it must be refreshed.
Max_Age defaults to 60 seconds and can span up to 136.1 years.
Proxies play a role in caching; for example, a sleeping edge sensor may use a proxy to cache
data and to preserve power. The CoAP message header is uniquely designed for maximum
efficiency and bandwidth preservation. The header is four bytes in length with a typical request
message taking only 10 to 20-byte headers. This is typically 10x smaller than an HTTP header.
The structure consists of Message Type Identifiers (T), which must be included in each header
along with an associated unique Message-ID. The Code field is used to signal errors or success
states across channels. After the header, all other fields are optional and include variable length
Tokens, Options, and Payload.
UDP also can cause duplicate messages to arrive for both CON and NON transmissions.
If identical Message_IDs are delivered to a recipient within the prescribed
EXCHANGE_LIFETIME, a duplicate is said to exist. This clearly can occur, as shown in the
previous figures, when an ACK is missing or dropped and the client retransmits the message
with the same Message_ID. The CoAP specification states the recipient should ACK each
duplicate message it receives but should only process one request or response. This rule may
be relaxed if a CON message transports a request that is idempotent.
As mentioned, CoAP allows for the role of an observer in a system. This is unique as it
allows CoAP to behave in a similar manner to MQTT. The observation process allows a
client to register for observation and the server will notify the client whenever the resource
being monitored changes state. The duration of observation can be defined during
registration. Additionally, the observation relationship ends when the initiating client sends a
RST or another GET message:
Figure 3.7: CoAP Observer registration and update process
//insecure coap://example.net:1234/~temperature/value.xml
//secure coaps://example.net:1234/~temperature/value.xml
CoAP offers resource discovery mechanisms as well. Simply sending a GET request to
/.well-known/core will disclose a list of known resources on the device. Additionally, query
strings can be used in the request for applying specific filters.
Other protocols:
There are many other messaging protocols in use or proposed for IoT and M2M
deployments. By far the most prevalent are MQTT and CoAP.
STOMP: STOMP stands for Simple (or Streaming) Text Message-Oriented Middleware
Protocol. It is a text-based protocol designed by Codehaus to operate with message-oriented
middleware. A broker developed in one programming language can receive messages from a
client written in another. The protocol has similarities to HTTP and operates over TCP.
STOMP consists of a frame header and a frame body. The current specification is STOMP
1.2, dated October 22, 2012, and is available under a free license. It is different from many
protocols presented as it does not deal with subscription topics or queues. It simply uses
HTTP-like semantics such as SEND with a destination string. A broker must dissect the
message and map to a topic or queue for the client. A consumer of the data will SUBSCRIBE
to the destinations provided by the broker. STOMP has clients written in python (Stomp.py),
TCL (tStomp), and Erlang (stomp.erl). Several servers have native STOMP support, such as
RabbitMQ (via a plugin), and some servers have been designed in specifics languages (Ruby,
Perl, or OCaml).
AMQP: AMQP stands for Advanced Message Queuing Protocol. It is a hardened and
proven MOM protocol used by massive data sources such as JP Morgan Chase in processing
over 1 billion messages per day, and the Ocean Observatory Initiative in collecting over 8
terabytes of oceanographic data each day. It was originally devised at JP Morgan Chase in
2003, who led the creation of a working group of 23 companies in 2006 chartered with the
architecture and governance of the protocol. In 2011, the working group was merged into the
OASIS group where it is currently hosted.
Today it is well established in banking and credit transaction industries but has a place in
IoT as well. Furthermore, AMQP is standardized by ISO and IEM as ISO/IEC 19464:2014.
A formal AMQP working group can be found at www.amqp.org. The AMQP protocol
resides on top of the TCP stack and uses port 5672 for communication. Data is serialized
over AMQP, meaning that messages are broadcast in unit frames. Frames are transmitted in
virtual channels identified with a unique channel_id. Frames consist of headers, channel_ids,
payload information, and footers.
A channel, however, can only be associated with a single host. Messages are assigned a
unique global identifier. AMQP is a flow-controlled, message-orientated communication
system. It is a wire-level protocol and a low-level interface. A wire protocol refers to APIs
immediately above the physical layer of a network. A wire-level API allows for different
messaging services such as .NET (NMS) and Java (JMS) to communicate with each other.
Similarly, AMQP attempts to decouple publishers from subscribers. Unlike MQTT it has
mechanisms for load balancing and formal queuing. A well-used protocol that is based on
AMQP is RabbitMQ. RabbitMQ is an AMQP message broker written in Erlang.
Additionally, several AMQP clients are available, such as the RabbitMQ client written in
Java, C#, Javascript, and Erlang, as well as Apache Qpid written in Python, C++, C#, Java,
and Ruby. One or more virtual hosts with their own namespaces, exchanges, and message
queues will reside on a central server(s).
The network topology of an AMQP deployment is hub-and-spoke with the ability for
hubs to communicate with one another. AMQP consists of nodes and links. A node is a
named source or a sink of messages. A message frame moves between nodes across
unidirectional links. If a message is passed through a node, the global identifier is unchanged.
If the Node performs any transformation, a new ID is assigned. A link can filter messages.
There are three distinct messaging patterns that can be used in AMQP:
Questions to Practice:
Part-A
1. Evaluate a MQ Telemetry Transport by considering packet format, broker topics and QOS
Levels
3. Determine an algorithm for solving MQTT by using Keep Alive, Client take over and
Security
References:
HTTP (Hypertext Transfer Protocol) is perhaps the most popular application protocol
used in the Internet (or The WEB). HTTP is an asymmetric request
request-response
response client-server
client
protocol as illustrated. An HTTP client sends a request message to an HTTP server. The
server, in turn, returns a response message. In other words, HTTP is a pull protocol, the
client pulls information from the server (instead of server pushes information down to the
client). HTTP is a stateless protocol. In other words, the current re
request
quest does not know what
has been done in the previous requests. HTTP permits negotiating of data type and
representation, so as to allow systems to be built independently of the data being transferred.
HTTP Protocol:
As mentioned, whenever you enter a URL in the address box of the browser, the browser
translates the URL into a request message according to the specified protocol; and sends the
request message to the server. For example, the browser translated the URL
http://www.nowhere123.com/doc/index.html into the following request message:
GET/docs/index.html HTTP/1.1
Host: www.nowhere123.com
Accept: image/gif, image/jpeg, */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
(blank line)
When this request message reaches the server, the server can take either one of these
actions: The server interprets the request received, maps the request into a file under
the server's document directory, and returns the file requested to the client.
The server interprets the request received, maps the request into a program kept in the
server, executes the program, and returns the output of the program to the client.
The request cannot be satisfied, the server returns an error message.
An example of the HTTP response message is as shown:
HTTP/1.1 200 OK
Date: Wed, 15Apr 202008:56:53 GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Thu, 16 Apr 2020 07:16:26 GMT
ETag: "10000000565a5-2c-3e94b66c2e680"
Accept-Ranges: bytes
Content-Length: 44
Connection: close
Content-Type: text/html
X-Pad: avoid browser bug
<html><body><h1>It works!</h1></body></html>
The browser receives the response message, interprets the message and displays the
contents of the message on the browser's window according to the media type of the response
(as in the Content-Type response header). Common media type include "text/plain",
"text/html", "image/gif", "image/jpeg", "audio/mpeg", "video/mpeg", "application/msword",
and "application/pdf". In its idling state, an HTTP server does nothing but listening to the IP
address(es) and port(s) specified in the configuration for incoming request. When a request
arrives, the server analyzes the message header, applies rules specified in the configuration,
and takes the appropriate action. Th
Thee webmaster's main control over the action of web server
is via the configuration, which will be dealt with in greater details in the later sections.
Another important thing associated with REST is resource methods to be used to perform
the desired transition. A large number of people wrongly relate resource methods to HTTP
GET/PUT/POST/DELETE methods. Roy Fielding has never mentioned any recommendation
around which method to be used in which condition. All he emphasizes is that it should be
uniform interface. If you decide HTTP POST will be used for updating a resource – rather than
most people recommend HTTP PUT – it’s alright and application interface will be RESTful.
Ideally, everything that is needed to change the resource state shall be part of API response for
that resource – including methods and in what state they will leave the representation.
4.3 CRUD operations
Within computer programming, the acronym CRUD stands for create, read, update and
delete. These are the four basic functions of persistent storage. Also, each letter in the acronym
can refer to all functions executed in relational database applications and mapped to a standard
HTTP method, SQL statement or DDS operation. It can also describe user-interface conventions
that allow viewing, searching and modifying information through computer-based forms and
reports. Entities are read, created, updated and deleted. Those same entities can be modified by
taking the data from a service and changing the setting properties before sending the data back to
the service for an update. Plus, CRUD is data-oriented and the standardized use of HTTP action
verbs.
Most applications have some form of CRUD functionality. In fact, every programmer has
had to deal with CRUD at some point. Not to mention, a CRUD application is one that utilizes
forms to retrieve and return data from a database. The first reference to CRUD operations came
from Haim Kilov in 1990 in an article titled, “From semantic to object-oriented data modeling.”
However, the term was first made popular by James Martin’s 1983 book, Managing the Data-
base Environment. Here’s a breakdown:
CREATE procedures: Performs the INSERT statement to create a new record.
READ procedures: Reads the table records based on the primary keynoted within
the input parameter.
UPDATE procedures: Executes an UPDATE statement on the table based on the
specified primary key for a record within the WHERE clause of the statement.
DELETE procedures: Deletes a specified row in the WHERE clause.
Based on the requirements of a system, varying user may have different CRUD cycles. A
customer may use CRUD to create an account and access that account when returning to a
particular site. The user may then update personal data or change billing information. On the
other hand, an operations manager might create product records, then call them when needed or
modify line items.
Here’s an example, illustrating an asp.net MVC 4 CRUD operation using ADO.NET:
During the Web 2.0 era, CRUD operations were at the foundation of most dynamic
websites. However, you should differentiate CRUD from the HTTP action verbs. For example, if
you want to create a new record you should use “POST.” To update a record, you would use
“PUT” or “PATCH.” If you wanted to delete a record, you would use “DELETE.” Through
CRUD, users and administrators had the access rights to edit, delete, create or browse online
records. An application designer has many options for executing CRUD operations. One of the
most efficient of choices is to create a set of stored procedures in SQL to execute operations.
With regard to CRUD stored procedures, here are a few common naming conventions:
The procedure name should end with the implemented name of the CRUD
operation. The prefix should not be the same as the prefix used for other user-
defined stored procedures.
CRUD procedures for the same table will be grouped together if you use the table
name after the prefix.
After adding CRUD procedures, you can update the database schema by
identifying the database entity where CRUD operations will be implemented.
Benefits of CRUD:
Instead of using ad-hoc SQL statements, many programmers prefer to use CRUD because
of its performance. When a stored procedure is first executed, the execution plan is stored in
SQL Server’s procedure cache and reused for all applications of the stored procedure.
When a SQL statement is executed in SQL Server, the relational engine searches the
procedure cache to ensure an existing execution plan for that particular SQL statement is
available and uses the current plan to decrease the need for optimization, parsing and
recompiling steps for the SQL statement.
If an execution plan is not available, then the SQL Server will create a new execution
plan for the query. Moreover, when you remove SQL statements from the application code,
all the SQL can be kept in the database while only stored procedure invocations are in the
client application. When you use stored procedures, it helps to decrease database coupling.
Furthermore, using CRUD operations helps to prevent SQL injection attacks. By utilizing
stored procedures instead of string concatenation to build dynamic queries from user input
data for all SQL Statements means that everything placed into a parameter gets quoted.
4.4 Websocket
Web sockets are defined as a two-way communication between the servers and the
clients, which mean both the parties communicate and exchange data at the same time. The key
points of Web Sockets are true concurrency and optimization of performance, resulting in more
responsive and rich web applications. Description of Web Socket Protocol This protocol defines
a full duplex communication from the ground up. Web sockets take a step forward in bringing
desktop rich functionalities to the web browsers. It represents an evolution, which was awaited
for a long time in client/server web technology.
The main features of web sockets are as follows − Web socket protocol is being
standardized, which means real time communication between web servers and clients is possible
with the help of this protocol.
Web sockets are transforming to cross platform standard for real time communication
between a client and the server. This standard enables new kind of the applications. Businesses
for real time web application can speed up with the help of this technology. The biggest
advantage of Web Socket is it provides a two-way communication (full duplex) over a single
TCP connection.
Before diving to the need of Web sockets, it is necessary to have a look at the existing
techniques, which are used for duplex communication between the server and the client. They
are as follows −
Polling
Long Polling
Streaming
Postback and AJAX
HTML5
Polling
Polling can be defined as a method, which performs periodic requests regardless of the
data that exists in the transmission. The periodic requests are sent in a synchronous way. The
client makes a periodic request in a specified time interval to the Server. The response of the
server includes available data or some warning message in it.
Long Polling
Long polling, as the name suggests, includes similar technique like polling. The client
and the server keep the connection active until some data is fetched or timeout occurs. If the
connection is lost due to some reasons, the client can start over and perform sequential
request.Long polling is nothing but performance improvement over polling process, but
constant requests may slow down the process.
Streaming
It is considered as the best option for real-time data transmission. The server keeps the
connection open and active with the client until and unless the required data is being fetched.
In this case, the connection is said to be open indefinitely. Streaming includes HTTP headers
which increases the file size, increasing delay. This can be considered as a major drawback.
Internet was conceived to be a collection of Hypertext Mark-up Language (HTML) pages
linking one another to form a conceptual web of information. During the course of time,
static resources increased in number and richer items, such as images and began to be a part
of the web fabric. Server technologies advanced which allowed dynamic server pages - pages
whose content was generated based on a query. Soon, the requirement to have more dynamic
web pages lead to the availability of Dynamic Hypertext Mark-up Language (DHTML). All
thanks to JavaScript. Over the following years, we saw cross frame communication in an
attempt to avoid page reloads followed by HTTP Polling within frames. However, none of
these solutions offered a truly standardized cross browser solution to real-time bi-directional
communication between a server and a client. This gave rise to the need of Web Sockets
Protocol. It gave rise to full-duplex communication bringing desktop-rich functionality to all
web browsers.
Web Socket represents a major upgrade in the history of web communications. Before its
existence, all communication between the web clients and the servers relied only on HTTP.
Web Socket helps in dynamic flow of the connections that are persistent full duplex. Full
duplex refers to the communication from both the ends with considerable fast speed. It is
termed as a game changer because of its efficiency of overcoming all the drawbacks of
existing protocols.
Web Socket for Developers and Architects:
Importance of Web Socket for developers and architects –
Web Socket is an independent TCP-based protocol, but it is designed to support
any other protocol that would traditionally run only on top of a pure TCP
connection.
Web Socket is a transport layer on top of which any other protocol can run. The
Web Socket API supports the ability to define sub-protocols: protocol libraries
that can interpret specific protocols.
Examples of such protocols include XMPP, STOMP, and AMQP.
The developers no longer have to think in terms of the HTTP request-response
paradigm.
The only requirement on the browser-side is to run a JavaScript library that can
interpret the Web Socket handshake, establish and maintain a Web Socket
connection.
On the server side, the industry standard is to use existing protocol libraries that
run on top of TCP and leverage a Web Socket Gateway.
The following diagram describes the functionalities of Web Sockets –
Web Sockets occupy a key role not only in the web but also in the mobile industry. The
importance of Web Sockets is given below.
Web Sockets as the name indicates, are related to the web.
Web consists of a bunch of techniques for some browsers; it is a broad
communication platform for vast number of devices, including desktop
computers, laptops, tablets and smart phones.
HTML5 app that utilizes Web Sockets will work on any HTML5 enabled web
browser. Web socket is supported in the mainstream operating systems.
All key players in the mobile industry provide Web Socket APIs in own native
apps. Web sockets are said to be a full duplex communication.
The approach of Web Sockets works well for certain categories of web
application such as chat room, where the updates from client as well as server are
shared simultaneously.
Web Sockets, a part of the HTML5 specification, allow full duplex communication
between web pages and a remote host. The protocol is designed to achieve the following
benefits, which can be considered as the key points –
Reduce unnecessary network traffic and latency using full duplex through a single
connection (instead of two).
Streaming through proxies and firewalls, with the support of upstream and
downstream communication simultaneously.
It is necessary to initialize the connection to the server from client for communication
between them. For initializing the connection, creation of Javascript object with the URL
with the remote or local server is required.
var socket = new WebSocket(“ ws://echo.websocket.org ”);
The URL mentioned above is a public address that can be used for testing and
experiments. The websocket.org server is always up and when it receives the message and sends
it back to the client. This is the most important step to ensure that application works correctly.
Web Sockets – Events:
There are four main Web Socket API events –
Open
Message
Close
Error
Each of the events are handled by implementing the functions like onopen, onmessage,
onclose and onerror functions respectively. It can also be implemented with the help of
addEventListener method.
The brief overview of the events and functions are described as follows –
Open: Once the connection has been established between the client and the server, the
open event is fired from Web Socket instance. It is called as the initial handshake
between client and server. The event, which is raised once the connection is
established, is called onopen.
Message: Message event happens usually when the server sends some data. Messages
sent by the server to the client can include plain text messages, binary data or images.
Whenever the data is sent, the onmessage function is fired.
Close: Close event marks the end of the communication between server and the
client. Closing the connection is possible with the help of onclose event. After
marking the end of communication with the help of onclose event, no messages can
be further transferred between the server and the client. Closing the event can happen
due to poor connectivity as well.
Error: Error marks for some mistake, which happens during the communication. It is
marked with the help of onerror event. Onerror is always followed by termination of
connection.
Web Sockets – Actions:
Events are usually triggered when something happens. On the other hand, actions are
taken when a user wants something to happen. Actions are made by explicit calls using
functions by users. The Web Socket protocol supports two main actions, namely −
send( )
close( )
send ( ) :
This action is usually preferred for some communication with the server, which includes
sending messages, which includes text files, binary data or images.
A chat message, which is sent with the help of send() action, is as follows –
// get text view and button for submitting the message
var textsend = document.getElementById(“text-view”);
var submitMsg = document.getElementById(“tsend-button”);
<scriptlanguage="javascript"type="text/javascript">
var wsUri ="ws://echo.websocket.org/";
var output;
function init(){
output = document.getElementById("output");
testWebSocket();
}
function testWebSocket(){
websocket =newWebSocket(wsUri);
websocket.onopen =function(evt){
onOpen(evt)
};
}
function onOpen(evt){
writeToScreen("CONNECTED");
}
window.addEventListener("load", init,false);
</script>
<h2>WebSocket Test</h2>
<divid="output"></div>
</html>
The output will be as follows –
The above HTML5 and JavaScript file shows the implementation of two events of Web
Socket, namely –
onLoad which helps in creation of JavaScript object and initialization of
connection.
onOpen establishes connection with the server and also sends the status.
statu
Bridge: Bridge is a network device which works in data link layer. Bridge connects two
different LAN working on same protocol. Also In bridge, format of packet is not changed.
Bridge connects two While gateway will settle for and transfer
of packet.
Part-B
1. Evaluate the CRUD Operations based on HTTP.
2. Validate connectivity to IoT cloud platforms using HTTP REST Services
3. Measure the Significance of gateway design and its characteristics.
4. Justify the differences between HTTP, HTTP REST and Protocol Bridging with an example?
References:
The concept of Edge computing came into the feature in order to sort out this problem.
Edge computing is somewhat similar to the Content Delivery Network (CDN) but edge server
processes the data generated from IoT devices in edge computing instead of the centralized
cloud server. There is an edge server near the IoT device user which serves as a bridge between
the cloud and the user. This decreases the data transmission latency and the user experiences a
speedy performance of the device. There are some areas that require minimal transmission
latency such as smart homes, cloud gaming, video streaming, smart devices, etc. If there is a
delay in data transmission in these areas then the impacts can be dangerous. For example, in the
case of self-driving cars the response from the cloud should be very quick. If there is a delay in
transmission of response after the car sends the data to the central server, consequences can be
dangerous as the car cannot take human-like decisions on its own. Thus, edge computing
reduces the transmission latency when voluminous data has to be transmitted. Edge computing
helps in real-time data processing without any sort of transmission latency. Since the data is
processed near the user device, the usage of Internet bandwidth is also reduced. These features
of edge computing are making it a fundamental constituent for organizations.
According to one of the statistics of Gartner, 50% of the data generated by an enterprise
will be created and processed by edge computing rather than centralized cloud data centres by
2022. Edge Computing: Advantages and difficulties The decrease in response time to
microseconds can be taken as the major advantage of edge computing. There is less usage of
internet bandwidth that ultimately saves the network resources. Although edge computing
seems similar to cloud computing, there is a difference in the advantages and applications of
cloud computing.
Edge computing has a handful of challenges despite its capability to mitigate latency in
data transmission and network usage. Some of them are:
Network Bandwidth: The storage of more data demands the expansion of bandwidth. Thus, the
network bandwidth needs to be increased with the increase in the amount of data while edge
computing is used.
Security: Due to the distributed structure, edge computing is vulnerable to security exploits.
This type of architecture increases attack vectors that may result in malware infections.
Compliances: The data stored over Edge servers need verification for compliances and
regulations as data is one of the key components of any business organization.
Latency: The distance between the user device and cloud affects the transmission latency.
Sometimes there is an increase in latency that may result in disastrous consequences.
Hence, edge computing along with solutions brings some challenges too. These
difficulties can be overcome by consulting with cloud experts to start utilizing the benefits of
edge computing. So, the implementation of edge computing in any business surely helps in the
growth of the business.
In both architectures data is generated from the same source—physical assets such as
pumps, motors, relays, sensors, and so on. These devices perform a task in the physical world
such as pumping water, switching electrical circuits, or sensing the world around them. These are
the “things” that make up the internet of things.
Fog computing: In fog computing, transporting data from things to the cloud requires
many steps.
First the electrical signals from things are traditionally wired to the I/O points of an
automation controller (PLC or PAC). The automation controller executes a control
system program to automate the things.
Next the data from the control system program is sent to an OPC server or protocol
gateway, which converts the data into a protocol Internet system understand, such as
MQTT or HTTP.
Then the data is sent to another system, such as a fog node or IoT gateway on the
LAN, which collects the data and performs higher-level processing and analysis. This
system filters, analyzes, processes, and may even store the data for transmission to the
cloud or WAN later.
So fog computing involves many layers of complexity and data conversion. Its
architecture relies on many links in a communication chain to move data from the physical
world of our assets into the digital world of information technology. In a fog computing
architecture, each link in the communication chain is a potential point of failure.
Edge computing: Edge computing simplifies this communication chain and reduces
potential points of failure. In edge computing, physical assets like pumps, motors, and
generators are again physically wired into a control system, but this system is controlled by
an edge programmable industrial controller, or EPIC.
The EPIC automates the physical assets by executing an onboard control system
program, just like a PLC or PAC. But the EPIC has edge computing capabilities that allow it
to also collect, analyze, and process data from the physical assets it's connected to—at the
same time it's running the control system program. EPICs then use edge computing
capabilities to determine what data should be stored locally or sent to the cloud for further
analysis.
In edge computing, intelligence is literally pushed to the network edge, where our
physical assets or things are first connected and where IoT data originates. Edge computing
saves time and money by streamlining IoT communication, reducing system and network
architecture complexity, and decreasing the number of potential failure points in an IoT
application. Reducing system architecture complexity is key to the success of IIoT
applications.
If you place too much at the edge, it's easy to overwhelm the smaller processor
and storage platforms that exist there. In some cases, storage could be limited to a
few gigabytes and processing using a single CPU. Power and size restrictions are
really what set the limits.
Another pattern is failure to integrate security from concept to production.
Security is systemic to both edge and fog computing architectures and centralized
processing. Security needs to span both and use mechanisms such as identity and
access management. Encryption is not a nice-to-have, but rather a requirement for
device safety. Imagine if a jet engine could be hacked in flight.
Fog and edge could create a tipping point, of sorts. Network latency limited IoT's full
evolution and maturity given the limited processing that can occur at sensors. Placing a
micro-platform at the IoT device, as well as providing tools and approaches to leverage this
platform, will likely expand the capabilities of IoT systems and provide more use cases for
IoT in general. Fog and edge are enabling technologies and standards that give IoT users and
technology providers with more options. Removing the limits of centralized cloud servers
means IoT is much more distributed and flexible in the services providers can offer. The first
step is to understand what edge and fog are, how they can be best exploited within your own
problem domain, and then face real business problems. Consider the number of ways edge
and fog computing can make consumers' and workers' lives better. The value is limitless.
5.3 Edge Analytics at devices and Gateways
Edge Computing/ Edge Analytics is a relatively new approach for many companies. Most
architectures are used to sending all data to the Cloud/Lake. But in Edge computing, that
does not happen. i.e. data can be processed near the source and not all data is sent back to the
Cloud. For large-scale IoT deployments, this functionality is critical because of the sheer
volumes of Data being generated.The significance and evolution of IoT edge analytics. It is
believed that hardware capabilities will converge for large vendors (Cisco, Dell, HPE, Intel,
IBM and others). Hence, IoT analytics will be the key differentiator.
Companies like Cisco, Intel and others were early proponents of Edge Computing by
positioning their Gateways as Edge devices. Historically, gateways performed the function of
traffic aggregation and routing. In the Edge computing model, the core gateway functionality
has evolved. Gateways do not just route data but also store data and to perform computations
on the data as well.
Edge analytics allows us to do some pre-processing or filtering of the data closer to
where the data is being created. Thus, data that falls within normal parameters can be ignored
or stored in a low cost storage and abnormal readings may be sent to the Lake or the in-
memory database. Now, a new segment of the market is developing driven by vendors like
Dell. HPE and others.
These vendors are positioning their servers as Edge devices by adding additional storage,
computing power and analytics capabilities. This has implications for Edge Analytics for
IoT. IoT Edge Analytics is typically applicable for Oil Rigs, Mines and Factories which
operate in low bandwidth, low latency environments. Edge Analytics could apply not just to
sensor data but also to richer forms of data such as Video analytics. IoT datasets are massive.
A typical Formula One car carries 150-300 sensors.
An airlines for example, the current Airbus A350 model has close to 6,000 sensors and
generates 2.5 Tb of data per day,. A city (for example the Smart city of Santander in Spain)
includes a network comprising more than 25,000 sensors. To avoid these sensors from
constantly pinging the Cloud, we need some form of interim processing. Hence, the need for
Edge processing in IoT analytics. We can consider Edge device
devicess from two perspectives:
Evolution of the traditional Gateway vendors and Evolution of the traditional server vendors.
where m is the number of physical variables (like temperature and humidity), while n is the
number of samples. At a certain time t, data vector generated by sensor node i is
Xi(t) = [xi,1(t); xi,2(t); : : : ; xi,m(t)]T
5.5 Detecting Anomalies:
The Internet of Things (IoT) refers to a network of billions of interconnected devices
thathave the ability to communicate and exchange data over the Internet. Such IoT devices
range from sensors, smart phones, computers, vehicles, building appliances, and health
devices. The extension of Internet connectivity to physical devices and everyday objects has
substantially increased worldwide real-time data collection and transmission. As of 2019
there are approximately 9 billion IoT devices across the world and by 2020 this number will
surge to over 25 billion. These IoT edge devices, such as smart light bulbs, wearable medical
devices, doors, heaters, sensors for smart agriculture, etc., which typically host a variety of
sensors for temperature, pressure, humidity, light, motion and acceleration, are often
resource-constrained. For example, many of them are battery-powered, with limited
processing power which is just sufficient for the task at hand so that they can be mass-
produced while minimizing costs. On the other hand, the amount of data produced by these
sensors at this scale is staggering. As IoT devices grow in number, the tremendous amount
of sensing data collected hasraised great challenges for data transmission overhead (time and
energy) and cloud storage. Applications of cost-cutting edge mining techniques to reduce
packet transmission and remote storage requirements are rapidly growing throughout IoT
networks. One of the most basic edge mining techniques is random sampling to reduce the
number of observations sent to the cloud. More sophisticated methods, such as anomaly
detection, isolate and transmit only the contextually relevant observations. The guiding
principle behind anomaly detection is that only unexpected behavior needs to be notified to
the centralized cloud. Contemporary works specify different anomaly detection methods
ranging from basic thresholding to machine learning algorithms. However, as the resources
available at each IoT edge device can be rather limited in terms of power, memory,
connectivity, bandwidth, and computation, it is critical to choose appropriate anomaly
detection algorithms that can not only effectively identify abnormal behaviors but also
consume limited resources at the edge devices.
Anomaly detection is one of the most popular edge mining techniques explored in
IoT scenarios anomaly detection is used as a method to implement an Intrusion Detection
System (IDS) for Wireless Sensor Networks (WSN). The use of LDA to reduce the
dimensionality of network intrusion datasets and applies both Naive Bayes and KNN
algorithms for anomaly classification. The performance of anomaly detection (i.e., false
positive rates) using an unsupervised outlier detection technique based on the RF algorithm.
Furthermore, the effectiveness of autoencoders for an unsupervised IDS and proposes a
novel splitting and learning mechanism to lower false positive detection. Although these
works explore novel applications of anomaly detection on the IoT edge, they do not focus on
resource constrained scenarios and therefore do not delve into the time and energy
consumption of these methods. Inanomaly detection is investigated in healthcare
applications. Cardiac anomaly detection with low false negative counts and stress the
importance of capturing outliers in healthcare applications. Similar to the IDS studies, this
work focuses extensively on anomaly detection implementation but does not consider the
factor of resource consumption. Several works also tackle anomaly detection in IoT
applications outside of IDS and healthcare. In non-machine learning anomaly detection
algorithms are proposed for a set of heterogenous sensors in an IoT WSN. In an autoencoder
neural network is used for determining anomaly readings from a testbed of eighttemperature
and humidity sensors.
Anomaly Detection Models And Datasets:
Random Forests RF models are applicable to a wide range of classification problems. In a
random forest, each node is split using a subset of features randomly chosen at that node. This
strategy is robust against overfitting and enables RF to perform better than many other
classifiers, including discriminant analysis, support vector machines, and neural networks. The
only major downside of RF is that a large number of trees can slow down the algorithm for real-
time predictions.
Multilayer Perceptron MLP is the most known and frequently used type of neural
network using the backpropagation training algorithm. In recent years, neural networks have
been extensively used for pattern recognition and optimization. MLP models contain three types
of layers: input layer, output layer, and hidden layer. Each node in the input layer, from top to
bottom, passes an input data point to each neuron in the first hidden layer. Then, each hidden
layer neuron multiplies each value with a weight vector and computes the sum of the multiplied
values. Subsequently each hidden layer neuron applies its activation function to this sum, and
sends the resulting value to the next layer and eventually to the output layer. Suppose there are N
training samples, d features, k hidden layers each containing h neurons, o output neurons, and i
iterations. It is advisable to start with a small number of hidden layers and nodes given the high
time complexity of MLP’s backpropagation algorithm and time-consuming grid search
procedure. One of the most notable advantages of MLP, is its low prediction complexity,
O(Nhk), which makes it suitable for time-sensitive anomaly detection tasks.
K-Nearest Neighbors K-Nearest Neighbor Classifier (KNN) is a relatively simple
learning algorithm. It is very commonly used in text mining, agricultural predictions, and stock
market forecasting. Because KNN does not make any assumptions about the underlying data
distribution, it is particularly suitable for applications with little or no prior knowledge about the
distribution of the dataset. KNNs are generally reputed for high prediction accuracy with respect
to precision and recall. Furthermore, the training phase of KNN classifiers is very efficient.
Nevertheless, there are several drawbacks of KNNs. First, the model has high space complexity
as it stores all training data instances. Second, finding the most optimal value for K is not trivial.
KNN’s most significant drawback, however, is its exorbitant prediction phase overhead, which is
O(NPMlogK) with N training samples, M test samples, K neighbors, and P distance metric time
complexity. The high overhead at the prediction phase may make it less suitable to be carried on
by resource constrained IoT edge devices.
Discriminant Analysis In this work, we use both LDA and QDA. LDA first projects a
dataset onto lower-dimensional space to prevent overfitting and to generate linear class-
separability. QDA also performs dimensionality reduction but generates a quadratic line to fit the
training data. Despite its potential for non-linear data patterns, the number of the parameters
needed by QDA scales quadratically with that of the variables, making it slower for very high
dimensional datasets. Thus,the use of anomaly detection as an impactful edge mining technique
in different IoT scenarios.
Part-B
6. Create a methodology how Edge and Fog Computing act as a key player in IoT techniques
with a real time application
References: