Hadoop and Java Ques - Ans
Hadoop and Java Ques - Ans
Any data that cannot be stored into traditional RDBMS is termed as Big Data. As we know
most of the data that we use today has been generated in the past 20 years. And this data is
mostly unstructured or semi structured in nature. More than the volume of the data it is the
nature of the data that defines whether it is considered as Big Data or not.
3. How big data analysis helps businesses increase their revenue? Give
example.
Big data analysis is helping businesses differentiate themselves for example Walmart the
worlds largest retailer in 2014 in terms of revenue - is using big data analytics to increase its
sales through better predictive analytics, providing customized recommendations and
launching new products based on customer preferences and needs. Walmart observed a
significant 10% to 15% increase in online sales for $1 billion in incremental revenue. There
are many more companies like Facebook, Twitter, LinkedIn, Pandora, JPMorgan Chase,
Bank of America, etc. using big data analytics to boost their revenue.
Here is an interesting video that explains how various industries are leveraging big data
analysis to increase their revenue
eBay
Hulu
Spotify
Rubikloud
Twitter
To view a detailed list of some of the top companies using Hadoop CLICK HERE
5. Differentiate between Structured and Unstructured data.
Data which can be stored in traditional database systems in the form of rows and columns,
for example the online purchase transactions can be referred to as Structured Data. Data
which can be stored only partially in traditional database systems, for example, data in XML
records can be referred to as semi structured data. Unorganized and raw data that cannot
be categorized as semi structured or structured data is referred to as unstructured data.
Facebook updates, Tweets on Twitter, Reviews, web logs, etc. are all examples of
unstructured data.
1)HDFS Hadoop Distributed File System is the java based file system for scalable and
reliable storage of large datasets. Data in HDFS is stored in the form of blocks and it
operates on the Master Slave Architecture.
Here is a visual that clearly explain the HDFS and Hadoop MapReduce Concepts-
1) Hadoop Common
2) HDFS
3) Hadoop MapReduce
4) YARN
Data Management and Monitoring Components are - Ambari, Oozie and Zookeeper.
ECC memory. However, the hardware configuration also depends on the workflow
requirements and can change accordingly.
10. What are the most commonly defined input formats in Hadoop?
The most common Input Formats defined in Hadoop are:
Text Input Format- This is the default input format defined in Hadoop.
Key Value Input Format- This input format is used for plain text files
wherein the files are broken down into lines.
Sequence File Input Format- This input format is used for reading files in
sequence.
We have further categorized Big Data Interview Questions for Freshers and Experienced-
Block Scanner - Block Scanner tracks the list of blocks present on a DataNode and verifies
them to find any kind of checksum errors. Block Scanners use a throttling mechanism to
reserve disk bandwidth on the datanode.
edits file-It is a log of changes that have been made to the namespace since checkpoint.
Checkpoint NodeCheckpoint Node keeps track of the latest checkpoint in a directory that has same structure
as that of NameNodes directory. Checkpoint node creates checkpoints for the namespace
at regular intervals by downloading the edits and fsimage file from the NameNode and
merging it locally. The new image is then again updated back to the active NameNode.
BackupNode:
Backup Node also provides check pointing functionality like that of the checkpoint node but it
also maintains its up-to-date in-memory copy of the file system namespace that is in sync
with the active NameNode.
4. What is the port number for NameNode, Task Tracker and Job
Tracker?
NameNode 50070
1)Using the Hadoop FS Shell, replication factor can be changed per file basis using the
below command-
$hadoop fs setrep w 2 /my/test_file (test_file is the filename whose replication factor will
be set to 2)
2)Using the Hadoop FS Shell, replication factor of all files under a given directory can be
modified using the below command-
3)$hadoop fs setrep w 5 /my/test_dir (test_dir is the name of the directory and all the files
in this directory will have a replication factor set to 5)
NAS stores data on a dedicated hardware whereas in HDFS all the data
blocks are distributed across local drives of the machines.
The contents present in the file are divided into data block as soon as the client is ready to
load the file into the hadoop cluster. After consulting with the NameNode, client allocates 3
data nodes for each data block. For each data block, there exists 2 copies in one rack and
the third copy is present in another rack. This is generally referred to as the Replica
Placement Policy.
We have further categorized Hadoop HDFS Interview Questions for Freshers and
Experienced-
1)setup () This method of the reducer is used for configuring various parameters like the
input data size, distributed cache, heap size, etc.
3)cleanup () - This method is called only once at the end of reduce task for clearing all the
temporary files.
Class.
The custom partitioner to the job can be added as a config file in the
wrapper which runs Hadoop MapReduce or the custom partitioner can be added
to the job by using the set method of the partitioner class.
5. What is the relationship between Job and Task in Hadoop?
A single job can be broken down into one or many tasks in Hadoop.
It is not necessary to write Hadoop MapReduce jobs in java but users can write MapReduce
jobs in any desired programming language like Ruby, Perl, Python, R, Awk, etc. through the
Hadoop Streaming API.
1)Shuffle
2)Sort
3)Reduce
9. What is a TaskInstance?
The actual hadoop MapReduce jobs that run on each slave node are referred to as Task
instances. Every task instance has its own JVM process. For every new task instance, a
JVM process is spawned by default for a task.
We have further categorized Hadoop MapReduce Interview Questions for Freshers and
Experienced-
1. When should you use HBase and what are the key components of
HBase?
HBase should be used when the big data application has
3)If the application demands key based access to data while retrieving.
Zookeeper- It takes care of the coordination between the HBase Master component and the
client.
Catalog Tables-The two important catalog tables are ROOT and META.ROOT table tracks
where the META table is and META table stores all the regions in the system.
Table Level Operational Commands in HBase are-describe, list, drop, disable and scan.
4. Explain the difference between RDBMS data model and HBase data
model.
RDBMS is a schema based database whereas HBase is schema less data model.
RDBMS does not have support for in-built partitioning whereas in HBase there is automated
partitioning.
1)Family Delete Marker- This markers marks all columns for a column family.
We have further categorized Hadoop HBase Interview Questions for Freshers and
Experienced-
--import \
--connect jdbc:mysql://localhost/db \
--username root \
Incremental load can be performed by using Sqoop import command or by loading the data
into hive without overwriting it. The different attributes that need to be specified during
incremental load in Sqoop are-
1)Mode (incremental) The mode defines how Sqoop will determine what the new rows are.
The mode can have value as Append or Last Modified.
2)Col (Check-column) This attribute specifies the column that should be examined to find
out the rows to be imported.
3)Value (last-value) This denotes the maximum value of the check column from the
previous import operation.
1)Append
2)Last Modified
To insert only rows Append should be used in import command and for inserting the rows
and also updating Last-Modified should be used in the import command.
6. How can you check all the tables present in a single database using
Sqoop?
The command to check the list of all tables present in a single database using Sqoop is as
follows-
Large objects in Sqoop are handled by importing the large objects into a file referred as
LobFile i.e. Large Object File. The LobFile has the ability to store records of huge size, thus
each record in the LobFile is a large object.
8. Can free form SQL queries be used with Sqoop import command? If
yes, then how can they be used?
Sqoop allows us to use free form SQL queries with the import command. The import
command should be used with the e and query options to execute free form SQL queries.
When using the e and query options with the import command the target dir value must
be specified.
10. What are the limitations of importing RDBMS tables into Hcatalog
directly?
There is an option to import RDBMS tables into Hcatalog directly by making use of hcatalog
database option with the hcatalog table but the limitation to it is that there are several
arguments like as-avrofile , -direct, -as-sequencefile, -target-dir , -export-dir are not
supported.
We have further categorized Hadoop Sqoop Interview Questions for Freshers and
Experienced-
Source- This is the component through which data enters Flume workflows.
Client- The component that transmits event to the source that operates with the agent.
Hadoop ZooKeeper Interview Questions and Answers for Freshers Q.Nos- 1,2
Hadoop Pig Interview Questions and Answers
1) What do you mean by a bag in Pig?
Collection of tuples is referred as a bag in Apache Pig
2) Does Pig support multi-line commands?
Yes
3) What are different modes of execution in Apache Pig?Apache Pig runs in
2 modes- one is the Pig (Local Mode) Command Mode and the other is the Hadoop
MapReduce (Java) Command Mode. Local Mode requires access to only a single machine
where all files are installed and executed on a local host whereas MapReduce requires
accessing the Hadoop cluster.
We have further categorized Hadoop Pig Interview Questions for Freshers and Experienced-
Release 2.4.1
We have further categorized Hadoop YARN Interview Questions for Freshers and
Experienced-
2) Explain about the different channel types in Flume. Which channel type is faster?
3) Which is the reliable channel in Flume to ensure that there is no data loss?
8) Is it possible to leverage real time analysis on the big data collected by Flume directly? If
yes, then explain how.
4)Is it possible to change the default location of Managed Tables in Hive, if so how?
8)What is SerDe in Hive? How can you write yourown customer SerDe?
9)In case of embedded Hive, can the same metastore be used by multiple users?
Or
5)What are the modules that constitute the Apache Hadoop 2.0 framework?
We hope that these Hadoop Interview Questions and Answers have pre-charged you for
your next Hadoop Interview.Get the Ball Rolling and answer the unanswered questions in
the comments below.Please do! It's all part of our shared mission to ease Hadoop Interviews
for all prospective Hadoopers.We invite you to get involved.
50 interview questions:
Hadoop Developer Interview Questions
1) Explain how Hadoop is different from other parallel computing solutions.
3) What will a Hadoop job do if developers try to run it with an output directory that is already
present?
5) Did you ever built a production process in Hadoop? If yes, what was the process when
your Hadoop job fails due to any reason? (Open Ended Question)
6) Give some examples of companies that are using Hadoop architecture extensively.
10) What are the points to consider when moving from an Oracle database to Hadoop
clusters? How would you decide the correct size and number of nodes in a Hadoop cluster?
11) How do you benchmark your Hadoop Cluster with Hadoop tools?
16) If there are 10 HDFS blocks to be copied from one machine to another. However, the
other machine can copy only 7.5 blocks, is there a possibility for the blocks to be broken
down during the time of replication?
25) Why would a Hadoop developer develop a Map Reduce by disabling the reduce step?
26) What is the functionality of Task Tracker and Job Tracker in Hadoop? How many
instances of a Task Tracker and Job Tracker can be run on a single Hadoop Cluster?
31)In Hadoop, if custom partitioner is not defined then, how is data partitioned before it is
sent to the reducer?
32) What is replication factor in Hadoop and what is default replication factor level Hadoop
comes with?
34) If you are the user of a MapReduce framework, then what are the configuration
parameters you need to specify?
35) Explain about the different parameters of the mapper and reducer functions.
36) How can you set random number of mappers and reducers for a Hadoop job?
41) Hadoop attains parallelism by isolating the tasks across various nodes; it is possible for
some of the slow nodes to rate-limit the rest of the program and slows down the program.
What method Hadoop provides to combat this?
43) What are combiners and when are these used in a MapReduce job?
44) How does a DataNode know the location of the NameNode in Hadoop cluster?
45) How can you check whether the NameNode is working or not?
47) Are there any problems which can only be solved by MapReduce and cannot be solved
by PIG? In which kind of scenarios MR jobs will be more useful than PIG?
Most of the Big Data vendors are making their efforts for finding and ideal solution to this
challenging problem that has paved way for the advent of a very demanding and popular
alternative named Apache Spark. Spark makes development completely a pleasurable
activity and has a better performance execution engine over MapReduce whilst using the
same storage engine Hadoop HDFS for executing huge data sets.
Apache Spark has gained great hype in the past few months and is now being regarded as
the most active project of Hadoop Ecosystem.
Before we get into further discussion on what empowers Apache Spark over Hadoop
MapReduce let us have a brief understanding of what actually Apache Spark is and then
move on to understanding the differences between the two.
Apache Spark is an open source available for free download thus making it a user friendly
face of the distributed programming framework i.e. Big Data. Spark follows a general
execution model that helps in in-memory computing and optimization of arbitrary operator
graphs so that querying data becomes much faster when compared to the disk based
engines like MapReduce.
Apache Spark has a well designed application programming interface that consists of
various parallel collections with methods such as groupByKey, Map and Reduce so that you
get a feel as though you are programming locally. With Apache Spark you can write
collection oriented algorithms using the functional programming language Scala.
Why Apache Spark was developed?
Hadoop MapReduce that was envisioned at Google and successfully implemented and
Apache Hadoop is an extremely famous and widely used execution engine. You will find
several applications that are on familiar terms with how to decompose their work into a
sequence of MapReduce jobs. All these real time applications will have to continue their
operation without any change.
However the users have been consistently complaining about the high latency problem with
Hadoop MapReduce stating that the batch mode response for all these real time applications
is highly painful when it comes to processing and analyzing data.
Now this paved way for Hadoop Spark, a successor system that is more powerful and
flexible than Hadoop MapReduce. Despite the fact that it might not be possible for all the
future allocations or existing applications to completely abandon Hadoop MapReduce, but
there is a scope for most of the future applications to make use of a general purpose
execution engine such as Hadoop Spark that comes with many more innovative features, to
accomplish much more than that is possible with MapReduce Hadoop.
Thus, Hadoop Spark is just the apt choice for the future big data applications that
possibly would require lower latency queries, iterative computation and real time processing
on similar data.
Hadoop Spark has lots of advantages over Hadoop MapReduce framework in terms of a
wide range of computing workloads it can deal with and the speed at which it executes the
batch processing jobs.
Click here to know more about our IBM Certified Hadoop Developer course
Difference between MapReduce and Spark
Src: www.tapad.com
i) Hadoop vs Spark Performance
Hadoop Spark has been said to execute batch processing jobs near about 10 to 100 times
faster than theHadoop MapReduce framework just by merely by cutting down on the
number of reads and writes to the disc.
In case of MapReduce there are these Map and Reduce tasks subsequent to which there is
a synchronization barrier and one needs to preserve the data to the disc. This feature of
MapReduce framework was developed with the intent that in case of failure the jobs can be
recovered but the drawback to this is that, it does not leverage the memory of the Hadoop
cluster to the maximum.
Nevertheless with Hadoop Spark the concept of RDDs (Resilient Distributed Datasets) lets
you save data on memory and preserve it to the disc if and only if it is required and as well it
does not have any kind of synchronization barriers that possibly could slow down the
process. Thus the general execution engine of Spark is much faster than Hadoop
MapReduce with the use of memory.
Most of the real time applications use Hadoop MapReduce for generating reports that help in
finding answers to historical queries and then altogether delay a different system that will
deal with stream processing so as to get the key metrics in real time. Thus the organizations
ought to manage and maintain separate systems and then develop applications for both the
computational models.
However with Hadoop Spark all these complexities can be eliminated as it is possible to
implement both stream and batch processing on the same system so that it simplifies the
development, deployment and maintenance of the application.With Spark it is possible to
control different kinds of workloads, so if there is an interaction between various workloads in
the same process it is easier to manage and secure such workloads which come as a
limitation with MapReduce.
With Spark Streaming it is possible to pass data through various software functions for
instance performing data analytics as and when it is collected.
Developers can now as well make use of Apache Spark for Graph processing which maps
the relationships in data amongst various entities such as people and objects. Organizations
can also make use of Apache Spark with predefined machine learning code libraries so that
machine learning can be performed on the data that is stored in various Hadoop clusters.
Hadoop MapReduce is meant for data that does not fit in the memory
whereas Apache Spark has a better performance for the data that fits in the
memory, particularly on dedicated clusters.
Apache Spark and Hadoop MapReduce both are failure tolerant but
comparatively Hadoop MapReduce is more failure tolerant than Spark.
Nevertheless every coin has two faces and yeah so does Hadoop Spark comes with some
backlogs such as inability to handle in case if the intermediate data is greater than the
memory size of the node, problems in case of node failure and the most important of all is
the cost factor.
Hadoop Spark makes use of the journaling (also known as Recomputation) for providing
resiliency in case there is a node failure by chance as a result we can conclude that the
recovery behavior in case of node failure is just similar as that in case of Hadoop
MapReduce except for the fact that the recovery process would be much faster.
Spark also has the spill to disk feature incase if for a particular node there is insufficient RAM
for storing the data partitions then it provides graceful degradation for disk based data
handling. When it comes to cost, with street RAM prices being 5USD per GB, we can have
near about 1TB of RAM for 5K USD thus making memory to be a very minor fraction of the
overall node costing.
One great advantage that comes coupled with Hadoop MapReduce over Apache Spark is
that in case if the data size is greater than memory then under such circumstances Apache
Spark will not be able to leverage its cache and there is much probability that it will be far
slower than the batch processing of MapReduce.
Confused Hadoop vs. Spark Which One to Choose?
If the question that is leaving you confused on Hadoop MapReduce or Apache Spark or
rather say to choose Disk Based Computing or RAM Based Computing, then the answer to
this question is straightforward. It all depends and the variables on which this decision
depends keep on changing dynamically with time.
Nevertheless, the current trends are in favor of the in-memory techniques like the Apache
Spark as the industry trends seem to be rendering a positive feedback for it. So to conclude
with we can state that, the choice of Hadoop MapReduce vs. Apache Spark depends on the
user-based case and we cannot make an autonomous choice.
In this piece of writing we provide the users an insight on the novel Hadoop 2.0
(YARN) and help them understand the need to switch from Hadoop 1.0 to Hadoop 2.0.
The huge data giants on the web such as Google, Yahoo and Facebook who had adopted
Apache Hadoop had to depend on the partnership of Hadoop HDFS with the resource
management environment and MapReduce programming. These technologies collectively
enabled the users to manage processes and store huge amounts of semi-structured,
structured or unstructured data within Hadoop clusters. Nevertheless there were certain
intrinsic drawbacks with Hadoop MapReduce pairing. For instance, Google and other users
of Apache Hadoop had various alluding issues with Hadoop 1.0 of not having the ability to
keep track with the flood of information that they were collecting online due to the batch
processing arrangement of MapReduce.
Hadoop 2.0 popularly known as YARN (Yet another Resource Negotiator) is the latest
technology introduced in Oct 2013 that is being used widely nowadays for processing and
managing distributed big data.
Hadoop YARN has a modified architecture unlike the intrinsic characteristics of Hadoop 1.0
so that the systems can scale up to new levels and responsibilities can be clearly assigned
to the various components in Hadoop HDFS.
Click here to know more about our IBM Certified Hadoop Developer course
The Hadoop 1.0 or the so called MRv1 mainly consists of 3 important components namely:
The major difference with Hadoop 2.0 is that, in this next generation of Hadoop the cluster
resource management capabilities are moved into YARN.
YARN
YARN has taken an edge over the cluster management responsibilities from MapReduce, so
that now MapReduce just takes care of the Data Processing and other responsibilities are
taken care of by YARN.
In Hadoop 2.0, the Job Tracker in YARN mainly depends on 3 important components
3.It is responsible for negotiating the resources of the system amongst the
competing applications.
NM-Node Manager
YARN has been capable of providing the organizations something that is far beyond Map
Reduce, by separating the cluster resource management function completely from the data
processing function. With comparatively less overloaded sophisticated programming
protocols and being cost effective, companies preferably would like to migrate their
applications from Hadoop 1.0 to Hadoop 2.0. An edge that YARN provides to Hadoop
Users is that it is backward compatible (i.e. one can easily run an existing Map Reduce job
on Hadoop 2.0 without making any modifications) thus compelling the companies to migrate
from Hadoop 1.0 to Hadoop 2.0 without even giving it a second thought.
Despite the fact that most of the Hadoop applications have migrated from Hadoop 1.0 to
Hadoop 2.0 there are migrations that are still in progress and companies are consistently
striving hard to accomplish this long needed upgrade for their applications.
With Hadoop YARN, it is now easy for Hadoop Developers to build applications directly
with Hadoop, devoid of having to bolt them from any other outside third party vendor tools
which was the case with Hadoop 1.0.This is another important reason why companies will
establish Hadoop 2.0 as a platform for creating applications and manipulating data for more
effectively and efficiently.
YARN is the elephant sized change that Hadoop 2.0 has brought in but undoubtedly there
are lots of challenges involved as companies migrate from Hadoop 1.0 to Hadoop 2.0
however the basic changes to the MR framework will have greater usability level for Hadoop
in the upcoming big data scenarios. Hadoop 2.0 being more isolated and scalable over the
earlier version, it is anticipated that soon there will be several novel tools that will get the
most out of the new features in YARN (Hadoop 2.0).
Big Data in healthcare originates from the large electronic health datasets these datasets
are very difficult to manage with the conventional hardware and software. The use of legacy
data management methods and tools also makes it impossible to usefully leverage all this
data. Big Data in healthcare is an overpowering concept not just because of the volume of
data but also due to the different data types and the pace at which healthcare data
management needs to be managed. The sum total of data related to the patient and their
well-being constitutes the Big Data problem in the healthcare industry.Big Data Analytics
has actually become an on the rise and crucial problem in healthcare informatics as well.
Healthcare informatics also contributes to the development of Big Data analytic technology
by posing novel challenges in terms of data knowledge representation, database design,
data querying and clinical decision support.
Despite the fact that, most of the data in the health care sector is stored in printed form, the
recent trend is moving towards rapid digitization of this data. Big Data in healthcare industry
promises to support a diverse range of healthcare data management functions such as
population health management, clinical decision support and disease surveillance. The
Healthcare industry is still in the early stages of getting its feet wet in the large scale
integration and analysis of big data.
With 80% of the healthcare data being unstructured, it is a challenge for the healthcare
industry to make sense of all this data and leverage it effectively for Clinical operations,
Medical research, and Treatment courses.
The volume of Big data in healthcare is anticipated to grow over the coming years and the
healthcare industry is anticipated to grow with changing healthcare reimbursement models
thus posing critical challenges to the healthcare environment. Even though, profit is not the
sole motivator, it is extremely important for the big data healthcare companies to make use
of the best in class techniques and tools that can leverage Big Data in healthcare effectively.
Else these big data healthcare companies might have to skate on thin ice when it comes to
generating profitable revenue.
of doctors and assistants. All this was successfully achieved using Hadoop ecosystem
components - Hive, Flume, Sqoop, Spark, and Impala.
The analytics tool developed by Explorys is used for data mining so that it helps clinicians
determine the deviations among patients and the effects treatments have on their health.
These insights help the medical practitioners and health care providers find out the best
treatment plans for a set of patient populations or for an individual patient.
Did you like our top 5 healthcare data solutions of Big Data? If you work in the healthcare
industry or have an idea of any other healthcare data solutions that help big data healthcare
companies harness the power of Hadoop, please leave a comment below!
As per the web statistics report in 2014, there are about 3 billion people who are connected
to the world wide web and the amount of time that the internet users spend on the web is
somewhere close to 35 billion hours per month, which is increasing gradually.
With the availability of several mobile and web applications, it is pretty common to have
billions of users- who will generate a lot of unstructured data. There is a need for a database
technology that can render 24/7 support to store, process and analyze this data.
Its important that youre not just going with a traditional database
because thats what everyone else is using, said Evaldo de Oliveira,
Business Development Director at FairCom. Pay attention to whats
going on in the NoSQL world because there are some problems that SQL
cannot handle.
Relational Databases The fundamental concept behind databases, namely MySQL,
Oracle Express Edition, and MS-SQL that uses SQL, is that they are all Relational Database
Management Systems that make use of relations (generally referred to as tables) for storing
data.
In a relational database, the data is correlated with the help of some common characteristics
that are present in the Dataset and the outcome of this is referred to as the Schema of the
RDBMS.
Oriented i.e. the structure of the data should be known in advance ensuring that
the data adheres to the schema.
Examples of such predefined schema based applications that use SQL
NoSQL now leads the way for the popular internet companies such as LinkedIn, Google,
Amazon, and Facebook - to overcome the drawbacks of the 40 year old RDBMS.
NoSQL generally scales horizontally and avoids major join operations on the data. NoSQL
database can be referred to as structured storage which consists of relational database as
the subset.
NoSQL Database covers a swarm of multitude databases, each having a different kind of
data storage model. The most popular types are Graph, Key-Value pairs, Columnar and
Document.
Click here to know more about our IBM Certified NoSQL course
NoSQL vs SQL 4 Key Differences:
1. Nature of Data and Its Storage- Tables vs. Collections
The foremost criterion for choosing a database is the nature of data that your enterprise is
planning to control and leverage. If the enterprise plans to pull data similar to an accounting
excel spreadsheet, i.e. the basic tabular structured data, then the relational model of the
database would suffice to fulfill your business requirements but the current trends demand
for storing and processing unstructured and unpredictable information.
Web-centric businesses like Amazon, eBay, etc., were in need of a database like NoSQL vs
SQL that can best match up with the changing data model rendering them greater levels of
flexibility in operations.
RDBMS requires a higher degree of Normalization i.e. data needs to be broken down into
several small logical tables to avoid data redundancy and duplication. Normalization helps
manage data in an efficient way, but the complexity of spanning several related tables
involved with normalization hampers the performance of data processing in relational
databases using SQL.
On the other hand, in NoSQL Databases such as Couchbase, Cassandra, and MongoDB,
data is stored in the form of flat collections where this data is duplicated repeatedly and a
single piece of data is hardly ever partitioned off but rather it is stored in the form of an entity.
Hence, reading or writing operations to a single entity have become easier and faster.
NoSQL databases can also store and process data in real time - something that SQL is not
capable of doing it.
Under such circumstances, if you are using a relational database, i.e., SQL, you will have to
meticulously replicate and repartition the database so as to fulfill the increasing demand of
the customers.
Most people who choose NoSQL as their primary data storage are
trying to solve two main problems: scalability and simplifying the
development process, said Danil Zburivsky, solutions architect at
Pythian
Generally, with increase in demand, relational databases tend to scale up vertically which
means that they add extra horsepower to the system - to enable faster operations on the
same dataset.On the contrary, NoSQL Databases like the HBase, Couchbase and MongoD,
scale horizontally with the addition ofextra nodes (commodity database servers) to the
resource pool, so that the load can be distributed easily.
NoSQL Databases work on the concept of the CAP priorities and at a time you can decide to
choose any of the 2 priorities out of the CAP Theorem (Consistency-Availability-Partition
Tolerance) as it is highly difficult to attain all the three in a changing distributed node system.
One can term NoSQL Databases as BASE , the opposite of ACID - meaning:
S= Soft State The state of the system can change anytime devoid of executing any query
because node updates take place every now and then to fulfill the ever changing
requirements.
E=Eventually Consistent- NoSQL Database systems will become consistent in the long run.
2)Big Data needs a flexible data model with a better database architecture
3)To process Big Data, these databases need continuous application availability with
modern transaction support.
NoSQL in Big Data Applications
1.
2.
3.
HBase is used by the discovery engine Stumble upon for data analytics
and storage.
4.
5.
LinkedIn, Orbitz, and Concur use the Couchbase NoSQL Database for
various data processing and monitoring tasks.
The Database Landscape is flooded with increased data velocity, growing data variety, and
exploding data volumes and only NoSQL databases like HBase, Cassandra, Couchbase can
keep up with these requirements of Big Data applications.
However, the limitations in the Hadoop Map Reduce pairing paved way for Hadoop 2.0. For
instance, Yahoo reported that Hadoop 1.x is not able to pace up with flood of information
they were collecting online due to the Map Reduces batch processing format and the
NameNodes SPOF had always been a bothersome issue in case of failures.
Hadoop introduced YARN - that has the ability to process terabytes and
Petabytes of data present in HDFS with the use of various non-MapReduce
applications namely GIRAPH and MPI.
Hadoop 2.0 divides the responsibilities of the overloaded Job Tracker into 2
different divine components i.e. the Application Master (per application) and the
Global Resource Manager.
NameNode SPOF problem limits the overall availability of the Hadoop Cluster in the
following ways:
The main motive of the Hadoop 2.0 High Availability project is to render availability to big
data applications 24/7 by deploying 2 Hadoop NameNodes One in active configuration and
the other is the Standby Node in passive configuration.
Earlier there was one Hadoop NameNode for maintaining the tree hierarchy of the HDFS
files and tracking the data storage in the cluster. Hadoop 2.0 High Availability allows users to
configure Hadoop clusters with uncalled- for NameNodes so as to eliminate the probability of
SPOF in a given Hadoop cluster. The Hadoop Configuration capability allows users to build
clusters horizontally with several NameNodes which can operate autonomously through a
common data storage pool, thereby, offering better computing scalability when compared to
Hadoop 1.0
With Hadoop 2.0, Hadoop architecture is now configured in a manner that it supports
automated failover with complete stack resiliency and a hot Standby NameNode.
Hadoop 2.0 is keyed up to identify any failures in NameNode host and processes, so that it
can automatically switch to the passive NameNode i.e. the Standby Node to ensure high
availability of the HDFS services to the Big Data applications. With the advent of Hadoop 2.0
HA its time for Hadoop Administrators to take a breather, as this process does not require
manual intervention.
With HDP 2.0 High Availability, the complete Hadoop Stack i.e. HBase, Pig, Hive,
MapReduce, Oozie are equipped to tackle the NameNode failure problem- without having to
lose the job progress or any related data. Thus, any critical long running jobs that are
scheduled to be completed at a specific time will not be affected by the NameNode failure.
Click here to know more about our IBM Certified Hadoop Developer course
Hadoop Users Expectations from Hadoop 2.0 High Availability
When Hadoop users were interviewed about the High Availability Requirements from
Hadoop 2.0 Architecture, some of the most common High Availability requirements that they
came up with are:
Stand for Multiple Failures - Hadoop users stated that with Hadoop 2.0
High Availability the Hadoop Cluster must be able to stand for more than one
failure simultaneously. Preferably, Hadoop configuration must allow the
administrator to configure the degree of tolerance or let the user make a choice
at the resource level - on how many failures can be tolerated by the cluster.
Self Recovery from a Failure Hadoop users stated that with Hadoop
2.0 High Availability, the Hadoop Cluster must heal automatically (self healing)
without any manual intervention to restore it back to a highly available state
after the failure, with the pre-assumption that sufficient physical resources are
already available.
Image Credit:.slideshare.net
The limitation of SPOF with the HDFS architecture was overcome with the birth of
AvatarNode at Facebook. Wondering why it has such an unusual name?
Dhruba Borthakur, a famed HDFS developer at Facebook has named it after the James
Cameron movie Avatar released then in 2009 - during the birth of AvatarNode.
AvatarNode has been contributed by Facebook as open source software to the Hadoop
Community, to offer highly available NameNode that has hot failover and failback.
AvatarNode is a double node or can be called as two-node cluster that provides a highly
available NameNode to the Big Data applications with a manual failover.
AvatarNode is now the heartthrob inside Facebook as it is a huge WIN to the NameNode
SPOF problem. AvatarNode is running heavy production workloads and contributes to
improved administration and reliability of Hadoop Clusters at Facebook.
Hadoop 2.2 is now supported on Windows that is now craving interest from organizations
that are dedicated to Microsoft platforms only. There is no doubt that there would be growing
pains as organizations migrate to the latest release of Hadoop, however the basic changes
to the MapReduce framework will add value for Hadoop in Big Data set-ups. Hadoop 2.0
is just a messenger of the growing technology and a revitalized concept for building and
implementing Big Data Applications. There is anticipation for various tools that will make the
most of the Hadoop 2.0 High Availability and the new HDFS Architecture will support
features in YARN.
As is the case in most other industries, Apache Hadoop has come to the rescue for the
Telecom sector as well in Telecom data analytics for providing real time monitoring and Big
data solutions.
Big telecom companies have a number of verticals such as Marketing, Product, Sales,
Human Resources, Information Technology, Research and Development, etc. that are in
constant need of information. Using Hadoop, the existing databases can be suitably mined
and information can be extracted which would eventually be fed to the respective verticals
for decision making.
For example, the Telecom data consumption trend of the past few years can be extrapolated
to determine the expected bandwidth usage by consumers in the future and appropriate
products can be pitched to them. Also, the technology division can scale up its network
infrastructure well in advance to determine.
Click here to know more about our IBM Certified Hadoop Developer course
Modern data architecture using Hadoop
in an efficient, fast and secure manner. It prevents unnecessary clogging of bandwidth which
can be efficiently used for other more important network oriented operations.
Hortonworks puts Hadoop as one of the best placed tools to look into Telecom data
analytics. It states that of the present solutions capable of handling Telecom data effectively,
Hadoop is the best suited for delivering a modern data architecture. It allows Telcos to store
new types of data, retain it for longer periods, join different data sets together and derive
new information from the resultant combination which can be valuable for the business
users.
Hortonworks lists down how modern data architectures using Hadoop can provide Big data
solutions to telecommunication industry which is in an attempt to gain competitive
advantage:
Handling Call Data Records (CDRs)
One of the biggest challenges the telecommunication industry faces is to have an
infrastructure in place for analyzing CDRs which can be ably addressed using Hadoop.
Telecommunications companies carry out a lot of forensics on their data for monitoring the
quality of service. It involves using Hadoop to perform dropped call analysis, monitor and
report poor sound quality, root cause analysis and pattern recognition. Considering the fact
that millions of records flow into the Big data Telecom databases every second, there is a
need to perform real time, accurate analysis and using Hadoop provides exactly that, with
the help of Apache Flume (capable of ingesting millions of CDRs into Hadoop database per
second) and Apache Storm (capable of processing data in real time and identifying irregular,
potentially troublesome patterns). These can be combined to improve the overall customer
satisfaction levels.
Proactive Servicing of Telecom data Equipment
Big telecom companies stay ahead of the market and invest in huge Telecom data
infrastructure well in advance in order to gain competitive advantage and be ready to service
customers as soon as there is a demand for their service. This calls for regular performance
monitoring of the Telecom data equipment such as s, conductors, signal boosters, antennas,
etc.
Using Hadoop enables telecom companies to analyze big data produced by
telecommunications systems through performance indicators (voltage and current levels, up
time, down time, efficiency, etc.). Using Hadoop makes this real time analysis easy to
perform and store.
Pitching new products
Product innovation is one of the key factors for any big telecom company to attain and
maintain its competitive advantage. It is important to analyze the usage history and forecast
the next generation of products which the customers are likely to expect and be ready with
them as soon as the demand arises. These require complex analysis on terabytes of data
and sort them according to customer demographics, geography, profession and a number of
other factors.
Using Apache Hadoop has made Telecom data analytics possible in a secure, reliable and
efficient manner.
Network Analytics
Todays telecom customer has got a number of service providers to choose from and the
switching costs have come down drastically, which means the telecom companies,
especially the giant dot of Telecoms, need to keep a careful watch on the performance of
their network. The network bottlenecks have to be identified and resolved within a matter of
minutes for a company to retain its customer base and attract new customers.
Using Hadoop gives them the ability to dig through Petabytes of data and extract
meaningful information in a matter of seconds.
Companies such as Telefonica, China Mobile and Verizon crunch their big data through
Hadoop to grow and maintain their services. "The ability to identify, gather, store and share
large amounts of data is one of the true benefits of cloud computing," said Verizon CMO
John Harrobin. "Providing those capabilities on top of our high-performing, secure and
scalable architecture strengthens the Verizon Cloud offering for our clients."
So, if you relate to the Telecom sector, in any capacity, learning Hadoop will always be a
huge advantage, especially in the current scenario dominated by the boom of the Telecom
data analytics in the telecommunication industry.
India and South Africa, despite their best efforts were knocked off the running, in the semis.
South Africa have been branded as chokers - as they tend to miss out on the most
important matches, even though they are a top team on ICC Cricket Rankings.
So how do we know this? How to experts sit and analyze who the top teams are? Which
batsman has the better batting average? Which bowler has the best economy rate?
All of the above questions are solved with the help of Big Data Analytics in Cricket, or more
appropriately termed Criclytics.
Cricket- A game, where 10 full member countries and 96 part time countries participate. ICC
Cricket World Cup is the most awaited event for these participants. Cricket is played as a
professional sport and has been so for the past 160 years. Data in Cricket is generated
every day for 365 days. With the ball by ball information of 531253 cricket players in close to
540290 cricket matches at 11960 cricket grounds across the world - this Big Data is
extremely voluminous. There are definitely tons of permutations and combinations of cricket
data to effectively predict the accuracy of the next ball that possibly can be bowled.
Can this data be accrued and analyzed to make accurate predictions for a game that is
known to be unpredictable?
The answer is straightforward. Using Hadoop and other big data technologies cricket data
can be analyzed to the niftiest precision.
Scope of Big Data in Cricket Predictions for World Cup 2015
Regardless of the kind of sport - the more insight you have, the more you know, the better
are your chances of enjoying its success- whether you are a broadcaster, player or a diehard fan.
It is not difficult to see how much big data analysis is used in post-match reviews and in
planning the game strategy for the next match. For instance, with real time Data analysis,
experts were able to predict that the teams going forward to the semifinals were going to be
India, Australia, South Africa and New Zealand. This prediction was based on each teams
past performances in the year. Experts had also hinted that since India had not won any of
the matches against Australia in the past year, it was not likely that they would win in the
semis. But we had hoped that the unpredictability of cricket will come through. Unfortunately
for India Data won again.
Image Credit:indianexpress.com
Big Data and Hadoop technology have a perspective in various industries but they are now
on the flap of targeting sports. Big Data in Sports can make a significant difference in
preventing injuries, scoring touchdowns and signing contracts from coaches to players.
Stephen Benjamin, IT Director at Cadence Design Systems Private
Limited says To an extent data analytics can help in creating a level playing field as this
insightful information is available to all, but the players still require ability and talent to
execute their plans on the ground
Big Data Analytics providers have come with progressively more refined methods for
monitoring and capturing the exponentially growing volumes of cricket data. Wearable
computers , CCTV Cameras, and sensors keep a track of each and every aspect of cricket
players' performance, training levels, intake of calories, interaction with fans and many more
in the pursuit of 'improved performance on the pitch'.
Big Data in Sports is categorized into 2 main analyses. The first category of predictions is
the one in which is analytics is done to provide fun statistics that entertain audiences/viewers
and the second one is performance focused analytics that helps the teams to plan in a better
manner or improve a players' performance.
The former kind of predictions are exposed to the viewers such as the weather conditions,
swing, average speed of the bowler and many more. Some of these cricket predictions are
just meant to enliven the viewers for example - 'Whenever Virat Kohli scores a century, India
wins' or 'New Zealand have never played a World Cup final before and Australia is chasing
the Cricket World Champion title for the 5th time.'
If we take into consideration data related to a Batsmen then it will include number of balls
faced, number of sixes scored, number of fours scored, overall runs scored, strike rate and
so on. Similarly, if we account for the data related to a bowler then it will include the number
of wickets taken, the bowling average, number of runs given to opposing batsmen, number
of balls bowled and so on.
These are just the statistical data we are talking about what about the video data which
shows how a batsmen has responded to a particular ball, how a ball swung in the
preliminary phase of the match, etc. Thus, with the help of Hadoop and other big data
technologies there is a massive opportunity to analyze these statistics and make the best
cricket predictions for the World Cup Cricket 2015 that will help in taking precise decisions
on and off the cricket pitch.
Amitava Ghosh, ex-CTO at TaxiForSure says While raw talent can never be
replaced, data science and statistical analysis is going to help in putting together a sharper
strategy for the game, just like it helps in business."
Click here to know more about our IBM Certified Hadoop Developer course
How cricket data is used for analytics in World Cup Cricket 2015?
The main goal of ICC is to provide real time and interesting story telling statistics to the
viewers through Big Data Analytics. ICC is using 40 years of historical world cup data to give
out the best cricket predictions and enhance the experience of the viewers at World Cup
Cricket 2015.
Predictive analytics is the secret to achieving this goal of ICC as it foretells specific
outcomes for events which take place in a match by looking at the previous trends in data
and considering various other variables.
With Insights tool, now Big Data Analytics can tell you the odds of the popular batsmen
Virat Kohli playing in a particular zone in opposition to a left hand bowler. Isnt that amazing?
The Big data functions plot in approximately 25 different variables for every ball bowled in a
match during the World Cup Cricket 2015 which are passed through an extensive analytics
process complemented by historical data with the help of local cricketing intelligence of the
ESPNCricInfo team.
Mr. Ramesh Kumar, the Head of ESPN Digital Media India and
ESPNCricInfo says that Unlike in cricket, big data analytics have become well-defined
tools in other sports such as baseball and basketball. We map every cricket ball in multiple
dimensions, be it in any time zone. Various coaches and players have been around us to get
this data. So, this tool will be available for free only up to a certain layer.
2) ScoreWithData-IBM Game Changer
Just 7 hours before the 1 st quarter final of ICC World Cup Cricket 2015 began,
ScoreWithData predicted that the South African Cricketer Imran Tahir is going to be ranked
as the Power Bowler and this prediction through Big Data Analytics came true as the South
African Team won the knockout match against Sri Lanka because of the outstanding
performance of Tahir making it an unforgettable experience for all the cricket lovers as they
take pleasure in watching World Cup Cricket .
This best cricket prediction owes IBM - for its analytic innovation ScoreWithData that
consists of Twitteratti Speaks, IBM Social Sentiment Index and Wisden Impact Index which
provides insights to cricket fans. IBM uses Social Media in particular Twitter for spreading
the word about their creative and interesting predictions.
Twitteratti Speaks uses Data Analytics Engine and Social Media Data to
identify and keep a track of all the highlight moments of every cricket match.
The highlight of a match are identified by the prevailing reactions of the cricket
fans by scanning and searching through billions of Twitter feeds and finally
results are projected visually.
IBM Social Sentiment Index uses real-time data streams for generating the
best cricket predictions on the teams as to who has a greater probability of
winning the match, who are the most talked of players.
This paved way for a novel kind of CRM - the Cloud based CRM. In any cloud based CRM,
applications are hosted by the vendor and organizations can gain access to data through the
web without having to worry about any technical aspects of managing it.
Cloud CRM has become popular over time upsetting the concept of the on-premise CRM
model because there was no software licensing fee involved, the organization did not
need dedicated IT staff or infrastructure. This has in turn reduced the cost of monthly
services and eased the set up process.
vision to reinvent the Cloud CRM model and now, Salesforce defines the new era of cloud
computing.
What does Salesforce do?
Salesforce is cloud based CRM software, developed to make organizations function
efficiently and profitably by reducing the cost of managing hardware infrastructure.
Salesforce offers a wide range of features in all the functional areas of a company:
that we want to be the platform for all business apps. When a business starts its day, they
open up their apps on us and all day, all their apps are running on our foundation.
Click here to know more about Salesforce training in ADM 201 and DEV 401
certifications.
5 REASONS WHY SALESFORCE IS THE BEST CRM
1.
Yvonne Genovese, Vice President at Gartners Marketing Leaders Research said Marketing
will be the largest growing Salesforce CRM category through 2017. The International Data
Corporation (or IDC) expects the overall market for marketing automation to grow from $3.2
billion in 2010 to $4.8 billion in 2015.
According to the Gartner research reports, 94% of the Salesforce CRM revenue generated
from the Support and Subscription fees whilst only 6% of the revenue comes from
The companys CEO Marc Benioff stated that Salesforce.com has recorded constant
currency and deferred revenue growth of 30% or more year-over year. The company
expects revenue to rise to between $6.45 billion and $6.50 billion in 2016.
In June 2013, ExactTarget was acquired for $2.5 billion which helped enhance the Marketing
Cloud through email campaign management and marketing automation. ExactTarget
acquisition has also resulted in Salesforce owning Pardot, an application for marketing
automation that primarily works in the area of online marketing campaigns. Pardot helps
boost sales by creating and deploying marketing campaigns. Pardot has a vital significance
in increased revenue and efficiency of Salesforce CRM Software.
The most recent strategic acquisition has been that of RelateIQ for $390 million which helps
eliminate manual data entry by automated tracking of relationships in the CRM space. This
will certainly be a critical value addition offering in Salesforce Marketing Cloud.
Image Credit:marketrealist.com
Salesforce acquired Heroku in 2010 to provide its customers with PaaS (Platform as a
Service) to provide support for various programming languages. Users can customize their
applications with developer tools like TheAppExchange and Database.com. With diverse
offerings and wide product portfolio, Salesforce is inventing the future while other
competitive CRM software applications like the Siebel are just trying to get into it.
5. Dawn of Salesforce1
Gartner has stated that by 2015, an overwhelming 60% of Internet users will have a
preference for mobile customer service applications, with various devices and applications
being available on a single platform.
To stay tuned with the increasing demand and growing trend, Salesforce launched
Salesforce1 in October 2013- an innovative CRM platform for software vendors, developers,
and customers for connecting applications and third-party services such as Dropbox,
Evernote and LinkedIn.Instant and customized Customer Service, On-Screen guided
support and live video support are just some of the remarkable features of Salesforce1
which contribute to its dominance in the CRM software space.
Salesforce1 has seen significant growth of active mobile application users which is whopping
96% with 46% increase in the active users for customized mobile applications. Thus,
Salesforce1 is successful in leveraging the growth in the Customer Relationship
Management Software market by meeting the increasing demand mobile devices service
providers.
the two products together merely eliminates the choice of the customer and subtly influences
the purchase behavior.
A survey conducted in June 2013 by Gartner predicts that the Big Data spending in Retail
Analytics will cross the $232 billion mark by 2016. Gartner surveyed close to 2/3 rd of the 720
businesses and IT leaders who saidthey have already invested in Big Data or
expect to invest in Big Data by the end of June 2015. Gartner survey also projects
that 73% of the retail organizations plan to invest in Retail Analytics 2 years down the lane.
Retail Big Data Analytics Market is anticipated to grow from $1.8 billion in 2014 to $4.5 billion
dollars in 2019.
Retail Analytics truly started with Target having figured out, quite early on that data
analytics can take the consumer buying experience to a whole other level.
When Target statistician Andrew Pole built a data mining algorithm which ran test after test
analyzing the data, useful patterns emerged which showed that consumers as a whole
exhibit similar purchase behaviors. It got a little out of hand, when Target accurately
predicted that a teen girl was pregnant (even before her family knew) and sent her
customized products catalogue to ease her buying needs.
Today, the breed of consumers have changed significantly. Big Data related to consumer
purchase behavior is volatile, voluminous and the veracity of this big data has the big retail
companies scrambling to innovate new data mining techniques and resorting to cheaper
open source software, like Hadoop to accurately store and analyze data in real time.
Big retail corporations such as Tesco ($130 billion), Walmart ($473 billion) and Target ($73
billion) have all experienced a significant decline in their revenues due to the fickle and
unpredictable shopping habits of the new generation of customers.
The new millennial would prefer looking at a Smartphone for price comparison and
immediately placing the order without even walking down to the store. Companies like
Walmart and Tesco have realized that the only way to bring about a change in data analytics
is to partner with startup organizations that have in-built platforms to comprehend and hold
on to consumers through intense data analysis.
Click here to know more about our IBM Certified Hadoop Developer course
Need for Retail Big Data Analytics
The supermarket chain TESCO has 600 million records of retail data growing at rapid pace
of million records every week with 5 years of sales history and 350 stores. It would be
practically impossible to analyze this amount of data at once, with the help of legacy
systems. There would be a significant amount of data loss as the processing speed of
legacy systems is limited.
With increase in retail channels and increased demand for social media, consumers are able
to compare the services, products and the prices regardless of the fact that they shop online
or in retail stores. With access to pool of information, consumers interact with retail channels
via social media platforms by empowering themselves in influencing other customers to
make a shift from one brand or another through online review, comments and tweets.
The recurrence of data infringements has rocketed to such a high point that every week
there is one mega retailer hit by frauds. Fraud Detection is a serious issue determined to
avoid losses and maintain the customers trust. Most commonly observed frauds include
fraudulent return of products purchased and stolen credit or debit card information.
Image Credit:slideshare.net
Giant retailer Amazon has an intensive program to detect and prevent credit card frauds,
which has led to 50% reduction in frauds within first 6 months. Amazon developed fraud
detection tools that use scoring approach in predictive analysis. This retail analytics depends
on huge datasets that contain not just financial information of the transactions but it also
keeps a track of browser information, IP address of the users and any other related technical
data that might help Amazon refine their Analytic models to detect and prevent fraudulent
activities.
2) Retail Analytics in localization and personalization of Customer
Driven promotions
Retailers personalize wide range of elements that include store formats, promotion
strategies, pricing of the products, staffing. Personalization may be dependent on various
factors such as demographics, location specific attributes (proximity to certain other
businesses) and the purchase behavior of the customer.
eCommerce companies like Amazon, eBay have mastered the art of personalized service.
Retailers are trying to do the same by providing customized messages, shopping offers, and
seasonal freebies. With Big data technologies like Hadoop personalized experience will
bring in paramount customer service resulting in happy customers.
Image Credit:bigdata.lgcns.com
Localization and personalization techniques require various analytical approaches to be
implemented that include behavioral targeting, price optimization and store site selection
analytics. If the motive is to localize clusters then a technique for clustering needs to be
used. Localization in retail sector is not usually geographically oriented; however retailers
can target pricing, offers and other product assortments depending on the behavior of the
customer to provide them a personalized shopping experience.
Image Credit:slideshare.net
Amazon has pioneered personalization strategy by using product based collaborative retail
analytics. Amazon provides data driven recommendations to customers depending on
previous purchase history, browser cookies, and wish lists.
3) Retail Analytics in Supply Chain Management
In 2012, Cognizants Sethuraman M.S. quoted in a paper,Big Datas Impact on
the Data Supply Chain - The worlds leading commercial information providers
deal with more than 200 million business records, refreshing them more than 1.5
million times a day to provide accurate information to a host of businesses and
consumers. They source data from various organizations in over 250 countries,
100 languages and cover around 200 currencies. Their databases are updated
every four to five seconds.
Big Data has the prospective to transform processes across various industries and this tech
trend could be the way to increase efficiency in the retail supply chain management. Supply
chain management is significant to the retailers in the long run. Retailers make every effort
to create optimized, flexible, global and event driven supply chain model to increase
efficiencies and enhance relationship with supply chain stakeholders.
Supply Chain Management is inefficient without Retail Big Data Analytics because it
would be very difficult to track individual packages in real time and gather useful information
related to shipments. Retail Analytics in Supply chain involves optimizing inventory,
replenishment, and shipment costs.
Metro Group retailer uses retail analytics to detect the movement of goods inside the stores
and display relevant information to the store personnel and customer. For example, if a
consumer takes an item into the trial room, the product recommendation system
recommends other related products while the customer is trying on the apparel. The store
personnel inform the customers whether the products are in sock or not. The retail analytics
system of Metro Group also keeps a track of the movement patterns on and off the shelf for
customer analytics for a later point in time. Retail Analytics also alert managers at Metro
Group about abnormalities in product by identifying unusual patterns, for example the
product is taken off from the shelf several times but it is not purchased.
4) Retail Analytics in Dynamic Pricing
100% price transparency is a pre-requisite - with customers marching towards comparison
between online and showroom prices. There is a need to build a dynamic pricing platform
with retail analytics that can power millions of pricing decisions amongst the biggest retailers.
Dynamic Pricing in Retail Analytics can be implemented in 2 ways1) Internal Profitability Intelligence Every online transaction is tracked at unit level
profitability by taking into consideration various variable costs such as vendor funding,
COGS (Cost of Goods Sold) and shipping charges.
2) External Competitor Intelligence- For a given set of retailer products, retail analytics
provide real time intelligence information about those products on competitors website with
corresponding prices.
Amazons analytical platform has a great advantage in dynamic pricing as it responds to the
competitive market rapidly by changing the prices of its products every 2 minutes (if
required) whilst other retailers change the prices of the products every 3 months.
Marcus Collins, a Research Analyst at Gartner said Big data analytics and the Apache
Hadoop open source project are rapidly emerging as the preferred Big Data
solutions to address business and technology trends that are disrupting
traditional data management and processing.
Allied Market Research predicts that the Hadoop-as-a-Service market will grow to $50.2
billion by 2020. The Global Hadoop Market is anticipated to reach $8.74 billion by 2016,
growing at a CAGR of 55.63 % from 20122016.
Wikibons latest market analysis states that- spending on Hadoop software and
subscriptions accounted for less than 1% of $27.4 billion or approximately $187 million in
2014 in overall Big Data spending. Wikibon predicts that the spending on Hadoop software
and subscriptions will increase to approximately $677 million by the end of 2017, with overall
big data market anticipated to reach the $50 billion mark.
Big Data and Hadoop are on the verge of revolutionizing enterprise data management
architectures. Cloud and enterprise vendors are competing to venture a claim in the big data
gold-rush market with pure plays of several top Hadoop Vendors. Apache Hadoop is an
open source big data technology with HDFS, Hadoop Common, Hadoop MapReduce and
Hadoop YARN as the core components .However, without the packaged solutions and
support of commercial Hadoop vendors, Hadoop distributions can just go unnoticed.
Need for Commercial Hadoop Vendors
Today, Hadoop is an open-source, catch-all technology solution with incredible scalability,
low cost storage systems and fast paced big data analytics with economical server costs.
Hadoop Vendor distributions overcome the drawbacks and issues with the open source
edition of Hadoop. These distributions have added functionalities that focus on:
AWS EMR handles important big data uses like web indexing, scientific simulation, log
analysis, bioinformatics, machine learning, financial analysis and data warehousing. AWS
EMR is the best choice for organizations who do not want to manage thousands of servers
directly - as they can rent out this cloud ready infrastructure of Amazon for big data analysis.
DynamoDB is another major NoSQL database offering by AWS Hadoop Vendor that was
deployed to run its giant consumer website. Redshift is a completely managed petabyte
scale data analytics solution that is cost effective in big data analysis with BI tools. Redshift
has costs as low as $1000 per terabyte annually. According to Forrester, Amazon is the
King of the Cloud - for companies in need of public cloud hosted Hadoop platforms for big
data management services.
2) Hortonworks Hadoop Distribution
Hortonworks Hadoop vendor, features in the list of Top 100 winners of Red Herring.
Hortonworks is a pure play Hadoop company that drives open source Hadoop distributions
in the IT market. The main goal of Hortonworks is to drive all its innovations through the
Hadoop open data platform and build an ecosystem of partners that speeds up the process
of Hadoop adoption amongst enterprises.
Principal Analyst of Forrester, Mike Gualtieri said "Where the open source community
isn't moving fast enough, Hortonworks will start new projects and commit
Hortonworks resources to get them off the ground."
Apache Ambari is an example of Hadoop cluster management console developed by
Hortonworks Hadoop vendor for provision, managing and monitoring Hadoop clusters. The
Hortonworks Hadoop vendor is reported to attract 60 new customers every quarter with
some giant accounts like Samsung, Spotify, Bloomberg and eBay. Hortonworks has
garnered strong engineering partnerships with RedHat, Microsoft, SAP and Teradata.
Hortonworks has grown its revenue at a rapid pace. The revenue generated by Hortonworks
totaled $33.38 million in first nine months of 2013 which was a significant increase by
109.5% from the previous year. However, the professional services revenue generated by
Hortonworks Hadoop vendor increases at a faster pace when compared to support and
subscription services revenue.
Click here to know more about our IBM Certified Hadoop Developer course
3) Cloudera Hadoop Distribution
Cloudera Hadoop Vendor ranks top in the big data vendors list for making Hadoop a reliable
platform for business use since 2008.Cloudera, founded by a group of engineers from
Yahoo, Google and Facebook - is focused on providing enterprise ready solutions of Hadoop
with additional customer support and training. Cloudera Hadoop vendor has close to 350
paying customers including the U.S Army, AllState and Monsanto. Some of them boast of
deploying 1000 nodes on a Hadoop cluster to crunch big data analytics for one petabyte of
data. Cloudera owes its long term success to corporate partners - Oracle, IBM, HP, NetApp
and MongoDB that have been consistently pushing its services.
Cloudera Hadoop vendor is just on the right path towards its goal with 53% of the Hadoop
market when compared to 11% of Hadoop Market possessed by MapR and 16% by
Hortonworks Hadoop vendors. Forrester says Clouderas approach to innovation is to
be loyal to core Hadoop but to innovate quickly and aggressively to meet
customer demands and differentiate its solution from those of other commercial
Hadoop vendors.
MapR has made considerable investments to get over the obstacles to worldwide adoption
of Hadoop which include enterprise grade reliability, data protection, integrating Hadoop into
existing environments with ease and infrastructure to render support for real time operations.
In 2015, MapR plans to make further investments to maintain its significance in the Big Data
vendors list. Apart from this MapR is all set to announce its technical innovations for Hadoop
with the intent of supporting business-as-it-happens- to increase revenue, mitigate risks and
reduce costs.
The below image illustrates comparison of the top 3 Hadoop vendors that will play a deciding
role to make a better choice.
Microsoft is an IT organization not known for embracing open source software solutions, but
it has made efforts to run this open data platform software on Windows. Hadoop as a service
offering by Microsofts big data solution is best leveraged through its public cloud product
-Windows Azures HDInsight particularly developed to run on Azure. There is another
production ready feature of Microsoft named Polybase that lets the users search for
information available on SQL Server during the execution of Hadoop queries. Microsoft has
great significance in delivering a growing Hadoop Stack to its customers.
According to analyst Mike Gualtieri at Forrester: Hadoops momentum is unstoppable
as its open source roots grow wildly into enterprises. Its refreshingly unique
approach to data management is transforming how companies store, process,
analyze, and share big data.
Commercial Hadoop Vendors continue to mature overtime with increased worldwide
adoption of Big Data technologies and growing vendor revenue. There are several top
Hadoop vendors namely Hortonworks, Cloudera, Microsoft and IBM. These Hadoop vendors
are facing a tough competition in the open data platform. With the war heating up amongst
big data vendors, nobody is sure as to who will top the list of commercial Hadoop vendors.
With Hadoop buying cycle on the upswing, Hadoop vendors must capture the market share
at a rapid pace to make the venture investors happy.
Most organizations today use Cloud computing services either directly or indirectly. For
example when we use the services of Amazon or Google, we are directly storing into the
cloud. Using Twitter is an example of indirectly using cloud computing services, as Twitter
stores all our tweets into the cloud. Distributed and Cloud computing have emerged as novel
computing technologies because there was a need for better networking of computers to
process data faster.
Centralized Computing Systems, for example IBM Mainframes have been around in
technological computations since decades. In centralized computing, one central computer
controls all the peripherals and performs complex computations. However, centralized
computing systems were ineffective and a costly deal in processing huge volumes of
transactional data and rendering support for tons of online users concurrently. This paved
way for cloud and distributed computing to exploit parallel processing technology
commercially.
ATM
Click here to know more about our IBM Certified Hadoop Developer course
Cloud computing
In a world of intense competition, users will merely drop you, if the application freezes or
slows down. Thus, the downtime has to be very much close to zero. For users, regardless of
the fact that they are in California, Japan, New York or England, the application has to be up
24/7,365 days a year. Mainframes cannot scale up to meet the mission critical business
requirements of processing huge structured and unstructured datasets. This paved way for
cloud distributed computing technology which enables business processes to perform critical
functionalities on large datasets.
Facebook has close to 757 million active users daily with 2 million photos viewed every
second, more than 3 billion photos uploaded every month, and more than one million
websites use Facebook Connect with 50 million operations every second. Distributed
Computing Systems alone cannot provide such high availability, resistant to failure and
scalability. Thus, Cloud computing or rather Cloud Distributed Computing is the need of the
hour to meet the computing challenges.
YouTube is the best example of cloud storage which hosts millions of user
uploaded video files.
Picasa and Flickr host millions of digital photographs allowing their users
to create photo albums online by uploading pictures to their services servers.
Google Docs is another best example of cloud computing that allows users
to upload presentations, word documents and spreadsheets to their data servers.
Google Docs allows users edit files and publish their documents for other users
to read or make edits.
Benefits of Cloud computing
1) A research has found out that 42% of working millennial would compromise with the
salary component if they can telecommute, and they would be happy working at a 6% pay
cut on an average. Cloud computing globalizes your workforce at an economical cost as
people across the globe can access your cloud if they just have internet connectivity.
2) A study found that 73% of knowledge workers work in partnership with each other in
varying locations and time zones. If an organization does not use cloud computing, then the
workers have to share files via email and one single file will have multiple names and
formats. With the innovation of cloud computing services, companies can provide a better
document control to their knowledge workers by placing the file one central location and
everybody works on that single central copy of the file with increased efficiency.
Frost & Sullivan conducted a survey and found that companies using cloud computing
services for increased collaboration are generating 400% ROI. Ryan Park, Operations
Engineer at Pinterest said "The cloud has enabled us to be more efficient, to try out
new experiments at a very low cost, and enabled us to grow the site very
dramatically while maintaining a very small team."
Cloud Computing vs. Distributed Computing
1) Goals
The goal of Distributed Computing is to provide collaborative resource sharing by connecting
users and resources. Distributed Computing strives to provide administrative scalability
(number of domains in administration), size scalability (number of processes and users), and
geographical scalability (maximum distance between the nodes in the distributed system).
Cloud Computing is all about delivering services or applications in on demand environment
with targeted goals of achieving increased scalability and transparency, security, monitoring
and management.In cloud computing systems, services are delivered with transparency not
considering the physical implementation within the Cloud.
2) Types
Distributed Computing is classified into three types-
visibility about the infrastructure. For example, Google and Microsoft own and
operate their own their public cloud infrastructure by providing access to the
public through Internet.
Distributed Cloud Computing has become the buzz-phrase of IT with vendors and
analysts agreeing to the fact that distributed cloud technology is gaining traction in the minds
of customers and service providers. Distributed Cloud Computing services are on the
verge of helping companies to be more responsive to market conditions while restraining IT
costs. Cloud has created a story that is going To Be Continued, with 2015 being a
momentous year for cloud computing services to mature.
women matching his compatibility percentage, he finally found his soul mate Tien Wang on
his 88 th date. Technological innovations in big data paved for perfect match making online.
are making every effort to maintain the credibility by matching the perfect partner to the
perfect person at the perfect time.
that helps users describe themselves about their likes, dislikes, interests, passions and other
useful information. It is not a short questionnaire that you answer what is your favorite sport
and color and the results help you find your life partner. The Online dating companies
provide questionnaires of up to as much as 400 questions. Users have to answer questions
on different topics varying from hypothetical situations to political views and taste
preferences to increase their online dating success rate.
are grouped into sets of similar users) to recommend dates based on their preferences and
tastes.
The unpredictability of human behavior has made big data analytics the key to finding Mr. or
Mrs. Right through online dating sites or apps because big data never lies .Online Dating
data is collected from social media platforms, credit rating agencies, history of online
shopping websites and various online behaviors like media consumption. Online Dating sites
then apply big data analytics to the treasure trove of collected information which helps them
determine the attributes that are attractive to online daters so that they can provide better
matches and perfect soul mates to their customers. With sophisticated technology in place,
Big Data Analytics promises to help you find true love via various online dating algorithms
and predictive analytics by sifting through a store of big data of millions of user profiles.
Online Dating giants like Match.com, eHarmony and OkCupid collect online dating data for
big data analytics from Facebook profiles, online shopping pages to determine the likes and
dislikes of a person as the data from these sites is much more helpful in predicting human
behavior based on actions than what the users fill out in the questionnaire.
A McKinsey report states that Companies must be able to apply advanced analytics
to the large amount of structured and unstructured data at their disposal to gain
a 360-degree view of their customers. Their engagement strategies should be
based on an empirical analysis of customers recent behaviors and past
experiences with the company, as well as the signals embedded in customers
mobile or social-media data.
Match.com provides its users with a questionnaire of 15 to 100 questions and then points
are allocated to the user based on the pre-defined parameters in the system such as
religion, income, education, hair color, age, etc. The users are then matched to people who
have similar points.Match.com uses advanced big data analytics to find out any
discrepancies in what people actually do on the website and what they actually confess. If
any discrepancies are found, the match making algorithms adjust the compatible match
results based on this behavior.
Amarnath Thombre, President at Match.com said- People have a check list of what
they want, but if you look at who they are talking to, they break their own rules.
They might list money as an important quality in a partner, but then we see
them messaging all the artists and guitar players.
Match.com does not take any risk in determining the accuracy of online dating data for big
data analysis.Match.com has started using facial recognition technology that helps them in
finding out the category of matches that the user prefers and highlight the features that
users are more attracted to.
Big data professionals at Match.com say that even if people are not so specific about the
height, weight, hair color or race they definitely have some kind of facial shape they want to
go for in their partner.Match.com aims to find a persons type by facial feature analysis so
that they can pair them up with the category of people who fit their type. These exclusive
services cost 5000 USD for 6 months; however, Match.com is willing to pay the price as it
gives them a sharper edge in competitive world.
With more than 565,000 couples married successfully and 438 people in US saying I Do
every day because of eHarmony, the credit is owed to IBM Big Data and Analytics product
IBM Pure Data System for Hadoop that renders personalized matches accurately and
quickly.Statistics on Online Dating site eHarmony show that it generates approximately 13
million matches a day for its 54 million user base and altogether has more than 125 TB of
data to analyze - which increases every day.
eHarmony asks its users to fill up a questionnaire of 400 questions when signing up which
helps them collect online dating data based on physical traits, location based preferences,
hobbies, passions and much more. Dataset of eHarmony is greater than 4 TB of data,
photos excluded. The best thing is that the match making algorithms of eHarmony use all the
online dating data it collects to find the perfect match for its users. The 400 questions
questionnaire is not the end. It collects data on the behavior of users in website such as how
many pictures they upload to the database, how many times they log in, what kind of profile
they visit frequently, etc. The data collected is sorted by specialized analysis algorithms
which help users find a perfect match.
Jason Chunk, Vice President of eHarmony said - From the data, you can tell who is
more introverted, who is likely to be an initiator, and we can also see if we give
people matches at certain times of the day, they would be more likely to make
communication with their matches. It kind of snowballs from there. We use a
number of tools on top of that, as well.
eHarmony uses MongoDB to ease the match making process for couples. Big Data and
machine learning processes of eHarmony use a flow algorithm which process a billion
prospective matches a day. The compatibility matching system of eHarmony was initially
built on RDBMS but it took more than 2 weeks for the matching algorithm to
execute.eHarmony has successfully reduced the time of execution by 95%( less than 12
hours) for the compatibility matching system algorithm to run by switching to MongoDB.
It is evident that big data plays a vital role in online dating revolution. Dating companies are
harnessing the power of big data analytics to become perfectionists in helping people find
true love online. As dating sites continue to collect tons of online dating data through
different sources and refine their match making algorithms to harness the power of big data,
we are not far witnessing the day when dating sites will know better than us on who our soul
mate is.
These Big Data trends show how organizations are harnessing the power of big data and
making technological advancements in big data analytics to get competitive advantage. Big
data analytics is making waves in every industry sector with novel tools and technology
trends. The ability to bind the intensifying amount of big data that is generated - has
transformed almost every sector, such as decoding human DNA cells in
minutes, pinpointing marketing efforts, controlling blood pressure levels, tracking calories
consumed,finding true love by predicting human behaviour, predicting a players
performance level based on historical data, foiling terrorist attacks, provide
personalized medicine to cancer patients, personalized shopping recommendations for
users, etc.
Hadoop, NoSQL, MongoDB, and Apache Spark are the buzzwords with big data
technologies - reverberating to leave a digital trace of data in everyones life which can be
used for analysis. The big data analytics market in 2015 will revolve around the Internet of
Things (IoT), Social media sentiment analysis, increase in sensor driven wearables, etc.
1) Big Data Analysis to drive Datafication
Eric Schmidt, Executive Chairman at Google says: From the dawn of civilization until
2003, humankind generated five Exabytes of data. Now we produce five
Exabytes every two daysand the pace is accelerating.
The process that makes a business, data driven is by collecting huge data from various
sources and storing them in centralized places to find new insights that lead to better
opportunities - can be termed as Datafication. Datafication will take big data analysis to new
heights - into real insights, future predictions and intelligent decisions.
A recent CivSource news article highlighted the creation of a big data transit team in Toronto
routing path - for big data analytics in transportation sector. Tom Tom, global leader in Traffic,
Navigation and Map products found that in Vancouver, Montreal and Toronto, commuters
lose an average of 84 hours a year because of being delayed due to heavy traffic. As a
solution to this problem, Toronto created a big data transit team for analysis of big data in the
transportation services department. They partnered with McMaster University to analyse
historical travel data. To establish Toronto as a truly smart city, it has requested vendors to
showcase proven products for measuring and monitoring traffic and travel.
Datafication is not a new trend but the speed with which data is being generated in real time
operational analytics systems is breath-taking. This is likely to bring about novel trends in big
data analytics. Datafication of organizations will soon impact our lives in this fast changing
world by formulating a data driven society.
analysis techniques to bridge the security gap in big data by predicting probable data
threats. Industry experts have started looking at big data analysis as a robust tool for
protecting data security by identifying signals of tenacious security threats. In 2015, big data
security has the potential to make more noise in the market as an emerging trend.
Neil Cook, CTO of Cloudmark says- In identifying the source of a current spam attack by
tracking where the attacker has sourced target email addresses, it is possible to identify
other address lists that attacker has downloaded and use that information to predict, and
prevent, the next attack.
2015 will welcome the dawn of big data analytics security tools to combine text mining,
ontology modelling and machine learning to provide comprehensive and integrated security
threat detection, prediction and prevention programs.
Click here to know more about our IBM Certified Hadoop Developer course.
3) Deep Learning soon to become the buzz word in Big Data Analysis
With deep learning, there could be a day when big data analysis would be used to identify
different kinds of data such as colors, objects or shapes in a video. The world will experience
a great pull from big data vendors in cognitive engagement and advanced analytics. Lets
hope for some innovative hit in deep learning to real time business situations by end of 2015.
Googles latest deep learning system built on recurrent neural networks aims to identify
motion in videos and interpret various objects present in the video by feature pooling
networks.
Deep Learning is a machine learning technique based on artificial neural networks. Deep
learning involves ingesting big data to neural networks to receive predictions in response.
Deep Learning is still an evolving technology but has great potential to solve business
problems.IT giants are making researches in deep learning to strive hard to build up
customer choice and expectations.
Salesforce CEO Benioff said Salesforce reached $5 billion in annual revenue faster
than any other enterprise software company and now its our goal to be the
fastest to reach $10 billion.
With approximately 200,000 customer companies using salesforce1 platform and close to
2000 companies built on top of the salesforce1 platform- there is an emerging boom for
Salesforce careers in the enterprise application sector. Companies are hiring competent
salesforce developers, salesforce administrators and architects to implement innovative
business solutions and maximize their investment in salesforce.
The demand for salesforce careers is on the rise with salesforce.com hitting huge sales
record every quarter. Consultants with integration skills of a salesforce developer and
salesforce administrator will be the need of the hour in salesforce as enterprises connect
salesforce with other legacy solutions and cloud applications. Thus, to make themselves
marketable and pursue salesforce careers, consultants must receive salesforce
certifications to stand out among other potential employees.
Global Business Consulting firm Bluewolf conforms that the demand for salesforce
Administrators and salesforce developers will increase by 25 percent five years down the
lane.
Users rate salesforce 85% higher in terms of overall CRM functionality that
outside the leader category in Gartners quadrant and 23% better scalability
than vendors present in the leaders category.
84% of salesforce customers say they would recommend this CRM to their
peers.
Sky is the limit when we talk about Salesforce Customer Relationship Management
software. You will not find a single field that a Salesforce Administrator cannot add or any
single piece of code that a Salesforce Developer cannot execute. The IT market is in dearth
of Salesforce Administrators and Salesforce Developers.
Consultants looking for high paying salesforce jobs must dig deep into various aspects such
as -the roles and responsibilities of a salesforce administrator and developer, undergo
salesforce admin training, undergo salesforce developer training, and receive various
salesforce certifications.
Click here to know more about Salesforce training in ADM 201 and DEV 401
certifications
Salesforce Administrator
Salesforce Administrator in broader terms can be defined as a person responsible for
managing and administering the configuration side of salesforce. He/She is the one who
performs various declarative changes and manages the new releases into production
environment. A Salesforce Administrator is a professional responsible for running the already
existing Salesforce instances smoothly. This means that a Salesforce Administrator need not
have a good grasp of integrations and various other downstream consequences because he
does not configure any new functionality.
The job of a Salesforce Administrator for a small IT enterprise need not be a full time
opportunity. In the early stages of Salesforce CRM implementation the administrator will
have to devote near about half a day (50% time of full time position) but once the application
is live, managing daily activities of Salesforce CRM hardly requires about 10-25% of the full
time opportunity.
Salesforce Administrator will create reports from the data stored in the
Salesforce CRM and produce information asset that will help boost the business
revenue.
Out of the ordinary project management skills and analytical skills are a
must to take action against the requested changes and classify customizations.
automation, how to create high value reports and dashboards, how to import and maintain
clean data, how to create a safe and secure Salesforce environment. Trained and Certified
Salesforce administrators provides the enterprises an assurance that the professional has
in-depth knowledge and is confident enough to take the best out of Salesforce.
Salesforce Developer
A Salesforce Developer is responsible for building functionalities in a sandbox with
Visualforce or Apex before it is given to the Salesforce Administrator for scheduling
deployment.
The role of a Salesforce developer is to code the application logic and carry out the following
responsibilities:
Big Data market is heating up with startups building viable products which target real world
pain points with huge fundings from solid management teams.Big data start-ups are betting
that customized data driven services and products will give them an edge over other giant IT
organizations. Startups are finding ways to outperform the competition - the giant IT
organizations, through big data analytics by customizing their services and products .From
effective marketplaces to frictionless online transactions, from improvised customer
interactions to perceptive predictions, from platforms which can instantly mine huge troves of
data to catalogs of information available for sale, the startups ecosystem loves big data. Big
Data Start-ups that put data first are able to fine-tune faster with the competitive market.
Venture capitalists are on the verge of investing in cloud computing, big data analytics, or
software development. The emerging trends in startups funding is giving the big data
startups a shot in getting along with giant IT customers like Apple, Facebook , and Google.
Funding from venture capitalists helps startups establish a strong customer base through
personalized services and products. For instance, with smartphones and wearable devices
like the FitBit Fitness trackers and Apple Watch, the novel big data startups are collecting the
huge amounts of unstructured data generated from these devices and analysing it to
develop customized products and services that customers can use. Venture capitalists
perceive bright future of investing in big data startups as it is these small companies which
aggregate tons of data to form value so that their customers can make the best use of it.
1) Spotify Big Data Startup Success Stories
With 60 million active users worldwide, close to 6 million paying customers, 20 million songs
and approximately 1.5 billion playlists, Spotify produces close to 1.5 TB of compressed data
on daily basis. Spotify has one of the biggest Hadoop clusters with 694 heterogeneous
nodes running close to 7000 jobs in a day.
Spotify delivered 4.5 billion hours of listening time in 2013 and is revolutionizing the manner
in which people listen to music. Spotify plays a prominent role in the way music industry
evolves. Streaming services of Spotify not only use big data to improvise music engagement
and render a personalized experience, but they also identify upcoming music artists and
predict their potential for success. Spotify uses predictive analysis through big data to add
zeal of enjoyment and fun to user experience. It also uses the huge stream of big data to
predict the winners at Grammy Awards with the prediction accuracy of 67%.
pitch in if he/she notices that you choose to stay at any other Starwood property because
Uber knows this. You will be flooded with several offers from Starwood ensuring that your
next stay in Seattle is with one of the Starwood property helping them generate revenue over
other competitors.
Uber uses regression analysis to find out the size of neighbourhood which in turn helps Uber
to find out the busiest neighbourhoods on Friday nights so that they can add additional surge
charge to their customers bill. Uber takes ratings from its drivers and riders and leverages
this data to analyse it for customer satisfaction and loyalty.
This big data start-up has future plans to partner with supreme luxury brands, retailers,
restaurants to collect data about the shopping malls you visit, the clubs you visit, the places
you dine in .It plans to reveal this information to its customers so that the make use of it for
targeted marketing.
3) Netflix Big Data Startup Success Stories
Netflix is a big data company that meticulously gathers information from more than 50
million subscribers at a remarkable pace. Netflix collects tons of data from its customers to
under understand the behaviour of users and determine their preferences for movies and TV
shows.It uses machine learning to predict an individuals tastes depending on the choices
they make. It collects various customer metrics such as what kind of movie people watch,
when they watch, where they watch, the time spent on selecting movies, when users stop
watching the movie or pause it, what devices they use to watch, searches, ratings, etc.
Netflix big data startup leverages all the information collected to provide movie
recommendations by predicting the movies or TV shows the customers are likely to
enjoy.As, happy customers are likely to continue their subscription while bringing in
profitability. Netflix derives powerful insights from the collected big data to customize and
improvise the experience of users on the platform depending on their preferences and
tastes.
4) Rubikloud Big Data Startup Success Stories
Rubikloud is among one of the top big data startups that helps online retailers to make use
of big data to increase their ROI by predicting consumer behaviour. Rubikloud has signed up
with several retailers from health, beauty, fashion and other vertical owing to annual sales of
$25 billion sales per annum.
solutions into a much larger end market of business users. The biggest
opportunity in our mind, by far, is the Big Data Practitionersthat create entirely
new business opportunities based on data where $1M spent on Hadoop is the
backbone of a $1B business.
A recent survey by Spiceworks about certifications found that half of the IT professionals will
be paying for continued IT education, as they think that certifications add value to their
career. 80% of IT respondents said they are targeting to complete some kind of training and
certification this year.IT professionals taking Hadoop certification are exponentially
increasing as Hadoop is anticipated to be the most sought after skill in IT for 2015. IT
professionals are racing to acquire Hadoop Certification from top-notch vendors to bridge
the Big Data talent gap.
Sarah Sproehnle, Director at Educational Services Cloudera said "Companies are
struggling to hire Hadoop talent.
Expert predictions reveal skyward demand for advanced analytics skills in the next few
years.
Hadoop is gaining huge popularity in US and all over the world, as companies in all
industries such as-Real estate, Media, Retail, Healthcare, Finance, Energy, Sports,
Dating, and Utilities are embracing Hadoop. The industries adopting Hadoop in enterprise
big data projects want to ensure that the professionals they hire are experts in handling the
zettabytes of data. Organizations across different vertical industries are in the process of
adopting Hadoop as an enterprise big data solution. Thus, they consider Hadoop Training
and Hadoop certification as a proof of persons skill in handling big data.
Success in Big Data market requires skilfully talented professionals who can demonstrate
their mastery with tools and techniques of Hadoop Stack.Organizations are on the hunt for
finding talent that can launch and scale big data projects. Hadoop certification courses help
companies find true big data specialists that have ability to execute challenges with live big
data sets.
Hadoop Certifications from popular vendors bestows professionals with the most demanding
and recognizedHadoop jobs. There are several top-notch big data vendors like Cloudera,
Hortonworks,IBM, and MapR offering Hadoop Developer Certification and Hadoop
Administrator Certification.
Hadoop Certification Advantages
IT professionals from different technical backgrounds are making efforts transitioning to
Hadoop to get high paid jobs. Professionals opt for Hadoop Certification to portray their
exceptional Hadoop skills. A certified Hadoop professional has an edge over a candidate
without a Hadoop Certification.
Click here to get a discount of $40 on IBM Certified Hadoop Developer course
Hadoop Training-A Pre-Requisitefor Hadoop Certification Preparation
According to Sand Hill Groups survey on Hadoop, respondents said that the inadequate
number of knowledgeable Hadoop professionals and gap in Hadoop skills is a major setback
when it comes to implementing Hadoop. The only way to address these issues is proper
Hadoop training that will help professionals clear Hadoop certification to meet up with the
requirements for industry demanding Hadoop knowledge and skills.
Hadoop Certification can be obtained from any of the top vendors after acquiring online or inclass Hadoop certification training. In-class Hadoop certification training is not practical for
full time IT professionals due to their busy schedule.
DeZyre offers best in class live faculty LED online Hadoop training that will help
professionals master the concepts of Hadoop and gain confidence in taking up challenging
Hadoop certification exam from any of the vendors. With DeZyres Hadoop Training material,
weekly 1-on-1 meetings with experienced mentors, lifetime access and accredited IBM
Please refer this article to understand the benefits of DeZyres Hadoop online live
training.
Objectives of Hadoop Certification Training
By the end of Hadoop Training Course a professional will be all set to clear the Hadoop
certification by mastering the following Hadoop concepts:
6) Implement any kind of live Hadoop and Big Data project in the enterprise
Apart from taking Hadoop Training, Hadoop Certification preparation involves taking several
practice tests provided by the vendors, understanding and answering the most commonly
asked Hadoop certification questions from various Hadoop certification dumps available on
the Internet, and hands-on experience in implementing the Hadoop concepts.
2015 will be the year when all industries will adopt Hadoop as a cornerstone of business
technology agenda and CIOs will make Hadoop platform a priority it is mandatory to take
Hadoop training and Hadoop certification to grow with the increasing demand for Hadoop
skills. If you have any questions related to Hadoop Training or IBM accredited Hadoop
Certification from DeZyre, please leave a comment below and DeZyre experts will get
back to you.
trends will give marketing and sales professionals all the required data inside
their inbox.
2015 agreed that with changing business models of companies and consumer behaviour
,their focus would be on cloud CRM solutions in 2015.
2) Businesses to focus on Effective Cloud CRM Solutions for Security
The high profile security breaches on Target and Sony empowered wisdom on how
vulnerable any network can be. The increasing cyber-attacks will force businesses to
emphasize on effective cloud CRM solutions for security. ReportsnReports, a market
research organization predicts that with increased demand for security in customer
relationship management software applications, this market is anticipated to grow from 4.2
billion in 2014 to 8.7 billion in 2018.
Microsoft has already accomplished its milestone by adopting an ISO standard to protect the
personal information in cloud environments. This ISO standard will be applicable to
Dynamics CRM Online, Intune, Microsoft Azure and Office 365.Microsoft is the first public
cloud service provider that has accomplished this momentous goal.
With growing shared knowledge base, cloud security will continue to be refined in sales
CRM software applications. We can expect Big-Data style analytics to emerge as a major
trend in cloud CRM software security for 2015 with complex analytics algorithms detecting
anomalous network behaviour and any other malicious activities on the cloud network.
Alexander Linden, Gartner analyst said The biggest driver for implementing Big
Data is to enhance the customer experience. Organizations need to direct their
Big Data resources to discover opportunities for enhanced business performance
and customer-focused competitive advantage
Click Here to Read More about the Emerging Trends in Big Data and Hadoop
2015 will witness increased collaboration between security experts with Security-as-aService (SaaS) to play a vital role in Cloud based customer relationship management
software applications to identify emerging threats with zero-hour accuracy.
3) More powerful social listening tools for Cloud based CRM
Social media is much more than a trend in this ever changing competitive world impacting all
areas of business. In 2015, sales CRM software vendors will focus on social engagement so
that businesses become knowledgeable about their customers. Nowadays, customers make
their decisions based on online reviews and discussions. Conversations that used to happen
in person now take place on Twitter, LinkedIn or Facebook. 2015 will see sales CRM
software vendors investing in social listening tools to empower the sales and marketing
teams of businesses to be customer centric at every point leading to more sales, more
connections and 100% customer satisfaction.
Microsoft Dynamics 2015 customer relationship management software update has
enhanced social engagement. Marketers can now listen, analyse and drive customer
engagement with social insights about their campaigns and brands being displayed on the
platform. Microsoft social listening tool that comes with Dynamic 2015 update is free if the
business has more than 10 professional users. It works well within Microsoft Dynamics
Marketing, Microsoft Dynamics CRM, or standalone. Businesses can now access social
listening information about their brand through Dynamics customer relationship management
software to identify customer problems, discover more opportunities to up sell and add
additional value to their business than other salespeople of competing organizations.
2015 will sense the onset of engagement with customers by businesses spotting trends
through social pulse in ways never before possible - through Cloud CRM applications. As
customers become more informed and connected; businesses have started to rethink upon
the CRM software processes, software and strategies to surface the competitive advantage.
Experts anticipate the demand for CRM software systems that include social customer
relationship management software tools to be on the rise for 2015.
4) Mobile CRM to become more powerful
Earlier customer relationship management software applications had limited functionality but
2015 is going to be somewhat different with CRM vendors investing a hefty sum to make
mobile platforms of CRM applications more powerful. CRM vendors with powerful mobile
applications have the potential to effectively organize the sales teams of a business so that
they can spend more time selling.
CIO recently published the launch of Sales Engage Cloud by Salesforce as one of the
popular tech trends story on April 15 ,2015.Sales Cloud Engage will provide data, marketing
content and deep insights on leads before engagement so that representative can craft
personalized nurture campaigns without having to deal with the hassle of building ad-hoc
marketing campaigns manually. Sales Engage Cloud will provide a provision to integrate
these lead nurture campaigns with the Salesforce1 mobile app so that salespeople can add
leads directly to their nurture campaigns from their smart phones. Sales Engage Cloud will
also help salespeople pull up complete engagement histories of their prospects with the
company. Salesforceannounced the availability of Sales Engage Cloud at $50 per month for
one seat by end of April, 2015.
Click here to know more about Salesforce training in ADM 201 and DEV 401
certifications
With Bring Your Own Device (BYOD) policies becoming common in most of the
organizations, Gartner predicts that 30% of organizations nowadays issue Tablets as primary
devices for salespeople. CRM software systems are piled up with so much of marketing and
sales historical data that representative end up spending more time entering data than using
it. Companies will strive hard to strike a balance between data collection and ease of use in
2015. CRM software vendors are more on the forefront to have a simple and easy platform
that will render support to mobile devices rather than having a complex system that nobody
wants to use.
Companies choosing a CRM software vendor with powerful mobile apps in 2015 will boost
profitability with salespeople closing more deals, reporting more accurately and being geared
up for better meetings.
5) Wearables-The Next Big Thing in CRM Software Systems
Wearable are the next phase of mobile revolution in CRM software which organizations will
adopt to connect with their customers in new ways. Integration of wearable computing
devices with CRM applications will help organizations across various industries to have real
time access to account data, effectively engage with customers, and discover various
opportunities for up-selling and cross-selling to enhance relationships with customers at
every encounter.
A recent news story on Enterprise Software by CIO revealed about Salesforce bringing its
cloud service toApple Watch. Salesforce already has partnered
with Google and Philips healthcare wearable device makers and now its next venture is
Apple. Cloud, mobile, social and data science revolutions will converge on the wrist in 2015
with Salesforce CRM software.
What do you think will be the most emerging trends in CRM software for 2015 that we have
missed on? Please let us know in the comments below.
Hadoop runs on Linux, thus knowing some basic Linux commands will take
you long way in pursuing successful career in Hadoop.
According to Dice, Java-Hadoop combined skill is in great demand in the IT
industry with increasing Hadoop jobs.
Career counsellors at DeZyre frequently answer the question posed by many of the
prospective students or professionals who want to switch their career to big data or HadoopHow much Java should I know to learn Hadoop?
Most of the prospective students exhibit some kind of disappointment when they ask this
question they feel not knowing Java to be a limitation and they might have to miss on a
great career opportunity. It is one of the biggest myth that- a person from any other
programming background other than Java cannot learn Hadoop.
There are several organizations who are adopting Apache Hadoop as an enterprise solution
with changing business requirements and demands. The demand for Hadoop professionals
in the market is varying remarkably. Professionals with any of the diversified tech skills like
Mainframes, Java, .NET , PHP or any other programming language expert can learn
Hadoop.
If an organization runs an application built on mainframes then they might be looking for
candidates who possess Mainframe +Hadoop skills whereas an organization that has its
main application built on Java would demand a Hadoop professional with expertise in
Java+Hadoop skills.
The below image shows a job posting on Monster.com for the designation of a Senior Data
Engineer-
The job description clearly states that any candidate who knows Hadoop and has strong
experience in ETL Informatica can apply for this job to build a career in Hadoop technology
without expertise knowledge in Java.The mandatory skills for the job have been highlighted
in red which include Hadoop, Informatica,Vertica, Netezza, SQL, Pig, Hive. The skill
MapReduce in Java is an additional plus but not required.
Here is another image which shows a job posting on Dice.com for the designation of a Big
Data Engineer-
The job description clearly underlines the minimum required skills for this role as Java, Linux
and Hadoop. Candidates who have expertise knowledge in Java, Linux and Hadoop can
only apply for this job and anybody with Java basics would not be the best fit for this job.
Some of the job roles require the professional to have explicit in-depth knowledge of Java
programming whereas few other job roles can be excelled even by professionals who are
well-versed with Java basics.
To learn Hadoop and build an excellent career in Hadoop, having basic knowledge of Linux
and knowing the basic programming principles of Java is a must. Thus, to incredibly excel in
the entrenched technology of Apache Hadoop, it is recommended that you at least learn
Java basics.
Click here to know more about our IBM Certified Hadoop Developer course
activated with free Java course
Java and Linux- Building Blocks of Hadoop
Apache Hadoop is an open source platform built on two technologies Linux operating system
and Java programming language. Java is used for storing, analysing and processing large
data sets. The choice of using Java as the programming language for the development of
hadoop is merely accidental and not thoughtful. Apache Hadoop was initially a sub project of
the open search engine Nutch. The Nutch team at that point of time was more comfortable in
using Java rather than any other programming language. The choice for using Java for
hadoop development was definitely a right decision made by the team with several Java
intellects available in the market. Hadoop is Java-based, so it typically requires professionals
to learn Java for Hadoop.
Apache Hadoop solves big data processing challenges using distributed parallel processing
in a novel way. Apache Hadoop architecture mainly consists of two components-
2.Hadoop Java MapReduce Programming Model Component- Java based system tool
HDFS is the virtual file system component of Hadoop that splits a huge data file into smaller
files to be processed by different processors. These small files are then replicated and
stored on various servers for fault tolerance constraints. HDFS is a basic file system
abstraction where the user need not bother on how it operates or stores files unless he/she
is an administrator.
At times, Hadoop developers might be required to dig deep into Hadoop code to understand
the functionality of certain modules or why a particular piece of code is behaving strange.
Under, such circumstances knowledge of Java basics and advanced programming concepts
comes as a boon to Hadoop developers. Technology experts advice prospective Hadoopers
to learn Java basics before they deep dive into Hadoop for a well-rounded real world
Hadoop implementation. Career counsellors suggest students to learn Java for Hadoop
before they attempt to work on Hadoop Map Reduce.
How to learn Java for Hadoop?
If you are planning to enrol for Hadoop training, ramp up on your knowledge on Java
beforehand.
the best choice for less experienced programmers as they might not be able to
comprehend the code snippets and other examples in the Java tutorial with ease.
There are several reputed online e-learning classes which provide great
options to learn Java for Hadoop. Knowledge experts explain Java basics, plus the
students can clarify any doubts they have then and there and engage in
discussion with other students to improve their knowledge base in Java.
Candidates who enrol for DeZyres IBM certified Hadoop training can activate a free
java course to ramp up their java knowledge. Individuals who are new to Java can also get
started to learn Hadoop just by understanding the Java basics taught as part of the free java
course curriculum at DeZyre. DeZyres 20 hours Java Course curriculum covers all the Java
basics needed to learn Hadoop such as-
Arrays
Exception Handling
Serialization
Collections
JAVA
Q1. What is a cookie ?show Answer
Ans. A cookie is a small piece of text stored on a user's computer by the browser for a specific
domain. Commonly used for authentication, storing site preferences, and server session identification.
Q2. Can we reduce the visibility of the overridden method ?show Answer
Ans. No
Q3. What are different types of inner classes ?show Answer
Ans. Simple Inner Class, Local Inner Class, Anonymous Inner Class , Static Nested Inner Class.
Q4. Difference between TreeMap and HashMap ?show Answer
Ans. They are different the way they are stored in memory. TreeMap stores the Keys in order whereas
HashMap stores the key value pairs randomly.
Q5. What is the difference between List, Set and Map ?show Answer
Ans. List - Members are stored in sequence in memory and can be accessed through index.
Set - There is no relevance of sequence and index. Sets doesn't contain duplicates whereas multiset
can have duplicates. Map - Contains Key , Value pairs.
Q6. Difference between Public, Private, Default and Protected ?
show Answer
Ans. Private - Not accessible outside object scope.
Public - Accessible from anywhere.
Default - Accessible from anywhere within same package.
Protected - Accessible from object and the sub class objects.
Q7. What is servlet Chaining ?show Answer
Ans. Multiple servlets serving the request in chain.
Q8. What are the Wrapper classes available for primitive types ?show Answer
Ans. boolean - java.lang.Boolean
byte - java.lang.Byte
char - java.lang.Character
double - java.lang.Double
float - java.lang.Float
int - java.lang.Integer
long - java.lang.Long
short - java.lang.Short
void - java.lang.Void
Q9. What are concepts introduced with Java 5 ?show Answer
Ans. Generics , Enums , Autoboxing , Annotations and Static Import.
Q10. Does Constructor creates the object ?show Answer
Ans. New operator in Java creates objects. Constructor is the later step in object creation.
Constructor's job is to initialize the members after the object has reserved memory for itself.
Q11. Can static method access instance variables ?show Answer
Ans. Though Static methods cannot access the instance variables directly, They can access them
using instance handler.
Q12. Does Java support Multiple Inheritance ?show Answer
Ans. Interfaces does't facilitate inheritance and hence implementation of multiple interfaces doesn't
make multiple inheritance. Java doesn't support multiple inheritance.
Q13. Difference between == and .equals() ?show Answer
Ans. "equals" is the member of object class which returns true if the content of objects are same
whereas "==" evaluate to see if the object handlers on the left and right are pointing to the same
object in memory.
Q14. Difference between Checked and Unchecked exceptions ?show Answer
Ans. Checked exceptions and the exceptions for which compiler throws an errors if they are not
checked whereas unchecked exceptions and caught during run time only and hence can't be
checked.
Q15. What is a Final Variable ?show Answer
Ans. Final variable is a variable constant that cannot be changed after initialization.
Q16. Which class does not override the equals() and hashCode() methods, inheriting them
directly from class Object?show Answer
Ans. java.lang.StringBuffer.
Q17. What is a final method ?show Answer
Ans. Its a method which cannot be overridden. Compiler throws an error if we try to override a method
which has been declared final in the parent class.
Q18. Which interface does java.util.Hashtable implement?show Answer
Ans. Java.util.Map
Q19. What is an Iterator?show Answer
Ans. Iterator is an interface that provides methods to iterate over any Collection.
Q20. Which interface provides the capability to store objects using a key-value pair?show
Answer
Ans. java.util.map
Q21. Difference between HashMap and Hashtable?show Answer
Ans. Hashtable is synchronized whereas HashMap is not.
HashMap allows null values whereas Hashtable doesnt allow null values.
Ans. No
Q29. What are the common uses of "this" keyword in java ?show Answer
Ans. "this" keyword is a reference to the current object and can be used for following 1. Passing itself to another method.
2. Referring to the instance variable when local variable has the same name.
3. Calling another constructor in constructor chaining.
Q30. What are the difference between Threads and Processes ?show Answer
Ans. 1. when an OS wants to start running program it creates new process means a process is a
program that is currently executing and every process has at least one thread running within it.
2). A thread is a path of code execution in the program, which has its own local variables, program
counter(pointer to current execution being executed) and lifetime.
3. When the JavaVirtual Machine (JavaVM, or just VM) is started by the operating system, a new
process is created. Within that process, many threads can be created.
4. Consider an example : when you open Microsoft word in your OS and you check your task manger
then you can see this running program as a process. now when you write something in opened word
document, then it performs more than one work at same time like it checks for the correct spelling, it
formats the word you enter , so within that process ( word) , due to different path execution(thread) all
different works are done at same time.
5. Within a process , every thread has independent path of execution but there may be situation
where two threads can interfere with each other then concurrency and deadlock come is picture.
6. like two process can communicate ( ex:u open an word document and file explorer and on word
document you drag and drop another another file from file explorer), same way two threads can also
communicate with each other and communication with two threads is relatively low.
7. Every thread in java is created and controlled by unique object of java.lang.Thread class.
8. prior to jdk 1.5, there were lack in support of asynchronous programming in java, so in that case it
was considered that thread makes the runtime environment asynchronous and allow different task to
perform concurrently.
Q31. How can we run a java program without making any object?show Answer
Ans. By putting code within either static method or static block.
Q32. Explain multithreading in Java ?show Answer
Ans. 1. Multithreading provides better interaction with the user by distribution of task
2. Threads in Java appear to run concurrently, so it provides simulation for simultaneous activities.
The processor runs each thread for a short time and switches among the threads to simulate simultaneous execution (context-switching) and it make appears that each thread has its own
processor.By using this feature, users can make it appear as if multiple tasks are occurring
simultaneously when, in fact, each is
running for only a brief time before the context is switched to the next thread.
3. We can do other things while waiting for slow I/O operations.
In the java.iopackage, the class InputStreamhas a method, read(), that blocks until a byte is read from
the stream or until an IOExceptionis thrown. The thread that executes this method cannot do anything
elsewhile awaiting the arrival of another byte on the stream.
Q46. what is the use of cookie and session ? and What is the difference between them ?show
Answer
Ans. Cookie and Session are used to store the user information. Cookie stores user information on
client side and Session does it on server side. Primarily, Cookies and Session are used for
authentication, user preferences, and carrying information across multiple requests. Session is meant
for the same purpose as the cookie does. Session does it on server side and Cookie does it on client
side. One more thing that quite differentiates between Cookie and Session. Cookie is used only for
storing the textual information. Session can be used to store both textual information and objects.
Q47. Which are the different segments of memory ?
show Answer
Ans. 1. Stack Segment - contains local variables and Reference variables(variables that hold the
address of an object in the heap)
2. Heap Segment - contains all created objects in runtime, objects only plus their object attributes
(instance variables)
3. Code Segment - The segment where the actual compiled Java bytecodes resides when loaded
Q48. Which memory segment loads the java code ?show Answer
Ans. Code segment.
Q49. which containers use a border Layout as their default layout ?show Answer
Ans. The window, Frame and Dialog classes use a border layout as their default layout.
Q50. Can a lock be acquired on a class ?show Answer
Ans. Yes, a lock can be acquired on a class. This lock is acquired on the class's Class object.
Q51. What state does a thread enter when it terminates its processing?show Answer
Ans. When a thread terminates its processing, it enters the dead state.
Q52. How many bits are used to represent Unicode, ASCII, UTF-16, and UTF-8 characters?show
Answer
Ans. Unicode requires 16 bits and ASCII require 7 bits. Although the ASCII character set uses only 7
bits, it is usually represented as 8 bits. UTF-8 represents characters using 8, 16, and 18 bit patterns.
UTF-16 uses 16-bit and larger bit patterns.
Q53. Does garbage collection guarantee that a program will not run out of memory?show
Answer
Ans. Garbage collection does not guarantee that a program will not run out of memory. It is possible
for programs to use up memory resources faster than they are garbage collected. It is also possible
for programs to create objects that are not subject to garbage collection
Q54. What is an object's lock and which object's have locks?show Answer
Ans. An object's lock is a mechanism that is used by multiple threads to obtain synchronized access
to the object. A thread may execute a synchronized method of an object only after it has acquired the
object's lock. All objects and classes have locks. A class's lock is acquired on the class's Class
object.
Q55. What is casting?show Answer
Ans. There are two types of casting, casting between primitive numeric types and casting between
object references. Casting between numeric types is used to convert larger values, such as double
values, to smaller values, such as byte values. Casting between object references is used to refer to
an object by a compatible class, interface, or array type reference
Q56. What restrictions are placed on method overriding?show Answer
Ans. Overridden methods must have the same name, argument list, and return type. The overriding
method may not limit the access of the method it overrides. The overriding method may not throw any
exceptions that may not be thrown by the overridden method.
Q57. How does a try statement determine which catch clause should be used to handle an
exception?show Answer
Ans. When an exception is thrown within the body of a try statement, the catch clauses of the try
statement are examined in the order in which they appear. The first catch clause that is capable of
handling the exception is executed. The remaining catch clauses are ignored.
Q58. Describe what happens when an object is created in Java ?show Answer
Ans. 1. Memory is allocated from heap to hold all instance variables and implementation-specific data
of the object and its superclasses. Implemenation-specific data includes pointers to class and method
data.
2. The instance variables of the objects are initialized to their default values.
3. The constructor for the most derived class is invoked. The first thing a constructor does is call the
constructor for its superclasses. This process continues until the constructor for java.lang.Object is
called,
as java.lang.Object is the base class for all objects in java.
4. Before the body of the constructor is executed, all instance variable initializers and initialization
blocks are executed. Then the body of the constructor is executed. Thus, the constructor for the base
class completes first and constructor for the most derived class completes last.
Q59. What is the difference between StringBuffer and String class ?show Answer
Ans. A string buffer implements a mutable sequence of characters. A string buffer is like a String, but
can be modified. At any point in time it contains some particular sequence of characters, but the
length and content of the sequence can be changed through certain method calls. The String class
represents character strings. All string literals in Java programs, such as "abc" are constant and
implemented as instances of this class; their values cannot be changed after they are created.
Q60. Describe, in general, how java's garbage collector works ?show Answer
Ans. The Java runtime environment deletes objects when it determines that they are no longer being
used. This process is known as garbage collection. The Java runtime environment supports a
Q85. Which access specifier can be used with Class ?show Answer
Ans. For top level class we can only use "public" and "default". We can use private with inner class.
Q86. Explain Annotations ?show Answer
Ans. Annotations, a form of metadata, provide data about a program that is not part of the program
itself. Annotations have no direct effect on the operation of the code they annotate. Annotations have
a number of uses, among them:
Information for the compiler Annotations can be used by the compiler to detect errors or suppress
warnings.
Compile-time and deployment-time processing Software tools can process annotation information
to generate code, XML files, and so forth.
Runtime processing Some annotations are available to be examined at runtime.
Q88. What are few of the Annotations pre defined by Java?show Answer
Ans. @Deprecated annotation indicates that the marked element is deprecated and should no longer
be used. The compiler generates a warning whenever a program uses a method, class, or field with
the @Deprecated annotation.
@Override annotation informs the compiler that the element is meant to override an element declared
in a superclass.
@SuppressWarnings annotation tells the compiler to suppress specific warnings that it would
otherwise generate.
@SafeVarargs annotation, when applied to a method or constructor, asserts that the code does not
perform potentially unsafe operations on its varargsparameter. When this annotation type is used,
unchecked warnings relating to varargs usage are suppressed.
@FunctionalInterface annotation, introduced in Java SE 8, indicates that the type declaration is
intended to be a functional interface, as defined by the Java Language Specification.
this type, the class' superclass is queried for the annotation type. This annotation applies only to class
declarations.
@Repeatable annotation, introduced in Java SE 8, indicates that the marked annotation can be
applied more than once to the same declaration or type use. For more information, see Repeating
Annotations.
Q91. How to display and set the Class path in Unix ?show Answer
Ans. To display the current CLASSPATH variable, use these commands in UNIX (Bourne shell):
% echo $CLASSPATH
To delete the current contents of the CLASSPATH variable,
In UNIX: % unset CLASSPATH; export CLASSPATH
To set the CLASSPATH variable,
In UNIX: % CLASSPATH=/home/george/java/classes; export CLASSPATH
Object Creation is the process to create the object in memory and returning its handler. Java provides
New keyword for object creation.
Initialization is the process of setting the initial / default values to the members. Constructor is used for
this purpose. If we don't provide any constructor, Java provides one default implementation to set the
default values according to the member data types.
Ans. Used in method declarations to specify that the method is not implemented in the same Java
source file, but rather in another language
Q115. What is "super" used for ?
show Answer
Ans. Used to access members of the base class.
Q116. What is "this" keyword used for ?
show Answer
Ans. Used to represent an instance of the class in which it appears.
Q117. Difference between boolean and Boolean ?show Answer
Ans. boolean is a primitive type whereas Boolean is a class.
Q118. What is a finalize method ?show Answer
Ans. finalize() method is called just before an object is destroyed.
Q119. What are Marker Interfaces ? Name few Java marker interfaces ?show Answer
Ans. These are the interfaces which have no declared methods.
Serializable and cloneable are marker interfaces.
Q120. Is runnable a Marker interface ?show Answer
Ans. No , it has run method declared.
Q121. Difference between Process and Thread ?show Answer
Ans. Process is a program in execution whereas thread is a separate path of execution in a program.
Q122. What is a Deadlock ?
show Answer
Ans. When two threads are waiting each other and cant precede the program is said to be deadlock.
Q123. Difference between Serialization and Deserialization ?
show Answer
Ans. Serialization is the process of writing the state of an object to a byte stream. Deserialization is
the process of restoring these objects.
Q124. Explain Autoboxing ?
show Answer
Ans. Autoboxing is the automatic conversion that the Java compiler makes between the primitive
types and their corresponding object wrapper classes
Q136. Explain the scenerios to choose between String , StringBuilder and StringBuffer ?show
Answer
Ans. If the Object value will not change in a scenario use String Class because a String object is
immutable.
If the Object value can change and will only be modified from a single thread, use a StringBuilder
because StringBuilder is unsynchronized(means faster).
If the Object value may change, and can be modified by multiple threads, use a StringBuffer because
StringBuffer is thread safe(synchronized).
Q137. Explain java.lang.OutOfMemoryError ?
show Answer
Ans. This Error is thrown when the Java Virtual Machine cannot allocate an object because it is out of
memory, and no more memory could be made available by the garbage collector.
Q138. Can we have multiple servlets in a web application and How can we do that ?show
Answer
Ans. Yes by making entries in web.xml
Q139. How can we manage Error Messages in the web application ?show Answer
Ans. Within message.properties file.
Q140. Is JVM, a compiler or interpretor ?
show Answer
Ans. Its an interpretor.
Q141. Difference between implicit and explicit type casting ?
show Answer
Ans. An explicit conversion is where you use some syntax to tell the program to do a conversion
whereas in case of implicit type casting you need not provide the data type.
Q142. Difference between loadClass and Class.forName ?show Answer
Ans. loadClass only loads the class but doesn't initialize the object whereas Class.forName initialize
the object after loading it.
Q143. Should we override finalize method ?
show Answer
Ans. Finalize is used by Java for Garbage collection. It should not be done as we should leave the
Garbage Collection to Java itself.
Q144. What is assert keyword used for ?show Answer
Ans. The assert keyword is used to make an assertiona statement which the programmer believes
is always true at that point in the program. This keyword is intended to aid in testing and debugging.
Q145. Difference between Factory and Abstract Factory Design Pattern ?show Answer
Ans. Factory Pattern deals with creation of objects delegated to a separate factory class whereas
Abstract Factory patterns works around a super-factory which creates other factories.
Q146. Difference between Factory and Builder Design Pattern ?show Answer
Ans. Builder pattern is the extension of Factory pattern wherein the Builder class builds a complex
object in multiple steps.
Q147. Difference between Proxy and Adapter ?show Answer
Ans. Adapter object has a different input than the real subject whereas Proxy object has the same
input as the real subject. Proxy object is such that it should be placed as it is in place of the real
subject.
Q148. Difference between Adapter and Facade ?
show Answer
Ans. The Difference between these patterns in only the intent. Adapter is used because the objects in
current form cannot communicate where as in Facade , though the objects can communicate , A
Facade object is placed between the client and subject to simplify the interface.
Q149. Difference between Builder and Composite ?
show Answer
Ans. Builder is a creational Design Pattern whereas Composite is a structural design pattern.
Composite creates Parent - Child relations between your objects while Builder is used to create group
of objects of predefined types.
Q150. Example of Chain of Responsibility Design Pattern ?
show Answer
Ans. Exception Handling Throw mechanism.
Q151. Example of Observer Design Pattern ?
show Answer
Ans. Listeners.
Q152. Difference between Factory and Strategy Design Pattern ?show Answer
Ans. Factory is a creational design pattern whereas Strategy is behavioral design pattern. Factory
revolves around the creation of object at runtime whereas Strategy or Policy revolves around the
decision at runtime.
Q153. Shall we use abstract classes or Interfaces in Policy / Strategy Design Pattern ?show
Answer
Ans. Strategy deals only with decision making at runtime so Interfaces should be used.
Q154. Which kind of memory is used for storing object member variables and function local
variables ?show Answer
Ans. Local variables are stored in stack whereas object variables are stored in heap.
Q155. Why do member variables have default values whereas local variables don't have any
default value ?
show Answer
Ans. member variable are loaded into heap, so they are initialized with default values when an
instance of a class is created. In case of local variables, they are stored in stack until they are being
used.
Q156. What is a Default Constructor ?show Answer
Ans. The no argument constructor provided by Java Compiler if no constructor is specified.
Q157. Will Compiler creates a default no argument constructor if we specify only multi
argument constructor ?show Answer
Ans. No, Compiler will create default constructor only if we don't specify any constructor.
Q158. Can we overload constructors ?
show Answer
Ans. Yes.
Q159. What will happen if we make the constructor private ?show Answer
Ans. We can't create the objects directly by invoking new operator.
Q160. How can we create objects if we make the constructor private ?show Answer
Ans. We can do so through a static public member method or static block.
Q161. What will happen if we remove the static keyword from main method ?
show Answer
Ans. Program will compile but will give a "NoSuchMethodError" during runtime.
Q162. Why Java don't use pointers ?show Answer
Ans. Pointers are vulnerable and slight carelessness in their use may result in memory problems and
hence Java intrinsically manage their use.
Q163. Can we use both "this" and "super" in a constructor ?show Answer
Ans. No, because both this and super should be the first statement.
Q164. Do we need to import java.lang.package ?show Answer
Spring provides a consistent transaction management interface that can scale down to a local
transaction
Q181. what is the difference between collections class vs collections interface ?show Answer
Ans. Collections class is a utility class having static methods for doing operations on objects of
classes which implement the Collection interface. For example, Collections has methods for finding
the max element in a Collection.
Q182. Will this code give error if i try to add two heterogeneous elements in the arraylist. ? and
Why ?
List list1 = new ArrayList<>();
list1.add(5);
list1.add("5");show Answer
Ans. If we don't declare the list to be of specific type, it treats it as list of objects.
int 1 is auto boxed to Integer and "1" is String and hence both are objects.
Q183. Difference between Java beans and Spring Beans ?show Answer
Q184. What is the difference between System.console.write and System.out.println ?show
Answer
Ans. System.console() returns null if your application is not run in a terminal (though you can handle
this in your application)
System.console() provides methods for reading password without echoing characters
System.out and System.err use the default platform encoding, while the Console class output
methods use the console encoding
Q185. What are various types of Class loaders used by JVM ?show Answer
Ans. Bootstrap - Loads JDK internal classes, java.* packages.
Extensions - Loads jar files from JDK extensions directory - usually lib/ext directory of the JRE
System - Loads classes from system classpath.
Q186. How are classes loaded by JVM ?show Answer
Ans. Class loaders are hierarchical. The very first class is specially loaded with the help of static
main() method declared in your class. All the subsequently loaded classes are loaded by the classes,
which are already loaded and running.
Q187. Difference between C++ and Java ?show Answer
Ans. Java does not support pointers.
Java does not support multiple inheritances.
Java does not support destructors but rather adds a finalize() method. Finalize methods are invoked
by the garbage collector prior to reclaiming the memory occupied by the object, which has the
finalize() method.
Java does not include structures or unions because the traditional data structures are implemented as
an object oriented framework.
C++ compiles to machine language , when Java compiles to byte code .
In C++ the programmer needs to worry about freeing the allocated memory , where in Java the
Garbage Collector takes care of the the unneeded / unused variables.
Java is platform independent language but c++ is depends upon operating system.
Java uses compiler and interpreter both and in c++ their is only compiler.
C++ supports operator overloading whereas Java doesn't.
Internet support is built-in Java but not in C++. However c++ has support for socket programming
which can be used.
Java does not support header file, include library files just like C++ .Java use import to include
different Classes and methods.
There is no goto statement in Java.
There is no scope resolution operator :: in Java. It has . using which we can qualify classes with the
namespace they came from.
Java is pass by value whereas C++ is both pass by value and pass by reference.
Java Enums are objects instead of int values in C++
C++ programs runs as native executable machine code for the target and hence more near to
hardware whereas Java program runs in a virtual machine.
C++ was designed mainly for systems programming, extending the C programming language
whereas Java was created initially to support network computing.
C++ allows low-level addressing of data. You can manipulate machine addresses to look at anything
you want. Java access is controlled.
C++ has several addressing operators . * & -> where Java has only one: the .
We can create our own package in Java(set of classes) but not in c and c++.
Ans. static loading - Classes are statically loaded with Javas new operator.
dynamic class loading - Dynamic loading is a technique for programmatically invoking the functions of
a class loader at run time.
Class.forName (Test className);
Q189. Tell something about BufferedWriter ? What are flush() and close() used for ?show
Answer
Ans. A Buffer is a temporary storage area for data. The BufferedWriter class is an output stream.It is
an abstract class that creates a buffered character-output stream.
Flush() is used to clear all the data characters stored in the buffer and clear the buffer.
Close() is used to closes the character output stream.
Q190. What is Scanner class used for ? when was it introduced in Java ?show Answer
Ans. Scanner class introduced in Java 1.5 for reading Data Stream from the imput device. Previously
we used to write code to read a input using DataInputStream. After reading the stream , we can
convert into respective data type using in.next() as String ,in.nextInt() as integer, in.nextDouble() as
Double etc
Q191. Why Struts 1 Classes are not Thread Safe whereas Struts 2 classes are thread safe ?
show Answer
Ans. Struts 1 actions are singleton. So all threads operates on the single action object and hence
makes it thread unsafe.
Struts 2 actions are not singleton and a new action object copy is created each time a new action
request is made and hence its thread safe.
Q192. What are some Java related technologies used for distributed computing ?show Answer
Ans. sockets, RMI. EJB
Q193. Whats the purpose of marker interfaces ?show Answer
Ans. They just tell the compiler that the objects of the classes implementing the interfaces with no
defined methods need to be treated differently.
Q194. What is the difference between final, finally and finalize() ?show Answer
Ans. final - constant variable, restricting method overloading, restricting class subclassing.
finally - handles exception. The finally block is optional and provides a mechanism to clean up
regardless of what happens within the try block. Use the finally block to close files or to release
other system resources like database connections, statements etc.
finalize() - method helps in garbage collection. A method that is invoked before an object is discarded
by the garbage collector, allowing it to clean up its state.
Q195. When do you get ClassCastException?show Answer
Ans. As we only downcast class in the hierarchy, The ClassCastException is thrown to indicate that
code has attempted to cast an object to a subclass of which it is not an instance.
Q196. Explain Thread States ?show Answer
Ans. Runnable - waiting for its turn to be picked for execution by the thread schedular based on
thread priorities.
Running - The processor is actively executing the thread code. It runs until it becomes blocked, or
voluntarily gives up its turn.
Waiting: A thread is in a blocked state while it waits for some external processing such as file I/O to
finish.
Sleeping - Java threads are forcibly put to sleep (suspended) with Thread.sleep. they can resume
using Thread.resume method.
Blocked on I/O - Will move to runnable after I/O condition like reading bytes of data etc changes.
Blocked on synchronization - Will move to Runnable when a lock is acquired.
Dead - The thread is finished working.
Q197. What are strong, soft, weak and phantom references in Java ?show Answer
Ans. Garbage Collector wont remove a strong reference.
A soft reference will only get removed if memory is low.
A weak reference will get removed on the next garbage collection cycle.
A phantom reference will be finalized but the memory will not be reclaimed. Can be useful when you
want to be notified that an object is about to be collected.
Q198. Difference between yield() and sleeping()? show Answer
Ans. When a task invokes yield(), it changes from running state to runnable state. When a task
invokes sleep(), it changes from running state to waiting/sleeping state.
Q199. What is a daemon thread? Give an Example ?show Answer
Ans. These are threads that normally run at a low priority and provide a basic service to a program or
programs when activity on a machine is reduced. garbage collector thread is daemon thread.
Q200. What is the difference between AWT and Swing?show Answer
Ans. Swing provides both additional components like JTable, JTree etc and added functionality to
AWT-replacement components.
Swing components can change their appearance based on the current look and feel library thats
being used.
Swing components follow the MVC paradigm, and thus can provide a much more flexible UI.
Swing provides extras for components, such as icons on many components, decorative borders for
components, tool tips for components etc.
Swing components are lightweight than AWT.
Swing provides built-in double buffering ,which means an off-screen buffer is used during drawing and
then the resulting bits are copied onto the screen.
Swing provides paint debugging support for when you build your own component.
Q201. What is the order of method invocation in an applet?show Answer
Ans. public void init()
public void start()
public void stop()
public void destroy()
Q202. Name few tools for probing Java Memory Leaks ?show Answer
Ans. JProbe, OptimizeIt
Q203. Which memory areas does instance and static variables use ?show Answer
Ans. instance variables are stored on stack whereas static variables are stored on heap.
Q204. What is J2EE? What are J2EE components and services?show Answer
Ans. J2EE or Java 2 Enterprise Edition is an environment for developing and deploying enterprise
applications. The J2EE platform consists of J2EE components, services, Application Programming
Interfaces (APIs) and protocols that provide the functionality for developing multi-tiered and distributed
Web based applications.
Q205. What are the components of J2EE ?show Answer
Ans. applets
Client component like Client side Java codes.
Web component like JSP, Servlet WAR
Enterprise JavaBeans like Session beans, Entity beans, Message driven beans
Enterprise application like WAR, JAR, EAR
Ans. XML or eXtensible Markup Language is a markup languages for describing data and its
metadata.
Q207. Difference between SAX and DOM Parser ?show Answer
Ans. A DOM (Document Object Model) parser creates a tree structure in memory from an input
document whereas A SAX (Simple API for XML) parser does not create any internal structure.
A SAX parser serves the client application always only with pieces of the document at any given time
whereas A DOM parser always serves the client application with the entire document no matter how
much is actually needed by the client.
A SAX parser, however, is much more space efficient in case of a big input document whereas DOM
parser is rich in functionality.
Use a DOM Parser if you need to refer to different document areas before giving back the information.
Use SAX is you just need unrelated nuclear information from different areas.
Xerces, Crimson are SAX Parsers whereas XercesDOM, SunDOM, OracleDOM are DOM parsers.
Q208. What is DTD ?show Answer
Ans. DTD or Document Type Definition is a standard agreed upon way of communication between
two parties. Your application can use a standard DTD to verify that data that you receive
from the outside world is valid and can be parsed by your parser.
Q209. What is XSD ?show Answer
Ans. XSD or Xml Schema Definition is an extension of DTD. XSD is more powerful and extensible
than DTD
Q210. What is JAXP ?show Answer
Ans. Stands for Java API for XML Processing. This provides a common interface for creating and
using SAX, DOM, and XSLT APIs in Java regardless of which vendors implementation is actually
being used.
Q211. What is JAXB ?show Answer
Ans. Stands for Java API for XML Binding. This standard defines a mechanism for writing out Java
objects as XML and for creating Java objects from XML structures.
Q212. What is marshalling ?show Answer
Ans. Its the process of creating XML structures out of Java Objects.
Q213. What is unmarshalling ?show Answer
Ans. Its the process of creating Java Objects out of XML structures.
Q214. Which load testing tools have you used ?show Answer
Q225. What are the phases of the JSP life cycle ?show Answer
Ans. The Model/View/Controller pattern, a strategy for dividing responsibility in a GUI component. The
model is the data for the component. The view is the visual presentation of the component on the
screen. The controller is responsible for reacting to events by changing the model. According to the
MVC pattern, these responsibilities should be handled by different objects.
Q231. What is race condition ?show Answer
Ans. A source of possible errors in parallel programming, where one thread can cause an error in
another thread by changing some aspect of the state of the program that the second thread is
depending on (such as the value of variable).
Q232. What is unicode ?show Answer
Ans. A way of encoding characters as binary numbers. The Unicode character set includes
characters used in many languages, not just English. Unicode is the character set that is
used internally by Java.
Q233. What is ThreadFactory ?show Answer
Ans. ThreadFactory is an interface that is meant for creating threads instead of explicitly creating
threads by calling new Thread(). Its an object that creates new threads on demand. Using thread
factories removes hardwiring of calls to new Thread, enabling applications to use special thread
subclasses, priorities, etc.
Q234. What is PermGen or Permanent Generation ?show Answer
Ans. The memory pool containing all the reflective data of the java virtual machine itself, such as class
and method objects. With Java VMs that use class data sharing, this generation is divided into readonly and read-write areas. The Permanent generation contains metadata required by the JVM to
describe the classes and methods used in the application. The permanent generation is populated by
the JVM at runtime based on classes in use by the application. In addition, Java SE library classes
and methods may be stored here.
Q235. What is metaspace ?show Answer
Ans. The Permanent Generation (PermGen) space has completely been removed and is kind of
replaced by a new space called Metaspace. The consequences of the PermGen removal is that
obviously the PermSize and MaxPermSize JVM arguments are ignored and you will never get a
java.lang.OutOfMemoryError: PermGen error.
Q236. What is the benefit of inner / nested classes ?show Answer
Ans. You can put related classes together as a single logical group.
Nested classes can access all class members of the enclosing class, which might be useful in certain
cases.
Nested classes are sometimes useful for specific purposes. For example, anonymous inner classes
are useful for writing simpler event-handling code with AWT/Swing.
Q237. Explain Static nested Classes ?show Answer
Ans. The accessibility (public, protected, etc.) of the static nested class is defined by the outer class.
A static nested class is not an inner class, it's a top-level nested class.
The name of the static nested class is expressed with OuterClassName.NestedClassName syntax.
When you define an inner nested class (or interface) inside an interface, the nested class is declared
implicitly public and static.
Static nested classes can be declared abstract or final.
Static nested classes can extend another class or it can be used as a base class.
Static nested classes can have static members.
Static nested classes can access the members of the outer class (only static members, obviously).
The outer class can also access the members (even private members) of the nested class through an
object of nested class. If you dont declare an instance of the nested class, the outer class cannot
access nested class elements directly.
Q238. Explain Inner Classes ?show Answer
Ans. The accessibility (public, protected, etc.) of the inner class is defined by the outer class.
Just like top-level classes, an inner class can extend a class or can implement interfaces. Similarly, an
inner class can be extended by other classes, and an inner interface can be implemented or extended
by other classes or interfaces.
An inner class can be declared final or abstract.
Inner classes can have inner classes, but youll have a hard time reading or understanding such
complex nesting of classes.
Q239. Explain Method Local Inner Classes ?show Answer
Ans. You can create a non-static local class inside a body of code. Interfaces cannot have local
classes, and you cannot create local interfaces.
Local classes are accessible only from the body of the code in which the class is defined. The local
classes are completely inaccessible outside the body of the code in which the class is defined.
You can extend a class or implement interfaces while defining a local class.
A local class can access all the variables available in the body of the code in which it is defined. You
can pass only final variables to a local inner class.
Q240. Explain about anonymous inner classes ?show Answer
Ans. Anonymous classes are defined in the new expression itself, so you cannot create multiple
objects of an anonymous class.
You cannot explicitly extend a class or explicitly implement interfaces when defining an anonymous
class.
An anonymous inner class is always created as part of a statement; don't forget to close the
statement after the class definition with a curly brace. This is a rare case in Java, a curly brace
followed by a semicolon.
Anonymous inner classes have no name, and their type must be either a subclass of the named type
or an implementer of the named interface
Q241. What will happen if class implement two interface having common method?show
Answer
Ans. That would not be a problem as both are specifying the contract that implement class has to
follow.
If class C implement interface A & interface B then Class C thing I need to implement print() because
of interface A then again Class think I need to implement print() again because of interface B, it sees
that there is already a method called test() implemented so it's satisfied.
Q242. What is the advantage of using arrays over variables ?show Answer
Ans. Arrays provide a structure wherein multiple values can be accessed using single reference and
index. This helps in iterating over the values using loops.
Q243. What are the disadvantages of using arrays ?show Answer
Ans. Arrays are of fixed size and have to reserve memory prior to use. Hence if we don't know size in
advance arrays are not recommended to use.
Arrays can store only homogeneous elements.
Arrays store its values in contentious memory location. Not suitable if the content is too large and
needs to be distributed in memory.
There is no underlying data structure for arrays and no ready made method support for arrays, for
every requriment we need to code explicitly
Q244. Difference between Class#getInstance() and new operator ?show Answer
Ans. Class.getInstance doesn't call the constructor whereas if we create an object using new operator
, we need to have a matching constructor or copiler should provide a default constructor.
Q245. Can we create an object if a Class doesn't have any constructor ( not even the default
provided by constructor ) ?show Answer
Ans. Yes , using Class.getInstance.
Q246. What is a cloneable interface and what all methods does it contain?show Answer
Ans. It is not having any method because it is a MARKER interface.
Ans. Session tracking is a mechanism that servlets use to maintain state about a series requests from
the same user across some period of time. The methods used for session tracking are:
User Authentication - occurs when a web server restricts access to some of its resources to only
those clients that log in using a recognized username and password
Hidden form fields - fields are added to an HTML form that are not displayed in the client's browser.
When the form containing the fields is submitted, the fields are sent back to the server
URL rewriting - every URL that the user clicks on is dynamically modified or rewritten to include extra
information. The extra information can be in the form of extra path information, added parameters or
some custom, server-specific URL change.
Cookies - a bit of information that is sent by a web server to a browser and which can later be read
back from that browser.
HttpSession- places a limit on the number of sessions that can exist in memory.
Q253. What is connection pooling?show Answer
Ans. It's a technique to allow multiple clients to make use of a cached set of shared and reusable
connection objects providing access to a database or other resource.
Q254. Advantage of Collection classes over Arrays ?show Answer
Ans. Collections are re-sizable in nature. We can increase or decrease the size as per recruitment.
Collections can hold both homogeneous and heterogeneous data's.
Every collection follows some standard data structures.
Collection provides many useful built in methods for traversing,sorting and search.
Q255. What are the Disadvantages of using Collection Classes over Arrays ?show Answer
Ans. Collections can only hold objects, It can't hold primitive data types.
Collections have performance overheads as they deal with objects and offer dynamic memory
expansion. This dynamic expansion could be a bigger overhead if the collection class needs
consecutive memory location like Vectors.
Collections doesn't allow modification while traversal as it may lead to
concurrentModificationException.
Ans. Float can represent up to 7 digits accurately after decimal point, where as double can represent
up to 15 digits accurately after decimal point.
Q259. What is the difference between >> and >>>?show Answer
Ans. Both bitwise right shift operator ( >> ) and bitwise zero fill right shift operator ( >>> ) are used to
shift the bits towards right. The difference is that >> will protect the sign bit whereas the >>> operator
will not protect the sign bit. It always fills 0 in the sign bit.
Q260. What is the difference between System.out ,System.err and System.in?show Answer
Ans. System.out and System.err both represent the monitor by default and hence can be used to
send data or results to the monitor. But System.out is used to display normal messages and results
whereas System.err is used to display error messages and System.in represents InputStream object,
which by default represents standard input device, i.e., keyboard.
Q261. Is it possible to compile and run a Java program without writing main( ) method?show
Answer
Ans. Yes, it is possible by using a static block in the Java program.
Q262. What are different ways of object creation in Java ?show Answer
Ans. Using new operator - new xyzClass()
Using factory methods - xyzFactory.getInstance( )
Using newInstance( ) method - (Class.forName(xyzClass))emp.newInstance( )
By cloning an already available object - (xyzClass)obj1.clone( )
throw any new checked exception or any checked exception which are higher in hierarchy than the
exception thrown in superclass method
Q273. Why is Java considered Portable Language ?show Answer
Ans. Java is a portable-language because without any modification we can use Java byte-code in any
platform(which supports Java). So this byte-code is portable and we can use in any other major
platforms.
Q274. Tell something about history of Java ?show Answer
Ans. Java was initially found in 1991 by James Gosling, Sun Micro Systems. At first it was called as
"Oak". In 1995 then it was later renamed to "Java". java is a originally a platform independent
language. Currently Oracle, America owns Java.
Q275. How to find if JVM is 32 or 64 bit from Java program. ?show Answer
Ans. You can find JVM - 32 bit or 64 bit by using System.getProperty() from Java program.
Q276. Does every class needs to have one non parameterized constructor ?show Answer
Ans. No. Every Class only needs to have one constructor - With parameters or without parameters.
Compiler provides a default non parameterized constructor if no constructors is defined.
Q277. Difference between throw and throws ?show Answer
Ans. throw is used to explicitly throw an exception especially custom exceptions, whereas throws is
used to declare that the method can throw an exception.
We cannot throw multiple exceptions using throw statement but we can declare that a method can
throw multiple exceptions using throws and comma separator.
Q278. Can we use "this" within static method ? Why ?show Answer
Ans. No. Even though "this" would mean a reference to current object id the method gets called using
object reference but "this" would mean an ambiguity if the same static method gets called using Class
name.
Q279. Similarity and Difference between static block and static method ?show Answer
Ans. Both belong to the class as a whole and not to the individual objects. Static methods are
explicitly called for execution whereas Static block gets executed when the Class gets loaded by the
JVM.
Q280. What are the platforms supported by Java Programming Language?show Answer
Ans. Java runs on a variety of platforms, such as Windows, Mac OS, and the various versions of
UNIX/Linux like HP-Unix, Sun Solaris, Redhat Linux, Ubuntu, CentOS, etc
Ans. Java uses Just-In-Time compiler to enable high performance. Just-In-Time compiler is a program
that turns Java bytecode into instructions that can be sent directly to the processor.
Q282. What is IDE ? List few Java IDE ?show Answer
Ans. IDE stands of Integrated Development Environment. Few Java IDE's are WSAD ( Websphhere
Application Developer ) , RAD ( Rational Application Developer ) , Eclipse and Netbeans.
Q283. What is an Object ?show Answer
Ans. Object is a run time entity whose state is stored in fields and behavior is shown via methods.
Methods operate on an object's internal state and serve as the primary mechanism for object-toobject communication.
Q284. What is a Class ?show Answer
Ans. A class is a blue print or Mold using which individual objects are created. A class can contain
fields and methods to describe the behavior of an object.
Q285. According to Java Operator precedence, which operator is considered to be with
highest precedence?show Answer
Ans. Postfix operators i.e () [] . is at the highest precedence.
Q286. What data type Variable can be used in a switch statement ?show Answer
Ans. Variables used in a switch statement can only be a byte, short, int, or char.
Q289. What things should be kept in mind while creating your own exceptions in Java?show
Answer
Ans. All exceptions must be a child of Throwable.
If you want to write a checked exception that is automatically enforced by the Handle or Declare Rule,
you need to extend the Exception class.
You want to write a runtime exception, you need to extend the RuntimeException class.
Q292. What is the difference between the Reader/Writer class hierarchy and the
InputStream/OutputStream class hierarchy?show Answer
Ans. The Reader/Writer class hierarchy is character-oriented, and the InputStream/OutputStream
class hierarchy is byte-oriented
Q295. What is the difference between a break statement and a continue statement?show
Answer
Ans. Break statement results in the termination of the statement to which it applies (switch, for, do, or
while). A continue statement is used to end the current loop iteration and return control to the loop
statement.
Q299. What will happen if static modifier is removed from the signature of the main method?
show Answer
Ans. Program throws "NoSuchMethodError" error at runtime .
Q302. What are the advantages and Disadvantages of Sockets ?show Answer
Ans. Sockets are flexible and sufficient. Efficient socket based programming can be easily
implemented for general communications. It cause low network traffic.
Socket based communications allows only to send packets of raw data between applications. Both the
client-side and server-side have to provide mechanisms to make the data useful in any way.
Q303. What environment variables do I need to set on my machine in order to be able to run
Java programs?show Answer
Ans. CLASSPATH and PATH are the two variables.
Q305. What is the difference between the size and capacity of a Vector?show Answer
Ans. The size is the number of elements actually stored in the vector, while capacity is the maximum
number of elements it can store at a given instance of time.
Ans. Data Hiding is a broader concept. Encapsulation is a OOP's centri concept which is a way of
data hiding in OOP's.
Q316. Difference between Abstraction and Implementation hiding ?show Answer
Ans. Implementation Hiding is a broader concept. Abstraction is a way of implementation hiding in
OOP's.
Q317. What are the features of encapsulation ?show Answer
Ans. Combine the data of our application and its manipulation at one place.
Encapsulation Allow the state of an object to be accessed and modified through behaviors.
Reduce the coupling of modules and increase the cohesion inside them.
Q322. Which String class methods are used to make string upper case or lower case?show
Answer
Ans. toUpperCase and toLowerCase
Q323. How to convert String to byte array and vice versa?show Answer
Ans. We can use String getBytes() method to convert String to byte array and we can use String
constructor new String(byte[] arr) to convert byte array to String.
Q324. Why Char array is preferred over String for storing password?
show Answer
Ans. String is immutable in java and stored in String pool. Once its created it stays in the pool until
unless garbage collected, so even though we are done with password its available in memory for
longer duration and there is no way to avoid it. Its a security risk because anyone having access to
memory dump can find the password as clear text.
Q325. Why String is popular HashMap key in Java?show Answer
Ans. Since String is immutable, its hashcode is cached at the time of creation and it doesnt need to
be calculated again. This makes it a great candidate for key in a Map and its processing is fast than
other HashMap key objects. This is why String is mostly used Object as HashMap keys.
Q326. What ate the getter and setter methods ?show Answer
Ans. getters and setters methods are used to store and manipulate the private variables in java
beans. A getters as it has name, suggest retrieves the attribute of the same name. A setters are allows
you to set the values of the attributes.
Q327. public class a {
public static void main(String args[]){
final String s1="job";
final String s2="seeker";
String s3=s1.concat(s2);
String s4="jobseeker";
System.out.println(s3==s4); // Output 1
System.out.println(s3.hashCode()==s4.hashCode()); Output 2
}
}
What will be the Output 1 and Output 2 ?show Answer
Ans. S3 and S4 are pointing to different memory location and hence Output 1 will be false.
Hash code is generated to be used as hash key in some of the collections in Java and is calculated
using string characters and its length. As they both are same string literals, and hence their hashcode
is same.Output 2 will be true.
Q328. What is the use of HashCode in objects ?show Answer
Ans. Hashcode is used for bucketing in Hash implementations like HashMap, HashTable, HashSet
etc.
Q329. Difference between Compositions and Inheritance ?show Answer
Ans. Inheritance means a object inheriting reusable properties of the base class. Compositions means
that an abject holds other objects.
In Inheritance there is only one object in memory ( derived object ) whereas in Composition , parent
object holds references of all composed objects.
From Design perspective - Inheritance is "is a" relationship among objects whereas Composition is
"has a" relationship among objects.
Q330. Will finally be called always if all code has been kept in try block ?show Answer
Ans. The only time finally won't be called is if you call System.exit() or if the JVM crashes first.
Q331. Will the static block be executed in the following code ? Why ?
class Test
{
static
{
System.out.println("Why I am not executing ");
}
public static final int param=20;
}
public class Demo
{
public static void main(String[] args)
{
System.out.println(Test.param);
}
}show Answer
Ans. No the static block won't get executed as the referenced variable in the Test class is final.
Compiler replaces the content of the final variable within Demo.main method and hence actually no
reference to Test class is made.
Q332. Will static block for Test Class execute in the following code ?
class Test
{
static
{
System.out.println("Executing Static Block.");
}
public final int param=20;
public int getParam(){
return param;
}
}
public class Demo
{
public static void main(String[] args)
{
System.out.println(new Test().param);
}
}show Answer
Ans. Yes.
Q333. What does String intern() method do?show Answer
Ans. intern() method keeps the string in an internal cache that is usually not garbage collected.
Q334. Will the following program display "Buggy Bread" ?
class Test{
static void display(){
System.out.println("Buggy Bread");
}
}
class Demo{
public static void main(String... args){
Test t = null;
t.display();
}
}show Answer
Ans. Yes. static method is not accessed by the instance of class. Either you call it by the class name
or the reference.
Q335. How substring() method of String class create memory leaks?show Answer
Ans. substring method would build a new String object keeping a reference to the whole char array, to
avoid copying it. Hence you can inadvertently keep a reference to a very big character array with just
a one character string.
Q336. Write a program to reverse a string iteratively and recursively ?show Answer
Ans. Using String method new StringBuffer(str).reverse().toString();
Iterative public static String getReverseString(String str){
StringBuffer strBuffer = new StringBuffer(str.length);
for(int counter=str.length -1 ; counter>=0;counter--){
strBuffer.append(str.charAt(counter));
}
return strBuffer;
}
Recursive public static String getReverseString(String str){
if(str.length <= 1){
return str;
}
return (getReverseString(str.subString(1)) + str.charAt(0);
}
Q337. If you have access to a function that returns a random integer from one to five, write
another function which returns a random integer from one to seven.show Answer
Ans. We can do that by pulling binary representation using 3 bits ( random(2) ).
getRandom7() {
String binaryStr = String.valuesOf(random(2))+String.valuesOf(random(2))
+String.valuesOf(random(2));
binaryInt = Integer.valueOf(binaryStr);
int sumValue=0;
int multiple = 1;
while(binaryInt > 0){
binaryDigit = binaryInt%10;
binaryInt = binaryInt /10;
sumValue = sumValue + (binaryDigit * multiple);
multiple = multiple * 2;
}
}
int multiple = 1;
while(binaryInt > 0){
binaryDigit = binaryInt%10;
binaryInt = binaryInt /10;
sumValue = sumValue + (binaryDigit * multiple);
multiple = multiple * 2;
}
return sumValue;
}
Q339. What will the following code print ?
String s1 = "Buggy Bread";
String s2 = "Buggy Bread";
if(s1 == s2)
System.out.println("equal 1");
String n1 = new String("Buggy Bread");
String n2 = new String("Buggy Bread");
if(n1 == n2)
System.out.println("equal 2"); show Answer
Ans. equal 1
Q340. Difference between new operator and Class.forName().newInstance() ?show Answer
Ans. new operator is used to statically create an instance of object. newInstance() is used to create an
object dynamically ( like if the class name needs to be picked from configuration file ). If you know
what class needs to be initialized , new is the optimized way of instantiating Class.
Q341. What is Java bytecode ?show Answer
Ans. Java bytecode is the instruction set of the Java virtual machine. Each bytecode is composed by
one, or two bytes that represent the instruction, along with zero or more bytes for passing
parameters.
Q342. How to find whether a given integer is odd or even without use of modules operator in
java?show Answer
Ans. public static void main(String ar[])
{
int n=5;
if((n/2)*2==n)
{
System.out.println("Even Number ");
}
else
{
System.out.println("Odd Number ");
}
}
}
Q344. Can we use Ordered Set for performing Binary Search ?show Answer
Ans. We need to access values on the basis of an index in Binary search which is not possible with
Sets.
Q345. What is Byte Code ? Why Java's intermediary Code is called Byte Code ?show Answer
Ans. Bytecode is a highly optimized set of instructions designed to be executed by the Java run-time
system. Its called Byte Code because each instruction is of 1-2 bytes.
Sample instructions in Byte Code 1: istore_1
2: iload_1
3: sipush 1000
6: if_icmpge 44
9: iconst_2
10: istore_2
Q346. Difference between ArrayList and LinkedList ?show Answer
Ans. LinkedList and ArrayList are two different implementations of the List interface. LinkedList
implements it with a doubly-linked list. ArrayList implements it with a dynamically resizing array.
Q347. If you are given a choice to use either ArrayList and LinkedList, Which one would you
use and Why ?show Answer
Ans. ArrayList are implemented in memory as arrays and hence allows fast retrieval through indices
but are costly if new elements are to be inserted in between other elements.
LinkedList allows for constant-time insertions or removals using iterators, but only sequential access
of elements
1. Retrieval - If Elements are to be retrieved sequentially only, Linked List is preferred.
2. Insertion - If new Elements are to be inserted in between other elements , Array List is preferred.
3. Search - Binary Search and other optimized way of searching is not possible on Linked List.
4. Sorting - Initial sorting could be pain but lateral addition of elements in a sorted list is good with
linked list.
5. Adding Elements - If sufficiently large elements needs to be added very frequently ,Linked List is
preferable as elements don't need consecutive memory location.
Q348. What are the pre-requisite for the collection to perform Binary Search ?show Answer
Ans. 1. Collection should have an index for random access.
2. Collection should have ordered elements.
Q349. Can you provide some implementation of a Dictionary having large number of
words ? show Answer
Ans. Simplest implementation we can have is a List wherein we can place ordered words and hence
can perform Binary Search.
Other implementation with better search performance is to use HashMap with key as first character of
the word and value as a LinkedList.
Further level up, we can have linked Hashmaps like ,
hashmap {
a ( key ) -> hashmap (key-aa , value (hashmap(key-aaa,value)
b ( key ) -> hashmap (key-ba , value (hashmap(key-baa,value)
....................................................................................
z( key ) -> hashmap (key-za , value (hashmap(key-zaa,value)
}
upto n levels ( where n is the average size of the word in dictionary.
counter = 5;
}
BuggyBread2(int x){
counter = x;
}
public static void main(String[] args) {
BuggyBread2 bg = new BuggyBread2();
System.out.println(counter);
}
}show Answer
Ans. Compile time error as it won't find the constructor matching BuggyBread2(). Compiler won't
provide default no argument constructor as programmer has already defined one constructor.
Compiler will treat user defined BuggyBread2() as a method, as return type ( void ) has been specified
for that.
Q358. What will be the output of following code ?
class BuggyBread1 {
public String method() {
return "Base Class - BuggyBread1";
}
}
class BuggyBread2 extends BuggyBread1{
private static int counter = 0;
public String method(int x) {
return "Derived Class - BuggyBread2";
}
public static void main(String[] args) {
BuggyBread1 bg = new BuggyBread2();
System.out.println(bg.method());
}
}show Answer
Ans. Base Class - BuggyBread1
Though Base Class handler is having the object of Derived Class but its not overriding as now with a
definition having an argument ,derived class will have both method () and method (int) and hence its
overloading.
Q359. What are RESTful Web Services ?show Answer
Ans. REST or Representational State Transfer is a flexible architecture style for creating web services
that recommends the following guidelines -
Q360. Which markup languages can be used in restful web services ? show Answer
Ans. XML and JSON ( Javascript Object Notation ).
Q361. Difference between Inner and Outer Join ?show Answer
Ans. Inner join is the intersection of two tables on a particular columns whereas Outer Join is the
Union of two tables.
Ans. With the advent of Internet, HTTP is the most preferred way of communication. Most of the
clients ( web thin client , web thick clients , mobile apps ) are designed to communicate using http
only. Web Services using http makes them accessible from vast variety of client applications.
Q366. what will be the output of this code ?
public static void main(String[] args)
{
StringBuffer s1=new StringBuffer("Buggy");
test(s1);
System.out.println(s1);
}
private static void test(StringBuffer s){
s.append("Bread");
}show Answer
Ans. BuggyBread
Q367. what will be the output of this code ?
public static void main(String[] args)
{
String s1=new String("Buggy");
test(s1);
System.out.println(s1);
}
private static void test(StringBuffer s){
s.append("Bread");
}show Answer
Ans. Buggy
Q368. what will be the output of this code ?
public static void main(String[] args)
{
StringBuffer s1=new StringBuffer("Buggy");
test(s1);
System.out.println(s1);
}
Q375. Why using cookie to store session info is a better idea than just using session info in
the request ?show Answer
Ans. Session info in the request can be intercepted and hence a vulnerability. Cookie can be read and
write by respective domain only and make sure that right session information is being passed by the
client.
Q376. What are different types of cookies ?show Answer
Ans. Session cookies , which are deleted once the session is over.
Permanent cookies , which stays at client PC even if the session is disconnected.
Q377. http protocol is by default ... ?show Answer
Ans. stateless
Q378. Can finally block throw an exception ?show Answer
Ans. Yes.
Q379. Can we have try and catch blocks within finally ?show Answer
Ans. Yes
Q380. Which of the following is a canonical path ?
1. C:\directory\..\directory\file.txt
2. C:\directory\subDirectory1\directory\file.txt
3. \directory\file.txtshow Answer
Ans. 2nd
Q381. What will the following code print when executed on Windows ?
public static void main(String[] args){
String parent = null;
File file = new File("/file.txt");
System.out.println(file.getPath());
System.out.println(file.getAbsolutePath());
try {
System.out.println(file.getCanonicalPath());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}show Answer
Ans. \file.txt
C:\file.txt
C:\file.txt
try {
System.out.println(file.getCanonicalPath());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}show Answer
Ans. ..\file.txt
C:\Workspace\Project\..\file.txt
C:\Workspace\file.txt
Q386. Which is the abstract parent class of FileWriter ?show Answer
Ans. OutputStreamWriter
Q387. Which class is used to read streams of characters from a file?show Answer
Ans. FileReader
Q388. Which class is used to read streams of raw bytes from a file?show Answer
Ans. FileInputStream
Q389. Which is the Parent class of FileInputStream ?show Answer
Ans. InputStream
Q390. Which of the following code is correct ?
a.
c.
File file = new File("../file.txt");
FileWriter fileWriter = new FileWriter(file);
BufferedWriter bufferedOutputWriter = new BufferedWriter(fileWriter);
d.
Ans. Though we are trying to serialize BuggyBread1 object but we haven't declared the class to
implement Serializable.
This will throw java.io.NotSerializableException upon execution.
Q395. Will this code run fine if BuggyBread2 doesn't implement Serializable interface ?
class BuggyBread1 implements Serializable{
private BuggyBread2 buggybread2 = new BuggyBread2();
public static void main(String[] args){
try {
BuggyBread1 buggybread1 = new BuggyBread1();
ObjectOutputStream objectOutputStream = new ObjectOutputStream(new
FileOutputStream(new File("newFile.txt")));
objectOutputStream.writeObject(buggybread1);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}show Answer
Ans. No, It will throw java.io.NotSerializableException.
Q396. Will this code work fine if BuggyBread2 doesn't implement Serializable ?
class BuggyBread1 extends BuggyBread2 implements Serializable{
private int x = 5;
public static void main(String[] args){
try {
BuggyBread1 buggybread1 = new BuggyBread1();
ObjectOutputStream objectOutputStream = new ObjectOutputStream(new
FileOutputStream(new File("newFile.txt")));
objectOutputStream.writeObject(buggybread1);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}show Answer
Ans. Yes.
Q397. Can we compose the Parent Class object like this ?
}
}show Answer
Ans. matcher.find() should have been used instead of matcher.next() within while.
Q409. Which methods of the Pattern class have equivalent methods in the String class? show
Answer
Ans. split() and macthes()
Q410. Can we compare Integers by using equals() in Java ?show Answer
Ans. Yes for the Wrapper class Integer but not for the primitive int.
Q411. What is comparator interface used for ?show Answer
Ans. The purpose of comparator interface is to compare objects of the same class to identify the
sorting order. Sorted Collection Classes ( TreeSet, TreeMap ) have been designed such to look for
this method to identify the sorting order, that is why class need to implement Comparator interface to
qualify its objects to be part of Sorted Collections.
Q412. Which are the sorted collections ?show Answer
Ans. TreeSet and TreeMap
Q413. What is rule regarding overriding equals and hasCode method ?show Answer
Ans. A Class must override the hashCode method if its overriding the equals method.
Q414. What is the difference between Collection and Collections ?show Answer
Ans. Collection is an interface whereas Collections is a utility class.
Q415. Is Java a statically typed or dynamically typed language ?show Answer
Ans. Statically typed
Q416. What do you mean by "Java is a statically typed language" ?show Answer
Ans. It means that the type of variables are checked at compile time in Java.The main advantage here
is that all kinds of checking can be done by the compiler and hence will reduce bugs.
Q417. How can we reverse the order in the TreeMap ?show Answer
Ans. Using Collections.reverseOrder()
Map tree = new TreeMap(Collections.reverseOrder());
Q419. How TreeMap orders the elements if the Key is a String ?show Answer
Ans. As String implements Comparable, It refers to the String compareTo method to identify the order
relationship among those elements.
Q420. Can we add heterogeneous elements into TreeMap ?show Answer
Ans. No, Sorted collections don't allow addition of heterogeneous elements as they are not
comparable.
Q421. Will it create any problem if We add elements with key as user defined object into the
TreeMap ?show Answer
Ans. It won't create any problem if the objects are comparable i.e we have that class implementing
Comparable interface.
Q422. Can we null keys in TreeMap ?show Answer
Ans. No, results in exception.
Q423. Can value be null in TreeMap ?show Answer
Ans. Yes.
Q424. Which interface TreeMap implements ?show Answer
Ans. TreeMap implements NavigableMap, SortedMap, Serializable and Clonable.
Q425. Do we have form beans in Struts 2 ?show Answer
Ans. No, because they are not longer required. As action classes are no more singleton in Struts 2,
user inputs can be captured in action itself.
Q426. What is a ConcurrentHashMap ?show Answer
Ans. ConcurrentHashMap is a hashMap that allows concurrent modifications from multiple threads as
there can be multiple locks on the same hashmap.
Q427. What is the use of double checked locking in createInstance() of Singleton class?
Double checked locking code:
public static Singleton createInstance() {
if(singleton == null){
synchronized(Singleton.class) {
if(singleton == null) {
singleton = new Singleton();
}
}
}
return singleton;
}
3. First level Cache is Session specific whereas Second level cache is shared by sessions that is why
First level cache is considered local and second level cache is considered global.
Q433. What are the the methods to clear cache in Hibernate ?show Answer
Ans. Evict() and clear(). Evist is used to clear a particular object from the cache whereas clear clears
the complete local cache.
Q434. What are different types of second level cache ?show Answer
Ans. 1. EHCache ( Easy Hibernate )
2. OSCache ( Open Symphony )
3. Swarm Cache ( JBoss )
4. Tree Cache ( JBoss )
Q435. Can we disable first level cache ? What should one do if we don't want an object to be
cached ?show Answer
Ans. No.We can either call evict after the object retrieval or can use separate sessions.
Q436. How to configure second level cache in Hibernate ?show Answer
Ans. 1. Configure Provider class in Hibernate configuration file.
2. Add Cache usage tag ( read-only or read-write ) in mapping files ( hbm ).
3. Create an XML file called ehcache.xml and place in classpath which contains time settings and
update settings, behavior of cache , lifetime and idletime of Pojos, how many objects are allowed.
Q437. What is Hibernate ?show Answer
Ans. Hibernate is a Java ORM Framework.
Q438. What are the advantages of Hibernate ?show Answer
Ans. 1. No need to know SQL, RDBMS, and DB Schema.
2. Underlying Database can be changed without much effort by changing SQL dialect and DB
connection.
3.Improved Performance by means of Caching.
Q439. What are the different types of inheritance in Hibernate ?show Answer
Ans. Table Per Class , Table per Sub Class , Table per Concrete Class
Q440. What is the purpose of dialect configured in Hibernate configuration file ?show Answer
Ans. It tells the framework which SQL varient to generate.
Q441. Please specify in what sequence the objects of following classes will be created ?
Session , SessionFactory, Query , Configurationshow Answer
3. Make sure that we are accessing the dependent objects before closing the session.
4. Using Fetch Join in HQL.
Q467. What is cascade ?show Answer
Ans. Instead of Saving Parent as well as Child Entities individually , Hibernate provides the option to
persist / delete the related entities when the Parent is persisted.
Q468. What are the different Cascade types ?show Answer
Ans. Detach, Merge , Persist , Remove , Refresh
Q469. Which type of associated Entities are Eagerly loaded by Default ?show Answer
Ans. OneToOne
Q470. After which Hibernate version , related Entities are initialized lazily ?show Answer
Ans. After Hibernate 3.0
Q471. Can we declare Entity class as final ?show Answer
Ans. Yes but as Hibernate creates the Proxy Classes inherited from the Entity Classes to
communicate with Database for lazy initialization. Declaring entity classes as final will prohibit
communication with database lazily and hence will be a performance hit.
Q472. What are the restrictions for the entity classes ?show Answer
Ans. 1. Entity classes should have default constructor.
2. Entity classes should be declared non final.
3. All elements to be persisted should be declared private and should have public getters and setters
in the Java Bean style.
4. All classes should have an ID that maps to Primary Key for the table.
Q473. What is the difference between int[] x; and int x[]; ?show Answer
Ans. No Difference. Both are the acceptable ways to declare an array.
Q474. What are the annotations used in Junit with Junit4 ?show Answer
Ans. @Test
The Test annotation indicates that the public void method to which it is attached can be run as a test
case.
@Before
The Before annotation indicates that this method must be executed before each test in the class, so
as to execute some preconditions necessary for the test.
@BeforeClass
The BeforeClass annotation indicates that the static method to which is attached must be executed
once and before all tests in the class.
@After
The After annotation indicates that this method gets executed after execution of each test.
@AfterClass
The AfterClass annotation can be used when a method needs to be executed after executing all the
tests in a JUnit Test Case class so as to clean-up the set-up.
@Ignores
The Ignore annotation can be used when you want temporarily disable the execution of a specific
test.
Q475. What is asynchronous I/O ?show Answer
Ans. It is a form of Input Output processing that permits other processing to continue before the I/O
transmission has finished.
Q476. If there is a conflict between Base Class Method definition and Interface Default method
definition, Which definition is Picked ?show Answer
Ans. Base Class Definition.
Q477. What are new features introduced with Java 8 ?show Answer
Ans. Lambda Expressions , Interface Default and Static Methods , Method Reference , Parameters
Name , Optional , Streams, Concurrency.
Q478. Can we have a default method without a Body ?show Answer
Ans. No. Compiler will give error.
Q479. Does java allow implementation of multiple interfaces having Default methods with
Same name and Signature ?show Answer
Ans. No. Compilation error.
Q480. What are Default Methods ?show Answer
Ans. With Java 8, We can provide method definitions in the Interfaces that gets carried down the
classes implementing that interface in case they are not overridden by the Class. Keyword "default" is
used to mark the default method.
Q481. Can we have a default method definition in the interface without specifying the keyword
"default" ?show Answer
Ans. No. Compiler complains that its an abstract method and hence shouldn't have the body.
Q482. Can a class implement two Interfaces having default method with same name and
signature ?
public interface DefaultMethodInterface {
default public void defaultMethod(){
System.out.println("DefaultMethodInterface");
}
}
public interface DefaultMethodInterface2 {
default public void defaultMethod(){
System.out.println("DefaultMethodInterface2");
}
}
public class HelloJava8 implements DefaultMethodInterface,DefaultMethodInterface2 {
public static void main(String[] args){
DefaultMethodInterface defMethIn = new HelloJava8();
defMethIn.defaultMethod();
}
}show Answer
Ans. No. Compiler gives error saying "Duplicate Default Methods"
}
}show Answer
Ans. Even then the Compiler will give error saying that there is a conflict.
Q484. What if we override the conflicting method in the Class ?
public interface DefaultMethodInterface {
default public void defaultMethod(){
System.out.println("DefaultMethodInterface");
}
}
public interface DefaultMethodInterface2 {
default public void defaultMethod(){
System.out.println("DefaultMethodInterface2");
}
}
public class HelloJava8 implements DefaultMethodInterface,DefaultMethodInterface2 {
public static void main(String[] args){
DefaultMethodInterface defMethIn = new HelloJava8();
defMethIn.defaultMethod();
}
public void defaultMethod(){
System.out.println("HelloJava8");
}
}show Answer
Ans. There won't be any error and upon execution the overriding class method will be executed.
Q485. What will happen if there is a default method conflict as mentioned above and we have
specified the same signature method in the base class instead of overriding in the existing
class ?
show Answer
Ans. There won't be any problem as the Base class method will have precedence over the Interface
Default methods.
Q486. If a method definition has been specified in Class , its Base Class , and the interface
which the class is implementing, Which definition will be picked if we try to access it using
Interface Reference and Class object ? show Answer
Ans. Class method definition is overriding both the definitions and hence will be picked.
Q487. If a method definition has been specified in the Base Class and the interface which the
class is implementing, Which definition will be picked if we try to access it using Interface
Reference and Class object ? show Answer
Ans. Base Class Definition will have precedence over the Interface Default method definition.
Q498. What things you would care about to improve the performance of Application if its
identified that its DB communication that needs to be improved ?show Answer
Ans. 1. Query Optimization ( Query Rewriting , Prepared Statements )
2. Restructuring Indexes.
3. DB Caching Tuning ( if using ORM )
4. Identifying the problems ( if any ) with the ORM Strategy ( If using ORM )