0% found this document useful (0 votes)
162 views

Elastic Search

Elasticsearch is a distributed, open source search and analytics engine that allows storing and searching of data in JSON format. It provides fast search responses by searching an index instead of text directly. Logstash is used to collect, filter and forward logs from various sources to Elasticsearch for storage and analysis. Kibana provides a web interface to visualize Elasticsearch data and create charts, reports and dashboards. The ELK stack (Elasticsearch, Logstash, Kibana) along with Beats is commonly used for log analysis and monitoring.

Uploaded by

Nirmal Gollapudi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
162 views

Elastic Search

Elasticsearch is a distributed, open source search and analytics engine that allows storing and searching of data in JSON format. It provides fast search responses by searching an index instead of text directly. Logstash is used to collect, filter and forward logs from various sources to Elasticsearch for storage and analysis. Kibana provides a web interface to visualize Elasticsearch data and create charts, reports and dashboards. The ELK stack (Elasticsearch, Logstash, Kibana) along with Beats is commonly used for log analysis and monitoring.

Uploaded by

Nirmal Gollapudi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

https://www.codementor.

io/samueljames/using-docker-with-elasticsearch-logstash-and-kibana-elk-
dzucc3c94

What is Elastic Search?


Elastic Search is a real time distributed and open source full-text search engine and analytics engine.
Elastic Search is able to achieve fast search responses because instead of searching the text directly,
it searches the index. ElasticSearch is developed in Java.

ElasticSearch uses JSON objects as responses, which makes it possible to invoke the ElasticSearch
server with a large number of different programming languages.
--------------------------------------------------------------------------------------------------------------------------------------
Advantages of Elastic Search:
Fast - Search results would be returned almost instantly, in order to provide a responsive user
experience.
Flexible - We'll want to be able to modify how the search is performed, in order to optimize for
different datasets and use cases.
Forgiving - If a search contains a typo, we'd still like to return relevant results for what the user
might have been trying to search for.
Full-Text - We don't want to limit our search to specific matching keywords or tags - we want to
search everything in our datastore (including large text fields) for a match.
--------------------------------------------------------------------------------------------------------------------------------------

Elastic Search Installation on Windows:


Prerequisites : Java 1.7 or higher

--------------------------------------------------------------------------------------------------------------------------------------

Elastic Search Installation using an Image:


Command to run:
docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node"
docker.elastic.co/elasticsearch/elasticsearch:7.1.0

--------------------------------------------------------------------------------------------------------------------------------------
==================================================================================

How can I add collection


Retrieve data
Records
Put values
JSON
Multiple records pull
Field
Key value pair

ELK Stack(Elastic stack as per the new name) + Beats


Purpose of ELK Stack is Log Monitoring or Log Analytics.

Any application that is under run/execution will generate a log. We call it as application log.

In general, we would require a machine for installing Elasticsearch which has more storage space.
Similarly, we would require more compute power for logstash and Kibana to process the data and
for search operations to be really fast.

Advantages of having such a log:


Support team can monitor and understand failures
DevOps team can work on new features because we can do analytics on it.
Benefial to support/ops team and the business generation team as well.

How will google/flipkart/amazon come up with new features?


They cant innovate everytime. These ppl will analyse your usage patterns. For that they make use of
these logs.
Ex: Firefox experience is increased.

Other tools that do a similar job is splunk/loglee etc


Disadvantages with the above tools:
You need to pay heavily for the servers where you need to install splunk etc
Licensing costs

ELK is absolutely zero cost. Enterprise support is very cheap for ELK.

Elastic Search - Database. logs are stored here.


Logstash - Convert the log file into an understandable format
Kibana - UI. Browser based
Beats

Example of log: <DateTime>-<OrgName>-<Product>-<Component>-


<SEV_LOG_ERROR_WARNING_FATAL>-LogMessage
Beats - Personalisation/Custom alerts for a product.
Performance Metrics/Build performance. etc
Architecture:
Elastic Search. Logstash, Kibana and Beats are to be on different servers in production.

ElasticSearch:
Free
Opensource
Great at full text searching

Highly scalable
Builtin search, aggregation and shredding
Used by Microsoft Azure, Wordpress and Stack Exchange
It is a Database

Take monitoring situation from non-existent to fully fledged enterprise ready


Web based monitoring and historical searching
proactive alerting solution

Distributed, fat, highly scalable document database


Created by Shay Banon in 2010
Can be setup on a simple single node cluster
It is much like a NO SQL Database

For Images, go for Hadoop stack

Logstash:
Aggregates, filters and supplements log data
Forwards logs to Elasticsearch
Sending logs directly to Elasticsearch without logstash can lead to inconsistent data

Kibana:
Web based front-end
Works easily with Elasticsearch for charts, graphs and visualizing data
Free from the Elastic Company

Beats:
Small, lightweight utilities for reading logs from a variety of sources. Usually sends data to Logstash.
It can send data directly to Elasticsearch but it would be inconsistent.
Filebeat: Text log files
Metricbeat: OS and applications
Packetbeat: Network Monitoring
Winlogbeat: Windows Event Log
Libbeat: Write your own

Alerting:
Helps track conditions based on Elasticsearch data
Continually monitors log data for pre-configured conditions
Send notifications to email, Slack, Hipchat, PagerDuty out of the box.

We can monitor Disk, CPU & RAM, Applications and Network logs across multiple geographical
locations for each of the servers.
On top, we have multiple environments/Operating Systems. Windows machines, Linux boxes, HP-
AIX, Solaris etc.

[Beat Components]
Filebeat -->
Packetbeat --> LogStash --> Elasticsearch --> Kibana
Metricbeat -->
Winlogbeat -->

How we can setup/System Buildup?


Start from the back and work forward
Usually Elasticsearch clusters comprise many nodes
We are keeping things simple with one Elasticsearch node

System Requirements:
What kind of OS should we use?
Elastic search runs fine on Linux & Windows
We are going to choose Linux and use distribution packages
Ubuntu 16.10 Server Edition
We can also install on Windows

Setup machines at different regions(t2.medium) using AWS:


First we will explore the docs: https://github.com/elastic/elasticsearch
Works with Openjdk or oracle Java8(1.8.0_131), RAM 4GB and 2 core CPU as requirement

Let me be root user: sudo -i


apt-get update

Change the IP machine to Static IP Address in AWS(Steps to change are below):

Installing Java:
apt-get install openjdk-8-jre-headless
java -version

suppose I install some latest version of Java in future. At that time 1.8.0_131 would not be the
default version of Java. It would have latest version of Java as default. How do I stop that?
Use alternatives. But much more better approach is to set in .bashrc
Use your version of java set in .bashrc

set is the command that will print all the environment variables
Download the elasticsearch package to a folder pkg
cd /home/ubuntu
mkdir pkg
cd pkg
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.3.deb
(if it was rpm it would be rpm -ivh)
dpkg -i elasticsearch-5.6.3.deb
Next we have to change the configuration of the elasticsearch:
vim /etc/elasticsearch/elasticsearch.yml
In this file, make these changes. Remove the comment(#)
cluster.name: globo-clustering
node.name: ec2-52-32-194-200.us-west-2.compute.amazonaws.com (Provide the public DNS name)
Change network.host: <Public ip address>

Add this port in Inbound rules:


Custom TCP --> TCP --> 9200 --> Custom --> 0.0.0.0/0(which means anywhere) --> Click Save.

Memory map programming. It gives an illusion that you are working on files, but you are actually
working on RAM. So it will really be fast. Helps in indexing, faster search etc.
Start elasticsearch cluster service by "service elasticsearch start"
Test by executing curl http://<ipaddress>:9200
Increase the memory map count by "systemctl -w vm.max_map_count=262144"
Run as root user: service elasticsearch start
service elasticsearch status --> Is it failed or successful

cd /etc/elasticsearch/
ls
there are no logfiles with elasticsearch

which elasticsearch
whereis elasticsearch
/usr/share/elasticsearch
But there are no logs.

But where will the default logs be: cd /var/log/elasticsearch/


less globo-clustering.log

Error is Failed to bind to [9300-9400]

To fix this error:


Inbound rules: All Traffic --> All --> 0-65535 --> Custom --> 0.0.0.0/0(which means anywhere) --> Click
Save.

service elasticsearch start


service elasticsearch status
It says active(running)

To see if it is working or not: http://52.32.194.200:9200 (EC2 public IP address)

How do you test if a particular port is opening or not?


mkdir test
cd test
wget http://localhost:9200 --> Connection refused or working??

In chrome -> F12. This will tell me where exactly is my network issue?
Or
the easiest alternative is google: postman (chrome web store)
Get request: 52.32.194.200:9200 Click on Send

If it still fails:
vim /etc/elasticsearch/elasticsearch.yml
goto network.host...Replace it with Private IP address(172.31.19.216)
service elasticsearch start
service elasticsearch status

Again try it on postman: http://52.32.194.200:9200

Now try it the same on the chrome browser. http://52.32.194.200:9200

service elasticsearch enable --> Each time I restart my machine, elasticsearch service should be
enabled. thats why I do this.

Extra Notes:
Add this port in Inbound rules:
All Traffic --> All --> 0-65535 --> Custom --> 0.0.0.0/0(which means anywhere) --> Click Save.
Custom TCP --> TCP --> 9200 --> Custom --> 0.0.0.0/0(which means anywhere) --> Click Save.
To start elastic search on boot: "systemctl enable elasticsearch"

Modify the elasticsearch configuration file present at /etc/elasticsearch/elasticsearch.yml

How to create a static IP address in EC2?


In AWS, go to Elastic IPs-->Allocate new IP Address
A new IP Address would be created. Right click on the new address and say Associate address.
Next fill the instance name that is present in Instance.
Provide IP Address which is optional -> Click on Associate.

Instance State: Running

LogStash is a Data Collection Engine


Beats will provide us the logs to LogStash server. Beats will ingest us the logs.
Logstash will enhance or modify the logs and forward it to Elasticsearch

Logstash Configuration:
input{} --> Where is data coming from? Beats or Logs?
filter{} --> How should we parse the data? Ignore some? Modify any?
output{} --> Where should we store the logs? Back end? Elasticsearch?

Logstash plugins:
Out of the box can read apache logs, log4j files, Windows Event Logs, and more...
Included filters can read raw text, parse csv, or look up geo/location information by IP address, or
reading json
Dozens of filers are included by default

Logstash Filters:
grok filters:
geoip filters:

How to install Logstash:


Launch another Ubuntu server:(t2.small)
Inbound rules: Select the one which has both of the below:
All Traffic --> All --> 0-65535 --> Custom --> 0.0.0.0/0(which means anywhere) --> Click Save.
Custom TCP --> TCP --> 9200 --> Custom --> 0.0.0.0/0(which means anywhere) --> Click Save.
Perform the same static/Elastic IP
Login as root
apt-get update
Install java jdk:
apt-get install openjdk-8-jre-headless

Run the following command to import the ElasticSearch public GPG key into apt
wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

Create the ElasticSearch source list:


echo "deb https://artifacts.elastic.co/packages/5.x/apt stable main" | sudo tee -a
/etc/apt/sources.list.d/elasticsearch-5.x.list
(or)
echo "deb http://packages.elastic.co/elasticsearch/2.x/debian stable main" | sudo tee -a
/etc/apt/sources.list.d/elasticsearch-2.x.list

Execute: apt-get update && apt-get install logstash


Logstash is stored in: /usr/share/logstash and move to this directory using cd

How to check logstash is working?


cd /usr/share/logstash
Now execute this command:
bin/logstash -e "input { stdin {} } output { stdout {} }"
Takes nearly 2mins but will output something...
Type hello and press enter

It will output the timestamp and then print the word: hello

Visualising with Kibana: It is the frontend component.


General graphing and visualization tool written in Node.js
Free, works great with Elasticsearch, includes a ton of visualization options and widgets.
Easy to create useful dashboards and share them with coworkers

Installing Kibana on Ubuntu is very easy


Elastic company maintains .deb packages for Debian based systems

Ubuntu t2.micro is enough for Kibana.


Select a security group where all the traffic is allowed just like we did it for Elasticsearch.

apt-get update
We dont need Java here. Kibana works on nodejs.
wget -q0 - https://packages.elastic.co/GPG-KEY-elasticsearch| sudo apt-key add -
echo "deb https://artifacts.elastic.co/packages/5.x/apt stable main" | sudo tee -a
/etc/apt/sources.list.d/elasticsearch-5.x.list
apt-get install kibana
Modify /etc/kibana/kibana.yml
server.host: 172.31.31.0 --> Enter the private IP address
server.name: Private DNS (ip-172-31-31-0.us-west-2.compute.internal)
elasticsearch.url: "http://52.32.194.200:9200" --> Public IP
Execute: service kibana start
service kibana status

Now try to access using web browser: http://54.69.248.213:5601

Instrumenting Windows Servers:


I need to send details of RAM, CPU, Disk and Event log

To achieve this, I would be using two beat components: winlogbeat and metricbeat

Beat components are written in GoLang


Go programs are static binaries, no need for JVM or other runtimes
Can be "cross-compiled" to work on Windows, linux, macOS, and BSD
Usually pretty small and lightweight. great for system utilities
Note:
Docker is also written in GoLang
Usually large companies have dozens, hundreds or even thousands of servers.
For our purposes, we are going to use two windows web servers and one windows file server
Will keep data diverse enough for our demonstrations.

Process: Download and unpack winlogbeat


Configure it to use logstash and add some custom fields and data
Set it up to run as a windows service

Installing Winlogbeat:
Download the winlogbeat file from elastic site
Extract zip file and change the following in winlogbeat.yml

tags: ["us-east-1"]
fields:
globo_environment: production

Enable logstash configuration

From powershell install winlogbeat template by using following command


"Invoke-WebRequest -Method PUT -InFile .\winlogbeat.template.json -Url
http://<elasticsearchserver>:9200/_template/winlogbeat"

From powershell install winlogbeat service using following command:


".\install-service-winloagbeat.ps1"
Start service using start-service winlogbeat

Type winlogbeat in google. First link with elastic.io need to be used.


In windows, run command, type eventvwr...different logs will be present which we are going to
parse.
We will send these logs to our elastic search server

Go hit the GA release


Download the 64 bit version and extract it. Rename the folder as winlogbeat.
Move the folder to C:\Program Files
Click on GitHub on the right for documentation
Start and Run Windows Powershell.
cd c:\Program Files\Winlogbeat
code .
you would see it in Visual Studio
For winlogbeat.yml, make these modifications:
tags: ["us-east-1"]
fields:
globo_environment: production

Comment out the output.elasticsearch section.


Uncomment the output.logstash section.
hosts: ["34.212.62.67:5044"] --->>Public IP

Class-2

You might also like