Elastic Search
Elastic Search
io/samueljames/using-docker-with-elasticsearch-logstash-and-kibana-elk-
dzucc3c94
ElasticSearch uses JSON objects as responses, which makes it possible to invoke the ElasticSearch
server with a large number of different programming languages.
--------------------------------------------------------------------------------------------------------------------------------------
Advantages of Elastic Search:
Fast - Search results would be returned almost instantly, in order to provide a responsive user
experience.
Flexible - We'll want to be able to modify how the search is performed, in order to optimize for
different datasets and use cases.
Forgiving - If a search contains a typo, we'd still like to return relevant results for what the user
might have been trying to search for.
Full-Text - We don't want to limit our search to specific matching keywords or tags - we want to
search everything in our datastore (including large text fields) for a match.
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------
==================================================================================
Any application that is under run/execution will generate a log. We call it as application log.
In general, we would require a machine for installing Elasticsearch which has more storage space.
Similarly, we would require more compute power for logstash and Kibana to process the data and
for search operations to be really fast.
ELK is absolutely zero cost. Enterprise support is very cheap for ELK.
ElasticSearch:
Free
Opensource
Great at full text searching
Highly scalable
Builtin search, aggregation and shredding
Used by Microsoft Azure, Wordpress and Stack Exchange
It is a Database
Logstash:
Aggregates, filters and supplements log data
Forwards logs to Elasticsearch
Sending logs directly to Elasticsearch without logstash can lead to inconsistent data
Kibana:
Web based front-end
Works easily with Elasticsearch for charts, graphs and visualizing data
Free from the Elastic Company
Beats:
Small, lightweight utilities for reading logs from a variety of sources. Usually sends data to Logstash.
It can send data directly to Elasticsearch but it would be inconsistent.
Filebeat: Text log files
Metricbeat: OS and applications
Packetbeat: Network Monitoring
Winlogbeat: Windows Event Log
Libbeat: Write your own
Alerting:
Helps track conditions based on Elasticsearch data
Continually monitors log data for pre-configured conditions
Send notifications to email, Slack, Hipchat, PagerDuty out of the box.
We can monitor Disk, CPU & RAM, Applications and Network logs across multiple geographical
locations for each of the servers.
On top, we have multiple environments/Operating Systems. Windows machines, Linux boxes, HP-
AIX, Solaris etc.
[Beat Components]
Filebeat -->
Packetbeat --> LogStash --> Elasticsearch --> Kibana
Metricbeat -->
Winlogbeat -->
System Requirements:
What kind of OS should we use?
Elastic search runs fine on Linux & Windows
We are going to choose Linux and use distribution packages
Ubuntu 16.10 Server Edition
We can also install on Windows
Installing Java:
apt-get install openjdk-8-jre-headless
java -version
suppose I install some latest version of Java in future. At that time 1.8.0_131 would not be the
default version of Java. It would have latest version of Java as default. How do I stop that?
Use alternatives. But much more better approach is to set in .bashrc
Use your version of java set in .bashrc
set is the command that will print all the environment variables
Download the elasticsearch package to a folder pkg
cd /home/ubuntu
mkdir pkg
cd pkg
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.3.deb
(if it was rpm it would be rpm -ivh)
dpkg -i elasticsearch-5.6.3.deb
Next we have to change the configuration of the elasticsearch:
vim /etc/elasticsearch/elasticsearch.yml
In this file, make these changes. Remove the comment(#)
cluster.name: globo-clustering
node.name: ec2-52-32-194-200.us-west-2.compute.amazonaws.com (Provide the public DNS name)
Change network.host: <Public ip address>
Memory map programming. It gives an illusion that you are working on files, but you are actually
working on RAM. So it will really be fast. Helps in indexing, faster search etc.
Start elasticsearch cluster service by "service elasticsearch start"
Test by executing curl http://<ipaddress>:9200
Increase the memory map count by "systemctl -w vm.max_map_count=262144"
Run as root user: service elasticsearch start
service elasticsearch status --> Is it failed or successful
cd /etc/elasticsearch/
ls
there are no logfiles with elasticsearch
which elasticsearch
whereis elasticsearch
/usr/share/elasticsearch
But there are no logs.
In chrome -> F12. This will tell me where exactly is my network issue?
Or
the easiest alternative is google: postman (chrome web store)
Get request: 52.32.194.200:9200 Click on Send
If it still fails:
vim /etc/elasticsearch/elasticsearch.yml
goto network.host...Replace it with Private IP address(172.31.19.216)
service elasticsearch start
service elasticsearch status
service elasticsearch enable --> Each time I restart my machine, elasticsearch service should be
enabled. thats why I do this.
Extra Notes:
Add this port in Inbound rules:
All Traffic --> All --> 0-65535 --> Custom --> 0.0.0.0/0(which means anywhere) --> Click Save.
Custom TCP --> TCP --> 9200 --> Custom --> 0.0.0.0/0(which means anywhere) --> Click Save.
To start elastic search on boot: "systemctl enable elasticsearch"
Logstash Configuration:
input{} --> Where is data coming from? Beats or Logs?
filter{} --> How should we parse the data? Ignore some? Modify any?
output{} --> Where should we store the logs? Back end? Elasticsearch?
Logstash plugins:
Out of the box can read apache logs, log4j files, Windows Event Logs, and more...
Included filters can read raw text, parse csv, or look up geo/location information by IP address, or
reading json
Dozens of filers are included by default
Logstash Filters:
grok filters:
geoip filters:
Run the following command to import the ElasticSearch public GPG key into apt
wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
It will output the timestamp and then print the word: hello
apt-get update
We dont need Java here. Kibana works on nodejs.
wget -q0 - https://packages.elastic.co/GPG-KEY-elasticsearch| sudo apt-key add -
echo "deb https://artifacts.elastic.co/packages/5.x/apt stable main" | sudo tee -a
/etc/apt/sources.list.d/elasticsearch-5.x.list
apt-get install kibana
Modify /etc/kibana/kibana.yml
server.host: 172.31.31.0 --> Enter the private IP address
server.name: Private DNS (ip-172-31-31-0.us-west-2.compute.internal)
elasticsearch.url: "http://52.32.194.200:9200" --> Public IP
Execute: service kibana start
service kibana status
To achieve this, I would be using two beat components: winlogbeat and metricbeat
Installing Winlogbeat:
Download the winlogbeat file from elastic site
Extract zip file and change the following in winlogbeat.yml
tags: ["us-east-1"]
fields:
globo_environment: production
Class-2