SlideShare a Scribd company logo
ELASTICSEARCH INTRO
Tom Chen 陳炯廷
ctchen@gmail.com
ABOUT ME
• engineer @ iF+TechArt 當若科技藝術
• full stack / CTO @ House123
• engineer @Trend Micro
Elasticsearch intro output
策展、專案管理
Management
|	 展覽企劃	 
|	 空間規劃	 
|	 專案管理	 
|	 協力單位聯繫/執行
協調	 
|	 預算評估/製表
互動設計
Interactive
|	 人因交互設計	 
|	 展示設備	 
|	 互動內容製作	 
|	 客製化整案執行	 
|	 工業級機電控制/系統整合	 
|	 結構設計/施工	 
|	 實境遊戲
所以其實最近⽐比較常碰
tornado, raspberry pi…
但... ok 的!
elasticsearch introduction
Elas%csearch	
  is	
  a	
  flexible	
  and	
  powerful	
  open	
  
source,	
  distributed,	
  real-­‐%me	
  search	
  and	
  
analy%cs	
  engine.	
  Architected	
  from	
  the	
  ground	
  up	
  
for	
  use	
  in	
  distributed	
  environments	
  where	
  
reliability	
  and	
  scalability	
  are	
  must	
  haves,	
  
Elas%csearch	
  gives	
  you	
  the	
  ability	
  to	
  move	
  easily	
  
beyond	
  simple	
  full-­‐text	
  search.	
  Through	
  its	
  robust	
  
set	
  of	
  APIs	
  and	
  query	
  DSLs,	
  plus	
  clients	
  for	
  the	
  
most	
  popular	
  programming	
  languages,	
  
Elas%csearch	
  delivers	
  on	
  the	
  near	
  limitless	
  
promises	
  of	
  search	
  technology.	
  
• store	
  data	
  
• search	
  
• scalable
她有點像 Database
但⼜又不太⼀一樣
你可以把他當
DB 或 NoSQL 使⽤用
但還是要看你想要達成的⺫⽬目標是什麼
先來看看他的強項
Search
強項⼀一
⼀一般 SQL
SELECT	
  *	
  FROM	
  table	
  WHERE	
  field	
  LIKE	
  '%querystring%';
elasticsearch
field:'querystring'
{	
  
	
  	
  "match":	
  {	
  
	
  	
  	
  	
  "field":	
  "querystring"	
  
	
  	
  }	
  
}
or
elasticsearch
Lucene	
  Query	
  Parser	
  Syntax
Elasticsearch	
  Query	
  DSL
or
http://lucene.apache.org/core/4_10_3/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html
behind the scene (indexing)
"Set the shape to semi-transparent by calling set_trans(5)"
set the shape to semi transparent by calling set_trans 5
standard tokenizer (Unicode Standard Annex #29)
lowercase token filter
stop token filter
fields and query string are analyzed
behind the scene (searching)
set the shape to semi transparent by calling set_trans 5
fields and query string are analyzed
semi-transparent
semi transparent
what about Chinese?
use an analyzer that is friendly to Chinese
elasticsearch-analysis-mmseg
MMSEG:A Word Identification System for Mandarin ChineseText
Based onTwoVariants of the Maximum Matching Algorithm
mmseg4j
https://github.com/medcl/elasticsearch-analysis-mmseg
https://code.google.com/p/mmseg4j/
http://technology.chtsai.org/mmseg/
what about Chinese?
use an analyzer that is friendly to Chinese
elasticsearch-analysis-smartcn
https://github.com/elasticsearch/elasticsearch-analysis-smartcn
Aggregations (Facets)
強項⼆二
就是可以達成像這樣⼦子的東⻄西
POST	
  /cars/transactions/_bulk	
  
{	
  "index":	
  {}}	
  
{	
  "price"	
  :	
  10000,	
  "color"	
  :	
  "red",	
  "make"	
  :	
  "honda",	
  "sold"	
  :	
  "2014-­‐10-­‐28"	
  }	
  
{	
  "index":	
  {}}	
  
{	
  "price"	
  :	
  20000,	
  "color"	
  :	
  "red",	
  "make"	
  :	
  "honda",	
  "sold"	
  :	
  "2014-­‐11-­‐05"	
  }	
  
{	
  "index":	
  {}}	
  
{	
  "price"	
  :	
  30000,	
  "color"	
  :	
  "green",	
  "make"	
  :	
  "ford",	
  "sold"	
  :	
  "2014-­‐05-­‐18"	
  }	
  
{	
  "index":	
  {}}	
  
{	
  "price"	
  :	
  15000,	
  "color"	
  :	
  "blue",	
  "make"	
  :	
  "toyota",	
  "sold"	
  :	
  "2014-­‐07-­‐02"	
  }	
  
{	
  "index":	
  {}}	
  
{	
  "price"	
  :	
  12000,	
  "color"	
  :	
  "green",	
  "make"	
  :	
  "toyota",	
  "sold"	
  :	
  "2014-­‐08-­‐19"	
  }	
  
{	
  "index":	
  {}}	
  
{	
  "price"	
  :	
  20000,	
  "color"	
  :	
  "red",	
  "make"	
  :	
  "honda",	
  "sold"	
  :	
  "2014-­‐11-­‐05"	
  }	
  
{	
  "index":	
  {}}	
  
{	
  "price"	
  :	
  80000,	
  "color"	
  :	
  "red",	
  "make"	
  :	
  "bmw",	
  "sold"	
  :	
  "2014-­‐01-­‐01"	
  }	
  
{	
  "index":	
  {}}	
  
{	
  "price"	
  :	
  25000,	
  "color"	
  :	
  "blue",	
  "make"	
  :	
  "ford",	
  "sold"	
  :	
  "2014-­‐02-­‐12"	
  }
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_aggregation_test_drive.html
GET	
  /cars/transactions/_search?search_type=count	
  
{	
  
	
  	
  	
  	
  "aggs"	
  :	
  {	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  "colors"	
  :	
  {	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "terms"	
  :	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "field"	
  :	
  "color"	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  }	
  
}
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_aggregation_test_drive.html
{	
  
...	
  
	
  	
  	
  "hits":	
  {	
  
	
  	
  	
  	
  	
  	
  "hits":	
  []	
  	
  
	
  	
  	
  },	
  
	
  	
  	
  "aggregations":	
  {	
  
	
  	
  	
  	
  	
  	
  "colors":	
  {	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  "buckets":	
  [	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "key":	
  "red",	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "doc_count":	
  4	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  },	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "key":	
  "blue",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "doc_count":	
  2	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  },	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "key":	
  "green",	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  "doc_count":	
  2	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  ]	
  
	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  }	
  
}
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_aggregation_test_drive.html
還有⼀一些不錯的功能
more like this
geolocation
…
但這邊就不多說了
so…
來看 code 吧 XD
https://github.com/yychen/estest
很多⼈人都有 wordpress
https://github.com/yychen/estest
GOAL
替⾃自⼰己的 wordpress 刻⼀一個搜尋
https://github.com/yychen/estest
requests + lxml (XPath) 寫爬蟲
pyelasticsearch 與 elasticsearch 溝通
tornado host 網⾴頁
https://github.com/yychen/estest
#!/usr/bin/env	
  python	
  	
  	
  
from	
  pyelasticsearch	
  import	
  ElasticSearch	
  	
  	
  
from	
  pyelasticsearch.exceptions	
  import	
  ElasticHttpNotFoundError	
  	
  	
  
	
  	
  	
  
from	
  settings	
  import	
  HOST,	
  INDEX,	
  DOCTYPE	
  	
  	
  
	
  	
  	
  
index_settings	
  =	
  {	
  	
  	
  
	
  	
  	
  	
  'mappings':	
  {	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  DOCTYPE:	
  {	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'properties':	
  {	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'title':	
  {'type':	
  'string',	
  'analyzer':	
  'mmseg',	
  'boost':	
  1.5,	
  'term_vector':	
  'with_positions_offsets'},	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'url':	
  {'type':	
  'string',	
  'index':	
  'not_analyzed'},	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'content':	
  {'type':	
  'string',	
  'analyzer':	
  'mmseg',	
  'boost':	
  0.7,	
  'term_vector':	
  'with_positions_offsets'},	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'categories':	
  {'type':	
  'nested',	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'properties':	
  {	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'url':	
  {'type':	
  'string',	
  'index':	
  'not_analyzed'},	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'name':	
  {'type':	
  'string',	
  'index':	
  'not_analyzed'},	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  }	
  	
  	
  
	
  	
  	
  	
  }	
  	
  	
  
}	
  	
  	
  
	
  	
  	
  
es	
  =	
  ElasticSearch(HOST)	
  	
  	
  
try:	
  	
  	
  
	
  	
  	
  	
  es.delete_index(INDEX)	
  	
  	
  
except	
  ElasticHttpNotFoundError:	
  	
  	
  
	
  	
  	
  	
  #	
  No	
  index	
  found	
  	
  	
  
	
  	
  	
  	
  pass	
  	
  	
  
	
  	
  	
  
es.create_index(INDEX,	
  settings=index_settings)	
  	
  	
  
建⽴立 mapping
https://github.com/yychen/estest
requests + lxml 寫爬蟲
從最新的⼀一篇當
作⼊入⼝口第⼀一⾴頁
下⼀一⾴頁就找這個
連結
https://github.com/yychen/estest 1
requests + lxml 寫爬蟲
title
categories
content
url
https://github.com/yychen/estest 2
requests + lxml 寫爬蟲
item	
  
{	
  
	
  	
  'url':	
  u'http://yychen.joba.cc/dev/archives/164',	
  
	
  	
  'content':	
  u'u96d6u7136u4e4b...',	
  
	
  	
  'categories':	
  [	
  
	
  	
  	
  	
  {'link':	
  'http://yychen.joba.cc/dev/archives/category/django',	
  'name':	
  'django'},	
  
	
  	
  	
  	
  {'link':	
  'http://yychen.joba.cc/dev/archives/category/python',	
  'name':	
  'python'},	
  
	
  	
  	
  	
  {'link':	
  'http://yychen.joba.cc/dev/archives/category/web',	
  'name':	
  'web'}],	
  
	
  	
  'title':	
  'Django	
  1.7	
  Migration'	
  
}
from	
  pyelasticsearch	
  import	
  ElasticSearch	
  
es	
  =	
  ElasticSearch(HOST)	
  
es.index(INDEX,	
  DOCTYPE,	
  doc=item,	
  id=item['url'])
https://github.com/yychen/estest 3
requests + lxml 寫爬蟲
def	
  main():	
  
	
  	
  	
  	
  url	
  =	
  u'http://yychen.joba.cc/dev/archives/164'	
  
	
  	
  	
  	
  es	
  =	
  ElasticSearch(HOST)	
  
	
  	
  	
  	
  for	
  i	
  in	
  range(20):	
  
	
  	
  	
  	
  	
  	
  	
  	
  item,	
  url	
  =	
  get_page(url)	
  
	
  	
  	
  	
  	
  	
  	
  	
  if	
  not	
  url:	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  print	
  '033[1;33mWe've	
  reached	
  the	
  end,	
  breaking...033[m'	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  break	
  
	
  	
  	
  	
  	
  	
  	
  	
  #	
  put	
  it	
  into	
  es	
  
	
  	
  	
  	
  	
  	
  	
  	
  print	
  'Indexing	
  033[1;37m%s033[m	
  (%s)...'	
  %	
  (item['title'],	
  item['url'])	
  
	
  	
  	
  	
  	
  	
  	
  	
  es.index(INDEX,	
  DOCTYPE,	
  doc=item,	
  id=item['url'])	
  
https://github.com/yychen/estest
requests + lxml 寫爬蟲
def	
  get_page(url):	
  
	
  	
  	
  	
  #	
  store	
  the	
  to-­‐be-­‐indexed	
  document	
  to	
  item	
  
	
  	
  	
  	
  item	
  =	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  'categories':	
  [],	
  
	
  	
  	
  	
  }	
  
	
  	
  	
  	
  page	
  =	
  requests.get(url)	
  
	
  	
  	
  	
  #	
  page.encoding	
  =	
  'utf-­‐8'	
  
	
  	
  	
  	
  html	
  =	
  etree.HTML(page.text)	
  
	
  	
  	
  	
  	
  
	
  	
  	
  	
  try:	
  
	
  	
  	
  	
  	
  	
  	
  	
  prev_url	
  =	
  html.xpath('//a[@rel="prev"]/@href')[0]	
  
	
  	
  	
  	
  except	
  IndexError:	
  
	
  	
  	
  	
  	
  	
  	
  	
  #	
  We	
  reached	
  the	
  end	
  
	
  	
  	
  	
  	
  	
  	
  	
  return	
  None,	
  None	
  
	
  	
  	
  	
  	
  
	
  	
  	
  	
  title_parts	
  =	
  html.xpath('//h1//text()')	
  
	
  	
  	
  	
  content_parts	
  =	
  html.xpath('//div[@class="post-­‐bodycopy	
  cf"]//text()')	
  
	
  	
  	
  	
  categories	
  =	
  html.xpath('//a[@rel="category	
  tag"]')	
  
	
  	
  	
  	
  	
  
	
  	
  	
  	
  item['url']	
  =	
  url	
  
	
  	
  	
  	
  item['title']	
  =	
  process_tags(title_parts)	
  
	
  	
  	
  	
  item['content']	
  =	
  process_tags(content_parts)	
  
	
  	
  	
  	
  	
  
	
  	
  	
  	
  #	
  Process	
  the	
  categories	
  
	
  	
  	
  	
  for	
  category	
  in	
  categories:	
  
	
  	
  	
  	
  	
  	
  	
  	
  _cat	
  =	
  {}	
  
	
  	
  	
  	
  	
  	
  	
  	
  _cat['link']	
  =	
  category.xpath('./@href')[0]	
  
	
  	
  	
  	
  	
  	
  	
  	
  _cat['name']	
  =	
  category.xpath('./text()')[0]	
  
	
  	
  	
  	
  	
  	
  	
  	
  item['categories'].append(_cat)	
  
	
  	
  	
  	
  	
  
	
  	
  	
  	
  return	
  item,	
  prev_url	
  
https://github.com/yychen/estest
tornado
class	
  SearchHandler(tornado.web.RequestHandler):	
  
	
  	
  	
  def	
  post(self):	
  
	
  	
  	
  	
  	
  	
  	
  	
  dsl	
  =	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'query':	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'bool':	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'should':	
  [	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {'match':	
  {'content':	
  self.get_argument('q')}},	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  {'match':	
  {'title':	
  self.get_argument('q')}},	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ]	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  },	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'highlight':	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'pre_tags':	
  ['<em>'],	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'post_tags':	
  ['</em>'],	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'fields':	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'content':	
  {'no_match_size':	
  150,	
  'number_of_fragments':	
  1},	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  'title':	
  {'no_match_size':	
  150,	
  'number_of_fragments':	
  0},	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  	
  	
  	
  	
  }	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  results	
  =	
  es.search(dsl,	
  index=INDEX,	
  doc_type=DOCTYPE)	
  
	
  	
  	
  	
  	
  	
  	
  	
  hits	
  =	
  results['hits']['hits']	
  
	
  	
  	
  	
  	
  	
  	
  	
  self.write(json.dumps(hits))	
  
https://github.com/yychen/estest
tada~~~
https://github.com/yychen/estest
GOOD!
叉⼦子之後發揮駭客精神
⾃自⼰己寫⾃自⼰己部落格的搜尋引擎吧!
https://github.com/yychen/estest
再把剛剛的東⻄西再發揚光⼤大⼀一些
search.joba.cc
本服務已經下線, 僅供 demo
⾮非常⾮非常好 :D
⼀一個⼩小 tip
在 mapping 建⽴立⼀一個
keyword 的欄位
把所有東⻄西都丟進去
譬如說: url, title, 任何 id, category etc.
全部都丟進去
這樣直接在 keyword 這個欄位
做搜尋就可以⼀一網打盡了
Conclusion
Elasticsearch intro output
今天過後, ⼤大家就知道要怎麼畫⾺馬了
photo by derek_b on Flickr
Any questions?
Thank you :D
Ad

More Related Content

What's hot (20)

Webinar: Building Your First App in Node.js
Webinar: Building Your First App in Node.jsWebinar: Building Your First App in Node.js
Webinar: Building Your First App in Node.js
MongoDB
 
Mongodb Aggregation Pipeline
Mongodb Aggregation PipelineMongodb Aggregation Pipeline
Mongodb Aggregation Pipeline
zahid-mian
 
4Developers 2018: Pyt(h)on vs słoń: aktualny stan przetwarzania dużych danych...
4Developers 2018: Pyt(h)on vs słoń: aktualny stan przetwarzania dużych danych...4Developers 2018: Pyt(h)on vs słoń: aktualny stan przetwarzania dużych danych...
4Developers 2018: Pyt(h)on vs słoń: aktualny stan przetwarzania dużych danych...
PROIDEA
 
Getting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJSGetting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJS
MongoDB
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation Framework
MongoDB
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
MongoDB
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation Pipeline
Jason Terpko
 
JQuery Flot
JQuery FlotJQuery Flot
JQuery Flot
Arshavski Alexander
 
Creating New Streams: Presented by Dennis Gove, Bloomberg LP
Creating New Streams: Presented by Dennis Gove, Bloomberg LPCreating New Streams: Presented by Dennis Gove, Bloomberg LP
Creating New Streams: Presented by Dennis Gove, Bloomberg LP
Lucidworks
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
Sematext Group, Inc.
 
Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days Munich
Norberto Leite
 
Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1
Anuj Jain
 
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
MongoDB
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation Options
MongoDB
 
Node.js: scalability tips - Azure Dev Community Vijayawada
Node.js: scalability tips - Azure Dev Community VijayawadaNode.js: scalability tips - Azure Dev Community Vijayawada
Node.js: scalability tips - Azure Dev Community Vijayawada
Luciano Mammino
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
DATAVERSITY
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
MongoDB
 
HTML5 after the hype - JFokus2015
HTML5 after the hype - JFokus2015HTML5 after the hype - JFokus2015
HTML5 after the hype - JFokus2015
Christian Heilmann
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation Framework
MongoDB
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Henrik Ingo
 
Webinar: Building Your First App in Node.js
Webinar: Building Your First App in Node.jsWebinar: Building Your First App in Node.js
Webinar: Building Your First App in Node.js
MongoDB
 
Mongodb Aggregation Pipeline
Mongodb Aggregation PipelineMongodb Aggregation Pipeline
Mongodb Aggregation Pipeline
zahid-mian
 
4Developers 2018: Pyt(h)on vs słoń: aktualny stan przetwarzania dużych danych...
4Developers 2018: Pyt(h)on vs słoń: aktualny stan przetwarzania dużych danych...4Developers 2018: Pyt(h)on vs słoń: aktualny stan przetwarzania dużych danych...
4Developers 2018: Pyt(h)on vs słoń: aktualny stan przetwarzania dużych danych...
PROIDEA
 
Getting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJSGetting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJS
MongoDB
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation Framework
MongoDB
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
MongoDB
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation Pipeline
Jason Terpko
 
Creating New Streams: Presented by Dennis Gove, Bloomberg LP
Creating New Streams: Presented by Dennis Gove, Bloomberg LPCreating New Streams: Presented by Dennis Gove, Bloomberg LP
Creating New Streams: Presented by Dennis Gove, Bloomberg LP
Lucidworks
 
Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days Munich
Norberto Leite
 
Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1
Anuj Jain
 
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2
MongoDB
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation Options
MongoDB
 
Node.js: scalability tips - Azure Dev Community Vijayawada
Node.js: scalability tips - Azure Dev Community VijayawadaNode.js: scalability tips - Azure Dev Community Vijayawada
Node.js: scalability tips - Azure Dev Community Vijayawada
Luciano Mammino
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
DATAVERSITY
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
MongoDB
 
HTML5 after the hype - JFokus2015
HTML5 after the hype - JFokus2015HTML5 after the hype - JFokus2015
HTML5 after the hype - JFokus2015
Christian Heilmann
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation Framework
MongoDB
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Henrik Ingo
 

Viewers also liked (20)

Pytables
PytablesPytables
Pytables
gowell
 
Live Performance Effects
Live Performance EffectsLive Performance Effects
Live Performance Effects
Tom Chen
 
真蝦意外接到的Case
真蝦意外接到的Case真蝦意外接到的Case
真蝦意外接到的Case
Tom Chen
 
Xmas
XmasXmas
Xmas
Tom Chen
 
Command line 初級寶典
Command line 初級寶典Command line 初級寶典
Command line 初級寶典
Tom Chen
 
Two scoops of django Introduction
Two scoops of django IntroductionTwo scoops of django Introduction
Two scoops of django Introduction
flywindy
 
Two scoops of Django - Deployment
Two scoops of Django - DeploymentTwo scoops of Django - Deployment
Two scoops of Django - Deployment
flywindy
 
AngularJS Sharing
AngularJS SharingAngularJS Sharing
AngularJS Sharing
Tom Chen
 
Django step0
Django step0Django step0
Django step0
永昇 陳
 
Gitlab
GitlabGitlab
Gitlab
Tom Chen
 
Working with the django admin
Working with the django admin Working with the django admin
Working with the django admin
flywindy
 
愛樂工程師
愛樂工程師愛樂工程師
愛樂工程師
Tom Chen
 
Django 實戰 - 自己的購物網站自己做
Django 實戰 - 自己的購物網站自己做Django 實戰 - 自己的購物網站自己做
Django 實戰 - 自己的購物網站自己做
flywindy
 
連淡水阿嬤都聽得懂的 機器學習入門 scikit-learn
連淡水阿嬤都聽得懂的機器學習入門 scikit-learn 連淡水阿嬤都聽得懂的機器學習入門 scikit-learn
連淡水阿嬤都聽得懂的 機器學習入門 scikit-learn
Cicilia Lee
 
Integrating tornado and webpack
Integrating tornado and webpackIntegrating tornado and webpack
Integrating tornado and webpack
Tom Chen
 
Learning django step 1
Learning django step 1Learning django step 1
Learning django step 1
永昇 陳
 
那些年,我用 Django Admin 接的案子
那些年,我用 Django Admin 接的案子那些年,我用 Django Admin 接的案子
那些年,我用 Django Admin 接的案子
flywindy
 
機器學習簡報 / 机器学习简报 Machine Learning
機器學習簡報 / 机器学习简报 Machine Learning 機器學習簡報 / 机器学习简报 Machine Learning
機器學習簡報 / 机器学习简报 Machine Learning
Will Kuan 官大鈞
 
Django workshop homework 3
Django workshop homework 3Django workshop homework 3
Django workshop homework 3
flywindy
 
解密解密
解密解密解密解密
解密解密
Tom Chen
 
Pytables
PytablesPytables
Pytables
gowell
 
Live Performance Effects
Live Performance EffectsLive Performance Effects
Live Performance Effects
Tom Chen
 
真蝦意外接到的Case
真蝦意外接到的Case真蝦意外接到的Case
真蝦意外接到的Case
Tom Chen
 
Command line 初級寶典
Command line 初級寶典Command line 初級寶典
Command line 初級寶典
Tom Chen
 
Two scoops of django Introduction
Two scoops of django IntroductionTwo scoops of django Introduction
Two scoops of django Introduction
flywindy
 
Two scoops of Django - Deployment
Two scoops of Django - DeploymentTwo scoops of Django - Deployment
Two scoops of Django - Deployment
flywindy
 
AngularJS Sharing
AngularJS SharingAngularJS Sharing
AngularJS Sharing
Tom Chen
 
Working with the django admin
Working with the django admin Working with the django admin
Working with the django admin
flywindy
 
愛樂工程師
愛樂工程師愛樂工程師
愛樂工程師
Tom Chen
 
Django 實戰 - 自己的購物網站自己做
Django 實戰 - 自己的購物網站自己做Django 實戰 - 自己的購物網站自己做
Django 實戰 - 自己的購物網站自己做
flywindy
 
連淡水阿嬤都聽得懂的 機器學習入門 scikit-learn
連淡水阿嬤都聽得懂的機器學習入門 scikit-learn 連淡水阿嬤都聽得懂的機器學習入門 scikit-learn
連淡水阿嬤都聽得懂的 機器學習入門 scikit-learn
Cicilia Lee
 
Integrating tornado and webpack
Integrating tornado and webpackIntegrating tornado and webpack
Integrating tornado and webpack
Tom Chen
 
Learning django step 1
Learning django step 1Learning django step 1
Learning django step 1
永昇 陳
 
那些年,我用 Django Admin 接的案子
那些年,我用 Django Admin 接的案子那些年,我用 Django Admin 接的案子
那些年,我用 Django Admin 接的案子
flywindy
 
機器學習簡報 / 机器学习简报 Machine Learning
機器學習簡報 / 机器学习简报 Machine Learning 機器學習簡報 / 机器学习简报 Machine Learning
機器學習簡報 / 机器学习简报 Machine Learning
Will Kuan 官大鈞
 
Django workshop homework 3
Django workshop homework 3Django workshop homework 3
Django workshop homework 3
flywindy
 
解密解密
解密解密解密解密
解密解密
Tom Chen
 
Ad

Similar to Elasticsearch intro output (20)

Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
Alexei Gorobets
 
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Codemotion
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 Minutes
Karel Minarik
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
Alex Zyl
 
ElasticSearch Hands On
ElasticSearch Hands OnElasticSearch Hands On
ElasticSearch Hands On
Nag Arvind Gudiseva
 
Semantic Web & TYPO3
Semantic Web & TYPO3Semantic Web & TYPO3
Semantic Web & TYPO3
André Wuttig
 
Anwendungsfaelle für Elasticsearch
Anwendungsfaelle für ElasticsearchAnwendungsfaelle für Elasticsearch
Anwendungsfaelle für Elasticsearch
Florian Hopf
 
ELK Stack - Turn boring logfiles into sexy dashboard
ELK Stack - Turn boring logfiles into sexy dashboardELK Stack - Turn boring logfiles into sexy dashboard
ELK Stack - Turn boring logfiles into sexy dashboard
Georg Sorst
 
Relevance trilogy may dream be with you! (dec17)
Relevance trilogy  may dream be with you! (dec17)Relevance trilogy  may dream be with you! (dec17)
Relevance trilogy may dream be with you! (dec17)
Woonsan Ko
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用
LearningTech
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Spain
 
Ams adapters
Ams adaptersAms adapters
Ams adapters
Bruno Alló Bacarini
 
Code is not text! How graph technologies can help us to understand our code b...
Code is not text! How graph technologies can help us to understand our code b...Code is not text! How graph technologies can help us to understand our code b...
Code is not text! How graph technologies can help us to understand our code b...
Andreas Dewes
 
Example-driven Web API Specification Discovery
Example-driven Web API Specification DiscoveryExample-driven Web API Specification Discovery
Example-driven Web API Specification Discovery
Javier Canovas
 
Automatic discovery of Web API Specifications: an example-driven approach
Automatic discovery of Web API Specifications: an example-driven approachAutomatic discovery of Web API Specifications: an example-driven approach
Automatic discovery of Web API Specifications: an example-driven approach
Jordi Cabot
 
Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !
Microsoft
 
Montreal Elasticsearch Meetup
Montreal Elasticsearch MeetupMontreal Elasticsearch Meetup
Montreal Elasticsearch Meetup
Loïc Bertron
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
NexThoughts Technologies
 
曾勇 Elastic search-intro
曾勇 Elastic search-intro曾勇 Elastic search-intro
曾勇 Elastic search-intro
Shaoning Pan
 
Elastic search intro-@lamper
Elastic search intro-@lamperElastic search intro-@lamper
Elastic search intro-@lamper
medcl
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
Alexei Gorobets
 
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Codemotion
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 Minutes
Karel Minarik
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
Alex Zyl
 
Semantic Web & TYPO3
Semantic Web & TYPO3Semantic Web & TYPO3
Semantic Web & TYPO3
André Wuttig
 
Anwendungsfaelle für Elasticsearch
Anwendungsfaelle für ElasticsearchAnwendungsfaelle für Elasticsearch
Anwendungsfaelle für Elasticsearch
Florian Hopf
 
ELK Stack - Turn boring logfiles into sexy dashboard
ELK Stack - Turn boring logfiles into sexy dashboardELK Stack - Turn boring logfiles into sexy dashboard
ELK Stack - Turn boring logfiles into sexy dashboard
Georg Sorst
 
Relevance trilogy may dream be with you! (dec17)
Relevance trilogy  may dream be with you! (dec17)Relevance trilogy  may dream be with you! (dec17)
Relevance trilogy may dream be with you! (dec17)
Woonsan Ko
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用
LearningTech
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Spain
 
Code is not text! How graph technologies can help us to understand our code b...
Code is not text! How graph technologies can help us to understand our code b...Code is not text! How graph technologies can help us to understand our code b...
Code is not text! How graph technologies can help us to understand our code b...
Andreas Dewes
 
Example-driven Web API Specification Discovery
Example-driven Web API Specification DiscoveryExample-driven Web API Specification Discovery
Example-driven Web API Specification Discovery
Javier Canovas
 
Automatic discovery of Web API Specifications: an example-driven approach
Automatic discovery of Web API Specifications: an example-driven approachAutomatic discovery of Web API Specifications: an example-driven approach
Automatic discovery of Web API Specifications: an example-driven approach
Jordi Cabot
 
Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !
Microsoft
 
Montreal Elasticsearch Meetup
Montreal Elasticsearch MeetupMontreal Elasticsearch Meetup
Montreal Elasticsearch Meetup
Loïc Bertron
 
曾勇 Elastic search-intro
曾勇 Elastic search-intro曾勇 Elastic search-intro
曾勇 Elastic search-intro
Shaoning Pan
 
Elastic search intro-@lamper
Elastic search intro-@lamperElastic search intro-@lamper
Elastic search intro-@lamper
medcl
 
Ad

Recently uploaded (20)

Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 

Elasticsearch intro output

  • 2. ABOUT ME • engineer @ iF+TechArt 當若科技藝術 • full stack / CTO @ House123 • engineer @Trend Micro
  • 4. 策展、專案管理 Management | 展覽企劃 | 空間規劃 | 專案管理 | 協力單位聯繫/執行 協調 | 預算評估/製表
  • 5. 互動設計 Interactive | 人因交互設計 | 展示設備 | 互動內容製作 | 客製化整案執行 | 工業級機電控制/系統整合 | 結構設計/施工 | 實境遊戲
  • 9. Elas%csearch  is  a  flexible  and  powerful  open   source,  distributed,  real-­‐%me  search  and   analy%cs  engine.  Architected  from  the  ground  up   for  use  in  distributed  environments  where   reliability  and  scalability  are  must  haves,   Elas%csearch  gives  you  the  ability  to  move  easily   beyond  simple  full-­‐text  search.  Through  its  robust   set  of  APIs  and  query  DSLs,  plus  clients  for  the   most  popular  programming  languages,   Elas%csearch  delivers  on  the  near  limitless   promises  of  search  technology.  
  • 10. • store  data   • search   • scalable
  • 16. ⼀一般 SQL SELECT  *  FROM  table  WHERE  field  LIKE  '%querystring%';
  • 17. elasticsearch field:'querystring' {      "match":  {          "field":  "querystring"      }   } or
  • 18. elasticsearch Lucene  Query  Parser  Syntax Elasticsearch  Query  DSL or http://lucene.apache.org/core/4_10_3/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html
  • 19. behind the scene (indexing) "Set the shape to semi-transparent by calling set_trans(5)" set the shape to semi transparent by calling set_trans 5 standard tokenizer (Unicode Standard Annex #29) lowercase token filter stop token filter fields and query string are analyzed
  • 20. behind the scene (searching) set the shape to semi transparent by calling set_trans 5 fields and query string are analyzed semi-transparent semi transparent
  • 21. what about Chinese? use an analyzer that is friendly to Chinese elasticsearch-analysis-mmseg MMSEG:A Word Identification System for Mandarin ChineseText Based onTwoVariants of the Maximum Matching Algorithm mmseg4j https://github.com/medcl/elasticsearch-analysis-mmseg https://code.google.com/p/mmseg4j/ http://technology.chtsai.org/mmseg/
  • 22. what about Chinese? use an analyzer that is friendly to Chinese elasticsearch-analysis-smartcn https://github.com/elasticsearch/elasticsearch-analysis-smartcn
  • 25. POST  /cars/transactions/_bulk   {  "index":  {}}   {  "price"  :  10000,  "color"  :  "red",  "make"  :  "honda",  "sold"  :  "2014-­‐10-­‐28"  }   {  "index":  {}}   {  "price"  :  20000,  "color"  :  "red",  "make"  :  "honda",  "sold"  :  "2014-­‐11-­‐05"  }   {  "index":  {}}   {  "price"  :  30000,  "color"  :  "green",  "make"  :  "ford",  "sold"  :  "2014-­‐05-­‐18"  }   {  "index":  {}}   {  "price"  :  15000,  "color"  :  "blue",  "make"  :  "toyota",  "sold"  :  "2014-­‐07-­‐02"  }   {  "index":  {}}   {  "price"  :  12000,  "color"  :  "green",  "make"  :  "toyota",  "sold"  :  "2014-­‐08-­‐19"  }   {  "index":  {}}   {  "price"  :  20000,  "color"  :  "red",  "make"  :  "honda",  "sold"  :  "2014-­‐11-­‐05"  }   {  "index":  {}}   {  "price"  :  80000,  "color"  :  "red",  "make"  :  "bmw",  "sold"  :  "2014-­‐01-­‐01"  }   {  "index":  {}}   {  "price"  :  25000,  "color"  :  "blue",  "make"  :  "ford",  "sold"  :  "2014-­‐02-­‐12"  } http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_aggregation_test_drive.html
  • 26. GET  /cars/transactions/_search?search_type=count   {          "aggs"  :  {                    "colors"  :  {                            "terms"  :  {                              "field"  :  "color"                            }                  }          }   } http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_aggregation_test_drive.html
  • 27. {   ...        "hits":  {              "hits":  []          },        "aggregations":  {              "colors":  {                      "buckets":  [                          {                                "key":  "red",                                  "doc_count":  4                            },                          {                                "key":  "blue",                                "doc_count":  2                          },                          {                                "key":  "green",                                "doc_count":  2                          }                    ]              }        }   } http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_aggregation_test_drive.html
  • 29. so…
  • 34. requests + lxml (XPath) 寫爬蟲 pyelasticsearch 與 elasticsearch 溝通 tornado host 網⾴頁 https://github.com/yychen/estest
  • 35. #!/usr/bin/env  python       from  pyelasticsearch  import  ElasticSearch       from  pyelasticsearch.exceptions  import  ElasticHttpNotFoundError             from  settings  import  HOST,  INDEX,  DOCTYPE             index_settings  =  {              'mappings':  {                      DOCTYPE:  {                              'properties':  {                                      'title':  {'type':  'string',  'analyzer':  'mmseg',  'boost':  1.5,  'term_vector':  'with_positions_offsets'},                                      'url':  {'type':  'string',  'index':  'not_analyzed'},                                      'content':  {'type':  'string',  'analyzer':  'mmseg',  'boost':  0.7,  'term_vector':  'with_positions_offsets'},                                  'categories':  {'type':  'nested',                                              'properties':  {                                                      'url':  {'type':  'string',  'index':  'not_analyzed'},                                                      'name':  {'type':  'string',  'index':  'not_analyzed'},                                              }                                      }                              }                      }              }       }             es  =  ElasticSearch(HOST)       try:              es.delete_index(INDEX)       except  ElasticHttpNotFoundError:              #  No  index  found              pass             es.create_index(INDEX,  settings=index_settings)       建⽴立 mapping https://github.com/yychen/estest
  • 36. requests + lxml 寫爬蟲 從最新的⼀一篇當 作⼊入⼝口第⼀一⾴頁 下⼀一⾴頁就找這個 連結 https://github.com/yychen/estest 1
  • 37. requests + lxml 寫爬蟲 title categories content url https://github.com/yychen/estest 2
  • 38. requests + lxml 寫爬蟲 item   {      'url':  u'http://yychen.joba.cc/dev/archives/164',      'content':  u'u96d6u7136u4e4b...',      'categories':  [          {'link':  'http://yychen.joba.cc/dev/archives/category/django',  'name':  'django'},          {'link':  'http://yychen.joba.cc/dev/archives/category/python',  'name':  'python'},          {'link':  'http://yychen.joba.cc/dev/archives/category/web',  'name':  'web'}],      'title':  'Django  1.7  Migration'   } from  pyelasticsearch  import  ElasticSearch   es  =  ElasticSearch(HOST)   es.index(INDEX,  DOCTYPE,  doc=item,  id=item['url']) https://github.com/yychen/estest 3
  • 39. requests + lxml 寫爬蟲 def  main():          url  =  u'http://yychen.joba.cc/dev/archives/164'          es  =  ElasticSearch(HOST)          for  i  in  range(20):                  item,  url  =  get_page(url)                  if  not  url:                          print  '033[1;33mWe've  reached  the  end,  breaking...033[m'                          break                  #  put  it  into  es                  print  'Indexing  033[1;37m%s033[m  (%s)...'  %  (item['title'],  item['url'])                  es.index(INDEX,  DOCTYPE,  doc=item,  id=item['url'])   https://github.com/yychen/estest
  • 40. requests + lxml 寫爬蟲 def  get_page(url):          #  store  the  to-­‐be-­‐indexed  document  to  item          item  =  {                  'categories':  [],          }          page  =  requests.get(url)          #  page.encoding  =  'utf-­‐8'          html  =  etree.HTML(page.text)                    try:                  prev_url  =  html.xpath('//a[@rel="prev"]/@href')[0]          except  IndexError:                  #  We  reached  the  end                  return  None,  None                    title_parts  =  html.xpath('//h1//text()')          content_parts  =  html.xpath('//div[@class="post-­‐bodycopy  cf"]//text()')          categories  =  html.xpath('//a[@rel="category  tag"]')                    item['url']  =  url          item['title']  =  process_tags(title_parts)          item['content']  =  process_tags(content_parts)                    #  Process  the  categories          for  category  in  categories:                  _cat  =  {}                  _cat['link']  =  category.xpath('./@href')[0]                  _cat['name']  =  category.xpath('./text()')[0]                  item['categories'].append(_cat)                    return  item,  prev_url   https://github.com/yychen/estest
  • 41. tornado class  SearchHandler(tornado.web.RequestHandler):        def  post(self):                  dsl  =  {                          'query':  {                                  'bool':  {                                          'should':  [                                                  {'match':  {'content':  self.get_argument('q')}},                                                  {'match':  {'title':  self.get_argument('q')}},                                          ]                                  }                          },                          'highlight':  {                                  'pre_tags':  ['<em>'],                                  'post_tags':  ['</em>'],                                  'fields':  {                                          'content':  {'no_match_size':  150,  'number_of_fragments':  1},                                          'title':  {'no_match_size':  150,  'number_of_fragments':  0},                                  }                          }                  }                                    results  =  es.search(dsl,  index=INDEX,  doc_type=DOCTYPE)                  hits  =  results['hits']['hits']                  self.write(json.dumps(hits))   https://github.com/yychen/estest
  • 43. GOOD!
  • 51. 譬如說: url, title, 任何 id, category etc. 全部都丟進去
  • 56. photo by derek_b on Flickr Any questions?