SlideShare a Scribd company logo
BASEL | BERN | BRUGG | BUCHAREST | DÜSSELDORF | FRANKFURT A.M. | FREIBURG I.BR. | GENEVA
HAMBURG | COPENHAGEN | LAUSANNE | MANNHEIM | MUNICH | STUTTGART | VIENNA | ZURICH
http://guidoschmutz.wordpress.com@gschmutz
Location Analytics
Real-Time Geofencing using Kafka
Guido Schmutz
Guido Schmutz
Working at Trivadis for more than 22 years
Consultant, Trainer Software Architect for Java, Oracle, SOA and Big Data / Fast Data
Oracle Groundbreaker Ambassador & Oracle ACE Director
Head of Trivadis Architecture Board
Technology Manager @ Trivadis
More than 30 years of software development experience
Contact: guido.schmutz@trivadis.com
Blog: http://guidoschmutz.wordpress.com
Slideshare: http://www.slideshare.net/gschmutz
Twitter: gschmutz
167th edition
Agenda
1. Introduction & Motivation
2. Using KSQL
3. Using Kafka Streams
4. Using Tile38
5. Visualization using ArcadiaData
6. Summary
Guido Schmutz
Working at Trivadis for more than 22 years
Oracle Groundbreaker Ambassador & Oracle ACE Director
Consultant, Trainer, Software Architect for Java, AWS, Azure,
Oracle Cloud, SOA and Big Data / Fast Data
Platform Architect & Head of Trivadis Architecture Board
More than 30 years of software development experience
Contact: guido.schmutz@trivadis.com
Blog: http://guidoschmutz.wordpress.com
Slideshare: http://www.slideshare.net/gschmutz
Twitter: gschmutz
155th edition
Introduction
Geofencing – What is it?
• the use of GPS or RFID technology to
create a virtual geographic boundary,
enabling software to trigger a response
when a object/device enters or leaves a
particular area
• Possible Events
• OUTSIDE
• lNSIDE
• ENTER
• EXIT
Source: https://tile38.com
Geofencing – What can we do with it?
• On-Demand and Delivery Services - assign
orders to an area's designated service
provider
• On-Demand Transportation - track Electronic
Transportation Devices and their distance
from charging stations
• Transportation Management - track flow of
people using public transport systems
• Commercial Real Estate - Identify how many
people drive or walk by a specific location
• Retail Shopper Guidance - Guide
customer to a specific product once they
are in your store
• Property Security - Open or lock doors as
individuals with designated devices
approach or leave a building or vehicle.
• Property Control - restrict vehicles to be
operational only inside a geofenced area –
like drones or construction equipment
Geo-Processing
• Well-known text (WKT) is a text markup language for
representing vector geometry objects on a map
• GeoTools is a free software GIS toolkit for developing standards
compliant solutions
Apache Kafka – A Streaming Platform
Source
Connector
Sink
Connector
trucking_
driver
KSQL Engine
Kafka Streams
Kafka Broker
Dash
board
High Level Overview of Use Case
geofence
Join Position
& Geofences
Vehicle
Position
object
position
pos &
geofences
Geo
fencing
geofence
status
key=10
{ "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311}
key=3
{"id":3,"name":"Berlin, Germany","geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443,
…))","last_update":1560607149015}
Geofence
Mgmt
Vehicle
Position
Weather
Service
Using KSQL
KSQL – Streams and Tabless
geofence
Table
vehicle
position
Stream
CREATE STREAM vehicle_position_s
(id VARCHAR,
latitude DOUBLE,
longitude DOUBLE)
WITH (KAFKA_TOPIC='vehicle_position',
VALUE_FORMAT='DELIMITED');
CREATE TABLE geo_fence_t
(id BIGINT,
name VARCHAR,
geometry_wkt VARCHAR)
WITH (KAFKA_TOPIC='geo_fence',
VALUE_FORMAT='JSON',
KEY = 'id');KSQL
Geofencing
How to determine "inside" or "outside" geofence?
Only one standard UDF for geo processing in KSQL: GEO_DISTANCE
Implement custom UDF using functionality from GeoTools Java library
public String geo_fence(final double latitude, final double longitude,
final String geometryWKT){ .. }
public List<String> geo_fence_bulk(final double latitude
, final double longitude, List<String> idGeometryListWKT) { .. }
ksql> SELECT geo_fence(latitude, longitude, ' POLYGON ((13.297920227050781
52.56195151687443, 13.2440185546875 52.530216577830124, ...))')
FROM test_geo_udf_s;
52.4497 | 13.3096 | OUTSIDE
52.4556 | 13.3178 | INSIDE
Custom UDF to determine if Point is inside a geometry
@Udf(description = "determines if a lat/long is inside or outside the
geometry passed as the 3rd parameter as WKT encoded ...")
public String geo_fence(final double latitude, final double longitude,
final String geometryWKT) {
String status = "";
GeometryFactory geometryFactory = JTSFactoryFinder.getGeometryFactory();
WKTReader reader = new WKTReader(geometryFactory);
Polygon polygon = (Polygon) reader.read(geometryWKT);
Coordinate coord = new Coordinate(longitude, latitude);
Point point = geometryFactory.createPoint(coord);
if (point.within(polygon)) {
status = "INSIDE";
} else {
status = "OUTSIDE";
}
return status;
}
1) Using Cross Join
geofence
Table
Join Position
& Geofences
vehicle
position
Stream
Stream
pos &
geofences
CREATE STREAM vp_join_gf_s
AS
SELECT vp.id, vp.latitude, vp.longitude,
gf.geometry_wkt
FROM vehicle_position_s AS vp
CROSS JOIN geo_fence_t AS gf
There is no Cross Join
in KSQL!
2) INNER Join
geofence
Stream
Join Position
& Geofences
vehicle
position
Stream
Stream
pos &
geofences
{ "group":1", "name":"St. Louis",
"geometry_wkt":"POLYGON ((13.297920227050781
52.56195151687443, …))",
"last_update":1560607149015}
{ "group":1", "name":"Berlin", "geometry_wkt":"POLYGON
((-90.23345947265625 38.484769753492536,…))",
"last_update":1560607149015}
Enrich Group
Table
geofences
by group 1
Enrich Group
Stream
postion by
group 1 Cannot insert into Table
from Stream
>INSERT INTO geo_fence_t
>SELECT '1' AS group_id, geof.id, …
>FROM geo_fence_s geof;
INSERT INTO can only be used to insert into
a stream. A02_GEO_FENCE_T is a table.
{ "group":"1", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
3) Geofences aggregated in one group
Join Position
& Geofences
Stream
geofence
status
Geofences
aggby group
Table
{ "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
geo_fence_bulk
geofence
Stream
vehicle
position
Stream
{ "group":1", "name":"St. Louis",
"geometry_wkt":"POLYGON ((13.297920227050781
52.56195151687443, …))",
"last_update":1560607149015}
{ "group":1", "name":"Berlin", "geometry_wkt":"POLYGON
((-90.23345947265625 38.484769753492536,…))",
"last_update":1560607149015}
Enrich With
Group-1
Stream
geofences
by group 1
Enrich With
Group-1
Stream
postion by
group 1
geofences
by group 1
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
{ "group":"1", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
3) Geofences aggregated in one group
CREATE TABLE a03_geo_fence_aggby_group_t
AS
SELECT group_id
, collect_set(id + ':' + geometry_wkt) AS id_geometry_wkt_list
FROM a03_geo_fence_by_group_s geof
GROUP BY group_id;
CREATE STREAM a03_vehicle_position_by_group_s
AS
SELECT '1' group_id, vehp.id, vehp.latitude, vehp.longitude
FROM vehicle_position_s vehp
PARTITION BY group_id;
3) Geofences aggregated in one group
• CREATE STREAM a03_geo_fence_status_s
• AS
• SELECT vehp.id, vehp.latitude, vehp.longitude,
geo_fence_bulk(vehp.latitude, vehp.longitude,
geofaggid_geometry_wkt_list) AS geofence_status
• FROM a03_vehicle_position_by_group_s vehp
• LEFT JOIN a03_geo_fence_aggby_group_t geofagg
• ON vehp.group_id = geofagg.group_id;
ksql> SELECT * FROM a03_geo_fence_status_s;
46 | 52.47546 | 13.34851 | [1:OUTSIDE, 3:INSIDE]
46 | 52.47521 | 13.34881 | [1:OUTSIDE, 3:INSIDE]
...
As many as there are geo-fences
Geo Hash for a better distribution
Geohash is a geocoding which
encodes a geographic location
into a short string of letters and
digits
Length Area width x height
1 5,009.4km x 4,992.6km
2 1,252.3km x 624.1km
3 156.5km x 156km
4 39.1km x 19.5km
12 3.7cm x 1.9cm
http://geohash.gofreerange.com/
Geo Hash Custom UDF
ksql> SELECT latitude, longitude, geo_hash(latitude, longitude, 3)
>FROM test_geo_udf_s;
38.484769753492536 | -90.23345947265625 | 9yz
public String geohash(final double latitude,
final double longitude, int length)
public List<String> neighbours(String geohash)
public String adjacentHash(String geohash, String directionString)
public List<String> coverBoundingBox(String geometryWKT, int length)
ksql> SELECT geometry_wkt, geo_hash(geometry_wkt, 5)
>FROM test_geo_udf_s;
POLYGON ((-90.23345947265625 38.484769753492536, -90.25886535644531
38.47455675836861, ...)) | [9yzf6, 9yzf7, 9yzfd, 9yzfe, 9yzff, 9yzfg, 9yzfk,
9yzfs, 9yzfu]
4) Geofences aggregated by GeoHash
Join Position
& Geofences
Stream
geofence
status
Geofences
gpby geohash
Table
{ "geohash":"u33", "name":"Postdam",
"geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"geohash":"u33", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))",
"last_update":1560607149015}
geo_fence_bulk()
geofence
Table
vehicle
position
Stream
{ "geohash":"u33", "name":"Potsam",
"geometry_wkt":"POLYGON ((13.297920227050781
52.56195151687443, …))",
"last_update":1560607149015}
{ "group":"u33", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
Enrich with
GeoHash
Stream
geofences
& geohash
Enrich with
GeoHash
Stream
position &
geohash
geofences
by geohash
geo_hash()
geo_hash()
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
{ "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
4) Geofences aggregated by GeoHash
CREATE STREAM a04_geo_fence_by_geohash_s
AS
SELECT geo_hash(geometry_wkt, 3)[0] geo_hash, id, name, geometry_wkt
FROM a04_geo_fence_s
PARTITION by geo_hash;
INSERT INTO a04_geo_fence_by_geohash_s
SELECT geo_hash(geometry_wkt, 3)[1] geo_hash, id, name, geometry_wkt
FROM a04_geo_fence_s
WHERE geo_hash(geometry_wkt, 3)[1] IS NOT NULL
PARTITION BY geo_hash;s
INSERT INTO a04_geo_fence_by_geohash_s
SELECT ...
There is no explode()
functionality in KSQL! https://github.com/confluentinc/ksql/issues/527
4) Geofences aggregated by GeoHash
CREATE TABLE a04_geo_fence_by_geohash_t
AS
SELECT geo_hash,
COLLECT_SET(id + ':' + geometry_wkt) AS id_geometry_wkt_list,
COLLECT_SET(id) id_list
FROM a04_geo_fence_by_geohash_s
GROUP BY geo_hash;
CREATE STREAM a04_vehicle_position_by_geohash_s
AS
SELECT vp.id, vp.latitude, vp.longitude,
geo_hash(vp.latitude, vp.longitude, 3) geo_hash
FROM vehicle_position_s vp
PARTITION BY geo_hash;
4) Geofences aggregated by GeoHash
CREATE STREAM a04_geo_fence_status_s
AS
SELECT vp.geo_hash, vp.id, vp.latitude, vp.longitude,
geo_fence_bulk (vp.latitude, vp.longitude, gf.id_geometry_wkt_list)
AS fence_status
FROM a04_vehicle_position_by_geohash_s vp 
LEFT JOIN a04_geo_fence_by_geohash_t gf 
ON (vp.geo_hash = gf.geo_hash);
ksql> SELECT * FROM a04_geo_fence_status_s;
u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE]
u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE]
9yz | 12 | 38.34409 | -90.15034 | [2:OUTSIDE, 1:OUTSIDE]
...
As many as there are geo-fences in
geohash
4a) Geofences aggregated by GeoHash
Join Position
& Geofences
Geofences
gpby geohash
Table
{ "group":"u33", "name":" Potsdam",
"geometry_wkt":"POLYGON ((5.668945 51.416016, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
geo_fence_bulk()
geofence
Table
vehicle
position
Stream
{ "geohash":u33", "name":"Postsdam",
"geometry_wkt":"POLYGON ((5.668945 51.416016, …))",
"last_update":1560607149015}
{ "geohash":"u33", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
Enrich with
GeoHash
Stream
geofences
& geohash
Enrich with
GeoHash
Stream
position &
geohash
geofences
by geohash
geo_hash()
geo_hash()
Stream
udf
status
geofence
status
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
{ "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
4b) Geofences aggregated by GeoHash
Join Position
& Geofences
Geofences
gpby geohash
Table
{ "geohash":"u33", "name":"Potsdam",
"geometry_wkt":"POLYGON ((5.668945 51.416016, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON
((-90.23345947265625 38.484769753492536,…))",
"last_update":1560607149015}
geo_fence()
geofence
Table
vehicle
position
Stream
{ "geohash":"u33", "name":"Potsdam",
"geometry_wkt":"POLYGON ((5.668945 51.416016, …))",
"last_update":1560607149015}
{ "group":"u33", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
Enrich with
GeoHash
Stream
geofences
& geohash
Enrich with
GeoHash
Stream
position &
geohash
geofences by
geohash
geo_hash()
geo_hash()
Stream
position &
geofence
Explode
Geofendes
Stream
geofence
status
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
{ "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
4b) Geofences aggregated by GeoHash
CREATE STREAM a04b_geofence_udf_status_s
AS
SELECT id, latitude, longitude, id_list[0] AS geofence_id,
geo_fence(latitude, longitude, geometry_wkt_list[0]) AS geofence_status
FROM a04_vehicle_position_by_geohash_s vp 
LEFT JOIN a04_geo_fence_by_geohash_t gf 
ON (vp.geo_hash = gf.geo_hash);
INSERT INTO a04b_geofence_udf_status_s
SELECT id, latitude, longitude, id_list[1] geofence_id,
geo_fence(latitude, longitude, geometry_wkt_list[1]) AS geofence_status
FROM a04_vehicle_position_by_geohash_s vp 
LEFT JOIN a04_geo_fence_by_geohash_t gf 
ON (vp.geo_hash = gf.geo_hash)
WHERE id_list[1] IS NOT NULL;
Berne
Fribourg
It works …. but ….
• By re-partitioning by geohash
we lose the guaranteed order
for a given vehicle
• Can be problematic, if there is a
backlog in one of the
topics/partitions
u0m5
u0m4
u0m7
u0m6
Consumer 1 Consumer 2
Using Kafka Streams
Geo-Fencing with Kafka Streams and Global KTable
Enrich Position with GeoHash
& Join with Geofences
Global
KTable
{ "geohash":u33", "name":"Potsdam",
"geometry_wkt":"POLYGON ((5.668945
51.416016, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin",
"geometry_wkt":"POLYGON ((-
90.23345947265625
38.484769753492536,…))",
"last_update":1560607149015}
geofence
KTable
vehicle
position
{ "geohash":u33", "name":"Potsdam",
"geometry_wkt":"POLYGON ((5.668945
51.416016, …))",
"last_update":1560607149015}
{ "group":u33", "name":"Berlin",
"geometry_wkt":"POLYGON ((-
90.23345947265625
38.484769753492536,…))",
"last_update":1560607149015}
Enricht and Group
by GeoHash
matched
geofences
Detect Geo
Event
geofece_sa
tus
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
geofence
by geohash
{"id":"10", "latitude" : 52.3924,
"longitude" : 13.0514, [
{"name":"Berlin"} ] }
{ "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
{"id":"10", "status" : "ENTER", "geofenceName":"Berlin"} }
position &
geohash
Geo-Fencing with Kafka Streams and Global KTable
KStream<String, GeoFence> geoFence = builder.stream(GEO_FENCE);
KStream<String, GeoFence> geoFenceByGeoHash =
geoFence.map((k,v) -> KeyValue.<GeoFence, List<String>> pair(v,
GeoHashUtil.coverBoundingBox(v.getWkt().toString(), 5)))
.flatMapValues(v -> v)
.map((k,v) -> KeyValue.<String,GeoFence>pair(v, createFrom(k, v)));
KTable<String, GeoFenceList> geofencesByGeohash =
geoFenceByGeoHash.groupByKey().aggregate(
() -> new GeoFenceList(new ArrayList<GeoFenceItem>()),
(aggKey, newValue, aggValue) -> {
GeoFenceItem geoFenceItem = new
GeoFenceItem(newValue.getId(), newValue.getName(),
newValue.getWkt(), "");
if (!aggValue.getGeoFences().contains(geoFenceItem))
aggValue.getGeoFences().add(geoFenceItem);
return aggValue;
},
Materialized.<String, GeoFenceList,
KeyValueStore<Bytes,byte[]>>as("geofences-by-geohash-store"));
geofencesByGeohash.toStream().to(GEO_FENCES_KEYEDBY_GEOHASH,
Produced.<String, GeoFenceList> keySerde(stringSerde));
Geo-Fencing with Kafka Streams and Global KTable
final GlobalKTable<String, GeoFenceList> geofences =
builder.globalTable(GEO_FENCES_KEYEDBY_GEOHASH);
KStream<String, VehiclePositionWithMatchedGeoFences> positionWithMatchedGeoFences =
vehiclePositionsWithGeoHash.leftJoin(geofences,
(k, pos) -> pos.getGeohash().toString(),
(pos, geofenceList) -> {
List<MatchedGeoFence> matchedGeofences = new ArrayList<MatchedGeoFence>();
if(geofenceList != null) {
for (GeoFenceItem geoFenceItem : geofenceList.getGeoFences()) {
boolean geofenceStatus =
GeoFenceUtil.geofence(pos.getLatitude(), pos.getLongitude(),
geoFenceItem.getWkt().toString());
if(geofenceStatus)
matchedGeofences.add(new MatchedGeoFence(geoFenceItem.getId(),
geoFenceItem.getName(), null));
}
}
return new VehiclePositionWithMatchedGeoFences(pos.getVehicleId(), 0L,
pos.getLatitude(), pos.getLongitude(),
pos.getEventTime(), matchedGeofences);
});
Geo-Fencing with Kafka Streams and Global KTable
final KStream<String, VehiclePositionWithMatchedGeoFences> positionWithMatchedGeoFences =
builder.stream(MATCHED_FENCE_STREAM);
final StoreBuilder<KeyValueStore<String, VehiclePositionWithMatchedGeoFences>>
vehicleGeoFenceStatusStore = Stores
.keyValueStoreBuilder(Stores.persistentKeyValueStore("GeoFenceSnapshotStore"),
Serdes.String(), positionWithMatchedGeoFencesSerde)
.withCachingEnabled();
builder.addStateStore(bargeGeoFenceStatusStore);
KStream<String, List<GeoEvent>> geoEvents = positionWithMatchedGeoFences.transformValues(
() -> new GeoEventEmitter (bargeGeoFenceStatusStore.name())
,vehicleGeoFenceStatusStore.name());
KStream<String, GeoEvent> geoEvent = geoEvents.flatMapValues(v -> v);
KStream<String, GeoEvent> geoEventByVehicleId =
geoEvent.selectKey((k, v) -> v.getVehicleId().toString());
geoEventByVechicleId.to(GEO_EVENT_STREAM);
Using Tile38
Tile38
• https://tile38.com
• Open Source Geospatial Database & Geofencing Server
• Real Time Geofencing
• Roaming Geofencing
• Fast Spatial Indices
• Pluggable Event Notifications
Tile38 – How does it work?
> SETCHAN berlin WITHIN vehicle FENCE OBJECT
{"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1
3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598],
[13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473
],[13.501167297363281,52.47148826410652], ...]]}
> SUBSCRIBE berlin
{"ok":true,"command":"subscribe","channel":"berlin","num":1,"elapsed":"5.85
µs"}
.
.
.
{"command":"set","group":"5d07581689807d000193ac33","detect":"outside","hoo
k":"berlin","key":"vehicle","time":"2019-06-
17T09:06:30.624923584Z","id":"10","object":{"type":"Point","coordinates":[1
3.3096,52.4497]}}
SET vehicle 10 POINT 52.4497 13.3096
Tile38 – How does it work?
> SETHOOK berlin_hook kafka://broker-1:9092/tile38_geofence_status WITHIN
vehicle FENCE OBJECT
{"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1
3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598],
[13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473
],[13.501167297363281,52.47148826410652], ...]]}
bigdata@bigdata:~$ kafkacat -b localhost -t tile38_geofence_status
% Auto-selecting Consumer mode (use -P or -C to override)
{"command":"set","group":"5d07581689807d000193ac34","detect":"outside","hoo
k":"berlin_hook","key":"vehicle","time":"2019-06-
17T09:12:00.488599119Z","id":"10","object":{"type":"Point","coordinates":[1
3.3096,52.4497]}}
SET vehicle 10 POINT 52.4497 13.3096
1) Enrich with GeoFences – aggregated by geohash
geofence
Stream
vehicle
position
Stream
Invoke UDF
{"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-
90.23345947265625 38.484769753492536,…))", "last_update":1560607149015}
{ "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311}
Invoke UDF
Geofence
Service
geofence
status
set_pos()
set_fence()
Stream
udf
status
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
2) Using Custom Kafka Connector for Tile38
geofence
vehicle
position
{"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-
90.23345947265625 38.484769753492536,…))", "last_update":1560607149015}
{ "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311}
Geofence
Service
kafka-to-
tile38
kafka-to-
tile38
geofence
status
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
2) Using Custom Kafka Connector for Tile38
curl -X PUT 
/api/kafka-connect-1/connectors/Tile38SinkConnector/config 
-H 'Content-Type: application/json' 
-H 'Accept: application/json' 
-d '{
"connector.class":
"com.trivadis.geofence.kafka.connect.Tile38SinkConnector",
"topics": "vehicle_position",
"tasks.max": "1",
"tile38.key": "vehicle",
"tile38.operation": "SET",
"tile38.hosts": "tile38:9851"
}'
Currently only supports SET command
Visualization using Arcadia
Data
Arcadia Data https://www.arcadiadata.com/
Summary
Summary & Outlook
• Summary
• Geo Fencing is doable using Kafka and KSQL
• KSQL is similar to SQL, but don't think relational
• UDF and UDAF's is a powerful way to extend KSQL
• Use Geo Hashes to partition work
• Outlook
• Performance Tests
• Cleanup code of UDFs and UDAFs
• Implement Kafka Source Connector for Tile 38
Technology on its own won't help you.
You need to know how to use it properly.

More Related Content

What's hot (20)

PDF
Large Scale Lakehouse Implementation Using Structured Streaming
Databricks
 
PDF
How to Take Advantage of an Enterprise Data Warehouse in the Cloud
Denodo
 
PDF
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
HostedbyConfluent
 
ODP
Presto
Knoldus Inc.
 
PDF
Moving to Databricks & Delta
Databricks
 
PDF
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Alluxio, Inc.
 
PDF
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Guido Schmutz
 
PDF
Deloitte CitySynergy whitepaper
Roger Jeffrey
 
PDF
Introduction SQL Analytics on Lakehouse Architecture
Databricks
 
PPTX
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Anant Corporation
 
PPTX
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Neo4j
 
PPTX
Using Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
DataWorks Summit
 
PPTX
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
DataWorks Summit
 
PPTX
memcached Distributed Cache
Aniruddha Chakrabarti
 
PPTX
Why apache Flink is the 4G of Big Data Analytics Frameworks
Slim Baltagi
 
PPTX
Real Life, Strategic BI Strategy for your IT Organization
mayamidmore
 
PDF
Introducing DataFrames in Spark for Large Scale Data Science
Databricks
 
PPTX
Process Mining 2.0: From Insights to Actions
Marlon Dumas
 
PDF
Focused build overview
ANNAMALAI VELMURUGAN
 
PDF
Building large scale transactional data lake using apache hudi
Bill Liu
 
Large Scale Lakehouse Implementation Using Structured Streaming
Databricks
 
How to Take Advantage of an Enterprise Data Warehouse in the Cloud
Denodo
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
HostedbyConfluent
 
Presto
Knoldus Inc.
 
Moving to Databricks & Delta
Databricks
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Alluxio, Inc.
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Guido Schmutz
 
Deloitte CitySynergy whitepaper
Roger Jeffrey
 
Introduction SQL Analytics on Lakehouse Architecture
Databricks
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Anant Corporation
 
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Neo4j
 
Using Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
DataWorks Summit
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
DataWorks Summit
 
memcached Distributed Cache
Aniruddha Chakrabarti
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Slim Baltagi
 
Real Life, Strategic BI Strategy for your IT Organization
mayamidmore
 
Introducing DataFrames in Spark for Large Scale Data Science
Databricks
 
Process Mining 2.0: From Insights to Actions
Marlon Dumas
 
Focused build overview
ANNAMALAI VELMURUGAN
 
Building large scale transactional data lake using apache hudi
Bill Liu
 

Similar to Location Analytics - Real-Time Geofencing using Kafka (20)

PDF
Location Analytics Real-Time Geofencing using Kafka
Guido Schmutz
 
PDF
Where in the world is Franz Kafka? | Will LaForest, Confluent
HostedbyConfluent
 
PPTX
Stratio's Cassandra Lucene index: Geospatial use cases
Andrés de la Peña
 
PPTX
Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...
DataStax
 
PDF
Stratio's Cassandra Lucene index: Geospatial use cases by Andrés Peña
Big Data Spain
 
PDF
Comparing Geospatial Implementation in MongoDB, Postgres, and Elastic
Antonios Giannopoulos
 
PPTX
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Paul Brebner
 
PDF
CrateDB 101: Geospatial data
Claus Matzinger
 
PDF
Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman ...
HostedbyConfluent
 
PPTX
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Paul Brebner
 
PPTX
Getting Started with Geospatial Data in MongoDB
MongoDB
 
PDF
Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...
Paul Brebner
 
PDF
Pg intro part1-theory_slides
lasmasi
 
PDF
Building Location Aware Apps - Get Started with PostGIS, PART I
lasmasi
 
PPTX
Day 6 - PostGIS
Barry Jones
 
KEY
Mapping Flatland: Using MongoDB for an MMO Crossword Game (GDC Online 2011)
Grant Goodale
 
PDF
Giving MongoDB a Way to Play with the GIS Community
MongoDB
 
PPTX
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Paul Brebner
 
PPTX
SQL Geography Datatypes by Jared Nielsen and the FUZION Agency
Jared Nielsen
 
Location Analytics Real-Time Geofencing using Kafka
Guido Schmutz
 
Where in the world is Franz Kafka? | Will LaForest, Confluent
HostedbyConfluent
 
Stratio's Cassandra Lucene index: Geospatial use cases
Andrés de la Peña
 
Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...
DataStax
 
Stratio's Cassandra Lucene index: Geospatial use cases by Andrés Peña
Big Data Spain
 
Comparing Geospatial Implementation in MongoDB, Postgres, and Elastic
Antonios Giannopoulos
 
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Paul Brebner
 
CrateDB 101: Geospatial data
Claus Matzinger
 
Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman ...
HostedbyConfluent
 
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Paul Brebner
 
Getting Started with Geospatial Data in MongoDB
MongoDB
 
Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...
Paul Brebner
 
Pg intro part1-theory_slides
lasmasi
 
Building Location Aware Apps - Get Started with PostGIS, PART I
lasmasi
 
Day 6 - PostGIS
Barry Jones
 
Mapping Flatland: Using MongoDB for an MMO Crossword Game (GDC Online 2011)
Grant Goodale
 
Giving MongoDB a Way to Play with the GIS Community
MongoDB
 
Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...
Paul Brebner
 
SQL Geography Datatypes by Jared Nielsen and the FUZION Agency
Jared Nielsen
 
Ad

More from Guido Schmutz (20)

PDF
30 Minutes to the Analytics Platform with Infrastructure as Code
Guido Schmutz
 
PDF
Event Broker (Kafka) in a Modern Data Architecture
Guido Schmutz
 
PDF
ksqlDB - Stream Processing simplified!
Guido Schmutz
 
PDF
Kafka as your Data Lake - is it Feasible?
Guido Schmutz
 
PDF
Event Hub (i.e. Kafka) in Modern Data Architecture
Guido Schmutz
 
PDF
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Guido Schmutz
 
PDF
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Guido Schmutz
 
PDF
Building Event Driven (Micro)services with Apache Kafka
Guido Schmutz
 
PDF
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Guido Schmutz
 
PDF
What is Apache Kafka? Why is it so popular? Should I use it?
Guido Schmutz
 
PDF
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Guido Schmutz
 
PDF
Streaming Visualisation
Guido Schmutz
 
PDF
Kafka as an event store - is it good enough?
Guido Schmutz
 
PDF
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Guido Schmutz
 
PDF
Fundamentals Big Data and AI Architecture
Guido Schmutz
 
PDF
Streaming Visualization
Guido Schmutz
 
PDF
Streaming Visualization
Guido Schmutz
 
PDF
Building Event-Driven (Micro) Services with Apache Kafka
Guido Schmutz
 
PDF
Introduction to Stream Processing
Guido Schmutz
 
PDF
Stream Processing – Concepts and Frameworks
Guido Schmutz
 
30 Minutes to the Analytics Platform with Infrastructure as Code
Guido Schmutz
 
Event Broker (Kafka) in a Modern Data Architecture
Guido Schmutz
 
ksqlDB - Stream Processing simplified!
Guido Schmutz
 
Kafka as your Data Lake - is it Feasible?
Guido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Guido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Guido Schmutz
 
Building Event Driven (Micro)services with Apache Kafka
Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Guido Schmutz
 
What is Apache Kafka? Why is it so popular? Should I use it?
Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Guido Schmutz
 
Streaming Visualisation
Guido Schmutz
 
Kafka as an event store - is it good enough?
Guido Schmutz
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Guido Schmutz
 
Fundamentals Big Data and AI Architecture
Guido Schmutz
 
Streaming Visualization
Guido Schmutz
 
Streaming Visualization
Guido Schmutz
 
Building Event-Driven (Micro) Services with Apache Kafka
Guido Schmutz
 
Introduction to Stream Processing
Guido Schmutz
 
Stream Processing – Concepts and Frameworks
Guido Schmutz
 
Ad

Recently uploaded (20)

PPT
01 presentation finyyyal معهد معايره.ppt
eltohamym057
 
PDF
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
PPTX
Human-Action-Recognition-Understanding-Behavior.pptx
nreddyjanga
 
PDF
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
PDF
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
PPTX
Rational Functions, Equations, and Inequalities (1).pptx
mdregaspi24
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PPTX
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PPTX
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
PPTX
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
PPTX
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PDF
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
PPTX
加拿大尼亚加拉学院毕业证书{Niagara在读证明信Niagara成绩单修改}复刻
Taqyea
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PDF
2_Management_of_patients_with_Reproductive_System_Disorders.pdf
motbayhonewunetu
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PDF
How to Avoid 7 Costly Mainframe Migration Mistakes
JP Infra Pvt Ltd
 
PPT
deep dive data management sharepoint apps.ppt
novaprofk
 
01 presentation finyyyal معهد معايره.ppt
eltohamym057
 
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
Human-Action-Recognition-Understanding-Behavior.pptx
nreddyjanga
 
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
Rational Functions, Equations, and Inequalities (1).pptx
mdregaspi24
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
加拿大尼亚加拉学院毕业证书{Niagara在读证明信Niagara成绩单修改}复刻
Taqyea
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
2_Management_of_patients_with_Reproductive_System_Disorders.pdf
motbayhonewunetu
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
How to Avoid 7 Costly Mainframe Migration Mistakes
JP Infra Pvt Ltd
 
deep dive data management sharepoint apps.ppt
novaprofk
 

Location Analytics - Real-Time Geofencing using Kafka

  • 1. BASEL | BERN | BRUGG | BUCHAREST | DÜSSELDORF | FRANKFURT A.M. | FREIBURG I.BR. | GENEVA HAMBURG | COPENHAGEN | LAUSANNE | MANNHEIM | MUNICH | STUTTGART | VIENNA | ZURICH http://guidoschmutz.wordpress.com@gschmutz Location Analytics Real-Time Geofencing using Kafka Guido Schmutz
  • 2. Guido Schmutz Working at Trivadis for more than 22 years Consultant, Trainer Software Architect for Java, Oracle, SOA and Big Data / Fast Data Oracle Groundbreaker Ambassador & Oracle ACE Director Head of Trivadis Architecture Board Technology Manager @ Trivadis More than 30 years of software development experience Contact: [email protected] Blog: http://guidoschmutz.wordpress.com Slideshare: http://www.slideshare.net/gschmutz Twitter: gschmutz 167th edition
  • 3. Agenda 1. Introduction & Motivation 2. Using KSQL 3. Using Kafka Streams 4. Using Tile38 5. Visualization using ArcadiaData 6. Summary
  • 4. Guido Schmutz Working at Trivadis for more than 22 years Oracle Groundbreaker Ambassador & Oracle ACE Director Consultant, Trainer, Software Architect for Java, AWS, Azure, Oracle Cloud, SOA and Big Data / Fast Data Platform Architect & Head of Trivadis Architecture Board More than 30 years of software development experience Contact: [email protected] Blog: http://guidoschmutz.wordpress.com Slideshare: http://www.slideshare.net/gschmutz Twitter: gschmutz 155th edition
  • 6. Geofencing – What is it? • the use of GPS or RFID technology to create a virtual geographic boundary, enabling software to trigger a response when a object/device enters or leaves a particular area • Possible Events • OUTSIDE • lNSIDE • ENTER • EXIT Source: https://tile38.com
  • 7. Geofencing – What can we do with it? • On-Demand and Delivery Services - assign orders to an area's designated service provider • On-Demand Transportation - track Electronic Transportation Devices and their distance from charging stations • Transportation Management - track flow of people using public transport systems • Commercial Real Estate - Identify how many people drive or walk by a specific location • Retail Shopper Guidance - Guide customer to a specific product once they are in your store • Property Security - Open or lock doors as individuals with designated devices approach or leave a building or vehicle. • Property Control - restrict vehicles to be operational only inside a geofenced area – like drones or construction equipment
  • 8. Geo-Processing • Well-known text (WKT) is a text markup language for representing vector geometry objects on a map • GeoTools is a free software GIS toolkit for developing standards compliant solutions
  • 9. Apache Kafka – A Streaming Platform Source Connector Sink Connector trucking_ driver KSQL Engine Kafka Streams Kafka Broker
  • 10. Dash board High Level Overview of Use Case geofence Join Position & Geofences Vehicle Position object position pos & geofences Geo fencing geofence status key=10 { "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} key=3 {"id":3,"name":"Berlin, Germany","geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))","last_update":1560607149015} Geofence Mgmt Vehicle Position Weather Service
  • 12. KSQL – Streams and Tabless geofence Table vehicle position Stream CREATE STREAM vehicle_position_s (id VARCHAR, latitude DOUBLE, longitude DOUBLE) WITH (KAFKA_TOPIC='vehicle_position', VALUE_FORMAT='DELIMITED'); CREATE TABLE geo_fence_t (id BIGINT, name VARCHAR, geometry_wkt VARCHAR) WITH (KAFKA_TOPIC='geo_fence', VALUE_FORMAT='JSON', KEY = 'id');KSQL Geofencing
  • 13. How to determine "inside" or "outside" geofence? Only one standard UDF for geo processing in KSQL: GEO_DISTANCE Implement custom UDF using functionality from GeoTools Java library public String geo_fence(final double latitude, final double longitude, final String geometryWKT){ .. } public List<String> geo_fence_bulk(final double latitude , final double longitude, List<String> idGeometryListWKT) { .. } ksql> SELECT geo_fence(latitude, longitude, ' POLYGON ((13.297920227050781 52.56195151687443, 13.2440185546875 52.530216577830124, ...))') FROM test_geo_udf_s; 52.4497 | 13.3096 | OUTSIDE 52.4556 | 13.3178 | INSIDE
  • 14. Custom UDF to determine if Point is inside a geometry @Udf(description = "determines if a lat/long is inside or outside the geometry passed as the 3rd parameter as WKT encoded ...") public String geo_fence(final double latitude, final double longitude, final String geometryWKT) { String status = ""; GeometryFactory geometryFactory = JTSFactoryFinder.getGeometryFactory(); WKTReader reader = new WKTReader(geometryFactory); Polygon polygon = (Polygon) reader.read(geometryWKT); Coordinate coord = new Coordinate(longitude, latitude); Point point = geometryFactory.createPoint(coord); if (point.within(polygon)) { status = "INSIDE"; } else { status = "OUTSIDE"; } return status; }
  • 15. 1) Using Cross Join geofence Table Join Position & Geofences vehicle position Stream Stream pos & geofences CREATE STREAM vp_join_gf_s AS SELECT vp.id, vp.latitude, vp.longitude, gf.geometry_wkt FROM vehicle_position_s AS vp CROSS JOIN geo_fence_t AS gf There is no Cross Join in KSQL!
  • 16. 2) INNER Join geofence Stream Join Position & Geofences vehicle position Stream Stream pos & geofences { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich Group Table geofences by group 1 Enrich Group Stream postion by group 1 Cannot insert into Table from Stream >INSERT INTO geo_fence_t >SELECT '1' AS group_id, geof.id, … >FROM geo_fence_s geof; INSERT INTO can only be used to insert into a stream. A02_GEO_FENCE_T is a table. { "group":"1", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
  • 17. 3) Geofences aggregated in one group Join Position & Geofences Stream geofence status Geofences aggby group Table { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence_bulk geofence Stream vehicle position Stream { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich With Group-1 Stream geofences by group 1 Enrich With Group-1 Stream postion by group 1 geofences by group 1 high low low high low high Scalable Latency "Code Smell" medium medium medium { "group":"1", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
  • 18. 3) Geofences aggregated in one group CREATE TABLE a03_geo_fence_aggby_group_t AS SELECT group_id , collect_set(id + ':' + geometry_wkt) AS id_geometry_wkt_list FROM a03_geo_fence_by_group_s geof GROUP BY group_id; CREATE STREAM a03_vehicle_position_by_group_s AS SELECT '1' group_id, vehp.id, vehp.latitude, vehp.longitude FROM vehicle_position_s vehp PARTITION BY group_id;
  • 19. 3) Geofences aggregated in one group • CREATE STREAM a03_geo_fence_status_s • AS • SELECT vehp.id, vehp.latitude, vehp.longitude, geo_fence_bulk(vehp.latitude, vehp.longitude, geofaggid_geometry_wkt_list) AS geofence_status • FROM a03_vehicle_position_by_group_s vehp • LEFT JOIN a03_geo_fence_aggby_group_t geofagg • ON vehp.group_id = geofagg.group_id; ksql> SELECT * FROM a03_geo_fence_status_s; 46 | 52.47546 | 13.34851 | [1:OUTSIDE, 3:INSIDE] 46 | 52.47521 | 13.34881 | [1:OUTSIDE, 3:INSIDE] ... As many as there are geo-fences
  • 20. Geo Hash for a better distribution Geohash is a geocoding which encodes a geographic location into a short string of letters and digits Length Area width x height 1 5,009.4km x 4,992.6km 2 1,252.3km x 624.1km 3 156.5km x 156km 4 39.1km x 19.5km 12 3.7cm x 1.9cm http://geohash.gofreerange.com/
  • 21. Geo Hash Custom UDF ksql> SELECT latitude, longitude, geo_hash(latitude, longitude, 3) >FROM test_geo_udf_s; 38.484769753492536 | -90.23345947265625 | 9yz public String geohash(final double latitude, final double longitude, int length) public List<String> neighbours(String geohash) public String adjacentHash(String geohash, String directionString) public List<String> coverBoundingBox(String geometryWKT, int length) ksql> SELECT geometry_wkt, geo_hash(geometry_wkt, 5) >FROM test_geo_udf_s; POLYGON ((-90.23345947265625 38.484769753492536, -90.25886535644531 38.47455675836861, ...)) | [9yzf6, 9yzf7, 9yzfd, 9yzfe, 9yzff, 9yzfg, 9yzfk, 9yzfs, 9yzfu]
  • 22. 4) Geofences aggregated by GeoHash Join Position & Geofences Stream geofence status Geofences gpby geohash Table { "geohash":"u33", "name":"Postdam", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"geohash":"u33", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence_bulk() geofence Table vehicle position Stream { "geohash":"u33", "name":"Potsam", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":"u33", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich with GeoHash Stream geofences & geohash Enrich with GeoHash Stream position & geohash geofences by geohash geo_hash() geo_hash() high low low high low high Scalable Latency "Code Smell" medium medium medium { "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
  • 23. 4) Geofences aggregated by GeoHash CREATE STREAM a04_geo_fence_by_geohash_s AS SELECT geo_hash(geometry_wkt, 3)[0] geo_hash, id, name, geometry_wkt FROM a04_geo_fence_s PARTITION by geo_hash; INSERT INTO a04_geo_fence_by_geohash_s SELECT geo_hash(geometry_wkt, 3)[1] geo_hash, id, name, geometry_wkt FROM a04_geo_fence_s WHERE geo_hash(geometry_wkt, 3)[1] IS NOT NULL PARTITION BY geo_hash;s INSERT INTO a04_geo_fence_by_geohash_s SELECT ... There is no explode() functionality in KSQL! https://github.com/confluentinc/ksql/issues/527
  • 24. 4) Geofences aggregated by GeoHash CREATE TABLE a04_geo_fence_by_geohash_t AS SELECT geo_hash, COLLECT_SET(id + ':' + geometry_wkt) AS id_geometry_wkt_list, COLLECT_SET(id) id_list FROM a04_geo_fence_by_geohash_s GROUP BY geo_hash; CREATE STREAM a04_vehicle_position_by_geohash_s AS SELECT vp.id, vp.latitude, vp.longitude, geo_hash(vp.latitude, vp.longitude, 3) geo_hash FROM vehicle_position_s vp PARTITION BY geo_hash;
  • 25. 4) Geofences aggregated by GeoHash CREATE STREAM a04_geo_fence_status_s AS SELECT vp.geo_hash, vp.id, vp.latitude, vp.longitude, geo_fence_bulk (vp.latitude, vp.longitude, gf.id_geometry_wkt_list) AS fence_status FROM a04_vehicle_position_by_geohash_s vp LEFT JOIN a04_geo_fence_by_geohash_t gf ON (vp.geo_hash = gf.geo_hash); ksql> SELECT * FROM a04_geo_fence_status_s; u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE] u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE] 9yz | 12 | 38.34409 | -90.15034 | [2:OUTSIDE, 1:OUTSIDE] ... As many as there are geo-fences in geohash
  • 26. 4a) Geofences aggregated by GeoHash Join Position & Geofences Geofences gpby geohash Table { "group":"u33", "name":" Potsdam", "geometry_wkt":"POLYGON ((5.668945 51.416016, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence_bulk() geofence Table vehicle position Stream { "geohash":u33", "name":"Postsdam", "geometry_wkt":"POLYGON ((5.668945 51.416016, …))", "last_update":1560607149015} { "geohash":"u33", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich with GeoHash Stream geofences & geohash Enrich with GeoHash Stream position & geohash geofences by geohash geo_hash() geo_hash() Stream udf status geofence status high low low high low high Scalable Latency "Code Smell" medium medium medium { "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
  • 27. 4b) Geofences aggregated by GeoHash Join Position & Geofences Geofences gpby geohash Table { "geohash":"u33", "name":"Potsdam", "geometry_wkt":"POLYGON ((5.668945 51.416016, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence() geofence Table vehicle position Stream { "geohash":"u33", "name":"Potsdam", "geometry_wkt":"POLYGON ((5.668945 51.416016, …))", "last_update":1560607149015} { "group":"u33", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich with GeoHash Stream geofences & geohash Enrich with GeoHash Stream position & geohash geofences by geohash geo_hash() geo_hash() Stream position & geofence Explode Geofendes Stream geofence status high low low high low high Scalable Latency "Code Smell" medium medium medium { "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
  • 28. 4b) Geofences aggregated by GeoHash CREATE STREAM a04b_geofence_udf_status_s AS SELECT id, latitude, longitude, id_list[0] AS geofence_id, geo_fence(latitude, longitude, geometry_wkt_list[0]) AS geofence_status FROM a04_vehicle_position_by_geohash_s vp LEFT JOIN a04_geo_fence_by_geohash_t gf ON (vp.geo_hash = gf.geo_hash); INSERT INTO a04b_geofence_udf_status_s SELECT id, latitude, longitude, id_list[1] geofence_id, geo_fence(latitude, longitude, geometry_wkt_list[1]) AS geofence_status FROM a04_vehicle_position_by_geohash_s vp LEFT JOIN a04_geo_fence_by_geohash_t gf ON (vp.geo_hash = gf.geo_hash) WHERE id_list[1] IS NOT NULL;
  • 29. Berne Fribourg It works …. but …. • By re-partitioning by geohash we lose the guaranteed order for a given vehicle • Can be problematic, if there is a backlog in one of the topics/partitions u0m5 u0m4 u0m7 u0m6 Consumer 1 Consumer 2
  • 31. Geo-Fencing with Kafka Streams and Global KTable Enrich Position with GeoHash & Join with Geofences Global KTable { "geohash":u33", "name":"Potsdam", "geometry_wkt":"POLYGON ((5.668945 51.416016, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((- 90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geofence KTable vehicle position { "geohash":u33", "name":"Potsdam", "geometry_wkt":"POLYGON ((5.668945 51.416016, …))", "last_update":1560607149015} { "group":u33", "name":"Berlin", "geometry_wkt":"POLYGON ((- 90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enricht and Group by GeoHash matched geofences Detect Geo Event geofece_sa tus high low low high low high Scalable Latency "Code Smell" medium medium medium geofence by geohash {"id":"10", "latitude" : 52.3924, "longitude" : 13.0514, [ {"name":"Berlin"} ] } { "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514} {"id":"10", "status" : "ENTER", "geofenceName":"Berlin"} } position & geohash
  • 32. Geo-Fencing with Kafka Streams and Global KTable KStream<String, GeoFence> geoFence = builder.stream(GEO_FENCE); KStream<String, GeoFence> geoFenceByGeoHash = geoFence.map((k,v) -> KeyValue.<GeoFence, List<String>> pair(v, GeoHashUtil.coverBoundingBox(v.getWkt().toString(), 5))) .flatMapValues(v -> v) .map((k,v) -> KeyValue.<String,GeoFence>pair(v, createFrom(k, v))); KTable<String, GeoFenceList> geofencesByGeohash = geoFenceByGeoHash.groupByKey().aggregate( () -> new GeoFenceList(new ArrayList<GeoFenceItem>()), (aggKey, newValue, aggValue) -> { GeoFenceItem geoFenceItem = new GeoFenceItem(newValue.getId(), newValue.getName(), newValue.getWkt(), ""); if (!aggValue.getGeoFences().contains(geoFenceItem)) aggValue.getGeoFences().add(geoFenceItem); return aggValue; }, Materialized.<String, GeoFenceList, KeyValueStore<Bytes,byte[]>>as("geofences-by-geohash-store")); geofencesByGeohash.toStream().to(GEO_FENCES_KEYEDBY_GEOHASH, Produced.<String, GeoFenceList> keySerde(stringSerde));
  • 33. Geo-Fencing with Kafka Streams and Global KTable final GlobalKTable<String, GeoFenceList> geofences = builder.globalTable(GEO_FENCES_KEYEDBY_GEOHASH); KStream<String, VehiclePositionWithMatchedGeoFences> positionWithMatchedGeoFences = vehiclePositionsWithGeoHash.leftJoin(geofences, (k, pos) -> pos.getGeohash().toString(), (pos, geofenceList) -> { List<MatchedGeoFence> matchedGeofences = new ArrayList<MatchedGeoFence>(); if(geofenceList != null) { for (GeoFenceItem geoFenceItem : geofenceList.getGeoFences()) { boolean geofenceStatus = GeoFenceUtil.geofence(pos.getLatitude(), pos.getLongitude(), geoFenceItem.getWkt().toString()); if(geofenceStatus) matchedGeofences.add(new MatchedGeoFence(geoFenceItem.getId(), geoFenceItem.getName(), null)); } } return new VehiclePositionWithMatchedGeoFences(pos.getVehicleId(), 0L, pos.getLatitude(), pos.getLongitude(), pos.getEventTime(), matchedGeofences); });
  • 34. Geo-Fencing with Kafka Streams and Global KTable final KStream<String, VehiclePositionWithMatchedGeoFences> positionWithMatchedGeoFences = builder.stream(MATCHED_FENCE_STREAM); final StoreBuilder<KeyValueStore<String, VehiclePositionWithMatchedGeoFences>> vehicleGeoFenceStatusStore = Stores .keyValueStoreBuilder(Stores.persistentKeyValueStore("GeoFenceSnapshotStore"), Serdes.String(), positionWithMatchedGeoFencesSerde) .withCachingEnabled(); builder.addStateStore(bargeGeoFenceStatusStore); KStream<String, List<GeoEvent>> geoEvents = positionWithMatchedGeoFences.transformValues( () -> new GeoEventEmitter (bargeGeoFenceStatusStore.name()) ,vehicleGeoFenceStatusStore.name()); KStream<String, GeoEvent> geoEvent = geoEvents.flatMapValues(v -> v); KStream<String, GeoEvent> geoEventByVehicleId = geoEvent.selectKey((k, v) -> v.getVehicleId().toString()); geoEventByVechicleId.to(GEO_EVENT_STREAM);
  • 36. Tile38 • https://tile38.com • Open Source Geospatial Database & Geofencing Server • Real Time Geofencing • Roaming Geofencing • Fast Spatial Indices • Pluggable Event Notifications
  • 37. Tile38 – How does it work? > SETCHAN berlin WITHIN vehicle FENCE OBJECT {"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1 3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598], [13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473 ],[13.501167297363281,52.47148826410652], ...]]} > SUBSCRIBE berlin {"ok":true,"command":"subscribe","channel":"berlin","num":1,"elapsed":"5.85 µs"} . . . {"command":"set","group":"5d07581689807d000193ac33","detect":"outside","hoo k":"berlin","key":"vehicle","time":"2019-06- 17T09:06:30.624923584Z","id":"10","object":{"type":"Point","coordinates":[1 3.3096,52.4497]}} SET vehicle 10 POINT 52.4497 13.3096
  • 38. Tile38 – How does it work? > SETHOOK berlin_hook kafka://broker-1:9092/tile38_geofence_status WITHIN vehicle FENCE OBJECT {"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1 3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598], [13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473 ],[13.501167297363281,52.47148826410652], ...]]} bigdata@bigdata:~$ kafkacat -b localhost -t tile38_geofence_status % Auto-selecting Consumer mode (use -P or -C to override) {"command":"set","group":"5d07581689807d000193ac34","detect":"outside","hoo k":"berlin_hook","key":"vehicle","time":"2019-06- 17T09:12:00.488599119Z","id":"10","object":{"type":"Point","coordinates":[1 3.3096,52.4497]}} SET vehicle 10 POINT 52.4497 13.3096
  • 39. 1) Enrich with GeoFences – aggregated by geohash geofence Stream vehicle position Stream Invoke UDF {"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((- 90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} { "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} Invoke UDF Geofence Service geofence status set_pos() set_fence() Stream udf status high low low high low high Scalable Latency "Code Smell" medium medium medium
  • 40. 2) Using Custom Kafka Connector for Tile38 geofence vehicle position {"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((- 90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} { "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} Geofence Service kafka-to- tile38 kafka-to- tile38 geofence status high low low high low high Scalable Latency "Code Smell" medium medium medium
  • 41. 2) Using Custom Kafka Connector for Tile38 curl -X PUT /api/kafka-connect-1/connectors/Tile38SinkConnector/config -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "connector.class": "com.trivadis.geofence.kafka.connect.Tile38SinkConnector", "topics": "vehicle_position", "tasks.max": "1", "tile38.key": "vehicle", "tile38.operation": "SET", "tile38.hosts": "tile38:9851" }' Currently only supports SET command
  • 45. Summary & Outlook • Summary • Geo Fencing is doable using Kafka and KSQL • KSQL is similar to SQL, but don't think relational • UDF and UDAF's is a powerful way to extend KSQL • Use Geo Hashes to partition work • Outlook • Performance Tests • Cleanup code of UDFs and UDAFs • Implement Kafka Source Connector for Tile 38
  • 46. Technology on its own won't help you. You need to know how to use it properly.