Assessment of OpenStreetMap Data - A Review
Assessment of OpenStreetMap Data - A Review
net/publication/257028692
CITATIONS READS
25 1,281
3 authors:
Hardeep Rai
Guru Nanak Dev Engineering College
18 PUBLICATIONS 213 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Hi, at present we are working on OSM based navigation systems View project
All content following this page was uploaded by Hardeep Rai on 17 December 2014.
17
International Journal of Computer Applications (0975 - 8887)
Volume 76 - No. 16, August 2013
Some researcher also analysed the spatial data quality using pa- In 2011, Ciepluch [30] discussed that not everyone contributes
rameters in addition to discussed above, which are Thematic Ac- data of the same quality, due to lack of practice and knowledge
curacy, Temporal Accuracy, Logical consistency, Semantic ac- which can be improved by practice and experience in map mak-
curacy, Usage/purpose/constraints, Variation in quality, Meta- ing. Ludwig [22] described a methodology to compare OSM
quality, and Resolution. Often one dataset may be superior to street data with Navteq for all populated roads in Germany. The
other datasets in one, but not all aspects. methodology was based on a matching between the street ob-
jects of OSM and Navteq adopted from [23] which allows for
object-wise comparison of geometries and thematic attributes.
Finally, they calculated relative quality measures: relative ob-
3. ASSESSMENT WORK
ject completeness, relative attribute completeness, difference in
OpenStreetMap is based on the concept of crowdsourcing and speed limits and positional differences. Another researcher [24]
founded by Steve Coast from UK in 2004. OpenStreetMap pro- statistically analysed the routing process using OpenStreetMap
duces huge spatial data, with less effort. and researchers are road data of the inner city of Hamburg. A similar approach
working on the devising method to use the data rather than col- was used in France to analyse OSM data [25]. The results of
lecting the data. OpenStreetMap produces labelled data, When this research showed the advantage and flexibility, but also con-
labeled data is easy to come by, the focus of the researcher would cluded the problem of the heterogeneity of the data specifically
be on working with the labelled data rather than collecting it. for France. In same year, the first study that analysed the quality
This impacts not only which problems researchers choose to of OSM outside of Europe were conducted [26]. In this research
work on, but also the learning methodology they use to approach work the OSM project data had been compared with proprietary
them. data from TomTom (TeleAtlas) and Navteq for the entire state of
In this sense, crowdsourcing provides a two-part cost savings: Florida (USA) and four specific cities within the USA. In com-
greater ability to realize traditionally-claimed savings of active parison to the results for Germany or England, the discrepan-
learning, as well as reduced cost of crowd annotation vs. tradi- cies between the rural and urban areas in the USA showed an
tional annotators. A second important benefit will again be the opposite tendency. In Florida, the rural data was, in parts, even
implications for use of labeled vs. unlabeled data for training more complete than that of the proprietary datasets in the rela-
when labeled data is plentiful. Instead of comparing to past su- tive comparison conducted. Zielstra [27] compared the amount
pervised learning curves, researchers may instead consider past of pedestrian-related data between freely available sources, i.e.,
learning curves for active learning, which will be steeper in com- OSM and/or TIGER, and proprietary providers, i.e., Tele Atlas
parison. In many prior studies, it has been concluded that there Multinet and/or Navteq Discover Cities, and analysed its effect
is greater potential for active learning than supervised learning on modeling ?spatial aspects of transit accessibility” for pedes-
to benefit from crowdsourcing [16, 17]. trian in five US and four German cities. They concluded that inte-
Haklay, in 2008, [18] analysed OSM data compared Great gration of pedestrian-only segments can lead to a more realistic
Britain, and ordnance survey (OS) geodata with OSM data. In assessment of service areas when compared to using networks
2009, Ather [14] extended this work to the OS Master Map for that contain only streets that are passable by cars and that the
selected parts of London. He additionally compared complete- assessment of VGI data quality, especially OSM data, is an on-
ness of road names. Then, in 2010, Haklay [19] in his research going issue of high importance for successful geo-applications
work buffered British Ordnance Survey data to determine what [7, 19].
percentage of the OSM roads were covered. A commonly ap- Other analyst in 2012 [28] assessed the effect of network data in-
plied technique for matching different road networks is graph tegration from different sources on the length of computed short-
matching [20]. The analysis of Germany started with compari- est paths for pedestrians. Their results showed that the combined
son of commercial mulinet proprietary map data from TomTom use of network datasets significantly reduces shortest path dis-
[21, 22] compared with street map data from different propri- tances compared to the use of single datasets. They concluded
etary geodata providers. that data integration leads to an increased value for users of
18
International Journal of Computer Applications (0975 - 8887)
Volume 76 - No. 16, August 2013
pedestrian routing applications but that combining OSM and [6] Howe, Jeff The rise of crowdsourcing, in Wired
other commercial datasets cannot be considered for implementa- Magazine, Issue 14.06,2006, 1–4.
tion due to current licensing issues. Neis [29] assessed the com- [7] Goodchild, M.F. Spatial Accuracy 2.0. in Proceeding of
pleteness of the OSM street network via a relative comparison the 8th International Symposium on Spatial Accuracy
(street network length, no. streets without names, no. turn restric- Assessment in Natural Resources and Environmental
tions) between OSM and a commercial dataset provider (Tom- Sciences, Shanghai, China, , 2008, 25–27.
Tom formerly known as Tele Atlas). They noted though that for
comparison the TomTom dataset is suitable only for street net- [8] Goodchild, M.F. Citizens as voluntary sensors: Spatial
work data for car-specific navigation. They also evaluated logi- data infrastructure in the world of web 2.0 in International
cal consistency using an internal test, whereby topological and Journal of Spatial Data Infrastructures Research, 2007,
thematic consistency is determined. Concerning turn restrictions Vol. 2, 24–32.
(176,000 in TomTom; 21,000 in OSM in Germany in June 2011) [9] Fischer, F. Collaborative mappingHow wikinomics is
they discussed that although the number of turn restrictions avail- manifest in the geo-information economy in
able in the OSM dataset is continually increasing, it will proba- GEOInformatics, Volume 11, No 2, 2008, 28–31.
bly take several more years before OSM achieves the same level [10] Kounadi, O. Assessing the quality of OpenStreetMap data
as TomTom, based on the current status and development?. Apart MSc. Dissertation, University College of London
from England, no studies have been conducted to date over a pe- Department of Civil, Environmental And Geomatic
riod of several years and for an entire country [19]. Engineering, August 2009.
In 2013 many researchers have been aggressively working in the
area of assessment of OpenStreetMap, but research work for In- [11] The Data Stat of OpenStreetMap Users
dia has not initiated yet. http://wiki.openstreetmap.org/wiki/Stats
(Accessed 23/7/2013).
[12] Schmidt, M.; Weiser, P., Web Mapping Services:
4. CONCLUSIONS & FUTURE WORK
Development and Trends, in Springer Lecture Notes On
This review paper concludes that OpenstreetMap is generating Online Maps with APIs and WebServices, 2012, 13–21.
huge dataset with the help of non-commercialised users, with [13] Sui, D.Z. The wikification of GIS and its consequences:
varying level of experience. So the assessment becomes vital to Or Angelina Jolie’s new tattoo and the future of GIS,
give maturity to OpenStreetMap data. The results from the find- Computers Environment and Urban Systems, 2008, 32,
ing show that because of varying level of user experience the data 1–5.
is not error free, and also mapped areas depend upon the contri-
bution by the users. But general trend is that number of absolute [14] Ather, A. A Quality Analysis of OpenStreetMap Data
and relative errors is falling. M.E. Thesis, University College London, London, UK,
The discussed approaches in this paper are offline approaches May 2009.
of checking the data and correcting the OSM data afterwards. [15] Android to command nearly half of worldwide
Another method which still needs attention from the researchers smartphone operating system market by year-end 2012.
community is online quality check or anomaly detection engine, URL
that can check for the quality of the map while it is being up- http://www.gartner.com/newsroom/id/1622614,
loaded by the user. The idea may be to create automated model Accessed on 19/5/2013.
that can spot mistakes. In order to do this some machine learn- [16] Lease, M. On Quality Control and Machine Learning in
ing techniques with several user parameters being: the age of the Crowdsourcing in Proceeding of Human Computation,
user, the number of edits, what happens in a changeset etc can be AAAI Workshop, August 2011, 97–102.
taken into consideration as discussed by Neis [29]. Further some
[17] Ambati, V., Active Learning and Crowdsourcing for
of the researchers have compared data proprietary data sets and
Machine Translation in Low Resource Scenarios in
given their results. The ground reality of those proprietary data
Thesis, Kevin Knight, University of Southern California,
sets may also be checked.
ISI, 2012.
[18] Hakley, M. How good is OpenStreetMap information? A
5. REFERENCES comparative study of OpenStreetMap and Ordnance
[1] O’ Reilly, T. What Is Web 2.0: Design Patterns and Survey datasets for London and the rest of England in
Business Models for the Next Generation of Software, Environment and Planning B: Planning and Design,
OReilly Media: Cambridge, MA, USA, 2008. http:// August 2008.
oreilly.com/web2/archive/what-is-web-20.html [19] Haklay, M. How good is volunteered geographical
(Acessed on 13/3/13). information? A comparative study of OpenStreetMap and
[2] Anderson, P. What Is Web 2.0? Ideas, Technologies and ordnance survey datasets in Environment and Planning B:
Implications for Education,Joint Information Systems Planning and Design, 2010, 37(4), 682–703.
Committee: Bristol, UK, 2007. http://www.jisc.ac. [20] Zhang, M. and Meng, I. An iterative road-matching
uk/media/documents/techwatch/tsw0701b.pdf approach for the integration of postal data in Computer,
(Acessed on 13/3/13). Environment and Urban Systems, 2007, 31, 597–615.
[3] Hudson-Smith, A.; Batty, M.; Milton, R.; Batty, M. [21] Zielstra, D.; Zipf, A. A Comparative Study of Proprietary
NeoGeography and web 2.0: Concepts, tools and Geodata and Volunteered Geographic Information for
applications, in Journal of Location Based Services,Vol. 3, Germany in Proceedings of 13th AGILE International
No. 2, 2009, 118–145. Conference on Geographic Information Science, Guimar
[4] Goodchild, M. NeoGeography and the nature of Portugal, May 2010, 10–14.
geographic expertise. in Journal of Location Based [22] Ludwig, I.; Voss, A.; Krause-Traudes, M. A Comparison
Services,Vol. 3, No. 2,2009 82–96. of the Street Networks of Navteq and OSM in Germany in
[5] Haklay, Muki, Alex Singleton, and Chris Parke Web Advancing Geoinformation Scienced of a Changing World
mapping 2.0: The neogeography of the GeoWeb, in Lecture Notes in Geoinformation and Cartography 2011,
Geography Compass 2.6,2008, 2011-2039. published by Springer, 65–84.
19
International Journal of Computer Applications (0975 - 8887)
Volume 76 - No. 16, August 2013
[23] Walter; Volker; Fritsch, D., Matching spatial data sets: a Record: Journal of the Transportation Research Board,
statistical approach in International Journal of 2011, 2217, 145–152.
Geographical Information Science, 13.5 (1999), 445–473. [28] Zielstra, D.; Hochmair, H.H. Comparison of Shortest Path
[24] Fessele, M.: Poplin, A. Statistical analysis of routing Lengths for Pedestrian Routing in Street Networks Using
processes using OpenStreetMap road data of Hamburg Free and Proprietary Data in Proceedings of
with different completeness of information about one-way Transportation Research Board - 91st Annual Meeting,
streets in Proceedings of GeoValue’2010, 2010, 87–92. Washington, DC, USA, 22–26, January 2012.
[29] Neis, P.; Zielstra, D.; Zipf, A., The Street Network
[25] Girres, J.F.; Touya, G Quality assessment of the French
Evolution of Crowdsourced Maps: OpenStreetMap in
OpenStreetMap dataset in Transaction in GIS, 2010, 14,
Germany 2007–2011 in Future Internet, 2012, 4(1), 1-21,
435–459.
doi:10.3390/fi4010001.
[26] Zielstra, D.; Hochmair, H.H. Digital street data: Free [30] Ciepluch, B.,;Mooney, P.,; Jacob, R.,; Winstanley, A.,
versus proprietary in GIM International, 2011, 25, 29–33. Sketches of Generic Framework for Quality Assessmenft
[27] Zielstra, D.; Hochmair, H.H. A comparative study of of Volunteered Geographical Data in IEEE Geoscience
pedestrian accessibility to transit stations using free and and Remote Sensing Society (GRSS), 2011 , 1–5.
proprietary network data in Transportation Research
20