Capturing_the_views_of_geoscientists_on
Capturing_the_views_of_geoscientists_on
This is the ‘Accepted Version’ of the following Quarterly Journal of Engineering Geology and
Hydrogeology article - for which a suggested citation is given below:
Gilder, C.E.L., Geach, M., Vardanega, P.J., Holcombe, E.A. & Nowak, P. (2021).
Capturing the views of geoscientists on data sharing: a focus on the geotechnical
community. Quarterly Journal of Engineering Geology and Hydrogeology,
54(2): [qjegh2019-138] https://doi.org/10.1144/qjegh2019-138
The final published version of the article can be obtained via the following link:
https://qjegh.lyellcollection.org/content/54/2/qjegh2019-138 [Accessed 10 May 2022].
10 May 2022
Date of version – 5th September 2020
Paul J. Vardanega, BE MEngSc (QldUT) PhD (Cantab) GMICE MASCE MIEAust FHEA
Senior Lecturer in Civil Engineering
University of Bristol
Department of Civil Engineering
Queen’s Building
University Walk
Bristol, BS8 1TR, UK.
Email: [email protected]
ORCID: 0000-0001-7177-7851 (Corresponding Author)
Paul Nowak, BSc (Lond.), ARSM, CEng, MICE, MIMMM, CGeol, FGS
Technical Director,
Atkins,
Epsom, UK
Email: [email protected]
1
1 Capturing the views of geoscientists on data sharing: A focus on
4 ABSTRACT
5 The sharing of Ground Investigation (GI) data within the United Kingdom (UK) is commonly practiced
6 only in large infrastructure projects. A vast amount of GI data collected on routine projects is commonly
7 not made publicly available which is arguably inefficient and potentially unsustainable. This paper
8 captures the opinions of the geoscience community and the GI industry on data sharing to better
9 understand current working practices and potential barriers to data sharing. The results of a survey
10 carried out at the Janet Watson Meeting 2018: A data Explosion: The Impact of Big Data in Geoscience
11 held at the Geological Society of London are reported. This survey is compared with the results of
12 interviews undertaken during the Dig to Share project, a collaborative project led by Atkins, British
13 Geological Survey (BGS) and Morgan Sindall. The opinions and practices of geoscientists towards data
14 sharing across a project life cycle are reviewed. Drivers of risk relating to geotechnical aspects of a
16 1. INTRODUCTION
17 The sharing of data and the use of digital tools are becoming ever more important aspects of delivery
18 and management of large civil engineering projects in the United Kingdom (UK). Existing frameworks
19 for collection and storage of ground investigation (GI) data (e.g., AGS 2004; BSI 2014; AGS 2017) can
20 aid the process of providing geological and geotechnical data within the project team, allowing better
22 In a typical civil engineering project, a conceptual engineering geological model is created to anticipate
23 what might be encountered onsite based on geological inferences (Fookes 1997; Parry et al. 2014;
24 Norbury 2020). This is progressed to a preliminary observational model, one that is made from
25 observations from available boreholes at a site (Parry et al. 2014). This preliminary work forms part of
26 the Phase I - Desk Study (DS) (BSI 2020) and requires access to relevant historical data held in
2
27 databases to help identify the potential presence of technical risks (e.g., Clayton 2001; McMahon 1985).
28 A ground model includes engineering parameters, and a geotechnical model is built from a
29 mathematical or physical analysis (Parry et al. 2014). Preliminary information, which would inform a
30 preliminary observational model, may propagate through the subsequent design, construction, and
31 service phases of a project in the form of geotechnical risks that will require identification and
32 management. The DS is, therefore, widely recognised to be the most cost-effective part of this risk
34 The quality of data collection is required to be consistent with the Geotechnical Category and specific
35 to the requirements of the project. BSI (2020) requires a ‘risk register’ to be completed at the earliest
36 point in a project to summarise the likely foreseen sources of unfavourable conditions associated with
37 the subsurface. Geotechnical risks within a project can be described in terms of ‘Technical’ or
38 ‘Contractual’ risks (Baynes 2010). The technical risks are those that are the result of geotechnical
39 uncertainties, such as ‘The risk of encountering an unknown geological condition’; ‘The risk of using
40 the wrong geotechnical criteria’ and ‘The risk of bias and/or variation in the design parameters being
41 greater than estimated’ (McMahon 1985). Geotechnical engineers usually manage these uncertainties
42 and their associated risks using factors of safety and engineering judgement (often using limit state
43 design or partial factoring approaches) (e.g., Simpson et al. 1981; Bolton 1981; Vardanega & Bolton
44 2016). Risk is also managed by development of documents such as the code of practice for ground
46 (Section 2) (BSI 2010) describing the careful planning required for collection of geotechnical data, or
47 other specific documents such as Highways England (HE) document CD 622 (HE 2020) (superseding
49 This paper provides a brief overview of the progression of geotechnical data (and, therefore, the
50 associated uncertainties and risks) through the early project phases in a typical UK construction context.
51 The aim is to explore the current attitudes towards, and behaviour of, GI data storage and sharing of
52 those working in the industry. The methodology and results of two studies (a semi-quantitative survey
53 and a set of semi-structured qualitative interviews) are compared to understand: (i) to what extent does
54 data sharing occur in practice? (ii) is open data useful? and (iii) does data sharing help to improve risk
3
55 management? By tracking the path of GI data, along with the workflow of geotechnical engineers and
56 geoscience professionals, the potentially significant role that data sharing could have in identifying
57 geotechnical risks and potential for improvement of the risk reduction process is discussed.
59 Consideration of a centralised database for GI data came about as early as the 1980s. Organisations
60 were urged to release the internal information they held at the time (Wood et al. 1982), and research
61 became focused on the development of a suitable database structure relevant for descriptions of soils
62 and rocks (e.g., Toll & Oliver 1995). The British Geological Survey (BGS) began collecting borehole
63 records in the 19th century and this is maintained by the UK National Geoscience Data Centre (NGDC)
64 (NGDC 2020). Data from ground investigations in the UK are often transferred within a project in a
65 data transfer format file (AGS 2017) developed by the Association of Geotechnical Specialists (AGS)
66 during the early 1990's and extended in various version to date (AGS version 4.1) (AGS 2020). The
67 BGS provides, along with ‘pdf copies’ of groundwater wells and ground investigation boreholes, a web
69 There is often a lack of distinction between geotechnical data and geotechnical information (Chandler
70 et al. 2012). Geological data describes the factual data from a ground investigation whilst information
71 describes the interpreted geological layers, 2D and 3D relationships (which includes pdf copies of
72 borehole logs) (Chandler et al. 2012). In BS 8574:2014 (BSI 2014) the former distinction is termed
73 ‘logical data’. Similarly, in the United States, a data transfer format known as the Data Interchange for
74 Geotechnical and Geoenvironmental Specialists (DIGGS) (DIGGS 2020), has been developed, but its
75 adoption in practice has faced challenges. For instance, a study by the Ohio Department of Transport
76 Office found that data is typically re-input multiple times by those performing ground investigation
77 projects; the workflow for consultants typically involves manually re-typing data to produce pdf copies
79 Typical projects in the UK have a similar workflow as described in Deaton (2018) (Figure 2) and
80 inefficiencies are caused by the way that data is handled during a project. The individuals closest to the
81 original GI data at the start of the chain, i.e. new ground investigation data is produced at initiation of a
4
82 project by the GI contractor, supply data down the chain to those that use the information to inform
83 design decisions (see suggested groups of individuals in Figure 2). Where data is supplied as ‘pdf
84 copies’, which is not logical data, the resulting work processes can cause significant delay in delivery
85 of GI information. Geotechnical aspects can be submitted to Building Information Models (BIMs), yet
86 these are commonly not reviewed, and the flow of information is often one way (Chandler et al. 2012).
87 The data and information commonly do not reach a data repository, so cannot go towards benefitting
89 engineering practice.
90 Building Informational Modelling (BIM) (e.g., BSI 2019) government mandates have driven the need
91 for collective development of 3-dimensional models into the wider project environment, which has
92 affected geotechnical aspects (Chandler et al. 2012), and has driven the need to share data to build
93 reliable ground models. An emphasis on 3D modelling also features in the BS 5930 amendment (BSI
94 2020; Norbury 2020). UK Government initiatives have led to the development of the BIM Industry
95 Working Group to improve project delivery and operational performance (BIM Industry Working
96 Group 2011; HMG 2013). In the Highways and Rail sectors in the UK, where there is a greater need to
97 build detailed ground models (e.g., Mooney 2020), to achieve this requirement, organisations maintain
98 databases of engineering related information acquired during infrastructure projects, i.e. the Highways
99 Agency Geotechnical Data Management System (HA GDMS). This is a project-based system including
100 scanned analogue (paper) reports and AGS files. Similarly, the United Kingdom Oil and Gas Authority
101 (OGA), a government authority created to promote innovation in UK oil exploration, describes the need
102 for stewardship of a National Data Repository of well and seismic information, in part due to new
103 requirements of the UK Energy Act 2016. The geospatial commission was formed in April 2018 by the
104 UK Government to co-ordinate driving of value from geospatial data (HMG 2019) in the context of
105 construction and technology. This industry is looking to technologies including GIS, use of ‘Big data’
107 In the construction industry greater efficiencies are seen in large infrastructure projects such as HS1
108 and HS2 (Smale 2017), Cross rail’s Farringdon Station (Aldiss et al. 2012; Gakis et al. 2016) and
109 modelling of the London Basin (Mathers et al. 2014) where data is shared within a project. These
5
110 focused data sharing initiatives in UK infrastructure projects/sectors are presenting an interesting
111 opportunity for the geotechnical community, as it is not currently understood to what degree an increase
112 in data sharing would help the industry reduce project risk and impact overall productivity.
114 To investigate current attitudes to and practices of geoscience data-sharing a survey of 11 multiple-
115 choice questions were presented to attendees of the Janet Watson Meeting 2018: A data Explosion: The
116 Impact of Big Data in Geoscience held between 27th February to 1st March 2018 at the Geological
117 Society of London, UK. A total 54 individuals responded to the questionnaire, of whom 44.4%
118 represented the oil and gas industry, 26% research, 12.9% exploration, 7.4% construction industry,
119 3.7% remote sensing, 3.7% mining or mineral extractive industries and 1.8% data science (Q1). The
120 purpose was to provide perspective of these individuals attitude towards ‘open data’ and so understand
121 if the GI industry could improve their current workflow, by understanding the opinions of those working
122 closer to data science concepts within the geoscience community. The questions were designed to first
123 establish whether the individuals use open datasets, followed by an understanding of preferred data
124 storage types, participation in release of data to open environments, how they use open data in their
126 To build on this initial research a comparison of the perspectives and practices of this wider geoscience
127 community was made with those of a focussed population of construction industry professionals. A
128 series of semi-structured interviews were held by the behavioural research initiative Dig to Share
129 (2018a) a joint project between Atkins, Morgan Sindall, Fluxx and the BGS. This project produced a
130 document which detailed interesting quotes from interviews of 23 people from the engineering sector
131 (Dig to Share 2018b). Participants included individuals representing Utilities Providers, Ground
133 Sector Bodies. The interviews were designed with a particular focus on how these individuals interact
134 with the existing BGS database. Participants were encouraged to discuss the following topics:
135 • value articulation: what is the perception of the value of the BGS database those dealing with
136 ground investigation data on a day-to-day basis (additionally those who do not)?
6
137 • time and resources: is uploading data to the BGS database considered a commercially viable
138 activity?
139 • complexity of parties involved in civil engineering projects: what are the problems around
141 • data ownership: should owners of the data be obliged to give over their data to open datasets?
142 • data availability and format: is the current data held in the BGS database at a high enough
144 • technology: how can the functionality of the BGS database be improved, including the systems
145 used by its contributors i.e. methods of data capture and manipulation tools?
146 This research collates the quotes from the Dig to Share document (Dig to Share 2018b) according to
147 the participants stage on a typical project life cycle. The Janet Watson results are provided as a record
148 of the number of answers for each multiple-choice question. The insight from both populations, the
149 engineering sector, and from a wider geoscience context are reviewed.
152 Figure 3 and Figure 4 provide the results of the Janet Watson meeting survey. The number of
153 participants selecting each multiple-choice answer is provided. A total 48 out of 54 participants (88.8%)
154 confirmed the use of open datasets in their work (Q2). The most common open resources selected from
155 the choices shown included Ocean drilling/seismic based resources and both British and Non-UK
156 Geological Survey data (Figure 3a). Internal/company held databases were the fourth most used
157 resources. For the majority, open data is either extrapolated or used to support newly acquired data to
158 make decisions (Figure 4a) and within sectors that are naturally more reliant on using shared data, such
159 as Remote Sensing, Exploration and Oil and Gas, the use of open data as a sole informant for decisions
160 appears to be an accepted approach. This acceptance of use of open data, is mirrored in the perception
161 of risk, 73.1% of participants perceive a low, to moderately low attitude to risk (when risk is described
162 as a scale proportioned across four categories i.e. low, moderately low, moderately high and high risk),
163 58% are from these sectors (Figure 4d). The small representation that are from roles which are either
7
164 solely data science, or a mixture of data science and some other geoscience sectors, also held a lower
165 perception of risk. 50% of participants felt comfortable asking a client for permission to make data
166 collected during a project open source. The other 50% was made up of 25.9% who were not comfortable
167 and 24.1% who were in a position where this was not applicable to them (Figure 3c).
168 Participants were asked their perceived level of sharing of geoscience data in their own work. The
169 results indicated that participation to data sharing is low, in Figure 3d, 75% of survey participants
170 believed less than 50% of data from their work is made available as open source. Two aspects which
171 appear from results to be jointly contributing to low data sharing are the allocation of resources and
172 time. The need to incentivise and prevent data loss is perceived to be of less importance. Also, it can be
173 inferred that there is a general appreciation of financial advantages and benefits realised by the majority
174 i.e. the lowest answered categories shown in Figure 3e. It is also evident that many geoscience
175 professionals are still not prioritising data sharing activities in their workflow (Figure 4b). This question
176 also identified that collaboration between data scientists and geoscientists is currently low. The
177 participants were asked to provide an indication of who is most responsible for driving the release of
178 data to open environments. A total 44% of the Janet Watson population agree that the drive is required
181 The Dig to Share results compiled from document ‘148 interesting things’ (Dig to Share 2018b) have
182 been summarised into Figure 5 describing the key discussion themes drawn for comparison and
183 arranged according to stage on a project life cycle. Allocations R-1 to R-6 have been used to show the
184 groups of individuals (as identified in Figure 2) discussing each topic. The Dig to Share results indicate
185 that generally, the BGS database is considered a valuable resource; a utilities provider using the words
186 ‘vital’ or ‘incredibly useful’, whilst Multi-disciplinary consultants described advantage being gained to
187 preliminary scoping and effective targeting of borehole logs, enabling preliminary work to be advised
188 to designers. Interestingly those in roles closest to the data’s original procurement (groups R-1 and R-
189 2) are expressing most concern for the evaluation of uncertainties and, therefore, risk. From a GI
190 Contractor at the beginning of the chain, there is a concern for the current quality of records held in the
191 BGS database due to the lack of sharing of newly acquired data. This group also challenged the quality
8
192 of meta-data due to improvement of testing methods or procedures, suggesting archived information
193 needs to be checked, and describing difficulty in terms of resolution and quality when making 3D
194 models. This group also indicated that open data will not be able to replace the need to complete a site-
196 Those usually mid-way in the supply chain, the Multi-disciplinary consultants (R-2) provide discussion
197 across all key themes. Concerns are regarding the processes surrounding quality control, i.e. the risk of
198 accepting data that has an unknown audit process. Also describing a need to verify acquired data. This
199 group noted the importance of ensuring reliability of data, and many discuss the liabilities associated
201 The differences in opinions across the supply chain represents an unbalanced view. Those at the
202 infrastructure end consider the BGS open dataset a cost saving asset (Utilities provider), yet those earlier
203 in the supply chain, see withholding data is a commercial advantage (Multi-disciplinary). One
204 interviewee expresses a frustration of knowing data has been acquired during a nearby project, but they
205 are unable to obtain it. Several participants of the Dig to Share interviews from Multi-disciplinary
206 consultancy indicate that the act of sharing data is often forgotten, with no opportunity to bill for time.
207 Various comments revolved around the cost burden that the additional time and work to share data
208 would ensue, realising the impracticality of digitising data that remains in paper format or in archives
210 When discussing how communication through the supply chain is currently working, interviewees
211 described the motivations and priorities of those that are the owners of the data. Several participants
212 indicate that often the client or those working in the chain of GI to construction are unable to understand
213 the value of the data that they may hold or own. To counteract this, others would prefer permission to
214 become a common addition to contracts, enabling prior permission to be granted. Interestingly these
215 comments are coming from the intermediary owners and users of the data, specifically the roles, R-2,
217 When looking for future innovation to data sharing, those working on smaller scale developments
218 indicate that planning authorities could help drive the open data agenda. Interviewees appear to be
219 looking for solutions from existing work systems which have enabled other innovations in digital
9
220 storage of data, i.e. BIM. Evidence of some public sector project requirements are described to already
221 be written into contracts, for instance the Highways England contracts. Other interviewees indicate that
222 those closest to the data, who are acquiring it, would be most suitable to deliver it to the BGS database.
223 It is evident that existing use of technologies is currently low. Discussions include the preference to
224 retain the system of capture of data in hand-written form onsite. Innovations in digital capture have
225 been opposed by those who had experience testing these methods, due to concerns for possibility of
226 missing the quality assurance associated with re-typing of logs by the geological or geotechnical
227 engineer. Where an AGS digital data transfer format file is available the data is still chosen to be passed
230 Figure 6 summarises the drivers of risk evidenced from the opinion studies. At project level disparity
231 of meta-data resolution, low time and resources, perception of risk, participation and communication
232 are identified. The wider geoscience community is affected by the lack of release of data beyond a
233 project, which is inhibiting research opportunities and curation of learning outcomes sourced from
234 industry experiences. This is further driving individual project risk, due to lack of innovation in research
235 into codes of practices, natural variability and correlation of engineering parameters, which rely on data
236 availability.
237 The drivers at project level are realised to be from ‘human’ sources, originating from current practices
238 and understanding. It is clear from both survey populations that there is an appreciation of the benefits
239 of data sharing but sharing participation is low. From those that find value in ‘Big Data’ principles and
240 are working in roles that are dealing with finding solutions for our data needs in geosciences, the
241 practical aspects of data sharing are the same, but the outlook of individuals are different. Each driver
244 The uploading of content and data sharing is not seen as commercially viable by the majority in the GI
245 industry and it is not fully appreciated that the short-term efforts could outweigh that of the longer-term.
246 Government strategies are already in place to drive reduction in construction costs and promote faster
10
247 delivery (HMG 2013). Government data platforms are also detailing geographic information such as
248 Lidar data, UK Air quality and the Department for Environment and Rural Affairs (DEFRA) which
249 have been opened for use. A lot of the benefits are being realised in open data frameworks, developed
250 to enable smart cities (mruk n.d.) and have government support (Capgemini Consulting 2013). It seems
251 many want to benefit from the impacts of accessing more data but are still not allocating time and
253 The profession still encounters issues relating to the ground, which could have been identified prior to
254 intrusive investigation and the scope or amount of geotechnical investigation is linked to the likelihood
255 of under or overdesign (Jaksa et al. 2005). Whyte (1995) reviewed project expenditure on 58 UK
256 highways projects reporting GI cost was 0.45 % of total out-turn cost. Where costs increased above
257 tender cost, around 54% of the cost was due to geotechnical origins (Whyte 1995). Chapman (2008)
258 reviews typical project expenditure from a developer’s point of view and finds that the cost of a desk
259 study (assumed £5K) and ground investigation (assumed £40K), typically constitutes 0.19% of the
260 building cost, or 0.34% of the structure cost while ground risks are responsible for causing 20% of
261 projects significant delay (where one month of delay is estimated to cost £100M). The continued
262 pressure to deliver site investigations under low budgets may still be promoting competition between
264 Child et al. (2014) describe that the ‘data journey’ requires an update, which is more than just a digital
265 version of the traditional paper process, where data can be accessed at all stages of a project
266 development. Where a system for data sharing is initialised at the start of a project, this could help
269 Risk was a common narrative discussed by those in the geotechnical sector and concerns are still held
270 regarding liability. The multi-disciplinary consultants are often ‘keepers’ of the data and perhaps are
271 most informed of its quality, understanding the difficulties that are held in communication of the data’s
272 aspects of uncertainty. Evidence from this study indicates that for other geoscience sectors who rely on
273 shared data the use of open datasets as a sole source of information is accepted, and these individuals
274 generally perceive a lower attitude to risk. However, the limit of liability is a complex issue in the
11
275 construction sector, and one that needs to be better understood by all parties (e.g., AGS 2018). Problems
276 of liability and risk are being addressed in other disciplines by using the concept of a ‘data trust’, as is
277 described by the Open Data Institute (ODI) (ODI 2019). This may hold substantial benefits to those in
278 construction. Additionally, new ways of writing contracts and project management approaches to
279 attaining ground investigation data could easily be adopted (e.g., NGDC 2020, specifically ‘Data
280 sharing agreements’). To establish these new working methods the following points could be
281 considered.
282 (1) Methods which allow an ‘Agile’ approach to management, which involve close collaboration with
283 stakeholders of a project, so that unforeseen circumstances or anomalies found in the ground can be
284 communicated and reviewed whilst the GI Contractor is still onsite so that risks are better understood
286 (2) The alignment of geotechnical outcomes to the principles of BIM with the progression of standard
287 references to Levels of Detail and Levels of Information and/or procedures to enable quality control.
288 (3) Targeted efforts by supporting Learned Societies to increase understanding and dialogue between
289 legal representatives and geotechnical professionals, such as accreditation and CPD training courses.
291 An outcome of this research is that that the owners of GI data and those in the supply chain must be
292 better informed of technological advances in other sectors regarding data sharing and be aware of the
293 existing frameworks that are currently not being effectively applied. In the GI sector, use of the Code
294 of Practice for the Management of Geotechnical data (BSI 2014), and more recent modifications to
295 AGS4 (AGS 2017; Child et al. 2014; AGS 2020), which include tables for the exchange of laboratory
296 schedules within the format, could be more widely adopted. Griffiths (2014) describes a need for
297 engineering geologists to broaden their skills base to tackle future societal challenges and it is clear that
299 In the data science sector, many data management tools are already available which enable ‘Big Data’
300 application (Yaqoob et al. 2016). The use of ‘Big Data’ solutions in the engineering geoscience industry
301 is reported to have increased, yet the limiting factor of effective uptake has similarly been described as
302 human rather than technological (Dabson & Fitzgerald 2018). The current amounts of data in the GI
12
303 industry are perhaps not as large as in other sectors, however, concepts such as PropBase (Kingdon et
304 al. 2016), which can successfully combine data from differing formats could hold future benefits for
305 integration of other geoscience datasets i.e. geophysical data, alongside the geotechnical properties held
306 within an AGS data transfer format file. Other geoscience areas are also struggling to make the most of
307 efficiency gains of technology. In the Petroleum and Petrochemical industry, the uptake of ‘Big Data’
308 is differing between upstream and downstream operations (Hassani & Silva 2018). In communities also
309 dealing with geospatial data, it is understood that there is known financial and sustainability reward
310 from sharing geographic information and this is actively encouraged but is also managing the same
311 problems the GI industry is facing (e.g., AGI 2015; ODI 2018).
312 Consideration for data science collaboration is an important outcome of the Janet Watson meeting. The
313 use of data science concepts and the enhanced methodologies for data analysis this can provide (i.e use
314 of open source tools such as Python), should be given as much relevance as any other discipline taught
315 to a geotechnical engineer. Interestingly, this is not discussed or considered an important competency
316 in a lot of work seeking effective future development of engineering geology practice (i.e. Turner &
318 The sharing and efficient storage of newly acquired data must form part of the workflow and be an
319 integral part of the continued development of site investigation practice. Clear changes to existing
321 a) Use of existing data science technology for management and querying of GI data (i.e. web-based
322 systems, integration of analytical tools such as Python, R languages for visualisation and
323 statistical analysis, using SQL or open-source data processing tools (i.e. Hadoop)) should be
324 embraced by the industry. The AGS data transfer format should be the principle approach to
327 c) Increased employment from data science roles to introduce technological efficiencies in
13
329 5.4 Lack of communication and, therefore, feedback loop in the Geotechnical Community
330 The lack of movement to pro-active behaviours has been described previously about geotechnical
331 engineering, describing some ground investigation practices as ‘business open as usual’ (Knill 2003).
332 A large issue with the current workflow discussed in this research is communication. This is not only
333 through lack of understanding by management or the design team (Bridges 2019), it is a lack of
334 ownership or responsibility for data sharing described throughout the supply chain. The owners of the
335 data are generally considered to be the ultimate client or developer of a particular asset. However, this
336 is not clear in all projects and there is a need to specify data stewardship into contracts. The BGS is in
337 the process of releasing donated site investigations and borehole logs which have been previously held
338 as confidential, where the data has been held by them for over 4 years. This is in response to The
339 Freedom of Information Act and Environmental Information Regulations (EIR) which requires BGS to
340 revisit all previous confidentiality agreements and notifying donors (BGS 2020).
341 The Janet Watson results suggest that lack of sharing is not driven through reluctance to speak to owners
342 of the data. The Dig to Share participants indicate that approaching the client is low on the list of
343 priorities, teams can be large over a scheme and individuals cannot bill time to it. In response to these
344 findings, the Dig to Share project developed the following initiatives which could improve upon
346 a) Increase discussion in the value of geotechnical data with Clients who can benefit from data
348 b) Promote the role of Super Users (individuals within a company for instance) whose role is to
350 c) Develop potential methods of incentivisation in the release and return of data for access to third
352 Janet Watson meeting participants identified both themselves and the government to be relevant drivers
353 of change. Many participants from the GI industry believe that the changes are required from
354 enforcement from government sources to drive the change in data sharing. There is already significant
355 evidence of a government drive to the better management of geospatial data (HMG 2019). This is
356 reflecting significant governmental investment, coupled with data mandates through principles of BIM.
14
357 Although there are clear benefits of sharing data it is evident that the geoscience community is still
360 This paper draws together observations and views on the context of sharing data across the broader
361 geoscience industry and more specifically the disciplines of ground investigation and geotechnical
362 engineering. The questionnaire and interviews have provided an evidence base for current practices
363 and some of the barriers to GI data sharing; and brought greater insight into an issue that has perhaps
364 been recognised within the GI industry for many years. The main conclusions are: (i) data sharing is
365 not an active part of the current workflow, and the current working practice for curation of data from
366 past UK-based civil engineering projects is causing inefficiencies; (ii) in cases where data sharing is
367 occurring it is providing useful and relevant information for preliminary phases of projects; and (iii) the
368 lack of data-sharing is driving ground related uncertainties. Current working practices could be
369 improved by actions such as increasing awareness of those in non-technical roles in the construction
370 supply chain, allocating time and resources to data sharing, promoting data science in geoscience
372 Data has historically been important for geotechnical engineering design processes and practices (e.g.,
373 Kulhawy & Mayne 1990; Phoon & Kulhway 1999) so it might be expected that the lack of data-sharing
374 is hindering innovation. Research relies heavily on what has been published in the literature, by
375 producing new data or through specific collaboration efforts which require funding (e.g., Vardanega et
376 al. 2020). This is especially important given that ground-related uncertainties are still causing
377 significant time and financial risks to projects. Concerning the management of geotechnical risks, it is
378 proposed that data sharing not only hold potential for technical improvements and help inform project
379 level management decisions, but additionally aid other projects and the continued research into
380 geotechnical uncertainties in engineering design. The findings indicate that data sharing is not yet
381 happening widely enough in the UK and two of the main barriers seem to be the current attitudes and
382 working practices. The use of BGS as a central independent organisation to curate UK GI data is
15
383 working. This suggests that continued collection and management needs to be fuelled both from
385
387 The numerical data related to the survey at the Janet Watson meeting are presented in the paper and
388 anonymity is preserved. Data from the Dig to Share research can be sourced from Dig to Share (2018b).
389
390 ACKNOWLEDGMENTS
391 The first author would like to acknowledge the support the Engineering and Physical Sciences Research
392 Council, Grant Number: EP/R51245X/1. The first author would also like to thank the Geological
393 Society of London for supporting the authors undertaking of the survey at the Janet Watson Meeting.
394 The authors thank the reviewers of the manuscript for their helpful comments which have helped
396
397 REFERENCES
398 Association for Geographic Information (AGI) 2015. AGI Foresight Report 2020. The Association for
399 Geographic Information, London. UK. https://www.agi.org.uk/about/resources/category/100-
400 foresight?download=160:agi-foresight-2020 [Accessed 05 September 2020]
401 AGS 2004. Electronic Transfer of Geotechnical and Geoenvironmental Data, Edition 3.1.
402 Association of Geotechnical and Geoenvironmental Specialists, Kent, UK.
403 AGS 2017. Electronic Transfer of Geotechnical and Geoenvironmental Data, Edition 4.0.4.
404 Association of Geotechnical and Geoenvironmental Specialists, Kent, UK.
405 AGS 2018. LPA 68 – Guidance on Duty of Care arising from Third Party reliance on a geotechnical
406 report: BDW Trading Ltd v Integral Geotechnique (Wales) Ltd [2018] EWHC 1915 (TCC).
407 Association of Geotechnical and Geoenvironmental Specialists, Kent, UK.
408 AGS 2020. AGS Version 4.1. Association of Geotechnical and Geoenvironmental Specialists, Kent,
409 UK https://www.ags.org.uk/data-format/ags4-data-format/ [Accessed 05 September 2020]
410 Aldiss, D.T., Black, M.G., Entwisle, D.C., Page, D.P. and Terrington, R.L. 2012. Benefits of a 3D
411 geological model for major tunnelling works: an example from Farringdon, east-central
412 London, UK. Quarterly Journal of Engineering Geology and Hydrogeology, 45(4), 405–414.
413 https://doi.org/10.1144/qjegh2011-066
414 Baynes, F.J. 2010. Sources of geotechnical risk. Quarterly Journal of Engineering Geology and
415 Hydrogeology, 43(3), 321–331. https://doi.org/10.1144/1470-9236/08-003
416 BGS 2020. Site investigation and drilling information frequently asked questions. Depositing records
417 and digital data. https://www.bgs.ac.uk/services/NGDC/records/depositing.html [Accessed 05
418 September 2020]
16
419 BIM Industry Working Group 2011. A report for the Government Construction Client Group Building
420 Information Modelling (BIM) Working Party Strategy Paper.
421 https://www.cdbb.cam.ac.uk/system/files/documents/BISBIMstrategyReport.pdf [Accessed 05
422 September 2020]
423 Bolton, M.D. 1981. Limit state design in geotechnical engineering. Ground Engineering, 14(6), 39–
424 46.
425 Bridges, C. 2019. Geotechnical Risk: It’s not only the ground. Australian Geomechanics, 54(1), 27–
426 38.
427 British Standards Institution (BSI) 2010. Eurocode 7: Geotechnical design – Part 2: Ground
428 investigation and testing (BS EN 1997-2: 2007+A1:2010). British Standards Institution,
429 London, UK.
430 BSI 2014. Code of practice for the management of geotechnical data for ground engineering projects
431 (BS 8574:2014). British Standards Institution, London, UK.
432 BSI 2019. Organization and digitization of information about buildings and civil engineering works,
433 including building information modelling (BIM) – Information management using building
434 information modelling – Part 2: Delivery phase of the assets (BS EN ISO 19650-2:2018).
435 British Standards Institution, London, UK.
436 BSI 2020. Code of practice for ground investigations (BS 5930:2015+A1:2020). British Standards
437 Institution, London, UK.
438 Capgemini Consulting. 2013. The Open Data Economy: Unlocking Economic Value by Opening
439 Government and Public Data. https://www.capgemini.com/wp-
440 content/uploads/2017/07/the_open_data_economy_unlocking_economic_value_by_opening_go
441 vernment_and_public_data.pdf [Accessed 05 September 2020]
442 Chandler, R.J., McGregor, I.D. and Morin, G.R. 2012. The role of geotechnical data in Building
443 Information Modelling. In: Proceedings of the 11th Australia New Zealand Conference on
444 Geomechanics (ANZ 2012), 15-18 July 2012, Melbourne, Australia (Narsilio, G., Arulrajah, A.
445 and Kodikara, J. (eds.)), 511–516.
446 https://www.issmge.org/uploads/publications/89/82/2_1_10.pdf [Accessed 05 September 2020]
447 Chapman, T.J.P. 2008. The relevance of developer costs in geotechnical risk management. In:
448 Foundations: Proceedings of the Second BGA International Conference on Foundations,
449 ICOF2008 (Brown M.J., Bransby M.F., Brennan A.J. and Knappett J.A. (eds)). IHS BRE Press,
450 3-26.
451 Child, P., Grice, C. and Chandler, R. 2014. The Geotechnical Data Journey – How the Way We View
452 Data is Being Transformed. Information Technology in Geo-Engineering, 3, 83–88.
453 https://doi.org/10.3233/978-1-61499-417-6-83
454 Clayton, C.R.I. 2001. Managing geotechnical risk: time for change? Proceedings of the Institution of
455 Civil Engineers - Geotechnical Engineering, 149(1), 3–11.
456 https://doi.org/10.1680/geng.2001.149.1.3
457 Dabson, O. and Fitzgerald, R. 2018. Big Data in the industry: A critical examination of modern data
458 collection and use in engineering geosciences. In: Janet Watson Meeting 2018: Big Data in
459 Geoscience Abstract Book
460 https://www.geolsoc.org.uk/~/media/shared/documents/events/Archive/2018/JW18%20Big%2
461 0Data%20abstract%20book.pdf [Accessed 05 September 2020]
462 Deaton, S.L. 2018. What are the Benefits of Geotechnical Data Interchange? Paper presented at: 69th
463 Highway Geology Symposium (HGS). Proceedings available from:
464 https://www.highwaygeologysymposium.org/wp-content/uploads/69_HGS-OPT.pdf [Accessed
465 05 September 2020]
466 DIGGS 2020. Data Interchange for Geotechnical and Geoenvironmental Specialists.
467 http://www.diggsml.org [Accessed 05 September 2020]
468 Dig to Share 2018a. Digital Incubator, Infrastructure Industry Innovation Platform (i3P). Dig to Share.
469 https://www.i3p.org.uk/projects/digtoshare/ [Accessed 05 September 2020]
470 Dig to Share 2018b. 148 interesting things we heard while interviewing 23 people in the engineering
471 sector. Atkins. Dig to Share. https://www.i3p.org.uk/wp-content/uploads/2018/09/03-148-
472 Interesting-things...article.pdf [Accessed 05 September 2020]
17
473 Fookes, P.G. 1997. Geology for engineers: the Geological Model, Prediction and Performance.
474 Quarterly Journal of Engineering Geology, 30(4), 293–424.
475 https://doi.org/10.1144/GSL.QJEG.1997.030.P4.02
476 Gakis, A., Cabrero, P., Entwisle, D. and Kessler, H. 2016. 3D geological model of the completed
477 Farringdon underground railway station. In: Crossrail Project: Infrastructure, design and
478 construction (Black, M. (ed.)) Thomas Telford Limited and Crossrail, London, UK, 3, 431-446.
479 Geach, M. and Grice, C. 2020. What is Agile Site Investigation and Why is Data so Vital?
480 https://www.keynetix.cloud/technical-articles/agile-site-investigation/ [Accessed 05 September
481 2020]
482 Griffiths, J.S. 2014. Feet on the ground: engineering geology past, present and future. The 14th
483 Glossop lecture. Quarterly Journal of Engineering Geology and Hydrogeology, 47(2), 116–
484 143. https://doi.org/10.1144/qjegh2013-087
485 Hassani, H. and Silva, E.S. 2018. Big Data: a big opportunity for the petroleum and petrochemical
486 industry. OPEC Energy Review, 42(1), 74–89. https://doi.org/10.1111/opec.12118
487 HE 2020. Managing geotechnical risk (CD 622 Revision 1). Highways England, UK.
488 https://www.standardsforhighways.co.uk/prod/attachments/ff5ed991-71ed-4ff2-9800-
489 094e18cd1c4c [Accessed 05 September 2020]
490 Her Majesty’s Government (HMG) 2013. Construction 2025. Industrial Strategy: government and
491 industry in partnership. HM Government.
492 https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/210099/bis-13-
493 955-construction-2025-industrial-strategy.pdf [Accessed 05 September 2020]
494 HMG 2019. Geospatial Commission Annual Plan 2019/2020. HM Government.
495 https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/fi
496 le/799197/6.5522-CO-GeospatialCommissionAnnualPlan.pdf [Accessed 05 September 2020]
497 Jaksa, M.B., Goldsworthy, J.S., Fenton, G.A., Kaggwa, W.S., Griffiths, D.V., Kuo, Y.L. and
498 Poulos, H.G. 2005. Towards reliable and effective site investigations. Géotechnique, 55(2),
499 109–l21. https://doi.org/10.1680/geot.2005.55.2.109
500 Kingdon, A., Nayembil, M.L., Richardson, A.E. and Smith, A.G. 2016. A geodata warehouse: Using
501 denormalisation techniques as a tool for delivering spatially enabled integrated geological
502 information to geologists. Computers and Geosciences, 96, 87–97.
503 https://doi.org/10.1016/j.cageo.2016.07.016
504 Knill, J. 2003. Core values: the first Hans-Cloos lecture. Bulletin of Engineering Geology and the
505 Environment, 62(1), 1–34. https://doi.org/10.1007/s10064-002-0187-9
506 Kulhawy, F.H. and Mayne, P.W. 1990. Manual on Estimating Soil Properties for Foundation Design.
507 Rep. No. EL-6800, Electric Power Research Institute, Palo Alto, CA, USA.
508 Mathers, S.J., Burke, H.F., Terrington, R.L., Thorpe, S., Dearden, R.A., Williamson, J.P. and
509 Ford, J.R. 2014. A geological model of London and the Thames Valley, southeast England.
510 Proceedings of the Geologists’ Association, 125(4), 373–382.
511 https://doi.org/10.1016/j.pgeola.2014.09.001
512 McMahon, B.K. 1985. Geotechnical Design in the face of uncertainty. Australian Geomechanics
513 Journal, 10(1), 7–19.
514 Mooney, N. 2020. Ameys’ Story: Smart use of Historical Geotechnical Data for a Smart Motorway.
515 https://www.keynetix.cloud/technical-articles/historical-geotechnical-data/ [Accessed 05
516 September 2020]
517 mruk n.d. Building A Future City: Future City Glasgow Evaluation. Glasgow City Council.
518 https://futurecity.glasgow.gov.uk/reports/12826M_FutureCityGlasgow_Evaluation_Final_v10.
519 0.pdf [Accessed 05 September 2020]
520 NGDC 2020. Deposit data with NGDC. National Geoscience Data Centre, British Geological Survey.
521 https://www.bgs.ac.uk/services/ngdc/guidelines.html [Accessed 05 September 2020]
522 Norbury, D. 2020. Ground models – a brief overview. Quarterly Journal of Engineering Geology and
523 Hydrogeology, https://doi.org/10.1144/qjegh2020-018
524 ODI 2018. The UK’s geospatial data infrastructure: challenges and opportunities. Open Data Institute.
525 https://theodi.org/wp-content/uploads/2018/11/2018-11-ODI-Geospatial-data-infrastructure-
526 paper.pdf [Accessed 05 September 2020]
18
527 ODI 2019. Data Trust for the Royal Borough of Greenwich and Greater London Authority. Pilot 1 –
528 Sharing Cities Transport and Energy Case Studies. Open Data Institute. http://theodi.org/wp-
529 content/uploads/2019/04/BPE_PITCH_GREENWICH_GLA_A4-FINAL.pdf [Accessed 05
530 September 2020]
531 Parry, S., Baynes, F.J., Culshaw, M.G., Eggers, M., Keaton, J.F., Lentfer, K., Novotny, J. and Paul, D.
532 2014. Engineering geological models: An introduction: IAEG commission 25. Bulletin of
533 Engineering Geology and the Environment, 73(3), 689–706. https://doi.org/10.1007/s10064-
534 014-0576-x
535 Phoon, K-K. and Kulhawy, F.H. 1999. Characterization of geotechnical variability. Canadian
536 Geotechnical Journal, 36(4), 612–624. https://doi.org/10.1139/t99-038
537 Simpson, B., Pappin, J.W. and Croft, D.D. 1981. An approach to limit state calculations in
538 geotechnics. Ground Engineering, 14(6), 21-26, 28.
539 Smale, K. 2017. Sharing geotechnical information ‘could cut costs’. New Civil Engineer.
540 https://www.newcivilengineer.com/business-culture/sharing-geotechnical-information-could-
541 cut-costs/10025179.article [Accessed 05 September 2020]
542 Toll, D.G. and Oliver, A.J. 1995. Structuring soil and rock descriptions for storage in geotechnical
543 databases. Geological Data Management, Geological Society Special Publication, 97(1), 65–
544 71. https://doi.org/10.1144/GSL.SP.1995.097.01.08
545 Turner, A.K. and Rengers, N. 2010. A Report Proposing the Adaptation of the ASCE Body of
546 Knowledge Competency-based Approach to the Assessment of Education and Training Needs
547 in Geo-Engineering. Progress report to the: Joint Technical Committee JTC-3: Education and
548 Training. https://www.iaeg.info/wp-content/uploads/2019/01/c4_jtc3-appendix-1-
549 turnersandrengers-jan2010.pdf [Accessed 05 September 2020]
550 Vardanega, P.J. and Bolton, M.D. 2016. Design of Geostructural Systems. ASCE-ASME Journal of
551 Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering, 2(1), [04015017]
552 https://doi.org/10.1061/AJRUA6.0000849
553 Vardanega, P.J., Crispin, J.J., Gilder, C.E.L., Voyagaki, E. and Ntassiou, K. 2020. DINGO: A Pile
554 Load Test Database. Piling 2020 (to be presented).
555 Whyte, I.L. 1995. The financial benefit from a site investigation strategy. Ground Engineering, 28,
556 October, 33–36.
557 Wood, L.A., Tucker, E.V. and Day, R. 1982. Geoshare: The development of a databank of geological
558 records. Advances in Engineering Software, 4(4), 136–142. https://doi.org/10.1016/0141-
559 1195(82)90015-8
560 Yaqoob, I., Hashem, I.A.T, Gani, A., Mokhtar, S., Ahmed, E., Anuar, N.B. and Vasilakos, V. 2018.
561 Big data: From beginning to future. International Journal of Information Management, 36(6B),
562 1231–1247. https://doi.org/10.1016/j.ijinfomgt.2016.07.009
19
Figure 1. Modified from Fookes (1997) schematic. Relationship between time and money spent on a civil
engineering project and the influence of information gathering to enable a complete geotechnical understanding
of a project. © Geological Society of London
Figure 2. Summary of the main stages of ground investigation work undertaken during a civil engineering project
including the data transferred and the professional roles involved in each stage of work.
Figure 3. Results from Janet Watson Survey providing the number of answers for each question. Questions are as
follows: (a) which open datasets do you use? (Q3) Ocean drilling/seismic projects includes seismic data and
sources relevant to oil exploration including: International Ocean Discovery Program (IODP), Deep Sea Drilling
Project (DSDP). Non-UK Geological Surveys include United States Geological Survey (USGS) and Geoscience
Australia. Other participants suggested the following additions to the original multiple-choice options presented
above including: Geological Survey of the Netherlands (TNO), Common Data Access Ltd (CDA) and Norwegian
Petroleum Directorate (NPD) (b) What form of data storage is most useful? (Q5) additional suggestions by
participants are added (c) Are participants comfortable asking a Client for permission to make the data collected
on their project to be made open source? (Q9) (d) What percentage of your data is made accessible as open source?
(Q7) 0% = none of their work is perceived to be shared 100% = all of their work is perceived to be shared. (e)
What is the main cause of data loss in your industry? No of answers shows where some participants did not pick
two, each participant answer was weighted so that the total score per participant was equal to one (one participant
did not answer) (Q6).
Figure 4. Results from Janet Watson Survey. The number of answers for each question is presented according to
the participants sector of work (colour key) (a) shows answers to question: How do you use open source data
within your work? (Q4) (b) In your opinion what is most holding back the advancements in `big data` for
geoscience industries? (Q8) (c) Who do you feel has most responsibility in driving the `open data` agenda? (Q11)
(d) What is your attitude to risk when using open data (Q10).
Figure 5. Direct or partial quotes abstracted from original document Dig to Share (2018b), collated and sorted by
interviewee role and position on a typical project lifecycle for the purposes of this research.
Figure 6. Summary of drivers of risk to a civil engineering project lifecycle.