A global effort to make exploration data rock
ExploreSA: The Gawler Challenge
South Australia is firmly in the spotlight as a leader in exploration and geoscience with the completion of the first phase of the Gawler Challenge. The innovative competition with a $250,000 prize pool is the first global government-hosted crowdsourced open data competition to fast-track the discovery of mineral deposits.
The first phase of the competition, the Data Prep Prize, closed on 2 May 2020. The goal was to increase data sharing and collaboration, and to share novel approaches of data preparation for usage by the exploration industry. This would help explorers across South Australia, and the world, use exploration data with modern data analytics and machine learning approaches. Teams from across the world took part, sharing their code and cleaned datasets in an open forum. The four winners were announced on 2 June, sharing in a prize pool of $20,000.
The submissions to the Data Prep Prize focused on a range of data types across geophysics, geochemistry and company exploration reports.
The Geological Survey of South Australia curates massive geochemistry datasets from activities such as drilling, and soil and rock-chip sampling. The datasets include assay results spanning back multiple decades. Most of it has been provided by exploration companies over time, and although the majority of the data is correct, numerous data entry errors are scattered throughout. Many of these have been cleaned up by the Geological Survey.
Modern analytical techniques such as machine learning can ingest millions of rows of data at once, automatically, which means the data errors have the potential to cause significant problems in the modelling process. This can result in really useful data, like anomalous values of economic metals, being concealed.
Several submissions in the Data Prep Prize developed code and cleaned datasets that dealt with common errors in geochemistry data, such as incorrect units, null values and incorrect element names.
It is not just the errors in geochemistry data that can cause us headaches. The sheer size of the files, as well the changes in analytical methods and detection limits over time, provide challenges. One approach a winning submission took was to develop an easy workflow to transform the statewide drilling database into smaller csv and shape file formats that could easily be imported into GIS software.
Geophysics and remote sensing
As with geochemistry, South Australia has a huge resource of open file geophysical datasets. For machine learning applications, data scientists aim to layer all these different datasets to indicate which features are present in the same geographic area. This can pose a challenge, as geophysical data is available in several different formats – raw, processed and gridded images – that may not have a consistent resolution and may not be ‘stitched’ together across the area of investigation.
A number of submissions tackled geophysics and remote sensing data and came up with solutions such as: a process and output datasets for combining and cleaning remote sensing data in QGIS, a free open-source GIS application; effectively combining raw and inverted magnetotelluric data into useful files; and merging geophysical images into a statewide image.
- Access all Data Prep Prize submissions on the Challenge Forum via Unearthed’s ExploreSA: The Gawler Challenge homepage.
- Read about the winners and their solutions on the Unearthed Blog.
Impacts for the exploration industry
Improved data accessibility for explorers
As highlighted in multiple Data Prep Prize submissions, statewide geochemistry is challenging to deliver and visualise in our South Australian Resources Information Gateway (SARIG). Currently, the geochemistry layer is too large to view at the state level and users are required to zoom to a smaller section on the map for display. Besides, no geochemistry data is included within the current spatial dataset as multi-element interval data becomes complex to display.
The submissions have provided great insights on how SARIG can improve geochemistry display for explorers, so they can easily visualise and ultimately download relevant subsets of interest in a geologist-preferred format. The SARIG team has already started to build requirements to deliver a new series of geochemistry spatial datasets for explorers, so watch this space!
The Geological Survey also faces issues with the volume of geochemical data received, particularly concerning quality management. Several of the Data Prep Prize submissions identified some of the issues with the geochemistry data and great pathways to detect these. Not only can we use this information to help us fix the existing data, but we can more readily identify the sources of error through this work and develop systems and processes to manage these issues as future data is loaded rather than trying to apply later fixes. This means users can have more confidence in the data we are providing into the future.
Upskilling on modern data science techniques
There has been a drive to use modern analytical techniques such as machine learning on exploration data for some years. This is a developing field and, as such, many of the approaches may feel like a black box and not approachable for explorers. One of the benefits of the Data Prep Prize is some tangible and clear processes to follow to understand how geological data can be used in machine learning. Also, how important it is to consider how our data is stored, structured and managed. In particular, it has shown us which things we need to be mindful of if we want to use our data in machine learning processes. This will enable explorers to try more with machine learning, and work more closely with data scientists with an improved understanding of how they can apply their skills to exploration.
- Hear from Geological Survey experts about the impacts in industry via this interview.
ExploreSA: The Gawler Challenge is a partnership between the Department for Energy and Mining and open innovation platform, Unearthed. The competition showcases South Australia’s huge mineral resource opportunities and the quality data compiled by our world-recognised Geological Survey.
Using the department’s wealth of historical records, data and research, the competition offers the chance to combine geological expertise with new mathematical, machine learning and artificial intelligence to accelerate mineral discovery.
– Holly Bridgwater, Peter Buxton and Christie Gerrard, July 2020