We have a team that calls data holders and motivates them to donate the data to the AddressForAll project, with free licenses. We always recommend releasing data in the public domain and the use of CC0 license, but we also accept CC-BY and CC-BY-SA licensed data.
Our data are available in 3 treatment phases:
The data and licenses of the Digital-Guard project are strictly controlled and preserved just like the originals received from the donors. It is the "raw data", without standardization and in many different formats (CSV, Shapefile, Geojason etc.). They arepreserved for 20 years, and during that time can be downloaded, just as we received them.
Because they are diverse in origin, the Preserved data needs to be filtered and standardized. The AddressForAll Project makes a clipping focusing on addresses. The structure of the clipping is standardized and published in GeoJSON format, via PostgreSQL in git repositories.
The entire filtering and publishing process is open and reproducible , anyone can audit it. The results are not submitted to the validation process , and the same address can be described and repeated by different sources, such as the municipality, the water company, and the logistics company.
Consolidation consists of statistically aggregating the information from various sources about the same address and its neighborhoods, and applying validation algorithms. In the process the addresses recognized as duplicates are reduced to a single address, and invalid addresses are discarded.
We obtain both the reliability score of the original data and most likely position of the address point. Street names receive terminology standardization and house numbers can be optimized through averaging, repositioning or interpolation. This database is used for our search and geocoding APIs (under construction) .