But if it is so simple, why do consumer apps, and even some commercial apps, return geocodes as much as a quarter mile off the actual location?
The client that asked us that was struggling with inaccurate data that appeared to be accurate— until it was time to deliver to a location that couldn’t be found!
- There were 6 decimal points in the coordinates.
- There were no flags from the app that the data was questionable,
- And no reason to believe the geocoded data was anything but accurate.
1. Not all of the addresses entered for geocoding were properly cleansed and validated. Bad data in equals bad data out. (While geocoding is NOT the same as Address Cleansing and Validation, our software does include an internal cleansing process to return the most accurate data possible. This process, however, is happening behind the scenes. If you want to update your dataset with the cleansed and validated address information, consider adding our Address Validation software to your solution.)
2. The geocoder they used relied on interpolated geocoding rather than point geocoding, and while that may be good enough for some business processes, it was disastrous for theirs.
Interpolated geocoding uses data from a street geographic information system where the street network is mapped within the coordinate space. Each street is assumed to have a certain number and type of addresses and the location is “mapped” in an assumed location along the street – that may or may not reflect its actual location.
- Lacked coverage in their targeted areas
- Had outdated, inaccurate, or single-sourced datasets
- Or had weak interpretation and matching algorithms for matching pieces of their input to the underlying databases.
…So in some cases, it “fell back” to a higher level, such as postal code, to produce a geocode, without the client’s approval. He saw six decimal places and assumed it was highly accurate.
Just Need Batch Geocoding?
If you are looking a cloud-based geocoder for batching files to geocode, then click here for click here for MapMarker.
INTERPOLATED AND FALL BACK DATA
Geocoders “fall back” when they are unable to verify a location at a certain level of accuracy. For example, if a system cannot locate a specific house number, it will fall back to a street, and from a street to a zip code.
Do you know what level of accuracy is good enough for your application? Or what level of accuracy you are currently getting?
Location/Geoconfidence Code: Identifies the geographic precision of the coordinate for a matched address.
Our client couldn’t trust his data because he couldn’t trust his geocoder. Can you?
With clean and consistent data, it is possible to surface relevant business insights by understanding the relationships between people, places and things.
- User error is mitigated with on-the-fly run through address correction before geocoding.
- It covers 251 countries, with 73 countries at full point address accuracy and 149 countries at street level accuracy. Many languages are supported, too.
- Underlying the results is the latest mapping data from TomTom, HERE, USPS, and more. The highest quality option, MLD, has data from NINE sources. These datasets are updated automatically to ensure you have the latest location information available.
- You can send single line addresses or up to four address lines plus city, State and postal code. You can even geocode building names.
- It provides Match and Location Confidence Codes. (A Match Code tells you what the system had to do get a match, what parts of the address matched and what didn’t. A Location Code identifies the precision of the geocode – such as point, street, postal code, etc.)
- In addition, you have the power to control the match process through custom settings to control Precisely’s battle-tested matching algorithms in the default settings. You can even have it return candidate addresses for when an exact match can’t be found.
- And yes, it provides for full batch processing.
Systems designed for “consumers” are designed to give a “yes” answer, six decimal places of coordinates and a quality score, but the answers are not always right nor do they tell you what happened. Without a good result code/match code system, you are limited in being able to know when there are problems and how big they are. Those six decimal places may be for a different location than you think. These systems are okay for finding the closest pizza place, but are often not good enough for business use with thousands or millions of records.
Are you trying to implement Artificial Intelligence, Machine Learning, Single View, MDM (Master Data Management)? Or are you just trying to match items in Excel or Access?
Data Quality for location data is difficult. The Albuquerque list shows variations of just the city name. Imagine the issues with full addresses and the consequences of having combinations of problematic or ambiguous data in multiple fields. You may have addresses that cannot be reliably found, that are actually duplicates, or are very similar to other addresses.
No less than state of the art is required for every step in the process. You need consistent, validated, and standardized address and geocoding data PLUS…
- Advanced matching algorithms to analyze, parse and standardize your data.
- And software with machine learning capabilities to incorporate exceptions and provide flexible, easy to control data governance that makes the process of ensuring data quality as quick and painless as possible.
Our geocoding performs address validation and correction as part of the geocoding process! If you only need address validation and correction, we have that, too. We also cover you when you need the full features of address validation and correction, including detailed diagnostic information about your addresses. All available in batch, interactive, SaaS or self-hosted.
More on Geocoding from our Blog
Geocoding Basics What is geocoding? Simply, it’s the process of matching a location such as: an address ("Street-level geocoding") postal (ZIP) code ("ZIP-Level or Postal-Level geocoding) city name ("City-Level Geocoding, not used as much anymore) county name,...
One of our goals in communicating to you is to provide the most important information that helps you identify the signal and the noise. Think of listening to a radio station with static. The static is the noise and the voices or music are the signal. Our brains do a great job of filtering audio and there are tools to draw out the signal and suppress the noise. This blog is about focusing on the signal for various topics and, when possible, we’ll describe the noise. Unlike the radio static that is random, some of the noise in the marketplace may seem like signal, but it’s not.