Geocoding redefined: what’s new in Jawg’s geocoding API and data framework

Let's talk about how data is at the heart of geocoding, which sources Jawg uses for its API (and why), and how the project continuously evolves with the needs of the customer.

Geocoding redefined: what’s new in Jawg’s geocoding API and data framework
Photo by Christian Lue / Unsplash

In this article we’re going to talk about how data is at the heart of geocoding, which sources we use for our API (and why), and how the project continuously evolves with the needs of our customers.

Don't know what geocoding is? Checkout out our article: What is geocoding and what is it used for?

Data: the heart of geocoding

In order to do a geocoding search, we need data. We have several data sources that you have probably heard of while using our services: OpenStreetMap (OSM), OpenAddresses (OA), Geonames (GN) and Who's On First (WOF). The reason why we pull from different data sources is because each of them has its own specialty.

Administrative data improvements

Our main source for places like cities, regions and countries is WOF. Its major advantage is its architecture. Unlike OSM, it allows you to have a clear worldwide hierarchy between each elements, which is why our layer system is entirely based on their hierarchy.

However, sometimes WOF data can be inaccurate, which has been an issue for some of our customers. We mainly use it to get the location's place of points coming from all other sources. For example, the lack of precision from WOF has caused many countries to be mislabeled and/or for their borders to be incorrectly traced.

We tried to improve this inaccuracy by selecting data from much more precise sources like OSM. In this example, which displays the border between France and Switzerland, you can see some buildings in France that are located outside of the country according to WOF (left).

Original WOF data vs new OSM version

Generated administrative data

The available data is not always sufficient and we are always looking for solutions to improve our services. With this in mind, our R&D team checked on the possibility of generating data corresponding to administrative areas.

We did a Proof of Concept on postal codes in Greece and were able to generate a full coverage for the country. Of course, it is not an official data source and may not be completely accurate, but since the coverage data does not exist, it's still an improvement.

Here is an overview of what is now available in our services with the new Jawg data source (sources=jawg). And we will try to generate more data for different countries when the coverage is useful, but missing.

Postal codes generated in Greece

OSM as new administrative data source

As stated previously, WOF has a better architecture than OSM for its administrative part. The architecture in OSM a bit chaotic because we have different rules per countries, thus, implementing them all would be long and tedious.

However, this does not stop us as we have added some areas for OSM in our services, which were missing in WOF data. If you were used to searching by filtering WOF data only, you can now update your configuration to include OSM as well e.g.: sources=osm,wof&layers=coarse.

In addition to this, we created a new layer name island. This layer includes all islands that do not already exist in another layer. Let's take a look at Hawaii's islands, as Hawaii is a US State as well as an island and a county.
In the Hawaii State, we have Maui which is both a County and an Island. With the new layer name, we are now able to query the island itself and all other islands from Maui County.

The API evolves with the needs of our customers

We already have a rich, highly configurable API, but that doesn't stop our customers from suggesting new features, nor us from improving the user experience. Let's take a look at some of these latest updates.

Get the source geometry

One of our customers needed to retrieve the geometry of our administrative areas, which was not available via our API. After a quick brainstorm, we developed and deployed the new feature for them within 48 hours.
To find out more about this use case, check out this blog post: Find your next travel destination with HomeExchange and Jawg (in French).

Want to use this new feature? Here is a sample request:

https://api.jawg.io/places/v1/place?geometries=source&ids=whosonfirst:localadmin:404428955&access-token=

Focus on specific countries

It was already possible to filter searches on particular countries. Now, it's also possible to add a focus on specific countries.
Your company is present all over the world but wants to highlight a specific number of countries? focus.country is the perfect option for you!

Try it out with our library places-js and some examples like San Antonio, Santa Maria, Santa Cruz, Oxford, Manchester...

Without Focus
With USA Focus
<input
  id="places-w-focus"
  class="places-js"
  placeholder="Search with focus on USA">
<script>
  new JawgPlaces.Input({
    input: '#places-w-focus',
    searchOnTyping: true,
    focus: { countries: 'USA' },
    clearCross: true,
  })
</script>

Get hidden objects

This feature is to detect similar places as we automatically remove duplicated elements from our responses (that may represent the same physical place).

In some use cases, those elements can be interesting, especially when we need the geometry. That's why we exposed it and you can access it via properties.addendum.dedupe on each features.

{
  "type": "Feature",
  "geometry": { "type": "Point", "coordinates": [ 2.342865, 48.858705 ] },
  "properties": {
    "id": "101751119",
    "gid": "whosonfirst:locality:101751119",
    "layer": "locality",
    "source": "whosonfirst",
    "source_id": "101751119",
    "country_code": "FR",
    "name": "Paris",
    "accuracy": "centroid",
    "country": "France",
    "country_gid": "whosonfirst:country:85633147",
    "country_a": "FRA",
    "macroregion": "Île-De-France",
    "macroregion_gid": "whosonfirst:macroregion:404227465",
    "macroregion_a": "IF",
    "region": "Paris",
    "region_gid": "whosonfirst:region:85683497",
    "region_a": "VP",
    "localadmin": "Paris",
    "localadmin_gid": "whosonfirst:localadmin:1159322569",
    "locality": "Paris",
    "locality_gid": "whosonfirst:locality:101751119",
    "continent": "Europe",
    "continent_gid": "whosonfirst:continent:102191581",
    "label": "Paris, France",
    "addendum": {
      "dedupe": [
       {
          "gid": "geonames:macrocounty:2988506",
          "source": "geonames",
          "layer": "macrocounty",
          "id": "2988506"
        },
       {
          "gid": "whosonfirst:region:85683497",
          "source": "whosonfirst",
          "layer": "region",
          "id": "85683497"
        },
       {
          "gid": "geonames:locality:6455259",
          "source": "geonames",
          "layer": "locality",
          "id": "6455259"
        },
       {
          "gid": "geonames:region:2968815",
          "source": "geonames",
          "layer": "region",
          "id": "2968815"
        },
       {
          "gid": "geonames:locality:2988507",
          "source": "geonames",
          "layer": "locality",
          "id": "2988507"
        }
      ]
    }
  },
  "bbox": [ 2.224225, 48.815607, 2.469769, 48.902008 ]
}

Find the result in your language

We had some limitations on autocomplete searches in foreign languages, the typical example was the search for Parijs, Frankrijk in Dutch which only worked when writing Parijs without the country. We found the solution to eliminate this limitation and do autocomplete searches in your language.

Differentiate identical names

This is a case that happens regularly in France as several cities with the same name exist and the only way to differentiate them is to know in which region (or Département) they are located. Outside the Département is not an element that is generally displayed in addresses. So we created new rules to avoid these situations, without you needing to change your integration.

First example: search the city named Bagneux.

https://api.jawg.io/places/v1/search?text=Bagneux&access-token=...
Old version
0) Bagneux, France
1) Bagneux, France
2) Bagneux, France
3) Bagneux, France
4) Bagneux, France
5) Bagneux, France
6) Bagneux, France
New version
0) Bagneux, Hauts-de-Seine, France
1) Bagneux, Marne, France
2) Bagneux, Allier, France
3) Bagneux, Indre, France
4) Bagneux, Meurthe-et-Moselle, France
5) Bagneux, Aisne, France
6) Bagneux, Deux-Sèvres, France

Second example: search for Starbucks in New York.

https://api.jawg.io/places/v1/search?text=Starbucks, new york&access-token=...
Old version
0) Starbucks, New York, NY, USA
1) Starbucks, New York, NY, USA
2) Starbucks, New York, NY, USA
3) Starbucks, New York, NY, USA
4) Starbucks, New York, NY, USA
5) Starbucks, New York, NY, USA
6) Starbucks, New York, NY, USA
7) Starbucks, New York, NY, USA
8) Starbucks, New York, NY, USA
9) Starbucks, New York, NY, USA
New version
0) Starbucks, 5th Avenue, New York, NY, USA
1) Starbucks, Broadway, New York, NY, USA
2) Starbucks, 6th Avenue, New York, NY, USA
3) Starbucks, 3rd Avenue, New York, NY, USA
4) Starbucks, West 181st Street, New York, NY, USA
5) Starbucks, 7th Avenue, New York, NY, USA
6) Starbucks, East 93rd Street, New York, NY, USA
7) Starbucks, 1st Avenue, New York, NY, USA
8) Starbucks, East 90th Street, New York, NY, USA
9) Starbucks, West 145th Street, New York, NY, USA
Introducing Jawg Places JS
We are excited to release our new library Jawg Places JS, a fast and easy way to turn any HTML input into a search bar with autocomplete. This new library can also be used as a plugin for Leaflet, MapLibre GL JS and Mapbox GL JS.

As you can see, we've made quite a lot of updates in the recent months! But we're always looking to improve our services and rely on our users for their input.

So if you have any feedback or special requests, please do not hesitate to share them with us.