Natural Language Processing is All Talk

I’ve talked about Natural Language Processing (NLP) before and how it is beginning to change the BIM/GIS space. But NLP is just part of the whole solution to change how analysis is run. I look at this as three parts:

  1. Natural Language Processing
  2. Curated Datasets
  3. Dynamic Computation

NLP is understanding ontologies more than anything else. When I ask how “big” something is, what do I mean by this. Let’s abstract this away a bit.

How big is Jupiter?

One could look at this a couple ways. What is the mass of Jupiter? What is the diameter of Jupiter? What is the volume of Jupiter? Being able to figure out intent of the question is critical to having everything else work. We all remember Siri and Alexa when they first started. They were pretty good at figuring out the weather but once you got out of those canned queries all bets were off. It is the same with using NLP with BIM or GIS. How long is something? Easy! Show me all mixed-use commercial zoned space near my project? Hard. Do we know what mixed-use commercial zoning is? Do we know where my project is? That because we need to know more about the ontology of our domain. How do we do this, learn about our domain? We need lots of data to teach the NLP and then run it through a Machine Learning (ML) tool such as Amazon Comprehend to figure out the context of the data and structure it in a way the NLP can understand out intents.

As discussed above, curated data to figure out ontology is important but it’s also important to help users run analysis without understanding what they need. Imagine using Siri, but you needed to provide your own weather service to find out the current temperature? While I have many friends who would love to do this, most people just don’t care. Keep it simple and tell me how warm it is. Same with this knowledge engine we’re talking about. I want to know zoning for New York City? It should be available and ready to use. Not only that, curated so it is normalized across geographies. Asking a question in New York or Boston (while there are unique rules in every city) should’t be difficult. Having this data isn’t as sexy as the NLP, but it sure as heck makes that NLP so much better and smarter. Plus, who wants to worry about do they have the latest zoning for a city, it should always be available and on demand.

Lastly once we understand the context of the natural language query and have data to analysis, we need to run the algorithms on the question. This is what we typically think of as GIS. Rather than manually running that buffer and identity, we use AI/ML to figure out the intent of the user using the ontology and grab the data for the analysis from the curated data repository. This used to be something very special, you needed to use some monolithic tool such as ArcGIS or MapInfo to accomplish the dynamic computation. But today these algorithms are open and available to anyone. Natural language lets us figure out what the user is asking and then run the correct analysis, even if they call it something different from what a GIS person might.
The “Alexa-like” natural language demos where the computer talks to users is fun, but much like the AR examples we see these days, not really useful in the context of real world use. Who wants their computer talking to them in an open office environment? But giving users who don’t know anything about structured GIS analysis the ability to perform complex GIS analysis is the game changer. It isn’t about how many seats of some GIS program are on everyones desk but how easy these NLP/AI/ML systems can be integrated into the existing workflows or websites. That’s where I see 2019 going, GIS everywhere.


Underground Digital Twins

We all have used 3D maps. From Google Earth, to Google and Apple Maps, to Esri, Mapbox and others, we are very used to seeing 3D buildings rendered on our devices. But think of the iceberg analogy…

Below is a bigger deal than above…

Icebergs are so much bigger than they appear. This is the case with the built environment. Look out your window and you see a complex city. But what you don’t see is what is below. We know that these underground assets are hit on average every 60 seconds in the United States which costs over $1B dollars in losses. What we can’t see is costing cities and developers money that could be better spent on making these cities sustainable.

But getting a hold on this issue is not easy. The ownership of these assets is many times private and those companies do not wish to share anything about what is underground for business or security reasons. Plus even if sharing was something that interested people, there isn’t a good unified underground model to place them in (we have many of these available for above ground assets). But there seems to be some progress in this area. Writes Geoff Zeiss:

At the December Open Geospatial Consortium (OGC) Energy Summit at EPRI in Charlotte, Josh Lieberman of the OGC presented an overview of the progress of OGC’s underground information initiative, with the appropriate acronym MUDDI, which is intended to provide an open standards-based way to share information about the below ground.

The part that gets my attention is that MUDDI model is intended to build on and be compatible with many existing reference models. This is a big deal because many of the stakeholders in underground assets have already invested time and money into supporting these. As Geoff writes:

MUDDI is not an attempt to replace existing standards, but to build on and augment existing standards to create a unified model supporting multiple perspectives.

I’m totally on board with this. Creating a new model that handles all these edge-cases only will result in a model nobody wants. As we work toward integrating underground models into Digital Twin platforms, MUDDI will be a huge deal. It’s not ready by any means yet but because it support existing standards everyone can get involved immediately and start working at creating underground digital twins.


BIM vs. Digital Twin

The thing with BIM is that BIM models are VERY complicated. That’s just the nature of BIM. People talk about digital twins all the time, and BIM (as an extension of CAD) is probably one of the first representations of a digital twin. BIM though by its nature isn’t an “as-built.” It is just a picture of what the real world object should be, where-as a digital twin is a digital copy of an existing asset. Now the best way to start a digital twin is to import a BIM model, but there are some areas you need to be aware of before doing so.

  1. A BIM model might not be an as-built. As I said above, BIM is what something should be, not what it ends up being. During construction, changes are always made to the building, and in doing so, the BIM model ceases to be a digital twin. Just importing a BIM model without field verification can result in your digital twin not genuinely being a digital twin.

  2. What detail do you need in hour digital twin? A BIM model might have millions of entities making up even a simple asset, such as a window frame that is unique and requires high accuracy. This is very important in the construction phase where even a millimeter off can cause problems, but for a digital twin, that detail is not needed. This is where BIM and digital twins diverge; the BIM model is the engineering representation of something vs. a digital twin is just the digital replica. There is no reason why you couldn’t import in such an elaborate window frame of course, but throughout a whole building or even a city, these extra details get lost in the LOD. The key here is knowing what your LOD is and how you want to view it. There is much going on in the 3D space where you can use LOD to display the elaborate window frame above, yet still be performant where needed.
  3. Aftermarket features are generally part of a digital twin. BIM models are idealized in that they only show what was spec’d out. Digital twins need to show all those modifications that were made after the building was turned over to the owner. Doors removed, walls put up, windows boarded over. These things all need to be reflected in your digital twin. Just importing a BIM model that doesn’t address these changes means that when you go to link up your digital twin to IoT or other services, there is no one-to-one relationship. Preparation work of that BIM model before ingestion into a digital twin helps immeasurably.

It is easy to want to jump into creating digital twins of your buildings but it is critical to make sure that before you do so you’ve review your files to ensure that they are as-built and a twin of the real world asset.