Capturing As-Built Changes to Make Better Digital Twins

This post originally appeared on LinkedIn.
Augmented Reality view of Apple Park

Digital Twins are easy. All you have to do is create a 3D object. Some triangles and you’re done. A BIM model is practically a Digital Twin. The problem is usually those twins are created from data that isn’t “as-built“. What you end up with is a digital object that ISN’T a twin. How can you connect your IoT and other assets to a 3D object that isn’t representative of the real world?

I talked a little bit last time on how to programmatically create digital twins from satellite and other imagery. Of course, a good constellation can make these twins very up to date and accurate but it can miss the details needed for a good twin and it sure as heck can’t get inside a building to update any changes there. What we’re looking for here is a feedback loop, from design to construction to digital twin.

There are a lot of companies that can help with this process so I won’t go into detail there, but what is needed is the acknowledgment that investment is needed to make sure those digital twins are updated, not only is the building being delivered but an accurate BIM model that can be used as a digital twin. Construction firms usually don’t get the money to update these BIM models so they are used as a reference at the beginning, but change orders rarely get pushed back to the original BIM models provided by the architects. That said there are many methods that can be used to close this loop.

Construction methods cause changes from the architectural plans

Companies such as Pixel8 that I talked about last week can use high-resolution imagery and drones to create a point cloud that can be used to verify not only changes are being made as specifications but also can notify where deviations have been made from the BIM model. This is big because humans can only see so much on a building, and with a large model, it is virtually impossible for people to detect change. But using machine learning and point clouds, change detection is actually very simple and can highlight where accepted modifications have been made to the architectural drawings or where things have gone wrong.

A lidar machine creating a digital twin.
Focus on getting those changes into the original BIM models helps your digital twins

The key point here is using ML to discover and update digital twins at scale is critically important, but just as important is the ability to use ML to discover and update digital twins as they are built, rather than something that came from paper space.


Photo by Patrick Schneider on Unsplash
Photo by Elmarie van Rooyen on Unsplash
Photo by Scott Blake on Unsplash


Scaling Digital Twins

This article originally appeared on LinkedIn.

Let’s face it, digital twins make sense and there is no arguing their purpose. At least with the urban landscape though, it is very difficult to scale digital twins beyond a section of a city. At Cityzenith we attempted to overcome this need to have 3D buildings all over the world and used a 3rd party service that basically extruded OSM building footprints where they existed. You see this in the Microsoft Flight Simulator worlds that people have been sharing, it looks pretty good from a distance but up close it becomes clear that building footprints are a horrible way to represent a digital twin of the built environment because they are so inaccurate. Every building is not a rectangle and it becomes impossible to perform any analysis on them because they can be off upwards of 300% on their real-world structure.

Microsoft Flights Simulator created world-wide digital twins at a very rough scale.

How do you scale this problem, creating 3D buildings for EVERYWHERE? Even Google was unable to do this, they tried to get people to create accurate 3D buildings with Sketchup but that failed, and they tossed the product over to Trimble where it has gotten back to its roots in the AEC space. If Google can’t do it who can?

Vricon, who was a JV between Maxar and Saab but recently absorbed by Maxar completely, gives a picture into how this can be done. Being able to identify buildings, extract their shape, drape imagery over them, and then continue to monitor change over the years as additions, renovations, and even rooftop changes are identified. There is no other way I can see that we can have worldwide digital twins other than using satellite imagery.

Vricon is uniquely positioned to create on demand Digital Twins world-wide.

Companies such as Pixel8 also play a part in this. I’ve already talked about how this can be accomplished on my blog; I encourage you to take a quick read on it. The combination of satellite digital twins to cover the world and then using products such as Pixel8 can create that highly detailed ground truth that is needed in urban areas. In the end, you get an up to date, highly accurate 3D model that actually allows detailed analysis of impacts from new buildings or other disruptive changes in cities.

Hyper-accurate point clouds from imagery, hand-held or via drone.

But to scale out a complete digital twin of the world at scale, the only way to accomplish this is through satellite imagery. Maxar and others are already using ML to find buildings and discover if they have changed over time. Coupled with the technology that Vricon brings inside Maxar, I can see them really jump-starting a service of worldwide digital twins. Imagine being able to bring accurate building models into your analysis or products that not only are hyper-accurate compared to extruded footprints but are updated regularly based on the satellite imagery collected.

That sounds like the perfect world, Digital Twins as a Service.


Open Environments and Digital Twins

The GIS world has no idea how hard it is to work with data in the digital twin/BIM world. Most GIS formats are open, or at works readable to import into a closed system. But in the digital twin/BIM space, there is too many close data sets that makes it so hard to work with the data. The loops one must go through to import a Revit model are legendary and mostly are how you get your data into IFC without giving up all the intelligence. At Cityzenith, we were able to work with tons of open formats, but dealing with Revit and other closed formats was very difficult to the point it required a team in India to handle the conversions.

All the above is maddening because if there is one thing a digital twin should do, is be able to talk with as many other systems as possible. IoT messages, GIS datasets, APIs galore and good old fashioned CAD systems. That’s why open source data formats are best, those that are understood and can be extended in any way someone needs. One of the biggest formats that we worked with was glTF. It is widely supported these days but it really isn’t a great format for BIM models or other digital twin layers because it is more of a visual format than a data storage model. Think of it similar to a JPEG, great for final products, but you don’t want to work with it for your production data.

IFC, which I mentioned before, is basically a open BIM standard. IFC is actually a great format for BIM, but companies such as Autodesk don’t do a great job supporting it, it becomes more of interchange file, except where governments require it’s use. I also dislike the format because it is unwieldy, but it does a great job of interoperability and is well supported by many platforms.

IFC and GLTF are great, but they harken back to older format structures. They don’t take advantage of modern cloud based systems. I’ve been looking at DTDL (Digital Twins Definition Language) from Microsoft. What I do like about DLDT is that it is based on JSON-LD so many of those IoT services you are already working with take advantage of it. Microsoft’s Digital Twin platform was slow to take off but many companies, including Bentley Systems, are leveraging it to help their customers get a cloud based open platform which is what they all want. Plus you can use services such as Azure Functions (very underrated service IMO) to work with your data once it is in there.

Azure Digital Twins
Azure Digital Twins

The magic of digital twins is when you can connect messaging (IoT) services to your digital models. That’s the holy grail, have the real world connected to the digital world. Sadly, most BIM and digital twin systems aren’t open enough and require custom conversion work or custom coding to enable even simple integration with SAP, Salesforce or MAXIMO. That’s why these newer formats, based mostly on JSON, seem to fit the bill and we will see exponential growth in their use.


Natural Language Processing is All Talk

I’ve talked about Natural Language Processing (NLP) before and how it is beginning to change the BIM/GIS space. But NLP is just part of the whole solution to change how analysis is run. I look at this as three parts:

  1. Natural Language Processing
  2. Curated Datasets
  3. Dynamic Computation

NLP is understanding ontologies more than anything else. When I ask how “big” something is, what do I mean by this. Let’s abstract this away a bit.

How big is Jupiter?

One could look at this a couple ways. What is the mass of Jupiter? What is the diameter of Jupiter? What is the volume of Jupiter? Being able to figure out intent of the question is critical to having everything else work. We all remember Siri and Alexa when they first started. They were pretty good at figuring out the weather but once you got out of those canned queries all bets were off. It is the same with using NLP with BIM or GIS. How long is something? Easy! Show me all mixed-use commercial zoned space near my project? Hard. Do we know what mixed-use commercial zoning is? Do we know where my project is? That because we need to know more about the ontology of our domain. How do we do this, learn about our domain? We need lots of data to teach the NLP and then run it through a Machine Learning (ML) tool such as Amazon Comprehend to figure out the context of the data and structure it in a way the NLP can understand out intents.

As discussed above, curated data to figure out ontology is important but it’s also important to help users run analysis without understanding what they need. Imagine using Siri, but you needed to provide your own weather service to find out the current temperature? While I have many friends who would love to do this, most people just don’t care. Keep it simple and tell me how warm it is. Same with this knowledge engine we’re talking about. I want to know zoning for New York City? It should be available and ready to use. Not only that, curated so it is normalized across geographies. Asking a question in New York or Boston (while there are unique rules in every city) should’t be difficult. Having this data isn’t as sexy as the NLP, but it sure as heck makes that NLP so much better and smarter. Plus, who wants to worry about do they have the latest zoning for a city, it should always be available and on demand.

Lastly once we understand the context of the natural language query and have data to analysis, we need to run the algorithms on the question. This is what we typically think of as GIS. Rather than manually running that buffer and identity, we use AI/ML to figure out the intent of the user using the ontology and grab the data for the analysis from the curated data repository. This used to be something very special, you needed to use some monolithic tool such as ArcGIS or MapInfo to accomplish the dynamic computation. But today these algorithms are open and available to anyone. Natural language lets us figure out what the user is asking and then run the correct analysis, even if they call it something different from what a GIS person might.
The “Alexa-like” natural language demos where the computer talks to users is fun, but much like the AR examples we see these days, not really useful in the context of real world use. Who wants their computer talking to them in an open office environment? But giving users who don’t know anything about structured GIS analysis the ability to perform complex GIS analysis is the game changer. It isn’t about how many seats of some GIS program are on everyones desk but how easy these NLP/AI/ML systems can be integrated into the existing workflows or websites. That’s where I see 2019 going, GIS everywhere.


Underground Digital Twins

We all have used 3D maps. From Google Earth, to Google and Apple Maps, to Esri, Mapbox and others, we are very used to seeing 3D buildings rendered on our devices. But think of the iceberg analogy…

Below is a bigger deal than above…

Icebergs are so much bigger than they appear. This is the case with the built environment. Look out your window and you see a complex city. But what you don’t see is what is below. We know that these underground assets are hit on average every 60 seconds in the United States which costs over $1B dollars in losses. What we can’t see is costing cities and developers money that could be better spent on making these cities sustainable.

But getting a hold on this issue is not easy. The ownership of these assets is many times private and those companies do not wish to share anything about what is underground for business or security reasons. Plus even if sharing was something that interested people, there isn’t a good unified underground model to place them in (we have many of these available for above ground assets). But there seems to be some progress in this area. Writes Geoff Zeiss:

At the December Open Geospatial Consortium (OGC) Energy Summit at EPRI in Charlotte, Josh Lieberman of the OGC presented an overview of the progress of OGC’s underground information initiative, with the appropriate acronym MUDDI, which is intended to provide an open standards-based way to share information about the below ground.

The part that gets my attention is that MUDDI model is intended to build on and be compatible with many existing reference models. This is a big deal because many of the stakeholders in underground assets have already invested time and money into supporting these. As Geoff writes:

MUDDI is not an attempt to replace existing standards, but to build on and augment existing standards to create a unified model supporting multiple perspectives.

I’m totally on board with this. Creating a new model that handles all these edge-cases only will result in a model nobody wants. As we work toward integrating underground models into Digital Twin platforms, MUDDI will be a huge deal. It’s not ready by any means yet but because it support existing standards everyone can get involved immediately and start working at creating underground digital twins.


BIM vs. Digital Twin

The thing with BIM is that BIM models are VERY complicated. That’s just the nature of BIM. People talk about digital twins all the time, and BIM (as an extension of CAD) is probably one of the first representations of a digital twin. BIM though by its nature isn’t an “as-built.” It is just a picture of what the real world object should be, where-as a digital twin is a digital copy of an existing asset. Now the best way to start a digital twin is to import a BIM model, but there are some areas you need to be aware of before doing so.

  1. A BIM model might not be an as-built. As I said above, BIM is what something should be, not what it ends up being. During construction, changes are always made to the building, and in doing so, the BIM model ceases to be a digital twin. Just importing a BIM model without field verification can result in your digital twin not genuinely being a digital twin.

  2. What detail do you need in hour digital twin? A BIM model might have millions of entities making up even a simple asset, such as a window frame that is unique and requires high accuracy. This is very important in the construction phase where even a millimeter off can cause problems, but for a digital twin, that detail is not needed. This is where BIM and digital twins diverge; the BIM model is the engineering representation of something vs. a digital twin is just the digital replica. There is no reason why you couldn’t import in such an elaborate window frame of course, but throughout a whole building or even a city, these extra details get lost in the LOD. The key here is knowing what your LOD is and how you want to view it. There is much going on in the 3D space where you can use LOD to display the elaborate window frame above, yet still be performant where needed.
  3. Aftermarket features are generally part of a digital twin. BIM models are idealized in that they only show what was spec’d out. Digital twins need to show all those modifications that were made after the building was turned over to the owner. Doors removed, walls put up, windows boarded over. These things all need to be reflected in your digital twin. Just importing a BIM model that doesn’t address these changes means that when you go to link up your digital twin to IoT or other services, there is no one-to-one relationship. Preparation work of that BIM model before ingestion into a digital twin helps immeasurably.

It is easy to want to jump into creating digital twins of your buildings but it is critical to make sure that before you do so you’ve review your files to ensure that they are as-built and a twin of the real world asset.


Dreamit UrbanTech and Cityzenith

We all get busy from time to time and the past year has been so busy that it feels like a blur for me. That’s a good thing though, so much good going on. At Cityzenith we’ve hired a great team to migrate our platform from cesium.js to Unity. More on this later in the post.

Now on to the news we were able to talk about last month was Cityzenith was selected for the Dreamit UrbanTech. This is startup accelerator founded in partnership with Strategic Property Partners (a joint venture by Jeff Vinik and Bill Gates). SPP is leading the $3B redevelopment of the Tampa Bay waterfront, one of the largest real estate projects in the United States and a rare opportunity to build a smart, connected city in an existing urban zone.

I’ll be in Tampa all next week meeting with many of the companies who are going to use Cityzenith to build the Tampa Bay Waterfront Project. We are excited to use the project to scale Cityzenith into a tool that can be part of the workflows for all Architecture, Engineering and Construction companies who want to integrate, BIM, GIS, CAD, IoT, web services and visualize them in a worldwide 3D tool that gives them the ability to plan for the future and the impacts of current development. Should be an amazing time.

In the coming weeks I will be diving deeply into why Cesium is actually pretty damn awesome even if we didn’t select it for Cityzenith Smart World 2.0, why we’re really liking glTF 2.0, Mapbox Unity SDK, using AWS IoT and AWS Lambda for some great serverless file conversions and what we’re diving into deeply, City Information Modeling or CIM.


BIM Database Long Tail

In the GIS world the database part of GIS files is the power. I would wager the average GIS Analyst spends more time editing, calculating, transforming the GIS database more than they do the editing of the points/lines/polygons. The first thing I did working with GIS files is open the table to see what I have (or don’t have) for data.

One of the key aspects to BIM is the database. In the hands of an Architect, the database takes a back seat but tools such as Revit make sure that everything that is placed has detailed information about it stored in a database. It isn’t Revit though, IFC, CityGML and other formats treat the database as an important part of a BIM model. But when we share BIM models, the focus is always on the exterior of the model and not the data behind it.

Aqua Tower, Chicago, IL inside Cityzenith SmartWorld

One thing I’ve focused on here at Cityzenith since I joined as the CTO is pulling out the power from BIM models and expose them to users. As someone who is used to complex GIS databases I’m amazed at how much great data is locked in these BIM formats unable to be used by planners, engineers and citizens. I talked last week about adding a command line to Cityzenith so that users can get inside datasets and getting access to BIM databases is no exception.

That’s why we’re going to expose BIM databases the same way we expose SQL Server, Esri ArcGIS and other database formats. When you drag and drop BIM models into Cityzenith that have databases attached them you will be prompted to transform them with our transformation engine. BIM has always been treated as a special format that is locked up and kept only in hands of special users. That’s going to change, we are going to break out BIM from its protected silo and expose the longest of long tails in the spatial world, the BIM database.

I’ve always said Spatial isn’t Special and we can also say BIM isn’t Special.


BIM File Format Fun

If you ever thought it was difficult to work with GIS file formats, you haven’t explored the world of BIM. With Cityzenith I’m getting back into converting BIM and it’s making me nogstagic for Esri’s File Geodatabase or LIDAR formats. One of our core features at Cityzenith is drag and drop BIM model import. We’re supporting COLLADA, FBX, IFC, OBJ and CityGML for now but it seems every time I talk with a user they want to import some very proprietary BIM format. I won’t even get into the issues with importing Autodesk’s Revit but look how hard it is for even Safe FME.

The great thing about most of these 3D formats (beyond Revit) is they are relitively open and there are many tools for reading and writing them. But the sheer amount of formats means that you’ve got to plan for these in your software workflows. IFC and FBX do a great job of saving a lot of the BIM data on export while formats such as COLLADA basically drop everything except the structure. Most of the time this isn’t an issue because BIM files have so much data in them that really isn’t important for a planning and development tool such as Cityzenith, but we want to grab much of this to allow our users to perform analysis on building information.

We also want to grab floors of the building and basic structure beyond just the building shell. When we’re integrating IoT feeds into buildings, having the floors or rooms is critically important. IFC is the hope and the failure of openBIM. In an attempt to be everything to everyone, you end up with a bloated format but one that will address all your needs. Being able to programatically pull out floors and rooms of a BIM model requires a ontology for us to work with and IFC sticks to one that we can work with.

But any consultant will tell you, each user(client) has their own unique ontology that you have to work with. Safe Software has been dealing with this for years and has done an amazing job of working with these little idiosyncrasies that enter not only file formats but the models that humans create in them. It’s been really fun getting back into BIM and BIM file formats full time. BIM to GIS (or GIS to BIM) is hard but that challenge and making it simple and repeatable for all users is going to make it very exciting.