Skip to content Skip to navigation

Human and Robot Interaction for Dynamic Updating of Building Information Models

turtlebot image

Project Team

Kincho Law, Max Ferguson

Research Overview

Observed Problem:

Having an accessible and constantly-updated digital model of a facility or construction site could lead to significant improvements in safety, energy efficiency and worker productivity. While recent advances in machine learning have made such a system feasible, the interactions between people and an automated system remain important for real-time modeling of a facility. It is unclear how human operators and automated systems can collaborate when updating a 3D model or controlling machinery in the building space.

Primary Research Objective:

We will extend our Object R-CNN framework for dynamically updating building information models using computer vision. This framework allows people and automated systems to update the model of a facility. A web-based interface to observe and correct changes to the model will be developed so that an accurate digital model can be maintained. In addition, we will develop algorithms that allow humans and automated systems to jointly control mobile robots in the building space.

Potential Value to CIFE Members and Practice:

  • This research further advances the ability to automatically construct building information models from pointcloud data

  • This research introduces the concept of dynamic building models, which enables building information modelling to be used for applications in robotics, automation and safety management.

Research provides relevant insights for:

Owners, designers, construction, operators

Industry and Acadmic Partners


Research Updates & Progress Reports

Research this quarter focused on developing a communication platform to allow human operators, mobile robots, and machine learning systems to communicate and collaborate using a 3D model of the building. In addition, we demonstrated that a mobile robot can use this platform to safely move around rooms in a large facility.

Communication Platform

For safe and efficient robot operation, it is essential that humans and mobile robots can freely communicate about building geometry, robot objectives, and potential hazards. However, in most cases, communication between human operators and mobile robots is still very limited. We have developed a high-performance communication platform that allows multiple agents to query and manipulate a shared representation of the building state. The platform supports many datatypes including constructive solid geometry, polygon mesh, floor plan, pointcloud, timeseries, images, robot commands and generic events. The utility of the platform is demonstrated through two examples applications: A web-based user interface and an autonomous robot navigation system.

Safe Mobile Robot Navigation System

Many modern robot systems still lack the high-level navigation proficiencies that humans exhibit naturally, such as avoiding crowded spaces. We have developed a robot navigation system that is suited to complex and uncertain environments, such as modern facilities. A rigid-body simulator is used to simulate the movement of multiple mobile robots in a large-scale indoor space. A reinforcement learning algorithm is used to gradually learn a control policy for the mobile robot, based on the simulated behavior. This approach, referred to as Safe Neural Control, performs exceptionally well on several robot navigation tasks, learning human-like policies for navigation and collision avoidance. In addition, SNC learns many safe navigation behaviors, such as avoiding crowded spaces, reducing speed near walls, and avoiding stairs. Finally, by creating a real-time digital replica of the building we demonstrate that SNC can also be used to control robots in a real facility.

Detailed Research Overview & Progress Updates - 3/31/2020

There have been two key outcomes from the research project thus far: The first outcome is the completion of a communication platform, which allows human operators, mobile robots, and machine learning systems to communicate and collaborate using a 3D model of the building. The second outcome is a mobile robot control system that uses semantic-rich geometric information from the communication platform to improve mobile robot navigation. In addition, we demonstrated that a real mobile robot can use this platform to safely move around rooms in a large facility.

Digital model of test environment; TurtleBot in the test environment

Communication Platform for Human-Robot Interaction

For safe and efficient robot operation, it is essential that humans and mobile robots can freely communicate about building geometry, robot objectives, and potential hazards. However, in most cases, communication between human operators and mobile robots is still very limited. We have developed a high-performance communication platform that allows multiple agents to query and manipulate a shared representation of the building state. The platform supports many datatypes including constructive solid geometry, polygon mesh, floor plan, pointcloud, and robot control messengers. The platform also supports nonlinear coordinate frame transforms, allowing agents to communicate effectively regardless of the coordinate system they are using. Human operators can manipulate the 3D model using a web-based graphical user interface.

The communication platform was used to solve a collaborative mobile robot navigation scenario. In this scenario, each mobile robot publishes pointcloud data to the communication platform. The aggregated pointcloud data is used to identify obstacles that could potentially block the mobile robot trajectories. The mobile robots are immediately notified whenever a new object is detected, allowing them to replan their trajectory.

A live version of the communication platform is publicly available at The source code is available at:

3D building model; occupancy grid map

Stochastic neural control:  Using spatial knowledge to make better navigation decisions

Many modern robot systems still lack the high-level navigation proficiencies that humans exhibit naturally, such as avoiding fall-hazards or crowded spaces. In addition, most modern robotic systems are unable to use semantic-rich geometry data (like BIM models) to inform their decisions. We have developed a robot navigation system, called Stochastic Neural Control (SNC), that is suited to complex and uncertain environments, such as modern facilities. SNC uses a stochastic policy gradient algorithm for local control and a modified probabilistic roadmap planner for global motion planning. In SNC, each mobile robot control decision is conditioned on observations from the robot sensors as well as pointcloud data, allowing the robot to safely operate within geometrically complex environments. SNC is tested on several challenging navigation tasks and learns advanced policies for navigation, collision-avoidance and fall-prevention. Finally, we present a strategy for transferring SNC from a simulated environment to a real robot. We empirically show that the SNC system exhibits good policies for mobile robot navigation when controlling a real mobile robot.

Simulated navigational trials; Physical robot trials

Detailed Research Overview & Progress Updates - 12/6/2019

Overview & Observed Problem

Automation in construction, maintenance and facility management promises to bring significant productivity gains to a number of industries. During the past year, we have shown that mobile robots can automatically collect information from a work site environment using RGB-D cameras and lidar [1, 2], potentially enabling significant advances in automation, safety, and remote project management. Nevertheless, state-of-the-art computer vision algorithms are not yet able to handle the complexity and diversity of objects that may exist in a facility or a construction site. To this end, we believe that a combination of human supervision and modern machine learning techniques are required to maintain an accurate dynamic model of the facility or construction site. Through human supervision, the capabilities of a machine learning system will continuously and progressively improve, eventually leading to a system that can handle the geometric complexity and diversity of a real construction site.

For robots to operate in dynamic and unstructured environments such as construction sites or medical facilities, they must be able to infer or obtain a spatiotemporal, semantic-rich model of the environment. Such a model may include static information about the facility, such as the position of door and walls. To facilitate safe and efficient automation, the model should also contain dynamic information about the environment, such as the position of other robots in the facility. The position of movable objects, like furniture or safety cones, might also be of interest. Our research will focus on how human supervision can be integrated with our existing computer vision algorithms, allowing continuous learning and improved fault tolerance.

Maintaining a real-time digital building model can be useful for managers and automated systems, throughout the entire building lifetime. During construction, the model can be used to enable greater automation, as well as providing a real-time source of information for construction managers and related professionals. In the operations phase, the spatiotemporal information can be used to ensure that automated processes are operating in a safe and efficient manner. By incorporating human supervision, the ability of the automated data-capture system can evolve and improve with time, making it useful throughout the building lifecycle. Based on the above, real-time building information can influence the following aspects of a facility: (a) Buildability, by facilitating automated construction processes (b) Operability, by enabling the safer operation and efficient management of mobile robots in the facility. (c) Sustainability, by automating the collection of data for decision making related to improving productivity during construction, and efficiency in facility operation. Additionally, real-time semantic information about the building can be used for controlling and inspecting productivity, quality, and safety, which alleviates risks and opens new pathways for owners, operators, designers and builders.

Theoretical & Practical Points of Departure

The proposed research project builds on the success of our previous CIFE project, titled: “A Framework for Updating Building Information Models with Mobile Robots”. In this project we have developed an algorithm for updating building information models using RGB-D images [1]. Objects such as safety cones and trash cans are automatically detected using RGB-D images and added to a geometric model of the environment. The dynamically generated model is exposed to human operators and automated systems through both a web-based user interface (UI) and an application programming interface (API). The system is also capable of extracting object attributes, such as the color, size, and material of each object. One of the notable limitations of our system is that it does not provide a way for a human operator to rectify incorrect object predictions, or train the model to detect new objects. While this does not limit the quality of the research outcome, it does significantly limit the applicability of this system to real-world problems.

The concept of active learning systems is currently popular in the machine learning field [13], [14]. Researchers are interested in developing systems which continue to improve at a given task, either with or without human supervision. A similar approach has been applied by many major corporations to develop commercial systems which constantly improve, such as web search, product recommendations and advertising [15]. However, due to the increased technical complexity of computer vision and spatial modelling, there has been little investigation into the use of active learning in BIM.

Research Methods & Work Plan

This project will build on the theoretical framework for dynamic building information models, outlined in our previous project. The previous framework and accompanying implementation provided a method for updating building information models using streaming data from a mobile robot. We will introduce two new concepts to this framework: supervision and observation. Supervision is the idea that a human operator or an automated system observes the changes to the geometric model, and provides corrections to the underlying predictions. Observation is the idea that, a person or an automated system, watches the geometric model and uses the real-time data for the purpose of decision-making, automation, or improved safety.

The proposed works can be organized into three main categories: (1) improving our computer vision algorithm for automatically identifying common construction site objects; (2) exploring how people and automated systems can interact with, modify, and utilize a dynamic geometric model; and (3) validating the proposed framework with field tests. Each category is now described in more detail:

  1. Extending the Object R-CNN algorithm to synchronize machine observations with a semantic-rich geometric model of the environment. In this stage, we will extend our Object R-CNN algorithm to update a geometric model based on observations from an RGB-D camera. Contrary to our previous approach which only adds objects to the model, we will now focus on updating, editing and removing objects from the model. Depending on feedback from industry partners and the Technical Advisory Committee, we will focus on detecting common construction site objects, such as safety cones and signs, or common building objects such as furniture.
  2. Exploring how people and automated systems can interact with the real-time, updated geometric model. The computer vision algorithm will undoubtedly make errors when generating a sematic-rich model from point cloud data. We will explore how an operator can interact with the automatically generated model, and possibly correct misclassifications. Additionally, we will explore how automated systems can use the dynamic model for path planning and collision avoidance (for robotic applications).
  3. Validating the proposed framework using field tests. To validate the proposed framework, we will conduct a number of real-world tests. For demonstrative purposes, we will use our mobile robot to maintain and publish a dynamic model of the Y2E2 building for the latter half of the project. We will also explore how our algorithms and platform can be connected to existing data collection systems to enable real-time system information.

Expected Contributions to Practice

The biggest anticipated outcome of this research is a novel and elegant dynamic building information framework that will better facilitate bidirectional communication between mobile robots and building information models. The main components of the framework are shown in . This framework will be an ideological extension of the traditional BIM ontology, designed to better support automation in the construction site and modern facility. By incorporating human supervision, we believe that the resulting framework and algorithmic implementations will be beneficial to a wide range of stakeholders, from those in AEC to robotics. Additionally, we expect that this contribution will be of interest to parties in both industry and academia. The framework will provide a platform for future research in real-time building information modelling. Looking further ahead, the framework represents a step towards increased automation in a modern facility or construction site.

If successful, the secondary outcome will be a computer vision algorithm capable of automatically updating a geometric model, based on streaming data from a mobile robot or fixed camera. An interface will be developed to allow a human operator to interact with the computer vision predictions and make corrections where necessary.

Figure 2. Flow of information from mobile robots to the building information model. Research will focus on how computer vision algorithms and human supervision can be used to maintain a detailed and dynamic model of the built environment.


The proposed work aims to contribute to productivity gains during the construction and operation phase of the building lifecycle, and hence it will be of interest to stakeholders in construction, facility management, operations, and maintenance. This proposal is prepared partially based on our conversations with CIFE member companies (Hilti, Glodon, Bouygues Construction) as well as companies such as Einsite and Doxel. We will continue to seek expertise, feedback, guidance and collaborations from these companies. We believe that the development of supervised computer vision algorithms will be highly beneficial to many existing and emerging AEC technology companies. We also expect that the research will be particular applicable to stakeholders who are interested in using robots to automate construction, operations, or maintenance. We aim to showcase our algorithms and framework using data collected from the field through collaborations with CIFE members and other companies (such as Einsite). 

Additionally, the research team has successfully demonstrated machine learning in the energy, manufacturing and IoT domains and has received best research paper awards from ASCE, ASME and IEEE conferences on the subject. Working with the National Institute of Standards and Technology (NIST), the research team has also been actively involved in developing standard representations of machine learning models.  Specifically, the Gaussian Process Regression model has been accepted in PMML Version 4.3, supported by the Data Mining Group (DMG). The research team is currently working with NIST, SoftwareAG and IBM towards a standard representation for Deep Learning (CNN) model. Direct interactions with companies and standards organizations should provide the research team ample opportunities working with a broad spectrum of industries.


  • M. Ferguson and K. H. Law, “Communication Abstractions for Semantic Robot Interaction”, In Preparation
  • M. Ferguson and K. H. Law, "Stochastic Neural Control: Using Raw Pointcloud Data and Building Information Models", Submitted to International Conference on Intelligent Robots and Systems (i3CE), Las Vegas, California, USA, October 25 - 29, 2020.

The following publications are from the previous CIFE project:


  1. M. Ferguson and K. Law, “A 2D-3D Object Detection System for Updating Building Information Models with Mobile Robots,” in IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HI, USA, 2019.
  2. M. Ferguson, S. Jeong, and K. H. Law, “Worksite Object Characterization for Automatically Updating Building Information Models,” in ASCE International Conference on Computing in Civil Engineering (i3CE), Atlanta, Georgia, USA, 2019.
  3. J. Seo, S. Han, S. Lee, and H. Kim, “Computer vision techniques for construction safety and health monitoring,” Advanced Engineering Informatics, vol. 29, no. 2, pp. 239–251, 2015.
  4. Z. Zhu, M.-W. Park, C. Koch, M. Soltani, A. Hammad, and K. Davari, “Predicting movements of onsite workers and mobile equipment for enhancing construction site safety,” Automation in Construction, vol. 68, pp. 95–101, 2016.
  5. Y. Ham, K. K. Han, J. J. Lin, and M. Golparvar-Fard, “Visual monitoring of civil infrastructure systems via camera-equipped Unmanned Aerial Vehicles (UAVs): a review of related works,” Visualization in Engineering, vol. 4, no. 1, p. 1, 2016.
  6. H. Kim, H. Kim, Y. W. Hong, and H. Byun, “Detecting Construction Equipment Using a Region-Based Fully Convolutional Network and Transfer Learning,” Journal of Computing in Civil Engineering, vol. 32, no. 2, p. 04017082, 2017.
  7. R. B. Rusu, Z. C. Marton, N. Blodow, M. Dolha, and M. Beetz, “Towards 3D point cloud based object maps for household environments,” Robotics and Autonomous Systems, vol. 56, no. 11, pp. 927–941, 2008.
  8. R. B. Rusu, N. Blodow, Z. C. Marton, and M. Beetz, “Close-range scene segmentation and reconstruction of 3D point cloud maps for mobile manipulation in domestic environments,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2009, pp. 1–6.
  9. Armeni et al., “3D semantic parsing of large-scale indoor spaces,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 1534–1543.
  10. G. Lee et al., “A BIM-and sensor-based tower crane navigation system for blind lifts,” Automation in construction, vol. 26, pp. 1–10, 2012.
  11. S. Hwang and L. Y. Liu, “BIM for integration of automated real-time project control systems,” in Construction Research Congress 2010: Innovation for Reshaping Construction Practice, 2010, pp. 509–517.
  12. Autodesk, Autodesk Forge. 2016.
  13. C. Feng, M.-Y. Liu, C.-C. Kao, and T.-Y. Lee, “Deep active learning for civil infrastructure defect detection and classification,” in ASCE International Workshop on Computing in Civil Engineering, Seattle, WA, United States, 2017, pp. 298–306.
  14. K. Wang, D. Zhang, Y. Li, R. Zhang, and L. Lin, “Cost-effective active learning for deep image classification,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 12, pp. 2591–2600, 2017.
  15. H. Wang, N. Wang, and D.-Y. Yeung, “Collaborative deep learning for recommender systems,” in Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, 2015, pp. 1235–1244.

Original Research Proposal

Final Project Report

Technical Report TR246

Funding Year: 
Stakeholder Categories: