The growing prevalence of automated data-capture systems and semi-autonomous robots is likely to have a positive impact on efficiency and productivity in the construction and facility management sectors. However, for automated systems to work safely and efficiently in a complex environment like a worksite, they must have access to, or be capable of generating a
semantic-rich model of the environment. More specifically, these automated systems must understand what types of physical objects are near them, and where those objects are positioned in 3D space. To enable safe automation, it is essential that these systems are able to operate correctly in both static and dynamic environments. For example, a construction robot operating in a narrow hallway may have to simultaneously infer the position of static objects, such as construction site tools, and dynamic objects, like people. This dissertation proposes a set of new computational methods for extracting structured spatial knowledge about an indoor environment from unstructured sensor data. The dissertation then describes how this spatial knowledge can be leveraged to overcome several challenging scenarios in mobile robot navigation. This is achieved by training a reinforcement learning algorithm to condition mobile robot control decisions on a semantic-rich model of the environment.
The dissertation first proposes a system for identifying common objects in a worksite and automatically adding them to an existing geometric model of the environment. The system is composed of two components: A novel 2D-3D object detection network designed to detect and localize worksite objects, and a multi-object Kalman filter tracking system used to filter false-positive detections. This real-time computer vision system is referred to as Object R-CNN. Object R-CNN is validated using data collected with a purpose-built mobile robot. The mobile robot captures images of the worksite with an RGB-D camera and uses Object R-CNN to spawn newly placed objects in an existing geometric model of the worksite environment. Object R-CNN is subsequently extended to extract additional semantic information from the pointcloud data, such as the color and orientation of each object.
The second contribution of the dissertation is a communication platform that allows mobile robots and human operators to share geometric information about their shared environment. The dissertation describes how a real-time semantic-rich 3D model is used as a communication medium to better facilitate human-robot and robot-robot communication. The core contribution of this work is Timeframe, a high-performance communication platform that allows multiple agents to query and manipulate a shared representation of the building state.
The platform supports many data types including constructive solid geometry, polygon mesh, floor plan, pointcloud, and robot control messages. The platform also supports nonlinear coordinate frame transformations, allowing agents to communicate effectively regardless of the coordinate system they are using. Human operators can manipulate the 3D model using a web-based graphical user interface. The communication platform is used to solve a collaborative mobile robot navigation scenario. In this scenario, each mobile robot publishes pointcloud data to the communication platform. The aggregated pointcloud data is used to identify obstacles that could potentially block the mobile robot trajectories. The mobile robots are immediately notified whenever a new object is detected, allowing them to replan their trajectory. The communication platform also allows a human operator to supervise the mobile robot trajectories and the object detection process. Finally, the dissertation explores how a semantic-rich model of the built environment can be leveraged to improve the safety and efficiency of mobile robot navigation. A communication platform is developed to allow mobile robots to share spatial knowledge with human operators, automated systems, and additional mobile robots. The dissertation then introduces a new method for mobile robot control that combines reinforcement learning with traditional search-based motion planning. Each mobile robot control decision is conditioned on observations from the robot sensors as well as pointcloud data, allowing the robot to safely operate within geometrically complex environments. This approach, referred to as Stochastic Neural Control (SNC), performs exceptionally well on many challenging navigation tasks, learning good policies for navigation and collision avoidance. SNC is tested on several challenging navigation tasks and learns advanced policies for navigation, collision-avoidance and fall-prevention. Three variants of the SNC system are evaluated against a conventional motion planning baseline based on the Dynamic Window Approach (DWA). SNC outperforms the baseline and four other similar RL navigation systems in many of the trials. The SNC algorithm performs well in each of these environments, learning safe navigation behaviors, such as avoiding crowded spaces, avoiding stairs, reducing speed near walls, and operating safely in confined spaces. Finally, SNC is demonstrated for robots operating in a real environment.