Skip to content

Architecting success: Critical pathways in the implementation of a unified namespace

13 minute read
Content level: Advanced
4

A unified namespace (UNS) serves as the central nervous system of modern smart factories, and supports seamless data flow from the shop floor to the cloud. This article explores the critical challenges that organizations face during the implementation of UNS across edge and cloud environments. The article also provides practical solutions and proven implementation patterns to handle these challenges.

Introduction

Currently, manufacturing businesses encounter several data integration challenges when they digitize their operations. Some questions that they have about data storage, system integration, and real-time processing include the following:

  • Should we replicate Manufacturing Execution System (MES) and Enterprise Resource Planning (ERP) data within a UNS?

  • How do data lakes fit within this framework?

  • What role do Historian systems play in modern architecture?

While these are fundamental questions, organizations encounter even more complex questions, such as the following:

  • How do you discover and catalog data across hundreds of disparate systems without creating an administrative nightmare?

  • What happens when data transformation requirements change faster than your integration team can adapt?

  • How do you break down entrenched data silos that have evolved over decades of independent system deployment?

  • How do you achieve meaningful interoperability in industrial environments where legacy systems speak different languages, operate on different time scales, and serve fundamentally different purposes?

These questions indicate a search for a single source of truth in an inherently fragmented industrial world.

Unified namespace

Industry experts, such as Walker Reynolds, have championed UNS as the holy grail of industrial data architecture. Reynolds describes UNS as "a single source of truth for all data and information in your business. It's a place where the current state of the business exists (where it lives). It's the hub through which the smart things in your business communicate with one another." This vision promises three compelling benefits: a centralized data repository, real-time operational visibility, and seamless system interoperability through event-driven architecture.

However, this concept isn't as revolutionary as it appears. Volkswagen implemented similar ideas over a decade ago with their Profinet system, called Volkswagen Audi Seat Skoda (VASS). With this system, they used comprehensive naming conventions to organize their entire manufacturing environment. This provided a similar contextualization and hierarchy model of UNS.

Modern standards, such as Sparkplug, have enhanced Message Queuing Telemetry Transport (MQTT) networks with standardized topic namespaces. Also, OPC Unified Architecture’s (OPC UA's) Publish-Subscribe capabilities have promoted greater interaction between industrial protocols. These implementations show both the capabilities and limitations of trying to unify industrial data.

Implementation challenges

There are several challenges involved in creating a true UNS. When authoritative process states reside in specialized systems that aren’t designed for universal access, such as MES, it’s difficult to achieve a genuine single source of truth. The focus on "current state" often overlooks valuable historical data that’s crucial for trend analysis and predictive maintenance. Maintaining real-time state across large operations creates enormous data volumes and potential latency problems that can affect system performance. The complexity of implementing a true UNS across a diverse system might lead to oversimplification of critical data relationships.

Enter image description here

Direct paths and selective event layer usage

Dual-layer architecture presents a more practical approach rather than conforming to an idealistic vision. This solution connects factory-level operations with enterprise-wide systems through two fundamental layers that operate at both local and enterprise levels. The event layer handles immediate data processing through local messaging solutions, such as EMQX MQTT brokers, NATS, or Zenoh, at the factory level. The enterprise event buses use cloud solutions, such as AWS IoT Core, for scalable messaging across the organization.

The persistence layer manages data storage and retrieval through carefully selected combinations of technologies. Local factory storage might combine InfluxDB for time series data, PostgreSQL for structured operational information, or process historians, such as Aveva PI, depending on specific use cases. At the enterprise level, organizations select from solutions, such as AWS IoT SiteWise, Amazon Athena, Amazon Redshift, or Amazon Simple Storage Service (Amazon S3) Tables with Apache Iceberg. This selection creates a seamless bridge between local and cloud operations, while maintaining both real-time operational capabilities and enterprise-wide analytics.

Enter image description here

This architecture shows an important fact about data transmission: it’s not necessary that all data flow through the event layer. For improved performance and cost-efficiency, large volumes of historical or aggregated data are often replicated directly from local factory storage to cloud storage, and completely bypass the event layer. This direct replication approach reduces latency, minimizes network load, and reduces transmission costs. The approach also reserves the event layer for truly time-sensitive applications that require immediate access to current data states. Importantly, this distributed data storage doesn't sacrifice the unified access experience that organizations need. Interactive query services, such as Athena, can provide a single point of entry that federates queries across multiple data systems, regardless of whether they’re stored in Amazon S3 Tables or traditional databases. This federated approach delivers unified interface benefits without forcing all data through a single pipeline. As a result, the approach maintains both performance and the logical unity that business users require.

Smart transmission, processing, and storage strategies

Modern manufacturing environments generate data at frequencies that surpass practical transmission and storage capabilities. Programmable Logic Controllers (PLCs) that update values thousands of times per second can create scenarios where every change is unnecessary and impossible to capture. Smart data handling strategies are essential, such as aggregating high-frequency data before transmission, implementing intelligent sampling based on change significance, and computing meaningful metrics at the edge.

Enter image description here

Dynamic routing as part of smart data handling optimizes data replication and provides the capabilities for applications to operate seamlessly in the cloud or on the factory floor. However, the fundamental laws of physics, particularly the speed of light, still impose certain latency-related limitations between geographically distant locations.

Data formats for industrial systems

The quest for data format standardization presents another inconsistency. While some advocate for strict adherence to OPC UA’s structured approach and others support pure JSON simplicity, the manufacturing landscape proves too diverse for any single format to be considered superior.

Rather than enforcing conformity, successful implementations accept format diversity through robust data transformation capabilities and data model registries. These registries serve as central repositories for schemas and formats, and promote dynamic interpretation and transformation throughout the system.

Seamless deployment: Combining edge and cloud for agile manufacturing

The containerized application architecture complements these data layers by supporting seamless operation across edge and cloud environments. Kubernetes-based orchestration provides uniform deployment practices, while technologies such as Amazon Elastic Container Service (Amazon ECS) Anywhere or K3s provide lightweight container management at the edge. Amazon Elastic Kubernetes Service (Amazon EKS) hybrid nodes handle enterprise-scale orchestration in the cloud with edge nodes. This results in the creation of inherently portable applications that run unchanged, regardless of whether they’re deployed at the edge or in the cloud. This approach supports location-aware data routing, sophisticated caching mechanisms, and seamless failover between local and cloud data sources.

Enter image description here

Reference architecture

The following image shows a reference architecture for UNS implementation with its main components:

Enter image description here

The following are components of the architecture and their functionalities:

  • Protocol converter: The shop floor connectivity framework serves as a versatile protocol converter that bridges the gap between various data sources and the IoT ecosystem. It seamlessly transforms data from operational technology (OT) protocols, databases, and REST APIs into standardized MQTT messages, allowing smooth integration with the rest of the system.

  • MQTT edge broker: At the edge, EMQX's MQTT broker functions as a local message hub. This broker facilitates efficient communication between local system components, and makes sure that there's a swift and reliable data exchange within the edge environment.

  • Telemetry ingestion: AWS IoT SiteWise Edge handles telemetry ingestion by collecting industrial data through MQTT communication. To optimize transfer to the cloud, AWS IoT SiteWise Edge groups incoming messages into micro-batches to maximize network usage and processing efficiency. During this process, it links MQTT topics to specific AWS IoT SiteWise assets and their properties. AWS IoT SiteWise Edge also optimizes storage usage by automatically directing data to appropriate storage tiers. Time-sensitive or frequently accessed data goes to hot storage for quick retrieval, and historical or less frequently accessed data goes to cold storage for cost efficiency. This process balances performance needs with cost considerations.

  • Real-time channel: AWS IoT Core acts as a high-speed data layer that forms the real-time channel of architecture. It receives messages directly from the edge broker, allowing bidirectional communication with minimal latency. This component offers various levels of delivery guarantees and caters to different reliability requirements.

  • Message routing: AWS IoT rules support message routing through the speed layer to persistent storage. This makes sure that you can reliably capture critical data and store for future use or analysis.

  • Time series database: AWS IoT SiteWise provides a near real-time database solution coupled with an asset inventory system. This solution helps you handle telemetry data and provides quick access to recent data points and asset information.

  • Persistent storage: Amazon S3 serves as the system's long-term data repository. It offers robust and scalable storage capabilities to make sure that you can retain all data that the system ingests for extended periods. These capabilities facilitate historical analysis and compliance requirements.

  • Access layer: Athena functions as a query interface, providing access to historical data that's stored in Amazon S3. It supports complex analytics and machine learning activities so that users can process and analyze large volumes of data efficiently with SQL queries.

  • Application: Application can subscribe directly to MQTT topics for real-time data access. Or, it can use the Athena interface for batch operations on historical data, providing versatility in data consumption patterns.

  • Container orchestrator: Amazon EKS manages the deployment and orchestration of containerized components both in the cloud and at the edge. It provides a unified control plane for managing the entire container ecosystem across diverse environments.

  • Edge container nodes: Amazon EKS hybrid nodes are specialized nodes that manage containers that are deployed at the edge but are controlled from the cloud. This hybrid approach allows for distributed computing capabilities while maintaining centralized management, bridging the gap between cloud and edge environments.

The common MQTT topic structure in a UNS follows a hierarchical organization based on the ISA95 model. It reflects the physical and logical layout of the manufacturing environment. Topics are organized in a tree-like structure with forward slashes (/) as delimiters, typically consisting of nine levels: /Plant/Area/[MessageType]/Line/Cell/Equipment/Module/Device/Var

  1. Plant level represents the physical location or plant.

  2. Area level defines specific zones or departments within the site.

  3. Message Type level defines the type of information published, such as data or registration.

  4. Line level identifies production lines or major process units.

  5. Cell level represents specific process cells or work cells.

  6. Equipment level identifies specific machines or equipment.

  7. Module level represents subcomponents or modules of equipment.

  8. Device level contains control parameters and variables.

  9. Variable level contains specific data points or measurements.

Each level in this hierarchy might include metadata and specific attributes that are relevant to that level to support efficient data organization and discovery.

The following image is an example of the topic structure:

Enter image description here

The message structure consists of two main components, including the header and list of messages. This design helps with efficient data organization and streamlined processing.

The header section serves as a message envelope and contains crucial metadata that facilitates message handling and validation. This comprehensive header allows for easy message filtering and routing based on various criteria, and enhances the system's overall efficiency in data processing. The header includes the following elements:

  • Metadata: Essential information about the message itself, such as timestamp, source, and message type.

  • Validity elements: Parameters that define the message's relevance period or expiration criteria.

  • Schema reference: A pointer to the schema used for the message section, allowing for quick validation and interpretation of the message content.

The message body that follows the header contains a list of individual messages that each represent a specific data point or event. Each message in this list is structured as follows:

  • Metadata: This subsection identifies the specific item or asset that the data pertains to. It includes unique identifiers, such as asset IDs, equipment tags, or other relevant classification information. It also contains a serial number or sequence identifier. This data is crucial for maintaining the correct order of messages, especially in scenarios where message delivery might be delayed or out of sequence.

  • Properties: This subsection includes the actual data values that are associated with the item or asset. It can contain one or more key-value pairs that each represent a specific attribute or measurement. For example, in an industrial setting, this might include temperature readings, pressure levels, operational states, or any other relevant parameters. The flexibility of this structure allows for the representation of simple single-value properties and complex multi-value datasets.

Conclusion

The reality that emerges from practical implementation experience reveals an important insight. The UNS, as conventionally conceived, requires fundamental rethinking. The very notion of complete "unity" in industrial data often ignores the inherent complexity and diversity of manufacturing operations. Real manufacturing environments are messy, diverse, and constantly evolving. They resist total unification because different systems serve different purposes and operate under different constraints.

The most effective approach to namespace design isn't about achieving perfect uniformity. It's about creating a sophisticated orchestration of diverse systems, protocols, and data formats that work together while maintaining their individual strengths. Rather than forcing everything into a single rigid structure, successful implementation focuses on building flexible architecture patterns that can adapt to the common chaos in manufacturing environments. Therefore, it’s important to accept controlled diversity within a coherent framework, where different data sources can coexist and interoperate without losing their specialized capabilities.

The path forward involves recognizing that the true value comes not from reducing all differences, but from creating systems that can effectively manage and use those differences. The question isn't whether you're ready to implement a conventional UNS. It's whether you're prepared to build an architecture that works with the reality of manufacturing complexity rather than against it.


About the authors

Enter image description here

Jan Metzner

Jan Metzner is a Principal Specialist Solutions Architect for Industrial Solutions in the Industry Specialists and Solutions team at AWS, focusing on manufacturing technology strategy. As the global technical lead for Industrial IoT at AWS, he architects scalable solutions on the AWS platform, specializing in IoT implementations and data-driven applications. With over 20 years of experience across startups and enterprises, Jan helps organizations transform their manufacturing operations through cloud technology. Before joining AWS, Jan developed technical solutions that focus on industrial automation and digital transformation.

Enter image description here

Fabrizio Manfredi

Fabrizio is a Principal Cloud Architect for Industrial Solutions at AWS, where he specializes in transforming manufacturing operations through AWS technologies. With over two decades of expertise in the development of distributed systems and industrial automation, he leads strategic initiatives in smart manufacturing, focusing on connected factories, process optimization, and quality enhancement. Before joining AWS, Fabrizio developed mission-critical distributed systems across various industries, driving operational efficiency.