What Is a Data Contract?
Data contracts let domain developers build products with specifications. They guarantee interface compatibility and include terms of service and an SLA. Contracts cover how data is used and the required quality of data. Their goal is to provide transparency for data usage and dependencies and to define terms of service and SLAs. However, this requires a cultural shift and users need time to become familiar with them and understand the importance of data ownership. Data contracts should also include information schema, semantics, and lineage.
What is Data Sharing?
Data sharing enables domain teams to connect and share data products without copying. Ideally, data should not be copied, reducing silo proliferation and keeping ownership with domain owners. Centralized data governance should be used to share datasets securely between producers and consumers, best done through metadata linking.
Don’t Forget Team Structure
Team and organizational structure are an important aspect to consider for data mesh. It is typical to organize teams around selected domains rather than have a centralized team. Domain teams are responsible for all processes—data collection, transformations, cleaning, enrichment, and modeling. Within a domain, teams are organized vertically and consist of roles required to deliver data such as DataOps engineers, data engineers, data scientists, data analysts, and domain experts.
Knowledge Graphs and Data Mesh
The semantic and context-driven fundamentals of knowledge graphs make it ideal to support enterprise data mesh and data fabric-based development. Knowledge graphs can be leveraged to ensure data contracts are standardized, uniform, consistent, semantically correct, and aligned with data sets. They power data sharing platforms to connect data between users, systems, and applications in a consistent, unambiguous manner. This enables compliance with data contracts, ensuring data types, schema, entities, and their inter-relationships across data products are semantically valid.
Domain-centric and enterprise data catalogs can leverage a knowledge graph for storing the semantics with metadata. Knowledge graphs help in automatic metadata extraction, generation and gatekeeping of data quality, and certifying data assets based on semantic rules and validation criteria. Leveraging knowledge graphs with data mesh can give rise to a semantic data mesh, providing the data across different domains in the mesh with context and meaning. This facilitates semantic data discoverability, data interoperability and integration for data augmentation, enrichment and providing explainability for AI and ML use cases.
Data mesh is first and foremost a culture, processes, and people shift, and these things are hard to change quickly in larger organizations. It is not something that can easily be adopted and implemented like a data architecture. Some organizations focus on specific aspects of data mesh or implement a simpler version of the architecture.
For organizations, data mesh is not a matter of yes or no, rather, it should be an exercise in identifying what is preventing them from delivering value to the business in a timely, effective manner. The ability to connect, share, and access data effectively across the enterprise is a very likely answer and is one that a data mesh supported by knowledge graphs is set up to deliver.