This article has multiple issues. Please help
improve it or discuss these issues on the
talk page. (
Learn how and when to remove these template messages)
|
Data mesh is a sociotechnical approach to building a decentralized data architecture by leveraging a domain-oriented, self-serve design (in a software development perspective), and borrows Eric Evans’ theory of domain-driven design [1] and Manuel Pais’ and Matthew Skelton’s theory of team topologies. [2] Data mesh mainly concerns itself with the data itself, taking the data lake and the pipelines as a secondary concern. [3] The main proposition is scaling analytical data by domain-oriented decentralization. [4] With data mesh, the responsibility for analytical data is shifted from the central data team to the domain teams, supported by a data platform team that provides a domain-agnostic data platform. [5] This enables a decrease in data disorder or the existence of isolated data silos, due to the presence of a centralized system that ensures the consistent sharing of fundamental principles across various nodes within the data mesh and allows for the sharing of data across different areas. [6]
The term data mesh was first defined by Zhamak Dehghani in 2019 [7] while she was working as a principal consultant at the technology company Thoughtworks. [8] [9] Dehghani introduced the term in 2019 and then provided greater detail on its principles and logical architecture throughout 2020. The process was predicted to be a “big contender” for companies in 2022. [10] [11] Data meshes have been implemented by companies such as Zalando, [12] Netflix, [13] Intuit, [14] VistaPrint, PayPal [15] and others.
In 2022, Dehghani left Thoughtworks to found Nextdata Technologies to focus on decentralized data. [16]
Data mesh is based on four core principles: [17]
In addition to these principles, Dehghani writes that the data products created by each domain team should be discoverable, addressable, trustworthy, possess self-describing semantics and syntax, be interoperable, secure, and governed by global standards and access controls. [19] In other words, the data should be treated as a product that is ready to use and reliable. [20]
After its introduction in 2019 [7] multiple companies started to implement a data mesh [12] [14] [15] and share their experiences. Challenges (C) and best practices (BP) for practitioners, include:
Scott Hirleman has started a data mesh community that contains over 7,500 people in their Slack channel. [25]
{{
cite book}}
: CS1 maint: location missing publisher (
link)
{{
cite book}}
: CS1 maint: location missing publisher (
link)
{{
cite book}}
: CS1 maint: location missing publisher (
link)
This article has multiple issues. Please help
improve it or discuss these issues on the
talk page. (
Learn how and when to remove these template messages)
|
Data mesh is a sociotechnical approach to building a decentralized data architecture by leveraging a domain-oriented, self-serve design (in a software development perspective), and borrows Eric Evans’ theory of domain-driven design [1] and Manuel Pais’ and Matthew Skelton’s theory of team topologies. [2] Data mesh mainly concerns itself with the data itself, taking the data lake and the pipelines as a secondary concern. [3] The main proposition is scaling analytical data by domain-oriented decentralization. [4] With data mesh, the responsibility for analytical data is shifted from the central data team to the domain teams, supported by a data platform team that provides a domain-agnostic data platform. [5] This enables a decrease in data disorder or the existence of isolated data silos, due to the presence of a centralized system that ensures the consistent sharing of fundamental principles across various nodes within the data mesh and allows for the sharing of data across different areas. [6]
The term data mesh was first defined by Zhamak Dehghani in 2019 [7] while she was working as a principal consultant at the technology company Thoughtworks. [8] [9] Dehghani introduced the term in 2019 and then provided greater detail on its principles and logical architecture throughout 2020. The process was predicted to be a “big contender” for companies in 2022. [10] [11] Data meshes have been implemented by companies such as Zalando, [12] Netflix, [13] Intuit, [14] VistaPrint, PayPal [15] and others.
In 2022, Dehghani left Thoughtworks to found Nextdata Technologies to focus on decentralized data. [16]
Data mesh is based on four core principles: [17]
In addition to these principles, Dehghani writes that the data products created by each domain team should be discoverable, addressable, trustworthy, possess self-describing semantics and syntax, be interoperable, secure, and governed by global standards and access controls. [19] In other words, the data should be treated as a product that is ready to use and reliable. [20]
After its introduction in 2019 [7] multiple companies started to implement a data mesh [12] [14] [15] and share their experiences. Challenges (C) and best practices (BP) for practitioners, include:
Scott Hirleman has started a data mesh community that contains over 7,500 people in their Slack channel. [25]
{{
cite book}}
: CS1 maint: location missing publisher (
link)
{{
cite book}}
: CS1 maint: location missing publisher (
link)
{{
cite book}}
: CS1 maint: location missing publisher (
link)