To help understand how the data value chain works we can begin with a focus on how data is used — the decision-flow of how people and systems use it.
This is independent of technologies, organisational boundaries and/or governance structures.
- Data is acquired or created by something (e.g. a satellite, a drone, a sensor) or someone (e.g. some research)
- It is then combined and transformed into a useful form. This is often underestimated in terms of the amount of effort required.
- It is then analysed using a combination of machines (e.g. algorithms, machine-learning, AI, etc.)
- It is then used by humans and/or machines to make decisions that has an impact on the issue that people are trying to solve.
This activity may happen wholly inside a single organisation. However, it is far more likely that a range of internal and external data is used to help inform outcomes.
Given the complexity of most of our systems, a ‘web of data’ approach is more effective, efficient, resilience and scalable than an approach that tries to ‘aggregate everything into one place’ (which many big-data lakes have attempted to do over the last decade).
A distributed approach to Shared Data is cheaper and more robust in the long-term and may require an independent governance process to help manage trust relationships across the ecosystem and address data across the Data Spectrum.
An example — transport
To illustrate, transport infrastructure in London generates lots of data, it is created and gathered by Transport for London (TfL). A third-party (TransportAPI) collects the data from TfL and from every other transport provider in the UK, combining and transforming it into ‘market ready’ forms. Third-party app developers (e.g. National Express) can then create consumer-facing apps that help users understand when a bus is arriving. The end-user then makes a decision based on that insight.
All of this insight about how transport is being utilised can then be fed back into the whole system, enabling transport providers to optimise their infrastructure based on the end user’s needs.