Linked Data, Inference and Chinese Whispers.

Technology is simple, people are difficult. People create a piece knowledge, like this one: Coronavirus disease (COVID-19) advice for the public, which also has a timing aspect to it. This piece of knowledge immediately starts spreading and transforming on the way. Knowledge is there to be spread, of course, but there are different ways of doing it. The way I just did it myself, by linking to the original piece of knowledge, does not give me a piece of that spotlight. In a search of a piece of spotlight, people start para-phrasing the original piece of information, picking out pieces, adding own views and passing it on. This leads to a plethora of information pieces out there, with no possibility of backtracking to the original knowledge object.

What’s the mechanism of retrieving the ground truth, that initial knowledge object provided by empirical evidence? An answer to this is linked data. Instead of copying and passing on a piece of knowledge we send a reference to it. This is why I am against sending files via mail – you never know which version of the file you are getting. If instead we only share pointers to knowledge objects we can choose to always get the latest. The knowledge object can by itself evolve as well but keep track of the changes and detect if anyone has tempered with it.

To complicate it further, people, including myself, love detecting patterns in pieces of information, combining knowledge objects together and inferring new pieces of knowledge. We need to make sure we can back-track this chains of inferencing to original facts and ground truth, in line with what Hans Rosling said in Factfulness. A tiny tweak in a piece of information along the chain of reasoning may lead to an incorrect decision in the end of the reasoning chain.

The tiny tweaks may be intentional and unintentional. A minor variation of the ground truth or an error in the reasoning chain may lead to wrong decisions being taken at the end of the reasoning process. When this process concerns life and well-being of people, business-critical decision-making, or societal challenges, it needs to adhere to certain principles:

  • Data should never be copied. Send pointers to data, not the copy.
  • Traceability and explainability in decision-making needs to be in place.
  • In a search for optimal decision, don’t experiment on a live system without boundary conditions.
  • Back-tracking should be possible.
  • Mechanisms for resolving conflicts should be in place.
  • Mechanisms for detecting tweaks in data should be in place.
  • Mechanisms for reversing decisions should be in place.