Limits to Sharing

The vision of a global data network, as conceived by Tim Berners-Lee, has captivated the Semantic Web community for years. It's a bold dream: a web not just of text but of interconnected data! However, in the intervening years, the power law inherent in scale-free networks has taken the web's journey in an unexpected direction, creating massive 'gravity wells' that draw everything into them. Perhaps refusing to share data in the open was not selfish but, in fact, wise: before sharing, trust must be established.

Today, we're inching closer to the original dream of the Semantic Web. I believe some form of interconnected data future is inevitable. There are deep evolutionary forces at work here, as OpenAI's Ilya Sutskever reportedly remarked, "The models want to learn!"

How we shape the underlying scale-free data network is critical. There can be no doubt that if Tim Berners-Lee knew then what he knows now, he would have designed the web in such a way that we each held the keys to our personal data vaults. That horse seems to have bolted, but what lessons can we learn for the dream of the Semantic Web?

I think the key lies in organisations and how they approach data privacy as they enter the age of AI. All organisations are scrambling to deploy LLMs, but LLMs are vulnerable to adversarial attacks; they can be tricked into revealing things they shouldn't. Organisations must, therefore, enforce strict reuse policies around all datasets retrieved to augment LLM generation, and generated responses must strictly adhere to the organisation’s ontological 'world view.'

The Semantic Web already addresses these problems very neatly:

⭕ Clear and nuanced privacy and reuse policies supplied by ODRL.
⭕ Directed Acyclic Graph (DAG) of dataset lineage with DCAT.
⭕ Ontology with OWL and SHACL.

With a 'Semantic Data Mesh' in place, organisations can begin to share data safely, cooperating with other organisations. Globally, this would create a 'network of networks,' where each private network has its own discrete information boundary. This is a ubiquitous and powerful pattern.

It is only by first establishing trust that we can truly begin to share our data. So, ironically, the way to achieve a global network of our data is first to learn to protect it. The path to freely sharing data is paved with privacy!

⭕ Power Laws in Networks: https://www.knowledge-graph-guys.com/blog/power-laws
⭕ A shallow version of the semantic web is already here: https://w3techs.com/technologies/details/da-jsonld
⭕ Tim Berners-Lee: we need to re-decentralise the web: https://www.wired.com/story/tim-berners-lee-reclaim-the-web/
⭕ Are Aligned LLMS “Adversarially Aligned”? https://www.youtube.com/live/uqOfC3KSZFc
⭕ Semantic Data Mesh: https://www.knowledge-graph-guys.com/blog/the-semantic-layer

Previous
Previous

Power Laws

Next
Next

Nested Intelligence: