Wolfram does a pretty good job parsing the information in its own databases, but those databases will never match what is available on the Web. Wolfram’s databases currently store only 10 terabytes of information, a tiny fraction of what is on the Web. (I will be posting my impressions of Wolfram’s search engine soon). Google Squared is an early attempt to take the messy data which exists on the Web and place it into simple tables. It is still very experimental and isn’t always on target, but you can see where this is going. Turning the Web into a giant database will crush any attempt to segregate the “best” information into a separate database so that it can be processed and searched more deeply.

I don’t think it’s a foregone conclusion that anything will be “crushed” – as the focus shifts to more data-centric than Web page-centric, for example, a lot of traditional data management dimensions apply, e.g., “garbage in, garbage out”. 

There is great synergy at the intersection of relations and resources, in any case, and in many respects we’re still in the early chapters of this story.

On a related note, see GigaOM on Sir Tim Berners-Lee and Linked Data – it looks like a simple vocabulary change – from “Semantic Web” themes to data-centric concepts and real-world examples – is going to make a profound difference (even if many of the underlying concepts stay the same).

