Keep watching the skies – tag clouds as predictors of emergent fads

Paul Raven @ 16-06-2009

relational tag cloudEvery day, I spend a couple of hours digging through my RSS subscriptions for interesting stories, some of which I use here at Futurismic and most of which I store away at del.icio.us as research material (you know, for those fiction pieces that I keep meaning to find time to write… ahem). [image by ottonassar]

I’m a big fan of tagging my links because it enables me to trawl through the stored pieces (mine, and other people’s as well) by context and related topics, but it turns out there’s a greater benefit – user folksonomies on social bookmarking sites can be used to track and predict emerging trends and fads using mass data analysis:

The researchers tracked different users and noted the submissions they made, as well as the tags used on those posts. Taking this data, they could see what tags were frequently used in correlation with one another. This created a “coocurrence network,” which assigns weight to tags based on how often the tag was used and how many different users applied it.

With this information, it was possible to conduct a random walk (stepping randomly from one tag to another) and note how tags that occur together can form an otherwise undetectable semantic chain. These tags, based on their association with one another, allowed the researchers to follow along as one popular trend gradually replaced its predecessor.

When comparing individual random walks with one another, researchers noted that tags that appear close together in a non-obvious semantic network were likely to be visited by the same user, and tags that were far apart were visited together less often. Although no individual user might be aware of following these obscure connections, they became obvious when the data was examined in bulk.

[…]

The applicability of Heaps’ law to Internet tags was noted in particular. Heaps’ law states that the number of distinct words used in a body of text grows sublinearly relative to the size of the text—the bigger texts have more diverse vocabulary, but there are diminishing returns as things scale up. Likewise, the number of unique tags on del.ici.ous and BibsSonomy grow nearly linearly relative to the total number of tags—that is to say, our interests and the vocabulary used to describe them grow directly along with the Internet. It isn’t all just lolcats and musical parodies, even though it might seem so sometimes.

This fascinates me, because it confirms as a real phenomenon something that I always dismissed as a fallacy born of close involvement; scanning close to a thousand RSS feeds a day from a variety of sources and covering a variety of subjects gives me a sense of being able to observe trends bubbling up out the web’s chaotic maelstrom. I get a real kick out of watching a story or meme moving from low-level niche sites into the wider world of the web, and seeing new obsessions gather popularity.

And talk about hindsight – if I’d thought about it, I’d have seen the economic collapse coming about six months or more before it bit in and shifted all my investments somewhere safer. If I’d had any investments, that is…

Of course, this sort of trend analysis could probably be used for profit or surveillance purposes as well as the more abstract goals of research and cultural analysis, but if you haven’t realised that the internet is the ultimate double-edged sword by now… well, you’ve not been following along with my links, have you? 😉

Be Sociable, Share!

One Response to “Keep watching the skies – tag clouds as predictors of emergent fads”

  1. Evil Rocks says:

    I screamed (as politely as permitted in an old Jewish family) at my investing kin to get the fuck out and get the fuck out now citing the bubble-type growth in oil and the unreal behavior of the housing market for months leading up to the actual bust.

    Do you know how hard it was to not dance all over the fancy silver for the entirety of Turkey Week? I drank myself into silence (Maker’s Mark at 40/750 is well within reach for the fam despite halving of long-term investment dollars, and my mother makes a killer Manhattan) and confined smartass investment suggestions to “gold” and “treasury bonds, provided Obama doesn’t try to inflate his way out of the mess, accidentally triggering runaway inflation”.

    Despite what I considered formidable restraint over the “told you so” dance, my uncle took me aside at one point and told me to stop it, that I’d made my point, and that unless I was going to actually contribute to the discussion I should ask for another cocktail.

    They still don’t think I’m serious about runaway inflation. Lawyers.