'Slashtags' for citizen editors

November 09, 2009

Updated Nov 16, 2009: @chrismessina created a wiki for the Twitter syntax http://microsyntax.pbworks.com/Slashtags

The NYT reported today on how the #fthood hashtag has failed:

Until lately, the main way to make sense of an urgent outpouring of tweets on a particular subject was to use text searches: look for the phrase 'Fort Hood,' for example, or maybe an agreed-upon label, '#fthood,' within tweets. Yet during events like the shootings on Thursday at Fort Hood that left 13 people dead, this method is useless. Hundreds of 'relevant' tweets pop up every minute, most repeating the same news reports over and over again or expressing concern from far away."Refining the Twitter Explosion" on nyt.com

I believe that there is an enormous potential to do citizen journalism better on the web, and that we need the leadership of people who are willing to help clean up the mess. Unlike some people, I do not think that the poor citizen journalism around #fthood is an indictment of citizen journalism — rather I would say it points to the absence of citizen editors.

In the Vote Report and Swift parlance, these are "Sweepers," the custodians working to clean the stream, validate claims, and generally insert some professionalism.

Taken to their logical next step, you can see the emergence of volunteer "citizen editors," who appreciate journalistic rigor and take time to bring signal to the noise in dozens of different ways.

Recently around Meedan we have been talking a lot about using Delicious and Twitter tagging to more effectively manage our content across our many networks, and to bring more meaningful conversations to our users.

This is the power of tags: they are impossible to contain in a single network.

By relying on Delicious and other social bookmarking systems, we've been able to build our editorial backchannel into numerous social platforms. Rather than being stuck with the limitations of some CMS, and have to copy everything out to our social network, we can use the social network and then bring it in to our own domains.

That's always a smart approach for nonprofits, because it builds your conversation in a meaningful, and searchable way. Metadata value (real usable value!) accumulates like interest in your bank account. And citizen editors are the people who are trying to make this system provide even more of a return, because fundamentally we want more people to care, understand and take action.

Twitter Lists Taken Seriously

So we've been looking into some of the existing pseudo-standards like the #hashtag, and looking for ways for improving our journalistic rigor. George recently posted about using the new Twitter lists features to curate groups of sources for our Iran Twitter feed:

Rather than treating our Twitter list as a gizmo, with shoddy maintenance and dubious output, what if we put some rigor into it by beginning with Journalism 101?

George, our lead editor, knows this stuff all too well:

What is the reported location of the Twitter Stream? Is the Twitter Stream using Farsi or a local language? How long has the Twitter Stream account been up and running?

(And oh yes there are many more criteria.)

I think these are the good, basic questions that may not be answered by some organizations — and their lists are thus quantifiably worse, in the sense that they are less reliable, less meaningful, and probably noiser. So we can see that by following basic journalistic standards, your attention data becomes more valuable. Garbage in, garbage out, or, more positively, the system can be improved.

For nonprofits, which typically do not have a microgram of energy to spare, these kinds of tricks can be really helpful.

#hashtags and /slashtags

A great example of this type of "attention data enhancement" is the #hastag, which clarifies the context of a short statement on twitter with a globally recognizable tagging syntax. (I'll spare us the debate around hashtags, but suffice it to say, they can be done better.)

Chris Messina, one of the biggest advocates of #hashtags and other microsyntax, has just described a few extra bits of attribution using the "slasher." (I think we could just call it a "slashtag.")

'Pointers' are short words with different intentions. A group of pointers should typically be prefixed by ONE slasher character. You can daisy-chain multiple pointer phrases together, padded on both sides with one whitespace character. There should be NO space following the slasher. Hashtags should be appended to the very end of a tweet, except when they are part of the content of the message itself and indicate some proper name or abbreviation. Normal words that would be part of the content of a tweet anyway SHOULD NOT be hashed."New microsyntax for Twitter: three pointers and the slasher"

Particularly I think using /by is a great idea to reference an article or direct quote.

Using /by gives a very specific meaning to the username that follows it. It's intuitive enough that I don't think it even needs to be explained, you can just read it:


Not beautiful, but very clear.

This is useful for when you need to be more precise — say, if you wanted to use your attention data in another application.

For us at Meedan, this is the direction we are headed, fast. We are working on developing a clear and simple standard for using tags on the delicious network. This standard will be something that our editorial team (and anyone who cares to participate) can use to route information to our hand-curated database. You don't have to leave the comfort of your own twitter client, or use any fancy tools — just the simple, clear standards that we are figuring out.

We are already making great use of social bookmarks at meedan as a editorial backchannel. For example, you can see all of Meedan's Iraq sources on delicious, from our lead editor:


And everything that the Meedan user unthinkingly (me) has tagged as being generically "for meedan" (using an informal tag "for_meedan").


Because George also uses this tag, we can get a nice community of practice working together. This page shows the shared pool:


So, as you can see, we are using underscores, which is a common tagging convention because it looks like a space. We're not so happy with this: it's simply not expressive enough.

(Even though you can do a lot with a single little shared tag like #nptech.)

A more robust tagging system, which I believe would be very compelling if it were well designed, would extend some of this syntax. The question is: how to extend the syntax without making it overwhelming?

Setting some goals

I think that any tag needs to follow a standard that meets several critiera:

1.) it should read naturally when spoken out loud (no dots, equals signs, or weird abbreviations) 2.) it should be as cross-network as possible (for now the syntax should not break on Twitter or Delicious) [1] 3.) it should rely an aliases instead of strict taxonomies (tag first, fix it later)

So what I'm talking about is extending the tag that George used to curate Iraqi newspapers, iraq_newspaper to something like this:


which I think has several advantages.

  1. of the tag in ways that make the taxonomy immediately clearer. Iraq is nested "inside" a type of source.
  2. It works on Twitter
  3. It works on Delicious
  4. It is still very short (adds only one character over the underscore)

On delicious, spaces are not allowed, so I have started using two slashes. So where previously I might have tagged the article with a kind of meaningless tag:


but now I can tag it


Which is still a pretty meaningless tag, but is at least prefixed meaningfully to mean "this content is by this person" as per chris' helpful article above.

Also I can improve the previous technique of using the for_meedan shorthand





Which has the benefit of being equally readable, while obeying a more general rule of syntax.

Machine tags are not what we want, we are not machines

By far the most complete standard that is being used to solve these problems is the machine tag. This tag uses a colon and an equals sign to indicate a much more specific (though not necessarily accurate) structure. The history is from the geo community, mostly for this:


These namespaced key value pairs are admirably used as the output of some web apps, but are quite intimidating for human input.

Common opinion seems to be that they are too "dorky" to be usable at this point, considering especially that any good taxonomy is constantly in slight flux. (Though Flickr has made great use of them to kick of custom actions in their UI).

Similarly, what might be called a "double tag" is an interesting simplification down to a context-less key value pair:


In fact this is what comprises almost all of the tags in OSM, one of the most ambitious tagging innovations on the web. (I have said before that tagging is the secret sauce that makes a crazy project like OSM work.)

Finding a balance

Replace the equals sign in that last example, and you have slashtags, which I think are much better at communicating that "color" is a parent of the "red" value:


In this way, this "slashtag" or "slasher" approach, extended a with tiny bit of folksonomic conventions, could really strike the right balance between editorial simplicity and powerful machine-readablity.

Finding better editorial tools for realtime crises

I think that a better-defined tagging approach could really help make sense of critical, breaking news.

A wiki about hurricane Ida, for example, is probably not the right way to manage news about a critical event:

[http://farm3.static.flickr.com/2712/4089097932_350f83174c.jpg Ida]

Mediawiki makes me groan just looking at it. I'd much rather help update that information by tagging links into delicious, and knowing that someone is listening on the other end. This would motivate me to learn the emergent standards, follow a loose taxonomy, and generally try to be more articulate.

If we could react in realtime to create a more sophisticated picture of the news by expressing ourselves more clearly in the tagging interaction, I think we could ultimately make great strides in improving citizen journalism (even if all the idiots keep on tweeting, which, naturally, they will.)

This is why the usability of a citizen editor tagging scheme is so critical — it needs to be flexible enough (to handle hurricanes) but maintain a low barrier to participation (to cultivate citizen editors). The tagging approach has already proven itself in many trivial domains, now we need to step it up using our journalistic standards, and our shared interest in making sense of the news, particularly crises.

We are early in this strange distributed crisis data management effort, but I think that some of the ideas proposed by Chris Messina, and the experiments of the OSM community go a really long way in this regard. Particularly the nestabilty and readability seem like great virtues of this tagging system. Overall the "slash" is a widely understood metaphor, used by all major operating systems to indicate travresing "down" or "up" a taxonomy.

I'm going to transition some of my tagging habits accordingly, and see where it ends up!

I would love to know what you think. Stop by the contact page or @unthinkingly on Twitter and let me know what you think.

[1] notice how it breaks on gnolia.com and breaks on flickr.com Although it appears that Flickr preserves the slashes in the background, just doesn't display them on output.

[2] On Twitter there is s a bit of a variation required if we are to follow existing patterns: 1.) I can omit the space, so I will, and 2.) You need to prefix a user's name with the @ sign, like /for @meedan — I think this is still quite readable, but the difference between networks might need to be cleared up. We could in fact collapse the twitter tags to /for/meedan (ie: identical to the delicious tag) but this would probably break some automation in twitter clients that are expecting the @ prefix.