From Nieman Lab:
Ethan Zuckerman of the MIT Center for Civic Media taught a class this semester tailor-made for Nieman Lab readers: “News in the Age of Participatory Media.” The hook: What happens if you treat journalism as an engineering problem, bringing together the efforts of journalists and computer scientists?
The course’s final class last week featured a lot of bright students presenting their final projects, which was supposed to be a new tool, technique, or technology for reporting the news. (They were in various stages of completion.) I’ll be breaking out a few of the good ideas in future posts, but here are some of the ones that stood out to me.
Modernizing the hyperlink
The tag hasn’t changed much since Tim Berners-Lee proposed it 20 years ago. Hyperlinks are the fiber of the web. But Neha Narula, a Ph.D. student of computer science at MIT, finds herself frustrated with writers who abuse them. Blog posts littered with too many links leads to “cognitive overload,” she says. “As I explored this topic a little more,” she said, “I found what I was annoyed with was not linking too much but not linking well.” If Google is mentioned in copy, does Google have to be linked to the Google home page? Does the same link need to appear multiple times in one story?
Narula proposed the use of microformats and the little-known rev attribute to attach semantic meaning to links, allowing browsers to handle different kinds of links differently. (rev is supposed to represent a reverse link. All major browsers, when faced with a rev attribute now, just ignore it. It’s like a cousin to rel.)
For example, a link to a citation (dictionary definition, Wikipedia article) would get rev=”bib”, for bibliography. So:
might lead to that link being presented not in the body copy, but at the bottom of the post, in the form of a tidy bibliography.
She also proposes rev=”reaction”, which would clearly call out the original post an article is responding to; and rev=”object” for links to people and companies, which would facilitate an index for all of the proper nouns in a piece.
Oh, and the biggest crowd pleaser was a feature you may love or hate: a button that toggles off all links in a document for distraction-free (or, er, context-free) reading. (Try it on this article!)
Others have proposed approaches to adding metadata to links, from nofollow to syndication-source to standout to FOAF. Zuckerman suggested Narula create WordPress and Drupal plugins to encourage adoption. Getting the rest of the web on board would be a tall order.
Searching for correlations in a haystack
Eugene Wu, a graduate student of computer science at MIT, demonstrated a suite of tools called DBTruck that makes data comparison a snap. Enter the URL of a CSV file, JSON data, or an HTML table and DBTruck will clean up the data and import it to a local database. Normally you might go to a web page like this, select and copy the table, paste it into an Excel spreadsheet, then spend 15 minutes trying to fix the misplaced cells and formatting issues. DBTruck is automated and fast.
The program allows you to geocode any field that contains address information, whether that field is “Cambridge, MA” or “Cambridge, Mass.” or “1 Francis Ave, Cambridge.” Humans have come up with many ways to represent physical locations, but geographic coordinates are unambiguous instructions for computers to map a location. When you’re dealing with disorganized datasets, getting consistency is key.
Wu’s tool then lets you plot arbitrary comparisons between datasets. To test the program he plugged in all kinds of datasets, just for fun. Is there a correlation between addresses of Massachusetts lottery winners and Taco Bell locations? (No.) Suicide rates and unemployment rates in New York state? (No.) Suddenly he stumbled upon a connection that made sense: Communities in New York state with high teen pregnancy rates correlated highly with low birth weights. There’s a potential story there that Wu might not have otherwise set out to write. Zuckerman advised Wu to team up with The Boston Globe to run more arbitrary comparisons and discover what local stories might be hidden in the numbers. (It also seems like a dandy add-on to the PANDA Project, which is building a platform for in-house newsroom databases.)
Continue reading the rest of the story on Nieman Lab