The Twitter Collection & Analysis Toolkit (TCAT): More Than Pretty Pictures

How a new twitter modeling tool can make data more tangible.

TCAT Ebola Graph

By Brian Ward

Meet the Twitter Collection and Analysis Toolkit (TCAT), a powerful new modeling tool for Twitter data. It requires no programming or software knowledge and can process and archive tweets by the tens of millions for its graphs. How do you get your hands on this new tool? At the moment the best way is to enroll at Boston University.

jacob groshek

Jacob Groshek

“My involvement in [TCAT] began when I went to a workshop in Amsterdam last January,” said Jacob Groshek, “That’s where they built the tool itself. What we’ve done is to take that tool and install it and develop it a little bit further.”

Groshek is an assistant professor of Emerging Media at BU and oversaw the school’s acquisition and installment of TCAT last winter. Since then he and his team at Betweetness Labs have been working with the software to discover the extent of its abilities.

“That’s not to say we can solve everything, but in the full scale model it could be really, really powerful,” Groshek said, “I think we’re just at the tip of the iceberg.”

So what can TCAT do? You can use it to find all the hashtags related to a common subject. For example if you typed in The Walking Dead it would bring up hashtags such as #WalkingDead, #zombies, or #AMC. You can measure the popularity and frequency of various hashtags and create minute-by-minute timelines of Twitter activity. So you can discover at what time one hashtag overshadowed another, almost down to the minute. You can find the visibility of users by the number of mentions they receive and find stats on each user, such as who follows them, where those followers are from, and make a web graph showing how they are all interconnected.

“I think it’s useful as a tool to find users, to engage users. We can target specific groups and clusters of users that are talking with each other about a topic in a certain way.”

Groshek says the TCAT tools he uses most often are the co-mention graph and the co-hashtag graph, which combined show the most visible people within a group of hashtags and how they connect with others. Groshek says the first application he can think of for this software is for journalism; though that is not it’s only use.

“There’s no limit to what this thing can be applied to. Journalism to me is kind of an obvious one. It can help journalist find sources, it can help them grow their stories, and it can monitor their results,” Groshek said, “I think a lot of this overlays with what public relations practitioners are interested in doing, ‘how are brands being discussed,’ and we can answer those questions right away. And after that, what users are talking about our brand and how do we engage them in a way to reinforce or redirect what’s being said. At the end you can monitor how successful it’s been. “

While TCAT is not out on the market yet, Groshek says he sometimes takes on projects that others give him, or let them test the program themselves.

“It would be great if could turn into more marketable product,” he said, “We’re not quite there yet, [but] this is something I want to push out and make available.” ∞

About Author