Changes at Twapperkeeper

by | Feb 25, 2011

One of the great ways of studying Twitter activity around a hash tag post-event has been via the fantastic Twapperkeeper service, which allowed you to archive and export tagged tweets, together with their metadata, in a structured way. This information was also available as an API, which fed services like Eduserv’s Summarizr to provide a quick and easy way of assessing the size and reach of your event via Twitter.

Notice the past tense there.

What has changed?

Twitter has instructed Twapperkeeper to remove its API and export functions in order to comply with their Terms of Service regarding syndication of content. This means that the only way to view a hash tag archive will be via the website. There will be no way to export and download the raw data for analysis. There will also be no API available to support tools built on top of Twapperkeeper, such as Summarizr.

The Terms of Service in question say:

You will not attempt or encourage others to… sell, rent, lease, sublicense, redistribute, or syndicate the Twitter API or Twitter Content to any third party for such party to develop additional products or services without prior written approval from Twitter.

Speaking to All Twitter, Twapperkeeper founder John O’Brien says:

“What I’m seeing is, anybody who is ‘syndicating’ content, i.e. allowing it to be downloaded and exported in any structured way, is running afoul of the terms of service. If it’s in HTML, it’s fine. The minute it became structured it became a problem.”


What does this mean for amplified events?

This change will make it harder to get detailed data about event-related Twitter traffic in order to conduct close analysis. Whilst it might feel like Twitter has been a staple part of the event experience in some quarters for a long time, we are still really in the early stages when it comes to probing the data around an event hash tag and using this either for the purposes of post-event curation or to inform future strategy. I have, as yet, seen few event organisers show an interest in anything beyond the more superficial statistical analysis offered by Summarizr. This in itself had a real value in assessing the impact of an event and making the case for further resources to support the use of Twitter at future events. However, as a professional event amplifier, I am keen to see deeper analysis of audience engagement with events via Twitter to help identify ways of supporting both the live and remote audiences more effectively, and to provide more useful curation of conferences for future reference. Twapperkeeper was really the only tool that provided the functionality to support this.


There are still interesting tools popping up all the time which can help provide some analysis of an event Twitter stream, without syndicating the original data. In particular, I am curious to investigate Twitter Sentiment in more detail, as this claims to provide a level of automatic sentiment analysis around a particular search term. There are also a number of visualisation tools around which can give you some interesting angles on your event’s Twitter data. However, these will almost certainly be based upon Twitter’s own search API, which means that they will be time sensitive. In terms of hand crafted, event-specific research questions, examined outside the immediate context of the event, access to the raw hash tag data in a structured form is still crucial. Whilst Twapperkeeper archives will still be accessible in HTML at the website, extracting data to conduct such research will now be more difficult.

One option would be to use the JISC-funded Your Twapperkeeper, which allows you to run your own installation of Twapperkeeper on your own server. Provided you are doing this for personal use, it should not violate the syndication part of Twitter’s ToS. However, John O’Brien does advise proceeding in this direction at your own risk. In terms of reducing the risk, it might be appropriate to associate your Twapperkeeper installation with an event specific account or with a back up Twitter account, so that any potential dispute does not impact on a more highly valued account. It goes without saying that you should also make sure that you are not violating any of Twitter’s other ToS through any of your subsequent activities with the data.

The second option would be to scrape the data from the HTML page created when you view your Twapperkeeper archive online. This may take some time to set up, and will require access to someone with the necessary skills. I cannot find any reference on Twapperkeeper to suggest that this would violate the terms of use of the site, but I would recommend exploring the legal implications of this approach before taking this route.

The third option would be to use a tool like CoverItLive to collect your own record of event tweets. CoverItLive offers an RSS feed of all updates, so if you set up your live session to pull in all tagged tweets, you could use the resulting RSS data for analysis. This will last longer than the Twitter search RSS feed, as shown by comparing the #iwmw10 Twitter RSS feed to the IWMW10 CoverItLive RSS feed, seven months after the event. The RSS data is already structured for you, making this a much more viable work around. However, I do wonder whether this type of functionality also contravenes Twitter’s ToS and how strictly they are likely to enforce this issue with services, such as CoverItLive, which combine content from multiple sources to form their feeds. Twapperkeeper itself provides an RSS Permalink with each archive, but it is not yet clear whether this will be affected by the changes, as this could also be viewed as a means of syndicating structured Twitter content.

Other Restrictions

Twapperkeeper has already limited their service to two free archives per account, with a premium subscription model for people who need to create more archives to help cover the costs associated with archiving the high volume of tweets currently handled by the service. This, in itself, demonstrates the perceived value in such a service, together with the impressive 1 billion+ tweets held in their archives.

The Bigger Picture

This move has been seen by many as another step towards Twitter cracking down on third party apps. This means that both developers and users alike need to be more aware of Twitter’s ToS and tread carefully when developing a new tool based on the Twitter API, or choosing a tool as the basis of important work. This applies to event amplification, where a pallet of such tools may be employed within an amplification strategy, as well as to researchers, brand managers and marketeers. Any tool could be withdrawn or changed without notice, making it vital to conduct a clear risk assessment and identify a suitable back up plan for each aspect of the plan.

Moving Forward

Plugging the gap left by the withdrawal of this functionality will involve some creativity and will depend very much on what the event organiser wants to find out by studying their event’s Twitter data. In many cases, a well chosen combination of visualisation tools will provide the evidence required, so my first task will be to review some of these in order to select the best ones for my clients. In other cases, access to the raw data will be the only way to satisfy the organiser’s aims, so Your Twapperkeeper, a CoverItLive RSS feed or data scraping from Twapperkeeper itself may be the only ways of providing this level of detailed information. Unless, of course, Twitter brings out something new or adjusts its terms of service.
[shareaholic app=”share_buttons” id=”7637501″]

Reach us

Talk to us about your upcoming event...

We are based in Bath and operate across the UK

01252 413552

Send us a message