TwitterSearch

TwitterSearch was developed as part of a project about social media at the Carl von Linde-Akademie which is part of the Technische Universität München. Thus, it is a data collecting toolkit and not implementing the whole Twitter API but the Search API and the User Timeline API.

The library is fully accessible through the official repository at github and maintained by Christian Koepp.

It’s using the REST API in version 1.1 only. In its recent version it directly uses IDs of tweets to navigate throughout the available tweets instead of pages, which is way more comfortable to use and more convenient for really getting all possible tweets. Also, TwitterSearch is build to be highly flexible in its usage making it usable even within exotic use-cases. Details about non-default use-cases can be found in the Advanced usage sections within the class articles.

All classes and functionality is tested against the latest Python 2 and Python 3 versions automatically. The current state of all branches is visible through Travis CI. Additionally, you should note that with version 1.0 and upwards PEP-8 compatibility is enforced. Checks are done by running the pep8 toolkit.

The history of changes can be either accessed by using the official github repository or by looking at summary outlined as in the CHANGELOG.rst file within the package.

Warning

If you’re upgrading from a version < 1.0.0 be aware that the API changed! To support PEP-8 completely, former methods named someMethod() are now accessible as some_method(). Apart from this issue, four other API changes were introduced with version 1.0.0:

  • simplified proxy functionality (no usage of dicts but plain strings as only HTTPS proxies can be supported)
  • simplified geo-code parameter (TwitterSearchOrder.set_geocode(...,metric=True) renamed to set_geocode(...,imperial_metric=True))
  • simplified TwitterSearch.get_statistics() from dict to tuple style ({'queries':<int>, 'tweets':<int>} to (<int>,<int>))
  • additional feature: timelines of users can now be accessed using the new class TwitterUserOrder

In total those changes can be done quickly without browsing this documentation. If you are not able to do those changes just keep using the versions < 1.0.0. Those will stay available through pypi and therefore will be installable in the future using the common installation methods.

Table of contents

Basic usage

In most cases you probably just like to iterate through all available tweets as easy as possible. And there it is, a very minimal example to do exactly this:

from TwitterSearch import *

try:
    tso = TwitterSearchOrder()
    tso.set_keywords(['#Hashtag1', '#Hashtag2'])

    ts = TwitterSearch(
            consumer_key = 'aaabbb',
            consumer_secret = 'cccddd',
            access_token = '111222',
            access_token_secret = '333444'
        )

    for tweet in ts.search_tweets_iterable(tso):
        print('@%s tweeted: %s' % (tweet['user']['screen_name'], tweet['text']))

    except TwitterSearchException as e: # take care of all those ugly errors if there are some
        print(e)

If you’re into the access of a timeline of a certain user, you can do this by using the same pattern:

from TwitterSearch import *

try:
    # create a TwitterUserOrder for user named 'NeinQuarterly'
    tuo = TwitterUserOrder('NeinQuarterly') # is equal to TwitterUserOrder(458966079)

    # it's about time to create TwitterSearch object again
    ts = TwitterSearch(
        consumer_key = 'aaabbb',
        consumer_secret = 'cccddd',
        access_token = '111222',
        access_token_secret = '333444'
    )

    # start asking Twitter about the timeline
    for tweet in ts.search_tweets_iterable(tuo):
        print('@%s tweeted: %s' % (tweet['user']['screen_name'], tweet['text']))

except TwitterSearchException as e: # catch all those ugly errors
    print(e)

Please note those code snippes are already working examples executable in both, Python2 and Python3.

Accessible information

The creator of this library doesn’t like to hide any informations from you. Therefore the data you’ll receive is quite a lot. A typical tweet, as mentioned in the section above, consists of a huge dict.

You may ask the question “But what does this field exactly mean?”. Well, that’s where the job of TwitterSearch ends and the Twitter documentation joins the fun.

An example of how such a tweet looks like is the following dict:

       {'contributors': None,
'coordinates': None,
'created_at': 'Tue Jul 02 11:43:18 +0000 2013',
'entities': {'hashtags': [],
             'media': [{'display_url': 'pic.twitter.com/dJLxVZaSW9',
                        'expanded_url': 'http://twitter.com/EarthBeauties/status/351897277473882113/photo/1',
                        'id': 351897277478076417,
                        'id_str': '351897277478076417',
                        'indices': [78, 100],
                        'media_url': 'http://pbs.twimg.com/media/BOIwtZ2CAAEFRXU.jpg',
                        'media_url_https': 'https://pbs.twimg.com/media/BOIwtZ2CAAEFRXU.jpg',
                        'sizes': {'large': {'h': 375,
                                            'resize': 'fit',
                                            'w': 600},
                                  'medium': {'h': 375,
                                             'resize': 'fit',
                                             'w': 600},
                                  'small': {'h': 213,
                                            'resize': 'fit',
                                            'w': 340},
                                  'thumb': {'h': 150,
                                            'resize': 'crop',
                                            'w': 150}},
                        'source_status_id': 351897277473882113,
                        'source_status_id_str': '351897277473882113',
                        'type': 'photo',
                        'url': 'http://t.co/dJLxVZaSW9'}],
             'symbols': [],
             'urls': [],
             'user_mentions': [{'id': 786796010,
                                'id_str': '786796010',
                                'indices': [33, 47],
                                'name': u'Earth Pictures\u2122',
                                'screen_name': 'EarthBeauties'}]},
'favorite_count': 0,
'favorited': False,
'geo': None,
'id': 352029711347617792,
'id_str': '352029711347617792',
'in_reply_to_screen_name': 'EarthBeauties',
'in_reply_to_status_id': 351897277473882113,
'in_reply_to_status_id_str': '351897277473882113',
'in_reply_to_user_id': 786796010,
'in_reply_to_user_id_str': '786796010',
'lang': 'in',
'metadata': {'iso_language_code': 'in', 'result_type': 'recent'},
'place': None,
'possibly_sensitive': False,
'retweet_count': 0,
'retweeted': False,
'source': 'web',
'text': 'mau dong dibangunin rmh kekgini "@EarthBeauties: Hohenzollern Castle, Germany http://t.co/dJLxVZaSW9',
'truncated': False,
'user': {'contributors_enabled': False,
         'created_at': 'Sun Mar 18 04:22:51 +0000 2012',
         'default_profile': False,
         'default_profile_image': False,
         'description': u"girl non-smoking alcohol-free \u2022 @PLAYMAKERKIDSHC \u2022 DSFF \u2022 15\u221e \u2022 NotWild''",
         'entities': {'description': {'urls': []},
                      'url': {'urls': [{'display_url': 'instagram.com/giwaang',
                                        'expanded_url': 'http://instagram.com/giwaang',
                                        'indices': [0, 22],
                                        'url': 'http://t.co/vCyfkrdTwa'}]}},
         'favourites_count': 1,
         'follow_request_sent': False,
         'followers_count': 661,
         'following': False,
         'friends_count': 176,
         'geo_enabled': False,
         'id': 528140042,
         'id_str': '528140042',
         'is_translator': False,
         'lang': 'id',
         'listed_count': 1,
         'location': u"SwiekeCity\u2022PinkBabyRoom's",
         'name': 'EarStud',
         'notifications': False,
         'profile_background_color': 'BADFCD',
         'profile_background_image_url': 'http://a0.twimg.com/profile_background_images/872889954/b7439a65d39bdff360c934bd6f33c3b7.jpeg',
         'profile_background_image_url_https': 'https://si0.twimg.com/profile_background_images/872889954/b7439a65d39bdff360c934bd6f33c3b7.jpeg',
         'profile_background_tile': True,
         'profile_banner_url': 'https://pbs.twimg.com/profile_banners/528140042/1369624796',
         'profile_image_url': 'http://a0.twimg.com/profile_images/378800000047155611/7581e79882f1c9f1bbe4b706a023e2c9_normal.jpeg',
         'profile_image_url_https': 'https://si0.twimg.com/profile_images/378800000047155611/7581e79882f1c9f1bbe4b706a023e2c9_normal.jpeg',
         'profile_link_color': 'FF0000',
         'profile_sidebar_border_color': '000000',
         'profile_sidebar_fill_color': '252429',
         'profile_text_color': '666666',
         'profile_use_background_image': True,
         'protected': False,
         'screen_name': 'giwaang',
         'statuses_count': 10199,
         'time_zone': None,
         'url': 'http://t.co/vCyfkrdTwa',
         'utc_offset': None,
         'verified': False}}

Architecture

TwitterSearch consists of four classes: TwitterSearch, TwitterSearchOrder, TwitterUserOrder and TwitterSearchException.

To not repeat certain code-fragments the class TwitterOrder is also available. However, this class is rarely used directly and only contains few basic methods.

TwitterSearch

TwitterSearch package

Submodules

TwitterSearch.TwitterOrder module

class TwitterSearch.TwitterOrder.TwitterOrder

Bases: object

Basic interface class to inherit from. Methods raising NotImplementedError exceptions need to be implemented by all children

arguments = {}
create_search_url()

Generates an url-encoded query string from stored key-values tuples. Has to be implemented within child classes

Raises:NotImplementedError
set_count(cnt)

Sets ‘count’ parameter used to define the number of tweets to return per page. Maximum and default value is 100

Parameters:cnt – Integer containing the number of tweets per page within a range of 1 to 100
Raises:TwitterSearchException
set_include_entities(include)

Sets ‘include entities’ parameter to either include or exclude the entities node within the results

Parameters:include – Boolean to trigger the ‘include entities’ parameter
Raises:TwitterSearchException
set_max_id(twid)

Sets ‘max_id’ parameter used to return only results with an ID less than (that is, older than) or equal to the specified ID

Parameters:twid – A valid tweet ID in either long (Py2k) or integer (Py2k + Py3k) format
Raises:TwitterSearchException
set_search_url(url)

Reads given query string and stores key-value tuples. Has to be implemented within child classes

Parameters:url – A string containing the twitter API endpoint URL
Raises:NotImplementedError
set_since_id(twid)

Sets ‘since_id’ parameter used to return only results with an ID greater than (that is, more recent than) the specified ID

Parameters:twid – A valid tweet ID in either long (Py2k) or integer (Py2k + Py3k) format
Raises:TwitterSearchException

TwitterSearch.TwitterSearch module

class TwitterSearch.TwitterSearch.TwitterSearch(consumer_key, consumer_secret, access_token, access_token_secret, **attr)

Bases: object

This class contains the actual functionality of this library. It is responsible for correctly transmitting your data to the Twitter API (v1.1 only) and returning the results to your program afterwards. It is configured using an implementation of TwitterOrder along with valid Twitter credentials. Currently two different implementations are usable: TwitterUserOrder for retrieving the timeline of a certain user and TwitterSearchOrder for accessing the Twitter Search API.

The methods next(), __next__() and __iter__() are used during the iteration process. For more information about those methods please consult the official Python documentation.

authenticate(verify=True)

Creates an authenticated and internal oauth2 handler needed for queries to Twitter and verifies credentials if needed. If verify is true, it also checks if the user credentials are valid. The default value is True

Parameters:verify – boolean variable to directly check. Default value is True
check_http_status(http_status)

Checks if given HTTP status code is within the list at TwitterSearch.exceptions and raises a TwitterSearchException if this is the case. Example usage: checkHTTPStatus(200) and checkHTTPStatus(401)

Parameters:http_status – Integer value of the HTTP status of the last query. Invalid statuses will raise an exception.
Raises:TwitterSearchException
exceptions = {420: 'Enhance Your Calm: You are being rate limited', 502: 'Bad Gateway: Twitter is down or being upgraded', 422: 'Unprocessable Entity: Image unable to be processed', 429: ('Too Many Requests: Request cannot be served ', "due to the application's rate limit having ", 'been exhausted for the resource'), 400: 'Bad Request: The request was invalid', 401: ('Unauthorized: Authentication credentials ', ' were missing or incorrect'), 403: ('Forbidden: The request is understood, but', 'it has been refused or access is not allowed'), 500: 'Internal Server Error: Something is broken', 406: 'Not Acceptable: Invalid format is specified in the request', 503: ('Service Unavailable: The Twitter servers ', 'are up, but overloaded with requests'), 504: ("Gateway timeout: The request couldn't ", 'be serviced due to some failure within our stack'), 404: ('Not Found: The URI requested is invalid or', 'the resource requested does not exists'), 410: 'Gone: This resource is gone'}
get_amount_of_tweets()

Returns current amount of tweets available within this instance

Returns:The amount of tweets currently available
Raises:TwitterSearchException
get_metadata()

Returns all available meta data collected during last query. See Advanced usage for example

Returns:Available meta information about the last query in form of a dict
Raises:TwitterSearchException
get_minimal_id()

Returns the minimal tweet ID of the current response

Returns:minimal tweet identification number
Raises:TwitterSearchException
get_proxy()

Returns the current proxy url or None if no proxy is set

Returns:A string containing the current HTTPS proxy (e.g. my.proxy.com:8080) or None is no proxy is used
get_statistics()

Returns dict with statistical information about amount of queries and received tweets. Returns statistical values about the number of queries and the sum of all tweets received by this very instance of TwitterSearch. Example usage: print("Queries done: %i. Tweets received: %i" % ts.get_statistics())

Returns:A tuple with queries and tweets keys containing integers. E.g. (1,100) which stands for one query that contained one hundred tweets.
get_tweets()

Returns all available data from last query. See Advanced usage for example

Returns:All tweets found using the last query as a dict
Raises:TwitterSearchException
next()

Python2 comparability method. Simply returns self.__next__()

Returns:the __next__() method of this class
search_next_results()

Triggers the search for more results using the Twitter API. Raises exception if no further results can be found. See Advanced usage for example

Returns:True if there are more results available within the Twitter Search API
Raises:TwitterSearchException
search_tweets(order)

Creates an query string through a given TwitterSearchOrder instance and takes care that it is send to the Twitter API. This method queries the Twitter API without iterating or reloading of further results and returns response. See Advanced usage for example

Parameters:order – A TwitterOrder instance. Can be either TwitterSearchOrder or TwitterUserOrder
Returns:Unmodified response as dict.
Raises:TwitterSearchException
search_tweets_iterable(order)

Returns itself and queries the Twitter API. Is called when using an instance of this class as iterable. See Basic usage for examples

Parameters:order – An instance of TwitterOrder class (e.g. TwitterSearchOrder or TwitterUserOrder)
Returns:Itself using self keyword

Queries the Twitter API with a given query string and stores the results internally. Also validates returned HTTP status code and throws an exception in case of invalid HTTP states. Example usage sendSearch('?q=One+Two&count=100')

Parameters:url – A string of the URL to send the query to
Raises:TwitterSearchException
set_proxy(proxy)

Sets a HTTPS proxy to query the Twitter API

Parameters:proxy – A string of containing a HTTPS proxy e.g. set_proxy("my.proxy.com:8080").
Raises:TwitterSearchException
set_supported_languages(order)

Loads currently supported languages from Twitter API and sets them in a given TwitterSearchOrder instance. See Advanced usage for example

Parameters:order – A TwitterOrder instance. Can be either TwitterSearchOrder or TwitterUserOrder

TwitterSearch.TwitterSearchException module

exception TwitterSearch.TwitterSearchException.TwitterSearchException(code, msg=None)

Bases: Exception

This class is all about exceptions (surprise, surprise!). All exception based directly on TwitterSearch will consist of a code and a message describing the reason of the exception shortly.

TwitterSearch.TwitterSearchOrder module

class TwitterSearch.TwitterSearchOrder.TwitterSearchOrder

Bases: TwitterSearch.TwitterOrder.TwitterOrder

This class is for configurating all available arguments of the Twitter Search API (v1.1). It also creates valid query strings which can be used in other environments identical to the syntax of the Twitter Search API.

add_keyword(word)

Adds a given string or list to the current keyword list

Parameters:word – String or list of at least 2 character long keyword(s)
Raises:TwitterSearchException
create_search_url()

Generates (urlencoded) query string from stored key-values tuples

Returns:A string containing all arguments in a url-encoded format
iso_6391 = ['aa', 'ab', 'ae', 'af', 'ak', 'am', 'an', 'ar', 'as', 'av', 'ay', 'az', 'ba', 'be', 'bg', 'bh', 'bi', 'bm', 'bn', 'bo', 'br', 'bs', 'ca', 'ce', 'ch', 'co', 'cr', 'cs', 'cu', 'cv', 'cy', 'da', 'de', 'dv', 'dz', 'ee', 'el', 'en', 'eo', 'es', 'et', 'eu', 'fa', 'ff', 'fi', 'fj', 'fo', 'fr', 'fy', 'ga', 'gd', 'gl', 'gn', 'gu', 'gv', 'ha', 'he', 'hi', 'ho', 'hr', 'ht', 'hu', 'hy', 'hz', 'ia', 'id', 'ie', 'ig', 'ii', 'ik', 'io', 'is', 'it', 'iu', 'ja', 'jv', 'ka', 'kg', 'ki', 'kj', 'kk', 'kl', 'km', 'kn', 'ko', 'kr', 'ks', 'kv', 'kw', 'ky', 'la', 'lb', 'lg', 'li', 'ln', 'lo', 'lt', 'lu', 'lv', 'mg', 'mh', 'mi', 'mk', 'ml', 'mn', 'mr', 'ms', 'mt', 'my', 'na', 'nb', 'nd', 'ne', 'ng', 'nl', 'nn', 'no', 'nr', 'nv', 'ny', 'oc', 'oj', 'om', 'or', 'os', 'pa', 'pi', 'pl', 'ps', 'pt', 'qu', 'rm', 'rn', 'ro', 'ru', 'rw', 'sa', 'sc', 'sd', 'se', 'sg', 'si', 'sk', 'sl', 'sm', 'sn', 'so', 'sq', 'sr', 'ss', 'st', 'su', 'sv', 'sw', 'ta', 'te', 'tg', 'th', 'ti', 'tk', 'tl', 'tn', 'to', 'tr', 'ts', 'tt', 'tw', 'ty', 'ug', 'uk', 'ur', 'uz', 've', 'vi', 'vo', 'wa', 'wo', 'xh', 'yi', 'yo', 'za', 'zh', 'zu']
set_callback(func)

Sets ‘callback’ parameter. If supplied, the response will use the JSONP format with a callback of the given name

Parameters:func – A string containing the name of the callback function
Raises:TwitterSearchException
set_geocode(latitude, longitude, radius, imperial_metric=True)

Sets geolocation parameters to return only tweets by users located within a given radius of the given latitude/longitude. The location is preferentially taking from the Geotagging API, but will fall back to their Twitter profile.

Parameters:
  • latitude – A integer or long describing the latitude
  • longitude – A integer or long describing the longitude
  • radius – A integer or long describing the radius
  • imperial_metric – Whether the radius is given in metric (kilometers) or imperial (miles) system. Default is True which relates to usage of the imperial kilometer metric
Raises:

TwitterSearchException

set_keywords(words)

Sets a given list as the new keyword list

Parameters:words – A list of at least 2 character long new keywords
Raises:TwitterSearchException
set_language(lang)

Sets ‘lang’ parameter used to only fetch tweets within a certain language

Parameters:lang – A 2-letter language code string (ISO 6391 compatible)
Raises:TwitterSearchException
set_locale(lang)

Sets ‘locale’ parameter to specify the language of the query you are sending (only ja is currently effective)

Parameters:lang – A 2-letter language code string (ISO 6391 compatible)
Raises:TwitterSearchException
set_result_type(result_type)

Sets ‘result_type’ parameter to specify what type of search results you would prefer to receive. The current default is “mixed.” Valid values include: - mixed: Include both popular and real time results - recent: return only the most recent results - popular: return only the most popular results :param result_type: A string containing one of the three valid result types

Raises:TwitterSearchException
set_search_url(url)

Reads given query string and stores key-value tuples

Parameters:url – A string containing a valid URL to parse arguments from
set_until(date)

Sets ‘until’ parameter used to return only tweets generated before the given date

Parameters:date – A datetime instance
Raises:TwitterSearchException

TwitterSearch.TwitterUserOrder module

class TwitterSearch.TwitterUserOrder.TwitterUserOrder(user)

Bases: TwitterSearch.TwitterOrder.TwitterOrder

This class configures all arguments available of the user_timeline endpoint of the Twitter API (version 1.1 only). It also creates a valid query string out of the current configuration.

create_search_url()

Generates (urlencoded) query string from stored key-values tuples

Returns:A string containing all arguments in a url-encoded format
set_contributor_details(contdetails)

Sets ‘contributor_details’ parameter used to enhance the contributors element of the status response to include the screen_name of the contributor. By default only the user_id of the contributor is included

Parameters:contdetails – Boolean triggering the usage of the parameter
Raises:TwitterSearchException
set_exclude_replies(exclude)

Sets ‘exclude_replies’ parameter used to prevent replies from appearing in the returned timeline

Parameters:exclude – Boolean triggering the usage of the parameter
Raises:TwitterSearchException
set_include_rts(rts)

Sets ‘include_rts’ parameter. When set to False, the timeline will strip any native retweets from the returned timeline

Parameters:rts – Boolean triggering the usage of the parameter
Raises:TwitterSearchException
set_search_url(url)

Reads given query string and stores key-value tuples

Parameters:url – A string containing a valid URL to parse arguments from
set_trim_user(trim)

Sets ‘trim_user’ parameter. When set to True, each tweet returned in a timeline will include a user object including only the status authors numerical ID

Parameters:trim – Boolean triggering the usage of the parameter
Raises:TwitterSearchException

TwitterSearch.utils module

Module contents

Advanced usage: The TwitterSearch class

This is the main class of this library where all the action takes place. There are many ways to use it and the most common ones are explained in this section.

Constructor of TwitterSearch

The constructor needed to set your credentials for the Twitter API. The parameters are __init__( consumer_key, consumer_secret, access_token, access_token_secret, verify=True).

If you’re new to Python take a look at the following example:

ts1 = TwitterSearch(
    consumer_key = 'aaabbb',
    consumer_secret = 'cccddd',
    access_token = '111222',
    access_token_secret = '333444'
)

# equals

ts2 = TwitterSearch('aaabbb', 'cccddd', '111222', '333444', verify=True, proxy=None)

Authentication and verification

Please be aware that there is no further check whether or not your credentials are valid if you set verify=False in the constructor. If you’re skipping the verification process of TwitterSearch you can avoid some traffic and one query. Note that this validation query is part of the rate-limiting as done by Twitter. If you are sure your credentials are correct you can disable this feature.

But be aware that you’re only saving one request at all by avoiding the automatic verification process. Due to the fact that json doesn’t consume much traffic at all, this may only be a way for very conservative developers or some exotic scenarios.

Proxy usage

To use a HTTPS proxy at initialization of the TwitterSearch class, an addition argument named proxy='some.proxy:888' can be used. Otherwise the authentication will fail if the client has no direct access to the Twitter API.

Returned tweets

This library is trying to not hide anything from your eyes except the complexity of its functions. Due to this you’re able to get all the information available (which can be quite a lot).

Example output with only one tweet included:

{'search_metadata': {'completed_in': 0.08,
                 'count': 1,
                 'max_id': 352072665667878913,
                 'max_id_str': '352072665667878913',
                 'next_results': '?max_id=352072665667878912&q=Germany%20castle&count=1&include_entities=1',
                 'query': 'Germany+castle',
                 'refresh_url': '?since_id=352072665667878913&q=Germany%20castle&include_entities=1',
                 'since_id': 0,
                 'since_id_str': '0'},
                 'statuses': [
                 {'contributors': None,
           'coordinates': None,
           'created_at': 'Tue Jul 02 14:33:59 +0000 2013',
           'entities': {'hashtags': [],
                        'media': [{'display_url': 'pic.twitter.com/Oz77FLEong',
                                   'expanded_url': 'http://twitter.com/ThatsEarth/status/351839174887870464/photo/1',
                                   'id': 351839174896259072,
                                   'id_str': '351839174896259072',
                                   'indices': [117, 139],
                                   'media_url': 'http://pbs.twimg.com/media/BOH73Y3CEAAldKU.jpg',
                                   'media_url_https': 'https://pbs.twimg.com/media/BOH73Y3CEAAldKU.jpg',
                                   'sizes': {'large': {'h': 639,
                                                       'resize': 'fit',
                                                       'w': 960},
                                             'medium': {'h': 399,
                                                        'resize': 'fit',
                                                        'w': 600},
                                             'small': {'h': 226,
                                                       'resize': 'fit',
                                                       'w': 340},
                                             'thumb': {'h': 150,
                                                       'resize': 'crop',
                                                       'w': 150}},
                                   'source_status_id': 351839174887870464,
                                   'source_status_id_str': '351839174887870464',
                                   'type': 'photo',
                                   'url': 'http://t.co/Oz77FLEong'}],
                        'symbols': [],
                        'urls': [],
                        'user_mentions': [{'id': 118504288,
                                           'id_str': '118504288',
                                           'indices': [0, 11],
                                           'name': 'Josh Dallas',
                                           'screen_name': 'joshdallas'},
                                          {'id': 298250825,
                                           'id_str': '298250825',
                                           'indices': [12, 25],
                                           'name': 'Ginnifer Goodwin',
                                           'screen_name': 'ginnygoodwin'},
                                          {'id': 1201661238,
                                           'id_str': '1201661238',
                                           'indices': [49, 60],
                                           'name': 'Earth Pics',
                                           'screen_name': 'ThatsEarth'}]},
           'favorite_count': 0,
           'favorited': False,
           'geo': None,
           'id': 352072665667878913,
           'id_str': '352072665667878913',
           'in_reply_to_screen_name': 'joshdallas',
           'in_reply_to_status_id': None,
           'in_reply_to_status_id_str': None,
           'in_reply_to_user_id': 118504288,
           'in_reply_to_user_id_str': '118504288',
           'lang': 'en',
           'metadata': {'iso_language_code': 'en',
                        'result_type': 'recent'},
           'place': None,
           'possibly_sensitive': False,
           'retweet_count': 0,
           'retweeted': False,
           'source': '<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>',
           'text': '@joshdallas @ginnygoodwin home during wintertime"@ThatsEarth: Hohenzollern Castle floating above the Clouds,Germany. http://t.co/Oz77FLEong"',
           'truncated': False,
           'user': {'contributors_enabled': False,
                    'created_at': 'Fri Aug 14 09:15:27 +0000 2009',
                    'default_profile': False,
                    'default_profile_image': False,
                    'description': 'Scorpio. 23. MBA Graduate.',
                    'entities': {'description': {'urls': []},
                                 'url': {'urls': [{'display_url': 'fanfiction.net/u/4764512/',
                                                   'expanded_url': 'http://www.fanfiction.net/u/4764512/',
                                                   'indices': [0,
                                                               22],
                                                   'url': 'http://t.co/sEKQ1M85H2'}]}},
                    'favourites_count': 114,
                    'follow_request_sent': False,
                    'followers_count': 300,
                    'following': False,
                    'friends_count': 229,
                    'geo_enabled': False,
                    'id': 65599486,
                    'id_str': '65599486',
                    'is_translator': False,
                    'lang': 'en',
                    'listed_count': 0,
                    'location': 'Kuwait',
                    'name': 'Amal Behbehani',
                    'notifications': False,
                    'profile_background_color': 'DBE9ED',
                    'profile_background_image_url': 'http://a0.twimg.com/profile_background_images/317569734/tumblr_lqc4ttwuJm1qclkveo1_500.jpg',
                    'profile_background_image_url_https': 'https://si0.twimg.com/profile_background_images/317569734/tumblr_lqc4ttwuJm1qclkveo1_500.jpg',
                    'profile_background_tile': True,
                    'profile_banner_url': 'https://pbs.twimg.com/profile_banners/65599486/1372576102',
                    'profile_image_url': 'http://a0.twimg.com/profile_images/3763288269/57c274f19592f6d190957d8eb86c64f1_normal.png',
                    'profile_image_url_https': 'https://si0.twimg.com/profile_images/3763288269/57c274f19592f6d190957d8eb86c64f1_normal.png',
                    'profile_link_color': 'CC3366',
                    'profile_sidebar_border_color': 'DBE9ED',
                    'profile_sidebar_fill_color': 'E6F6F9',
                    'profile_text_color': '333333',
                    'profile_use_background_image': True,
                    'protected': False,
                    'screen_name': 'TigeyGirl',
                    'statuses_count': 18891,
                    'time_zone': 'Santiago',
                    'url': 'http://t.co/sEKQ1M85H2',
                    'utc_offset': -14400,
                    'verified': False}}]}

Have a look at the entities documented by Twitter to figure out what a specific key-value tuple does exactly mean.

Access meta data

An output of the available meta data from the query to the Twitter API is stored in a dict structure. You can access it by calling get_metadata() which will return all meta information about the last query.

Example:

{
'content-length': '467129',
'x-rate-limit-reset': '1372773784',
'x-rate-limit-remaining': '170',
'x-xss-protection': '1; mode=block',
'cache-control': 'no-cache, no-store, must-revalidate, pre-check=0, post-check=0',
'status': '200',
'transfer-encoding': 'chunked',
'set-cookie': 'lang=de, guest_id=v1%!xxx; Domain=.twitter.com; Path=/; Expires=Thu, 01-Jul-2013 14:02:32 UTC',
'expires': 'Tue, 31 Mar 1981 05:00:00 GMT',
'x-access-level': 'read',
'last-modified': 'Tue, 01 Jul 2013 14:02:32 GMT',
'-content-encoding': 'gzip',
'pragma': 'no-cache',
'date': 'Tue, 01 Jul 2013 14:02:32 GMT',
'x-rate-limit-limit': '180',
'content-location': u'https://api.twitter.com/1.1/search/tweets.json?count=100&oauth_body_hash=xxx&oauth_nonce=xxx&oauth_timestamp=xxx&oauth_consumer_key=xxx&oauth_signature_method=HMAC-SHA1&q=Germany+castle&oauth_version=1.0&oauth_token=xxx&oauth_signature=xxx',
'x-transaction': 'xxx',
'strict-transport-security': 'max-age=631138519',
'server': 'tfe',
'x-frame-options': 'SAMEORIGIN',
'content-type': 'application/json;charset=utf-8'
}

Be careful about those data as it contains sensible data as you can see in get_metadata()['content-location']. Do NOT save or output those information to insecure environments!

If you are interested in the amount of queries that this library did automatically on your behalf you can access those information easy by calling get_statistcs(). A trivial example use-case could be to print out those informations as part of a debugging or logging facility: print("Queries done: %i. Tweets received: %i" % ts.get_statistics())

TwitterSearch without automatic iteration

It is also perfectly possible to use TwitterSearch without any automatic iteration and to query the Twitter API all by yourself. For example you may like to implement the suggest max_id procedure of Twitter to access the API directly and don’t trust the library to do this automatically on its own. Just assume that we would like to implement this feature independently again. A possible solution of this could look like:

from TwitterSearch import *

try:
    tso = TwitterSearchOrder()
    tso.set_keywords(['Germany', 'castle'])

    ts = TwitterSearch('aaabbb', 'cccddd', '111222', '333444')

    # init variables needed in loop
    todo = True
    next_max_id = 0

    # let's start the action
    while(todo):

        # first query the Twitter API
        response = ts.search_tweets(tso)

        # print rate limiting status
        print( "Current rate-limiting status: %i" % ts.get_metadata()['x-rate-limit-reset'])

        # check if there are statuses returned and whether we still have work to do
        todo = not len(response['content']['statuses']) == 0

        # check all tweets according to their ID
        for tweet in response['content']['statuses']:
            tweet_id = tweet['id']
            print("Seen tweet with ID %i" % tweet_id)

            # current ID is lower than current next_max_id?
            if (tweet_id < next_max_id) or (next_max_id == 0):
                next_max_id = tweet_id
                next_max_id -= 1 # decrement to avoid seeing this tweet again

        # set lowest ID as MaxID
        tso.set_max_id(next_max_id)

except TwitterSearchException as e:
    print(e)

On-the-fly loading of supported languages

As you may have figured out some languages are not supported by Twitter and those that are may change over time. This is why Twitter does provide an endpoint to load all currently supported languages. You may query it to gather current information about the languages in Twitter.

from TwitterSearch import *

try:
    tso = TwitterSearchOrder()
    ts = TwitterSearch('aaabbb', 'cccddd', '111222', '333444')

    # load  currently supported languages by Twitter and store them in a TwitterSearchOrder object
    ts.set_supported_languages(tso)

    # try to set German (see ISO 639-1) as language
    ts.set_language('de')
    print('German seems to be officially supported by Twitter. Yay!')

except TwitterSearchException as e:

    # if we get an 1002 code it means that 'de' is not supported (see TwitterSearchException)
    if e.code == 1002:
        print('Oh no - German is not supported :(')
    print(e)

Advanced usage: The TwitterSearchException class

You can also print an exception like a string which will result in Error <TwitterSearchException.code>: <TwitterSearchException.message>. For those new to Python, this can be easily done like this:

except TwitterSearchException as e:
    print(e)

List of exceptions

There are two different kinds of exceptions. Those based on the HTTP status of the query to the Twitter API and those based on misconfiguration of the library, for example by setting odd parameters or trying to access tweets without querying the API before.

Library based exceptions

All exceptions based on issues within TwitterSearch do have TwitterSearchException.code >= 1000

Code Message
1000 Neither a list nor a string
1001 Not a list object
1002 No ISO 6391-1 language code
1003 No valid result type
1004 Invalid number
1005 Invalid unit
1006 Invalid callback string
1007 Not a date object
1008 Invalid boolean
1009 Invalid string
1010 Not a valid TwitterSearchOrder object
1011 No more results available
1012 No meta data available
1013 No tweets available
1014 No results available
1015 No keywords given
1016 Invalid dict

HTTP based exceptions

Exceptions based on the HTTP status response of the Twitter API are TwitterSearchException.code < 1000. Note that the code attribute is exactly the HTTP status value returned to TwitterSearch. All those exceptions are raised in TwitterSearch only.

Code Message
400 Bad Request: The request was invalid
401 Unauthorized: Authentication credentials were missing or incorrect
403 Forbidden: The request is understood, but it has been refused or access is not allowed
404 Not Found: The URI requested is invalid or the resource requested does not exists
406 Not Acceptable: Invalid format is specified in the request
410 Gone: This resource is gone
420 Enhance Your Calm: You are being rate limited
422 Unprocessable Entity: Image unable to be processed
429 Too Many Requests: Request cannot be served due to the application’s rate limit having been exhausted for the resource
500 Internal Server Error: Something is broken
502 Bad Gateway: Twitter is down or being upgraded
503 Service Unavailable: The Twitter servers are up, but overloaded with requests
504 Gateway timeout: The request couldn’t be serviced due to some failure within our stack

Advanced exception usage

Maybe there is an odd reason why you don’t want TwitterSearch to raise an exception when a 404 HTTP status is returned by Twitter. Additional you’d like to raise an exception when a 200 HTTP status is returned. Maybe you would like to test your firewall by doing complex HTTP queries. Okay, don’t ask me about use-cases, let’s just assume there is some strange reason to do so.

Since TwitterSearch is designed to be used in academic and highly individual scenarios it is perfectly possible to do such crazy stuff without much trouble.

from TwitterSearch import *

tso = TwitterSearchOrder()
tso.set_keywords(['strange', 'use-case'])
tso.set_include_entities(False)

ts = TwitterSearch(
    consumer_key = 'onetwothree',
    consumer_secret = 'fourfivesix',
    access_token = 'foo',
    access_token_secret = 'bar'
)

# add a HTTP status based exception based on status 200
ts.exceptions.update({200 : 'It worked - damn it!' })

# delete exception based on HTTP status 400
del ts.exceptions[400]

try:
    ts.authenticate()
    for tweet in ts.search_tweets_iterable(tso):
        print("Seen tweed with ID %i" % tweet['id'])

except TwitterSearchException as e:
    if e.code < 1000:
        print("HTTP status based exception: %i - %s" % (e.code, e.message))
    else:
      print("Regular exception: %i - %s" % (e.code, e.message))

If your credentials are correct you will receive the output HTTP status based exception: 200 - It worked - damn it!

Advanced usage: The TwitterSearchOrder class

This class mainly acts as a plain container for configuration all parameters currently available by the Twitter Search API. There are several parameters which can easily be set and modified by methods in TwitterSearchOrder.

The only parameter with a default value is count with 100. This is because it is the maximum of tweets returned by this very Twitter API endpoint. In most cases you’d like to reduce traffic and the amount of queries, so it makes sense to set the biggest possible value by default. Please note that this endpoint has a different maximum size than the one used in TwitterUserOrder.

Be aware that some parameters can be ignored by Twitter. For example currently not every language is detectable by the Search API. TwitterSearch is only responsible for transmitting values according to the Twitter documentation.

API Parameter Type Modifying methods Example  
q *required* add_keyword(<string>), set_keywords(<list>) add_keyword('#Hashtag'), set_keywords(['foo','bar'])
geocode optional set_geocode(latitude<float>, longitude<float>, radius<int,long>, imperial_metric=<True,False> set_geocode(52.5233,13.4127,10,imperial_metric=True)
lang optional set_language(<ISO-6391-string>) set_language('en')
locale optional set_locale(<ISO-6391-string>) set_locale('ja')
result_type optional set_result_type(<mixed,recent,popular>) set_result_type('recent')
count optional set_count(<int>[Range: 1-100]) set_count(42)
until optional set_until(<datetime.date>) set_until(datetime.date(2012, 12, 24))
since_id optional set_since_id(<int,long>[Range: >0]) set_since_id(250075927172759552)
max_id optional set_max_id(<int,long>[Range: >0]) set_max_id(249292149810667520)
include_entities optional set_include_etities(<bool,int>) set_include_etities(True), set_include_etities(1)
callback optional set_callback(<string>) set_callback('myMethod')

If you’re not familiar with the meaning of the parameters, please have a look at the Twitter Search API documentation. Most parameter are self-describing anyway.

Advanced usage examples

You may want to use TwitterSearchOrder for just generating a valid Twitter Search API query string containing all your arguments without knowing too much details about the Twitter API? No problem at all as there is the method TwitterSearchOrder.createSearchURL(). It creates and returns an valid Twitter Search API query string. Afterwards the last created string is also available through TwitterSearchOrder.url.

from TwitterSearch import TwitterSearchOrder, TwitterSearchException

try:
    tso = TwitterSearchOrder()
    tso.set_language('nl')
    tso.set_locale('ja')
    tso.set_keywords(['One','Two'])
    tso.add_keyword('myKeyword')

    print(tso.create_search_url())

except TwitterSearchException as e:
      print(e)

You’ll receive ?q=One+Two+myKeyword&count=100&lang=nl&locale=ja as result. Now you are free to use this string for manually querying Twitter (or any other API using the same parameter as Twitter does).

Maybe you would like to create another TwitterSearchOrder instance with a slightly different URL.

from TwitterSearch import TwitterSearchOrder, TwitterSearchException

try:
    tso = TwitterSearchOrder()
    tso.set_language('nl')
    tso.set_locale('ja')
    tso.set_keywords(['One','Two'])
    tso.add_keyword('myKeyword')

    querystr = tso.createSearchURL()

    # create a new TwitterSearchOrder based on the old query string and work with it
    tso2 = TwitterSearchOrder()
    tso2.set_search_url(querystr + '&result_type=mixed&include_entities=true')
    tso2.set_locale('en')
    print(tso2.create_search_url())

except TwitterSearchException as e:
   print(e)

This piece of code will finally result in an output of ?q=One+Two+myKeyword&count=100&lang=nl&locale=en&result_type=mixed&include_entities=true.

Please be aware that the sense of arguments given by set_search_url() is not checked. Due to this it is perfectly valid to to stuff like set_search_url('q=Not+my+department&count=1731&locale=Canada&foo=bar'). When manually setting the string, the leading ? sign is optional.

Such stuff doesn’t make much sense when querying Twitter. However, there may be cases when you’re using TwitterSearch is some exotic context where this behavior is needed to avoid the regular checks of the TwitterSearchOrder methods.

Be aware that if you’re using set_search_url() all previous configured parameters are lost.

Advanced usage: The TwitterUserOrder class

This class mainly acts as a plain container for configuration all parameters currently available by the Twitter Search API. There are several parameters which can easily be set and modified by methods in TwitterSearchOrder.

The only parameter with a default value is count with 200. This is because it is the maximum of tweets returned by this very Twitter API endpoint. In most cases you’d like to reduce traffic and the amount of queries, so it makes sense to set the biggest possible value by default. Please note that this endpoint has a different maximum size than the one used in TwitterSearchOrder.

API Parameter Type Modifying methods Example
user_id optional constructor (either screen-name or ID of user required) TwitterUserOrder("some_username")
screen_name optional constructor (either screen-name or ID of user required) TwitterUserOrder(123457890)
count optional set_count(<int>[Range: 1-200]) set_count(42)
until optional set_until(<datetime.date>) set_until(datetime.date(2012, 12, 24))
since_id optional set_since_id(<int,long>[Range: >0]) set_since_id(250075927172759552)
max_id optional set_max_id(<int,long>[Range: >0]) set_max_id(249292149810667520)
trim_user optional set_trim_user(<bool>) set_trim_user(True)
exclude_replies optional set_exclude_replies(<bool>) set_exclude_replies(False)
contributor_details optional set_contributor_details(<bool>) set_contributor_details(True)
include_rts optional set_include_rts(<bool>) set_include_rts(True)

If you’re not familiar with the meaning of the parameters, please have a look at the Twitter User Timeline API documentation. Most parameter are self-describing anyway. Only special use-cases may require those detailed configuration values to be set, so don’t worry if you don’t touch one of those advanced methods in your code.

Advanced usage examples

You may want to use TwitterUserOrder for just generating a valid Twitter Search API query string containing all your arguments without knowing too much details about the Twitter API? No problem at all as there is the method TwitterUserOrder.createSearchURL(). It creates and returns an valid Twitter Search API query string. Afterwards the last created string is also available through TwitterSearchOrder.url.

from TwitterSearch import TwitterUserOrder, TwitterSearchException

try:
    tuo = TwitterUserOrder("some_user")
    tuo.set_trim_user(True)
    tuo.set_exclude_replies(False)
    tuo.set_include_rts(True)

    print(tuo.create_search_url())

except TwitterSearchException as e:
      print(e)

You’ll receive ?trim_user=true&exclude_replies=false&include_rts=true as result. Now you are free to use this string for manually querying Twitter (or any other API using the same parameter as Twitter does).

Maybe you would like to create a new instance of TwitterUserOrder with the same configuration but for a different user. This is one way to do exactly this:

from TwitterSearch import TwitterSearchOrder, TwitterSearchException

try:
    tuo = TwitterUserOrder("some_user")
    tuo.set_trim_user(True)
    tuo.set_exclude_replies(False)
    tuo.set_include_rts(True)

    querystr = tuo.createSearchURL()

    # create a new TwitterSearchOrder based on the old query string and work with it
    tuo2 = TwitterUserOrder("some_other_user")
    tuo2.set_search_url(querystr)

    print(tso2.create_search_url())

except TwitterSearchException as e:
   print(e)

This piece of code will also result in an output of ?trim_user=true&exclude_replies=false&include_rts=true.

Please be aware that the sense of arguments given by set_search_url() is not checked. Due to this it is perfectly valid to to stuff like set_search_url('?trim_user=true&exclude_replies=false&include_rts=true&count=1337&foo=bar'). When manually setting the string, the leading ? sign is optional. Due to this you can force TwitterSearch to request custom queries. But be aware that those non-compatible queries are likely to fail. Use such techniques with caution as it doesn’t make much sense when querying Twitter. However, there may be cases when you’re using TwitterSearch is some exotic context where this behavior is needed to avoid the regular checks of the TwitterUserOrder methods.

Also note that if you’re using set_search_url() all previous configured parameters are lost and overridden.

Indices and tables

Contribution

Feel free to open issues, submit code or fork.