ITI0011:Twitter homework

Allikas: Kursused
Redaktsioon seisuga 6. oktoober 2014, kell 07:07 kasutajalt Ago (arutelu | kaastöö)
Mine navigeerimisribale Mine otsikasti

Back to course web page

This is English version of second homework. Estonian version is here: ITI0011:Säuts.

General

Some details may still change - overall requirements are fixed

Deadline: 21. or 23. October (depending on your practice time)
Defending your homework one week before (or earlier) will give you +1 point.

General reuirements:

  • Proper exception handling procedures – stacktrace errors must never appear during the execution time of your program.
  • Every object, its variable, or method should be commented using the Javadoc-style comments. Failure to comply with this requirement will lead to the loss of 1 point.

The goal of the homework is to create an interactive program capable of handling tweets using Twitter API. The basic part will give you 5 points, implementing some additional functionality will give you +1 point each, up to the maximal amount of 11 points this time. The basic part is an absolute minimal requirement that needs to be fulfilled in order to pass the assessment of this homework.

The required functionality:

  • Program accepts command-line arguments and everything is controllable from the arguments (example: java Twitter -location Tallinn -count 40 -sort date desc)
  • If the program is executed without arguments, an interactive command-line is executed, where the use can write commands ("> query Tallinn" or "> sort date desc")
  • Program has a proper manual (if some commands are written wrong or some parameters are missing, the help text should be shown; also "java Twitter --help" for example should print out the manual).
  • The program accepts a location and find last public tweets for that location and outputs those.
  • The number of tweets requested can be changed.
  • Downloaded tweets can be sorted and searched for.

Requirements are written in detail below. The list above is a general overview of the program (there are some more functionality features which need to be implemented).

Extra task: testable code (1p)

This hometask was created following the principle of providing students with all possible flexibility and possibilities for creativity. At the same time, it restricts possibilities to test your program using automated tests. Here we offer you an opportunity to earn additional bonus points (still the maximal possible amount of points, counting all possible bonus points is 11) if you make your code testable. To accomplish this task we provide you with a set of predefined interfaces, which you have to implement in your solution. Interfaces enable us to describe in a formal way which methods must be present in your solution, which arguments must these methods take and what should be the return value. Provided the implementation of these interfaces it becomes possible to call these methods using various input parameters and test it in a semi-automated way.

Attention! If you have already done with your solution (or at least some part of it) and wish to accomplish this extra task – this might mean that you possibly need to re-implement some major part in your solution. Thus, if you aim at implementing this extra part as well, you are expected to plan the structure of your program beforehand to comply with these requirements. As this extra task requires some extra effort from your side, we offer 1 bonus point for the accomplishment of this task.

More information: ITI0011:Twitter testable code.

Main part - 5p

The program makes a request to Twitter API to search public tweets. Last public tweets are downloaded in the location specified by the user from command line. The tweets are presented to the user.

Note that from Twitter API you just download the tweets (nothing special needs to be done to get the latest tweets - this is the default behavior). Also, when present tweets to the user, just print them out in the same order you receive.

Required functionality:

  • Read the location from command line (ex. "Tallinn")
  • For the location, find the geographical coordinates (latitude and longitude) using OpenStreetMap API
  • Use the bounding box information to calculate appropriate radius.
  • Send the coordinates and the radius to Twitter API
  • Read out the response from Twitter API into objects.
  • Print out the tweets.


Location coordinates

You can use OpenStreetMap community tool named Nominatim (Nominatim wiki). Given a location name it will return information about this location (coordinates, bounding box and other stuff).

Example request: http://nominatim.openstreetmap.org/search?q=Tallinn&format=xml

You will get a response (only partially shown):

<searchresults timestamp="Sat, 13 Sep 14 21:47:21 +0000" 
attribution="Data © OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright" 
querystring="Tallinn" polygon="false" 
exclude_place_ids="98174326,11438224,6000303521,6919504,6893196,86869124,15103978,5983246058" 
more_url="http://nominatim.openstreetmap.org/search?format=xml&
exclude_place_ids=98174326,11438224,6000303521,6919504,6893196,86869124,15103978,5983246058&accept-language=en-
US,en;q=0.5&q=Tallinn">
<place place_id="98174326" osm_type="relation" osm_id="2164745" place_rank="16" 
boundingbox="59.351806640625,59.5915794372559,24.5501689910889,24.9262847900391" lat="59.4372155" 
lon="24.7453688" display_name="Tallinn, Harju maakond, Estonia" class="place" type="city" 
importance="0.7819722223575" icon="http://nominatim.openstreetmap.org/images/mapicons/poi_place_city.p.20.png"/>
...
</searchresults>

In the results you will see several places with the name Tallinn (or the name includes Tallinn). For the given homework, the first result is what you need to look for. So, you have to read first "place" element. For the Twitter query you need location (latitude and longitude) and radius. From the location search result, you should look for attributes "lat" and "lon" to be used as the center of the Twitter search. For the radius, use "boungindbox" attribute, which gives you the bounding box around the location. In the given example, you should look for lat="59.4372155", lon="24.7453688", boundingbox="59.351806640625,59.5915794372559,24.5501689910889,24.9262847900391".

For the radius, you could just find the distance between latitude and longitude. Beware that in different locations on Earth 0.1 longitude difference has difference distance. For this homework, the radius calculated should not be very accurate. But it still should vary depending on the size of the city (New York > .. > Tallinn > Haapsalu). Don't waste too much time for radius calculation, this won't give you any extra points, if you calculate it with 1m accuracy.

In short, you need to make a query, read the response, and translate the response into center coordinates and a radius.

Note: You could use some other service to get city coordinates (for example Google Maps API).

Twitteri API

Twitter API (https://dev.twitter.com/docs/api/1.1) allows a program to do automatic queries to social network Twitter. For this homwork, you don't need to be an active Twitter user. But you still need an account to make public queries.

To use Twitter API, you need to have a Twitter account and register an App (application) under that account. If you have Twitter account (after registering), you should see your application here: https://apps.twitter.com/ (The same link is on the page dev.twitter.com - on the bottom of the page there is "TOOLS" list). On the application page, you should create a new application. When creating an app, you can provide whatever web page link you want (for example course web page).

After creating an application, you will see it on the application page. If you open your application (from the list) and open "API keys" tab, you will see "API key" and "API secret". You need those values to make queries to Twitter API.

From API, we will be using search query which is described here: https://dev.twitter.com/rest/public/search and https://dev.twitter.com/rest/reference/get/search/tweets

An example query to get Tallinn tweets within 1km radius: https://api.twitter.com/1.1/search/tweets.json?q=&geocode=59.4372155,24.7453688,1km&result_type=recent

The given link does not work in the browser, because you are not authenticated properly.

Some information about Twitter authentication can be seen here: https://dev.twitter.com/oauth

Doing all this authentication and connection manually is a lot of work. It is recommended to use a library which does most of the work for you. We recommend to use http://twitter4j.org/ . This helps you to do authentication and queries more easily. If you want, you can use some other library.

To use twitter4j you need to get the file twitter4j-core-4.0.2.jar. If you download the zip-file, the jar file is located under the folder "lib". When writing this assignment, 4.0.2 was the latest version. If the version is newer, then the file name changes accordingly. You need to add this jar-file into your project: project properties > java built path > libraries > add external jars .. ning otsite üles nimetatud faili.

To configure the twitter4j, you need the following code:

Konfigureerimise üks võimalus on selline:

		ConfigurationBuilder cb = new ConfigurationBuilder();
		cb.setDebugEnabled(true)
		.setApplicationOnlyAuthEnabled(true);
		cb.setOAuthConsumerKey(TWITTER_CUSTOMER_KEY)
		  .setOAuthConsumerSecret(TWITTER_CUSTOMER_SECRET);
		
		TwitterFactory tf = new TwitterFactory(cb.build());
		twitter4j.Twitter twitter = tf.getInstance();
		
		OAuth2Token token;
		try {
			token = twitter.getOAuth2Token();
		} catch (TwitterException e1) {
			// TODO Auto-generated catch block
			e1.printStackTrace();
		}

In the code snippet above TWITTER_CUSTOMER_KEY and TWITTER_CUSTOMER_SECRET are constants and the values are taken from Twitter application web page (application key and application secret accordingly). Of course you can use configuration file or some other method to set the API keys (more about that you can read from the library's web page).

How to actually make the query, you need to find out yourself, but the web page has a lot of examples. I recommend also to check the project github page where you can find the source code with tests. If you look at the tests, you can find different usages of the library.