Erinevus lehekülje "ITI0011:Twitter homework" redaktsioonide vahel

Allikas: Kursused
Mine navigeerimisribale Mine otsikasti
 
(ei näidata sama kasutaja 5 vahepealset redaktsiooni)
4. rida: 4. rida:
  
 
== General ==
 
== General ==
 
<div style="color: red;">Some details may still change - overall requirements are fixed</div>
 
  
 
Deadline: '''21. or 23. October (depending on your practice time)'''<br>
 
Deadline: '''21. or 23. October (depending on your practice time)'''<br>
 
Defending your homework one week before (or earlier) will give you +1 point.
 
Defending your homework one week before (or earlier) will give you +1 point.
  
The goal of the homework is to create an interactive program which gets public tweets from Twitter API. The main part gives you 5 points. Additional functionality will give you additional points. Up to 11 points this time.
+
General reuirements:
 +
* Proper exception handling procedures – stacktrace errors must never appear during the execution time of your program.
 +
* Every object, its variable, or method should be commented using the Javadoc-style comments. Failure to comply with this requirement will lead to the loss of 1 point.
 +
 
 +
The goal of the homework is to create an interactive program capable of handling tweets using Twitter API. The basic part will give you 5 points, implementing some additional functionality will give you +1 point each, up to the maximal amount of 11 points this time. The basic part is an absolute minimal  requirement that needs to be fulfilled in order to pass the assessment of this homework.
  
 
The required functionality:
 
The required functionality:
21. rida: 23. rida:
  
 
Requirements are written in detail below. The list above is a general overview of the program (there are some more functionality features which need to be implemented).
 
Requirements are written in detail below. The list above is a general overview of the program (there are some more functionality features which need to be implemented).
 +
 +
== Extra task: testable code (1p) ==
 +
 +
This hometask was created following the principle of providing students with all possible flexibility and possibilities for creativity. At the same time, it restricts possibilities to test your program using automated tests. Here we offer you an opportunity to earn additional bonus points (still the maximal possible amount of points, counting all possible bonus points is 11) if you make your code testable.
 +
To accomplish this task we provide you with a set of predefined interfaces, which you have to implement in your solution. Interfaces enable us to describe in a formal way which methods must be present in your solution, which arguments must these methods take and what should be the return value. Provided the implementation of these interfaces it becomes possible to call these methods using various input parameters and test it in a semi-automated way. 
 +
 +
Attention! If you have already done with your solution (or at least some part of it) and wish to accomplish this extra task – this might mean that you possibly need to re-implement some major part in your solution. Thus, if you aim at implementing this extra part as well, you are expected to plan the structure of your program beforehand to comply with these requirements. As this extra task requires some extra effort from your side, we offer 1 bonus point for the accomplishment of this task.
 +
 +
More information: [[ITI0011:Twitter testable code]].
  
 
== Main part - 5p ==
 
== Main part - 5p ==
86. rida: 97. rida:
 
Doing all this authentication and connection manually is a lot of work. It is recommended to use a library which does most of the work for you. We recommend to use http://twitter4j.org/ . This helps you to do authentication and queries more easily. If you want, you can use some other library.
 
Doing all this authentication and connection manually is a lot of work. It is recommended to use a library which does most of the work for you. We recommend to use http://twitter4j.org/ . This helps you to do authentication and queries more easily. If you want, you can use some other library.
  
To use twitter4j you need to get the file twitter4j-core-4.0.2.jar. If you download the zip-file, the jar file is located under the folder "lib". When writing this assignment, 4.0.2 was the latest version. If the version is newer, then the file name changes accordingly. You need to add this jar-file into your project: project properties > java built path > libraries > add external jars .. ning otsite üles nimetatud faili.
+
To use twitter4j you need to get the file twitter4j-core-4.0.2.jar. If you download the zip-file, the jar file is located under the folder "lib". When writing this assignment, 4.0.2 was the latest version. If the version is newer, then the file name changes accordingly. You need to add this jar-file into your project: project properties > java built path > libraries > add external jars .. and browse the jar-file.
  
 
To configure the twitter4j, you need the following code:
 
To configure the twitter4j, you need the following code:
  
Konfigureerimise üks võimalus on selline:
 
 
<pre>
 
<pre>
 
ConfigurationBuilder cb = new ConfigurationBuilder();
 
ConfigurationBuilder cb = new ConfigurationBuilder();
114. rida: 124. rida:
 
How to actually make the query, you need to find out yourself, but the web page has a lot of examples. I recommend also to check the project github page where you can find the source code with tests. If you look at the tests, you can find different usages of the library.
 
How to actually make the query, you need to find out yourself, but the web page has a lot of examples. I recommend also to check the project github page where you can find the source code with tests. If you look at the tests, you can find different usages of the library.
  
<!--
+
If you have a Twitter account, you could use that account to create an app. The general goal is to get public tweets, you could instead get the tweets of your friends. This does not give you extra points. Note that if you want to make a request for your friends' tweets, you need to add your account keys. More information can be found here: http://twitter4j.org/en/configuration.html . If you don't provide accessToken information (as in the example above), you can only get public tweets (which is OK for this homework).
 
 
  
Kui teil on Twitteri konto olemas, saate olemasoleva konto alla luua rakenduse. Kuigi üldiselt on ülesande eesmärk pärida avalikke säutse, võite teha rakenduse, mis pärib teie kasutajaga seotud säutse (ehk siis nende omasid, keda jälgite ja/või kes on sõbrad). Jällegi, üldjuhul see lisapunkte ei anna. Tähelepanu, kui teete päringuid oma sõbrade kohta, on teil lisaks rakenduse võtmele lisada juurde ka teie enda võtmed. Selle kohta leiate näiteid siit: http://twitter4j.org/en/configuration.html . Kõik sealsed näited käivadki selle olukorra kohta, kus on vaja teha päringuid konkreetselt teie kasutaja alt. Kui te oma kasutaja infot ei lisa, saab teostada vaid avalikke päringuid.
 
  
== Lisaosa: kohanimede puhverdamine (2p) ==
+
== Extra task: location buffering (2p) ==
  
Selle lisaosa funktsionaalsus on:
+
The functionality required to be implemented in this task includes the following:
* asukoha koordinaatide päringu tulemused salvestatakse faili, et järgmisel korral ei peaks neid uuesti küsima (kohalik ''cache'')
+
* Results of location queries are buffered in a file (local cache) so that there is no need to query the web service, in case the result is present in the cache.
* nimetatud faili saab lisada ise uusi asukohti (näiteks "kodu", "TTÜ" jne)
+
* One should be able to edit the cache file manually (using some sort of an editor) and populate it with some additional entries (i.e. “home”, “TUT”, etc.)
  
Puhverdamiseks kasuta faili "kohad.csv", mis peab olema sinu programmile kättesaadav (soovitavalt paikneb samas kataloogis käivitatava klassiga) ja peab koosnema CSV kujul ridadest:
+
You are expected to use the file named “kohad.csv” as the local cache. This file must reside in a path accessible by your program (e.g. project classpath) and should contain CSV-formatted rows of text:
  
 
     ametlik_nimi,latitude,longitude,raadius_km,alternatiivnenimi_1,..,alternatiivnenimi_N
 
     ametlik_nimi,latitude,longitude,raadius_km,alternatiivnenimi_1,..,alternatiivnenimi_N
  
või ridadest, kus koordinaadid ja raadius on puudu:
+
Alternatively, it may contain rows where some of the fields (e.g. coordinates or radius) are empty:
  
 
     ametlik_nimi,,,,alternatiivnenimi_1,...,alternatiivnenimi_N
 
     ametlik_nimi,,,,alternatiivnenimi_1,...,alternatiivnenimi_N
  
kus
+
where:
* ametlik_nimi on kohanimi, mida otsitakse API-st
+
* ametlik_nimi is a name of the location of interest
* alternatiivnimi_1, ..., alternatiivnimi_N on nimed, mida kasutaja saab sisestada. Näiteks saab kasutaja siin määrata "kodu". Kui vastav päring tehakse, otsitakse tegelikult ametlik_nimi asukohta (või kasutatakse juba olemasoelvaid koordinaate ja raadiust, kui need on olemas)
+
* alternatiivnimi1....alternatiivnimiN are names which the end-user might wish to insert. For instance, a user might wish to provide an alternative name “home” to a certain location. Still the search is done using the so-called official name of the location (the very first field)
* latitude, longitude ja raadius_km on standardse tähendusega. Need võivad puududa.
+
* lat, lon, radius_km have quite a straightforward meaning and may be absent
 
 
Sinu programm kasutab kohad.csv faili järgmiselt:
 
* Enne API poole pöördumist vaatab programm, kas vastav kohanimi on selles failis olemas
 
** Kui jah, ja koordinaadid-raadius on antud, kasutab ta koordinaate ja raadiust Twitteri päringu tegemiseks. Asukoha päringut ei tehta
 
** Kui jah, aga koordinaadid on puudu, siis
 
*** teeb uue asukoha API päring ametliku kohanime järgi, leiab koordinaadid ja arvutab raadiuse
 
*** kirjutab saadud koordinaadid ja raadiuse vastava rea juurde. Järgmisel korral sama päringu puhul pole vaja asukoha API-t enam kasutada.
 
** Kui ei, siis tehakse uus päring ning puhverdatakse tulemused faili (kirjutatakse uus rida, kus on ametlik_nimi, koordinaadid ja raadius).
 
  
Näited:
+
The program uses the local cache file in the following way:
 +
* Before querying the API the program first tries to locate the location of interest in the local cache file.
 +
** If the location was found and the position and radius data is present, this data is used in a query to Twitter API. No attempt to determine the geolocation is undertaken.
 +
** If the location was found, but the position and radius data is not present, then:
 +
*** The program queries the geolocation API to get the position of the location of interest, extracts the coordinates from the response and computes radius out of it.
 +
*** Populates the corresponding entry in the local cache file with the data obtained in the previous step. Next time the same query will be launched will not trigger the query to geolocation API any more.
 +
** If the location was not found, then the program is expected to make a new query regarding the location and to cache the data regarding the query in the local cache.
  
 +
Examples:
 
1) kohad.csv:
 
1) kohad.csv:
  
    tallinn,59.4,24.5,10,kodu,ttü
+
tallinn,59.4,24.5,10,home,ttü
  
Kui tehakse otsing "tallinn", siis kasutatakse koordinaate 59.4 ja 24.5, raadius = 10km. Sama juhtub, kui teha päring "ttü", "kodu". kohad.csv fail ei muutu.
+
In case of querying "Tallinn" coordinates 59.4 and 24.5 and radius 10 km will be used for Twitter API. The same happens if "home" is queried.
  
 +
2) kohad.csv
  
2) kohad.csv:
+
pärnu,,,,grandma,summerhouse
  
    pärnu,,,,vanaema,suvila
+
In case of querying "pärnu" or "grandma" or "summerhouse" then "pärnu" will be used for location search. Coordinates will be queried, radius will be calculated. As a result, the same row in the cache file should be filled with the coordinates and the radius, an example:
  
Kui tehakse otsing "pärnu" või "vanaema" või "suvila", siis otsitakse asukoha API-ga üles "pärnu", loetakse välja koordinaadid ja arvutatakse välja raadius. Tulemuseks on sama rida failis:
+
pärnu,58.3,24.5,5,grandma,summerhouse
  
    pärnu,58.3,24.5,5,vanaema,suvila
+
3) kohad.csv
  
3) kohad.csv:
+
tallinn,59.4,24.5,10,home,ttü
  
    tallinn,59.4,24.5,10,kodu,ttü
+
In the case of querying "pärnu" a location search is done to get the coordinates, a radius will be calculated. As a result, the new row will be added to the cache file:
  
Kui tehakse otsing "pärnu", siis tehakse asukoha otsing "pärnu" ja salvestatakse koordinaadid ja raadius faili. Tulemuseks on fail:
+
tallinn,59.4,24.5,10,home,ttü
 +
pärnu,58.3,24.5,5
  
    tallinn,59.4,24.5,10,kodu,ttü
+
The solution to this extra task, provided that it is done entirely and correctly, gives you 2 points.
    pärnu,58.3,24.5,5
 
  
See lisaosa täielikult realiseerituna annab 2 punkti.
+
== Extra task: Sorting (1p) ==
  
== Lisaosa: sorteerimine (1p) ==
+
Tweets are presented in a sorted order. Sorting shall be done by one of the specified criteria: author, tweet creation date, tweet itself. It should be possible to sort items in ascending, as well as in descending order.
 +
Examples of program invocation parameters:
  
Tweet'id väljastatakse sorteeritult vastavalt etteantud väljale. Sorteerida saab järgmiste väljade järgi: autor, postituse kuupäev, postitus (ehk siis tweet ise).
 
 
Sorteerimine peab toimima nii kasvavas kui kahanevad järjekorras.
 
 
Näiteks käivitatakse programm:
 
 
* java Twitter Tallinn -sort author
 
* java Twitter Tallinn -sort author
 
* java Twitter Tallinn -sort date
 
* java Twitter Tallinn -sort date
 
* java Twitter Tallinn -sort date desc
 
* java Twitter Tallinn -sort date desc
* java Twitter Tallinn -sort content
+
* java Twitter Tallinn -sort content  
 +
 
 +
The solution to this task will give you 1 point, provided that it was done correctly and entirely.
 +
 
  
See lisaosa täielikult realiseerituna annab 1 punkti.
+
== Extra task: filtering (1p) ==
  
== Lisaosa: filtreerimine (1p) ==
+
In addition to the location, it is possible to specify a search keyword and the size of the output (the amount of tweets to display). You should pass the amount of tweets along with your query, then filter the results of it and display only the ones matching the search keyword.
  
Lisaks kohanimele saab ette anda: otsingusõna ja väljastavate tweetide arvu. Tweetide arv tuleb kaasa anda päringu tegemisel. Otsingusõna tuleb rakendada aga saadud tulemuste peale.
+
The solution to this task gives you 1 point.  
  
Kui realiseeritud on nii tweetide arvu piiramine kui ka otsingusõna rakendamine saab 1 punkti.
+
== Extra task: interactive shell mode and commands to control the execution flow of the program (1p) ==
  
== Lisaosa: interaktiivne ja käsurealt juhtimine (1p) ==
+
This task is aiming at enabling the end user to launch the program in the so-called batch mode (by specifying parameters on the command line), or, alternatively, in the interactive shell mode (in this mode the user is expected to type commands in an interactive shell-like environment). Interactive mode assumes that a control command is inserted, followed by the immediate execution of it by the program. Afterwords the end-user is presented with a prompt waiting for the next input command.
  
Lisaosa mõte on võimaldada programmi käivitada nii käsurealt:
+
In case the program was launched in the batch mode, it parses command line arguments, extracts the parameter values, executes its task and terminates. If no arguments were specified on the command line the program launches the interactive shell and executes user commands one by one.
  
 +
Example execution from command line:
  
    java Twitter Tallinn -count 50 -sort date desc -search tere
+
java Twitter Tallinn -count 50 -sort date desc -search tere
  
kui ka luua interaktiivne keskkond, kus saab käske edasi anda. Interaktiivne keskkond tähendab seda, et programmi käivitamisel küsitakse kasutajalt sisendit ehk käsku. Käsk täidetakse ning seejärel saab kasutaja sisestada uue käsu (nagu command prompt või terminal). Näiteks:
+
Example of interactive program:
  
 
     > setcount 50
 
     > setcount 50
212. rida: 220. rida:
 
     > print
 
     > print
  
Kui programm käivitatakse käsurea argumentidega (esimene näide), siis programm teeb vajalikud sammud ära ja lõpetab töö. Kui käsurealt argumente ei antud, käivitub interaktiivne keskkond.
+
The program must recognize at least the following commands:
 
+
* Querying the Tweeter API (e.g. “query”). It should be able to pass along with the command the following parameters: the amount of tweets to display. Alternatively, you may come up with the solution, in which the amount of tweets to display is set with a separate command (e.g. “setcount 50”, in case the corresponding extra task has been accomplished.) which remains valid for all subsequent queries, until reset or unset completely.
Programm peab tundma järgmisi käske (interaktiivses keskkonnas peaks olema need käsud eraldi):
+
* [Only in interactive mode] Displaying results (e.g. “print”). It is acceptable if your program displays the results immediately after the query has been launched. However, with respect to other operations it would be convenient if you consider implementing the “print” functionality as a standalone command.  
* päringu tegemine (näites "query"). Päringu tegemisel saab kaasa anda tweetide koguse (kui see lisaosa on tehtud; näites "query Pärnu 10"). Võib teha piirangu määramise ka eraldi (näites "setcount 50", see jääb siis kehtima järgmiste päringute puhul).
+
* Sorting (e.g. “sort date desc”) in case the corresponding extra task has been accomplished.
* Ainult interaktiivses: tulemuste printimine (näites "print"). Te võite tulemused printida ka kohe peale päringu tegemist. Aga muude operatsioonide puhul on mugav, kui print on eraldi. Käsurea käivitamise puhul prinditakse tulemused alati välja (peale muude operatsioonide tegemist).
+
* Searching (e.g. “search hello”), in case the corresponding extra task has been accomplished.
* sorteerimine (näites "sort date desc"), kui vastav lisaosa on tehtud
+
* Context help (e.g. “help”). Might the end-user provide an argument formatted in an invalid way or some non-existent command – in all these and similar cases your program is expected to display a context-based help message for the user.  
* otsimine (näites "search tere"), kui vastav lisaosa on tehtud
 
* abi saamine (näiteks "help"). Kui sisestatakse vale argument (käsurealt käivitamisel) või käsk (interaktiivne käivitamine), siis peaks ka näitama abiteksti.
 
 
 
Selle lisaosa tegemisel on lisaks nõue see, et nii käsurealt saadud argumendid kui ka interaktiivselt edastatud käsud käivitatakse ühte moodi. Ehk siis mõlemal juhul töödeldakse nad samale kujule ja seejärel käivitatakse. Selle mõte on see, et teie programm oskab operereerida erinevate käskudega. See, kuidas programm need käsud kasutaja käest teada saab, pole käivitamise jaoks oluline. Kui te hiljem kunagi tahate lisada võimaluse käske saata üle veebi API, siis peate lisama lihtsalt käskude vastuvõtmise ja teisendamise ühtsele kujule (nagu ka käsurealt ja interaktiivsest keskkonnast) - ülejäänud juba töötab.
 
 
 
Kuigi see lisaosa tundub justkui väga mahukas 1 punkti kohta, siis tegelikult see väga keeruline pole. Niiehknaa on teil vaja realiseerida üks kahest: kas interaktiivne või käsurea variant. Vastasel korral te ei saa muid lisapunkte ka saada. Kui te kohe oma programmi loomise käigus arvestate võimalusega käivitada käske kahest erinvast kohast, siis hiljem mingeid täiendavaid arendusi teha pole vaja. Kui te olete teinud käsurealt käivitamise koos lisaülesannetega valmis ja hakkate seda lisaülesannet tegema, võib selle realiseerimine olla mõnevõrra keerulisem.
 
 
 
See lisaosa annab 1 punkti.
 
 
 
== Lisaosa: testitav kood (1p) ==
 
  
Antud ülesanne ei ole väga rangelt kirjeldatud, mis jätab tudengitele vabaduse loovuse näitamiseks. Küll aga pärsib selline vabadus testimise võimalust. Siin pakutakse välja teenida täiendab lisapunkt (kokku siis võimalik saada 11 punkti, kui kõik lisad ära teete) juhul, kui te teete oma koodi testitavaks. Sellejaoks anname ette päris mitu interface'i, mida teie peate implementeerima oma programmis. Interface'id võimaldavad meil kirjeldada ära, millised meetodid milliste argumentidega teie koodis olema - seega, me saame neid välja kutsuda ja ka testida.
+
Additional requirement: your program is expected to accomplish same tasks in the same way, independently of the way how this or that behavior is triggered (whether in an interactive shell or by providing command line arguments). In both cases they should be converted to a common form, followed by immediate execution. The idea behind this is to learn to design your program in a way, that would allow to specify operational parameters to it in various ways, and the way, in which these parameters are specified are should not affect the program execution flow in any way. It may come helpful if some day you might wish to add a possibility to process commands over some web-based API.  
  
'''Tähelepanu''' Kui te olete oma koodi juba valmis teinud ja tahate seda lisaosa sooritada, võib see tähendada, et peate suurema osa koodist ringi kirjutama. Seega, kui seda lisaosa jahite, peaksite algusest peale oma koodi vastavalt üles ehitama. Kuna antud ülesande puhul on testitavuse võimaldamiseks vaja suhteliselt palju täiendavat tööd teha (mida te muidu tegema ei peaks), siis on ka selle eest lisapunkt pakutud.
+
Despite that this extra task seems to be complex and effort demanding task, actually, it is not so difficult as it may seem. Either way you have to implement one of the two suggested operational modes:  either interactive one, or the batch one. Otherwise you will not be able to claim your bonus points. If during the initial phases of planning you consider the possibility for your program to operate in two possible modes of operation (and the possibility to acquire commands from two different sources and subsequently execute them). Thus we strongly advice to thoroughly think through the design of your program and plan beforehand, before you actually start writing some code.  
  
Loe lähemalt siit: [[ITI0011:Säuts lisaülesanne]]
+
This extra task gives you 1 point.
-->
 

Viimane redaktsioon: 6. oktoober 2014, kell 08:03

Back to course web page

This is English version of second homework. Estonian version is here: ITI0011:Säuts.

General

Deadline: 21. or 23. October (depending on your practice time)
Defending your homework one week before (or earlier) will give you +1 point.

General reuirements:

  • Proper exception handling procedures – stacktrace errors must never appear during the execution time of your program.
  • Every object, its variable, or method should be commented using the Javadoc-style comments. Failure to comply with this requirement will lead to the loss of 1 point.

The goal of the homework is to create an interactive program capable of handling tweets using Twitter API. The basic part will give you 5 points, implementing some additional functionality will give you +1 point each, up to the maximal amount of 11 points this time. The basic part is an absolute minimal requirement that needs to be fulfilled in order to pass the assessment of this homework.

The required functionality:

  • Program accepts command-line arguments and everything is controllable from the arguments (example: java Twitter -location Tallinn -count 40 -sort date desc)
  • If the program is executed without arguments, an interactive command-line is executed, where the use can write commands ("> query Tallinn" or "> sort date desc")
  • Program has a proper manual (if some commands are written wrong or some parameters are missing, the help text should be shown; also "java Twitter --help" for example should print out the manual).
  • The program accepts a location and find last public tweets for that location and outputs those.
  • The number of tweets requested can be changed.
  • Downloaded tweets can be sorted and searched for.

Requirements are written in detail below. The list above is a general overview of the program (there are some more functionality features which need to be implemented).

Extra task: testable code (1p)

This hometask was created following the principle of providing students with all possible flexibility and possibilities for creativity. At the same time, it restricts possibilities to test your program using automated tests. Here we offer you an opportunity to earn additional bonus points (still the maximal possible amount of points, counting all possible bonus points is 11) if you make your code testable. To accomplish this task we provide you with a set of predefined interfaces, which you have to implement in your solution. Interfaces enable us to describe in a formal way which methods must be present in your solution, which arguments must these methods take and what should be the return value. Provided the implementation of these interfaces it becomes possible to call these methods using various input parameters and test it in a semi-automated way.

Attention! If you have already done with your solution (or at least some part of it) and wish to accomplish this extra task – this might mean that you possibly need to re-implement some major part in your solution. Thus, if you aim at implementing this extra part as well, you are expected to plan the structure of your program beforehand to comply with these requirements. As this extra task requires some extra effort from your side, we offer 1 bonus point for the accomplishment of this task.

More information: ITI0011:Twitter testable code.

Main part - 5p

The program makes a request to Twitter API to search public tweets. Last public tweets are downloaded in the location specified by the user from command line. The tweets are presented to the user.

Note that from Twitter API you just download the tweets (nothing special needs to be done to get the latest tweets - this is the default behavior). Also, when present tweets to the user, just print them out in the same order you receive.

Required functionality:

  • Read the location from command line (ex. "Tallinn")
  • For the location, find the geographical coordinates (latitude and longitude) using OpenStreetMap API
  • Use the bounding box information to calculate appropriate radius.
  • Send the coordinates and the radius to Twitter API
  • Read out the response from Twitter API into objects.
  • Print out the tweets.


Location coordinates

You can use OpenStreetMap community tool named Nominatim (Nominatim wiki). Given a location name it will return information about this location (coordinates, bounding box and other stuff).

Example request: http://nominatim.openstreetmap.org/search?q=Tallinn&format=xml

You will get a response (only partially shown):

<searchresults timestamp="Sat, 13 Sep 14 21:47:21 +0000" 
attribution="Data © OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright" 
querystring="Tallinn" polygon="false" 
exclude_place_ids="98174326,11438224,6000303521,6919504,6893196,86869124,15103978,5983246058" 
more_url="http://nominatim.openstreetmap.org/search?format=xml&
exclude_place_ids=98174326,11438224,6000303521,6919504,6893196,86869124,15103978,5983246058&accept-language=en-
US,en;q=0.5&q=Tallinn">
<place place_id="98174326" osm_type="relation" osm_id="2164745" place_rank="16" 
boundingbox="59.351806640625,59.5915794372559,24.5501689910889,24.9262847900391" lat="59.4372155" 
lon="24.7453688" display_name="Tallinn, Harju maakond, Estonia" class="place" type="city" 
importance="0.7819722223575" icon="http://nominatim.openstreetmap.org/images/mapicons/poi_place_city.p.20.png"/>
...
</searchresults>

In the results you will see several places with the name Tallinn (or the name includes Tallinn). For the given homework, the first result is what you need to look for. So, you have to read first "place" element. For the Twitter query you need location (latitude and longitude) and radius. From the location search result, you should look for attributes "lat" and "lon" to be used as the center of the Twitter search. For the radius, use "boungindbox" attribute, which gives you the bounding box around the location. In the given example, you should look for lat="59.4372155", lon="24.7453688", boundingbox="59.351806640625,59.5915794372559,24.5501689910889,24.9262847900391".

For the radius, you could just find the distance between latitude and longitude. Beware that in different locations on Earth 0.1 longitude difference has difference distance. For this homework, the radius calculated should not be very accurate. But it still should vary depending on the size of the city (New York > .. > Tallinn > Haapsalu). Don't waste too much time for radius calculation, this won't give you any extra points, if you calculate it with 1m accuracy.

In short, you need to make a query, read the response, and translate the response into center coordinates and a radius.

Note: You could use some other service to get city coordinates (for example Google Maps API).

Twitteri API

Twitter API (https://dev.twitter.com/docs/api/1.1) allows a program to do automatic queries to social network Twitter. For this homwork, you don't need to be an active Twitter user. But you still need an account to make public queries.

To use Twitter API, you need to have a Twitter account and register an App (application) under that account. If you have Twitter account (after registering), you should see your application here: https://apps.twitter.com/ (The same link is on the page dev.twitter.com - on the bottom of the page there is "TOOLS" list). On the application page, you should create a new application. When creating an app, you can provide whatever web page link you want (for example course web page).

After creating an application, you will see it on the application page. If you open your application (from the list) and open "API keys" tab, you will see "API key" and "API secret". You need those values to make queries to Twitter API.

From API, we will be using search query which is described here: https://dev.twitter.com/rest/public/search and https://dev.twitter.com/rest/reference/get/search/tweets

An example query to get Tallinn tweets within 1km radius: https://api.twitter.com/1.1/search/tweets.json?q=&geocode=59.4372155,24.7453688,1km&result_type=recent

The given link does not work in the browser, because you are not authenticated properly.

Some information about Twitter authentication can be seen here: https://dev.twitter.com/oauth

Doing all this authentication and connection manually is a lot of work. It is recommended to use a library which does most of the work for you. We recommend to use http://twitter4j.org/ . This helps you to do authentication and queries more easily. If you want, you can use some other library.

To use twitter4j you need to get the file twitter4j-core-4.0.2.jar. If you download the zip-file, the jar file is located under the folder "lib". When writing this assignment, 4.0.2 was the latest version. If the version is newer, then the file name changes accordingly. You need to add this jar-file into your project: project properties > java built path > libraries > add external jars .. and browse the jar-file.

To configure the twitter4j, you need the following code:

		ConfigurationBuilder cb = new ConfigurationBuilder();
		cb.setDebugEnabled(true)
		.setApplicationOnlyAuthEnabled(true);
		cb.setOAuthConsumerKey(TWITTER_CUSTOMER_KEY)
		  .setOAuthConsumerSecret(TWITTER_CUSTOMER_SECRET);
		
		TwitterFactory tf = new TwitterFactory(cb.build());
		twitter4j.Twitter twitter = tf.getInstance();
		
		OAuth2Token token;
		try {
			token = twitter.getOAuth2Token();
		} catch (TwitterException e1) {
			// TODO Auto-generated catch block
			e1.printStackTrace();
		}

In the code snippet above TWITTER_CUSTOMER_KEY and TWITTER_CUSTOMER_SECRET are constants and the values are taken from Twitter application web page (application key and application secret accordingly). Of course you can use configuration file or some other method to set the API keys (more about that you can read from the library's web page).

How to actually make the query, you need to find out yourself, but the web page has a lot of examples. I recommend also to check the project github page where you can find the source code with tests. If you look at the tests, you can find different usages of the library.

If you have a Twitter account, you could use that account to create an app. The general goal is to get public tweets, you could instead get the tweets of your friends. This does not give you extra points. Note that if you want to make a request for your friends' tweets, you need to add your account keys. More information can be found here: http://twitter4j.org/en/configuration.html . If you don't provide accessToken information (as in the example above), you can only get public tweets (which is OK for this homework).


Extra task: location buffering (2p)

The functionality required to be implemented in this task includes the following:

  • Results of location queries are buffered in a file (local cache) so that there is no need to query the web service, in case the result is present in the cache.
  • One should be able to edit the cache file manually (using some sort of an editor) and populate it with some additional entries (i.e. “home”, “TUT”, etc.)

You are expected to use the file named “kohad.csv” as the local cache. This file must reside in a path accessible by your program (e.g. project classpath) and should contain CSV-formatted rows of text:

   ametlik_nimi,latitude,longitude,raadius_km,alternatiivnenimi_1,..,alternatiivnenimi_N

Alternatively, it may contain rows where some of the fields (e.g. coordinates or radius) are empty:

   ametlik_nimi,,,,alternatiivnenimi_1,...,alternatiivnenimi_N

where:

  • ametlik_nimi is a name of the location of interest
  • alternatiivnimi1....alternatiivnimiN are names which the end-user might wish to insert. For instance, a user might wish to provide an alternative name “home” to a certain location. Still the search is done using the so-called official name of the location (the very first field)
  • lat, lon, radius_km have quite a straightforward meaning and may be absent

The program uses the local cache file in the following way:

  • Before querying the API the program first tries to locate the location of interest in the local cache file.
    • If the location was found and the position and radius data is present, this data is used in a query to Twitter API. No attempt to determine the geolocation is undertaken.
    • If the location was found, but the position and radius data is not present, then:
      • The program queries the geolocation API to get the position of the location of interest, extracts the coordinates from the response and computes radius out of it.
      • Populates the corresponding entry in the local cache file with the data obtained in the previous step. Next time the same query will be launched will not trigger the query to geolocation API any more.
    • If the location was not found, then the program is expected to make a new query regarding the location and to cache the data regarding the query in the local cache.

Examples: 1) kohad.csv:

tallinn,59.4,24.5,10,home,ttü

In case of querying "Tallinn" coordinates 59.4 and 24.5 and radius 10 km will be used for Twitter API. The same happens if "home" is queried.

2) kohad.csv

pärnu,,,,grandma,summerhouse

In case of querying "pärnu" or "grandma" or "summerhouse" then "pärnu" will be used for location search. Coordinates will be queried, radius will be calculated. As a result, the same row in the cache file should be filled with the coordinates and the radius, an example:

pärnu,58.3,24.5,5,grandma,summerhouse

3) kohad.csv

tallinn,59.4,24.5,10,home,ttü

In the case of querying "pärnu" a location search is done to get the coordinates, a radius will be calculated. As a result, the new row will be added to the cache file:

tallinn,59.4,24.5,10,home,ttü
pärnu,58.3,24.5,5

The solution to this extra task, provided that it is done entirely and correctly, gives you 2 points.

Extra task: Sorting (1p)

Tweets are presented in a sorted order. Sorting shall be done by one of the specified criteria: author, tweet creation date, tweet itself. It should be possible to sort items in ascending, as well as in descending order. Examples of program invocation parameters:

  • java Twitter Tallinn -sort author
  • java Twitter Tallinn -sort date
  • java Twitter Tallinn -sort date desc
  • java Twitter Tallinn -sort content

The solution to this task will give you 1 point, provided that it was done correctly and entirely.


Extra task: filtering (1p)

In addition to the location, it is possible to specify a search keyword and the size of the output (the amount of tweets to display). You should pass the amount of tweets along with your query, then filter the results of it and display only the ones matching the search keyword.

The solution to this task gives you 1 point.

Extra task: interactive shell mode and commands to control the execution flow of the program (1p)

This task is aiming at enabling the end user to launch the program in the so-called batch mode (by specifying parameters on the command line), or, alternatively, in the interactive shell mode (in this mode the user is expected to type commands in an interactive shell-like environment). Interactive mode assumes that a control command is inserted, followed by the immediate execution of it by the program. Afterwords the end-user is presented with a prompt waiting for the next input command.

In case the program was launched in the batch mode, it parses command line arguments, extracts the parameter values, executes its task and terminates. If no arguments were specified on the command line the program launches the interactive shell and executes user commands one by one.

Example execution from command line:

java Twitter Tallinn -count 50 -sort date desc -search tere

Example of interactive program:

   > setcount 50
   > query Tallinn
   > print
   > query Pärnu 10
   > sort date desc
   > print
   > search tere
   > print

The program must recognize at least the following commands:

  • Querying the Tweeter API (e.g. “query”). It should be able to pass along with the command the following parameters: the amount of tweets to display. Alternatively, you may come up with the solution, in which the amount of tweets to display is set with a separate command (e.g. “setcount 50”, in case the corresponding extra task has been accomplished.) which remains valid for all subsequent queries, until reset or unset completely.
  • [Only in interactive mode] Displaying results (e.g. “print”). It is acceptable if your program displays the results immediately after the query has been launched. However, with respect to other operations it would be convenient if you consider implementing the “print” functionality as a standalone command.
  • Sorting (e.g. “sort date desc”) in case the corresponding extra task has been accomplished.
  • Searching (e.g. “search hello”), in case the corresponding extra task has been accomplished.
  • Context help (e.g. “help”). Might the end-user provide an argument formatted in an invalid way or some non-existent command – in all these and similar cases your program is expected to display a context-based help message for the user.

Additional requirement: your program is expected to accomplish same tasks in the same way, independently of the way how this or that behavior is triggered (whether in an interactive shell or by providing command line arguments). In both cases they should be converted to a common form, followed by immediate execution. The idea behind this is to learn to design your program in a way, that would allow to specify operational parameters to it in various ways, and the way, in which these parameters are specified are should not affect the program execution flow in any way. It may come helpful if some day you might wish to add a possibility to process commands over some web-based API.

Despite that this extra task seems to be complex and effort demanding task, actually, it is not so difficult as it may seem. Either way you have to implement one of the two suggested operational modes: either interactive one, or the batch one. Otherwise you will not be able to claim your bonus points. If during the initial phases of planning you consider the possibility for your program to operate in two possible modes of operation (and the possibility to acquire commands from two different sources and subsequently execute them). Thus we strongly advice to thoroughly think through the design of your program and plan beforehand, before you actually start writing some code.

This extra task gives you 1 point.