Data Mining and network analysis IDN0110 2016

Allikas: Kursused
Mine navigeerimisribale Mine otsikasti

Spring 2015/2016

IDN0110: Data Mining and network analysis

Taught by: Sven Nõmm

EAP: 6.0

Time and place: NB! Note time and places of the lectures on even weeks have been changed!!!

 Lectures: Mondays     17:15-18:45  ICT-312
 Labs:     Tuesdays    14:00-15:30  ICT-401


Consultations and Examinations (Preliminary).

Place ICT-405

Tuesday 24.05 Consultation 14:00 - 15:30

Tuesday 31.05 Examination 1 14:00 - 15:30

Friday 10.06 Examination 2 14:00 - 15:30

Tuesday 14.06 Make up Examination 14:00 - 15:30


Consultation: by appointment only Thursdays 17.30-18-30 Additional information: sven.nomm@ttu.ee

Overview

The course aims to provide knowledge of theory behind different methods of data mining and develop practical skills in applying those methods on practice. Is is spanned around four "super problems" of data mining:

  • Clustering
  • Classification
  • Association pattern mining
  • Outlier analysis

Main topics of the course:

  • Data types and Data Preparation
  • Similarity and Distances, Association Pattern Mining,
  • Cluster Analysis, Classification, Outlier analysis
  • Data streams, Text Data, Time Series, Discrete Sequences,
  • Spatial Data, Graph Data, Web Data, Social Network Analysis
  • Privacy-Preserving Data Mining

Evaluation

  • 2x mandatory closed book tests. Each test gives 10% of the final grade.
  • 4x mandatory home assignments (Computational assignment +short write up.) 30% of the final grade (computed on the basis of three best results)
  • final exam (gives 50 % of the final grade): Written report on assigned topic + discussion with lecturer.

Exam prerequisites: both closed book tests are accepted (graded as 51 or higher), all 4 home assignments are accepted (graded as 51 or higher).

  • 91 < score -- grade 5 (excellent)
  • 81 < score < 90 -- grade 4 (very good)
  • 71 < score < 80 -- grade 3 (good)
  • 61 < score < 70 -- grade 2 (satisfactory)
  • 51 < score < 60 -- grade 1 (acceptable)

score ≤ 50 -- a student has failed to pass

Lectures

Lecture slides, necessary files, links and other necessary information would appear here before the lecture or practice.

Lecture 1: Introduction and Data Preparation

Slides

Practice 1 (Due to the software problems Lab will be repeated on 9.02.2016

Lab 1 manual Exercise 1 Exercise 2 Data file for the exercise 1 Data file for the exercise 2

Lecture 2: Distance and Similarity Part I

Slides

Practice 2

Example 1 Example 2 Data

Lecture 3: Distance and Similarity Part I

Slides

NB! Moodle environment for the course has been activated

If you need the code to enroll please contact the teacher by e-mail. I will continue to upload lecture slides here. All other resources including home assignments will be available thorough the moodle only!!!

Lecture 4: Distance and Similarity Part I

Slides

Lecture 5: Cluster Analysis

Slides

Home Assignment 1

NB! Home Assignment 1 is available in the Moodle environment of this course! In order to access it one should have login and password for ained.ttu.ee and enroll their self to the course!

Lecture 6: Cluster Analysis

Slides I Slides II

Lecture 7: Outlier Analysis

Slides

Lecture 8: Outlier Analysis

Slides

Lecture 9: Closed Book Test

07.04.2016

There will be no consultation today.

Lecture 10: Mining Data Streams

Slides

Lecture 11: Mining Time Series

Slides

April the 25th: The Lecture is cancelled! Please Accept my apology. The practice on Tuesday the 26 will take place according to the schedule.