Erinevus lehekülje "Data Mining (ITI8730)" redaktsioonide vahel

Allikas: Kursused
Mine navigeerimisribale Mine otsikasti
 
(ei näidata sama kasutaja 33 vahepealset redaktsiooni)
1. rida: 1. rida:
  
Information for perspective students:
+
<span style="color:red"> Information for perspective students:</span>
  
Up-to-date information about the course will be added to this page by 31.08.2023.
+
<span style="color:red"> Lecture schedule and slides content are tentative.  Please follow the course page in TalTech Moodle for up to date information and lecture content!!!</span>
Below you can see slides from the previous year. TYesting procedures will change!  
 
  
The course is open to students with valid TalTech UniID!
+
<span style="color:red"> The course is open to students with valid TalTech UniID!
The course targets M.Sc. curricula students.  It is expected that the students are familiar with the Calculus, Linear algebra, Probability, Statistics and possess basic to intermediate knowledge of at least one programming language.  
+
The course targets M.Sc. curricula students.  It is expected that the students are familiar with the Calculus, Linear algebra, Probability, Statistics and possess basic to intermediate knowledge of at least one programming language. This course is not recommended for students of B.Sc. curricula.
 +
</span>
  
This course is not recommended for students of B.Sc. curricula.  
+
<span style="color:red">
 +
Code to join course page  in Moodle and MS Teams will be provided to the students via ÕIS e-mail on Monday September the 4th.  
 +
</span>
  
 +
<span style="color:red">
 +
Those planning to use their own computers please install "R" and "R-studio".
 +
</span>
  
  
 
+
Fall 2023
 
 
 
 
 
 
Fall 2022/2023
 
  
 
ITI8730: Data Mining and network analysis
 
ITI8730: Data Mining and network analysis
23. rida: 24. rida:
  
 
Taught by: Sven Nõmm
 
Taught by: Sven Nõmm
 +
 +
Teaching assistants Ilja Matjas, Rajesh Kalakoti
  
 
EAP: 6.0
 
EAP: 6.0
 
   
 
   
Lectures:  Tuesdays  
+
Lectures:  Tuesdays 12:15 - 13:45 ICT-A1
 
                        
 
                        
Labs (practices):     
+
Labs (practices):    Thursdays 14:00 - 15:30 ICT-404
  
 
Link to join MS Teams  
 
Link to join MS Teams  
 
  
 
Consultation: '''by appointment only''' Please do not hesitate to ask for appointment!!!
 
Consultation: '''by appointment only''' Please do not hesitate to ask for appointment!!!
51. rida: 53. rida:
 
* Cluster Analysis, Classification, Outlier analysis
 
* Cluster Analysis, Classification, Outlier analysis
 
* Data streams, Text Data, Time Series, Discrete Sequences,
 
* Data streams, Text Data, Time Series, Discrete Sequences,
* Spatial Data, Graph Data, Web Data, Social Network Analysis
+
* Graph Data, Social Network Analysis
  
 
==Evaluation==
 
==Evaluation==
*2x mandatory open book tests. Each test gives 10% of the final grade. One make-up attempt for each test.
+
*2x mandatory closed book tests. Each test gives 10% of the final grade. One make-up attempt for each test.
 
*3x mandatory home assignments (Computational assignment +short write up.) Each assignment gives 10% of the final grade. Late (after deadline) assignments are accepted with penalty of 10% for each day except Saturdays and Sundays.
 
*3x mandatory home assignments (Computational assignment +short write up.) Each assignment gives 10% of the final grade. Late (after deadline) assignments are accepted with penalty of 10% for each day except Saturdays and Sundays.
 
*final exam (gives 50 % of the final grade): Written report on assigned topic + discussion with lecturer.
 
*final exam (gives 50 % of the final grade): Written report on assigned topic + discussion with lecturer.
61. rida: 63. rida:
 
Home assignments, code examples, data files and useful links will be distributed by means of Moodle environment. Course enrollment  process in Moodle TBA.
 
Home assignments, code examples, data files and useful links will be distributed by means of Moodle environment. Course enrollment  process in Moodle TBA.
  
=Lectures =
+
=Lectures and Time line =
== Week 1  30.08.22 Distance function ==
+
== 05.09.23 Distance function ==
[[Media:Lecture1_DM2022_Introduction_distance_function.pdf ‎|Slides]]
+
[[Media:Lecture_01_DM2023_Introduction_distance_functions.pdf ‎|Slides]]
 
 
 
 
== Week 2  06.09.22 Cluster analysis I ==
 
[[Media:Lecture_02_DM2022_Cluster_analysis_I.pdf ‎|Slides]]
 
 
 
 
 
== Week 3  13.09.22 Cluster analysis II ==
 
[[Media:Lecture_02_DM2022_Cluster_analysis_II.pdf ‎|Slides]]
 
 
 
== Week 4  20.09.22 Cluster analysis III ==
 
[[Media:Lecture_04_DM_2022_Cluster_analysis_EM.pdf ‎|Slides]]
 
 
 
 
 
== Week 5  27.09.22 Outlier analysis ==
 
[[Media:Lecture_05_DM2022_Anomaly_and_Outlier_Analysis.pdf ‎|Slides]]
 
 
 
 
 
== Week 6  4.10.22 Classification I ==
 
[[Media:Lecture_06_DM2022_Classification.pdf ‎|Slides]]
 
  
== Week 7  11.10.22 Classification II ==
+
== 12.09.23 Cluster analysis I ==
[[Media:Lecture_07_Classification_2_DM_2022.pdf ‎|Slides]]
+
[[Media:Lecture_02_DM2023_Cluster_analysis_I.pdf ‎|Slides]]
  
== Week 8  18.10.22 Regression ==
+
== 19.09.23 Cluster analysis II ==
[[Media:Lecture_08_Data_preparation_regression_DM_2022.pdf ‎|Slides]]
+
[[Media:Lecture_03_DM2023_Cluster_analysis_II.pdf ‎|Slides]]
  
 +
[[Media:Practice_03_DM_2023_Cluster_analysis_EM_algorithm.pdf ‎|Slides (Practice)]]
  
== Week 9  25.10.22 Association Pattern mining ==
+
== 26.09.23 Anomaly and outlier analysis ==
[[Media:Lecture_09_DM2022_Association_Pattern_Mining.pdf ‎|Slides]]
+
[[Media:Lecture_04_DM2023_Anomaly_and_Outlier_Analysis.pdf ‎|Slides]]
  
== Week 9  27.10.22 Open book test I ==
+
== 03.10.23 Classification I ==
 +
[[Media:Lecture_05_DM2023_Classification_I.pdf ‎|Slides]]
  
== Week 10 01.11.22 Distance and Similarity II ==
+
== 10.10.23 Classification II ==
[[Media:Lecture_10_DM2022_Similarity_and_Distance_2.pdf ‎|Slides]]
+
[[Media:Lecture_06_Classification_II_DM_2023.pdf ‎|Slides]]
  
 +
== 17.10.23 Regression analysis ==
 +
[[Media:Lecture_07_DM2023_Regression_analysis_and_data_preparation.pdf ‎|Slides]]
  
== Week 11  08.11.22 Mining the Time series ==
+
== 24.10.23 Association Pattern mining ==
[[Media:Lecture_11_DM2022_Mining_TimeSeries.pdf ‎|Slides]]
+
[[Media:Lecture_08_DM2023_Association_Pattern_Mining.pdf ‎|Slides]]
  
== Week 11  08.11.22 Mining data streams ==
+
== 31.10.23 Closed Book Test I ==
[[Media:Lecture_11_DM2022_Mining_Data_Streams.pdf ‎|Slides]]
 
  
== Week 12  15.11.22 Text data mining ==
+
== 07.11.23 Distance and Similarity II  ==
[[Media:Lecture_12_DM2022_TextDataMining.pdf ‎|Slides]]
+
[[Media:Lecture_09_DM2023_Similarity_and_Distance_2.pdf ‎|Slides]]
  
== Week 13  22.11.22 Graph data mining and Social analysis ==
+
== 14.11.23 Mining the Time series ==
[[Media:Lecture_13_DM2022_Mining_Data_Graph_Data.pdf ‎|Slides]]
+
[[Media:Lecture_10_DM2023_Mining_Time_Series.pdf ‎|Slides]]
  
[[Media:Lecture_14_DM2022_Social_Network_analysis.pdf ‎|Slides]]
+
== 21.11.23 Mining data streams ==
 +
[[Media:Lecture_11_DM2023_Mining_Data_Streams.pdf ‎|Slides]]
  
Home assignment 3 will be published on 24.11.2022
+
== 28.11.23 Text data mining ==
 +
[[Media:Lecture_12_DM2023_Text_Data_Mining.pdf ‎|Slides]]
  
Make up test 1 26.12.22
+
== 05.12.23 Graph data mining and Social analysis ==
 +
[[Media:Lecture_13_DM2023_Mining_Data_Graph_Data.pdf ‎|Slides]]
  
== Week 14  29.11.22 Privacy preserving data mining==
+
[[Media:Lecture_13_DM2023_Social_Network_analysis.pdf ‎|Slides]]
[[Media:Lecture_15_DM2022_Privacy_preserving_data_mining.pdf ‎|Slides]]
 
  
== Week 15  06.12.22 and 08.12.22 Online practices devoted to text data mining and graph data mining ==
+
== 12.12.23 Privacy preserving data mining==
Note that practice material is available only in TalTech  Moodle environment!!!
+
[[Media:Lecture_14_DM2023_Privacy_preserving_data_mining.pdf ‎|Slides]]
  
== Week 16  13.12.22 Open book test 2 ==
+
== 19.12.23 Closed Book Test II ==
Make-up test if necessary will be given on 15.12.2022
 

Viimane redaktsioon: 31. august 2023, kell 08:15

Information for perspective students:

Lecture schedule and slides content are tentative. Please follow the course page in TalTech Moodle for up to date information and lecture content!!!

The course is open to students with valid TalTech UniID! The course targets M.Sc. curricula students. It is expected that the students are familiar with the Calculus, Linear algebra, Probability, Statistics and possess basic to intermediate knowledge of at least one programming language. This course is not recommended for students of B.Sc. curricula.

Code to join course page in Moodle and MS Teams will be provided to the students via ÕIS e-mail on Monday September the 4th.

Those planning to use their own computers please install "R" and "R-studio".


Fall 2023

ITI8730: Data Mining and network analysis

Old code for this course is IDN0110

Taught by: Sven Nõmm

Teaching assistants Ilja Matjas, Rajesh Kalakoti

EAP: 6.0

Lectures: Tuesdays 12:15 - 13:45 ICT-A1

Labs (practices): Thursdays 14:00 - 15:30 ICT-404

Link to join MS Teams

Consultation: by appointment only Please do not hesitate to ask for appointment!!! For communication please use the following e-mail: sven.nomm@taltech.ee

Prerequisites to join the course

Students are expected to be familiar with the foundations of Calculus, Linear algebra, Probability theory and Statistics and possess the knowledge of at least one programming language.

Overview

The course aims to provide knowledge of theory behind different methods of data mining and develop practical skills in applying those methods on practice. Is is spanned around four "super problems" of data mining:

  • Clustering
  • Classification
  • Association pattern mining
  • Outlier analysis

Main topics of the course:

  • Data types and Data Preparation
  • Similarity and Distances, Association Pattern Mining,
  • Cluster Analysis, Classification, Outlier analysis
  • Data streams, Text Data, Time Series, Discrete Sequences,
  • Graph Data, Social Network Analysis

Evaluation

  • 2x mandatory closed book tests. Each test gives 10% of the final grade. One make-up attempt for each test.
  • 3x mandatory home assignments (Computational assignment +short write up.) Each assignment gives 10% of the final grade. Late (after deadline) assignments are accepted with penalty of 10% for each day except Saturdays and Sundays.
  • final exam (gives 50 % of the final grade): Written report on assigned topic + discussion with lecturer.

Exam prerequisites: All 2 closed book tests are accepted (graded as 51 or higher), all 3 home assignments are accepted (graded as 51 or higher).

Home assignments, code examples, data files and useful links will be distributed by means of Moodle environment. Course enrollment process in Moodle TBA.

Lectures and Time line

05.09.23 Distance function

Slides

12.09.23 Cluster analysis I

Slides

19.09.23 Cluster analysis II

Slides

Slides (Practice)

26.09.23 Anomaly and outlier analysis

Slides

03.10.23 Classification I

Slides

10.10.23 Classification II

Slides

17.10.23 Regression analysis

Slides

24.10.23 Association Pattern mining

Slides

31.10.23 Closed Book Test I

07.11.23 Distance and Similarity II

Slides

14.11.23 Mining the Time series

Slides

21.11.23 Mining data streams

Slides

28.11.23 Text data mining

Slides

05.12.23 Graph data mining and Social analysis

Slides

Slides

12.12.23 Privacy preserving data mining

Slides

19.12.23 Closed Book Test II