Course: Data mining with R and KNIME

« Back
Course title Data mining with R and KNIME
Course code KI/0197
Organizational form of instruction Seminary
Level of course Bachelor
Year of study not specified
Semester Winter and summer
Number of ECTS credits 2
Language of instruction Czech
Status of course Compulsory-optional
Form of instruction Face-to-face
Work placements This is not an internship
Recommended optional programme components None
Lecturer(s)
  • Babichev Sergii, prof. DSc.
Course content
1. Introduction to data mining, presentation of data in software R and Python (packages "pandas" and "numpy"). Read the data from "txt" file and save the results of data processing to appropriate files in different software. 2. Methods of data visualization and normalization. The use of these methods in different software. 3. Regression analysis methods. Linear and non-linear regression models. Simple and multiply regression models. Evaluation of regression model quality. 4. Logistic regression. The use of the regression analysis to process of missing values in the studied data. 5. Basis of cluster analysis. Hierarchical clustering methods. Creation of dendrogram and formation of clusters. Creation of hierarchical clustering models in software KNIME with the use of functions and plugins of software R, Python and WEKA. 6. Iteration and density clustering algorithms k-means, C-means and DBSCAN. The use of these algorithms in different software with the following creation of models in software KNIME. 7. Basis of data classification. Comparison analysis of data clustering and classification. Data classification based on Bayesian methods and decision tree methods. Implementation of classification methods and algorithms in different software with creation of models in software KNIME. 8. Fuzzy logic methods. The use of these methods for data classification and prediction. Creation of fuzzy logic models in different software. 9. Basis of neural networks and their implementation in models of data classification and prediction. 10. Practical implementation of data mining methods for analysis and processing various data. Preparation of data processing results.

Learning activities and teaching methods
unspecified
Learning outcomes
The purpose of this course is the study and receiving practical experience of information processing and useful knowledge extracted by the complex use of software R, Python, WEKA and KNIME. During the course the students will be acquainted with practical implementation of methods and algorithms of data preprocessing (filtration, missing values processing, normalization, etc.), regression analysis, clustering and classification of the investigated data and fuzzy logic. Simulation of the studied data processing will be performed in software KNIME using functions and plugins of software R, Python and WEKA.

Prerequisites
unspecified

Assessment methods and criteria
unspecified
Practical test
Recommended literature


Study plans that include the course
Faculty Study plan (Version) Category of Branch/Specialization Recommended year of study Recommended semester
Faculty: Faculty of Science Study plan (Version): - (A14) Category: Informatics courses - Recommended year of study:-, Recommended semester: -
Faculty: Faculty of Science Study plan (Version): Information Systems (A14) Category: Informatics courses - Recommended year of study:-, Recommended semester: -