Course: Hybrid Data Science Models in R and Python

« Back
Course title Hybrid Data Science Models in R and Python
Course code KI/0208
Organizational form of instruction Seminary
Level of course Bachelor
Year of study not specified
Semester Winter and summer
Number of ECTS credits 2
Language of instruction Czech
Status of course Compulsory-optional
Form of instruction Face-to-face
Work placements This is not an internship
Recommended optional programme components None
Course availability The course is available to visiting students
Lecturer(s)
  • Babichev Sergii, prof. DSc.
Course content
1. data analysis based on data visualization in R (graphics and ggplot2 packages) 2. data analysis based on data visualisation in Python (matplotlib and plotly modules) 3. processing of missing values in R based on different types of regression, choice of the optimal model 4. methods of data preprocessing in R and Python (normalization, standardization), criteria for evaluating the quality of data processing, comparison of models and choice of the optimal method 5. hybrid signal filtering model based on joint use of wavelet analysis and Huang mode decomposition method 6. practical implementation of hybrid signal filtering models in R and Python 7. clustering model optimization methods based on joint use of internal and external clustering quality evaluation criteria 8. optimization of clustering algorithm parameters using Harrington method and fuzzy logic in R and Python 9. optimization of data sorting model using different data sorting criteria 10. hybrid data dimension reduction models based on joint use of clustering and sorting models 11. practical implementation of hybrid data dimension reduction models in R and Python for big data processing (gene expression) 12. development of a gene expression-based patient prognosis model using data mining and machine learning methods

Learning activities and teaching methods
unspecified
Learning outcomes
The aim of the course is to acquire knowledge and skills in information processing. During the course, students will get introduced to the practical implementation of different methods of data analysis and preprocessing (visualization, missing value processing based on different types of regression models, normalization and filtering) and to hybrid models created using data mining and machine learning methods, which are designed for processing different types of data. The course will use R and Python software functions and modules.

Prerequisites
unspecified

Assessment methods and criteria
unspecified
accomplishment of the assigned tasks
Recommended literature


Study plans that include the course
Faculty Study plan (Version) Category of Branch/Specialization Recommended year of study Recommended semester
Faculty: Faculty of Science Study plan (Version): Information Sciences (double subject) (A14) Category: Informatics courses - Recommended year of study:-, Recommended semester: -
Faculty: Faculty of Science Study plan (Version): - (A14) Category: Informatics courses - Recommended year of study:-, Recommended semester: -
Faculty: Faculty of Science Study plan (Version): Information Sciences (double subject) (A14) Category: Informatics courses - Recommended year of study:-, Recommended semester: -
Faculty: Faculty of Science Study plan (Version): Information Systems (A14) Category: Informatics courses - Recommended year of study:-, Recommended semester: -