CP7003 DATA ANALYSIS AND BUSINESS INTELLIGENCE - ANNA UNIVERSITY 1ST SEM CSE SYLLABUS REG-2013
ANNA UNIVERSITY, CHENNAI REGULATIONS - 2013 M.E. COMPUTER SCIENCE AND ENGINEERING CP7003 DATA ANALYSIS AND BUSINESS INTELLIGENCE OBJECTIVES: To understand linear regression models To understand logistic regression models To understand generalized linear models To understand simulation using regression models To understand causal inference To understand multilevel regression To understand data collection and model understanding UNIT I LINEAR REGRESSION Introduction to data analysis – Statistical processes – statistical models – statistical inference – review of random variables and probability distributions – linear regression – one predictor – multiple predictors – prediction and validation – linear transformations – centering and standardizing – correlation – logarithmic transformations – other transformations – building regression models – fitting a series of regressions UNIT II LOGISTIC AND GENERALIZED LINEAR MODELS Logistic regression – logistic regression coefficients – latent-data formulation – building a logistic regression model – logistic regression with interactions – evaluating, checking, and comparing fitted logistic regressions – identifiability and separation – Poisson regression – logistic-binomial model – Probit regression – multinomial regression – robust regression using t model – building complex generalized linear models – constructive choice models. UNIT III SIMULATION AND CAUSAL INFERENCE Simulation of probability models – summarizing linear regressions – simulation of non-linear predictions – predictive simulation for generalized linear models – fake-data simulation – simulating and comparing to actual data – predictive simulation to check the fit of a time-series model – causal inference – randomized experiments – observational studies – causal inference using advanced models – matching – instrumental variables UNIT IV MULTILEVEL REGRESSION Multilevel structures – clustered data – multilevel linear models – partial pooling – group-level predictors – model building and statistical significance – varying intercepts and slopes – scaled inverse-Wishart distribution – non-nested models – multi-level logistic regression – multi-level generalized linear models UNIT V DATA COLLECTION AND MODEL UNDERSTANDING Design of data collection – classical power calculations – multilevel power calculations – power calculation using fake-data simulation – understanding and summarizing fitted models – uncertainty and variability – variances – R2 and explained variance – multiple comparisons and statistical significance – analysis of variance – ANOVA and multilevel linear and general linear models – missing data imputation TOTAL: 45 PERIODS OUTCOMES: Upon Completion of the course,the students will be able to Build and apply linear regression models Build and apply logistic regression models Build and apply generalized linear models Perform simulation using regression models Perform casual inference from data Build and apply multilevel regression models Perform data collection and variance analysis REFERENCES: 1. Andrew Gelman and Jennifer Hill, "Data Analysis using Regression and multilevel/Hierarchical Models", Cambridge University Press, 2006. 2. Philipp K. Janert, "Data Analysis with Open Source Tools", O'Reilley, 2010. 3. Wes McKinney, "Python for Data Analysis", O'Reilley, 2012. 4. Davinderjit Sivia and John Skilling, "Data Analysis: A Bayesian Tutorial", Second Edition, Oxford University Press, 2006. 5. Robert Nisbelt, John Elder, and Gary Miner, "Handbook of statistical analysis and data mining applications", Academic Press, 2009. 6. Michael Minelli, Michelle Chambers, and Ambiga Dhiraj, "Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses", Wiley, 2013. 7. John Maindonald and W. John Braun, "Data Analysis and Graphics Using R: An Example- based Approach", Third Edition, Cambridge University Press, 2010. 8. David Ruppert, "Statistics and Data Analysis for Financial Engineering", Springer, 2011 |
No comments:
Post a Comment