Coursework

Programming for Data Analytics: Programming for data analysis with an emphasis on the analysis of large datasets using Python.  The core of ideas of programming – flow control, input and output, data structures (e.g., arrays, lists, trees and hash tables), iteration and recursion, classes and object-oriented programming – through writing code to deal with Big Data generated by social media sites such as Twitter.  Using Python for effective data analysis.  Including: vector computation and mathematics with NumPy, statistical computation with SciPy, working with tubular data with Pandas, and implementing analytics algorithms using Python.  Data Warehousing and Data Mining: The main concepts, components, and various architectures of Data Warehouse. Advanced data analysis and optimization of Data Warehouse Design. Data Warehousing and OLAP tools. Applying data mining algorithms to retrieve highly specialized information or knowledge about the data stored in the Data Warehouse.  Big Data Analytics: The principles underlying Big Data analytics and its applications in different domains with a state-of-the-art Big Data platform.  A combination of essential business and technical skills related to Big Data analytics. Business aspects emphasized include (a) understanding the scope and role of Big Data in today’s organizations, (b) representative example scenarios and case studies of industry specific applications highlighting Big Data issues – volume, variety, velocity, and veracity, (c) when to consider a Big Data Solution, (d) the integration of Big Data initiatives as part of the overall business strategy to achieve “return to data” and competitive differentiation, and information governance issues.  Technical aspects emphasized include (a) life cycle of a Big Data analytics solution with multiple entry points, b) essential components of a Big Data solution and technology platform, (c) key features of Hadoop and related technologies (e.g., MapReduce, HDFS, NoSQL), (d) performing analytics with predictive models, text analytics, and streaming data, and (e) data visualization and communication of analytical findings.  Modern Applied Statistics II: Data mining techniques for multivariate data, including principal component analysis, multidimensional scaling, and cluster analysis; supervised learning methods and pattern recognition; and an overview of statistical prediction analysis relevant to business intelligence and analytics. Modern Applied Statistics I: Logistic regression, GLM, density estimation, recursive partitioning, generalized additive models and spline models.  Variable selection and survival analysis.  quantile regression, longitudinal data analysis and mixed models.  Multiple comparison, false discovery rate, and simultaneous inference.  Meta-analysis, and large-scale inference.  Statistical Programming: Statistical programming languages including descriptive and visual analytics in R and SAS, and programming fundamentals in R and SAS including logic, loops, macros, and functions. R Programming: Visualization and modeling in R. C++ Programming: Problem solving, algorithm design, standards of program style, debugging and testing, control structures and data structures of the C++ language. Elementary data structures and basic algorithms that include sorting and searching. Topics include more advanced treatment of functions, data types such as arrays and structures, and files.  Computer Intensive Statistics: Numerical stability, matrix decompositions for linear models, methods for generating pseudo-random variates, interactive estimation procedures (Fisher scoring and EM algorithm), bootstrapping, scatterplot smoothers, Monte Carlo techniques including Monte Carlo integration and Markov chain Monte Carlo. Multivariate Analysis: The multivariate normal, Hotelling’s T^2 multivariate general linear model, discriminant analysis, covariance matrix tests, canonical correlation, and principle component analysis. Regression Analysis: Theory and application of regression models including linear, nonlinear, and generalized linear models. Topics include model specification, point and interval estimators, exact and asymptotic sampling distributions, tests of general linear hypotheses, prediction, influence, multicollinearity, assessment of model fit, and model selection. Experimental Design: Methods of constructing and analyzing designs for experimental investigations; analysis of designs with unequal subclass numbers; concepts of blocking randomization and replication; confounding in factorial experiments; incomplete block designs; response surface methodology. Theory of Sampling: Simple random, systematic, stratified random, one and two stage cluster sampling; introduction to variable probability sampling and estimation of population size. Probability Theory & Mathematical Statistics II:  An advanced treatment of estimation of parameters, confidence intervals, hypothesis testing, likelihood ratio test, sufficient statistics, et al.  Probability Theory & Mathematical Statistics I: An advanced treatment of random variables, expectation, special distributions (normal, binomial, exponential, etc.), moment generating functions, law of large numbers, central limit theorem, introduction to Bayesian probability, et al. Advanced Probability Theory: Measure spaces, extension theorem and construction of Lebesgue-Stieljes measures on Euclidean spaces, Lebesgue integration and the basic convergence theorems, Lp-spaces, absolute continuity of measures and the Radon Nikodym theorem, absolute continuity of functions on R and the fundamental theorem of Lebesgue integration, product spaces and Fubini-Tonelli Theorems, convolutions. Fourier series and transforms, probability spaces; Kolmogorov’s existence theorem for stochastic processes; expectation; Jensen’s inequality and applications, independence, Borel-Cantelli lemmas; weak and strong laws of large numbers and applications, renewal theory. Functions of a Single Complex Variable: Theory of analytic functions, integration, topology of the extended complex plane, singularities and residue theory, maximum principle. Theory of Interest: Measurement of interest; solution of interest problems; basic and general annuities; yield rates; amortization schedules and sinking funds; bonds; yield curves; duration and immunization; stochastic approaches, derivatives and their use in managing risk. Advanced Stochastic Processes: Weak convergence. Random walks and Brownian motion. Martingales. Stochastic integration and Ito’s Formula. Stochastic differential equations and applications. Advanced Abstract Algebra I: Algebraic systems and their morphisms, including groups, rings, modules, and fields. Stochastic Processes: Markov chains on discrete spaces in discrete and continuous time (random walks, Poisson processes, birth and death processes) and their long-term behavior. Optional topics may include branching processes, renewal theory, introduction to Brownian motion. Advanced Real Analysis II: Metric spaces, topological spaces, compactness, abstract theory of measure and integral, differentiation of measures, Banach spaces, introduction to functional analysis, fourier transform, distribution theory. Advanced Real Analysis I: Lebesque measure and Lebesgue integral, one variable differentiation theory, product integration, Lp spaces. Advanced Linear Algebra: Advanced topics in linear algebra including canonical forms; unitary, normal, Hermitian and positive-definite matrices; variational characterizations of eigenvalues, and applications to other branches of mathematics. Statistical Computing Applications: Modern statistical computing. Data management; spread sheets, verifying data accuracy, transferring data between systems. Data and graphical analysis with microcomputer statistical software packages. Macro programming. Algorithmic programming concepts and applications. Simulation. Software reliability. Computer Processing of Statistical Data: Structure, content and programming aspects of a modern statistical package. Advanced techniques in the use of a statistical software system for data analysis. Introduction to graphical methods in statistics and a macro programming language. Currently SAS is the software system used. Applied Time Series: Methods for analyzing data collected over time; review of multiple regression analysis. Elementary forecasting methods: moving averages and exponential smoothing. Autoregressive-moving average (Box- Jenkins) models: identification, estimation, diagnostic checking, and forecasting. Transfer function models and intervention analysis. Applied Probability Models: Probabilistic models in biological, engineering and the physical sciences. Markov chains; Poisson, birth-and-death, renewal, branching and queing processes; applications to bioinformatics and other quantitative problems. Survey Sampling Techniques: Concepts of sample surveys and the survey process; methods of designing sample surveys, including: simple random, stratified, and multistage sampling designs; methods of analyzing sample surveys including ratio, regression, domain estimation and nonresponse. Statistical Design and the Analysis of Experiments: The role of statistics in research and the principles of experimental design. Experimental units, randomization, replication, blocking, subdividing and repeatedly measuring experimental units; factorial treatment designs and confounding; extensions of the analysis of variance to cover general crossed and nested classifications and models that include both classificatory and continuous factors. Determining sample size. Statistical Methods I: Methods of analyzing and interpreting experimental and survey data. Statistical concepts and models; estimation; hypothesis tests with continuous and discrete data; simple and multiple linear regression and correlation; introduction to analysis of variance and blocking. Real Analysis II: Sequences and series of functions of a real variable, uniform convergence, power series and Taylor series, Fourier series, topology of n-dimensional space, implicit function theorem, calculus of the plane and 3-dimensional space. Additional topics may include metric spaces or Stieltjes or Lebesgue integration. Real Analysis I: A careful development of calculus of functions of a real variable: limits, continuity, differentiation, integration, series. Number Theory: Divisibility, integer representations, primes and divisors, linear diophantine equations, congruences, and multiplicative functions. Applications to cryptography. Theory of Probability and Statistics II: Transformations of random variables; sampling distributions; confidence intervals and hypothesis testing; theory of estimation and hypothesis tests; linear model theory, enumerative data; use of the R statistical package for simulation and data analysis. Theory of Probability and Statistics I: Probability; distribution functions and their properties; classical discrete and continuous distribution functions; multivariate probability distributions and their properties; moment generating functions; simulation of random variables and use of the R statistical package. Theory of Linear Algebra: Systems of linear equations, determinants, vector spaces, inner product spaces, linear transformations, eigenvalues and eigenvectors. Matrices and Linear Algebra: Systems of linear equations, determinants, vector spaces, linear transformations, orthogonality, least-squares methods, eigenvalues and eigenvectors. Combinatorics: Permutations, combinations, binomial coefficients, inclusion-exclusion principle, recurrence relations, generating functions. Additional topics selected from probability, random walks, and Markov chains. Abstract Algebra II: Theory of rings and fields. Introduction to Galois theory. Abstract Algebra I: Theory of groups. Homomorphisms. Quotient groups. Introduction to rings. Elementary Differential Equations: Solution methods for ordinary differential equations. First order equations, linear equations, constant coefficient equations. Eigenvalue methods for systems of first order linear equations. Introduction to stability and phase plane analysis. Calculus III: Analytic geometry and vectors, differential calculus of functions of several variables, multiple integrals, vector calculus. Calculus II: Integral calculus, applications of the integral, infinite series. Calculus I: Differential calculus, applications of the derivative, introduction to integral calculus.

Advertisement