Unit 1 Introduction : Data Science and Big Data Introduction to Data science and Big Data, Defining Data science and Big Data, Big Data examples, Data Explosion : Data Volume, Data Variety, Data Velocity and Veracity. Big data infrastructure and challenges. Big Data Processing Architectures : Data Warehouse, Re-Engineering the Data Warehouse, shared everything and shared nothing architecture, Big data learning approaches. Data Science -The Big Picture : Relation between ΑΙ, Statistical Learning, Machine Learning, Data Mining and Big Data Analytics. (Chapter - 1) Unit II Mathematical Foundation of Big Data Probability: Random Variables and Joint Probability, Conditional Probability and concept of Markov chains, Tail bounds, Markov chains and random walks, Pair-wise independence and universa\ hashing Approximate counting, Approximate median. Data Streaming Models and Statistical Methods : Flajole Martin algorithm, Distance Sampling and Random Projections, Bloom filters, Mode, Variance, standard deviation, Correlation analysis and Analysis of Variance. (Chapter - 2) Unit 111 Big Data Processing Big Data Analytics - Ecosystem and Technologies, Introduction to Google file system, Hadoop Architecture, Hadoop Storage : HDFS, Common Hadoop Shell commands, Anatomy ofFile Write and Read, NameNode, Secondary NameNode, and DataNode, Hadoop MapReduce paradigm, Map Reduce tasks, Job, Task trackers - Cluster Setup - SSH & Hadoop Configuration, Introduction to NOSQL, Textual ETL processing. (Chapter - 3)