Syllabus Big Data Analytics - [CCS334] UNIT I UNDERSTANDING BIG DATA Introduction to big data - convergence of key trends - unstructured data - industry examples of big data - web analytics - big data applications - big data technologies - introduction to Hadoop - open source technologies - cloud and big data - mobile business intelligence - Crowd sourcing analytics - inter and trans firewall analytics. (Chapter - 1) UNIT II NOSQL DATA MANAGEMENT Introduction to NoSQL - aggregate data models - key-value and document data models - relationships - graph databases - schemaless databases - materialized views - distribution models - master-slave replication - consistency - Cassandra - Cassandra data model - Cassandra examples - Cassandra clients (Chapter - 2) UNIT III BASICS OF HADOOP Data format - analyzing data with Hadoop - scaling out - Hadoop streaming - Hadoop pipes - design of Hadoop distributed file system (HDFS) - HDFS concepts - Java interface - data flow - Hadoop I/O - data integrity - compression - serialization - Avro - file-based data structures - Cassandra - Hadoop integration. (Chapter - 3) UNIT IV MAP REDUCE APPLICATIONS MapReduce workflows - unit tests with MRUnit - test data and local tests - anatomy of MapReduce job run - classic Map-reduce - YARN - failures in classic Map-reduce and YARN - job scheduling - shuffle and sort - task execution - MapReduce types - input formats - output formats. (Chapter - 4) UNIT V HADOOP RELATED TOOLS Hbase - data model and implementations - Hbase clients - Hbase examples - praxis. Pig - Grunt - pig data model - Pig Latin - developing and testing Pig Latin scripts. Hive - data types and file formats - HiveQL data definition - HiveQL data manipulation - HiveQL queries. (Chapter - 5)