Syllabus Data Science and Big Data Analytics - (310251) Credit : Examination Scheme : 03 End-Sem(TH) : 70 Marks Unit III Big Data Analytics Life Cycle Introduction to Big Data, sources of Big Data, Data Analytic Lifecycle : Introduction, Phase 1 : Discovery, Phase 2 : Data Preparation, Phase 3 : Model Planning, Phase 4 : Model Building, Phase 5 : Communication results, Phase 6 : Operation alize. (Chapter - 3) Unit IV Predictive Big Data Analytics with Python Introduction, Essential Python Libraries, Basic examples. Data Preprocessing : Removing Duplicates, Transformation of Data using function or mapping, replacing values, Handling Missing Data. Analytics Types : Predictive, Descriptive and Prescriptive. Association Rules : Apriori Algorithm, FP growth. Regression : Linear Regression, Logistic Regression. Classification : Naïve Bayes, Decision Trees. Introduction to Scikit-learn, Installations, Dataset, mat plotlib, filling missing values, Regression and Classification using Scikit-learn. (Chapter - 4) Unit V Big Data Analytics and Model Evaluation Clustering Algorithms : K-Means, Hierarchical Clustering, Time-series analysis. Introduction to Text Analysis : Text-preprocessing, Bag of words, TF-IDF and topics. Need and Introduction to social network analysis, Introduction to business analysis. Model Evaluation and Selection : Metrics for Evaluating Classifier Performance, Holdout Method and Random Sub sampling, Parameter Tuning and Optimization, Result Interpretation, Clustering and Time-series analysis using Scikit-learn, sklearn. metrics, Confusion matrix, AUC-ROC Curves, Elbow plot. (Chapter - 5) Unit VI Data Visualization and Hadoop Introduction to Data Visualization, Challenges to Big data visualization, Types of data visualization, Data Visualization Techniques, Visualizing Big Data, Tools used in Data Visualization, Hadoop ecosystem, Map Reduce, Pig, Hive, Analytical techniques used in Big data visualization. Data Visualization using Python : Line plot, Scatter plot, Histogram, Density plot, Box- plot. (Chapter - 6) Syllabus Data Science - (317529) Credit : Examination Scheme : 03 End_Semester(TH) : 70 Marks Unit III Data Analytics Life Cycle Introduction, Data Analytic Lifecycle : Introduction, Phase 1 : Discovery, Phase 2 : Data Preparation, Phase 3 : Model Planning, Phase 4 : Model Building, Phase 5 : Communication results, Phase 6 : Operationalize. (Chapter - 3) Unit IV Predictive Data Analytics with Python Introduction, Essential Python Libraries, Basic examples. Data Preprocessing : Removing Duplicates, Transformation of Data using function or mapping, replacing values, Handling Missing Data. Analytics Types : Predictive, Descriptive and Prescriptive. Association Rules: Apriori Algorithm, FP growth. Regression : Linear Regression, Logistic Regression. Classification : Naïve Bayes, Decision Trees. Introduction to Scikit-learn, Installations, Dataset, mat plotlib, filling missing values, Regression and Classification using Scikit-learn. (Chapter - 4) Unit V Data Analytics and Model Evaluation Clustering Algorithms : K-Means, Hierarchical Clustering, Time-series analysis. Introduction to Text Analysis: Text-preprocessing, Bag of words, TF-IDF and topics. Need and Introduction to social network analysis, Introduction to business analysis. Model Evaluation and Selection : Metrics for Evaluating Classifier Performance, Holdout Method and Random Sub sampling, Parameter Tuning and Optimization, Result Interpretation, Clustering and Time-series analysis using Scikit - learn, sklearn. metrics, Confusion matrix, AUC-ROC Curves, Elbow plot. (Chapter - 5) Unit VI Data Visualization and Hadoop Introduction to Data Visualization, Types of data visualization, Data Visualization Techniques, Tools used in Data Visualization, Challenges to Big data visualization, Visualizing Big Data, Analytical techniques used in Big data visualization, Hadoop ecosystem, Map Reduce, Pig, Hive,. Data Visualization using Python : Line plot, Scatter plot, Histogram, Density plot, Box- plot. (Chapter - 6)