Syllabus Information Storage and Retrieval -(414441) Credit Scheme : Examination Scheme : 03 Credits Mid_Semester : 30 Marks End_Semester : 70 Marks Unit I Introduction to Information Retrieval Basic Concepts of IR, Data Retrieval & Information Retrieval, Text mining and IR relation, IR system block diagram, Automatic Text Analysis : Luhn's ideas, Conflation Algorithm, Indexing and Index Term Weighting, Probabilistic Indexing, Automatic Classification. Measures of Association, Different Matching Coefficients, Cluster Hypothesis, Clustering Techniques : Rocchio’s Algorithm, Single pass algorithm, Single Link algorithm. (Chapter - 1) Unit II Indexing and Searching Techniques Indexing : Inverted file, Suffix trees & suffix arrays, Signature Files, Scatter storage or hash addressing. Searching Techniques : Boolean Search, sequential search, Serial search, cluster-based retrieval, Query languages, Types of queries, Patterns matching, structural queries. IR Models : Basic concepts, Boolean Model, Vector Model, Probabilistic Model. (Chapter - 2) Unit III Evaluation and Visualization of Information Retrieval System Performance evaluation : Precision and recall, MRR, F-Score, NDCG, user-oriented measures. Visualization in Information System : Starting points, Query Specification, document context, User relevance judgment, Interface support for search process. (Chapter - 3) Unit IV Distributed and Multimedia IR Distributed IR : Introduction, Collection Partitioning, Source Selection, Query Processing, Multimedia IR : Introduction, Data Modeling, Query Language, Background-Spatial Access Method, A Generic Multimedia Indexing Approach, One Dimensional Time Series, Two-Dimensional color Images, Automatic Feature Extraction, Trends and Research Issue. (Chapter - 4) Unit V Web Searching Introduction, Challenges, Web Characteristics, Search Engines : Centralized Architecture, Distributed Architecture, User Interfaces, Ranking, Crawling the web, Indices, Browsing, Meta-searchers, Searching using Hyperlinks, Trends and Research Issues, Introduction to Web Scraping : Python for web scraping, Request, HTML parsing, Beautiful Soup. (Chapter - 5) Unit VI Advanced Information Retrieval XML Retrieval : Basic XML concepts, Challenges in XML retrieval, Vector space model for XML retrieval, Evaluation of XML retrieval, Text-Centric vs. Data-Centric XML retrieval. Recommendation system : Collaborative Filtering and Content Based Recommendation of Documents and Products. Introduction to Semantic Web. (Chapter - 6)