Skip to content

Latest commit

 

History

History
25 lines (13 loc) · 1.59 KB

README.md

File metadata and controls

25 lines (13 loc) · 1.59 KB

cs454-winter-2015

Search Engines Information Retrieval in Practice

Please refer to the syllabus for more information about the course.

Homework 1

The focus of this homework assignment is crawling. Write a program that crawls the Internet given a seed URL. Your crawler program will need crawling guidance.

Homework 2

Once the information has been crawled and locally stored, meta information must be extracted. Thus this assignment focuses extraction. Write a program that extracts information from the crawled data from assignment 1.

Homework 3

The process of indexing and ranking involves taking explicited and implicitedly obtained metadata and stored them in a "database" for faster recall. Most importantly, we are concerned with how to organize information such that the "intent" of the users is correctly met. Write a program that ranks the data collected from assignment 1 and the extracted meta information from assignment 2.

Homework 4

Building a "Query Interface" is the provisioning a simple mechanism to search and present the hierarchical information retrieved. Write a command line interface and a web enabled interface that is based on indexed and ranked information collected in assignment 3.

Final Project

Your final project is consisted of putting together a complete information retrieval system. This search engine is combining all of your previous programming assignments into a suite of applications. Note that a search engine system is not necessarily a single program.