Skip to content

This is an off-line wikipedia search engine, which uses TF-IDF scoring to retrieve top results from a given wikipedia dump.

Notifications You must be signed in to change notification settings

Abhinandan11/wiki-search-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

wiki-search-engine

  1. Install PyStemmer

    https://github.com/snowballstem/pystemmer

  2. Create inverted index of a dump

    python wiki_indexer.py <wiki_dump_file_name> <output_file_name>

  3. Search in the input dump

    python query.py

About

This is an off-line wikipedia search engine, which uses TF-IDF scoring to retrieve top results from a given wikipedia dump.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages