Skip to content

Document clustering using PCA from scratch using numpy and scipy.

Notifications You must be signed in to change notification settings

sethuiyer/Document-Clusterer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Document-Clusterer

A simple document cluster using single value decomposition on a corpus of CNN-stories.

cleaning.py: Processes the directory of cnn-stories and produces a useful json file

model.py: Main program which does the clustering

#TODO Make a blog post explaining about the same

About

Document clustering using PCA from scratch using numpy and scipy.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages