Skip to content
princelySid edited this page Nov 18, 2014 · 12 revisions

Welcome to UmatiCodebase

This code is presented for purposes of making the Umati research project reproducible. The code is arranged in four sections which emerged as separate pillars in the projects. These pillars are:

  • Collection
  • Tagger
  • Analysis
  • Utilities

Collection

Data collection is divided into two section, Facebook and Twitter. Facebook collection is built in Python via the Graph API v2.0 while Twitter collection is built in R using the streaming and search API v2.0.

Tagger

The tagger is used to label datasets by setting a question and corresponding answers. It provides an interface for sampling data and continuously label the data. Code built in Python.

Analysis

The analysis code set is a flavour of Python and R scripts for performing association mining, classification and building predictive models. Different techniques have been provided to make the analysis versatile.

Utilities

The Utilities code subset provides support for data collection and manipulation. It includes R code for tracking clusters on Twitter and sending an alert during increased activity within the cluster.

Clone this wiki locally