Skip to content
chrisorwa edited this page Oct 23, 2014 · 12 revisions

Welcome to UmatiCodebase

This code is presented for purposes of making the Umati research project reproducible. The code is arranged in four sections which emerged as separate pillars in the projects. These pillars are:

  • Collection
  • Tagger
  • Analysis
  • Utilities

Collection

Data collection is divided into two section, Facebook and Twitter. Facebook collection is built in Python via the Graph API v2.0 while Twitter collection is built in R using the streaming and search API v2.0.

Tagger

The tagger is used to label datasets by setting a question and corresponding answers. It provides an interface for sampling data and continuously label the data. Code built in Python.

Analysis

The analysis code set is a flavour of Python and R scripts for performing association mining, classification and building predictive models.

Utilities

The Utilities code subset provides support for data collection and manipulation.

Clone this wiki locally