Mapping Natural Disaster Locations from Social Media Content

Overview

This project explored a workflow and method to condition, classify, geolocate and map unstructured data from social media in order to discover clusters of locations mentioned in natural disaster social media posts. This method collected posts from Twitter leveraging the Snscrape python library, and processed and conditioned the text of the posts.

A binary classification recurrent neural network was trained in TensorFlow with Keras using human labeled tweets aggregated by the CrisisBenchmark dataset created by the Crisis NLP project at the Qatar Computing Research Institute.

The text of the training data was vectorized using a pre-trained text embedding built from tweets using the Global Vectors for Word Representation (GloVE) methodology and available from the GloVe Project at Stanford University (Pennington, Socher, and Manning 2014). Hyperparameters for the model were tuned with the Hyperband algorithm, and the final model was evaluated using a 5-fold cross-validation. The model and text embedding built from the training data were used to classify tweets pulled from June and July 2021 for natural disaster informativeness.

The resulting informative tweets were then parsed for location and georeferenced using the Mordecai python library. Finally, the informative posts with locations mentioned in the text were displayed in an interactive R shiny web application for end users to map location explore by filtering data by geography, date and time, and research interest; discover trending topics on their selected data in a word cloud and export data for use by analysts in further research and visualization.

Scrape, Clean, Predict and Geoparse

Clone this repository to your local machine, there are several very large files such as the model as well as the tweets scraped and geo refereneced from June and July 2021, so pulling the first clone will take several minutes.

The

Helpful other links

Docker https://www.docker.com Docker was used it initiate a elastic search over a geonames index for the mordecai geo parser Elastic https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html Elastic was used to enable Mordecai to search the geonames index in a docker container

Jupiter Notebooks Used for testing and data wrangling

Exploration of Variables TextDataEDA.ipynb

Text Preprocessing TextConditioningandMachineLearning.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
Capstone Paper		Capstone Paper
Code		Code
Data		Data
GeoTweets		GeoTweets
ResearchPDFs		ResearchPDFs
TweetMap		TweetMap
WordVector		WordVector
model1		model1
tweets_archive		tweets_archive
.DS_Store		.DS_Store
.RDataTmp		.RDataTmp
.gitattributes		.gitattributes
.gitignore		.gitignore
README.html		README.html
README.md		README.md
capstone.Rproj		capstone.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mapping Natural Disaster Locations from Social Media Content

Overview

Scrape, Clean, Predict and Geoparse

Helpful other links

Jupiter Notebooks Used for testing and data wrangling

About

Releases

Packages

Languages

arboj/arbogast-capstone

Folders and files

Latest commit

History

Repository files navigation

Mapping Natural Disaster Locations from Social Media Content

Overview

Scrape, Clean, Predict and Geoparse

Helpful other links

Jupiter Notebooks Used for testing and data wrangling

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages