A social unrest and violent atmosphere can force a vast number of people to flee their country. While governments and international aid organizations need migration data to inform their decisions, the availability of this data is often delayed due to the tediousness to collect and publish this data. Recent studies recognized the increasing usage of social networking platforms amongst refugees to seek help and express their hardship during their journeys. This paper investigates the feasibility of accurately extracting and identifying tweets from Syria refugees. A robust framework has been developed to find, retrieve, clean and classify tweets from Syria. This includes the development of a Random Forest classifier, which automatically determines which tweets are from Syria refugees. Testing the classifier with samples of historical Twitter data produced promising result of 81% correct classification rate. This preliminary study demonstrates the potential that refugees’ messages can be accurately identified and extracted from social media data mixed with many unwanted messages, and this enables further works for studying refugee issues and predicting their migration patterns.
|Title of host publication||2018 International Conference on Computational Science and Computational Intelligence|
|Place of Publication||Las Vegas|
|Publication status||Published - Dec 2018|