Identifying Tweets from Syria Refugees Using a Random Forest Classifier

Smarti Reel, Kam Cheung Patrick Wong, Belinda Wu, Soraya Kouadri Mostefaoui, Haiming Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

61 Downloads (Pure)


A social unrest and violent atmosphere can force a vast number of people to flee their country. While governments and international aid organizations need migration data to inform their decisions, the availability of this data is often delayed due to the tediousness to collect and publish this data. Recent studies recognized the increasing usage of social networking platforms amongst refugees to seek help and express their hardship during their journeys. This paper investigates the feasibility of accurately extracting and identifying tweets from Syria refugees. A robust framework has been developed to find, retrieve, clean and classify tweets from Syria. This includes the development of a Random Forest classifier, which automatically determines which tweets are from Syria refugees. Testing the classifier with samples of historical Twitter data produced promising result of 81% correct classification rate. This preliminary study demonstrates the potential that refugees’ messages can be accurately identified and extracted from social media data mixed with many unwanted messages, and this enables further works for studying refugee issues and predicting their migration patterns.
Original languageEnglish
Title of host publication2018 International Conference on Computational Science and Computational Intelligence
Place of PublicationLas Vegas
ISBN (Print)978-1-7281-1360-9
Publication statusPublished - Dec 2018


Dive into the research topics of 'Identifying Tweets from Syria Refugees Using a Random Forest Classifier'. Together they form a unique fingerprint.

Cite this