TY - GEN
T1 - Identifying Tweets from Syria Refugees Using a Random Forest Classifier
AU - Reel, Smarti
AU - Wong, Kam Cheung Patrick
AU - Wu, Belinda
AU - Mostefaoui, Soraya Kouadri
AU - Liu, Haiming
PY - 2018/12
Y1 - 2018/12
N2 - A social unrest and violent atmosphere can force a vast number of people to flee their country. While governments and international aid organizations need migration data to inform their decisions, the availability of this data is often delayed due to the tediousness to collect and publish this data. Recent studies recognized the increasing usage of social networking platforms amongst refugees to seek help and express their hardship during their journeys. This paper investigates the feasibility of accurately extracting and identifying tweets from Syria refugees. A robust framework has been developed to find, retrieve, clean and classify tweets from Syria. This includes the development of a Random Forest classifier, which automatically determines which tweets are from Syria refugees. Testing the classifier with samples of historical Twitter data produced promising result of 81% correct classification rate. This preliminary study demonstrates the potential that refugees’ messages can be accurately identified and extracted from social media data mixed with many unwanted messages, and this enables further works for studying refugee issues and predicting their migration patterns.
AB - A social unrest and violent atmosphere can force a vast number of people to flee their country. While governments and international aid organizations need migration data to inform their decisions, the availability of this data is often delayed due to the tediousness to collect and publish this data. Recent studies recognized the increasing usage of social networking platforms amongst refugees to seek help and express their hardship during their journeys. This paper investigates the feasibility of accurately extracting and identifying tweets from Syria refugees. A robust framework has been developed to find, retrieve, clean and classify tweets from Syria. This includes the development of a Random Forest classifier, which automatically determines which tweets are from Syria refugees. Testing the classifier with samples of historical Twitter data produced promising result of 81% correct classification rate. This preliminary study demonstrates the potential that refugees’ messages can be accurately identified and extracted from social media data mixed with many unwanted messages, and this enables further works for studying refugee issues and predicting their migration patterns.
UR - https://www.researchgate.net/publication/330450320_Identifying_Tweets_from_Syria_Refugees_Using_a_Random_Forest_Classifier
UR - http://oro.open.ac.uk/58359/
M3 - Conference contribution
SN - 978-1-7281-1360-9
BT - 2018 International Conference on Computational Science and Computational Intelligence
PB - IEEE
CY - Las Vegas
ER -