TY - GEN
T1 - Exploring the roles of social media data to identify the locations and severity of road traffic accidents
AU - Salam, Sayeed
AU - Islam, Md Shihabul
AU - Ahmed, Fawaz
AU - Khan, Latifur
AU - Kim, Dohyeong
AU - Allo, Nicholas
AU - Nwariaku, Ohwofiemu
N1 - Funding Information:
VII. ACKNOWLEDGEMENT This study was supported by Award no. 1R21TW010991-01A1 (Reducing the Burden of Road Traffic-Associated Mortality using Mobile Technology) from the National Institute of Health. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - People tend to use social media to share information about nearby events which includes traffic accidents. Traffic accident reporting over the phone can initiate medical aid, however it often fails to correctly specify severity, location, and assessment of the overall situation. Social media information (i.e., tweets, posts, etc.) can be mined to extract supportive information to be used to improve reporting accuracy and reduce response time of first responders. In this paper, we developed a framework that can continuously analyze and extract relevant accident reports and tested it using the data from four cities in the U.S. and Nigeria. In this framework, we collected tweets from Twitter API, identified whether they are accident-related or not, clustered a group of tweets talking about the same accident, and performed a severity analysis based on the summary of the tweets. We then geolocated the accidents for which the location is mentioned (i.e. direct geo-coding) or provided an approximate location for accidents by estimating user location-based twitter feed (i.e. indirect geo-coding). We also used semantic role labeling approach for severity detection and present the accuracy with respect to annotated data. The results of empirical testing revealed that city-level locations were identified for 71-97% of the accidents and geo-coordinates were obtained for 33-83% of the accidents, varying across the study sites and geolocation methods. Our framework demonstrates that on average 9-11% cases social media precedes on publishing accident related information than that of actual police reports. We will also discuss our approach of using Distributed, Big Data frameworks to process large number of Tweets generated in a streaming manner.
AB - People tend to use social media to share information about nearby events which includes traffic accidents. Traffic accident reporting over the phone can initiate medical aid, however it often fails to correctly specify severity, location, and assessment of the overall situation. Social media information (i.e., tweets, posts, etc.) can be mined to extract supportive information to be used to improve reporting accuracy and reduce response time of first responders. In this paper, we developed a framework that can continuously analyze and extract relevant accident reports and tested it using the data from four cities in the U.S. and Nigeria. In this framework, we collected tweets from Twitter API, identified whether they are accident-related or not, clustered a group of tweets talking about the same accident, and performed a severity analysis based on the summary of the tweets. We then geolocated the accidents for which the location is mentioned (i.e. direct geo-coding) or provided an approximate location for accidents by estimating user location-based twitter feed (i.e. indirect geo-coding). We also used semantic role labeling approach for severity detection and present the accuracy with respect to annotated data. The results of empirical testing revealed that city-level locations were identified for 71-97% of the accidents and geo-coordinates were obtained for 33-83% of the accidents, varying across the study sites and geolocation methods. Our framework demonstrates that on average 9-11% cases social media precedes on publishing accident related information than that of actual police reports. We will also discuss our approach of using Distributed, Big Data frameworks to process large number of Tweets generated in a streaming manner.
KW - Accident
KW - BERT
KW - Clustering
KW - Semantic Role Labeling
KW - Summarization
KW - Tweet processing
KW - Visualization and API
UR - http://www.scopus.com/inward/record.url?scp=85127634512&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85127634512&partnerID=8YFLogxK
U2 - 10.1109/AIKE52691.2021.00016
DO - 10.1109/AIKE52691.2021.00016
M3 - Conference contribution
AN - SCOPUS:85127634512
T3 - Proceedings - 2021 IEEE 4th International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2021
SP - 62
EP - 71
BT - Proceedings - 2021 IEEE 4th International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2021
Y2 - 1 December 2021 through 3 December 2021
ER -