SlideShare una empresa de Scribd logo
1 de 19
Descargar para leer sin conexión
Real-time Natural Language Processing for
Crowdsourced Road Traffic Alerts
C.D. Athuraliya, M.K.H. Gunasekara,
Srinath Perera, Sriskandarajah Suhothayan
http://bit.ly/1NwBXTv
● Introduction
● Background
● Solution & Methodology
● Results & Conclusion
Overview
2
Introduction
● Success of modern day enterprises and businesses is highly relied
on how they process massive amounts of data
● “Drowning in data yet starving for knowledge”
● With the emergence of social media, public has gained the
potential to generate massive amounts of data
● But we are still in a struggle to extract useful information out of this
data
3
Introduction
● Road traffic has become a major issue, mainly in developing
countries
● Directly affects country’s economy and development due to the
waste of resources – Fuel, time
● Using technology to find solutions – Proven to be success stories in
number of cases
● This study was focused on one such solution emerged with the use
of social media
● Twitter – Popular for dynamic content publishing
○ Users publish on different topics such as current affairs, news, politics
and personal interests via 140 character messages called tweets
4
Background
● Road.lk – A website that provides localized traffic alerts from a
Twitter feed
● Experiencing road traffic or have information on road traffic? Tweet
about it!
● All users, follow @road_lk receive traffic alerts nearly in real-time
● Identified as a potential source to extract information on road traffic
in real-time
● Reliability maintained by higher number of publishers
5
Background – @road_lk Feed
6
Background
● Potential is significant to a country like Sri Lanka – Due to the
unavailability of high tech traffic monitoring systems
● Several limitations,
○ Connectivity requirement
○ Unavailability of proper alert mechanism except Twitter feed or
road.lk website
● Notable limitation – Users use natural language to post traffic
updates
● A format can make processing tweets more straightforward but it
can reduce the flexibility of sharing updates
7
Solution & Methodology
● A prototype solution was implemented by combining NLP and CEP
tools
● Accommodates three use cases,
○ Real-time road traffic feed and geo location map
○ Traffic search within an area
○ Traffic alert subscription
● Developed an architecture for a these use cases
● Multiple tools were utilized to retrieve, process and present
information
8
Solution & Methodology – Architecture
9
Solution & Methodology – Feed
● Feed Retrieval – Access Twitter via its API
● Existing feed for model training dataset generation
○ REST API, Twitter4J
● Real-time feed stream for alert generation
○ Streaming API, WSO2 Enterprise Service Bus Twitter
connector
10
Solution & Methodology – NLP
● @road_lk Twitter feed
○ Reliable data source to generate real-time traffic alerts
○ Constrained by natural language representation
● Transform this data into a machine readable representation – Can
use the full potential of this source for a better solution
● Proposed a NLP model to address this problem
● Extracted two entities from a tweet – location and traffic level
● Before extracting these two entities,
○ A tweet needed to be classified – Traffic alert or not?
○ Cleaning, preprocessing
11
Solution & Methodology – NLP
● NLP tasks required to classify and extract,
○ Tweet categorization
○ Location extraction
○ Traffic level extraction
● First task – Document categorization task
● Latter two – Name entity recognition (NER) tasks
● Apache OpenNLP toolkit was used
● Custom tokenizer for street names and city names
● Traffic level NER task – Predefined set of words selected to tag
● Had to consider factors – Spelling mistakes, informal language,
abbreviations 12
Solution & Methodology – CEP
● Another important property of this data source – Required to
process the Twitter feed in real-time
● Our approach was complex event processing (CEP)
● CEP is a field, concerned in processing data from multiple sources
in real-time
● Used WSO2 Complex Event Processor as the CEP tool to analyse
and process Twitter feed input stream
● Siddhi Query Language (SiddhiQL) is at the core of WSO2 CEP
● Designed to process event streams and identify complex event
occurrences
13
Solution & Methodology – Siddhi Queries
from classifiedStream#transform.nlp:getEntities(convertedText,4,true,"/_system/governance/en-location.bin")
select * insert into templocationStream;
from classifiedStream#transform.nlp:getEntities(convertedText,1,false,"/_system/governance/en-trafficlevel.bin")
select * insert into temptrafficlevelStream;
from S1=classifiedStream, S2=temptrafficlevelStream, S3=templocationStream
select S1.createdAt as time, S2.nameElement1 as trafficLevel, S3.nameElement1 as location1, S3.nameElement2 as
location2, S3.nameElement3 as location3, S3.nameElement4 as location4
insert into locationsStream;
from uiFeedStream#window.time(120 min) as trafficFeed join SearchEventStream as request
on (trafficFeed.latitude < request.latitude + 0.018 and trafficFeed.latitude > request.latitude - 0.018 and
trafficFeed.longitude < request.longitude + 0.027 and trafficFeed.longitude > request.longitude - 0.027)
select trafficFeed.formattedAddress, trafficFeed.latitude, trafficFeed.longitude, trafficFeed.level, trafficFeed.time
insert into searchResult;
14
Solution & Methodology – CEP
● Siddhi queries define how to process and combine existing event
streams to create new event streams
● SiddhiQL was extended with extensions for,
○ Tweet categorization
○ Name entity recognition
○ Geocoding
● Geocoding extension converts the locations into geo coordinates
● Searching functionality used a time-based Siddhi window
○ To retrieve traffic in nearby geo area within a predefined time
period
15
Results & Conclusion
● Implemented a web based interface to demonstrate the
functionalities
● Users can interact with this interface and make use of the use
cases
● Accuracy measures of NLP through OpenNLP evaluation APIs
● A solution to extract useful information from a crowdsourced social
networking service
● By utilizing a NLP/CEP combined approach
16
Results & Conclusion – Web UI
17
Results & Conclusion
● Results demonstrate the potential of such model
● To tackle an application of real-time natural language processing
task
● This model can be extended to tackle any real-time unstructured
data stream
● Transforming human readable data into machine readable format
enables deep processing of data to generate useful information and
insights
○ Trend analysis
○ Pattern detection and prediction
18
Thank you.

Más contenido relacionado

Similar a Real-time Natural Language Processing for Crowdsourced Road Traffic Alerts

Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell Extension
Sammy Fung
 
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
AgileNetwork
 
Open data for development
Open data for developmentOpen data for development
Open data for development
mlepage
 

Similar a Real-time Natural Language Processing for Crowdsourced Road Traffic Alerts (20)

Presentation
PresentationPresentation
Presentation
 
Extracting Insights from Data at Twitter
Extracting Insights from Data at TwitterExtracting Insights from Data at Twitter
Extracting Insights from Data at Twitter
 
Geospatial data platform at Uber
Geospatial data platform at UberGeospatial data platform at Uber
Geospatial data platform at Uber
 
A Platform Approach to Digital Transformation
A Platform Approach to Digital TransformationA Platform Approach to Digital Transformation
A Platform Approach to Digital Transformation
 
Making DMPs actionable and public
Making DMPs actionable and publicMaking DMPs actionable and public
Making DMPs actionable and public
 
Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell Extension
 
Shaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M ResumeShaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M Resume
 
Uber Geo spatial data platform at DataWorks Summit
Uber Geo spatial data platform at DataWorks SummitUber Geo spatial data platform at DataWorks Summit
Uber Geo spatial data platform at DataWorks Summit
 
DAMG7245-Fall23-FinalProjectProposal.pdf
DAMG7245-Fall23-FinalProjectProposal.pdfDAMG7245-Fall23-FinalProjectProposal.pdf
DAMG7245-Fall23-FinalProjectProposal.pdf
 
Putting Spatial Information in Customer Hands - Wayne Fry - Dept Natural Reso...
Putting Spatial Information in Customer Hands - Wayne Fry - Dept Natural Reso...Putting Spatial Information in Customer Hands - Wayne Fry - Dept Natural Reso...
Putting Spatial Information in Customer Hands - Wayne Fry - Dept Natural Reso...
 
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
 
Open data for development
Open data for developmentOpen data for development
Open data for development
 
City of Amsterdam: High velocity development
City of Amsterdam: High velocity developmentCity of Amsterdam: High velocity development
City of Amsterdam: High velocity development
 
The tripscore Linked Data client: calculating specific summaries over large t...
The tripscore Linked Data client: calculating specific summaries over large t...The tripscore Linked Data client: calculating specific summaries over large t...
The tripscore Linked Data client: calculating specific summaries over large t...
 
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data HubSFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
 
Android development
Android developmentAndroid development
Android development
 
#twbconf 2017: Digital transformation in London - Natalie Taylor, Mayor of Lo...
#twbconf 2017: Digital transformation in London - Natalie Taylor, Mayor of Lo...#twbconf 2017: Digital transformation in London - Natalie Taylor, Mayor of Lo...
#twbconf 2017: Digital transformation in London - Natalie Taylor, Mayor of Lo...
 
Engage 2020-nerd-for-move-on-from-x pages
Engage 2020-nerd-for-move-on-from-x pagesEngage 2020-nerd-for-move-on-from-x pages
Engage 2020-nerd-for-move-on-from-x pages
 
Visualizing CDR Data
Visualizing CDR DataVisualizing CDR Data
Visualizing CDR Data
 
Enterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in DenodoEnterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in Denodo
 

Último

➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 

Último (20)

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 

Real-time Natural Language Processing for Crowdsourced Road Traffic Alerts

  • 1. Real-time Natural Language Processing for Crowdsourced Road Traffic Alerts C.D. Athuraliya, M.K.H. Gunasekara, Srinath Perera, Sriskandarajah Suhothayan http://bit.ly/1NwBXTv
  • 2. ● Introduction ● Background ● Solution & Methodology ● Results & Conclusion Overview 2
  • 3. Introduction ● Success of modern day enterprises and businesses is highly relied on how they process massive amounts of data ● “Drowning in data yet starving for knowledge” ● With the emergence of social media, public has gained the potential to generate massive amounts of data ● But we are still in a struggle to extract useful information out of this data 3
  • 4. Introduction ● Road traffic has become a major issue, mainly in developing countries ● Directly affects country’s economy and development due to the waste of resources – Fuel, time ● Using technology to find solutions – Proven to be success stories in number of cases ● This study was focused on one such solution emerged with the use of social media ● Twitter – Popular for dynamic content publishing ○ Users publish on different topics such as current affairs, news, politics and personal interests via 140 character messages called tweets 4
  • 5. Background ● Road.lk – A website that provides localized traffic alerts from a Twitter feed ● Experiencing road traffic or have information on road traffic? Tweet about it! ● All users, follow @road_lk receive traffic alerts nearly in real-time ● Identified as a potential source to extract information on road traffic in real-time ● Reliability maintained by higher number of publishers 5
  • 7. Background ● Potential is significant to a country like Sri Lanka – Due to the unavailability of high tech traffic monitoring systems ● Several limitations, ○ Connectivity requirement ○ Unavailability of proper alert mechanism except Twitter feed or road.lk website ● Notable limitation – Users use natural language to post traffic updates ● A format can make processing tweets more straightforward but it can reduce the flexibility of sharing updates 7
  • 8. Solution & Methodology ● A prototype solution was implemented by combining NLP and CEP tools ● Accommodates three use cases, ○ Real-time road traffic feed and geo location map ○ Traffic search within an area ○ Traffic alert subscription ● Developed an architecture for a these use cases ● Multiple tools were utilized to retrieve, process and present information 8
  • 9. Solution & Methodology – Architecture 9
  • 10. Solution & Methodology – Feed ● Feed Retrieval – Access Twitter via its API ● Existing feed for model training dataset generation ○ REST API, Twitter4J ● Real-time feed stream for alert generation ○ Streaming API, WSO2 Enterprise Service Bus Twitter connector 10
  • 11. Solution & Methodology – NLP ● @road_lk Twitter feed ○ Reliable data source to generate real-time traffic alerts ○ Constrained by natural language representation ● Transform this data into a machine readable representation – Can use the full potential of this source for a better solution ● Proposed a NLP model to address this problem ● Extracted two entities from a tweet – location and traffic level ● Before extracting these two entities, ○ A tweet needed to be classified – Traffic alert or not? ○ Cleaning, preprocessing 11
  • 12. Solution & Methodology – NLP ● NLP tasks required to classify and extract, ○ Tweet categorization ○ Location extraction ○ Traffic level extraction ● First task – Document categorization task ● Latter two – Name entity recognition (NER) tasks ● Apache OpenNLP toolkit was used ● Custom tokenizer for street names and city names ● Traffic level NER task – Predefined set of words selected to tag ● Had to consider factors – Spelling mistakes, informal language, abbreviations 12
  • 13. Solution & Methodology – CEP ● Another important property of this data source – Required to process the Twitter feed in real-time ● Our approach was complex event processing (CEP) ● CEP is a field, concerned in processing data from multiple sources in real-time ● Used WSO2 Complex Event Processor as the CEP tool to analyse and process Twitter feed input stream ● Siddhi Query Language (SiddhiQL) is at the core of WSO2 CEP ● Designed to process event streams and identify complex event occurrences 13
  • 14. Solution & Methodology – Siddhi Queries from classifiedStream#transform.nlp:getEntities(convertedText,4,true,"/_system/governance/en-location.bin") select * insert into templocationStream; from classifiedStream#transform.nlp:getEntities(convertedText,1,false,"/_system/governance/en-trafficlevel.bin") select * insert into temptrafficlevelStream; from S1=classifiedStream, S2=temptrafficlevelStream, S3=templocationStream select S1.createdAt as time, S2.nameElement1 as trafficLevel, S3.nameElement1 as location1, S3.nameElement2 as location2, S3.nameElement3 as location3, S3.nameElement4 as location4 insert into locationsStream; from uiFeedStream#window.time(120 min) as trafficFeed join SearchEventStream as request on (trafficFeed.latitude < request.latitude + 0.018 and trafficFeed.latitude > request.latitude - 0.018 and trafficFeed.longitude < request.longitude + 0.027 and trafficFeed.longitude > request.longitude - 0.027) select trafficFeed.formattedAddress, trafficFeed.latitude, trafficFeed.longitude, trafficFeed.level, trafficFeed.time insert into searchResult; 14
  • 15. Solution & Methodology – CEP ● Siddhi queries define how to process and combine existing event streams to create new event streams ● SiddhiQL was extended with extensions for, ○ Tweet categorization ○ Name entity recognition ○ Geocoding ● Geocoding extension converts the locations into geo coordinates ● Searching functionality used a time-based Siddhi window ○ To retrieve traffic in nearby geo area within a predefined time period 15
  • 16. Results & Conclusion ● Implemented a web based interface to demonstrate the functionalities ● Users can interact with this interface and make use of the use cases ● Accuracy measures of NLP through OpenNLP evaluation APIs ● A solution to extract useful information from a crowdsourced social networking service ● By utilizing a NLP/CEP combined approach 16
  • 17. Results & Conclusion – Web UI 17
  • 18. Results & Conclusion ● Results demonstrate the potential of such model ● To tackle an application of real-time natural language processing task ● This model can be extended to tackle any real-time unstructured data stream ● Transforming human readable data into machine readable format enables deep processing of data to generate useful information and insights ○ Trend analysis ○ Pattern detection and prediction 18