Collect millions of reviews from travel websites, extract entities via AlchemyAPI and train a model to predict search behaviour in upcoming months based on what users are writing about specific geographical areas, specific accommodations? Or how about a recommendation engine for e-commerce platforms, that not only takes into account the number of purchases but also SEO specific factors like keyword difficulty, number of external links and more to find the right balance between internal linking and commercially interesting items? Classifying and structuring huge datasets of content can be time consuming, why not us a free trained Machine Learning API for Topic Detection to do this for you? In this session Jan Willem Bobbinck will introduce the concept of machine learning and share a few practical examples on how you can use it to optimize your SEO processes.
17. “A computer program is said to learn from
experience E with respect to some task T
and some performance measure P, if its
performance on T, as measured by P,
improves with experience E.” -Tom Mitchell,
Carnegie Mellon University
18. E: 50 years of data about housing prices in
Munich
T: Pricing prediction to sell at right price
P: the better price predictions it gives, the
better future predictions will be
19. The goal of ML is never to make “perfect”
guesses, because ML deals in domains where
there is no such thing. The goal is to make
guesses that are good enough to be useful.
British mathematician and professor of statistics
George E. P. Box that “all models are wrong, but
some are useful”
22. You know
what you are
looking for
What do these
datapoints have
in common?
23. E: 50 years of data about housing prices
in Munich
T: Pricing prediction to sell at right price
P: the better price predictions it gives, the
better future predictions will be
24. No rules teached. It took Google’s AI thousands of games to detect losing was probably bad
31. Best to start with:
• https://www.coursera.org/learn/machine-learning
by Andrew Ng (Baidu, former Google Brain)
• Tom Mitchell lectures:
http://www.cs.cmu.edu/~tom/10601_fall2012/lect
ures.shtml
• https://work.caltech.edu/telecourse.html Caltech
ML course
36. Mainly use pre trained models:
– Spam classification of user generated content
(comments & reviews)
– Content classification
– Text extraction from pages
37. • Query classification
• Recommendation engines: internal linking
based on both e-commerce, user
behaviour and SEO metrics.
52. 1. Collected all hotel reviews
2. Check sentiment and main entities
3. Upload search volume and e-commerce
data per hotel
4. Update internal linking accordingly
53.
54. 1. Collected all hotel reviews
2. Plotted against time
3. Extract upcoming entities and sentiments
4. Predict future search behaviour
5. Create landingpages for future targeting
71. • A list of links containing
– Content language
– Content topic
– Spam probability
– Content sentiment (if wanted)
– Prioritized on language relevancy
72. • 10.000+ keywords? Use a ML classifier
• Check for entities like places for local
• Buying intent vs informational
73. Persona
Customer journey
stage Page Type
Local
identifier Tag Keyword
Leisure NL Awareness Product Yes Campingaz Campingaz Munich
Leisure NL Awareness Informational No terrasverwarmer
Leisure NL Awareness Informational No terrasverwarming
Leisure NL Awareness Informational No BBQ gasbarbecue
Leisure NL Awareness Informational No BBQ gas bbq
Leisure NL Consideration Informational No Generic gasfles
Leisure NL Retention Informational No Generic gasfles vullen
Leisure NL Retention Informational No Branded primagaz
Leisure NL Consideration Informational No Generic gasfles kopen
B2B-industrie Awareness Informational No LNG lng
Leisure NL Consideration Product No Generic gasflessen
Leisure NL Awareness Informational No Generic kookplaat gas
Energie Awareness Informational No Propaan propaan
Leisure NL Awareness Informational No Butaan butaan
74.
75.
76. "I liked the book you gave me yesterday, but
the rest of my day was terrible."
77.
78.
79.
80. { "summarized_data": “Mallorcan roads are well
maintained, cyclist are really welcome and I really
enjoyed it last year...", "auto_gen_ranked_keywords": [
"flight", "madrid", "mallorca", "training", "food", "plane",
"delayed", "weather", "broken", "quest", "hot", "spirit",
"horror", "booked", "hour", "wifi", "trip", "situation", "airport",
"gate", "mallorcan", "lounge", "spend", "minute", "ve",
"cyclist", "rainy", "missed", "netherland", "enjoyed", "road" ]
}
83. Aw! Yes, said Miss Skinlin she hasn’t the
first heir to the female figure. The waves
dance bright and happy when I forgot to
learn, before which she told me to read and
study. My Uncle, with a commanding, What
are you better than Kintuck.
19th century American literature
http://blog.algorithmia.com/2015/12/nanogenmo-text-analysis-with-algorithmias/
84. 1. Input topic & Scrape current content
2. Create all N-grams
3. Create individual paragraphs
4. Randomly combine and create texts
5. Run through topic and sentiment classifiers to
evaluate
91. • Restructure website content based on a
set taxonomy of topics
• Extract texts from top 30 and define text
requirements (eg. Searchmetrics module)
• Purchase prediction for new queries
92.
93.
94.
95. • Use Google Tensorflow to identify image
contents
• Crawl topic related content
• Generate automatic descriptions and paragraph
text
• Build a image library site including text, good for
SEO
https://databricks.com/blog/2016/01/25/deep-learning-with-spark-and-tensorflow.html
96. • From 2011: Google Prediction API
http://cloudacademy.com/blog/google-prediction-api/