SlideShare a Scribd company logo
1 of 106
Download to read offline
using python
web scraping
Paul Schreiber
paul.schreiber@fivethirtyeight.com
paulschreiber@gmail.com
@paulschreiber
☁
</>
Fetching pages
➜ urllib

➜ urllib2

➜ urllib (Python 3)

➜ requests
import'requests'
page'='requests.get('http://www.ire.org/')
Fetch one page
import'requests'
base_url'=''http://www.fishing.ocean/p/%s''
for'i'in'range(0,'10):'
''url'='base_url'%'i'
''page'='requests.get(url)
Fetch a set of results
import'requests'
page'='requests.get('http://www.ire.org/')'
with'open("index.html",'"wb")'as'html:'
''html.write(page.content)
Download a file
Parsing data
➜ Regular Expressions

➜ CSS Selectors

➜ XPath

➜ Object Hierarchy

➜ Object Searching
<html>'
''<head><title>Green'Eggs'and'Ham</title></head>'
''<body>'
'''<ol>'
''''''<li>Green'Eggs</li>'
''''''<li>Ham</li>'
'''</ol>'
''</body>'
</html>
import're'
item_re'='re.compile("<li[^>]*>([^<]+?)</
li>")'
item_re.findall(html)
Regular Expressions DON’T
DO THIS!
from'bs4'import'BeautifulSoup'
soup'='BeautifulSoup(html)'
[s.text'for's'in'soup.select("li")]
CSS Selectors
from'lxml'import'etree'
from'StringIO'import'*'
html'='StringIO(html)'
tree'='etree.parse(html)'
[s.text'for's'in'tree.xpath('/ol/li')]
XPath
import'requests'
from'bs4'import'BeautifulSoup'
page'='requests.get('http://www.ire.org/')'
soup'='BeautifulSoup(page.content)'
print'"The'title'is'"'+'soup.title
Object Hierarchy
from'bs4'import'BeautifulSoup'
soup'='BeautifulSoup(html)'
[s.text'for's'in'soup.find_all("li")]
Object Searching
import'csv'
with'open('shoes.csv',''wb')'as'csvfile:'
''''shoe_writer'='csv.writer(csvfile)'
''''for'line'in'shoe_list:'
''''''''shoe_writer.writerow(line)'
Write CSV
output'='open("shoes.txt",'"w")'
for'row'in'data:'
''''output.write("t".join(row)'+'"n")'
output.close()'
Write TSV
import'json'
with'open('shoes.json',''wb')'as'outfile:'
''''json.dump(my_json,'outfile)'
Write JSON
workon'web_scraping
WTA Rankings
EXAMPLE 1
WTA Rankings
➜ fetch page

➜ parse cells

➜ write to file
import'csv'
import'requests'
from'bs4'import'BeautifulSoup'
url'=''http://www.wtatennis.com/singlesZ
rankings''
page'='requests.get(url)'
soup'='BeautifulSoup(page.content)
WTA Rankings
soup.select("#myTable'td")
WTA Rankings
[s'for's'in'soup.select("#myTable'td")]
WTA Rankings
[s.get_text()'for's'in'
soup.select("#myTable'td")]
WTA Rankings
[s.get_text().strip()'for's'in'
soup.select("#myTable'td")]
WTA Rankings
cells'='[s.get_text().strip()'for's'in'
soup.select("#myTable'td")]
WTA Rankings
for'i'in'range(0,'3):'
''print'cells[i*7:i*7+7]
WTA Rankings
with'open('wta.csv',''wb')'as'csvfile:'
''wtawriter'='csv.writer(csvfile)'
''for'i'in'range(0,'3):'
''''wtawriter.writerow(cells[i*7:i*7+7])
WTA Rankings
NY Election
Boards
EXAMPLE 2
NY Election Boards
➜ list counties

➜ loop over counties

➜ fetch county pages

➜ parse county data

➜ write to file
import'requests'
from'bs4'import'BeautifulSoup'
url'=''http://www.elections.ny.gov/
CountyBoards.html''
page'='requests.get(url)'
soup'='BeautifulSoup(page.content)
NY Election Boards
soup.select("area")
NY Election Boards
counties'='soup.select("area")'
county_urls'='[u.get('href')'for'u'in'
counties]
NY Election Boards
counties'='soup.select("area")'
county_urls'='[u.get('href')'for'u'in'
counties]'
county_urls'='county_urls[1:]'
county_urls'='list(set(county_urls))
NY Election Boards
for'url'in'county_urls[0:3]:'
''''print'"Fetching'%s"'%'url'
''''page'='requests.get(url)'
''''soup'='BeautifulSoup(page.content)'
''''lines'='[s'for's'in'soup.select("th")
[0].strings]'
''''data.append(lines)
NY Election Boards
output'='open("boards.txt",'"w")'
for'row'in'data:'
''''output.write("t".join(row)'+'"n")'
output.close()
NY Election Boards
ACEC
Members
EXAMPLE 3
ACEC Members
➜ loop over pages

➜ fetch result table

➜ parse name, id, location

➜ write to file
import'requests'
import'json'
from'bs4'import'BeautifulSoup'
base_url'=''http://www.acec.ca/about_acec/
search_member_firms/
business_sector_search.html/search/
business/page/%s'
ACEC Members
url'='base_url'%'1'
page'='requests.get(url)'
soup'='BeautifulSoup(page.content)'
soup.find(id='resulttable')
ACEC Members
url'='base_url'%'1'
page'='requests.get(url)'
soup'='BeautifulSoup(page.content)'
table'='soup.find(id='resulttable')'
rows'='table.find_all('tr')
ACEC Members
url'='base_url'%'1'
page'='requests.get(url)'
soup'='BeautifulSoup(page.content)'
table'='soup.find(id='resulttable')'
rows'='table.find_all('tr')'
columns'='rows[0].find_all('td')
ACEC Members
columns'='rows[0].find_all('td')'
company_data'='{'
'''name':'columns[1].a.text,'
'''id':'columns[1].a['href'].split('/')
[Z1],'
'''location':'columns[2].text'
}
ACEC Members
start_page'='1'
end_page'='2'
result'='[]
ACEC Members
for'i'in'range(start_page,'end_page'+'1):'
''''url'='base_url'%'i'
''''print'"Fetching'%s"'%'url'
''''page'='requests.get(url)'
''''soup'='BeautifulSoup(page.content)'
''''table'='soup.find(id='resulttable')'
''''rows'='table.find_all('tr')
ACEC Members
''for'r'in'rows:'
''''columns'='r.find_all('td')'
''''company_data'='{'
'''''''name':'columns[1].a.text,'
'''''''id':'columns[1].a['href'].split('/')[Z1],'
'''''''location':'columns[2].text'
''''}'
''''result.append(company_data)'
ACEC Members
with'open('acec.json',''w')'as'outfile:'
''''json.dump(result,'outfile)
ACEC Members
</>
Python Tools
➜ lxml

➜ scrapy

➜ MechanicalSoup

➜ RoboBrowser

➜ pyQuery
Ruby Tools
➜ nokogiri

➜ Mechanize
Not coding? Scrape with:
➜ import.io

➜ Kimono

➜ copy & paste

➜ PDFTables

➜ Tabula
☁
page'='requests.get(url,'auth=('drseuss','
'hamsandwich'))
Basic Authentication
page'='requests.get(url,'verify=False)
Self-signed certificates
page'='requests.get(url,'verify='/etc/ssl/
certs.pem')
Specify Certificate Bundle
requests.exceptions.SSLError:5hostname5
'shrub.ca'5doesn't5match5either5of5
'www.arthurlaw.ca',5'arthurlaw.ca'5
$'pip'install'pyopenssl'
$'pip'install'ndgZhttpsclient'
$'pip'install'pyasn1
Server Name Indication (SNI)
UnicodeEncodeError:''ascii','u'Cornet,'
Alizxe9','12,'13,''ordinal'not'in'
range(128)''
Fix
myvar.encode("utfZ8")
Unicode
page'='requests.get(url)'
if'(page.status_code'>='400):'
''...'
else:'
''...
Server Errors
try:'
''''r'='requests.get(url)'
except'requests.exceptions.RequestException'as'e:'
''''print'e'
''''sys.exit(1)'
Exceptions
headers'='{'
'''''UserZAgent':''Mozilla/3000''
}'
response'='requests.get(url,'headers=headers)
Browser Disallowed
import'time'
for'i'in'range(0,'10):'
''url'='base_url'%'i'
''page'='requests.get(url)'
''time.sleep(1)
Rate Limiting/Slow Servers
requests.get("http://greeneggs.ham/",'params='
{'name':''sam',''verb':''are'})
Query String
requests.post("http://greeneggs.ham/",'data='
{'name':''sam',''verb':''are'})
POST a form
paritcipate
<strong>'
''<em>foo</strong>'
</em>
!
github.com/paulschreiber/nicar15
Many graphics from The Noun Project

Binoculars by Stephen West. Broken file by Maxi Koichi.

Broom by Anna Weiss. Chess by Matt Brooks.

Cube by Luis Rodrigues. Firewall by Yazmin Alanis.

Frown by Simple Icons. Lock by Edward Boatman.

Wrench by Tony Gines.

More Related Content

What's hot

Scrapy talk at DataPhilly
Scrapy talk at DataPhillyScrapy talk at DataPhilly
Scrapy talk at DataPhillyobdit
 
Assumptions: Check yo'self before you wreck yourself
Assumptions: Check yo'self before you wreck yourselfAssumptions: Check yo'self before you wreck yourself
Assumptions: Check yo'self before you wreck yourselfErin Shellman
 
Web scraping 1 2-3 with python + scrapy (Summer BarCampHK 2012 version)
Web scraping 1 2-3 with python + scrapy (Summer BarCampHK 2012 version)Web scraping 1 2-3 with python + scrapy (Summer BarCampHK 2012 version)
Web scraping 1 2-3 with python + scrapy (Summer BarCampHK 2012 version)Sammy Fung
 
Open Hack London - Introduction to YQL
Open Hack London - Introduction to YQLOpen Hack London - Introduction to YQL
Open Hack London - Introduction to YQLChristian Heilmann
 
Hd insight programming
Hd insight programmingHd insight programming
Hd insight programmingCasear Chu
 
Go Web Development
Go Web DevelopmentGo Web Development
Go Web DevelopmentCheng-Yi Yu
 
Django - 次の一歩 gumiStudy#3
Django - 次の一歩 gumiStudy#3Django - 次の一歩 gumiStudy#3
Django - 次の一歩 gumiStudy#3makoto tsuyuki
 
Cross Domain Web
Mashups with JQuery and Google App Engine
Cross Domain Web
Mashups with JQuery and Google App EngineCross Domain Web
Mashups with JQuery and Google App Engine
Cross Domain Web
Mashups with JQuery and Google App EngineAndy McKay
 
Undercover Pods / WP Functions
Undercover Pods / WP FunctionsUndercover Pods / WP Functions
Undercover Pods / WP Functionspodsframework
 
Grails 1.2 探検隊 -新たな聖杯をもとめて・・・-
Grails 1.2 探検隊 -新たな聖杯をもとめて・・・-Grails 1.2 探検隊 -新たな聖杯をもとめて・・・-
Grails 1.2 探検隊 -新たな聖杯をもとめて・・・-Tsuyoshi Yamamoto
 
RESTFUL SERVICES MADE EASY: THE EVE REST API FRAMEWORK - Nicola Iarocci - Co...
RESTFUL SERVICES MADE EASY: THE EVE REST API FRAMEWORK -  Nicola Iarocci - Co...RESTFUL SERVICES MADE EASY: THE EVE REST API FRAMEWORK -  Nicola Iarocci - Co...
RESTFUL SERVICES MADE EASY: THE EVE REST API FRAMEWORK - Nicola Iarocci - Co...Codemotion
 
Essential git fu for tech writers
Essential git fu for tech writersEssential git fu for tech writers
Essential git fu for tech writersGaurav Nelson
 
Introduction to the Pods JSON API
Introduction to the Pods JSON APIIntroduction to the Pods JSON API
Introduction to the Pods JSON APIpodsframework
 

What's hot (20)

Pydata-Python tools for webscraping
Pydata-Python tools for webscrapingPydata-Python tools for webscraping
Pydata-Python tools for webscraping
 
Scrapy talk at DataPhilly
Scrapy talk at DataPhillyScrapy talk at DataPhilly
Scrapy talk at DataPhilly
 
Selenium&amp;scrapy
Selenium&amp;scrapySelenium&amp;scrapy
Selenium&amp;scrapy
 
Assumptions: Check yo'self before you wreck yourself
Assumptions: Check yo'self before you wreck yourselfAssumptions: Check yo'self before you wreck yourself
Assumptions: Check yo'self before you wreck yourself
 
Webscraping with asyncio
Webscraping with asyncioWebscraping with asyncio
Webscraping with asyncio
 
Scrapy
ScrapyScrapy
Scrapy
 
Web scraping 1 2-3 with python + scrapy (Summer BarCampHK 2012 version)
Web scraping 1 2-3 with python + scrapy (Summer BarCampHK 2012 version)Web scraping 1 2-3 with python + scrapy (Summer BarCampHK 2012 version)
Web scraping 1 2-3 with python + scrapy (Summer BarCampHK 2012 version)
 
Django
DjangoDjango
Django
 
Routing @ Scuk.cz
Routing @ Scuk.czRouting @ Scuk.cz
Routing @ Scuk.cz
 
CouchDB Day NYC 2017: MapReduce Views
CouchDB Day NYC 2017: MapReduce ViewsCouchDB Day NYC 2017: MapReduce Views
CouchDB Day NYC 2017: MapReduce Views
 
Open Hack London - Introduction to YQL
Open Hack London - Introduction to YQLOpen Hack London - Introduction to YQL
Open Hack London - Introduction to YQL
 
Hd insight programming
Hd insight programmingHd insight programming
Hd insight programming
 
Go Web Development
Go Web DevelopmentGo Web Development
Go Web Development
 
Django - 次の一歩 gumiStudy#3
Django - 次の一歩 gumiStudy#3Django - 次の一歩 gumiStudy#3
Django - 次の一歩 gumiStudy#3
 
Cross Domain Web
Mashups with JQuery and Google App Engine
Cross Domain Web
Mashups with JQuery and Google App EngineCross Domain Web
Mashups with JQuery and Google App Engine
Cross Domain Web
Mashups with JQuery and Google App Engine
 
Undercover Pods / WP Functions
Undercover Pods / WP FunctionsUndercover Pods / WP Functions
Undercover Pods / WP Functions
 
Grails 1.2 探検隊 -新たな聖杯をもとめて・・・-
Grails 1.2 探検隊 -新たな聖杯をもとめて・・・-Grails 1.2 探検隊 -新たな聖杯をもとめて・・・-
Grails 1.2 探検隊 -新たな聖杯をもとめて・・・-
 
RESTFUL SERVICES MADE EASY: THE EVE REST API FRAMEWORK - Nicola Iarocci - Co...
RESTFUL SERVICES MADE EASY: THE EVE REST API FRAMEWORK -  Nicola Iarocci - Co...RESTFUL SERVICES MADE EASY: THE EVE REST API FRAMEWORK -  Nicola Iarocci - Co...
RESTFUL SERVICES MADE EASY: THE EVE REST API FRAMEWORK - Nicola Iarocci - Co...
 
Essential git fu for tech writers
Essential git fu for tech writersEssential git fu for tech writers
Essential git fu for tech writers
 
Introduction to the Pods JSON API
Introduction to the Pods JSON APIIntroduction to the Pods JSON API
Introduction to the Pods JSON API
 

Viewers also liked

No excuses user research
No excuses user researchNo excuses user research
No excuses user researchLily Dart
 
Some Advanced Remarketing Ideas
Some Advanced Remarketing IdeasSome Advanced Remarketing Ideas
Some Advanced Remarketing IdeasChris Thomas
 
The Science of Marketing Automation
The Science of Marketing AutomationThe Science of Marketing Automation
The Science of Marketing AutomationHubSpot
 
How to: Viral Marketing + Brand Storytelling
How to: Viral Marketing + Brand Storytelling How to: Viral Marketing + Brand Storytelling
How to: Viral Marketing + Brand Storytelling Elle Shelley
 
Stop Leaving Money on the Table! Optimizing your Site for Users and Revenue
Stop Leaving Money on the Table! Optimizing your Site for Users and RevenueStop Leaving Money on the Table! Optimizing your Site for Users and Revenue
Stop Leaving Money on the Table! Optimizing your Site for Users and RevenueJosh Patrice
 
How to Plug a Leaky Sales Funnel With Facebook Retargeting
How to Plug a Leaky Sales Funnel With Facebook RetargetingHow to Plug a Leaky Sales Funnel With Facebook Retargeting
How to Plug a Leaky Sales Funnel With Facebook RetargetingDigital Marketer
 
10 Ways You're Using AdWords Wrong and How to Correct Those Practices
10 Ways You're Using AdWords Wrong and How to Correct Those Practices 10 Ways You're Using AdWords Wrong and How to Correct Those Practices
10 Ways You're Using AdWords Wrong and How to Correct Those Practices Kissmetrics on SlideShare
 
Brenda Spoonemore - A biz dev playbook for startups: Why, when and how to do ...
Brenda Spoonemore - A biz dev playbook for startups: Why, when and how to do ...Brenda Spoonemore - A biz dev playbook for startups: Why, when and how to do ...
Brenda Spoonemore - A biz dev playbook for startups: Why, when and how to do ...GeekWire
 
10 Mobile Marketing Campaigns That Went Viral and Made Millions
10 Mobile Marketing Campaigns That Went Viral and Made Millions10 Mobile Marketing Campaigns That Went Viral and Made Millions
10 Mobile Marketing Campaigns That Went Viral and Made MillionsMark Fidelman
 
The Essentials of Community Building by Mack Fogelson
The Essentials of Community Building by Mack FogelsonThe Essentials of Community Building by Mack Fogelson
The Essentials of Community Building by Mack FogelsonMackenzie Fogelson
 
The Beginners Guide to Startup PR #startuppr
The Beginners Guide to Startup PR #startupprThe Beginners Guide to Startup PR #startuppr
The Beginners Guide to Startup PR #startupprOnboardly
 
Biz Dev 101 - An Interactive Workshop on How Deals Get Done
Biz Dev 101 - An Interactive Workshop on How Deals Get DoneBiz Dev 101 - An Interactive Workshop on How Deals Get Done
Biz Dev 101 - An Interactive Workshop on How Deals Get DoneScott Pollack
 
A Guide to User Research (for People Who Don't Like Talking to Other People)
A Guide to User Research (for People Who Don't Like Talking to Other People)A Guide to User Research (for People Who Don't Like Talking to Other People)
A Guide to User Research (for People Who Don't Like Talking to Other People)Stephanie Wills
 
The Science behind Viral marketing
The Science behind Viral marketingThe Science behind Viral marketing
The Science behind Viral marketingDavid Skok
 
Mastering Google Adwords In 30 Minutes
Mastering Google Adwords In 30 MinutesMastering Google Adwords In 30 Minutes
Mastering Google Adwords In 30 MinutesNik Cree
 
LinkedIn Ads Platform Master Class
LinkedIn Ads Platform Master ClassLinkedIn Ads Platform Master Class
LinkedIn Ads Platform Master ClassLinkedIn
 
Wireframes - a brief overview
Wireframes - a brief overviewWireframes - a brief overview
Wireframes - a brief overviewJenni Leder
 
Using Your Growth Model to Drive Smarter High Tempo Testing
Using Your Growth Model to Drive Smarter High Tempo TestingUsing Your Growth Model to Drive Smarter High Tempo Testing
Using Your Growth Model to Drive Smarter High Tempo TestingSean Ellis
 

Viewers also liked (20)

No excuses user research
No excuses user researchNo excuses user research
No excuses user research
 
Some Advanced Remarketing Ideas
Some Advanced Remarketing IdeasSome Advanced Remarketing Ideas
Some Advanced Remarketing Ideas
 
The Science of Marketing Automation
The Science of Marketing AutomationThe Science of Marketing Automation
The Science of Marketing Automation
 
How to: Viral Marketing + Brand Storytelling
How to: Viral Marketing + Brand Storytelling How to: Viral Marketing + Brand Storytelling
How to: Viral Marketing + Brand Storytelling
 
Intro to Mixpanel
Intro to MixpanelIntro to Mixpanel
Intro to Mixpanel
 
Stop Leaving Money on the Table! Optimizing your Site for Users and Revenue
Stop Leaving Money on the Table! Optimizing your Site for Users and RevenueStop Leaving Money on the Table! Optimizing your Site for Users and Revenue
Stop Leaving Money on the Table! Optimizing your Site for Users and Revenue
 
How to Plug a Leaky Sales Funnel With Facebook Retargeting
How to Plug a Leaky Sales Funnel With Facebook RetargetingHow to Plug a Leaky Sales Funnel With Facebook Retargeting
How to Plug a Leaky Sales Funnel With Facebook Retargeting
 
10 Ways You're Using AdWords Wrong and How to Correct Those Practices
10 Ways You're Using AdWords Wrong and How to Correct Those Practices 10 Ways You're Using AdWords Wrong and How to Correct Those Practices
10 Ways You're Using AdWords Wrong and How to Correct Those Practices
 
Brenda Spoonemore - A biz dev playbook for startups: Why, when and how to do ...
Brenda Spoonemore - A biz dev playbook for startups: Why, when and how to do ...Brenda Spoonemore - A biz dev playbook for startups: Why, when and how to do ...
Brenda Spoonemore - A biz dev playbook for startups: Why, when and how to do ...
 
10 Mobile Marketing Campaigns That Went Viral and Made Millions
10 Mobile Marketing Campaigns That Went Viral and Made Millions10 Mobile Marketing Campaigns That Went Viral and Made Millions
10 Mobile Marketing Campaigns That Went Viral and Made Millions
 
The Essentials of Community Building by Mack Fogelson
The Essentials of Community Building by Mack FogelsonThe Essentials of Community Building by Mack Fogelson
The Essentials of Community Building by Mack Fogelson
 
The Beginners Guide to Startup PR #startuppr
The Beginners Guide to Startup PR #startupprThe Beginners Guide to Startup PR #startuppr
The Beginners Guide to Startup PR #startuppr
 
HTML & CSS Masterclass
HTML & CSS MasterclassHTML & CSS Masterclass
HTML & CSS Masterclass
 
Biz Dev 101 - An Interactive Workshop on How Deals Get Done
Biz Dev 101 - An Interactive Workshop on How Deals Get DoneBiz Dev 101 - An Interactive Workshop on How Deals Get Done
Biz Dev 101 - An Interactive Workshop on How Deals Get Done
 
A Guide to User Research (for People Who Don't Like Talking to Other People)
A Guide to User Research (for People Who Don't Like Talking to Other People)A Guide to User Research (for People Who Don't Like Talking to Other People)
A Guide to User Research (for People Who Don't Like Talking to Other People)
 
The Science behind Viral marketing
The Science behind Viral marketingThe Science behind Viral marketing
The Science behind Viral marketing
 
Mastering Google Adwords In 30 Minutes
Mastering Google Adwords In 30 MinutesMastering Google Adwords In 30 Minutes
Mastering Google Adwords In 30 Minutes
 
LinkedIn Ads Platform Master Class
LinkedIn Ads Platform Master ClassLinkedIn Ads Platform Master Class
LinkedIn Ads Platform Master Class
 
Wireframes - a brief overview
Wireframes - a brief overviewWireframes - a brief overview
Wireframes - a brief overview
 
Using Your Growth Model to Drive Smarter High Tempo Testing
Using Your Growth Model to Drive Smarter High Tempo TestingUsing Your Growth Model to Drive Smarter High Tempo Testing
Using Your Growth Model to Drive Smarter High Tempo Testing
 

Similar to Web Scraping with Python

How I make a podcast website using serverless technology in 2023
How I make a podcast website using serverless technology in 2023How I make a podcast website using serverless technology in 2023
How I make a podcast website using serverless technology in 2023Shengyou Fan
 
Web Scraping In Ruby Utosc 2009.Key
Web Scraping In Ruby Utosc 2009.KeyWeb Scraping In Ruby Utosc 2009.Key
Web Scraping In Ruby Utosc 2009.Keyjtzemp
 
Web весна 2013 лекция 6
Web весна 2013 лекция 6Web весна 2013 лекция 6
Web весна 2013 лекция 6Technopark
 
Web осень 2012 лекция 6
Web осень 2012 лекция 6Web осень 2012 лекция 6
Web осень 2012 лекция 6Technopark
 
GDG İstanbul Şubat Etkinliği - Sunum
GDG İstanbul Şubat Etkinliği - SunumGDG İstanbul Şubat Etkinliği - Sunum
GDG İstanbul Şubat Etkinliği - SunumCüneyt Yeşilkaya
 
Codeigniter : Two Step View - Concept Implementation
Codeigniter : Two Step View - Concept ImplementationCodeigniter : Two Step View - Concept Implementation
Codeigniter : Two Step View - Concept ImplementationAbdul Malik Ikhsan
 
Let's read code: python-requests library
Let's read code: python-requests libraryLet's read code: python-requests library
Let's read code: python-requests librarySusan Tan
 
Mojolicious - A new hope
Mojolicious - A new hopeMojolicious - A new hope
Mojolicious - A new hopeMarcus Ramberg
 
Dev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBDev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBMongoDB
 
お題でGroovyプログラミング: Part A
お題でGroovyプログラミング: Part Aお題でGroovyプログラミング: Part A
お題でGroovyプログラミング: Part AKazuchika Sekiya
 
E3 appspresso hands on lab
E3 appspresso hands on labE3 appspresso hands on lab
E3 appspresso hands on labNAVER D2
 
E2 appspresso hands on lab
E2 appspresso hands on labE2 appspresso hands on lab
E2 appspresso hands on labNAVER D2
 
MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know Norberto Leite
 
Mongo+java (1)
Mongo+java (1)Mongo+java (1)
Mongo+java (1)MongoDB
 
CodeIgniter PHP MVC Framework
CodeIgniter PHP MVC FrameworkCodeIgniter PHP MVC Framework
CodeIgniter PHP MVC FrameworkBo-Yi Wu
 
Great Developers Steal
Great Developers StealGreat Developers Steal
Great Developers StealBen Scofield
 
Effective iOS Network Programming Techniques
Effective iOS Network Programming TechniquesEffective iOS Network Programming Techniques
Effective iOS Network Programming TechniquesBen Scheirman
 
PesterSec: Using Pester & ScriptAnalyzer to Detect Obfuscated PowerShell
PesterSec: Using Pester & ScriptAnalyzer to Detect Obfuscated PowerShellPesterSec: Using Pester & ScriptAnalyzer to Detect Obfuscated PowerShell
PesterSec: Using Pester & ScriptAnalyzer to Detect Obfuscated PowerShellDaniel Bohannon
 

Similar to Web Scraping with Python (20)

How I make a podcast website using serverless technology in 2023
How I make a podcast website using serverless technology in 2023How I make a podcast website using serverless technology in 2023
How I make a podcast website using serverless technology in 2023
 
Web Scraping In Ruby Utosc 2009.Key
Web Scraping In Ruby Utosc 2009.KeyWeb Scraping In Ruby Utosc 2009.Key
Web Scraping In Ruby Utosc 2009.Key
 
Web весна 2013 лекция 6
Web весна 2013 лекция 6Web весна 2013 лекция 6
Web весна 2013 лекция 6
 
Web осень 2012 лекция 6
Web осень 2012 лекция 6Web осень 2012 лекция 6
Web осень 2012 лекция 6
 
ApacheCon 2005
ApacheCon 2005ApacheCon 2005
ApacheCon 2005
 
GDG İstanbul Şubat Etkinliği - Sunum
GDG İstanbul Şubat Etkinliği - SunumGDG İstanbul Şubat Etkinliği - Sunum
GDG İstanbul Şubat Etkinliği - Sunum
 
Codeigniter : Two Step View - Concept Implementation
Codeigniter : Two Step View - Concept ImplementationCodeigniter : Two Step View - Concept Implementation
Codeigniter : Two Step View - Concept Implementation
 
Let's read code: python-requests library
Let's read code: python-requests libraryLet's read code: python-requests library
Let's read code: python-requests library
 
Mojolicious - A new hope
Mojolicious - A new hopeMojolicious - A new hope
Mojolicious - A new hope
 
Dev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBDev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDB
 
お題でGroovyプログラミング: Part A
お題でGroovyプログラミング: Part Aお題でGroovyプログラミング: Part A
お題でGroovyプログラミング: Part A
 
E3 appspresso hands on lab
E3 appspresso hands on labE3 appspresso hands on lab
E3 appspresso hands on lab
 
E2 appspresso hands on lab
E2 appspresso hands on labE2 appspresso hands on lab
E2 appspresso hands on lab
 
Web Scrapping Using Python
Web Scrapping Using PythonWeb Scrapping Using Python
Web Scrapping Using Python
 
MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know
 
Mongo+java (1)
Mongo+java (1)Mongo+java (1)
Mongo+java (1)
 
CodeIgniter PHP MVC Framework
CodeIgniter PHP MVC FrameworkCodeIgniter PHP MVC Framework
CodeIgniter PHP MVC Framework
 
Great Developers Steal
Great Developers StealGreat Developers Steal
Great Developers Steal
 
Effective iOS Network Programming Techniques
Effective iOS Network Programming TechniquesEffective iOS Network Programming Techniques
Effective iOS Network Programming Techniques
 
PesterSec: Using Pester & ScriptAnalyzer to Detect Obfuscated PowerShell
PesterSec: Using Pester & ScriptAnalyzer to Detect Obfuscated PowerShellPesterSec: Using Pester & ScriptAnalyzer to Detect Obfuscated PowerShell
PesterSec: Using Pester & ScriptAnalyzer to Detect Obfuscated PowerShell
 

More from Paul Schreiber

Brooklyn Soloists: personal digital security
Brooklyn Soloists: personal digital securityBrooklyn Soloists: personal digital security
Brooklyn Soloists: personal digital securityPaul Schreiber
 
CreativeMornings FieldTrip: information security for creative folks
CreativeMornings FieldTrip: information security for creative folksCreativeMornings FieldTrip: information security for creative folks
CreativeMornings FieldTrip: information security for creative folksPaul Schreiber
 
WordCamp for Publishers: Security for Newsrooms
WordCamp for Publishers: Security for NewsroomsWordCamp for Publishers: Security for Newsrooms
WordCamp for Publishers: Security for NewsroomsPaul Schreiber
 
VIP Workshop: Effective Habits of Development Teams
VIP Workshop: Effective Habits of Development TeamsVIP Workshop: Effective Habits of Development Teams
VIP Workshop: Effective Habits of Development TeamsPaul Schreiber
 
WordPress NYC: Information Security
WordPress NYC: Information SecurityWordPress NYC: Information Security
WordPress NYC: Information SecurityPaul Schreiber
 
WPNYC: Moving your site to HTTPS
WPNYC: Moving your site to HTTPSWPNYC: Moving your site to HTTPS
WPNYC: Moving your site to HTTPSPaul Schreiber
 
NICAR delivering the news over HTTPS
NICAR delivering the news over HTTPSNICAR delivering the news over HTTPS
NICAR delivering the news over HTTPSPaul Schreiber
 
WordCamp US: Delivering the news over HTTPS
WordCamp US: Delivering the news over HTTPSWordCamp US: Delivering the news over HTTPS
WordCamp US: Delivering the news over HTTPSPaul Schreiber
 
BigWP: Delivering the news over HTTPS
BigWP: Delivering the news over HTTPSBigWP: Delivering the news over HTTPS
BigWP: Delivering the news over HTTPSPaul Schreiber
 
Delivering the news over HTTPS
Delivering the news over HTTPSDelivering the news over HTTPS
Delivering the news over HTTPSPaul Schreiber
 
D'oh! Avoid annoyances with Grunt.
D'oh! Avoid annoyances with Grunt.D'oh! Avoid annoyances with Grunt.
D'oh! Avoid annoyances with Grunt.Paul Schreiber
 
Getting to Consistency
Getting to ConsistencyGetting to Consistency
Getting to ConsistencyPaul Schreiber
 
EqualityCamp: Lessons learned from the Obama Campaign
EqualityCamp: Lessons learned from the Obama CampaignEqualityCamp: Lessons learned from the Obama Campaign
EqualityCamp: Lessons learned from the Obama CampaignPaul Schreiber
 

More from Paul Schreiber (18)

Brooklyn Soloists: personal digital security
Brooklyn Soloists: personal digital securityBrooklyn Soloists: personal digital security
Brooklyn Soloists: personal digital security
 
BigWP live blogs
BigWP live blogsBigWP live blogs
BigWP live blogs
 
CreativeMornings FieldTrip: information security for creative folks
CreativeMornings FieldTrip: information security for creative folksCreativeMornings FieldTrip: information security for creative folks
CreativeMornings FieldTrip: information security for creative folks
 
WordCamp for Publishers: Security for Newsrooms
WordCamp for Publishers: Security for NewsroomsWordCamp for Publishers: Security for Newsrooms
WordCamp for Publishers: Security for Newsrooms
 
VIP Workshop: Effective Habits of Development Teams
VIP Workshop: Effective Habits of Development TeamsVIP Workshop: Effective Habits of Development Teams
VIP Workshop: Effective Habits of Development Teams
 
BigWP Security Keys
BigWP Security KeysBigWP Security Keys
BigWP Security Keys
 
WordPress NYC: Information Security
WordPress NYC: Information SecurityWordPress NYC: Information Security
WordPress NYC: Information Security
 
WPNYC: Moving your site to HTTPS
WPNYC: Moving your site to HTTPSWPNYC: Moving your site to HTTPS
WPNYC: Moving your site to HTTPS
 
NICAR delivering the news over HTTPS
NICAR delivering the news over HTTPSNICAR delivering the news over HTTPS
NICAR delivering the news over HTTPS
 
WordCamp US: Delivering the news over HTTPS
WordCamp US: Delivering the news over HTTPSWordCamp US: Delivering the news over HTTPS
WordCamp US: Delivering the news over HTTPS
 
BigWP: Delivering the news over HTTPS
BigWP: Delivering the news over HTTPSBigWP: Delivering the news over HTTPS
BigWP: Delivering the news over HTTPS
 
Delivering the news over HTTPS
Delivering the news over HTTPSDelivering the news over HTTPS
Delivering the news over HTTPS
 
D'oh! Avoid annoyances with Grunt.
D'oh! Avoid annoyances with Grunt.D'oh! Avoid annoyances with Grunt.
D'oh! Avoid annoyances with Grunt.
 
Getting to Consistency
Getting to ConsistencyGetting to Consistency
Getting to Consistency
 
Junk Mail
Junk MailJunk Mail
Junk Mail
 
EqualityCamp: Lessons learned from the Obama Campaign
EqualityCamp: Lessons learned from the Obama CampaignEqualityCamp: Lessons learned from the Obama Campaign
EqualityCamp: Lessons learned from the Obama Campaign
 
Mac Productivity 101
Mac Productivity 101Mac Productivity 101
Mac Productivity 101
 
How NOT to rent a car
How NOT to rent a carHow NOT to rent a car
How NOT to rent a car
 

Recently uploaded

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

Web Scraping with Python