Big Data Definition & Characteristic.
Company Dominates Big Data.
Big Data and Other Technologies.
Big Data and UN.
Big Data for Statistics.
Big Data for Development.
Big data & Open Data.
Big data & SDG’s.
2. TABLE OF CONTENT
• Big Data.
• Definition & Characteristic.
• Company Dominates Big Data.
• Big Data and OtherTechnologies.
• Big Data and UN.
• Big Data for Statistics.
• Big Data for Development.
• Big data & Open Data.
• Big data & SDG’s.
By Joud Khattab
3. QUESTIONSTO CONSIDER
• What is Big Data?
• What makes data,“Big” Data?
• How to manage very large amounts of data and extract value and
knowledge from them?
By Joud Khattab
5. VOLUME
The size of data is subjective and dependent on technology
today's big data may not seem so big in a few years when data analysis and computing
technology improve
VARIETYVELOCITY
VOLUME
By Joud Khattab
6. BIG DATA CHARACTERISTICS (3’VS)
VOLUME
• Data volume is increasing exponentially:
• 90% of the world data was created in the last two years.
• Growing by a factor of 55.
• One zettabyte (ZB) = 1 trillion gigabytes.
2009
0.8 ZB
2020
44 ZB
By Joud Khattab
7. BIG DATA CHARACTERISTICS (3’VS)
VOLUME
• Every day, we create 2.5 Exabyte's of data which are equivalent to:
530000000 millions songs.
150000000 iPhones.
5 million laptops.
250000 libraries of congress.
90 years of HD video. TeraBytes
PetaBytes
ExaBytes
ZettaBytes
30 billion
RFID today
12+TBs
Tweets
25+TBs
Log data
????+TBs
data
By Joud Khattab
8. VARITY
This huge data generated in a variety of ways such as
social media, cell phone GPS signals, digital media, and purchase transaction records
VARIETYVELOCITY
VOLUME
By Joud Khattab
9. BIG DATA CHARACTERISTICS (3’VS)
VARITY
• Various formats, types, and structures:
• Relational Data (Tables,Transaction, Legacy Data).
• Text Data (Web).
• Semi-structured Data (XML, JSON).
• Text, numerical, images, audio, video, sequences, time series, social media
data, multi-dim arrays, etc…
• A single application can be generating/collecting many types of data.
• To extract knowledge all these types of data need to linked together.
By Joud Khattab
11. VELOCITY
Data is begin generated fast and need to be processed fast
VARIETYVELOCITY
VOLUME
By Joud Khattab
12. BIG DATA CHARACTERISTICS (3’VS)
VELOCITY
• Late decisions missing opportunities.
• Examples:
• E-Promotions: Based on your current location, your purchase history, what you like
send promotions right now for store next to you.
• Healthcare monitoring: sensors monitoring your activities and body any abnormal
measurements require immediate reaction.
By Joud Khattab
13. BIG DATA CHARACTERISTICS (3’VS)
VELOCITY
DATA AT REST
• Late decisions missing opportunities.
• Examples:
• E-Promotions: Based on your current
location, your purchase history, what you
like send promotions right now for store
next to you.
• Healthcare monitoring: sensors monitoring
your activities and body any abnormal
measurements require immediate reaction.
DATA IN MOTION
• Real-Near Time Analytics.
• Analyze data while it is generated to keep
only the information to avoid missing
opportunity to improve business results..
• Examples:
• Learning why Customers Switch to
competitors and their offers; in time to
Counter.
• Preventing Fraud as it is Occurring &
preventing more proactively
By Joud Khattab
14. GENERATING/CONSUMING DATA MODEL
HAS CHANGED
• Old Model:
• Few companies are generating data, all others are consuming data.
• New Model:
• all of us are generating data, and all of us are consuming data.
By Joud Khattab
15. BUILDING BIG DATA SYSTEMS
Its simply about how to manage the storing and computing of data
By Joud Khattab
16. BUILDING BIG DATA SYSTEMS
HADOOP
• The Apache Hadoop project develops open-source software for reliable,
scalable, distributed computing.
• Hadoop have two main components:
• Map Reduce (cluster resource management & data processing).
• HDFS (redundant, reliable storage).
By Joud Khattab
17. BUILDING BIG DATA SYSTEMS
HADOOP
MOVING
DATATO COMPUTE
MOVING
COMPUTETO DATA
By Joud Khattab
18. CHALLENGES IN HANDLING BIG DATA
• The Bottleneck is in technology:
• New architecture, algorithms, techniques are needed.
• Also in technical skills:
• Experts in using the new technology and dealing with big data.
11% of data is already used!!
By Joud Khattab
20. BIG DATA MARKET
The Big Data technology and services market are growing very fast
About 6 times the growth rate of the overall ICT market.
2012
6.2B
2018
48.3B
By Joud Khattab
21. BIG DATA MARKET
data science is the fourth paradigm of science, following theory, experiment,
and computational science.
Industry RevolutionsBy Joud Khattab
29. BIG DATA & UNITED NATIONS
• UN uses Big Data as a transformative tool for official statistics.
• Potential to improve accuracy and reducing costs for official statistics.
• UN Global Working Group:
• Big Data & the United Nations "provide a strategic vision, direction, and a global
programs on big data for official statistics, to promote practical use of sources of Big
data for official statistics, while finding solutions to their challenges, and to promote
capacity building and sharing of experiences in this respect."
By Joud Khattab
31. BIG DATAVS NSO
What does big data mean for Official Statistics?
By Joud Khattab
32. BIG DATAVS NSO
• Change of paradigm:
• From finite population sampling methodology, to additional statistical modeling and
machine learning.
• From designers of data collection processes to designers of statistical products.
By Joud Khattab
33. POTENTIAL OF BIG DATA
• Three features which big data can directly or indirectly benefit
macroeconomic and financial statistics, and finally policymaking:
1. By answering new questions and producing new indicators.
2. By bridging time lags in the availability of official statistics and supporting the
timelier forecasting of existing indicators .
3. By providing an innovative data source in the production of official statistics.
By Joud Khattab
35. BIG DATA FOR DEVELOPMENT
A concept that refers to the identification of sources of Big Data relevant to policy and
planning of development programmers. It differs from both “traditional” development data
and what the private sector and mainstream media call Big Data.
By Joud Khattab
36. BIG DATA FOR DEVELOPMENT
• If properly mined and analyzed, Big Data can improve the understanding of
human behavior and offer policymaking support for global development in
three main ways:
EarlyWarning
Early detection of
anomalies can enable
faster responses to
population in times
of crisis.
RealTime
Awareness
Fine grained
representation of
reality through Big
Data can inform the
design and targeting
of programs and
policies.
RealTime Feed
Back
Adjustments can be
made possible by
real time monitoring
the impact of
policies and
programs.
By Joud Khattab
37. WHAT CANWE USE BIG DATA FOR?
• Foster Decision Making and Accountability
• Where are the funds going?
• Is funding going to the right places?
• Monitoring & Evaluation
• What changes occurred over time?
• Did the intervention cause the change?
• What other factors might have led to the outcome?
By Joud Khattab
38. BIG DATA & OPEN DATA
Open Data refers to data that is free from copyright and can be shared in the public
domain.That is not a defining characteristic of Big Data, which can be privately owned or
have varying levels of access control.
By Joud Khattab
39. BIG DATA & OPEN DATA
• In the context of policy making, it is worth to elaborate on the interface
between big data and the new phenomenon of “open data”.
• They are closely related but are not the same.
• Open data brings a perspective that can make big data more useful, more democratic, and
less threatening.
• While big data is defined by size, open data is defined by its use.
• All definitions of open data include two basic features:
• The data must be publicly available for anyone to use, and it must be licensed in a way
that allows for its reuse.
• Open data should also be relatively easy to use, although there are gradations of
"openness".
By Joud Khattab
40. DATA PHILANTHROPY
• The public sector cannot fully exploit Big Data without leadership from
the private sector.With this in mind, the concept of “Data Philanthropy”
has emerged as a partnership by which private sector companies share
data for public benefit, taking the initiative to anonymize their data sets
and provide them to social innovators to mine for real-time insights,
patterns and trends.
By Joud Khattab
42. CLASSIFICATION OF DATA
• Big data that is not open is not democratic:
• Section one of the diagram includes all kinds of big data that is kept from
the public – like the data that large retailers hold on their customers, or
national security data. This kind of big data gives an advantage to the
people who control it.
• Open data does not have to be big data to matter:
• Modest amounts of data, as shown in section four, can have a big impact
when it is made public.
• Data from local governments, for example, can help citizens participate in
local budgeting, choose healthcare, analyze the quality of local services, or
build apps that help people navigate public transport.
• Big, open data doesn't have to come from government:
• This is shown in section three. More and more scientists are sharing their
research in a new, collaborative research model. Other researchers are
using big data collected from social media – most of which is open to the
public – to analyze public opinion and market trends.
By Joud Khattab
43. OPEN DATA IN ARAB WORLD PORTAL
• Open Governmental Data in Arab World:
• Saudi, Bahrain, UAE, Oman,Tunisia, Algeria, Morocco, Jordan, Qatar.
Deference's between Arab & Global Portals!?
By Joud Khattab
44. BIG DATA
&
THE SDG’S
How data science and analytics can
contribute to sustainable
development
By Joud Khattab
45. WHY USE BIG DATA FOR SDG?
• Scarcer financial resources.
• Need to target interventions where most needed.
• Greater demand for transparency and country ownership.
• Monitoring of the progress.
• Need objective evidence base for decision-making.
By Joud Khattab
47. BIG DATA SDG & UN
• Chaired byTheWorld Bank and INEGI
• 7 international agencies and companies
• WEF, Orange, ODI, Data-Pop Alliance,
NASA, Paris 21, Positium
• 6United Nations agencies
• UNSD, UNECE, UNESCAP, ITU, Global
Pulse, UN Department of Economic and
Social Affairs
• 3 universities
• University of Pennsylvania, MIT, Harvard
• Colombia’s National Administrative
Department of Statistics
By Joud Khattab
48. PLAN OF ACTIONS
• Survey to identify which of the 169 SDG targets could use Big Data,
• Proposals of Big Data-specific indicators related to the SDG targets
• (which may be different to the current set of indicators based on traditional sources of
data).
• Make an inventory of past and ongoing research work on Big Data and identify
those that could be used to calculate one or more SDG targets.
• Pilot research in 1-2 countries on calculating 2-3 SDG indicators using Big Data.
• Presentation at the Big Data Conference of UAE.
• Write report of theWorking Group.
By Joud Khattab
49. BIG DATA & SDG’S
• Data are now recognized as central to achieving the 2030 sustainable
development agenda as effective public policy requires quality data which
is now being themed as Data for Development D4D.
• Data Revolution and Big Data can be thought of as a new areas and new
sources where NSO can play a big role in integrating them into national
statistical system and mainstreaming them into official statistics in order to
provide data support in a comprehensive monitoring process.
By Joud Khattab
50. DATA ISTHE NEW OIL
In its raw form, oil has little value
One processed and refined it helps power the world
By Joud Khattab