4. WE FOCUS ON EVERY ASPECT OF DATA
R&D
• Data capture and consumption
• Communications and networking
• Infrastructure
• Hardware and software
• Cybersecurity
• Data statistics, modelling and analytics
• Decision sciences
• Behavioural economics and cognitive sciences
5. Architecture & Analytics Platform
• Blockchain Research
• Designing Systems with Blockchain
• Trustworthiness of Blockchain
• Defining and Using Smart Contracts
• Process-oriented Dependability
• Process mining
• Log analysis
• Error Detection
• Data Analytics Platform
5 | Blockchain | Qinghua Lu
6. Let’s Start!
• Hype
• Elements of a blockchain
• Blockchain in Depth
• Software Architectural aspects of Blockchain
• Case Studies
7. Hype
• Gartner hype cycle
• Potential applications
• Cryptocurrencies and current
landscape
12. Payment System
Blockchain | Qinghua Lu12 |
Traditional trusted environment Blockchain trustless environment
Bitcoin
13. Identity Management
Blockchain | Qinghua Lu13 |
Traditional trusted environment Blockchain trustless environment
• Trusted identity provider
• Federated identity
• Identity maintained by the user
• Name, date-of-birth, email
• Social media profiles
• Attestations
• Verified by “trusted” participants
(government, bank)
• Reputational attestation by other
participants (social medias)
14. Asset Registry and Provenance
Blockchain | Qinghua Lu14 |
Traditional trusted environment Blockchain trustless environment
• Centralized database • Register assets on blockchain
• Proof of existence
• Land, financial assets, digital assets...
• Associated key information verified by
“trusted” third party
• Transactions provide the evidence of
provenance
• Ownership transfer
15. Trading Market
Blockchain | Qinghua Lu15 |
Traditional trusted environment Blockchain trustless environment
• Register assets on blockchain
• Smart contracts enable negotiation and
escrow
• Rely on participants to resolve dispute
Slur.io
16. Other Potential Use Cases
• Financial Services
• Digital currency
• (International)
payment
• Reconciliation
• Settlement
• Markets
• Trade finance
Blockchain | Qinghua Lu16 |
• Government Services
• Registry & Identity
• Grants & Social security
• Quota management
• Taxation
• Enterprise and
Industry
• Supply chain
• IoT
• Digital rights and IP
• Data management
• Attestation
• Inter-divisional
accounting
• Corporate Affairs
23. Blockchain Landscape
Blockchain | Qinghua Lu23 |
Permissionless
Public Blockchain
Permissioned
Public Blockchain
Private Blockchaininfrastructure
Infrastructure
Add-on
Smart contractsColoured coin Side-chains
Blockchain tool
provider
Application development Blockchain-as-a-Service
Applications
Application
Sector
Supply
Chain
Blockchain solutions
Identity
Management
Legal Contract
Codification
IoT Data Sharing and
Analysis
Financial
Clearing and
settlement
Asset
Management
Up to July 2016
24. Elements of a blockchain
• Contract
• Immutability
• Cryptography
25. Enough about cryptocurrencies what about
blockchains?
• Cryptocurrencies are based on blockchains - blockchains are the
underlying technology.
• Blockchain has 3 elements
1. A contract . Every blockchain is based on one or more contracts.
If more than one, the contracts should be logically connected.
2. An immutable history of valid transactions within the contract.
3. A cryptographic encoding of the contract.
Blockchain | Qinghua Lu25 |
26. Element 1: blockchain contract
• A contract in a blockchain is a specification of how
individuals or entities can interact with the
blockchain and their obligations and rewards from
this interaction.
• Contracts can be
• explicit or implicit
• hard coded into the platform or programmable
• Smart contracts are a language within the platform
for specifying the contract.
26
27. Bitcoin contract
• Hard coded
• Key concepts:
• Accounts – an individual may have one or more accounts
within the bitcoin network
• Spending – an individual owning an account can transfer
bitcoins to another account
• Mining – a new bitcoin can be created by finding a valid new
block (complicated process)
• Consensus validation. A transaction is not “accepted” until it
has been validated by participants in the network.
27
28. Ethereum
• The Ethereum platform supports a Turing complete
programing langage
• You write a “smart contract” in this programming
language that specifies the rules of your contract.
• There are other rules for using the platform that
govern costs and fees.
• The contract for Ethereum users is similar to
Bitcoins.
28
29. Hyperledger Fabric
• Open source platform supported by IBM
• Contracts are written in Chaincode
• “Chaincode is a program, written in Go, node.js, and eventually in
other programming languages such as Java, that implements a
prescribed interface.Chaincode runs in a secured Docker container
isolated from the endorsing peer process.”
-hyperledger-fabric.readthedocs.io/en/release-1.1/chaincode.htm
• Turing Complete
• Plug ins exist for different contract variants (described later)
Blockchain | Xiwei (Sherry) Xu29 |
30. Element 2: An immutable history
30
• Immutable public ledger
• Time stamped transactions
• Audit trail of what happened
• Distributed and replicated
31. How is immutability achieved?
• Immutability rests on two foundations
• Hash functions
• Difficulty of modifying blockchain
31
32. Hash Function
• Hash functions
• Takes any string as input
• Fixed-size output (for example, 256 bits)
• Efficiently computable
• Security properties
• Collision-unlikely
32 | Blockchain | Qinghua Lu
x
y
H(x) = H(y)
33. Blockchain
Transactions
⁞
of previous block
H( )
Transactions
⁞
of previous block
Transactions
⁞
of previous block
Genesis Block
H( )H( )H( )
• Linked list with hash pointers
Blockchain | Qinghua Lu33 |
35. What prevents someone from changing
the whole blockchain?
• It depends on the trust model (discussed in more
detail later)
• Consensus model. Each block must satisfy mining
constraints – computationally very difficult.
• Superuser model. Each block is signed by one of the special
users. Rules exist to prevent rogue special users from
modifying data.
35
36. Element 3: Cryptography
contract Key
Generator
Prover key
Verifier key
Prover – I have
complied with all
elements of the
contract under my
control
Compliance
Token
Compliance
Token
Verifier – Prover has
(has not) complied
37. Comments on previous slide
Blockchain | Xiwei (Sherry) Xu37 |
• Everything is public to users of the blockchain except the internals
of the key generator
• Figure might be slightly different when a different theoretical basis
is used. This figure is based on Quadratic Span Programs
• All details of the provers assertion such as who/what/when can be
kept secret.
• There may be elements of the contract outside of the prover’s
control. E.g. Bitcoin consensus process
38. Building blocks of cryptography with
Zerocash example
• NP completeness
• Zero Knowledge proofs
• Quadratic Span Programs
• Very complicated – you don’t need to understand
cryptographic portions of blockchain to understand
uses of blockchain
38
39. NP completeness
• An NP complete problem is one in which there is no
way to locate a solution in polynomial time.
• It may be possible, however, to verify a solution in
polynomial time.
• The Boolean satisfiability problem is NP complete.
• Given a statement in Boolean logic, finding an assignment of
True or False to the variables in that statement that makes
the statement evaluate to True is computationally difficult.
• Verifying that a particular assignment evaluates to True is
computationally easy.
39
41. Start by choosing p and g
• p is a large prime
• g is a base such that gp = 0 (mod p)
• Computing y = gx (mod p) is easy given g, x, and p
• But finding x when given y, g, and p is NP difficult.
• So y, g, and p can all be made public without
compromising the ability to find x.
42. Using p and g to implement Diffie
Hellman
• Alice (or Bob) select p and g. These are publically
shared.
• Alice and Bob each select random numbers rA and rB.
They keep them secret (even from each other)
• Alice computes mA = grA (mod p) and sends it to Bob
• Bob computes mB = grB (mod p) and sends it to Alice
• Bob computes s = (grA)rB (mod p). I.e. mA raised to his
secret number. Alice does a similar computation.
Because exponents are commutative (xAB = xBA) they
both arrive at the same s but an eavesdropper cannot
derive s. Furthermore, rA and rB are kept secret.
42
43. Moving to Zero Knowledge (ZK)
proofs
• Alice generates two keys using techniques similar to
Diffie Hellman. One she keeps secret and one she
shares with Bob.
• Alice claims knowledge of some fact and encodes
that knowledge using her secret key. She shares the
encoding with Bob.
• Bob uses the shared key to verify the truth of the
fact that Alice is claiming.
43
44. Security
• It is not possible for Bob to derive Alice’s fact from
their communication without either factoring large
numbers (an NP complete problem) or knowing
Alice’s secret key.
• That is, Alice can convince Bob that she knows the
fact in question without divulging the fact – zero
knowledge proof.
• More importantly, there is an efficient way to
generate zero knowledge proofs for any NP
complete problem.
44
45. Quadratic Span programs
• Roles:
• Prover (Alice in prior examples).
• Verifier (Bob).
• Generating prover/verifier tokens
• Input Boolean expression into the key generator. Output is
two publically available tokens– one for the prover, one for
the verifier
• Proof/verification
• Prover uses prover token to assert a particular string is true
for the Boolean expression.
• Verifier uses verifier token to check prover’s assertion
45
46. Cheating
• Prover cannot cheat because then their assertion
cannot be verified.
• Verifier cannot cheat because verifier token is
public, verifier algorithm is public, and assertion is
public. Anyone can verify if assertion is true.
• Tokens are based on secret choices. These choices
must be kept secret.
46
47. How do quadratic span programs
work?
• Key idea – for any Boolean expression can generate a
function that will determine whether an input satisfies that
expression.
• Function can work in linear time
• Function does not find satisfying input, only verifies if a given input
satisfies
• Function is derived by generating vectors that span possible
inputs and modeling each gate in Boolean circuit associated
with Boolean expression as a vector.
• Two different functions – one for input wires (for proving)
and one for internal wires (for verification)
47
48. Applying encryption
• Boolean satisfying problem is NP complete so there
exist Zero Knowledge proofs to determine validity. This
is a part of creating a quadratic span program.
• A key generator creates the tokens based on choice of
large primes. These primes must be kept secret but are
only used in the generation phase.
• Prover asserts particular string is true. Verifier uses
verifier token to verify but because quadratic span
proofs and verification are zero knowledge, no
information about the string is available to the Verifier
(or anyone else).
48
49. Zerocash
• Zerocash (Zcash) is a crypto currency (like bitcoin)
such that transactions can be verified without
disclosing any information about the person
initiating the transaction or the recipient of the
transaction.
50. What is a transaction?
• A transaction describes the transfer of money from
one account to another. It has the following items:
• From account
• To account
• Amount of money to transfer from sender to recipient
• Verification that the from account has sufficient funds
• Verification that the person initiating the transaction owns
the from account
51. Pieces of zerocash
• Setup – create public tokens
• CreateAddress – open an account
• Mint – create a new zerocash coin
• Verify transaction – verify that a reported
transaction satisfies conditions
• Pour – spend coins
• Receive – add spent coins to your account
52. Zerocash properties
• Relies on the public tokens
• Relies on an immutable public ledger
• Keeps private (encrypted) all of the elements of the
transaction
• Account identifiers – from and to
• Owners of accounts
• Amount of accounts
• Amount of transaction
53. Blockchain In-Depth
• Trust model
• Mining and distributed consensus
• Network considerations
• Variants
• Current reality
54. Trust model
• Some blockchains have no central authority
and rely on consensus mechanisms within
the network of users – e.g. Bitcoin,
Ethereum. A form of crowdsourcing
• Other blockchains have users with special
privileges – e.g. one option in Hyperledger
Fabric. We will call user with special
privileges a “Superuser”
54
55. Who do you trust?
Blockchain | Qinghua Lu55 |
Organization 1 Organization 2 Organization 1 Organization 2
Centralized Trusted Authority
Superuser Crowdsourced
Blockchain network
56. Mining Process in Bitcoin, Ethereum
Blockchain | Qinghua Lu56 |
Aggregation
Header
Construction
Solving puzzle
Propagation
Receiving a new block • End of one round of a competition is the beginning
of the next round
• Remove the transactions of the new block from
transaction pool
• Aggregate remaining valid transactions
• Calculate the hash of the previous block
• Construct a merkle tree to summarize all the
included transactions
• Proof-of-work
57. Proof-of-work
• Nodes compete for right to write blockchain
• Solve a hash puzzle
Blockchain | Qinghua Lu57 |
Transactions
⁞
H( ) of previous block
Nonce
H(nonce || H() of previous block || Tx || … || Tx) is very small
00000…
Output space of hash (256 bits)
Target space
• If the Hash function is secure: The only way to succeed is to try enough number of
values until you get lucky
• Prob (wining next block) = Fraction of global hash power the miner controls
58. Select Random Node as Writer
• Sybil attack
• Reputation system is subverted by forging identities in peer-2-peer network
• A adversary is controlling multiple nodes on a network
• Selecting nodes in proportion to a resource that no one can
monopolize
• Proof-of-work – In proportion to computing power
• Proof-of-stake – In proportion to ownership
Blockchain | Qinghua Lu58 |
59. Alternatives
• Proof-of-work
• Waste of electricity
• Electricity consumption is close to Turkmenistan
• Proof-of-stake
• Ownership of a certain amount of currency and randomization
• Delegated proof-of-stake
• Proof-of-retrievability (Permacoin)
• In proportion to distributed storage of archival data
• BFT (Byzantine Fault Tolerance)
• Stronger consistency guarantee, lower latency, smaller number of nodes
Blockchain | Qinghua Lu59 |
60. Distributed Consensus
• Reliability in distributed systems
• Nodes may crash
• Nodes may be malicious
• Networks is imperfect
• Definition
• All correct nodes decide on the same value
• This value must be proposed by a correct node
• Blockchain network
• Nodes have a sequence of blocks of transactions they’ve reached consensus on
• Each node has a set of outstanding transactions
Blockchain | Qinghua Lu60 |
61. Blockchain Consensus
• Each node collects new transactions into a block
• In each round, a random node gets to broadcast its block
• Other nodes accept the block only if all transactions in it are valid
• Other nodes implicitly accept/reject the block
• Extending it
• Ignoring it and extending chain from an earlier block
Blockchain | Qinghua Lu61 |
62. Transaction Life Cycle
Blockchain | Qinghua Lu62 |
Creation
Authorization
Broadcast
Propagation
Included
Pending
Local transaction handling
Dropped
Included Tx outdated
Committed
Rejected
GossipprotocolConsensus protocol
Consensus protocol
63. Blockchain Network
63 |
• Network
• Gossip protocol for propagation
• Consensus protocol for agreement
depends on trust model
• Every node hosts a replica
• Good availability
• Efficient reading
Blockchain | Qinghua Lu
64. Peer-to-Peer Network
• Propagation
• New transactions
• New blocks
• Eclipse Attack
• Isolated from the network
• Network is unreliable
• Security comes from blockchain data structure
• Security comes from consensus protocol
Blockchain | Qinghua Lu64 |
66. Public / Consortium / Private
• Anyone can use a public blockchain network
• Public Bitcoin or Public Ethereum
• Public networks have incentives for people to join and fees for use
• Better transparency and auditability, bad performance
• Consortium blockchain is used cross-organizations
• Controlled by pre-authorized nodes
• Private blockchain is within a single organization
• Consortium/private instantiation of public blockchain
• Blockchain platform is (mostly) open source
• Network layer access control – firewall
Blockchain | Qinghua Lu66 |
67. Customizability
• Some blockchains have hard coded transaction rules–
e.g. Bitcoin or other non government backed digital
currencies.
• Other blockchains allow you to set up your own
transaction rules. E.g. Ethereum has a Turing
complete language to specify transaction rules.
• Programs in these languages are called “smart contracts”
67
68. On chain/off chain
• Some blockchains allow you to import/export
some data not on the chain for computational or
verification purposes.
• Such data is no longer protected by encryption or
guaranteed to be immutable although recording a
hash of exported/imported data can provide some
guarantees.
68
69. Incentive structure for public
networks
• Creating a new block in Bitcoin is computationally
difficult. This is referred to as “mining”
• In public networks, some of the power of blockchain
is due to replication and having new blocks
validated.
• The incentive structure is intended both to
incentivize creation of new blocks and to replicate
existing blocks.
• If you are a successful miner, you are allocated new Bitcoins
• Collected fees are also allocated to encourage participation.
69
70. Permissionless / Permissioned
• Consortium/Private networks require permissions to access
• More suitable for regulated industries
– Know-Your-Customer (KYC)
• Other permissions
• Permission to initiate transactions
• Permission to mine
• Fine-grained permission: permission to create a particular asset
• Public networks can be used for private purposes
• E.g. executing a smart Ethereum contract to manage a supply chain.
• Public networks that are used for private purposes require permissions to
access the private portions.
Blockchain | Qinghua Lu70 |
71. What is real?
• Unauthorized termination of smart contract
• *7% smart contract can be terminated without authority (Public Ethereum)
• Transaction failure difficult to detect
• Mistakenly retry
• The Decentralized Autonomous Organization (DAO)
• Code issue leads to $60 Million Ether Theft during ICO (Initial Coin Offering)
• Hard fork of Ethereum
Blockchain | Qinghua Lu71 |
Immature Code!
*I. Weber, V. Gramoli, M. Staples et al., “On availability for blockchain-based systems”, SRDS’17, Hong Kong, China, September 2017.
72. Blockchain Myths
Blockchain | Qinghua Lu72 |
Myth Reality
Solves Every Problem A kind of database
Trustless Can shift trust and spread trust
Secure Focus is Integrity, not Confidentiality
Smart contracts are legal contracts May help execute parts of some legal contracts
Immutable Many only offer probabilistic immutability
Need to waste electricity Emerging blockchains are more efficient
Are inherently unsalable Emerging blockchains are more scalable
• c
74. Blockchain qualities
• Blockchains are architectural design choice
• Functionally, they are a kind of database and computational execution engine
• Using a blockchain impacts non-functional properties
• (+) Integrity, Non-repudiation
• (-) Modifiability
• (+) Availability
• (-) Confidentiality, Privacy
– Solution: Cryptography
• (- write / + read) Latency
• (-) Throughput
– Solutions: Increasing block size, Lightning network, Segregated witness
Blockchain | Qinghua Lu74 |
75. Blockchain in a Software System
75 | Blockchain | Qinghua Lu
Database
(big data)
Applications
API
Tokens/CurrenciesSmart Contracts
Applications Applications
Shared Data Ledger (meta-data, small data)
Applications
Blockchain
76. Blockchain vs. Shared Database
76 | Blockchain | Qinghua Lu
Blockchain Shared Database
Operations Insert (Append Only) Create/Read/Update/Delete (CRUD)
Replication Full replication on every peer • Master-slave
• Multi-master
Consensus • Majority of peers agree on the
outcome of transactions
• Tolerant of Byzantine Generals’
problem
• Distributed transactions (2 Phase
Commit, Paxos)
• Synchronization
Validation Global rules enforced on the
whole blockchain
Local integrity constraints
79. Blockchain | Qinghua Lu
Shared information
79 |
Actors / Goals
• Data Host: Maintains a database for some information
• Data Consumer: Uses the information provided by data host
Function Requirement
• The shared information could be manipulated internally by the data host
• The shared information could be manipulated externally by the data consumer
through API
Non-function Requirement
• Infrastructure reliability
• Efficient reading/writing
• Data Integrity
80. Architecture of Current Practice
Blockchain | Qinghua Lu80 |
Primary
database
Database
replica
WriteAPI
Database
replica
ReadAPI
Data Host
Department
Internal
database
Internal
database
Data Consumer
From Private Sector
Internal
database
Data Consumer
From Public Sector
Internal
database
81. Architecture of Using Blockchain
Blockchain | Qinghua Lu81 |
Consortium Blockchain
across departments
Department A (Data
Provider/Consumer)
Department B (Data
Provider/Consumer)
Service B
Service A
Service C
Service B
Service A Service B Service C
Data Consumer
From Private Sector
Data Consumer
From Public Sector
82. Blockchain | Qinghua Lu
Benefits and Risks
82 |
Benefit Risk
• Reliability
- Geographical distribution
• Performance
- More efficient read from local replica
• Integrity
- Cryptographic hash
- Digital signature
• Immutability
- Log of all operations are immutable
• Performance
- Longer latency for writing
- Limited on-chain data size
• Immutability
- Accidentally wrong records are
irreversible
• Operation cost
- Potential extra effort from
current data consumer side
• Business Model
- Read is from local replica
83. Blockchain | Qinghua Lu
Device management
83 |
Actors / Goals
• Hospital
• Device management and
• Device usage tracking across hospitals
Function Requirement
• Device management: Device registration, maintenance etc
• Device usage tracking: Record which patient uses what device at what time
Non-function Requirement
• Interoperability
• Data Integrity
• Immutability
84. Architecture of Current Practice
Blockchain | Qinghua Lu84 |
Device
Usage
User Interface
Hospital A
Device
Usage
User Interface
Hospital B
Device
Usage
User Interface
Hospital C
85. Architecture with Blockchain
Blockchain | Qinghua Lu85 |
Consortium blockchain
across departments
Hospital A Hospital B
Device B
Device A
Device C
Device B
User Interface User Interface
86. Blockchain | Qinghua Lu
Benefits and Risks
86 |
Benefit Risk
• Performance
- More efficient read from local replica
• Immutability
- Log of who uses which device at
what time are immutable
• Interoperability
- Share the same infrastructure
• Integrity
- Cryptographic hash
- Digital signature
• Performance
- Longer latency for writing
• Immutability
- Accidentally wrong records are
irreversible
• Transparency
- Data are publicly accessible
- Leaking pricy data
87. Blockchain | Qinghua Lu
Supply chain
87 |
• Interoperability: Coordinate information exchange across the many information systems
• Latency: Exchange of physical goods wait upon exchange of digital documentation
• Integrity: Information about goods and supply chain events cannot be falsified
• Confidentiality: Some information should be held commercial-in-confidence
• Scalability: Many processes in progress at anytime across a large number of parties
88. Blockchain | Qinghua Lu
Architecture of Current Practice
88 |
• A central aggregation server for an agreed portion of the supply chain
• Exchanged data: supply chain event
• Exchanged documents: ibills of Lading, booking confirmations. Arrival notices, container
releases, terminal load list, delivery orders, tax invoices..
89. Blockchain | Qinghua Lu
Architecture with Blockchain as Data Storage
89 |
• Using a blockchain network for exchanging supply chain events.
• All non-event data is still exchanged in a point-to-point manner between participants
90. Blockchain | Qinghua Lu
Architecture with Smart Contract
90 |
• Supply chain process design, implementation, and enforcement on blockchain
through using smart contract
Consortium BlockchainStakeholder A Stakeholder B
Trigger Trigger
API API
91. Blockchain | Qinghua Lu
Properties Analysis
91 |
Central event server Blockchain as data storage Blockchain as smart contract engine
Scalability Central event server
is the bottleneck
• Reading scalability is good
• Writing scalability is limited by public blockchain
Inter-
operability
• Point-to-point integration
• New participant needs to integrate with all
participants directly
• The data format need to be agreed upfront
• New participant needs to integrate with the
process
• Integration burden for the remaining
participants is reduced
Latency Many latency requirements on information transfer are usually on the order of minutes to hours
Integrity Central event server
to operate
Inherent feature of blockchain
Confi-
dentiality
Central access
control
• On-chain data Encryption
• Re-identification for interaction volume
• Dummy transaction or new address for every process
93. • Blockchains are heavily hyped
• Three elements of a blockchain
1. Contract
2. Immutability
3. Cryptography
• Variants for different contexts
• Blockchains are a distributed data ledger (database)
• Blockchains are being tested in a variety of domains
Summary
93 | Blockchain | Qinghua Lu
94. More information
• Sources are at the end of this slide deck
• This slide deck is on slideshare
• lenbass@cmu.edu
• QUESTIONS/DISCUSSION
Blockchain | Xiwei (Sherry) Xu94 |