WHAT IS A BLOCKCHAIN ORACLE?
Oracles are applications that source, verify, and transmit external information (i.e. information stored off-chain) to smart contracts running on the blockchain. Besides “pulling” off-chain data and broadcasting it on Ethereum, oracles can also “push” information from the blockchain to external systems. An example of the latter could be an oracle that unlocks a smart lock once the user sends the fee via an Ethereum transaction.
Oracles act as a “bridge” connecting smart contracts on blockchains to off-chain data providers. Without oracles, smart contract applications would only be able to access on-chain data. An oracle provides a mechanism for triggering smart contract functions using off-chain data.
Oracles differ based on the source of data (one or multiple sources), trust models (centralized or decentralized), and system architecture (immediate-read, publish-subscribe, and request-response). We can also distinguish between oracles based on whether they retrieve external data for use by on-chain contracts (input oracles), send information from the blockchain to the off-chain applications (output oracles), or perform computational tasks off-chain (computational oracles).
WHY DO SMART CONTRACTS NEED ORACLES?
Most developers see smart contracts as simply pieces of code running at specific addresses on the blockchain. However, a more general view of smart contracts is that they are self-executing software programs capable of enforcing agreements between parties once specific conditions are met—which explains the term, “smart contracts.”
But using smart contracts to enforce agreements between people isn't straightforward, given that Ethereum is deterministic. A deterministic system is one that always produces the same results given an initial state and a particular input—there is no randomness or variation in the process of computing outputs from inputs.
To achieve deterministic execution, blockchains limit nodes to reaching consensus on simple binary (true/false) questions using only data stored on the blockchain itself. Examples of such questions include:
“Did the account owner (identified by a public key) sign this transaction with the paired private key?”
“Does this account have enough funds to cover the transaction?”
“Is this transaction valid in the context of this smart contract?”, etc.
If blockchains received information from external sources (i.e., from the real world), determinism would be impossible to achieve, preventing nodes from agreeing on the validity of changes to the blockchain’s state. Take for example a smart contract that executes a transaction based on the current ETH-USD exchange rate obtained from a traditional price API. This figure would likely change frequently (not to mention that the API could get deprecated or hacked), meaning nodes executing the same contract code would arrive at different results.
For a public blockchain, like Ethereum, with thousands of nodes around the world processing transactions, determinism is critical. With no central authority serving as a source of truth, it is expected that nodes should arrive at the same state after applying the same transactions. A case whereby node A executes a smart contract’s code and gets "3" as a result, while node B gets "7" after running the same transaction would cause consensus to break down and eliminate Ethereum’s value as a decentralized computing platform.
The scenario described earlier also highlights the problem with designing blockchains to pull information from external sources. Oracles, however, solve this problem by taking information from off-chain sources and storing it on the blockchain for smart contracts to consume. Since information stored on-chain is unalterable and publicly available, Ethereum nodes can safely use off-chain data to compute state changes without breaking consensus.
To do this, an oracle is typically made up of a smart contract running on-chain and some off-chain components. The on-chain contract receives requests for data from other smart contracts, which it passes to the off-chain component (called an oracle node). This oracle node can query data sources—using application programming interfaces (APIs), for example—and send transactions to store the requested data in the smart contract's storage.
Essentially, a blockchain oracle bridges the information gap between the blockchain and the external environment, creating “hybrid smart contracts”. A hybrid smart contract is one that functions based on a combination of on-chain contract code and off-chain infrastructure. Decentralized prediction markets, described in the introduction, are an excellent example of hybrid smart contracts. Other examples might include crop insurance smart contracts that pay out when a set of oracles determine that certain weather phenomena have taken place.
WHAT IS THE ORACLE PROBLEM?
It is easy to give smart contracts access to off-chain data by relying on an entity (or multiple entities) to introduce extrinsic information to the blockchain by storing it in the data payload of a transaction. But this brings up new problems:
How do we verify that the injected information was extracted from the correct source or hasn’t been tampered with?
How do we ensure that this data is always available and updated regularly?
The so-called “oracle problem” demonstrates the issues that come with using blockchain oracles to send inputs to smart contracts. It is critical to make sure that data from an oracle is correct or smart contract execution will produce erroneous results. Also important is the need for trustlessness—having to ‘trust’ oracle operators to reliably provide accurate information robs smart contracts of their most defining qualities.
Different oracles differ in their approach to solving the oracle problem, and we explore these approaches later. While no oracle is perfect, an oracle’s merits should be measured based on how it handles the following challenges:
Correctness: An oracle should not cause smart contracts to trigger state changes based on invalid off-chain data. For this reason, an oracle must guarantee authenticity and integrity of data—authenticity means the data was gotten from the correct source, while integrity means the data remained intact (i.e., it wasn’t altered) before being sent on-chain.
Availability: An oracle should not delay or prevent smart contracts from executing actions and triggering state changes. This quality requires that data from an oracle be available on request without interruption.
Incentive compatibility: An oracle should incentivize off-chain data providers to submit correct information to smart contracts. Incentive compatibility involves attributability and accountability. Attributability allows for correlating a piece of external information to its provider, while accountability bonds data providers to the information they give, such that they can be rewarded or penalized based on the quality of information provided.