Coin Metrics’ State of the Network: Issue 204
Breaking down blockchain addresses
Get the best data-driven crypto insights and analysis every week:
State of the Network Foundations: Blockchain Addresses
By Alex R. Mead & Kyle Waters
In this week’s State of the Network we are kicking off a new series, entitled Foundations. This new series will present various “technical” aspects of blockchain technology in an approachable way. In this week’s issue, we start with the concept of an address. We’ll start with a basic introduction of addresses, then dive into specifics of both Ethereum and Bitcoin addresses. Public-Private Keys, EOA’s, Smart-Contracts, and the ever ubiquitous phrase, “not your keys, not your crypto,” will all be explained. As usual, we will have on-chain data to highlight key concepts and offer important context.
Foundations is a new series within SOTN and we would love your feedback! All comments are welcome! Please share any feedback here.
The concept of an address is perhaps the most fundamental in the world of cryptocurrencies. Regardless of the blockchain, an address is the unit of “identity” on the blockchain (“on-chain”) and all interactions begin and end with an address. Whether executing a simple transfer of funds or implementing a complicated trade using a DeFi protocol, addresses are always critically involved. While there are subtle differences for addresses from chain to chain, they all enable the unique identification and grouping of assets on-chain.
Next, we’ll go over the specifics of Ethereum addresses. What are the different types, how they are made, and how they are used. Then we will explain Bitcoin addresses and the most significant four address types on that network.
But First, Who “Owns” an Address?
To clear up some confusion, a single address does not necessarily mean a single user. It could be a group of individuals behind an address or a single individual may have multiple addresses. An address could even be a smart-contract and not be “owned” by any single user. Addresses, however, are the atomic unit of identity within blockchains and that is how we will treat them here, regardless of who owns or operates them. But as we’ll soon learn, true ownership lies with the controller of an address’s private keys.
An Ethereum address is a unique sequence of 20 bytes, meaning with 8 bits to a byte, there exist 28*20 = 2160 possible addresses for Ethereum. Although addresses only take 20 bytes, the Ethereum address space is much larger than IPv6 2128, and of course the odds of “guessing” the private key of an existing account are astronomically improbable. Typically, addresses are displayed as a string beginning with “0x” followed by 40 hexadecimal characters (0-9, A-F). Within the Ethereum system, each address has an associated balance of the native currency, Ether or ETH. These balances range from barely above 0 ETH to the largest, with more than 19 million ETH, the Beacon Chain Staking Contract. Beyond the largest address balance, however, ETH is distributed across millions of addresses as shown below. In this chart, we can see that nearly 100 million of the total outstanding 120 million ETH are held on addresses with 100 or more ETH as their balance.
Ethereum addresses come in two types: Externally Owned Accounts (EOA’s) and Smart-Contacts. The difference is EOA’s have an associated private key, while Smart-Contracts have associated contract code. This means only an EOA can sign a transaction and initiate computation within the Ethereum system, while smart contracts must be “activated” by those transactions originating from EOA’s.
For example, 0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045 is a known personal EOA of Vitalik Buterin, while 0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48 is a smart contract address for the stablecoin USDC. Typically for an EOA like Vitalik’s, the private keys are managed by a commercially available “wallet,” for example MetaMask. But could also be managed using several other methods, including Python scripts or a hardware wallet.
Where do Ethereum addresses come from?
Externally Owned Account (EOA) addresses and smart-contract addresses are each derived in separate ways and involve a sequence of cryptographic operations. In fact, this is one of the key reasons crypto is called “crypto.”
The account generation process starts by generating a private key, which is a 256 bit (32 byte) random number. It is essential that this number be kept secret, as the private key alone provides complete control over an Address. This number specifically is the origin of the phrase, “not your keys, not your crypto.” And this is because without this number, or if someone else gets this number, they can get access to your address, the identity tied to your address, as well as all funds and collectibles you may have left on there.
Next, that private key, which is really just a number, is used as an input to the Elliptic Curve Digital Signature Algorithm, or ECDSA for short. As the name suggests, the algorithm uses elliptic curves to produce another number, 512-bits (64 bytes) in length, referred to as the public key. These two numbers are collectively referred to as a Public-Private Key Pair.
Finally, this public key is hashed using the Keccak256 function which produces yet another 256-bit sequence (32 bytes). This sequence is then truncated to preserve the last 20 bytes, or 160 bits, and voilà, that is the address which is used by Ethereum. The figure below shows this process graphically.
This process can be accomplished many ways, however, most users choose widely available “wallet software” such as MetaMask or an open source software like web3.py to generate and manage their keys. Beyond creating and managing keys and addresses technically speaking, typically, EOA’s are controlled by a person, a group of people, or professional custodians with access to the associated private key. This private key is used to sign transactions, which is the critical step for sending ETH to another EOA account or Smart Contract and must be kept secret and guarded with a very high level of security. We’ll discuss this more below concerning cold wallets.
Smart Contract Address
EOA’s are the first step on Ethereum to using the blockchain, whether that’s sending Ether in a simple transfer transaction or interacting with a smart contract. However, a special function EOA’s also perform is deploying smart contracts themselves. In this contract deployment process the second Ethereum address type is formulated.
The process begins with some compiled EVM compatible bytecode inserted into a transaction that is sent to the “special” deployment address, 0x0. The Ethereum protocol itself then uses an algorithm combining the deploying EOA public key, as well its nonce (i.e. the total number of transactions that address has sent in its lifetime).
More specifically, the address (20 bytes) and the nonce (64 bytes) are concatenated, and then Recursive-Length Prefix (RLP) Serialized. The resulting byte sequence is then Keccak256 hashed in the same process as the public key, with the last 20 bytes used as the smart contract address.
Smart Contract addresses are very important pieces of information as they are the entry place for any transaction which will interact with the contract. For example, if one wants to swap on UniswapV2 for example, the Factory Contract address will be very important, 0x5C69bEe701ef814a2B6a3EDD4B1652CB9cc5aA6f.
Quantifying Addresses on Ethereum
While the total possible number of addresses on Ethereum is 2160 as mentioned above, the chart below shows how many of those addresses are currently seen on-chain. In the figure, the addresses have been grouped by type, showing nearly 100 million EOA’s with a non-zero ETH balance and just over 30 million smart contracts. While these are sizable numbers, they are insignificant compared to the total possible number, which is good news for Ethereum users who can rest assured we are nowhere near running out of addresses.
As compared with Ethereum, Bitcoin addresses have a few more details that need to be ironed out. To begin, there are in fact four main address types: Legacy, Pay-to-Script-Hash, Segwit, and Taproot. Although Bitcoin uses a different accounting method, called the unspent transaction output (UTXO) model for tracking coins on-chain, we will focus on the different address types below. Interested readers can find more information about the distinction between UTXO and account-based blockchain models here in past research.
Legacy (P2PKH) Addresses
These addresses comprise those used with the original deployment of the Bitcoin system and can easily be identified as they start with a “1”. They are generated in a very similar fashion to Ethereum addresses and begin with the creation of a Public-Private Key pair.
The public key is then hashed as with Ethereum, but with the SHA-256 algorithm. The resulting 256 bit sequence is then hashed again using the RACE Integrity Primitives Evaluation Message Digest (RIPEMD-160) which results in a 20 byte (160 bit) sequence. Stepping back, the process is very similar to Ethereum, however, when displayed in text, Bitcoin typically uses a Base58CheckEncoding. This means the 20 byte sequence is encoded using 58 characters (1-9, A-H, J-N, P-Z, a-k, m-z), not the 16 used in hexadecimal codings. Basically, the same 160-bit sequence of zero’s and one’s stored in computer memory is printed to the screen using a set of 16 or 58 characters. Many other encodings also exist, however, this is the one chosen for Bitcoin.
Pay-to-Script-Hash (P2SH) Addresses
Identified by their starting with a “3”, P2SH addresses are technically scripts which can be used for things like multisignature requirement transfers. The true nature of Pay-to-Script-Hash’ing on Bitcoin is beyond our scope, but simply put, a P2SH address is the Base58Check encoding from above of the hash of the scripting code. Thus, one can conceptualize P2SH addresses as the encoding of a 20 byte hash of as script, analogous to the encoding of the 20 byte hash of a public key.
SegWit and Taproot Addresses
In an attempt to increase the efficiency of block space and improve fees, SegWit introduced several changes in how addresses are constructed. The details of the exact algorithm start to become tedious, hence we’ve opted to skip them here, but one can easily see these new addresses as they begin with “bc1”. In addition to SegWit, Taproot addresses, identified by beginning with “bc1p” further improve transaction efficiency and also offer greater privacy.
Similar to Ethereum, the chart below shows the total outstanding Bitcoin balances aggregated by the total size of the balance for an address. Similar to Ethereum, the vast majority of Bitcoin is held by large value addresses. However, as we have pointed out in the past, it is important to closely examine the data to better understand this distribution.
Another good resource for Bitcoin address types is txstats.com, where addresses are grouped by type and plotted against time with respect to how many transactions of which they have been a part.
More On Crypto Addresses
Beyond the mechanics of what addresses really are, there are several other concepts one should understand with regard to addresses, including: hot/cold wallets as well as Vanity Address and ENS.
Hot Wallet, Cold Wallet
Cryptocurrency assets “live” on the blockchains themselves, stored as either native Layer 1 currencies (e.g. Ethereum or Bitcoin) or within smart contracts balance ledgers (e.g. USDC, Dai, WETH). The blockchains themselves can be thought of as distributed databases stored in the nodes of the respective networks. This means users do not custody the cryptocurrencies themselves, but rather custody the private keys associated with the respective addresses on the various blockchains. As such the truly private information that must be kept secure is the private keys corresponding to the addresses.
In the world of crypto, two main approaches are used to store these private keys: hot wallets and cold wallets. The difference is quite simple: hot wallets are private key storage systems with connection to the Internet, while cold wallets are systems not connected to the Internet. Total disconnection from the Internet makes the private keys more securely guarded. A potential thief would physically need the device to get the private keys. To summarize, it’s not the cryptocurrencies themselves that are stored in the wallets, but rather the private key associated with an address that maintains a balance on the blockchain that is stored. Private-key management is growing in sophistication, and an increasingly popular method of digital asset security is Multi-Party Computation (MPC) which involves breaking up private keys into encrypted shares divided across a number of participants.
Vanity Addresses & ENS
Vanity Addresses are addresses just like any other, however, some part of the address will have some meaning beyond the randomness expected. Like a vanity license plate you might see driving on the highway, the first several characters might spell out a word, phrase, or name in English, like the Legacy bitcoin address: 1googLemzFVj8ALj6mfBsbifRoD4miY36v. These addresses can be used to more easily identify a counterparty and can be used for branding and marketing purposes. The process to make these addresses is supported by several open source projects, for example Vanitygen. It should be noted however, the process can be very computational resource intensive, as it requires a “guess-and-check” type algorithm. It is prudent to be wary of trusting online tools for vanity addresses as they too have a copy of your private key, and as they say, “Not your keys, not your crypto.” In fact, last year there was a vulnerability exploited in the “Profanity” Ethereum vanity-address generator.
As a formal system to avoid having to generate easy-to-remember vanity addresses, Ethereum has an application called the Ethereum Name Service (ENS) which is a spin on the Domain Name System (DNS) for URL to IP address mapping from the traditional Internet. Very similar to URLs, users can purchase custom names that end with “.eth” and the smart contract creates a record of these labels which is publicly available on chain. Using the above example for Vitalik Buterin, one can see his ENS name vitalik.eth. ENS names, just like URLs, help make the Ethereum blockchain more user friendly as alphanumeric sequences are easier to remember than a 20 byte sequence, adding convenience and security to the network.
While much more could be said about addresses, we’ve covered the basics for both Ethereum and Bitcoin. As mentioned above, this issue is part of the new Foundations series explaining technical topics related to crypto. Any feedback or thoughts are very welcome and can be submitted here.
Network Data Insights
Active addresses on Ethereum fell over the week to a daily average of 454K per day, a drop of 21% from the week prior. This drop coincided with a sharp rise in fees on the network, with the average gas price at 56 GWEI over the past week. Bitcoin active addresses were relatively flat while stablecoin active addresses were mostly down from the week prior.
Coin Metrics Updates
This week’s updates from the Coin Metrics team:
Explore our entire catalog of data-driven research on our new insights page on the Coin Metrics website, which makes it easier to browse through previous State of the Network issues, as well as other original research.
For the best in-depth discussion of CM data and research, come check out our research community on the web3 social media platform gm.xyz.
As always, if you have any feedback or requests please let us know here.
Subscribe and Past Issues
Coin Metrics’ State of the Network, is an unbiased, weekly view of the crypto market informed by our own network (on-chain) and market data.