Understanding SHA256 and Bitcoin mining
On January 3, 2009, the first batch of 50 Bitcoins was mined most likely by the mysterious inventor of Bitcoin: Satoshi Nakamoto. Since then people have gone crazy about Bitcoin mining and have made good fortunes. Today, there are more than a million unique Bitcoin miners who are being criticized over the Internet for expending terawatt-hours of energy annually(around 5% of the global electricity production) for mining purpose and gulping up GPUs on launch day despite the silicon shortage in manufacturing industries. But why does the process of Bitcoin mining take so much of electricity and computational power? The reason is SHA256 (Secure Hash Algorithm), the underlying encryption algorithm used in Bitcoin protocol.
A hash is a mathematical function which takes information and turns it into letters and numbers of a certain length. Hashing is used to make storing and finding information quicker because hashes are usually shorter and easier to find. Hashes also make information unreadable so the original data can become confidential. So, SHA256 takes an input ( up to the size of 2²⁵⁶ -1 ) and turns it into a string of 256 bits or 64 hexadecimal digits.
Few things to remember before going into Bitcoin mining:
Bitcoin is nothing but a ledger( set of transactions) which stores all the transactions that have happened since Bitcoin was invented. Those transactions are stored in blocks of size 1MB, because storing them continuously in one continuous memory location is not possible. These blocks are connected in similar way as linkedlists. Bitcoin protocol uses cryptography for securing the transactions which is achieved using SHA256.
Let’s take a function Square(x) =16, x≥0. Here, it is very easy to guess the value of x which is 4. Now a little complex function, Sum(a, b)=9, a,b≥0. Here, guessing the value of a and b is little difficult because there are few possible combinations like (a, b)=(9,0),(1,8),(2,7),(3,6)…(0,9). Now for SHA256(x)=afdcc7e3c0759b5754a25b752cdd9f1f1e165d04f90a11ac8fa1b7b991c03767, it is almost impossible to guess x.
For example: “ power is power” can be hashed and will be equal to: 885a47cfa1b6376d42f77a08e41da60732ce62c766a4f9b163ff1f974db169ca and if I capitalize the p in power, then the hash will be completely different: bc04fba325987a556c765c1f5ece2eca1e7f7ef3fb8bbe2d87bd4f45d81ebf9e. So even a small change in x produces a totally different hash, but everytime we apply SHA256 to the same x it will produce the same hash, in that way it is deterministic. Otherwise, guessing it the other way is very very difficult. This very deterministic property of SHA256 is used in finding the required hash and get rewarded with Bitcoins.
Let the mining begin:
In blockchain, a block is not only transactions, it has other components like- block number, previous hash and a nonce value. A block number is the serial number of a block in the blockchain. A previous hash is the hash output of the previous block. Now nonce is an interesting value which makes the Bitcoin mining process to utilize so much of electricity and computational power. The Bitcoin protocol requires that the first few digits of the hash value should be zero. Suppose we say that first 4 digits of the hash value should be zero, then difficulty level for mining Bitcoins is 4. If we just add block number, transactions and previous hash, and convert it into a string it might not have 4 zeros in the beginning of the hash value. So now we introduce a new value called nonce. Nonce is just a number which starts from 1 and goes till a very large number(1 Trillion in our case). So now we add nonce along with other components of a block to produce a hash with required numbers of zero in the beginning. Therefore, Bitcoin mining is nothing but the process of guessing a nonce that generates a hash with first x number of zeros. Once the expected hash is found, the block is verified, locked and the miner is rewarded with Bitcoins. Then we move to the next block.
The time taken to discover the correct nonce grows exponentially as the difficulty level increases because the more the prefix zeros required the larger will be the nonce value which will take more time to be found.
We can see in the above table that as we increase the difficulty level, the time taken to guess the correct nonce increases exponentially. For difficulty =8, my computer ran the code for more than 2 hours but still could not find the correct nonce.
But wait! does that mean I just got rewarded with Bitcoins for making correct guesses till difficulty level 7? No, because those blocks were mined a very long time ago, and the current difficulty level is 19! which means we need to guess a nonce value which will give a hash with 19 zeros in the beginning. You can just imagine how much time it would take for my computer to make that guess. It could take years also. That’s why miners use tons of GPUs (which uses a lot of electricity) to mine Bitcoins because it is like a race- whoever reaches the required block first gets the reward of 6.25 Bitcoins today. In the beginning, difficulty level was less and number of Bitcoins rewarded was 50 per block. Every 4 years Bitcoins rewarded per block gets halved. Now the next Bitcoin halving is expected in 2024 after which the miners will get only 3.125 Bitcoins per block, and this process will continue until all the 21 million Bitcoins are mined.
Conclusion
Algorithm wise, Bitcoin mining is not difficult as it sounds . In fact, it is just a for loop which is running for a very very large number of times. The main crux of it is guessing the correct nonce before anyone does, and for that we need very powerful processors which are costly and use a lot of electricity. You can find the python code for this article here.