OP_RETURN is well known as the simplest mechanism for creating an unspendable output and carrying data within a bitcoin transaction. It’s had a checkered history with being the root cause of one of the worst bugs in Bitcoin’s history and arguably also being a political football that led to the beginning of the great altcoin diaspora. It is a far more important piece of the Bitcoin puzzle than is commonly understood though. It’s current usage is a subset of it’s potential and it’s also kind of wrong. Lets take a look at the history.
The original OP_RETURN
In the beginning the execution of scripts worked slightly differently than it does now. In some respects more primitive. In others more in line with the original intent. There are actually two scripts involved in the authorization of spending a UTXO. The locking script (also known as scriptPubkey) which is designated at the time a new bitcoin output is created. And the unlocking script (also known as scriptSig) which is provided at the time of spending the output (to create yet another new output). First the unlocking script is executed, then the stack(s) that are a result of that are fed as input to the locking script. Think of the locking script as a puzzle and the unlocking script as the solution to the puzzle. If you can solve the puzzle you are authorized to spend the Bitcoin.
In the original BitCoin this was achieved with the help of another op code OP_CODESEPERATOR. This op code limits the scope of what subsequent OP_CHECKSIG operation will cover during a signature check i.e. which parts of the transaction the signature signs. This is necessary because usually there is a signature in the unlocking script and an OP_CHECKSIG in the locking script. And the signature can’t sign itself without creating a paradox and probably ending the universe as we know it. What would happen is the two scripts were taken and turned into a single script a bit like this:
Concatenate( unlockingScript | OP_CODESEPERATOR | lockingScript )
The single script was then executed.
I can haz all your Bitcoinz
There was a glaring bug in this arrangement. What determines if the ‘puzzle’ has been successfully solved is that if at the end of script execution 1) it hasn’t failed with an error and 2) that the top item remaining on the stack can be interpreted as a valid Boolean ‘true’. OP_RETURN didn’t change this. It simply exited the script early and the above check was done to determine if the script was spendable. But it neglected to distinguish between the locking script which is set in stone once in the UTXO at the (earlier) time when the output is created, and the unlocking script which anyone can create in an attempt to spend the coin. Consider the following locking script:
<public_key> OP_CHECKSIG
This is the original pay-to-public-key script template. It *should* only be able to be unlocked by providing a valid signature, which you should only be able to provide if you have the private key that corresponds to the public key. In the original version of Bitcoin what’s executed looks like this:
Concatenate( <signature> | OP_CODESEPERATOR | <public_key> OP_CHECKSIG )
This would work fine as long as the signature is valid but you could also do this:
Concatenate( OP_TRUE OP_RETURN | OP_CODESEPERATOR | <public_key> OP_CHECKSIG )
This pushes true onto the stack then exits the script. The signature check is never performed and the script is considered valid because the top stack item is a Boolean ‘true’. In fact it doesn’t matter what the locking script is. This behavior could be exploited to spend any bitcoin at all.
Satoshi noticed this and fixed it. But he did it in a rush. The fix was to make OP_RETURN always exit and FAIL the script. The reason for this is likely in way the code was structured at the time. A better fix would have been to disallow OP_RETURN in the unlocking script. Or ensure it always exits and fails if found in the unlocking script but behaves normally if found in the locking script. The problem is that by concatenating the scripts there was no simple way for the script interpreter to know which part it was in. OP_CODESEPERATOR is a valid op code that can be used anywhere in locking or unlocking scripts. It can be used multiple times even. So encountering that op code isn’t a definite marker of the transition from unlocking to locking script. There are other ways it could have been done but they introduce complexity and I suspect Satoshi was in a bit of a hurry to roll out a fix before this vulnerability got exploited.
The data carrier
Roll forward in time and the way the script interpreter was called changed a bit. First lets point out there is not just one stack in script, there are three. The main stack, the alt stack and the “if” stack which is used to determine whether conditional branches in code are executed or not. In the original implementation, because it was a single execution of the engine, all three of the resultant stacks from unlocking script were passed onto the locking script. These days both scripts are executed separately with just the main stack contents from unlocking script being passed onto the locking script. The function of OP_CODESEPERATOR is now subsumed by the semantics of the script execution engine so it has fallen into disuse although that is not to say it isn’t useful in other ways, but that’s a whole other story. This is a significant change in semantics.
The now nerfed OP_RETURN was noted for one of its potential use cases which is to make an output provably unspendable. It’s behaviour is now hard coded, if it is found and executed the script fails no matter what. This was considered a useful property since miners will never need it for validating a future transaction they can optionally prune it. So it was a useful way of embedding data in the blockchain in way that proved its existence at a point in time but without imposing a long term storage burden on miners. We still use it that way today. It is not the only means of storing data in the blockchain and arguably not even the best. But it is now a common use case on BSV blockchain.
To fix it we have to break it
That’s right. The way we are all using OP_RETURN right now is going to get broken if we fix OP_RETURN. But don’t worry too much, it’s an easy fix and we can start right now. Almost all usages of OP_RETURN look a bit like this:
OP_RETURN <data>
And most have a value of zero so nothing is at risk. But this form of OP_RETURN script will become spendable (by anyone) when the original functionality is restored. Not for existing OP_RETURN outputs (please see part 3 of this series for an explanation of UTXO height based activation). But for new ones. Consider the above script with an unlocking script that ends with OP_TRUE:
OP_TRUE | OP_RETURN <data>
As you can see anyone could spend this. Whether that matters or not a point for others to debate but it is not the behavior we currently assume and expect.
The call to action
If you’re an app developer making use of OP_RETURN there is something very simple you can do right now that will ensure your app will maintain consistent behavior both now and after the Genesis upgrade. Instead of starting your script with OP_RETURN, start it with OP_FALSE OP_RETURN. A locking script that contains this will always fail when it hits this sequence of op codes both now and after the Genesis upgrade. Here’s an example of the how the same technique we showed above would look with this op code pattern:
OP_TRUE | OP_FALSE OP_RETURN
As we can see the top stack item is OP_FALSE so when we encounter OP_RETURN and exit the script we will look at the top stack item, see that it is not a Boolean ‘true’ and the script will be deemed to have failed just like an old school OP_RETURN transaction.
BSV v0.2.1 released on 12th July 2019 contains a change to recognize this op code pattern as a standard transaction. So once the majority of nodes are upgraded to this version there should be no functional difference to your app. Additionally this version also sets the data carrier size default to 100kb so transmission of data carrier transactions will become a lot easier and more reliable.
But what about my old OP_RETURN outputs?
I mentioned earlier that this won’t be a problem for existing outputs. I have written a separate article to explain UTXO height based activation as it is a mechanism we will be using for several changes required in the Genesis upgrade. You can read it HERE but the TLDR is that no your old OP_RETURN transactions will behave the same way they always did.
Get on it
The above section was titled ‘call to action’ and that’s exactly what this article is. You can change your code right now because OP_FALSE OP_RETURN behaves functionally identical to a simple OP_RETURN on its own right now. The sooner your app supports this the sooner you can forget about it. The behavior of this script type won’t change after the upgrade. So please update your software ASAP. We are now 7 month out from the return to Genesis. There is a lot of work to do both in the BSV blockchain as well as in the wallet and app ecosystem. We can knock this one off now so we can focus on the other stuff later.
Steve Shadders