First off, this is far less than theoretical. It’s more likely that I’m having some sort of misunderstanding or lack of understanding that when looked at closer, the increasing run of 0’s isn’t actually a problem at all. But it seems to me that at some point the increasing likelihood of a collision must become statistically significant, and must somehow factor into the value of attempting an attack based on finding a missing link type block…

I still don’t quite fully understand the guts of the current/reference mainline bitcoin client, and I definitely don’t absolutely fully understand the crypto and stats behind it. But one thing seems sort of obvious. Difficulty is the required number of 0 bits in the resulting hash for a block. The higher the difficulty, the more 0 bits at the end of the block. It effectively sets a mask, where anything to the left of the mask is wildcard, but to the right MUST be 0.

As the network speeds up, difficulty is adjusted using a sliding window algorithm. This adjustment is to try to keep the block creation time as close as possible to a desired interval. The increased difficulty has the property that you’re less and less likely to find the right “salt” to add to get a hash that ends in that many zeroes.

All fine and good. Sure…

But there’s another side to that string of required zeroes. It reduces the number of possible unique values for the block hash. Right now, as we understand it, the SHA-256 algorithm used (which actually involves two rounds of SHA-256) shouldn’t allow for any sort of reverse attack, IE finding good data from a hash….

However at some point the difficulty ramps up enough, and the number of possible valid hashes starts to increase the number of possible collisions. At SOME point it becomes possible that the “correct” hash collides with an existing block hash…I don’t think there’s any protection in the network for NOT accepting longer runs of zeroes (higher difficulty in theory), just shorter runs.

Reverse attack is complicated quite a bit by the fact that part of the data needed for the next block comes from the accepted hash of the previous block. So, in theory, you can’t really “work ahead” because you do not know what the network might accept as the next block.

But what if? What if you did know. Or could know? Part of this is the theory behind “51%” attacks and the like. If you can assemble 51% of the network hashrate all controlled by a single entity, in theory, you can basically run the whole network….If a hole is found wherein you can begin to link a “pre built” chain of blocks together…because you know the previous blocks and all you need to do is to solve for that “missing link” block…you might not need 51% of the network once the difficulty gets very high? This may sort of assumes a possible weakness in the SHA-256 algo…or something else that allows one to bridge the gap with a single block, or maybe a small chain, onto a much larger following chain possibly.

Just the simple fact that as difficulty goes up, the number of possibilities for a valid block hash goes down…is interesting. And takes bigger mind than mine, and more time and math knowledge than I have to think about deeper. Maybe nothing, maybe something?

Just me whistling in the dark I recon.