Browsed by
Author: ShlomiZeltsinger

Bitpy – compiled!

Bitpy – compiled!

You can now download and run a compiled version of Bitpy (Bitpy0.0.1.exe) from mega.nz.

Just download it from this link.

Or you can copy and paste the link in your browser: https://mega.nz/#!kEMF2bpS!qkXPZBh4BPtaoA9bHz9WIeInE9VaU5QF9EY4XgB35sw

 

Please note that the code and program are intended for educational purposes only. Don’t use with any sensitive device or as a real live Bitcoin client.

Transaction part one – The misconceptions about the block chain

Transaction part one – The misconceptions about the block chain

Three levels of abstractification.

The are three level to understand the way the Bitcoin block chain works.

The first level
One of the most simple Bitcoin block chain abstractification

In the first level, we got those who just heard about Bitcoin for the first time. People usually thinks that the coins are just associated with the Bitcoin address. Whenever Alice sends Bob coins, she just place a statement (or transaction) in the block chain specifying that the X coins that were associated with Alice address, should now be associated with Bob address. This is of course wrong since this representation contains only 3 thing: the sender address, the receiver address and the amount of coins to be transferred. But this level of abstraction is usually the first thought most people have when first presented with the idea of the block chain.

The second level
Points to previous transacions

In the second level, people starts to understand that each transaction also contains the origin of each coin. So if Alice wants to send 5 BTC to bob, she first needs to show where she got these 5 BTC from, That is, she also needs to provide a way to point to older translations on the block chain, transactions which proves that Alice is indeed the rightful owner of these five BTC. So now what we have starts to look more like a chain. It’s no longer just a table containing the address and the current BTC balance,  rather the block chain needs to contain a list of all previous transactions. This depiction is more accurate, but still it isn’t complete – introducing the third level.

The third level
Points at previous transaction and set the condition for claiming the coins.

In the third level, people learns that a transaction is more then just a simple statement in the block chain. Transaction contains more then just information about the origin of the coins (the input), the address of the receivers (the outputs) and the amount of coins to be transferred. Rather, the transaction also contains a riddle in it, and only by solving this riddle, can the coins be claimed and transferred. So when Alice send 5 BTC to Bob, she doesn’t only pointing at the transaction from which Alice got the coins, it also solves the riddle specified in that transaction to prove that Alice is allowed to claim those coins, but that’s not the end of it, in the transaction that Alice publish on the block chain she’s also inserting another riddle, a riddle that only Bob can answer. So if Bob would like to use this coins in the future, he’ll have to solve that riddle which was provided to him by Alice. And the process goes on and on and on.

I love riddles! Or, how to solve the riddle and prove that I’m allowed to claim the coins?

A quick reminder. Signing messages and key pair.

In the previous post we’ve talked about mathematical trapdoors and hashing functions. We’ve saw how we can use these types of function so sign a message with a private key and how these signed message can be verified with the corresponding public key.

F("My name is Alice", Alice's_private_key) = signed message.

Verify(signed message, "My name is Alice", Alice's_public key) = true.

Change any component in the Verify statement, and we’ll receive false. This way we can easily prove that the message “My name is Alice” was indeed signed by Alice and that no one tempered with that message.

Pay attention! We cannot sign a message with the public key, it simply wont work! the public key can only be used to verify a message, not to sign it!

So, Alice can now publish a message in the block chain that contains the input (origin of the coins) and the output (the address of the receiver). This is the original message, public for anyone to see and check as he or she pleases. Alice can also now specify the following rules in that public message:

  1. This message is the original message.
  2. Within this message I’ve included Bob’s_public_key (In hashed format – We’ll get there in a second).
  3. Take your public key and hash it. If your hashed public key matches the hashed public key from step 2, you may move to the next step.
  4.  Take this original message, and sign it with your own private key.
    F(This message, Bob’s_private_key) = signed message.
  5. The signed message should be verified only against the public key that is specified in this message.
    Verify(signed message, This message, Bob’s_public_key) = true.
  6. If the verification indeed yield true, you may claim the coins in this transaction

 Getting Bob’s public key

When Alice specify Bob as the recipient of a transaction, she does so by inserting Bob’s hashed public key. The hashed public key (as we already saw in the post dealing with keys) can be deducted from Bob’s Bitcoin address.

The public key is hashed and then added to the final Bitcoin address.
The public key is hashed and then added to the final Bitcoin address.

So basically, knowing Bob’s Bitcoin address is equal to knowing Bob’s hashed public key. All we need to do is to convert the address back from base 58, and remember that out of the 25 bytes of the binary address, we need to omit the first byte and the last 4 bytes.

Scrips – the beginning

As we’ve already saw many times in the past, Bitcoin message is no more then a string of bytes that follow a predefined order. That means that in order to specify conditions and rules like the one Alice is using in her transaction message, we also need all parties to agree on a predefined field that will contain the conditions, and we also need to agree on a way to translate the bytes in that field into a set of rules, or instructions. For this purpose, a scripting language was created for Bitcoin, this langue allocate predefined operation to a predefined byte. For example, the byte 0x8b means “plus 1”, the byte 0x8c means “minus 1”, the byte 0xa9 means “hash the public key. First with SHA-256 and then with RIPEMD-160”, and so on. There’re many more of these operators (also called OP-codes) and these OP-codes helps us to specify the rules that needs to be fulfilled. These rules are transparent and are publicly visible on the block chain. Whenever a node verify a transaction, the nodes follows these rules and make sure that they eventually do yield a true statement.

We’ll talk more about these OP-codes and scrips in the next sections, when we’ll also see a real example of valid transaction script.

 

Keys, addresses and hashing

Keys, addresses and hashing

A key pair is one of the greatest tools that are used in Bitcoin, but it might be a little unintuitive at first. Don’t worry, you’ll get it!

There’s also a short video I made a few months ago that describes the basics of keys. It doesn’t completely corresponds to our current project, but it might provide you with another point of reference. You can watch it over here – Bitcoin python tutorial for beginners – keys and address.

 

One way function

The name “one way function” is quite self explanatory. These function are very easy to solve in one way but almost impossible to invert. Giving function f, and the input x, I can easily calculate the result y.

f(x) = y <- easy to solve

But given the result y, and the function f, It will be almost impossible to find x

f(?) = y <- almost impossible to guess.

The Bitcoin protocol define the use of some of these one way functions (SHA256, ripmed, ECDSA and murmurhash. More functions are being tested and might be used in the future).  Each one of these function has its own place in the protocol. Some functions will be used more then once and/or will be combined with another function to achieve even a grater level of security. For example, signing a message (usually a transaction message) will be done using the SHA256 function, SHA256("hello!"), finding the checksum of the message payload will be done using the SHA256 function twice SHA256(SHA256(message)).

  • Some people have hard time to accept the concept of “hard to guess”, they feel it’s too ambiguous. Well, technically an extremely powerful computer might be able to iterate through all possible results until it will find the right one (this is called brute force), but in practice, it will take a very – very long time. Trying to brute force the result of a SHA256 function on a 32 bytes message will take about 10^65 years. The age of the universe is only 1.4*10^9 year. I think it’s good enough security.
It’s easy to get the result y of function f for a giving x. But almost impossible to tell what the original input was.

 

Mathematical trapdoor and key pair

Mathematical trapdoor is a special type of one way function. The main difference is that in mathematical trapdoor we may also use few extra pieces of information called keys. Bitcoin uses the mathematical trapdoor function ECDSA or Elliptic Curve Digital Signature Algorithm, to produce two keys, or a key pair -A private key, and a public key – Both keys will always come in pairs! there cannot be a public key that matches two different private keys and vice versa!

The private key is used to solve (sign) the function f for message x. The result is the signed message y.

f(private_key, x) = y <- easy to solve

Now I have two messages. The original message x, and the signed message y. I want to prove that I’m the one who signed the original message x, that I’m the owner of the private key. But I don’t want to give my own private key. Anyone who have my private key will be able to sign in my name on other messages as well. So I’m using the public key. The public key can only be used to prove the solution of the function, but it cannot be used to sign messages

f(public_key, x) = y <- easy to prove

f(public_key, x) = null <- I can't sign message x with the public key. only with the private key

  • Pay attention that when we’re using the public key we’re just proving the equation, not solving it.

Here’s a simple numeric example I found on the wikipedia page on mathematical trapdoor:

An example of a simple mathematical trapdoor is “6895601 is the product of two prime numbers. What are those numbers?” A typical solution would be to try dividing 6895601 by several prime numbers until finding the answer. However, if one is told that 1931 is one of the numbers, one can find the answer by entering “6895601 ÷ 1931” into any calculator. This example is not a sturdy trapdoor function – modern computers can guess all of the possible answers within a second – but this sample problem could be improved by using the product of two much larger primes.

 

Let’s see an example:

Step one – create a key pair:

Create key pair using the ECDSA function and some random numbers
Create key pair using the ECDSA function and some random numbers

Step two – sign a message with the private key:

using the private key and the ECDSA algorithm
using the private key and the ECDSA algorithm

Step three – send the original message alongside the encrypted message and the public key

3 items are needed to validate the message. The public key, the original message and the encrypted message
3 items are needed to validate the message. The public key, the original message and the encrypted message

The code

In our project we’ve defined the Key class under Bitpay/Utils/KeyUtils/keys.py. This class contains all the necesery steps that are required in order to generate a private key, trnsform that private key to a public key and then create a Bitcoin address out of that public key.

step one – create (or receive) the private key

The first thing that we’re going to do is to create our private key. The private key is defined as  a random 32 bytes uint. Our class begins with a simple check. If the user initialize the Key class with an already existing private key, that private key will be saved into self.private_key. Otherwise, we’re using the urandom function in the os module to create a random 32 bytes long number.

def __init__(self, private_key=0):
    if private_key == 0:
        self.private_key = os.urandom(32)
        self.printable_pk = str(binascii.hexlify(self.private_key), "ascii")
    else:
        self.printable_pk = private_key
        self.private_key = binascii.unhexlify(private_key.encode('ascii'))

You might’ve noticed that we’ve also created a printable_pk variable. This variable will store the private key in hexadecimals. This way it is easier to store, copy and/or print the private key.

 

Step two – Use the private key to initialize the signing function

After we got our private key it’s time to use it initialize our ECDSA function. This step is similar to declaring our function f with the private key pr_k.

self.sk = f(pr_k, )

We’re defining the variable self.sk (for Signing Key) and use SigningKey.from_string from the ECDSA module with two arguments, the first one is our self.private key, and the second one is the curve (We haven’t talked about the curve yet, But it represent the mathematical part of our function. This is too advance mathematics so we won’t go into it in this project. But for now we just need to know that the Bitcoin protocol requires us to use the ECDSA function with the mathematical curve SECP256k1)

self.sk = ecdsa.SigningKey.from_string(self.private_key, curve = ecdsa.SECP256k1)

 

Step three – Use the initialized function (self.sk) to get the public key

Now that we got our signing key, we can use it in order to create our public key.

self.vk = self.sk.verifying_key

We’re defining a new variable called self.vk which will hold the verifying key, or the public key that can be sent alongside the signed message and the original message. This key will be used to verify that the message was indeed signed by the owner of that public key. And since every public key matches only one specific private key, it also proves that the one who signed the message also possess the corresponding private key.

 

Step four – Formatting the public key.

The variable self.vk holds the public key that will be used to verify our signed messages. But the Bitcoin protocol requires that we’ll represent this public key in couple of different formats.

The following chart from the Bitcoin wiki site shows the way the public key should be formatted:

Converting the public key to Bitcoin address
Converting the public key to Bitcoin address

 

The first line is the real public key, or in our case the verification key self.vk

this is the real public key - but we can't send it like this. We need to do dome formatting
this is the real public key – but we can’t send it like this. We need to do dome formatting

 

The second line tells us that we need to inser the byte 0x04 at the beginning of our public key

self.public_key =  b"04" + binascii.hexlify(self.vk.to_string())

We’re using the function to_string in order to display the variable self.vk  as a string. Then we convert it to hexadecimals so it will be easier to append the byte 0x04.

This is the public key in Bitcoin terminology. Usually, When looking for the public key in signed transactions, that's what it will look like
This is the public key in Bitcoin terminology. Usually, When looking for the public key in signed transactions, that’s what it will look like

 

The third line tells us to hash the public key twice. once using the SHA256 function, and then again using the ripemd160 function.

ripemd160 = hashlib.new('ripemd160') # <-initializing the ripemd160 function 
ripemd160.update(hashlib.sha256(binascii.unhexlify(self.public_key)).digest())
First hashing the public key using the SHA256 function. Then the result is hashed with the ripemd160 function
First hashing the public key using the SHA256 function. Then the result is hashed with the ripemd160 function

The forth line tells us to add another byte at the beginning of the hashed key.

We're working with the main network so we'll add the byte 0x00
We’re working with the main network so we’ll add the byte 0x00

This is the network ID byte which is used to prevent us from using keys and addresses that were generated in the test network, in the main network (and vice versa). In our example we’re using the main network, so the byte we’ll add will be 0x00.

self.hashed_public_key = b”00″ + binascii.hexlify(ripemd160.digest())

In Bitcoin terminology, the result is the hashed public key. This format is used mostly when creating a transactions.

 

The fifth (and sixth) line tells us to take our hashed public key and hash it again, twice, using the SHA256 function. The first 4 bytes of the result will be the checksum.

self.checksum = binascii.hexlify(hashlib.sha256(hashlib.sha256(binascii.unhexlify(self.hashed_public_key)).digest()).digest()[:4])
The checksum is the first 4 bytes
The checksum is the first 4 bytes

 

The seventh line creates the Bitcoin address in its binary form by appending the hashed public key with the checksum. This is a valid Bitcoin address, but it still need to go through one more process before it can be used with most Bitcoin wallets.

self.binary_addr = binascii.unhexlify(self.hashed_public_key + self.checksum)

 

The last line Finally we’ve reached the end point. There’s only one more thing we need to do before we can get the standard Bitcoin address and that is to convert the binary code of the address into a base58 string. The idea behind this conversion is quite simple. In order to reduce human errors, it was decided that some characters will be omitted from the standard Bitcoin address. characters like capital O, the number 0, lower case l and upper case I, as well as many more characters were omitted.

The final address represented in base 58.
The final address represented in base 58.
self.addr = base58.b58encode(self.binary_addr)
  • You might need to install the base58 module using the command pip install base58.

 

User interface

We’ve also added a tab to our graphical user interface which might help. You can use it to see the public key, hashed public key and Bitcoin address or any given private address.

The user interface for the keys can be found in the second tab
The user interface for the keys can be found in the second tab
Connection part three – Receiving messages

Connection part three – Receiving messages

In the previous posts, all that we’ve done was to construct and send messages to another node on the network. In this post, we’ll see what happens to incoming messages.

First stop – The ReceiverManager:

class ReceiverManager(Thread):
    def __init__(self, sock):
        Thread.__init__(self)
        self.sendingQueue = Utils.globals.sendingQueue
        self.sock = sock
        self.ping = ""

        self.outfile = open("data_received_from_node.txt", 'w')

    def run(self):
        while True:
            try:

                # get only the header's message
                header = self.sock.recv(24)

                if len(header) <= 0:
                    raise Exception("Node disconnected (received 0bit length message)")

                headerStream = BytesIO(header)
                parsedHeader = HeaderParser(headerStream)

                # get the payload
                payload = self.recvall(parsedHeader.payload_size)
                payloadStream = BytesIO(payload)

                self.manager(parsedHeader, payloadStream)

            except Exception as e:
                print(e)
                break

        print("Exit receiver Thread")

The receivermanager always runs in the background, checking our Thread for any incoming packets. Once it receives a packet, it will immediately cut its first 24 bytes.

header = self.sock.recv(24)

The first 24 bytes are the header. If you remember from this post, every Bitcoin message will starts with header, and the header is always exactly 24 bytes long.

 

The first 24 bytes are the header. The rest is the payload
The first 24 bytes are the header. The rest is the payload.

This header is now parsed as a string of bytes and passed to the HeaderParser class in Bitpy/Network/HeaderParser.py

headerStream = BytesIO(header)
parsedHeader = HeaderParser(headerStream)

 

Second stop – The HeaderParser class:

The HeaderParser class takes the first 24 bytes as a long string of bytes, and then it reads them in the same order that we’ve seen before.

Size (Bytes) Name Data type Description
4 Start string char[4]  The network identifier
12 Command name char[12]  The name of the command.
4 Payload size uint32 Len(payload)
4 Checksum char[4]  SHA256(SHA256(payload))[:4]

First 4 bytes for the Start string (or Magic number), another 12 bytes for Command name, the next 4 bytes are the Payload size and the last 4 bytes are the checksum.

4 bytes for starting string. 12 for command name. 4 for payload size and 4 for checksum
4 bytes for starting string. 12 for command name. 4 for payload size and 4 for checksum
class HeaderParser:
    def __init__(self, header):  # Packets is a stream

        self.magic = read_hexa(header.read(4))
        self.command = header.read(12)
        self.payload_size = read_uint32(header.read(4))
        self.checksum = read_hexa(header.read(4))

        self.header_size = 4 + 12 + 4 + 4

    def to_string(self):
        display = "\n-------------HEADER-------------"
        display += "\nMagic:\t %s" % self.magic
        display += "\nCommand name	:\t %s" % self.command
        display += "\nPayload size	:\t %s" % self.payload_size
        display += "\nChecksum	:\t\t %s" % self.checksum
        display += "\nheader Size:\t\t %s" % self.header_size
        display += "\n"
        return display

We’ve also defined the to_string function which basically makes it easier to print a human readable version of the message header.

You might’ve noticed that currently our code just accept the checksum field from the received message without checking it. This is of course a security flaw in our code. The checksum filed is there to help us verify the authenticity of the message. That is one of the ways we can make sure that no one tempered or changed the message on its way from the sender node to our node. But for the time being we’ll assume that the message is indeed authentic and we’ll accept the checksum as is.

 

Third stop – Back to the ReceiverManager:

Now that we have our header, it’s time to get the payload. The size of the payload was defined in the header of the message. We need to cut that amount of bytes from our incoming packets, just as we cut the first 24 bytes of the header. There’s however one extra step in our code. Instead of using the built in sock.recv function (as we did for the header) we’ve decided to implement our own recevall function. The rational was that since we have no way to predetermine the size of the payload, and since the built in sock.recv can’t handle large packets of unknown size, it would be wiser to break the payload into smaller parts and append them together. This has nothing to do with the Bitcoin protocol, it’s only our way to make sure that the code will properly handle large messages.

def recvall(self, length):
    parts = []

    while length > 0:
        part = self.sock.recv(length)
        if not part:
            raise EOFError('socket closed with %d bytes left in this part'.format(length))

        length -= len(part)
        parts.append(part)

    return b''.join(parts)

So now, after we’ve cut the required amount of bytes that represents the payload of our message, and we have both our header (which was already parsed) and our payload (yet to be parsed), we’ll pass them both to the receivermanager manager function.

 

Forth stop – Manager:

    
def manager(self, parsedHeader, payloadStream):

    command = parsedHeader.command.decode("utf-8")
    message = {"timestamp": time.time(), "command": command, "header": parsedHeader.to_string(), "payload": ""}


    if command.startswith('ping'):
        ping = Ping.DecodePing(payloadStream)

        pong = Pong.EncodePong(ping.nonce)
        packet = PacketCreator(pong)
        self.sendingQueue.put(packet.forge_packet())

        message["payload"] = str(ping.nonce)
        self.display(message)

    elif command.startswith('inv'):
        inv = Inv.DecodeInv(payloadStream)
        message["payload"] = inv.get_decoded_info()
        self.display(message)

    elif command.startswith('addr'):
        addr = Addr.DecodeAddr(payloadStream)
        message["payload"] = addr.get_decoded_info()
        self.display(message)

    elif command.startswith('pong'):
        pong = Pong.DecodePong(payloadStream)
        message["payload"] = pong.get_decoded_info()
        self.display(message)

    elif command.startswith('version'):
        version = Version.DecodeVersion(payloadStream)
        message["payload"] = version.get_decoded_info()
        self.display(message)

The manager function does a very simple thing. It checks the command of the message (the command is part of the header) and then it sends the message payload to be parsed by the corresponding functions. For example. If the manager sees that the command is «pong», it will use the decodepong method in Bitpay/Packets/control_messages/pong.py to extract the desire fields out of it. (You can read more about «pong», «ping» and «verack» messages in this post.).

 

Divergence

We have our pared message, both its header and payload. And now we need to decide what to do with them. For some messages this might be the end of the line. There’s nothing more we can do with them. Some might require us to act. «ping» message should be answered by a «pong» message, transactions should be checked and relayed (We’ll talk about transactions in later posts), «version» messages should be acknowledged by sending back a «verack» message.

A major part of learning the Bitcoin protocol is learning how each and every message should be dealt with. Which fields of information it contains and what is the meaning of this information. We’ve already talked about some of the messages in previous posts  (see here for «ping», «pong» and «verack» messages, and here for «version» message.) and as our project will have more features implemented, so we’ll discuss other type of messages and how to deal with them.

User interface

User interface

The best way to make our code really useful is to add user interface to it. There is a lot of this that can only be taught and understood by looking at the code itself, bu there are also many things that the average user can learn about the Bitcoin protocol that can be explained in a more “human friendly” way. An that is way Alexis and I have decided to add a Graphical User Interface (GUI) to our project in hope that in the future we can create a full Bitcoin graphical environment.

We’ve added a new folder to hold our GUI files, the UI folder. This folder contains 3 implementations. The first one uses the Tkinter module, the second uses the pyQt module and the third is a simple command line interface (And this is why we’re calling this folder UI and not GUI). We still haven’t decided on the final implementation bu we believe that we’ll eventually go with only the pyQt implementation, basically because this one looks the best!

 

This is what it looks like at the moment:

Bitcoin graphical environment
The user interface of bitpy.

This is our basic design, At the bottom there’s a list of all the messages you can send: «version», «verack» and «ping»   (More messages will be added soon). The left side keeps a record of all of the incoming messages, in the order in which they were received. Once we choose one of these messages, we can see at the right panel a display of the parsed message. The header and the payload.

Remember! this is just our first draft, but we’re quite proud of it. Little by little our project shows more and more potential.

 

installing dependencies

Because we’re using pyQt5, you might need to install the pyQt5 module on your machine. The easiest way to do so will be to use the commend:

pip install pyqt

 

 

 

Pivot number one – Python 2.7 to 3.5

Pivot number one – Python 2.7 to 3.5

Why did we migrated from Python 2.7 to Python 3.5?

After many discussions and long conversations Alexis and I decided that we should use Python 3. I can’t really point on the exact reason that made us agreed on this change, but I guess the parse_ip bug (As described over here) was the real catalysis. We’ve spent so much time fixing something that worked just fine in python 3. We just snapped, and decided to bite our lips and make the change.

So how will your code base be effected?

Not much actually. We’ve tried our best to keep the changes in the code as minimal as possible. But nevertheless, there’re changes. We’ll add a sub section to each individual post that was published prior to the Python 3.5 migration. In these sub sections we’ll do our best to cover all of the changes in the code. We’re also highly suggesting that you’ll view the full change log in github.

  • Pay attention! The github change log displays all of the changes in the code, included many changes that are nothing more than a draft and changes that aren’t necessarily related to Python 3 migration.

 

Here’s a list of most of the changes that were made in the code during this Python 3.5 migration:

  1. Every print function is now requires parentheses print "hello world" --> print ("hello world").
  2. We’ve fixed the data type – parse_ip bug (described over here).
  3. We’ve added a new data type set of functions called read_hexa and to_hexa which are used for decoding and encoding hexadecimals. (don’t forget, we’re working mostly with strings of bytes, so hexadecimal decoder and encoder are highly important).
  4. we’ve added a try call in Bitpy/Packets/PacketCreator.py .

We also suggest that you’ll read this great article on the general changes between Python 2.x and Python 3.x.

 

Messages part three – Ping Pong and VerAck

Messages part three – Ping Pong and VerAck

VerAck

When establishing connection, we first need to send a version message to the node we wish to connect to. But keeping our connection alive will require the use of 3 more messages: ping, pong and VerAck.

The VerAck message is the most simple type of message, it’s basically an empty message, it has no payload, only a header. Alexis and I have decided that any type of message will have its own file to construct and parse its payload, even if that message doesn’t have any payload. It helps us to maintain the structure of our project.

Bitpy/Packets/control_messages/verack.py:

class EncodeVerack:
    def __init__(self):
        self.command_name = "verack"

    # Verack messages have an empty payload
    def forge(self):
        return ""


# No need this class because, like GetAddr, there is no payload
class DecodeVerack:
    def __init__(self, payload=""):
        pass

As you can see, the function forge will return an empty string and the class DecodeVerAck won’t do anything.

Pay attention that even though the VerAck payload is empty, this message still needs to have a header. The header will contain 4 bytes for the Magic number (or starting string), 12 bytes for the command name (VerAck), 4 bytes for the size of the payload (which in this case will be zero) and another 4 bytes for the checksum (which in this case will always be the same).

 

 

Ping and Pong

One node can always send a ping request to the other. That is how peers on the network can check if the connection is still alive. Ping request is a nothing more than a message, just like any other Bitcoin message, that contains one field:

Size (Bytes) Name Data type Description
8 nonce uint_64 A random number

Alice sends Bob a ping message. This ping message will contain 8 bytes of random number. Bob will receive this ping message from Alice, and will acknowledge her request by sending her a pong message, this pong message will contain the same random number that Alice sent to Bob.

Let’s have a look at the code implementation

Bitpy/Packets/control_messages/ping.py:

import random
from Utils.dataTypes import *


class EncodePing:
    def __init__(self):
        self.command_name = "ping"
        self.nonce = to_uint64(random.getrandbits(64))

    def forge(self):
        return self.nonce


class DecodePing:
    def __init__(self, payload):
        self.nonce = payload.read(8)

When we want to construct a ping message we’ll use the class EncodePing which will set the field nonce to be 8 bytes long random number (64 bit = 8 bytes). When we’ll receive a ping message, we’ll use the class DecodePing with the payload of the incoming message to find out the nonce of that ping message (We’re only reading the nonce in the ping message, we haven’t used it yet). We’ll use this nonce when we’ll construct our returning (pong) message.

 

Bitpy/Packets/control_messages/pong.py:

from Utils.dataTypes import *

class EncodePong:
    def __init__(self, ping_received):
        self.command_name = "pong"

        self.nonce = ping_received

    def forge(self):
        return self.nonce

class DecodedPong:
    def __init__(self, pong_received):
        self.command_name = "pong"
        self.nonce = read_uint64(pong_received.read(8))

    def get_decoded_info(self):
        return "\npong   :\t\t %s" % self.nonce

The class EncodePong will take the payload of the incoming ping message and will assign it to be the nonce of the pong message (the message that we’ll send back). The DecodedPong class will receive the payload of the pong and will assign it to the self variable nonce. We can use the get_decoded_info function to display this nonce number. Because the pong is just a returning message, there’s nothing else we need to do with it.

Messages part two – Payloads and version message

Messages part two – Payloads and version message

In the previous posts we’ve talked a little bit about messages. We know that a message is nothing more than a string of bytes, it has an header and a body (payload), and it must maintain its predefined format. We’ve seen the format of the header, but every message body (payload) will contain different information, according to the message type. We’ll start by constructing the “version” message.

The version message is used when trying to establish a connection with the remote node. Alice will send the version message to Bob, and only after Bob have approved this version message, and replay with his own version message, only then the connection between the two nodes can be established. No other message will be accepted before both nodes have exchanged this version message. So it’s no surprise that we choose to construct this message first, since this is the first message that we’ll send, and the first one we’ll receive.

 

The fields that are required in our version message (From the developer reference):

 

Size (Bytes) Name Data type Description
4 version int32 What is the latest version of the protocol that the transmitting node (our node) understands. In this example this number is 70012
8 services uint64
Not full node 0x00
Full node 0x01
8 timestamp int64 Current timestamp
8 addr_recv services uint64 What type of services OUR receiving node can support?
16 addr_recv IP address char The IP address of OUR receiving node
2 addr_recv port uint16 The port of OUR receiving node
8 addr_trans services uint64 What type of services OUR transmitting node can support?
16 addr_trans IP address char The IP address of OUR transmitting node
2 addr_trans port uint16 The port of OUR transmitting node
8 nonce uint64 A random number that helps the receiving node to detect and index our connection
Varies user_agent bytes CompactSize This field varies in size, but it tells the other node what should be the size of the next field
Varies user_agent string This field is used to display the name of our node, like licence plates. We can call our node whatever we want, “core”, “classic”, “my_cool_bitcoin_thingy”.
4 start_height int32 The highest block that the transmitting node knows of.
1 relay bool
True The transmitting node can relay messages to the rest of the network
False The transmitting node can’t relay messages to the rest of the network
  • Pay attention that in the “version” message, when asking for both the receiving and the transmitting  services, IP address and ports, we’re asked about our own machine, our own node. In our case both incoming and outgoing messages will be dealt in a similar manner, but some implementations might include more advanced routing.

Code implementation

Alexis and I decided that every message will have it’s own file in which the payload of the message will be both created (For messages that our node will send) and parsed (For incoming messages).

import random
import time
from io import BytesIO
from Utils.config import version_number, latest_known_block
from Utils.dataTypes import *


class EncodeVersion:
    def __init__(self):
        self.command_name = "version"

        self.version = to_int32(version_number)
        self.services = to_uint64(0)
        self.timestamp = to_int64(time.time())

        self.addr_recv_services = to_uint64(0)
        self.addr_recv_ip = to_big_endian_16char("127.0.0.1")
        self.addr_recv_port = to_big_endian_uint16(8333)

        self.addr_trans_services = to_uint64(0)
        self.addr_trans_ip = to_big_endian_16char("127.0.0.1")
        self.addr_trans_port = to_big_endian_uint16(8333)

        self.nonce = to_uint64(random.getrandbits(64))
        self.user_agent_bytes = to_uchar(0)
        self.starting_height = to_int32(latest_known_block)
        self.relay = to_bool(False)

    def forge(self):
        return self.version + self.services + self.timestamp + \
               self.addr_recv_services + self.addr_recv_ip + self.addr_recv_port + \
               self.addr_trans_services + self.addr_trans_ip + self.addr_trans_port + \
               self.nonce + self.user_agent_bytes + self.starting_height + \
               self.relay


class DecodedVersion:
    def __init__(self, payload):
        self.version = read_int32(payload.read(4))
        self.services = read_uint64(payload.read(8))
        self.timestamp = read_int64(payload.read(8))

        self.addr_recv_services = read_uint64(payload.read(8))
        self.addr_recv_ip = parse_ip(payload.read(16))
        self.addr_recv_port = read_big_endian_uint16(payload.read(2))

        self.addr_trans_services = read_uint64(payload.read(8))
        self.addr_trans_ip = parse_ip(payload.read(16))
        self.addr_trans_port = read_big_endian_uint16(payload.read(2))

        self.nonce = read_uint64(payload.read(8))

        self.user_agent_bytes = read_compactSize_uint(BytesIO(payload.read(1)))
        self.user_agent = read_char(payload.read(self.user_agent_bytes), self.user_agent_bytes)

        self.starting_height = read_int32(payload.read(4))
        self.relay = read_bool(payload.read(1))

    def get_decoded_info(self):
        display = "\n-----Version-----"
        display += "\nversion                :\t\t %s" % self.version
        display += "\nservices  	         :\t\t %s" % self.services
        display += "\ntimestamp              :\t\t %s" % self.timestamp

        display += "\naddr_recv_services	 :\t\t %s" % self.addr_recv_services
        display += "\naddr_recv_ip           :\t\t %s" % self.addr_recv_ip
        display += "\naddr_recv_port         :\t\t %s" % self.addr_recv_port

        display += "\naddr_trans_services  	:\t\t %s" % self.addr_trans_services
        display += "\naddr_trans_ip         :\t\t %s" % self.addr_trans_ip
        display += "\naddr_trans_port	    :\t\t %s" % self.addr_trans_port

        display += "\nnonce                 :\t\t %s" % self.nonce

        display += "\nuser_agent_bytes  	:\t\t %s" % self.user_agent_bytes
        display += "\nuser_agent            :\t\t %s" % self.user_agent
        display += "\nstarting_height	    :\t\t %s" % self.starting_height
        display += "\nrelay	                :\t\t %s" % self.relay

        return display

Because this code is used both for incoming and outgoing messages, it has both the class EncodeVersion, which is used to build the payload of the version message, and the class DecodedVersion which is used to parse the payload of any incoming version message.

The function forge will just append and return all the fields in the right order – this is the finale payload.

 

EncodeVersion

Because we haven’t established connection yet, we first need to create the payload of our version message using the EncodeVersion class. The class won’t take any argument (except for self) and will just assign every field with the right value and the right data type.

The variables version_number and last_known_block are imported from Bitpy/Utils/config.py and are set to:

version_number = 70012
latest_known_block = 416419  # june 2016

Our node is not a full node so the services will be set to 0x00. For that reason we’ll also set our relay field to be False.

In our own node, both incoming and outgoing messages will be dealt by the same machine so both receiving and transmitting machines are the same:

self.addr_recv_services = to_uint64(0)
self.addr_recv_ip = to_big_endian_16char("127.0.0.1")
self.addr_recv_port = to_big_endian_uint16(8333)

self.addr_trans_services = to_uint64(0)
self.addr_trans_ip = to_big_endian_16char("127.0.0.1")
self.addr_trans_port = to_big_endian_uint16(8333)

 

We’re using the function random.getrandbits(64) in order to populate or nonce field with 8 bytes long random number.

We’re also not adding any vanity name to our node at the time so we’re setting the user_agent_bytes to be 0. That means that there’s no user_agent_bytes field.

 

 

DecodedVersion

This class is quite straightforward, it receives the payload of the incoming message from Bitpy/Manager/ReceiverManager.py.

It uses the builtin function read  and our data types functions to assign each field with the proper value, for example the first 4 bytes are the version number in uint32 format, the next 8 bytes are the services field in uint64 format and so on.

The only thing that is really unique is the user_agent field:

self.user_agent_bytes = read_compactSize_uint(BytesIO(payload.read(1)))
self.user_agent = read_char(payload.read(self.user_agent_bytes), self.user_agent_bytes)

This is the first example of the varying CompactSize data type in use. The field user_agent_bytes doesn’t have a fixed size. The Bitcoin protocol defines the variable data type CompactSize to deal with such fields (you can read more about this data type in the data types section). We’re using the function BytesIOin in order to send this argument as a string of bytes to theread_CompactSize_unit function and receives back the Uint that matches the size of the next field, the, the user_agent field. Then we’re using the data type function read_char which requires two arguments. The first is the string of bytes itself (payload.read(self.user_agent_bytes)) and the second is the size of the total string (self.user_agent_bytes).

Once we’ve finished parsing out version message we can use the get_decode_info function in order to display the information about the remote node (currently, we aren’t doing anything with this information except to dump it as a text file).

—–Version—–
version : 70012
services : 5
timestamp : 1467293151
addr_recv_services : 1
addr_recv_ip : ��^�V�
addr_recv_port : 30373
addr_trans_services : 5
addr_trans_ip : ��
addr_trans_port : 8333
nonce : 1755461931592560680
user_agent_bytes : 16
user_agent : /Classic:0.12.0/
starting_height : 418653
relay : True

 

Edit (4-Jul-2016): Python 2.5 to 3.5 migration

Please read the general notes about the transition from Python 2.5 to 3.5 over here. And the complete github change log for the migration over here.

The code for the <Version> message remind fairly untouched, only few adjustments were required:
class EncodeVersion:
///
self.timestamp = to_int64(int(time.time()))
///
self.addr_recv_ip = to_big_endian_16char(b“127.0.0.1”)
///
self.addr_trans_ip = to_big_endian_16char(b“127.0.0.1”)
///
class DecodedVersion:
///
self.user_agent = read_chars(payload.read(self.user_agent_bytes), self.user_agent_bytes)
///

 

 

Messages part one – constructing the message and headers

Messages part one – constructing the message and headers

Communication in the Bitcoin network is done via messages.  A message is no more than just a string of bytes.

This is an example of a simple “ping” message:

f9beb4d970696e670000000000000000080000001b3cb220309941550a2ffd5c # The full message. Header + Payload

The Bitcoin protocol describes how each message should be packet. It is highly important to make sure that you construct the message in the right format. The machine on the other side will expect to receive a very specific format and it won’t be able to process any message that won’t fit this format.

Every message is made of two main components:

  1. Header (not to be confused with block headers! we’ll talk about them in a later post!)
  2. Payload

The payload is the body of the message – The message itself. Many types of messages are defined in the Bitcoin protocol, and we’ll talk about each message later on, but for now we should take a look at the header and that’s because the header is the one thing ALL of the messages have in common.

Each header is made of four components:

Size (Bytes) Name Data type Description
4 Start string char[4]  The network identifier
12 Command name char[12]  The name of the command.
4 Payload size uint32 Len(payload)
4 Checksum char[4]  SHA256(SHA256(payload))[:4]

Starting-string

The first 4 bytes of the message are the Starting-string (or Magic number).

f9beb4d9 # 4 bytes starting string (Magic number)

This number tells the receiving machine which network I’m using. In this tutorial we’re using the real main-network (Caution! any mistake made in the real main network might cost you real Bitcoins!). The Magic number of the main-network is 0xf9beb4d9. We can look at the “ping” message example at the top and see for yourself that the first four bytes in the message are indeed f9beb4d9. The receiving machine will first check this Magic number to make sure that it is receiving messages from the network it’s currently ruining, and only then it will start processing the rest of the message.

 

Command name

The next 12 bytes are the Command name.

70696e670000000000000000 # 12 bytes command name

Each header should contain the name of the command, or type of message that is contained in the payload (body) of the message. In our case, the message is a “ping” message. The receiving machine will use this information to know how to parse and treat this message. In our example the receiving machine will answer with a “pong” message (but only after it will complete validating the message). Pay attention that the command field should be exactly 12 bytes long. That means that if the command name is shorter than 12 bytes, it needs to be padded with nulls. We can see that the first 4 bytes makes the word “ping” in hexadecimals  (70-P, 69-I, 6e-N, 67-G) and another 8 bytes are nulls (00). In our code implementation (see below) we’re using the function command_padding to pad our command name to be 12 bytes.

def command_padding(self, command):  # The message command should be padded to be 12 bytes long.

command += (12 – len(command)) * “\00” return command

 

Size

Another 4 bytes will contain the size of the payload.

08000000 # 4 bytes size of the payload

This field is very straightforward. We just need to insert the size of the payload (body) of our message. In our case it’s simply 8 bytes long. (Once again, we need to make sure it will be exactly 4 bytes longs. luckily we’ve predefined this data type in our Bitpy/Utils/dataTypes.py file under to_uint32(v). So we don’t have to manually insert the extra 3 null bytes).

 

Checksum

The last part of the header is the checksum.

1b3cb220 # 4 bytes checksum

It might seem somewhat strange at the beginning, but all that we need to do is to take the payload of our message, perform the cryptographic function SHA256 twice on that message, and then append the last 4 bytes of the result to our header. SHA256(SHA256(payload))[:4]

def get_checksum(self):
   check = hashlib.sha256(hashlib.sha256(self.payload).digest()).digest()[:4]
   return check

 

 

Payload

And now all that we left with is the payload – the body of the message . Each message will be parsed differently.

309941550a2ffd5c # The payload, the body of our "ping" message.

 

The code implementation

In our code we need to deal with both incoming and outgoing messages. So we’ve split our code to two files:

  1. Bitpy/Packets/HeaderParser.py – to parse the headers of the incoming messages (And after the header was properly parsed, to use the 12 bytes command properly to determine what other steps are required).
  2.  Bitpy/Packets/PacketCreator.py – to build the header of our outgoing message and to pre-fixed it to the payload of the message.

Bitpy/Packets/HeaderParser.py for incoming messages

class HeaderParser:
    def __init__(self, block):  # Packets is a stream

        self.magic = read_hexa(block.read(4))
        self.command = block.read(12)
        self.payload_size = read_uint32(block.read(4))
        self.checksum = hash_to_string(block.read(4))

        self.header_size = 4 + 12 + 4 + 4

    def to_string(self):
        display = "\n-------------HEADER-------------"
        display += "\nMagic:\t %s" % self.magic
        display += "\nCommand name	:\t %s" % self.command
        display += "\nPayload size	:\t %s" % self.payload_size
        display += "\nChecksum	:\t\t %s" % self.checksum
        display += "\nheader Size:\t\t %s" % self.header_size
        display += "\n"
        return display

Bitpy/Packets/PacketCreator.py for outgoing messages

class PacketCreator:
    def __init__(self, payload):
        self.payload = payload.forge()  # The message payload forged

        # create the header
        self.magic = to_hexa(
            "F9BEB4D9")  # The Magic number of the Main network -> This message will be accepted by the main network
        self.command = self.command_padding(payload.command_name)
        self.length = to_uint32(len(self.payload))
        self.checksum = self.get_checksum()

    def command_padding(self, command):  # The message command should be padded to be 12 bytes long.
        command += (12 - len(command)) * "\00"
        return command

    def get_checksum(self):
        check = hashlib.sha256(hashlib.sha256(self.payload).digest()).digest()[:4]
        return check

    def forge_header(self):
        return self.magic + self.command + self.length + self.checksum

    def forge_packet(self):
        return self.forge_header() + self.payload

 

Edit (4-Jul-2016): Python 2.5 to 3.5 migration

Please read the general notes about the transition from Python 2.5 to 3.5 over here. And the complete github change log for the migration over here.

The command_padding function was rewrite in order to take full advantage of Python 3.5 capacity and we’re now using the built in function ljust.

def command_padding(self, cmd):
    command = str(cmd)
    command = command.ljust(12, '\00')
    return str.encode(command)