Browsed by
Tag: Bitpy

Scripts and stacks

Scripts and stacks

Personal note, Many things happened in past two months the required my full attention. I hope to resume a steady flow of posts in coming days.


In the last post we’ve talked about one the biggest bitcoin misconception – The idea that transaction actually moves coins from one wallet to another. The truth is that transactions are nothing more that statements. These statements always points to a previous statement (that in turn point to an even older statement and so on), and usually these statement also specify an amount of coins that the current owner is wishing to transfer. The statement also contains a riddle, or an equation that needs to be proofed, and mostly, the key to proof this equation will require the use of the private_key that is associated with the recipient bitcoin address.


Pay attention, even though Bob will be required to use his own private_key to proof that he indeed can solve this problem, the private_key still won’t be available to any one.


Now let’s look for a second at this transaction message. We’ve already learned how to create a bitcoin message (see this section about Version message and this one about headers). We just need to make sure that all of the fields are filled in accordance the protocol rules. Just like filling a form. You can find a complete list of the fields that needs to be filled in the bitcoin developers documentation.



Most of the fields are quite straight foreword. I might still create another post in the future with detail instructions on how to fill all the fields, but this isn’t really the topic of this post. This post deals with one of bitcoin more fascinating aspects – The riddle that Alice place in her statement. The riddle that only Bob can solve -The script.


(You just can’t wait to create your own transaction? you’re more than welcome to watch my videos on creating bitcoin transaction)

Scripts, what is it?

Scripts is a computer language. In more detail, it’s a set of predefined words that are agreed upon. Every node that follows the rules specified in the bitcoin protocol will know how to read, interpret and implement these words. Because bitcoin messages are basically nothing more then a string on bytes, these words are not written in plain English, rather are translated to OP_CODEs. That way, we can send our message as a string of bytes, and the receiving node will know that these bytes represent some instructions. (Important note, The receiving node will only treat this bytes as instructions only if they appear inside one of the script field.)

Here’re selected few:

Word Opcode Hex Input Output Description
OP_1ADD 139 0x8b in out  1 is added to the input.
OP_1SUB 140 0x8c in out 1 is subtracted from the input.
N/A 1-75 0x01-0x4b (special) data The next opcode bytes is data to be pushed onto the stack
OP_MIN 163 0xa3 a b out  Returns the smaller of a and b.
 OP_SHA256  168  0xa8  in  hash The input is hashed using SHA-256
OP_EQUAL 135 0x87 x1 x2 True / false Returns 1 if the inputs are exactly equal, 0 otherwise.

The original list included around 200 of these words, but currently most nodes will only support few dozes of these words. Using these few words we can create many “riddles” or state many conditions to claim the coins in our transaction message.

For example I can add the following string of bytes as my script.

0x01 0x8b 0x87 0x02 0x87

<1> <OP_1ADD> <2> <OP_EQUAL>
  1. It will take the number 1.
  2. Use the OP_CODE OP_1ADD to add 1 to it -> The output of this OP_CODE will be 2.
  3. Use the OP_CODE OP_EQUAL to make sure if the result is equal to 2. -> The output of this OP_CODE will be True.

A word of caution though, most nodes not only refuse to accept most of these OP_CODEs, they will even refuse to accept most non-standard  scripts, mainly because they want users to use standard transactions. Many nodes will not only refuse to accept a transaction with a non standard script, they’ll also refuse to transmit these transactions to other nodes.



You might’ve already noticed that this script language can only be written as a list of operations. Unlike other high level languages (such as python for example) Scripts can only be used in a predefined order. This type of structure is called stack, because we’re stacking variables and data on top of each other. But not only we’re stacking them, using the stack structure also means that they’ll be processed in accordance to the order in which they were stacked.

In our previous example, the integer 1 was the first item in our stack. Then came the operation OP_1ADD which took that item as its input, processed this item by adding 1 to it, and than giving the output 2. Now the number 2 is stacked BELLOW the integer 2.

<1> <OP_1ADD> <2> <OP_EQUAL>

<2> <2> <OP_EQUAL>

The node recognize the OP_CODE <0x02> as the integer 2, so it moves on to the next item in our stack – the OP_CODE OP_EQUAL. This operation input is the two items that are directly bellow it and compere the two. If both are equal, it will return True.



This example code can’t be used with a standard bitcoin transaction, it’s only meant to give you a general feel on how scripts works.

You can find an example of a real transaction over here:



Give it a try with bitpy

One of bitpy newest feature is the ability to create stacks and see them in action in real time. Mind you, only few OP_CODES are currently implemented, but it might still give you a feel on how stacks works.

Example of stack using bitpy
Example of stack using bitpy


Simple stack architecture with python

Stack architecture can easily be implemented using arrays. After all, it’s nothing than an array of objects (variables, operations, results etc’).

In our bitpy project, under Utils/OpCodes/ I’ve created a stack class. In its most basic form, this class will only create and empty array upon initialization, followed by  2 methods only.

class Stack():

    def __init__(self):
        self.items = []

    def push(self, item):

    def pop(self):
        elm = self.items.pop()
        return elm
  1. push(item) append new item to the array
  2. pop(item) remove the topmost item in my array.

This should be enough to create a very basic stack class. Still, I’ve added few more methods.

class Stack():

    def __init__(self):
        self.items = []

    def isEmpty(self):
        return self.items == []

    def push(self, item):

    def pop(self):
        elm = self.items.pop()
        return elm

    def size(self):
        return len(self.items)

    def printStack(self):
        display = ""
        for items in self.items:
            items = str(items)
            if len(items) > 5:
                display += " " + "<"+ items[:5] + "..." + ">"
                display += " " + "<" + items + ">"
        return display

    def clear(self):

The isEmpty method will check if our stack array is empty.

The size method will give us the size of the array.

The printStack will provide us with a visual representation of our array. Pay attention that I’ve limited the size of each item to only 5 characters so that items such as hashed messages, bitcoin addresses, keys etc’ won’t take the all screen.

The clear method will remove all items from our array.

Using this methods we can easily start implementing more advanced OP_CODE to our stack array.

def OP_DUP(self):
    elm = self.pop()

def OP_HASH160(self): #saved as string!

def OP_EQUAL(self):
    elm1 = self.pop()
    elm2 = self.pop()

    if elm1 == elm2:

def OP_VERIFY(self):
    top = self.pop()
    if top == 1:

def OP_RETURN(self, input):


Keys, addresses and hashing

Keys, addresses and hashing

A key pair is one of the greatest tools that are used in Bitcoin, but it might be a little unintuitive at first. Don’t worry, you’ll get it!

There’s also a short video I made a few months ago that describes the basics of keys. It doesn’t completely corresponds to our current project, but it might provide you with another point of reference. You can watch it over here – Bitcoin python tutorial for beginners – keys and address.


One way function

The name “one way function” is quite self explanatory. These function are very easy to solve in one way but almost impossible to invert. Giving function f, and the input x, I can easily calculate the result y.

f(x) = y <- easy to solve

But given the result y, and the function f, It will be almost impossible to find x

f(?) = y <- almost impossible to guess.

The Bitcoin protocol define the use of some of these one way functions (SHA256, ripmed, ECDSA and murmurhash. More functions are being tested and might be used in the future).  Each one of these function has its own place in the protocol. Some functions will be used more then once and/or will be combined with another function to achieve even a grater level of security. For example, signing a message (usually a transaction message) will be done using the SHA256 function, SHA256("hello!"), finding the checksum of the message payload will be done using the SHA256 function twice SHA256(SHA256(message)).

  • Some people have hard time to accept the concept of “hard to guess”, they feel it’s too ambiguous. Well, technically an extremely powerful computer might be able to iterate through all possible results until it will find the right one (this is called brute force), but in practice, it will take a very – very long time. Trying to brute force the result of a SHA256 function on a 32 bytes message will take about 10^65 years. The age of the universe is only 1.4*10^9 year. I think it’s good enough security.
It’s easy to get the result y of function f for a giving x. But almost impossible to tell what the original input was.


Mathematical trapdoor and key pair

Mathematical trapdoor is a special type of one way function. The main difference is that in mathematical trapdoor we may also use few extra pieces of information called keys. Bitcoin uses the mathematical trapdoor function ECDSA or Elliptic Curve Digital Signature Algorithm, to produce two keys, or a key pair -A private key, and a public key – Both keys will always come in pairs! there cannot be a public key that matches two different private keys and vice versa!

The private key is used to solve (sign) the function f for message x. The result is the signed message y.

f(private_key, x) = y <- easy to solve

Now I have two messages. The original message x, and the signed message y. I want to prove that I’m the one who signed the original message x, that I’m the owner of the private key. But I don’t want to give my own private key. Anyone who have my private key will be able to sign in my name on other messages as well. So I’m using the public key. The public key can only be used to prove the solution of the function, but it cannot be used to sign messages

f(public_key, x) = y <- easy to prove

f(public_key, x) = null <- I can't sign message x with the public key. only with the private key

  • Pay attention that when we’re using the public key we’re just proving the equation, not solving it.

Here’s a simple numeric example I found on the wikipedia page on mathematical trapdoor:

An example of a simple mathematical trapdoor is “6895601 is the product of two prime numbers. What are those numbers?” A typical solution would be to try dividing 6895601 by several prime numbers until finding the answer. However, if one is told that 1931 is one of the numbers, one can find the answer by entering “6895601 ÷ 1931” into any calculator. This example is not a sturdy trapdoor function – modern computers can guess all of the possible answers within a second – but this sample problem could be improved by using the product of two much larger primes.


Let’s see an example:

Step one – create a key pair:

Create key pair using the ECDSA function and some random numbers
Create key pair using the ECDSA function and some random numbers

Step two – sign a message with the private key:

using the private key and the ECDSA algorithm
using the private key and the ECDSA algorithm

Step three – send the original message alongside the encrypted message and the public key

3 items are needed to validate the message. The public key, the original message and the encrypted message
3 items are needed to validate the message. The public key, the original message and the encrypted message

The code

In our project we’ve defined the Key class under Bitpay/Utils/KeyUtils/ This class contains all the necesery steps that are required in order to generate a private key, trnsform that private key to a public key and then create a Bitcoin address out of that public key.

step one – create (or receive) the private key

The first thing that we’re going to do is to create our private key. The private key is defined as  a random 32 bytes uint. Our class begins with a simple check. If the user initialize the Key class with an already existing private key, that private key will be saved into self.private_key. Otherwise, we’re using the urandom function in the os module to create a random 32 bytes long number.

def __init__(self, private_key=0):
    if private_key == 0:
        self.private_key = os.urandom(32)
        self.printable_pk = str(binascii.hexlify(self.private_key), "ascii")
        self.printable_pk = private_key
        self.private_key = binascii.unhexlify(private_key.encode('ascii'))

You might’ve noticed that we’ve also created a printable_pk variable. This variable will store the private key in hexadecimals. This way it is easier to store, copy and/or print the private key.


Step two – Use the private key to initialize the signing function

After we got our private key it’s time to use it initialize our ECDSA function. This step is similar to declaring our function f with the private key pr_k. = f(pr_k, )

We’re defining the variable (for Signing Key) and use SigningKey.from_string from the ECDSA module with two arguments, the first one is our self.private key, and the second one is the curve (We haven’t talked about the curve yet, But it represent the mathematical part of our function. This is too advance mathematics so we won’t go into it in this project. But for now we just need to know that the Bitcoin protocol requires us to use the ECDSA function with the mathematical curve SECP256k1) = ecdsa.SigningKey.from_string(self.private_key, curve = ecdsa.SECP256k1)


Step three – Use the initialized function ( to get the public key

Now that we got our signing key, we can use it in order to create our public key.

self.vk =

We’re defining a new variable called self.vk which will hold the verifying key, or the public key that can be sent alongside the signed message and the original message. This key will be used to verify that the message was indeed signed by the owner of that public key. And since every public key matches only one specific private key, it also proves that the one who signed the message also possess the corresponding private key.


Step four – Formatting the public key.

The variable self.vk holds the public key that will be used to verify our signed messages. But the Bitcoin protocol requires that we’ll represent this public key in couple of different formats.

The following chart from the Bitcoin wiki site shows the way the public key should be formatted:

Converting the public key to Bitcoin address
Converting the public key to Bitcoin address


The first line is the real public key, or in our case the verification key self.vk

this is the real public key - but we can't send it like this. We need to do dome formatting
this is the real public key – but we can’t send it like this. We need to do dome formatting


The second line tells us that we need to inser the byte 0x04 at the beginning of our public key

self.public_key =  b"04" + binascii.hexlify(self.vk.to_string())

We’re using the function to_string in order to display the variable self.vk  as a string. Then we convert it to hexadecimals so it will be easier to append the byte 0x04.

This is the public key in Bitcoin terminology. Usually, When looking for the public key in signed transactions, that's what it will look like
This is the public key in Bitcoin terminology. Usually, When looking for the public key in signed transactions, that’s what it will look like


The third line tells us to hash the public key twice. once using the SHA256 function, and then again using the ripemd160 function.

ripemd160 ='ripemd160') # <-initializing the ripemd160 function 
First hashing the public key using the SHA256 function. Then the result is hashed with the ripemd160 function
First hashing the public key using the SHA256 function. Then the result is hashed with the ripemd160 function

The forth line tells us to add another byte at the beginning of the hashed key.

We're working with the main network so we'll add the byte 0x00
We’re working with the main network so we’ll add the byte 0x00

This is the network ID byte which is used to prevent us from using keys and addresses that were generated in the test network, in the main network (and vice versa). In our example we’re using the main network, so the byte we’ll add will be 0x00.

self.hashed_public_key = b”00″ + binascii.hexlify(ripemd160.digest())

In Bitcoin terminology, the result is the hashed public key. This format is used mostly when creating a transactions.


The fifth (and sixth) line tells us to take our hashed public key and hash it again, twice, using the SHA256 function. The first 4 bytes of the result will be the checksum.

self.checksum = binascii.hexlify(hashlib.sha256(hashlib.sha256(binascii.unhexlify(self.hashed_public_key)).digest()).digest()[:4])
The checksum is the first 4 bytes
The checksum is the first 4 bytes


The seventh line creates the Bitcoin address in its binary form by appending the hashed public key with the checksum. This is a valid Bitcoin address, but it still need to go through one more process before it can be used with most Bitcoin wallets.

self.binary_addr = binascii.unhexlify(self.hashed_public_key + self.checksum)


The last line Finally we’ve reached the end point. There’s only one more thing we need to do before we can get the standard Bitcoin address and that is to convert the binary code of the address into a base58 string. The idea behind this conversion is quite simple. In order to reduce human errors, it was decided that some characters will be omitted from the standard Bitcoin address. characters like capital O, the number 0, lower case l and upper case I, as well as many more characters were omitted.

The final address represented in base 58.
The final address represented in base 58.
self.addr = base58.b58encode(self.binary_addr)
  • You might need to install the base58 module using the command pip install base58.


User interface

We’ve also added a tab to our graphical user interface which might help. You can use it to see the public key, hashed public key and Bitcoin address or any given private address.

The user interface for the keys can be found in the second tab
The user interface for the keys can be found in the second tab
Connection part three – Receiving messages

Connection part three – Receiving messages

In the previous posts, all that we’ve done was to construct and send messages to another node on the network. In this post, we’ll see what happens to incoming messages.

First stop – The ReceiverManager:

class ReceiverManager(Thread):
    def __init__(self, sock):
        self.sendingQueue = Utils.globals.sendingQueue
        self.sock = sock = ""

        self.outfile = open("data_received_from_node.txt", 'w')

    def run(self):
        while True:

                # get only the header's message
                header = self.sock.recv(24)

                if len(header) <= 0:
                    raise Exception("Node disconnected (received 0bit length message)")

                headerStream = BytesIO(header)
                parsedHeader = HeaderParser(headerStream)

                # get the payload
                payload = self.recvall(parsedHeader.payload_size)
                payloadStream = BytesIO(payload)

                self.manager(parsedHeader, payloadStream)

            except Exception as e:

        print("Exit receiver Thread")

The receivermanager always runs in the background, checking our Thread for any incoming packets. Once it receives a packet, it will immediately cut its first 24 bytes.

header = self.sock.recv(24)

The first 24 bytes are the header. If you remember from this post, every Bitcoin message will starts with header, and the header is always exactly 24 bytes long.


The first 24 bytes are the header. The rest is the payload
The first 24 bytes are the header. The rest is the payload.

This header is now parsed as a string of bytes and passed to the HeaderParser class in Bitpy/Network/

headerStream = BytesIO(header)
parsedHeader = HeaderParser(headerStream)


Second stop – The HeaderParser class:

The HeaderParser class takes the first 24 bytes as a long string of bytes, and then it reads them in the same order that we’ve seen before.

Size (Bytes) Name Data type Description
4 Start string char[4]  The network identifier
12 Command name char[12]  The name of the command.
4 Payload size uint32 Len(payload)
4 Checksum char[4]  SHA256(SHA256(payload))[:4]

First 4 bytes for the Start string (or Magic number), another 12 bytes for Command name, the next 4 bytes are the Payload size and the last 4 bytes are the checksum.

4 bytes for starting string. 12 for command name. 4 for payload size and 4 for checksum
4 bytes for starting string. 12 for command name. 4 for payload size and 4 for checksum
class HeaderParser:
    def __init__(self, header):  # Packets is a stream

        self.magic = read_hexa(
        self.command =
        self.payload_size = read_uint32(
        self.checksum = read_hexa(

        self.header_size = 4 + 12 + 4 + 4

    def to_string(self):
        display = "\n-------------HEADER-------------"
        display += "\nMagic:\t %s" % self.magic
        display += "\nCommand name	:\t %s" % self.command
        display += "\nPayload size	:\t %s" % self.payload_size
        display += "\nChecksum	:\t\t %s" % self.checksum
        display += "\nheader Size:\t\t %s" % self.header_size
        display += "\n"
        return display

We’ve also defined the to_string function which basically makes it easier to print a human readable version of the message header.

You might’ve noticed that currently our code just accept the checksum field from the received message without checking it. This is of course a security flaw in our code. The checksum filed is there to help us verify the authenticity of the message. That is one of the ways we can make sure that no one tempered or changed the message on its way from the sender node to our node. But for the time being we’ll assume that the message is indeed authentic and we’ll accept the checksum as is.


Third stop – Back to the ReceiverManager:

Now that we have our header, it’s time to get the payload. The size of the payload was defined in the header of the message. We need to cut that amount of bytes from our incoming packets, just as we cut the first 24 bytes of the header. There’s however one extra step in our code. Instead of using the built in sock.recv function (as we did for the header) we’ve decided to implement our own recevall function. The rational was that since we have no way to predetermine the size of the payload, and since the built in sock.recv can’t handle large packets of unknown size, it would be wiser to break the payload into smaller parts and append them together. This has nothing to do with the Bitcoin protocol, it’s only our way to make sure that the code will properly handle large messages.

def recvall(self, length):
    parts = []

    while length > 0:
        part = self.sock.recv(length)
        if not part:
            raise EOFError('socket closed with %d bytes left in this part'.format(length))

        length -= len(part)

    return b''.join(parts)

So now, after we’ve cut the required amount of bytes that represents the payload of our message, and we have both our header (which was already parsed) and our payload (yet to be parsed), we’ll pass them both to the receivermanager manager function.


Forth stop – Manager:

def manager(self, parsedHeader, payloadStream):

    command = parsedHeader.command.decode("utf-8")
    message = {"timestamp": time.time(), "command": command, "header": parsedHeader.to_string(), "payload": ""}

    if command.startswith('ping'):
        ping = Ping.DecodePing(payloadStream)

        pong = Pong.EncodePong(ping.nonce)
        packet = PacketCreator(pong)

        message["payload"] = str(ping.nonce)

    elif command.startswith('inv'):
        inv = Inv.DecodeInv(payloadStream)
        message["payload"] = inv.get_decoded_info()

    elif command.startswith('addr'):
        addr = Addr.DecodeAddr(payloadStream)
        message["payload"] = addr.get_decoded_info()

    elif command.startswith('pong'):
        pong = Pong.DecodePong(payloadStream)
        message["payload"] = pong.get_decoded_info()

    elif command.startswith('version'):
        version = Version.DecodeVersion(payloadStream)
        message["payload"] = version.get_decoded_info()

The manager function does a very simple thing. It checks the command of the message (the command is part of the header) and then it sends the message payload to be parsed by the corresponding functions. For example. If the manager sees that the command is «pong», it will use the decodepong method in Bitpay/Packets/control_messages/ to extract the desire fields out of it. (You can read more about «pong», «ping» and «verack» messages in this post.).



We have our pared message, both its header and payload. And now we need to decide what to do with them. For some messages this might be the end of the line. There’s nothing more we can do with them. Some might require us to act. «ping» message should be answered by a «pong» message, transactions should be checked and relayed (We’ll talk about transactions in later posts), «version» messages should be acknowledged by sending back a «verack» message.

A major part of learning the Bitcoin protocol is learning how each and every message should be dealt with. Which fields of information it contains and what is the meaning of this information. We’ve already talked about some of the messages in previous posts  (see here for «ping», «pong» and «verack» messages, and here for «version» message.) and as our project will have more features implemented, so we’ll discuss other type of messages and how to deal with them.

User interface

User interface

The best way to make our code really useful is to add user interface to it. There is a lot of this that can only be taught and understood by looking at the code itself, bu there are also many things that the average user can learn about the Bitcoin protocol that can be explained in a more “human friendly” way. An that is way Alexis and I have decided to add a Graphical User Interface (GUI) to our project in hope that in the future we can create a full Bitcoin graphical environment.

We’ve added a new folder to hold our GUI files, the UI folder. This folder contains 3 implementations. The first one uses the Tkinter module, the second uses the pyQt module and the third is a simple command line interface (And this is why we’re calling this folder UI and not GUI). We still haven’t decided on the final implementation bu we believe that we’ll eventually go with only the pyQt implementation, basically because this one looks the best!


This is what it looks like at the moment:

Bitcoin graphical environment
The user interface of bitpy.

This is our basic design, At the bottom there’s a list of all the messages you can send: «version», «verack» and «ping»   (More messages will be added soon). The left side keeps a record of all of the incoming messages, in the order in which they were received. Once we choose one of these messages, we can see at the right panel a display of the parsed message. The header and the payload.

Remember! this is just our first draft, but we’re quite proud of it. Little by little our project shows more and more potential.


installing dependencies

Because we’re using pyQt5, you might need to install the pyQt5 module on your machine. The easiest way to do so will be to use the commend:

pip install pyqt




Messages part two – Payloads and version message

Messages part two – Payloads and version message

In the previous posts we’ve talked a little bit about messages. We know that a message is nothing more than a string of bytes, it has an header and a body (payload), and it must maintain its predefined format. We’ve seen the format of the header, but every message body (payload) will contain different information, according to the message type. We’ll start by constructing the “version” message.

The version message is used when trying to establish a connection with the remote node. Alice will send the version message to Bob, and only after Bob have approved this version message, and replay with his own version message, only then the connection between the two nodes can be established. No other message will be accepted before both nodes have exchanged this version message. So it’s no surprise that we choose to construct this message first, since this is the first message that we’ll send, and the first one we’ll receive.


The fields that are required in our version message (From the developer reference):


Size (Bytes) Name Data type Description
4 version int32 What is the latest version of the protocol that the transmitting node (our node) understands. In this example this number is 70012
8 services uint64
Not full node 0x00
Full node 0x01
8 timestamp int64 Current timestamp
8 addr_recv services uint64 What type of services OUR receiving node can support?
16 addr_recv IP address char The IP address of OUR receiving node
2 addr_recv port uint16 The port of OUR receiving node
8 addr_trans services uint64 What type of services OUR transmitting node can support?
16 addr_trans IP address char The IP address of OUR transmitting node
2 addr_trans port uint16 The port of OUR transmitting node
8 nonce uint64 A random number that helps the receiving node to detect and index our connection
Varies user_agent bytes CompactSize This field varies in size, but it tells the other node what should be the size of the next field
Varies user_agent string This field is used to display the name of our node, like licence plates. We can call our node whatever we want, “core”, “classic”, “my_cool_bitcoin_thingy”.
4 start_height int32 The highest block that the transmitting node knows of.
1 relay bool
True The transmitting node can relay messages to the rest of the network
False The transmitting node can’t relay messages to the rest of the network
  • Pay attention that in the “version” message, when asking for both the receiving and the transmitting  services, IP address and ports, we’re asked about our own machine, our own node. In our case both incoming and outgoing messages will be dealt in a similar manner, but some implementations might include more advanced routing.

Code implementation

Alexis and I decided that every message will have it’s own file in which the payload of the message will be both created (For messages that our node will send) and parsed (For incoming messages).

import random
import time
from io import BytesIO
from Utils.config import version_number, latest_known_block
from Utils.dataTypes import *

class EncodeVersion:
    def __init__(self):
        self.command_name = "version"

        self.version = to_int32(version_number) = to_uint64(0)
        self.timestamp = to_int64(time.time())

        self.addr_recv_services = to_uint64(0)
        self.addr_recv_ip = to_big_endian_16char("")
        self.addr_recv_port = to_big_endian_uint16(8333)

        self.addr_trans_services = to_uint64(0)
        self.addr_trans_ip = to_big_endian_16char("")
        self.addr_trans_port = to_big_endian_uint16(8333)

        self.nonce = to_uint64(random.getrandbits(64))
        self.user_agent_bytes = to_uchar(0)
        self.starting_height = to_int32(latest_known_block)
        self.relay = to_bool(False)

    def forge(self):
        return self.version + + self.timestamp + \
               self.addr_recv_services + self.addr_recv_ip + self.addr_recv_port + \
               self.addr_trans_services + self.addr_trans_ip + self.addr_trans_port + \
               self.nonce + self.user_agent_bytes + self.starting_height + \

class DecodedVersion:
    def __init__(self, payload):
        self.version = read_int32( = read_uint64(
        self.timestamp = read_int64(

        self.addr_recv_services = read_uint64(
        self.addr_recv_ip = parse_ip(
        self.addr_recv_port = read_big_endian_uint16(

        self.addr_trans_services = read_uint64(
        self.addr_trans_ip = parse_ip(
        self.addr_trans_port = read_big_endian_uint16(

        self.nonce = read_uint64(

        self.user_agent_bytes = read_compactSize_uint(BytesIO(
        self.user_agent = read_char(, self.user_agent_bytes)

        self.starting_height = read_int32(
        self.relay = read_bool(

    def get_decoded_info(self):
        display = "\n-----Version-----"
        display += "\nversion                :\t\t %s" % self.version
        display += "\nservices  	         :\t\t %s" %
        display += "\ntimestamp              :\t\t %s" % self.timestamp

        display += "\naddr_recv_services	 :\t\t %s" % self.addr_recv_services
        display += "\naddr_recv_ip           :\t\t %s" % self.addr_recv_ip
        display += "\naddr_recv_port         :\t\t %s" % self.addr_recv_port

        display += "\naddr_trans_services  	:\t\t %s" % self.addr_trans_services
        display += "\naddr_trans_ip         :\t\t %s" % self.addr_trans_ip
        display += "\naddr_trans_port	    :\t\t %s" % self.addr_trans_port

        display += "\nnonce                 :\t\t %s" % self.nonce

        display += "\nuser_agent_bytes  	:\t\t %s" % self.user_agent_bytes
        display += "\nuser_agent            :\t\t %s" % self.user_agent
        display += "\nstarting_height	    :\t\t %s" % self.starting_height
        display += "\nrelay	                :\t\t %s" % self.relay

        return display

Because this code is used both for incoming and outgoing messages, it has both the class EncodeVersion, which is used to build the payload of the version message, and the class DecodedVersion which is used to parse the payload of any incoming version message.

The function forge will just append and return all the fields in the right order – this is the finale payload.



Because we haven’t established connection yet, we first need to create the payload of our version message using the EncodeVersion class. The class won’t take any argument (except for self) and will just assign every field with the right value and the right data type.

The variables version_number and last_known_block are imported from Bitpy/Utils/ and are set to:

version_number = 70012
latest_known_block = 416419  # june 2016

Our node is not a full node so the services will be set to 0x00. For that reason we’ll also set our relay field to be False.

In our own node, both incoming and outgoing messages will be dealt by the same machine so both receiving and transmitting machines are the same:

self.addr_recv_services = to_uint64(0)
self.addr_recv_ip = to_big_endian_16char("")
self.addr_recv_port = to_big_endian_uint16(8333)

self.addr_trans_services = to_uint64(0)
self.addr_trans_ip = to_big_endian_16char("")
self.addr_trans_port = to_big_endian_uint16(8333)


We’re using the function random.getrandbits(64) in order to populate or nonce field with 8 bytes long random number.

We’re also not adding any vanity name to our node at the time so we’re setting the user_agent_bytes to be 0. That means that there’s no user_agent_bytes field.




This class is quite straightforward, it receives the payload of the incoming message from Bitpy/Manager/

It uses the builtin function read  and our data types functions to assign each field with the proper value, for example the first 4 bytes are the version number in uint32 format, the next 8 bytes are the services field in uint64 format and so on.

The only thing that is really unique is the user_agent field:

self.user_agent_bytes = read_compactSize_uint(BytesIO(
self.user_agent = read_char(, self.user_agent_bytes)

This is the first example of the varying CompactSize data type in use. The field user_agent_bytes doesn’t have a fixed size. The Bitcoin protocol defines the variable data type CompactSize to deal with such fields (you can read more about this data type in the data types section). We’re using the function BytesIOin in order to send this argument as a string of bytes to theread_CompactSize_unit function and receives back the Uint that matches the size of the next field, the, the user_agent field. Then we’re using the data type function read_char which requires two arguments. The first is the string of bytes itself ( and the second is the size of the total string (self.user_agent_bytes).

Once we’ve finished parsing out version message we can use the get_decode_info function in order to display the information about the remote node (currently, we aren’t doing anything with this information except to dump it as a text file).

version : 70012
services : 5
timestamp : 1467293151
addr_recv_services : 1
addr_recv_ip : ��^�V�
addr_recv_port : 30373
addr_trans_services : 5
addr_trans_ip : ��
addr_trans_port : 8333
nonce : 1755461931592560680
user_agent_bytes : 16
user_agent : /Classic:0.12.0/
starting_height : 418653
relay : True


Edit (4-Jul-2016): Python 2.5 to 3.5 migration

Please read the general notes about the transition from Python 2.5 to 3.5 over here. And the complete github change log for the migration over here.

The code for the <Version> message remind fairly untouched, only few adjustments were required:
class EncodeVersion:
self.timestamp = to_int64(int(time.time()))
self.addr_recv_ip = to_big_endian_16char(b“”)
self.addr_trans_ip = to_big_endian_16char(b“”)
class DecodedVersion:
self.user_agent = read_chars(, self.user_agent_bytes)



Messages part one – constructing the message and headers

Messages part one – constructing the message and headers

Communication in the Bitcoin network is done via messages.  A message is no more than just a string of bytes.

This is an example of a simple “ping” message:

f9beb4d970696e670000000000000000080000001b3cb220309941550a2ffd5c # The full message. Header + Payload

The Bitcoin protocol describes how each message should be packet. It is highly important to make sure that you construct the message in the right format. The machine on the other side will expect to receive a very specific format and it won’t be able to process any message that won’t fit this format.

Every message is made of two main components:

  1. Header (not to be confused with block headers! we’ll talk about them in a later post!)
  2. Payload

The payload is the body of the message – The message itself. Many types of messages are defined in the Bitcoin protocol, and we’ll talk about each message later on, but for now we should take a look at the header and that’s because the header is the one thing ALL of the messages have in common.

Each header is made of four components:

Size (Bytes) Name Data type Description
4 Start string char[4]  The network identifier
12 Command name char[12]  The name of the command.
4 Payload size uint32 Len(payload)
4 Checksum char[4]  SHA256(SHA256(payload))[:4]


The first 4 bytes of the message are the Starting-string (or Magic number).

f9beb4d9 # 4 bytes starting string (Magic number)

This number tells the receiving machine which network I’m using. In this tutorial we’re using the real main-network (Caution! any mistake made in the real main network might cost you real Bitcoins!). The Magic number of the main-network is 0xf9beb4d9. We can look at the “ping” message example at the top and see for yourself that the first four bytes in the message are indeed f9beb4d9. The receiving machine will first check this Magic number to make sure that it is receiving messages from the network it’s currently ruining, and only then it will start processing the rest of the message.


Command name

The next 12 bytes are the Command name.

70696e670000000000000000 # 12 bytes command name

Each header should contain the name of the command, or type of message that is contained in the payload (body) of the message. In our case, the message is a “ping” message. The receiving machine will use this information to know how to parse and treat this message. In our example the receiving machine will answer with a “pong” message (but only after it will complete validating the message). Pay attention that the command field should be exactly 12 bytes long. That means that if the command name is shorter than 12 bytes, it needs to be padded with nulls. We can see that the first 4 bytes makes the word “ping” in hexadecimals  (70-P, 69-I, 6e-N, 67-G) and another 8 bytes are nulls (00). In our code implementation (see below) we’re using the function command_padding to pad our command name to be 12 bytes.

def command_padding(self, command):  # The message command should be padded to be 12 bytes long.

command += (12 – len(command)) * “\00” return command



Another 4 bytes will contain the size of the payload.

08000000 # 4 bytes size of the payload

This field is very straightforward. We just need to insert the size of the payload (body) of our message. In our case it’s simply 8 bytes long. (Once again, we need to make sure it will be exactly 4 bytes longs. luckily we’ve predefined this data type in our Bitpy/Utils/ file under to_uint32(v). So we don’t have to manually insert the extra 3 null bytes).



The last part of the header is the checksum.

1b3cb220 # 4 bytes checksum

It might seem somewhat strange at the beginning, but all that we need to do is to take the payload of our message, perform the cryptographic function SHA256 twice on that message, and then append the last 4 bytes of the result to our header. SHA256(SHA256(payload))[:4]

def get_checksum(self):
   check = hashlib.sha256(hashlib.sha256(self.payload).digest()).digest()[:4]
   return check




And now all that we left with is the payload – the body of the message . Each message will be parsed differently.

309941550a2ffd5c # The payload, the body of our "ping" message.


The code implementation

In our code we need to deal with both incoming and outgoing messages. So we’ve split our code to two files:

  1. Bitpy/Packets/ – to parse the headers of the incoming messages (And after the header was properly parsed, to use the 12 bytes command properly to determine what other steps are required).
  2.  Bitpy/Packets/ – to build the header of our outgoing message and to pre-fixed it to the payload of the message.

Bitpy/Packets/ for incoming messages

class HeaderParser:
    def __init__(self, block):  # Packets is a stream

        self.magic = read_hexa(
        self.command =
        self.payload_size = read_uint32(
        self.checksum = hash_to_string(

        self.header_size = 4 + 12 + 4 + 4

    def to_string(self):
        display = "\n-------------HEADER-------------"
        display += "\nMagic:\t %s" % self.magic
        display += "\nCommand name	:\t %s" % self.command
        display += "\nPayload size	:\t %s" % self.payload_size
        display += "\nChecksum	:\t\t %s" % self.checksum
        display += "\nheader Size:\t\t %s" % self.header_size
        display += "\n"
        return display

Bitpy/Packets/ for outgoing messages

class PacketCreator:
    def __init__(self, payload):
        self.payload = payload.forge()  # The message payload forged

        # create the header
        self.magic = to_hexa(
            "F9BEB4D9")  # The Magic number of the Main network -> This message will be accepted by the main network
        self.command = self.command_padding(payload.command_name)
        self.length = to_uint32(len(self.payload))
        self.checksum = self.get_checksum()

    def command_padding(self, command):  # The message command should be padded to be 12 bytes long.
        command += (12 - len(command)) * "\00"
        return command

    def get_checksum(self):
        check = hashlib.sha256(hashlib.sha256(self.payload).digest()).digest()[:4]
        return check

    def forge_header(self):
        return self.magic + self.command + self.length + self.checksum

    def forge_packet(self):
        return self.forge_header() + self.payload


Edit (4-Jul-2016): Python 2.5 to 3.5 migration

Please read the general notes about the transition from Python 2.5 to 3.5 over here. And the complete github change log for the migration over here.

The command_padding function was rewrite in order to take full advantage of Python 3.5 capacity and we’re now using the built in function ljust.

def command_padding(self, cmd):
    command = str(cmd)
    command = command.ljust(12, '\00')
    return str.encode(command)


Connection part one – Finding a node and packets routing

Connection part one – Finding a node and packets routing

The first thing we need our code to do is to connect to the Bitcoin network. this is relatively straightforward process, we just need to find one node in the network and establish connection  with that node. A list of few of the active nodes can be easily found online. We’ve randomly picked one node from this list on .

We’re using the socket module to establish our connection using this simple code:

import socket
import sys

HOST = ""
PORT = 8333

    We will use this file to connect to one node
    But in the future we will connect to more than one

def connect():
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

        sock.connect((HOST, PORT ))
    except Exception as e:
        print e

    return sock


we’ve also created at the root of our folder structure (right under Bitpy/). This will initialize the connection code upon startup and will route our incoming and outgoing packets to ReceiverManager  and SenderManager respectively.  The queue module helps us to make sure that the packets are being processed in the right order. We’ll later  see what each file does, but for now, what is important to understand is:

  1. We’re connecting to another node on the network.
  2. We’ve found the address of this node on a public list at
  3. The connection code is stored at Network/
  4. We’ve created a Main file under our root directory (Bitpy/ that will initialize the connection to the node, and will route our incoming and outgoing packets to one of the two queues  files (for outgoing packets) and (for incoming packets). Both files can be found under Manager/.
  5. The user manually specify which packet (message) he wants to send using the core_manager. We’ll talk about it later on when we’ll be dealing with the user interface.


So now we should have a look at our Receiver/Sender Managers, but our ReceiverManager is a bit too complex for this stage, so we’ll talk about it later, once we’re ready to talk about parsing incoming messages. For now, we’ll only have a look at our SenderManager.

The first thing we did was to use the threading module. This module allows us to keep our connection asynchronous, that means that we can receive and send messages at the same time. Apart from this threading module this file contains only one more class – SendingManager. Once this class is defined, it will have access to our thread, it will be able to use or sock object (declared in to connect to the remote node and it will also receive the packets queue from the file.


from threading import Thread

class SenderManager(Thread):

    def __init__(self,sock, queue):
        self.sock = sock
        self.queue = queue

    def run(self):
        while True:
            if not self.queue.empty():
                order = self.queue.get()

        print "Exit sender Thread"


So the file gets a list of packets (messages) from the user which he wishes to send. (The user creates the packets in the file). The packets are stored in a queue, and a SenderManager object is then created. It gets access to the sock object, the thread, and the queue , then it will simply send the packets in their order, as specified in the queue, one by one, to the ip address and port of the sock, while making sure that the connection remains asynchronous.


Before we can start sending and receiving messages, we first need to learn about messages.



Data types

Data types

Date types

When reading the Bitcoin developer reference, it becomes immediately clear that the Bitcoin protocol requires the user to work with only specific Bitcoin data types. You can’t just insert numbers as int and expect it to work. Each field, of every packet that is send or received by our node needs to be properly formatted.

Let’s have a look at the version message documentation in the Bitcoin developer reference.
We can see that the first field should contain the protocol version number (currently 70012). But we can’t just send the number as-is, it’s specifically stated that the number should be 4 Bytes, int_32 type. And we can also see that any variable, any piece of information that is either received or send will be formatted in the predefined manner that was specified in the Bitcoin protocol documentation. Luckily, Python have the struct module that allows us to easily predefine our data type. In our example we want to pack the number “70012” into a 4 bytes int (remember, 32 bits is 4 bytes) with the variable name “version”.
So using the struct module in our code should look like this:

import struct

version = struct.pack("i", 70012)

The i in the code represents 4 bytes integer. For a complete lists of characters and their meaning, have a look at the following table in the struct module documentation.

This is quite a simple process, just look at the Bitcoin documentation to find out how each variable should be parsed, and then head to the struct module documentation to find the corresponding character. But once done again and again for each an every variable, it will surely cause our code to get out of control and errors are a sure thing. So Alexis suggested that we’ll predefine all of the data types that are required in one file. Now, instead of using the previous code for our version variable, we can just use the predefined function to_int32(v):

import struct

def to_int32(v):
    return struct.pack("i", v)

version = to_int32(70012)

We’ve also added a read_int32, which allows us to easily get back our variable.

import struct

def to_int32(v):
    return struct.pack("i", v)

version = to_int32(70012) # The number 70012 is now packed.

print version # Unreadable

def read_int32(v):
    return struct.unpack("i", v)[0]

print read_int32(version) # The number 70012 is readable again

Most of the data types were easy to define, but the Bitcoin protocol has one special type of data type which is called compactSize_uint.
In this data type, every number higher than 252 will have a prefix that will indicate the length of the number. This type of data type is mostly used for variables of changing length.

import struct

def to_compactSize_uint(v):
    if 0xfd > v:
        return struct.pack("<B", v)
     elif 0xffff > v:
        return "FD".decode("hex") + struct.pack("<H", v)
     elif 0xffffffff > v:
        return "FE".decode("hex") + struct.pack("<I", v)
        return "FF".decode("hex") + struct.pack("<Q", v)

def read_compactSize_uint(s):  # S is a stream of bytes

    # Read an unsigned char to get the format
    size = ord(

    # Return the value
    if size < 0xFD:
        return size
    if size == 0xFD:
        return read_uint16(
    if size == 0xFE:
        return read_uint32(
    if size == 0xFF:
        return read_uint64(

The parse_ip bug

We’ve also tried to built a parse_ip function to properly displaying IP addresses. But unfortunately we’ve came across when using Windows. You can read more about our attempts to deal with the bug at our trello board

Edit (4-Jul-2016): Python 2.5 to 3.5 migration

Please read the general notes about the transition from Python 2.5 to 3.5 over here. And the complete github change log for the migration over here.

Most of the data types function have remained unchanged. With the exceptions of:

The functions that dealt with reading and writing charterers were replaced by two function: to_chars and read_chars.

def to_chars(v, length=-1):
    if length == -1:
        length = len(v)
return struct.pack(">%ss" % length, v)

def read_chars(v, length= -1):
     if length == -1:
         length = len(v)
         return struct.unpack(">%ss" % length, v)[0]

These new functions can accept a specific variable size (length) If now length is inserted, it will calculate the size of the string automatically. This allows us to deals with strings of varies sizes.

The parse_ip function was fixed and replaced by the following code:

def parse_ip(ip):
    IPV4_COMPAT = b"\x00" * 10 + b"\xff" * 2

    # IPv4
    if ip[0:12] == IPV4_COMPAT:
        ip = read_hexa(ip[12:])# we remove the first 10 "\x00" an 2 "\xff , and convert bytes to hexa
        ip = "%i.%i.%i.%i" % (int(ip[0:2], 16), int(ip[2:4], 16), int(ip[4:6], 16), int(ip[6:8], 16))

    # IPv6
        # TODO

    return ip

We’ve also added a two more functions for encoding and decoding hexadecimals:

def to_hexa(v):
    return bytes.fromhex(v)

def read_hexa(v):
    return v.hex()


Baby steps. Folders structure

Baby steps. Folders structure

Our first major decision was how to structure our project. After some considerations we’ve decided to use the following structure.


Bitpy file structure
Manager /
Network /
Packets /
	control_messages /
	data_messages /
Utils /

The Manager folder will contain the UI of the project and the codes that will deal with both incoming and outgoing packets.

The Network folder will contain all the files that are required to establish and maintain connection with the network.

The Packets This is where we’re actually starting to see the effects of the Bitcoin protocol on our design. Bitcoin protocol uses send and receive packets of data. There are 2 types of data packets:

control messages

This is just a short introduction, we’ll look into each and every message in the future

control messages are used to send meta type information such as: What’s your version number?(Version) What’s your address?(getAddr) Ping and Pong, etc’. This information is used by the Bitcoin protocol, but doesn’t contains any real information on blocks, transactions, signatures etc. Basically, nothing that will effect the Bitcoin blockchain. No transaction will be recorded, no messages will be signed, no block will be added etc’

data_messages are used to interact with the Bitcoin blockchain. Either we only ask for some information from the blockchain such as: send me specific blocks or block information(GetBlocks, GetData, Inv), show me the Mempool(Mempool) etc’ or we’re trying interact with the Bitcoin blockchain by sending transactions(Tx) or blocks(Block).

It is important to note that the Bitcoin protocol requires that every message will be constructed in the same way (mostly it means it will have a specific header and will contain only specific data types. Again, we’ll look into it with more details soon). That is why we’ve also included HeaderParser and PacketCreator in this Packets folder.

The Utils folder will contain an assortment of tools that might be required in our project.