Browsed by
Category: Python implementation

Scripts and stacks

Scripts and stacks

Personal note, Many things happened in past two months the required my full attention. I hope to resume a steady flow of posts in coming days.

Review

In the last post we’ve talked about one the biggest bitcoin misconception – The idea that transaction actually moves coins from one wallet to another. The truth is that transactions are nothing more that statements. These statements always points to a previous statement (that in turn point to an even older statement and so on), and usually these statement also specify an amount of coins that the current owner is wishing to transfer. The statement also contains a riddle, or an equation that needs to be proofed, and mostly, the key to proof this equation will require the use of the private_key that is associated with the recipient bitcoin address.

level3

Pay attention, even though Bob will be required to use his own private_key to proof that he indeed can solve this problem, the private_key still won’t be available to any one.

 

Now let’s look for a second at this transaction message. We’ve already learned how to create a bitcoin message (see this section about Version message and this one about headers). We just need to make sure that all of the fields are filled in accordance the protocol rules. Just like filling a form. You can find a complete list of the fields that needs to be filled in the bitcoin developers documentation.

 

 

Most of the fields are quite straight foreword. I might still create another post in the future with detail instructions on how to fill all the fields, but this isn’t really the topic of this post. This post deals with one of bitcoin more fascinating aspects – The riddle that Alice place in her statement. The riddle that only Bob can solve -The script.

 

(You just can’t wait to create your own transaction? you’re more than welcome to watch my videos on creating bitcoin transaction)

Scripts, what is it?

Scripts is a computer language. In more detail, it’s a set of predefined words that are agreed upon. Every node that follows the rules specified in the bitcoin protocol will know how to read, interpret and implement these words. Because bitcoin messages are basically nothing more then a string on bytes, these words are not written in plain English, rather are translated to OP_CODEs. That way, we can send our message as a string of bytes, and the receiving node will know that these bytes represent some instructions. (Important note, The receiving node will only treat this bytes as instructions only if they appear inside one of the script field.)

Here’re selected few:

Word Opcode Hex Input Output Description
OP_1ADD 139 0x8b in out  1 is added to the input.
OP_1SUB 140 0x8c in out 1 is subtracted from the input.
N/A 1-75 0x01-0x4b (special) data The next opcode bytes is data to be pushed onto the stack
OP_MIN 163 0xa3 a b out  Returns the smaller of a and b.
 OP_SHA256  168  0xa8  in  hash The input is hashed using SHA-256
OP_EQUAL 135 0x87 x1 x2 True / false Returns 1 if the inputs are exactly equal, 0 otherwise.

The original list included around 200 of these words, but currently most nodes will only support few dozes of these words. Using these few words we can create many “riddles” or state many conditions to claim the coins in our transaction message.

For example I can add the following string of bytes as my script.

0x01 0x8b 0x87 0x02 0x87

<1> <OP_1ADD> <2> <OP_EQUAL>
  1. It will take the number 1.
  2. Use the OP_CODE OP_1ADD to add 1 to it -> The output of this OP_CODE will be 2.
  3. Use the OP_CODE OP_EQUAL to make sure if the result is equal to 2. -> The output of this OP_CODE will be True.

A word of caution though, most nodes not only refuse to accept most of these OP_CODEs, they will even refuse to accept most non-standard  scripts, mainly because they want users to use standard transactions. Many nodes will not only refuse to accept a transaction with a non standard script, they’ll also refuse to transmit these transactions to other nodes.

 

Stacks

You might’ve already noticed that this script language can only be written as a list of operations. Unlike other high level languages (such as python for example) Scripts can only be used in a predefined order. This type of structure is called stack, because we’re stacking variables and data on top of each other. But not only we’re stacking them, using the stack structure also means that they’ll be processed in accordance to the order in which they were stacked.

In our previous example, the integer 1 was the first item in our stack. Then came the operation OP_1ADD which took that item as its input, processed this item by adding 1 to it, and than giving the output 2. Now the number 2 is stacked BELLOW the integer 2.

<1> <OP_1ADD> <2> <OP_EQUAL>

<2> <2> <OP_EQUAL>

The node recognize the OP_CODE <0x02> as the integer 2, so it moves on to the next item in our stack – the OP_CODE OP_EQUAL. This operation input is the two items that are directly bellow it and compere the two. If both are equal, it will return True.

<True>

 

This example code can’t be used with a standard bitcoin transaction, it’s only meant to give you a general feel on how scripts works.

You can find an example of a real transaction over here:

 

 

Give it a try with bitpy

One of bitpy newest feature is the ability to create stacks and see them in action in real time. Mind you, only few OP_CODES are currently implemented, but it might still give you a feel on how stacks works.

Example of stack using bitpy
Example of stack using bitpy

 

Simple stack architecture with python

Stack architecture can easily be implemented using arrays. After all, it’s nothing than an array of objects (variables, operations, results etc’).

In our bitpy project, under Utils/OpCodes/Codes.py I’ve created a stack class. In its most basic form, this class will only create and empty array upon initialization, followed by  2 methods only.

class Stack():

    def __init__(self):
        self.items = []

    def push(self, item):
        self.items.append(item)

    def pop(self):
        elm = self.items.pop()
        return elm
  1. push(item) append new item to the array
  2. pop(item) remove the topmost item in my array.

This should be enough to create a very basic stack class. Still, I’ve added few more methods.

class Stack():

    def __init__(self):
        self.items = []

    def isEmpty(self):
        return self.items == []

    def push(self, item):
        self.items.append(item)

    def pop(self):
        elm = self.items.pop()
        return elm

    def size(self):
        return len(self.items)

    def printStack(self):
        display = ""
        for items in self.items:
            items = str(items)
            if len(items) > 5:
                display += " " + "<"+ items[:5] + "..." + ">"
            else:
                display += " " + "<" + items + ">"
        return display

    def clear(self):
        self.items.clear()

The isEmpty method will check if our stack array is empty.

The size method will give us the size of the array.

The printStack will provide us with a visual representation of our array. Pay attention that I’ve limited the size of each item to only 5 characters so that items such as hashed messages, bitcoin addresses, keys etc’ won’t take the all screen.

The clear method will remove all items from our array.

Using this methods we can easily start implementing more advanced OP_CODE to our stack array.

def OP_DUP(self):
    elm = self.pop()
    self.items.append(elm)
    self.items.append(elm)

def OP_HASH160(self): #saved as string!
    self.push(Utils.keyUtils.keys.generate_hashed_public_key_string(self.pop()))

def OP_EQUAL(self):
    elm1 = self.pop()
    elm2 = self.pop()

    if elm1 == elm2:
        self.push(1)
    else:
        self.push(0)

def OP_VERIFY(self):
    top = self.pop()
    if top == 1:
        self.push(1)
    else:
        self.push(0)

def OP_RETURN(self, input):
    self.push(input)

 

Transaction part one – The misconceptions about the block chain

Transaction part one – The misconceptions about the block chain

Three levels of abstractification.

The are three level to understand the way the Bitcoin block chain works.

The first level
One of the most simple Bitcoin block chain abstractification

In the first level, we got those who just heard about Bitcoin for the first time. People usually thinks that the coins are just associated with the Bitcoin address. Whenever Alice sends Bob coins, she just place a statement (or transaction) in the block chain specifying that the X coins that were associated with Alice address, should now be associated with Bob address. This is of course wrong since this representation contains only 3 thing: the sender address, the receiver address and the amount of coins to be transferred. But this level of abstraction is usually the first thought most people have when first presented with the idea of the block chain.

The second level
Points to previous transacions

In the second level, people starts to understand that each transaction also contains the origin of each coin. So if Alice wants to send 5 BTC to bob, she first needs to show where she got these 5 BTC from, That is, she also needs to provide a way to point to older translations on the block chain, transactions which proves that Alice is indeed the rightful owner of these five BTC. So now what we have starts to look more like a chain. It’s no longer just a table containing the address and the current BTC balance,  rather the block chain needs to contain a list of all previous transactions. This depiction is more accurate, but still it isn’t complete – introducing the third level.

The third level
Points at previous transaction and set the condition for claiming the coins.

In the third level, people learns that a transaction is more then just a simple statement in the block chain. Transaction contains more then just information about the origin of the coins (the input), the address of the receivers (the outputs) and the amount of coins to be transferred. Rather, the transaction also contains a riddle in it, and only by solving this riddle, can the coins be claimed and transferred. So when Alice send 5 BTC to Bob, she doesn’t only pointing at the transaction from which Alice got the coins, it also solves the riddle specified in that transaction to prove that Alice is allowed to claim those coins, but that’s not the end of it, in the transaction that Alice publish on the block chain she’s also inserting another riddle, a riddle that only Bob can answer. So if Bob would like to use this coins in the future, he’ll have to solve that riddle which was provided to him by Alice. And the process goes on and on and on.

I love riddles! Or, how to solve the riddle and prove that I’m allowed to claim the coins?

A quick reminder. Signing messages and key pair.

In the previous post we’ve talked about mathematical trapdoors and hashing functions. We’ve saw how we can use these types of function so sign a message with a private key and how these signed message can be verified with the corresponding public key.

F("My name is Alice", Alice's_private_key) = signed message.

Verify(signed message, "My name is Alice", Alice's_public key) = true.

Change any component in the Verify statement, and we’ll receive false. This way we can easily prove that the message “My name is Alice” was indeed signed by Alice and that no one tempered with that message.

Pay attention! We cannot sign a message with the public key, it simply wont work! the public key can only be used to verify a message, not to sign it!

So, Alice can now publish a message in the block chain that contains the input (origin of the coins) and the output (the address of the receiver). This is the original message, public for anyone to see and check as he or she pleases. Alice can also now specify the following rules in that public message:

  1. This message is the original message.
  2. Within this message I’ve included Bob’s_public_key (In hashed format – We’ll get there in a second).
  3. Take your public key and hash it. If your hashed public key matches the hashed public key from step 2, you may move to the next step.
  4.  Take this original message, and sign it with your own private key.
    F(This message, Bob’s_private_key) = signed message.
  5. The signed message should be verified only against the public key that is specified in this message.
    Verify(signed message, This message, Bob’s_public_key) = true.
  6. If the verification indeed yield true, you may claim the coins in this transaction

 Getting Bob’s public key

When Alice specify Bob as the recipient of a transaction, she does so by inserting Bob’s hashed public key. The hashed public key (as we already saw in the post dealing with keys) can be deducted from Bob’s Bitcoin address.

The public key is hashed and then added to the final Bitcoin address.
The public key is hashed and then added to the final Bitcoin address.

So basically, knowing Bob’s Bitcoin address is equal to knowing Bob’s hashed public key. All we need to do is to convert the address back from base 58, and remember that out of the 25 bytes of the binary address, we need to omit the first byte and the last 4 bytes.

Scrips – the beginning

As we’ve already saw many times in the past, Bitcoin message is no more then a string of bytes that follow a predefined order. That means that in order to specify conditions and rules like the one Alice is using in her transaction message, we also need all parties to agree on a predefined field that will contain the conditions, and we also need to agree on a way to translate the bytes in that field into a set of rules, or instructions. For this purpose, a scripting language was created for Bitcoin, this langue allocate predefined operation to a predefined byte. For example, the byte 0x8b means “plus 1”, the byte 0x8c means “minus 1”, the byte 0xa9 means “hash the public key. First with SHA-256 and then with RIPEMD-160”, and so on. There’re many more of these operators (also called OP-codes) and these OP-codes helps us to specify the rules that needs to be fulfilled. These rules are transparent and are publicly visible on the block chain. Whenever a node verify a transaction, the nodes follows these rules and make sure that they eventually do yield a true statement.

We’ll talk more about these OP-codes and scrips in the next sections, when we’ll also see a real example of valid transaction script.

 

Keys, addresses and hashing

Keys, addresses and hashing

A key pair is one of the greatest tools that are used in Bitcoin, but it might be a little unintuitive at first. Don’t worry, you’ll get it!

There’s also a short video I made a few months ago that describes the basics of keys. It doesn’t completely corresponds to our current project, but it might provide you with another point of reference. You can watch it over here – Bitcoin python tutorial for beginners – keys and address.

 

One way function

The name “one way function” is quite self explanatory. These function are very easy to solve in one way but almost impossible to invert. Giving function f, and the input x, I can easily calculate the result y.

f(x) = y <- easy to solve

But given the result y, and the function f, It will be almost impossible to find x

f(?) = y <- almost impossible to guess.

The Bitcoin protocol define the use of some of these one way functions (SHA256, ripmed, ECDSA and murmurhash. More functions are being tested and might be used in the future).  Each one of these function has its own place in the protocol. Some functions will be used more then once and/or will be combined with another function to achieve even a grater level of security. For example, signing a message (usually a transaction message) will be done using the SHA256 function, SHA256("hello!"), finding the checksum of the message payload will be done using the SHA256 function twice SHA256(SHA256(message)).

  • Some people have hard time to accept the concept of “hard to guess”, they feel it’s too ambiguous. Well, technically an extremely powerful computer might be able to iterate through all possible results until it will find the right one (this is called brute force), but in practice, it will take a very – very long time. Trying to brute force the result of a SHA256 function on a 32 bytes message will take about 10^65 years. The age of the universe is only 1.4*10^9 year. I think it’s good enough security.
It’s easy to get the result y of function f for a giving x. But almost impossible to tell what the original input was.

 

Mathematical trapdoor and key pair

Mathematical trapdoor is a special type of one way function. The main difference is that in mathematical trapdoor we may also use few extra pieces of information called keys. Bitcoin uses the mathematical trapdoor function ECDSA or Elliptic Curve Digital Signature Algorithm, to produce two keys, or a key pair -A private key, and a public key – Both keys will always come in pairs! there cannot be a public key that matches two different private keys and vice versa!

The private key is used to solve (sign) the function f for message x. The result is the signed message y.

f(private_key, x) = y <- easy to solve

Now I have two messages. The original message x, and the signed message y. I want to prove that I’m the one who signed the original message x, that I’m the owner of the private key. But I don’t want to give my own private key. Anyone who have my private key will be able to sign in my name on other messages as well. So I’m using the public key. The public key can only be used to prove the solution of the function, but it cannot be used to sign messages

f(public_key, x) = y <- easy to prove

f(public_key, x) = null <- I can't sign message x with the public key. only with the private key

  • Pay attention that when we’re using the public key we’re just proving the equation, not solving it.

Here’s a simple numeric example I found on the wikipedia page on mathematical trapdoor:

An example of a simple mathematical trapdoor is “6895601 is the product of two prime numbers. What are those numbers?” A typical solution would be to try dividing 6895601 by several prime numbers until finding the answer. However, if one is told that 1931 is one of the numbers, one can find the answer by entering “6895601 ÷ 1931” into any calculator. This example is not a sturdy trapdoor function – modern computers can guess all of the possible answers within a second – but this sample problem could be improved by using the product of two much larger primes.

 

Let’s see an example:

Step one – create a key pair:

Create key pair using the ECDSA function and some random numbers
Create key pair using the ECDSA function and some random numbers

Step two – sign a message with the private key:

using the private key and the ECDSA algorithm
using the private key and the ECDSA algorithm

Step three – send the original message alongside the encrypted message and the public key

3 items are needed to validate the message. The public key, the original message and the encrypted message
3 items are needed to validate the message. The public key, the original message and the encrypted message

The code

In our project we’ve defined the Key class under Bitpay/Utils/KeyUtils/keys.py. This class contains all the necesery steps that are required in order to generate a private key, trnsform that private key to a public key and then create a Bitcoin address out of that public key.

step one – create (or receive) the private key

The first thing that we’re going to do is to create our private key. The private key is defined as  a random 32 bytes uint. Our class begins with a simple check. If the user initialize the Key class with an already existing private key, that private key will be saved into self.private_key. Otherwise, we’re using the urandom function in the os module to create a random 32 bytes long number.

def __init__(self, private_key=0):
    if private_key == 0:
        self.private_key = os.urandom(32)
        self.printable_pk = str(binascii.hexlify(self.private_key), "ascii")
    else:
        self.printable_pk = private_key
        self.private_key = binascii.unhexlify(private_key.encode('ascii'))

You might’ve noticed that we’ve also created a printable_pk variable. This variable will store the private key in hexadecimals. This way it is easier to store, copy and/or print the private key.

 

Step two – Use the private key to initialize the signing function

After we got our private key it’s time to use it initialize our ECDSA function. This step is similar to declaring our function f with the private key pr_k.

self.sk = f(pr_k, )

We’re defining the variable self.sk (for Signing Key) and use SigningKey.from_string from the ECDSA module with two arguments, the first one is our self.private key, and the second one is the curve (We haven’t talked about the curve yet, But it represent the mathematical part of our function. This is too advance mathematics so we won’t go into it in this project. But for now we just need to know that the Bitcoin protocol requires us to use the ECDSA function with the mathematical curve SECP256k1)

self.sk = ecdsa.SigningKey.from_string(self.private_key, curve = ecdsa.SECP256k1)

 

Step three – Use the initialized function (self.sk) to get the public key

Now that we got our signing key, we can use it in order to create our public key.

self.vk = self.sk.verifying_key

We’re defining a new variable called self.vk which will hold the verifying key, or the public key that can be sent alongside the signed message and the original message. This key will be used to verify that the message was indeed signed by the owner of that public key. And since every public key matches only one specific private key, it also proves that the one who signed the message also possess the corresponding private key.

 

Step four – Formatting the public key.

The variable self.vk holds the public key that will be used to verify our signed messages. But the Bitcoin protocol requires that we’ll represent this public key in couple of different formats.

The following chart from the Bitcoin wiki site shows the way the public key should be formatted:

Converting the public key to Bitcoin address
Converting the public key to Bitcoin address

 

The first line is the real public key, or in our case the verification key self.vk

this is the real public key - but we can't send it like this. We need to do dome formatting
this is the real public key – but we can’t send it like this. We need to do dome formatting

 

The second line tells us that we need to inser the byte 0x04 at the beginning of our public key

self.public_key =  b"04" + binascii.hexlify(self.vk.to_string())

We’re using the function to_string in order to display the variable self.vk  as a string. Then we convert it to hexadecimals so it will be easier to append the byte 0x04.

This is the public key in Bitcoin terminology. Usually, When looking for the public key in signed transactions, that's what it will look like
This is the public key in Bitcoin terminology. Usually, When looking for the public key in signed transactions, that’s what it will look like

 

The third line tells us to hash the public key twice. once using the SHA256 function, and then again using the ripemd160 function.

ripemd160 = hashlib.new('ripemd160') # <-initializing the ripemd160 function 
ripemd160.update(hashlib.sha256(binascii.unhexlify(self.public_key)).digest())
First hashing the public key using the SHA256 function. Then the result is hashed with the ripemd160 function
First hashing the public key using the SHA256 function. Then the result is hashed with the ripemd160 function

The forth line tells us to add another byte at the beginning of the hashed key.

We're working with the main network so we'll add the byte 0x00
We’re working with the main network so we’ll add the byte 0x00

This is the network ID byte which is used to prevent us from using keys and addresses that were generated in the test network, in the main network (and vice versa). In our example we’re using the main network, so the byte we’ll add will be 0x00.

self.hashed_public_key = b”00″ + binascii.hexlify(ripemd160.digest())

In Bitcoin terminology, the result is the hashed public key. This format is used mostly when creating a transactions.

 

The fifth (and sixth) line tells us to take our hashed public key and hash it again, twice, using the SHA256 function. The first 4 bytes of the result will be the checksum.

self.checksum = binascii.hexlify(hashlib.sha256(hashlib.sha256(binascii.unhexlify(self.hashed_public_key)).digest()).digest()[:4])
The checksum is the first 4 bytes
The checksum is the first 4 bytes

 

The seventh line creates the Bitcoin address in its binary form by appending the hashed public key with the checksum. This is a valid Bitcoin address, but it still need to go through one more process before it can be used with most Bitcoin wallets.

self.binary_addr = binascii.unhexlify(self.hashed_public_key + self.checksum)

 

The last line Finally we’ve reached the end point. There’s only one more thing we need to do before we can get the standard Bitcoin address and that is to convert the binary code of the address into a base58 string. The idea behind this conversion is quite simple. In order to reduce human errors, it was decided that some characters will be omitted from the standard Bitcoin address. characters like capital O, the number 0, lower case l and upper case I, as well as many more characters were omitted.

The final address represented in base 58.
The final address represented in base 58.
self.addr = base58.b58encode(self.binary_addr)
  • You might need to install the base58 module using the command pip install base58.

 

User interface

We’ve also added a tab to our graphical user interface which might help. You can use it to see the public key, hashed public key and Bitcoin address or any given private address.

The user interface for the keys can be found in the second tab
The user interface for the keys can be found in the second tab
Connection part three – Receiving messages

Connection part three – Receiving messages

In the previous posts, all that we’ve done was to construct and send messages to another node on the network. In this post, we’ll see what happens to incoming messages.

First stop – The ReceiverManager:

class ReceiverManager(Thread):
    def __init__(self, sock):
        Thread.__init__(self)
        self.sendingQueue = Utils.globals.sendingQueue
        self.sock = sock
        self.ping = ""

        self.outfile = open("data_received_from_node.txt", 'w')

    def run(self):
        while True:
            try:

                # get only the header's message
                header = self.sock.recv(24)

                if len(header) <= 0:
                    raise Exception("Node disconnected (received 0bit length message)")

                headerStream = BytesIO(header)
                parsedHeader = HeaderParser(headerStream)

                # get the payload
                payload = self.recvall(parsedHeader.payload_size)
                payloadStream = BytesIO(payload)

                self.manager(parsedHeader, payloadStream)

            except Exception as e:
                print(e)
                break

        print("Exit receiver Thread")

The receivermanager always runs in the background, checking our Thread for any incoming packets. Once it receives a packet, it will immediately cut its first 24 bytes.

header = self.sock.recv(24)

The first 24 bytes are the header. If you remember from this post, every Bitcoin message will starts with header, and the header is always exactly 24 bytes long.

 

The first 24 bytes are the header. The rest is the payload
The first 24 bytes are the header. The rest is the payload.

This header is now parsed as a string of bytes and passed to the HeaderParser class in Bitpy/Network/HeaderParser.py

headerStream = BytesIO(header)
parsedHeader = HeaderParser(headerStream)

 

Second stop – The HeaderParser class:

The HeaderParser class takes the first 24 bytes as a long string of bytes, and then it reads them in the same order that we’ve seen before.

Size (Bytes) Name Data type Description
4 Start string char[4]  The network identifier
12 Command name char[12]  The name of the command.
4 Payload size uint32 Len(payload)
4 Checksum char[4]  SHA256(SHA256(payload))[:4]

First 4 bytes for the Start string (or Magic number), another 12 bytes for Command name, the next 4 bytes are the Payload size and the last 4 bytes are the checksum.

4 bytes for starting string. 12 for command name. 4 for payload size and 4 for checksum
4 bytes for starting string. 12 for command name. 4 for payload size and 4 for checksum
class HeaderParser:
    def __init__(self, header):  # Packets is a stream

        self.magic = read_hexa(header.read(4))
        self.command = header.read(12)
        self.payload_size = read_uint32(header.read(4))
        self.checksum = read_hexa(header.read(4))

        self.header_size = 4 + 12 + 4 + 4

    def to_string(self):
        display = "\n-------------HEADER-------------"
        display += "\nMagic:\t %s" % self.magic
        display += "\nCommand name	:\t %s" % self.command
        display += "\nPayload size	:\t %s" % self.payload_size
        display += "\nChecksum	:\t\t %s" % self.checksum
        display += "\nheader Size:\t\t %s" % self.header_size
        display += "\n"
        return display

We’ve also defined the to_string function which basically makes it easier to print a human readable version of the message header.

You might’ve noticed that currently our code just accept the checksum field from the received message without checking it. This is of course a security flaw in our code. The checksum filed is there to help us verify the authenticity of the message. That is one of the ways we can make sure that no one tempered or changed the message on its way from the sender node to our node. But for the time being we’ll assume that the message is indeed authentic and we’ll accept the checksum as is.

 

Third stop – Back to the ReceiverManager:

Now that we have our header, it’s time to get the payload. The size of the payload was defined in the header of the message. We need to cut that amount of bytes from our incoming packets, just as we cut the first 24 bytes of the header. There’s however one extra step in our code. Instead of using the built in sock.recv function (as we did for the header) we’ve decided to implement our own recevall function. The rational was that since we have no way to predetermine the size of the payload, and since the built in sock.recv can’t handle large packets of unknown size, it would be wiser to break the payload into smaller parts and append them together. This has nothing to do with the Bitcoin protocol, it’s only our way to make sure that the code will properly handle large messages.

def recvall(self, length):
    parts = []

    while length > 0:
        part = self.sock.recv(length)
        if not part:
            raise EOFError('socket closed with %d bytes left in this part'.format(length))

        length -= len(part)
        parts.append(part)

    return b''.join(parts)

So now, after we’ve cut the required amount of bytes that represents the payload of our message, and we have both our header (which was already parsed) and our payload (yet to be parsed), we’ll pass them both to the receivermanager manager function.

 

Forth stop – Manager:

    
def manager(self, parsedHeader, payloadStream):

    command = parsedHeader.command.decode("utf-8")
    message = {"timestamp": time.time(), "command": command, "header": parsedHeader.to_string(), "payload": ""}


    if command.startswith('ping'):
        ping = Ping.DecodePing(payloadStream)

        pong = Pong.EncodePong(ping.nonce)
        packet = PacketCreator(pong)
        self.sendingQueue.put(packet.forge_packet())

        message["payload"] = str(ping.nonce)
        self.display(message)

    elif command.startswith('inv'):
        inv = Inv.DecodeInv(payloadStream)
        message["payload"] = inv.get_decoded_info()
        self.display(message)

    elif command.startswith('addr'):
        addr = Addr.DecodeAddr(payloadStream)
        message["payload"] = addr.get_decoded_info()
        self.display(message)

    elif command.startswith('pong'):
        pong = Pong.DecodePong(payloadStream)
        message["payload"] = pong.get_decoded_info()
        self.display(message)

    elif command.startswith('version'):
        version = Version.DecodeVersion(payloadStream)
        message["payload"] = version.get_decoded_info()
        self.display(message)

The manager function does a very simple thing. It checks the command of the message (the command is part of the header) and then it sends the message payload to be parsed by the corresponding functions. For example. If the manager sees that the command is «pong», it will use the decodepong method in Bitpay/Packets/control_messages/pong.py to extract the desire fields out of it. (You can read more about «pong», «ping» and «verack» messages in this post.).

 

Divergence

We have our pared message, both its header and payload. And now we need to decide what to do with them. For some messages this might be the end of the line. There’s nothing more we can do with them. Some might require us to act. «ping» message should be answered by a «pong» message, transactions should be checked and relayed (We’ll talk about transactions in later posts), «version» messages should be acknowledged by sending back a «verack» message.

A major part of learning the Bitcoin protocol is learning how each and every message should be dealt with. Which fields of information it contains and what is the meaning of this information. We’ve already talked about some of the messages in previous posts  (see here for «ping», «pong» and «verack» messages, and here for «version» message.) and as our project will have more features implemented, so we’ll discuss other type of messages and how to deal with them.

User interface

User interface

The best way to make our code really useful is to add user interface to it. There is a lot of this that can only be taught and understood by looking at the code itself, bu there are also many things that the average user can learn about the Bitcoin protocol that can be explained in a more “human friendly” way. An that is way Alexis and I have decided to add a Graphical User Interface (GUI) to our project in hope that in the future we can create a full Bitcoin graphical environment.

We’ve added a new folder to hold our GUI files, the UI folder. This folder contains 3 implementations. The first one uses the Tkinter module, the second uses the pyQt module and the third is a simple command line interface (And this is why we’re calling this folder UI and not GUI). We still haven’t decided on the final implementation bu we believe that we’ll eventually go with only the pyQt implementation, basically because this one looks the best!

 

This is what it looks like at the moment:

Bitcoin graphical environment
The user interface of bitpy.

This is our basic design, At the bottom there’s a list of all the messages you can send: «version», «verack» and «ping»   (More messages will be added soon). The left side keeps a record of all of the incoming messages, in the order in which they were received. Once we choose one of these messages, we can see at the right panel a display of the parsed message. The header and the payload.

Remember! this is just our first draft, but we’re quite proud of it. Little by little our project shows more and more potential.

 

installing dependencies

Because we’re using pyQt5, you might need to install the pyQt5 module on your machine. The easiest way to do so will be to use the commend:

pip install pyqt

 

 

 

Messages part two – Payloads and version message

Messages part two – Payloads and version message

In the previous posts we’ve talked a little bit about messages. We know that a message is nothing more than a string of bytes, it has an header and a body (payload), and it must maintain its predefined format. We’ve seen the format of the header, but every message body (payload) will contain different information, according to the message type. We’ll start by constructing the “version” message.

The version message is used when trying to establish a connection with the remote node. Alice will send the version message to Bob, and only after Bob have approved this version message, and replay with his own version message, only then the connection between the two nodes can be established. No other message will be accepted before both nodes have exchanged this version message. So it’s no surprise that we choose to construct this message first, since this is the first message that we’ll send, and the first one we’ll receive.

 

The fields that are required in our version message (From the developer reference):

 

Size (Bytes) Name Data type Description
4 version int32 What is the latest version of the protocol that the transmitting node (our node) understands. In this example this number is 70012
8 services uint64
Not full node 0x00
Full node 0x01
8 timestamp int64 Current timestamp
8 addr_recv services uint64 What type of services OUR receiving node can support?
16 addr_recv IP address char The IP address of OUR receiving node
2 addr_recv port uint16 The port of OUR receiving node
8 addr_trans services uint64 What type of services OUR transmitting node can support?
16 addr_trans IP address char The IP address of OUR transmitting node
2 addr_trans port uint16 The port of OUR transmitting node
8 nonce uint64 A random number that helps the receiving node to detect and index our connection
Varies user_agent bytes CompactSize This field varies in size, but it tells the other node what should be the size of the next field
Varies user_agent string This field is used to display the name of our node, like licence plates. We can call our node whatever we want, “core”, “classic”, “my_cool_bitcoin_thingy”.
4 start_height int32 The highest block that the transmitting node knows of.
1 relay bool
True The transmitting node can relay messages to the rest of the network
False The transmitting node can’t relay messages to the rest of the network
  • Pay attention that in the “version” message, when asking for both the receiving and the transmitting  services, IP address and ports, we’re asked about our own machine, our own node. In our case both incoming and outgoing messages will be dealt in a similar manner, but some implementations might include more advanced routing.

Code implementation

Alexis and I decided that every message will have it’s own file in which the payload of the message will be both created (For messages that our node will send) and parsed (For incoming messages).

import random
import time
from io import BytesIO
from Utils.config import version_number, latest_known_block
from Utils.dataTypes import *


class EncodeVersion:
    def __init__(self):
        self.command_name = "version"

        self.version = to_int32(version_number)
        self.services = to_uint64(0)
        self.timestamp = to_int64(time.time())

        self.addr_recv_services = to_uint64(0)
        self.addr_recv_ip = to_big_endian_16char("127.0.0.1")
        self.addr_recv_port = to_big_endian_uint16(8333)

        self.addr_trans_services = to_uint64(0)
        self.addr_trans_ip = to_big_endian_16char("127.0.0.1")
        self.addr_trans_port = to_big_endian_uint16(8333)

        self.nonce = to_uint64(random.getrandbits(64))
        self.user_agent_bytes = to_uchar(0)
        self.starting_height = to_int32(latest_known_block)
        self.relay = to_bool(False)

    def forge(self):
        return self.version + self.services + self.timestamp + \
               self.addr_recv_services + self.addr_recv_ip + self.addr_recv_port + \
               self.addr_trans_services + self.addr_trans_ip + self.addr_trans_port + \
               self.nonce + self.user_agent_bytes + self.starting_height + \
               self.relay


class DecodedVersion:
    def __init__(self, payload):
        self.version = read_int32(payload.read(4))
        self.services = read_uint64(payload.read(8))
        self.timestamp = read_int64(payload.read(8))

        self.addr_recv_services = read_uint64(payload.read(8))
        self.addr_recv_ip = parse_ip(payload.read(16))
        self.addr_recv_port = read_big_endian_uint16(payload.read(2))

        self.addr_trans_services = read_uint64(payload.read(8))
        self.addr_trans_ip = parse_ip(payload.read(16))
        self.addr_trans_port = read_big_endian_uint16(payload.read(2))

        self.nonce = read_uint64(payload.read(8))

        self.user_agent_bytes = read_compactSize_uint(BytesIO(payload.read(1)))
        self.user_agent = read_char(payload.read(self.user_agent_bytes), self.user_agent_bytes)

        self.starting_height = read_int32(payload.read(4))
        self.relay = read_bool(payload.read(1))

    def get_decoded_info(self):
        display = "\n-----Version-----"
        display += "\nversion                :\t\t %s" % self.version
        display += "\nservices  	         :\t\t %s" % self.services
        display += "\ntimestamp              :\t\t %s" % self.timestamp

        display += "\naddr_recv_services	 :\t\t %s" % self.addr_recv_services
        display += "\naddr_recv_ip           :\t\t %s" % self.addr_recv_ip
        display += "\naddr_recv_port         :\t\t %s" % self.addr_recv_port

        display += "\naddr_trans_services  	:\t\t %s" % self.addr_trans_services
        display += "\naddr_trans_ip         :\t\t %s" % self.addr_trans_ip
        display += "\naddr_trans_port	    :\t\t %s" % self.addr_trans_port

        display += "\nnonce                 :\t\t %s" % self.nonce

        display += "\nuser_agent_bytes  	:\t\t %s" % self.user_agent_bytes
        display += "\nuser_agent            :\t\t %s" % self.user_agent
        display += "\nstarting_height	    :\t\t %s" % self.starting_height
        display += "\nrelay	                :\t\t %s" % self.relay

        return display

Because this code is used both for incoming and outgoing messages, it has both the class EncodeVersion, which is used to build the payload of the version message, and the class DecodedVersion which is used to parse the payload of any incoming version message.

The function forge will just append and return all the fields in the right order – this is the finale payload.

 

EncodeVersion

Because we haven’t established connection yet, we first need to create the payload of our version message using the EncodeVersion class. The class won’t take any argument (except for self) and will just assign every field with the right value and the right data type.

The variables version_number and last_known_block are imported from Bitpy/Utils/config.py and are set to:

version_number = 70012
latest_known_block = 416419  # june 2016

Our node is not a full node so the services will be set to 0x00. For that reason we’ll also set our relay field to be False.

In our own node, both incoming and outgoing messages will be dealt by the same machine so both receiving and transmitting machines are the same:

self.addr_recv_services = to_uint64(0)
self.addr_recv_ip = to_big_endian_16char("127.0.0.1")
self.addr_recv_port = to_big_endian_uint16(8333)

self.addr_trans_services = to_uint64(0)
self.addr_trans_ip = to_big_endian_16char("127.0.0.1")
self.addr_trans_port = to_big_endian_uint16(8333)

 

We’re using the function random.getrandbits(64) in order to populate or nonce field with 8 bytes long random number.

We’re also not adding any vanity name to our node at the time so we’re setting the user_agent_bytes to be 0. That means that there’s no user_agent_bytes field.

 

 

DecodedVersion

This class is quite straightforward, it receives the payload of the incoming message from Bitpy/Manager/ReceiverManager.py.

It uses the builtin function read  and our data types functions to assign each field with the proper value, for example the first 4 bytes are the version number in uint32 format, the next 8 bytes are the services field in uint64 format and so on.

The only thing that is really unique is the user_agent field:

self.user_agent_bytes = read_compactSize_uint(BytesIO(payload.read(1)))
self.user_agent = read_char(payload.read(self.user_agent_bytes), self.user_agent_bytes)

This is the first example of the varying CompactSize data type in use. The field user_agent_bytes doesn’t have a fixed size. The Bitcoin protocol defines the variable data type CompactSize to deal with such fields (you can read more about this data type in the data types section). We’re using the function BytesIOin in order to send this argument as a string of bytes to theread_CompactSize_unit function and receives back the Uint that matches the size of the next field, the, the user_agent field. Then we’re using the data type function read_char which requires two arguments. The first is the string of bytes itself (payload.read(self.user_agent_bytes)) and the second is the size of the total string (self.user_agent_bytes).

Once we’ve finished parsing out version message we can use the get_decode_info function in order to display the information about the remote node (currently, we aren’t doing anything with this information except to dump it as a text file).

—–Version—–
version : 70012
services : 5
timestamp : 1467293151
addr_recv_services : 1
addr_recv_ip : ��^�V�
addr_recv_port : 30373
addr_trans_services : 5
addr_trans_ip : ��
addr_trans_port : 8333
nonce : 1755461931592560680
user_agent_bytes : 16
user_agent : /Classic:0.12.0/
starting_height : 418653
relay : True

 

Edit (4-Jul-2016): Python 2.5 to 3.5 migration

Please read the general notes about the transition from Python 2.5 to 3.5 over here. And the complete github change log for the migration over here.

The code for the <Version> message remind fairly untouched, only few adjustments were required:
class EncodeVersion:
///
self.timestamp = to_int64(int(time.time()))
///
self.addr_recv_ip = to_big_endian_16char(b“127.0.0.1”)
///
self.addr_trans_ip = to_big_endian_16char(b“127.0.0.1”)
///
class DecodedVersion:
///
self.user_agent = read_chars(payload.read(self.user_agent_bytes), self.user_agent_bytes)
///

 

 

Messages part one – constructing the message and headers

Messages part one – constructing the message and headers

Communication in the Bitcoin network is done via messages.  A message is no more than just a string of bytes.

This is an example of a simple “ping” message:

f9beb4d970696e670000000000000000080000001b3cb220309941550a2ffd5c # The full message. Header + Payload

The Bitcoin protocol describes how each message should be packet. It is highly important to make sure that you construct the message in the right format. The machine on the other side will expect to receive a very specific format and it won’t be able to process any message that won’t fit this format.

Every message is made of two main components:

  1. Header (not to be confused with block headers! we’ll talk about them in a later post!)
  2. Payload

The payload is the body of the message – The message itself. Many types of messages are defined in the Bitcoin protocol, and we’ll talk about each message later on, but for now we should take a look at the header and that’s because the header is the one thing ALL of the messages have in common.

Each header is made of four components:

Size (Bytes) Name Data type Description
4 Start string char[4]  The network identifier
12 Command name char[12]  The name of the command.
4 Payload size uint32 Len(payload)
4 Checksum char[4]  SHA256(SHA256(payload))[:4]

Starting-string

The first 4 bytes of the message are the Starting-string (or Magic number).

f9beb4d9 # 4 bytes starting string (Magic number)

This number tells the receiving machine which network I’m using. In this tutorial we’re using the real main-network (Caution! any mistake made in the real main network might cost you real Bitcoins!). The Magic number of the main-network is 0xf9beb4d9. We can look at the “ping” message example at the top and see for yourself that the first four bytes in the message are indeed f9beb4d9. The receiving machine will first check this Magic number to make sure that it is receiving messages from the network it’s currently ruining, and only then it will start processing the rest of the message.

 

Command name

The next 12 bytes are the Command name.

70696e670000000000000000 # 12 bytes command name

Each header should contain the name of the command, or type of message that is contained in the payload (body) of the message. In our case, the message is a “ping” message. The receiving machine will use this information to know how to parse and treat this message. In our example the receiving machine will answer with a “pong” message (but only after it will complete validating the message). Pay attention that the command field should be exactly 12 bytes long. That means that if the command name is shorter than 12 bytes, it needs to be padded with nulls. We can see that the first 4 bytes makes the word “ping” in hexadecimals  (70-P, 69-I, 6e-N, 67-G) and another 8 bytes are nulls (00). In our code implementation (see below) we’re using the function command_padding to pad our command name to be 12 bytes.

def command_padding(self, command):  # The message command should be padded to be 12 bytes long.

command += (12 – len(command)) * “\00” return command

 

Size

Another 4 bytes will contain the size of the payload.

08000000 # 4 bytes size of the payload

This field is very straightforward. We just need to insert the size of the payload (body) of our message. In our case it’s simply 8 bytes long. (Once again, we need to make sure it will be exactly 4 bytes longs. luckily we’ve predefined this data type in our Bitpy/Utils/dataTypes.py file under to_uint32(v). So we don’t have to manually insert the extra 3 null bytes).

 

Checksum

The last part of the header is the checksum.

1b3cb220 # 4 bytes checksum

It might seem somewhat strange at the beginning, but all that we need to do is to take the payload of our message, perform the cryptographic function SHA256 twice on that message, and then append the last 4 bytes of the result to our header. SHA256(SHA256(payload))[:4]

def get_checksum(self):
   check = hashlib.sha256(hashlib.sha256(self.payload).digest()).digest()[:4]
   return check

 

 

Payload

And now all that we left with is the payload – the body of the message . Each message will be parsed differently.

309941550a2ffd5c # The payload, the body of our "ping" message.

 

The code implementation

In our code we need to deal with both incoming and outgoing messages. So we’ve split our code to two files:

  1. Bitpy/Packets/HeaderParser.py – to parse the headers of the incoming messages (And after the header was properly parsed, to use the 12 bytes command properly to determine what other steps are required).
  2.  Bitpy/Packets/PacketCreator.py – to build the header of our outgoing message and to pre-fixed it to the payload of the message.

Bitpy/Packets/HeaderParser.py for incoming messages

class HeaderParser:
    def __init__(self, block):  # Packets is a stream

        self.magic = read_hexa(block.read(4))
        self.command = block.read(12)
        self.payload_size = read_uint32(block.read(4))
        self.checksum = hash_to_string(block.read(4))

        self.header_size = 4 + 12 + 4 + 4

    def to_string(self):
        display = "\n-------------HEADER-------------"
        display += "\nMagic:\t %s" % self.magic
        display += "\nCommand name	:\t %s" % self.command
        display += "\nPayload size	:\t %s" % self.payload_size
        display += "\nChecksum	:\t\t %s" % self.checksum
        display += "\nheader Size:\t\t %s" % self.header_size
        display += "\n"
        return display

Bitpy/Packets/PacketCreator.py for outgoing messages

class PacketCreator:
    def __init__(self, payload):
        self.payload = payload.forge()  # The message payload forged

        # create the header
        self.magic = to_hexa(
            "F9BEB4D9")  # The Magic number of the Main network -> This message will be accepted by the main network
        self.command = self.command_padding(payload.command_name)
        self.length = to_uint32(len(self.payload))
        self.checksum = self.get_checksum()

    def command_padding(self, command):  # The message command should be padded to be 12 bytes long.
        command += (12 - len(command)) * "\00"
        return command

    def get_checksum(self):
        check = hashlib.sha256(hashlib.sha256(self.payload).digest()).digest()[:4]
        return check

    def forge_header(self):
        return self.magic + self.command + self.length + self.checksum

    def forge_packet(self):
        return self.forge_header() + self.payload

 

Edit (4-Jul-2016): Python 2.5 to 3.5 migration

Please read the general notes about the transition from Python 2.5 to 3.5 over here. And the complete github change log for the migration over here.

The command_padding function was rewrite in order to take full advantage of Python 3.5 capacity and we’re now using the built in function ljust.

def command_padding(self, cmd):
    command = str(cmd)
    command = command.ljust(12, '\00')
    return str.encode(command)

 

Connection part one – Finding a node and packets routing

Connection part one – Finding a node and packets routing

The first thing we need our code to do is to connect to the Bitcoin network. this is relatively straightforward process, we just need to find one node in the network and establish connection  with that node. A list of few of the active nodes can be easily found online. We’ve randomly picked one node from this list on blockchain.info .

We’re using the socket module to establish our connection using this simple code:

import socket
import sys

HOST = "66.90.137.89"
PORT = 8333

"""
    We will use this file to connect to one node
    But in the future we will connect to more than one
"""

def connect():
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

    try:
        sock.connect((HOST, PORT ))
    except Exception as e:
        print e
        sys.exit(0)

    return sock

 

we’ve also created main.py at the root of our folder structure (right under Bitpy/). This will initialize the connection code upon startup and will route our incoming and outgoing packets to ReceiverManager  and SenderManager respectively.  The queue module helps us to make sure that the packets are being processed in the right order. We’ll later  see what each file does, but for now, what is important to understand is:

  1. We’re connecting to another node on the network.
  2. We’ve found the address of this node on a public list at blockchain.info.
  3. The connection code is stored at Network/connection.py
  4. We’ve created a Main file under our root directory (Bitpy/Main.py) that will initialize the connection to the node, and will route our incoming and outgoing packets to one of the two queues  files SenderManager.py (for outgoing packets) and ReceiverManager.py (for incoming packets). Both files can be found under Manager/.
  5. The user manually specify which packet (message) he wants to send using the core_manager. We’ll talk about it later on when we’ll be dealing with the user interface.

 

So now we should have a look at our Receiver/Sender Managers, but our ReceiverManager is a bit too complex for this stage, so we’ll talk about it later, once we’re ready to talk about parsing incoming messages. For now, we’ll only have a look at our SenderManager.

The first thing we did was to use the threading module. This module allows us to keep our connection asynchronous, that means that we can receive and send messages at the same time. Apart from this threading module this file contains only one more class – SendingManager. Once this class is defined, it will have access to our thread, it will be able to use or sock object (declared in connection.py) to connect to the remote node and it will also receive the packets queue from the main.py file.

 

from threading import Thread


class SenderManager(Thread):

    def __init__(self,sock, queue):
        Thread.__init__(self)
        self.sock = sock
        self.queue = queue

    def run(self):
        while True:
            if not self.queue.empty():
                order = self.queue.get()
                self.sock.sendall(order)

        print "Exit sender Thread"

 

So the main.py file gets a list of packets (messages) from the user which he wishes to send. (The user creates the packets in the core_manager.py file). The packets are stored in a queue, and a SenderManager object is then created. It gets access to the sock object, the thread, and the queue , then it will simply send the packets in their order, as specified in the queue, one by one, to the ip address and port of the sock, while making sure that the connection remains asynchronous.

 

Before we can start sending and receiving messages, we first need to learn about messages.

 

 

Data types

Data types

Date types

When reading the Bitcoin developer reference, it becomes immediately clear that the Bitcoin protocol requires the user to work with only specific Bitcoin data types. You can’t just insert numbers as int and expect it to work. Each field, of every packet that is send or received by our node needs to be properly formatted.

Let’s have a look at the version message documentation in the Bitcoin developer reference.
We can see that the first field should contain the protocol version number (currently 70012). But we can’t just send the number as-is, it’s specifically stated that the number should be 4 Bytes, int_32 type. And we can also see that any variable, any piece of information that is either received or send will be formatted in the predefined manner that was specified in the Bitcoin protocol documentation. Luckily, Python have the struct module that allows us to easily predefine our data type. In our example we want to pack the number “70012” into a 4 bytes int (remember, 32 bits is 4 bytes) with the variable name “version”.
So using the struct module in our code should look like this:

import struct

version = struct.pack("i", 70012)

The i in the code represents 4 bytes integer. For a complete lists of characters and their meaning, have a look at the following table in the struct module documentation.

This is quite a simple process, just look at the Bitcoin documentation to find out how each variable should be parsed, and then head to the struct module documentation to find the corresponding character. But once done again and again for each an every variable, it will surely cause our code to get out of control and errors are a sure thing. So Alexis suggested that we’ll predefine all of the data types that are required in one file. Now, instead of using the previous code for our version variable, we can just use the predefined function to_int32(v):

import struct

def to_int32(v):
    return struct.pack("i", v)

version = to_int32(70012)

We’ve also added a read_int32, which allows us to easily get back our variable.

import struct

def to_int32(v):
    return struct.pack("i", v)

version = to_int32(70012) # The number 70012 is now packed.

print version # Unreadable


def read_int32(v):
    return struct.unpack("i", v)[0]


print read_int32(version) # The number 70012 is readable again
 

Most of the data types were easy to define, but the Bitcoin protocol has one special type of data type which is called compactSize_uint.
In this data type, every number higher than 252 will have a prefix that will indicate the length of the number. This type of data type is mostly used for variables of changing length.

import struct

def to_compactSize_uint(v):
    if 0xfd > v:
        return struct.pack("<B", v)
     elif 0xffff > v:
        return "FD".decode("hex") + struct.pack("<H", v)
     elif 0xffffffff > v:
        return "FE".decode("hex") + struct.pack("<I", v)
    else:
        return "FF".decode("hex") + struct.pack("<Q", v)



def read_compactSize_uint(s):  # S is a stream of bytes

    # Read an unsigned char to get the format
    size = ord(s.read(1))

    # Return the value
    if size < 0xFD:
        return size
    if size == 0xFD:
        return read_uint16(s.read(2))
    if size == 0xFE:
        return read_uint32(s.read(4))
    if size == 0xFF:
        return read_uint64(s.read(8))

The parse_ip bug

We’ve also tried to built a parse_ip function to properly displaying IP addresses. But unfortunately we’ve came across when using Windows. You can read more about our attempts to deal with the bug at our trello board

Edit (4-Jul-2016): Python 2.5 to 3.5 migration

Please read the general notes about the transition from Python 2.5 to 3.5 over here. And the complete github change log for the migration over here.

Most of the data types function have remained unchanged. With the exceptions of:

The functions that dealt with reading and writing charterers were replaced by two function: to_chars and read_chars.

def to_chars(v, length=-1):
    if length == -1:
        length = len(v)
return struct.pack(">%ss" % length, v)

def read_chars(v, length= -1):
     if length == -1:
         length = len(v)
         return struct.unpack(">%ss" % length, v)[0]

These new functions can accept a specific variable size (length) If now length is inserted, it will calculate the size of the string automatically. This allows us to deals with strings of varies sizes.

The parse_ip function was fixed and replaced by the following code:

def parse_ip(ip):
    IPV4_COMPAT = b"\x00" * 10 + b"\xff" * 2

    # IPv4
    if ip[0:12] == IPV4_COMPAT:
        ip = read_hexa(ip[12:])# we remove the first 10 "\x00" an 2 "\xff , and convert bytes to hexa
        ip = "%i.%i.%i.%i" % (int(ip[0:2], 16), int(ip[2:4], 16), int(ip[4:6], 16), int(ip[6:8], 16))

    # IPv6
    else:
        # TODO
        pass

    return ip

We’ve also added a two more functions for encoding and decoding hexadecimals:

def to_hexa(v):
    return bytes.fromhex(v)

def read_hexa(v):
    return v.hex()