Browsed by
Tag: transactions

How to be blockchain compatible. part one

How to be blockchain compatible. part one

Tl;dr: Don’t try to move your existing organization to be a “decentralized/ blockchain based” look at the cases in which your organization might implement some blockchain based solutions and start making the necessary changes in-house to allow future integration.

Seeing how new technologies are changing our world, and change it fast, makes many entrepreneurs twist and itch. It’s in the blood of most of them to try and incorporate this new technology as soon as possible. However, most are failing to properly understand what the blockchain (it’s more than just a chain of blocks) is and how it can affect their organization. Many want to completely redesign their business model and organization architecture to migrate entirely to the blockchain.
I find this approach to be counterproductive (and in most cases, outright ludicrous). Instead, I’m trying to help them to consider the need of their business, to map their current application and its architecture and to educate themselves on what the blockchain really is, and what it isn’t. Then we can look at ways in which existing organizations might make their systems more compatible with current blockchains. That way these organizations are both going through and internal (and valuable!) process of learning how to work with the blockchain, create a list of cases in which the blockchain might be beneficial to them.

In the next few posts, I’ll expand on it and try to give a general review of how a business owner might start remodeling their systems to be blockchain compatible.

The Database. Part one.

Many considering the blockchain storage features to be similar to those of a legacy database systems. However, the blockchain differs from such architectures in many aspects. In this article, I’ll try to give a general review of the things legacy databases and blockchain have in common, and of course, of the things that separate them from each other.

  1. Writing and reading is extremely more expensive than in legacy systems
  2. The array is an append only array. That means that data cannot be delete or update
  3. Most current languages cannot work with blockchain
  4. A substantial amount of the information stored on the blockchain is irrelevant for most users

Study case

Let’s consider the following case. A social network app wants to migrate its database onto the blockchain. That social app probably contains a multitude of data entries. Our social network app might be using a rational database (must commonly SQL), or non-rational database (NoSQL).

If our app uses a rational database, it might look something like this:

Table – users:

User id User name User email Number of articles published by user
1 Shlomi Zeltsinger Shlomi.zeltsinger@gmail.com [1, 5]
2 John Doe NoOne@gmail.com [2,3,4,6]

Table – articles:

Article id Created by user id body title
1 1 Story number one Good title
2 2 Story number two Better title
3 2 Story number three Best title
4 2 Story number four Excellent title
5 1 Story number five Stolen title
6 2 Story number six Bad title

In the case of NoSQL database, we will have the following structure:

Whenever our social app wants to update its legacy database, it uses the TRANSACTION.

START TRANSACTION
      INSERT INTO articles VALUES (
                  7, # Article id 
                  1, # User id
                  "story number seven", # Body
                  "title number seven" # Title )
COMMIT

Transactions in this sense have the following four characteristics:

Atomicity − ensures that all operations within the work unit are completed successfully. Otherwise, the transaction is aborted at the point of failure and all the previous operations are rolled back to their former state.

Consistency − ensures that the database properly changes states upon a successfully committed transaction.

Isolation − enables transactions to operate independently of and transparent to each other.

Durability − ensures that the result or effect of a committed transaction persists in case of a system failure.

(from https://www.tutorialspoint.com/sql/sql-transactions.htm)

Now let’s see what a transaction in the meaning of the blockchain is. First of all, it does possess some of the characteristics described above. Each transaction is atomic – either it was valid and accepted into a block, or it’s not. There’s no gray area. Each transaction is also isolated from the other, at least when looking at systems like the current Bitcoin and Ethereum blockchains (Some advance research is done on a new architecture that might change this propriety a little). Also, once a transaction was accepted to the blockchain, its effect is durable, and even if my own personal server falls, the result of that action is still available.  When it comes to consistency, transactions that are transmitted to the Bitcoin/Ethereum network will be verified and examined using various tools (scripting language, cryptography, EVM) that force consists result across the network.

So, if all four characteristics are the same, then what’s the difference?

Expensive

Some of you might have looked at the above example, the one in which we’re adding article number seven, and asked yourselves “what is going to happen to the user’s table”? We specifically asked to update the article table, but now – the user table should also reflect the changes made (now user id 1 have 3 articles [1,5,7] ). To update this field, we’ll either add the command UPDATE to our TRANSACTION code:

START TRANSACTION
      UPDATE users
      SET articles = [1, 5, 7]
      WHERE user id = 1;
COMMIT

Or transmit this UPDATE command as a second transaction. In both cases, we’re adding extra work to our database both in computational cycles or in physical storage place (now we need to store the number Three in out database). But in the case of Bitcoin and Ethereum, we’re using the blockchain to store the information, and we’re asking the other nodes on the network to validate our transactions constantly. And this has a much higher price than doing so locally.

How high? Well, in the case of Ethereum, every 256 bit that is stored, requires 20K of gas unit (that’s the minimum storage space for words). At a current gas price of 0. 00000005 Ethers per 1 gas unit that amounts to: 20,000 * 0.00000005  = 0.001 ethers per 1 Byte.

Therefore storing one kilobyte will cost you: 0.001 Ether * 31 = 0.031 Ethers.

At current market price of 350 USD per ethers, that means that one kilobyte of data should cost 10.85 USD. That’s 10,850,000 USD for one Gigabyte.

Bitcoin prices aren’t much more competitive. And what is worse, currently each transaction cannot contain more than 40 bytes of added information – our articles will be very short.

This raises the questions:

  • What type of information should be stored on the blockchain? Will it be more cost effective to reconstruct some of it on demand? (For example, whenever someone looks for the number of articles that were published by user number 1, the server will consult its own local copy of the blockchain, and count only the articles that were written by that user)
  • Can we index our array in such a way that two modification be efficiently made at the same time?

Variables state

The truth is that the information that is stored on the blockchain doesn’t look at all like the table shown above. The blockchain maintains a list of its valid transactions (in the case of both Bitcoin and Ethereum) and in the case of Ethereum only, a tree of states in which indexed variables are stored.

TX 1 Add user (1, “Shlomi Zeltsinger”, Shlomi.zeltsinger@gmail.com)
TX 2 Add article (1, 1, “story number one”, “Good title”)
TX 3 Add user (1, “John Doe”, NoOne@gmail.com)
TX 4 Add article (2, 2, “story number two”, “Better title”)
TX 5 Add article (3, 2, “story number three”, “Best title”)

The fact that the blockchain is a list of transactions and not a real scheme (table/objects) based database have some interesting implications. First, to interpret this list of the transactions to something of value to use, we’ll need to parse the blockchain and reconstruct our database from scratch.

Someone would need to sit with a pen and paper, read through the list of the transactions and write down the desired information.

For example, if we’re looking for all the articles that were created by John Doe, that someone will first have to look for a transaction that begins with “add users” and contains the name “John Doe” (tx3). When the right transaction was found, that someone will write down the user id of John Doe (2) and then he’ll need to look for all the transactions that begin with “Add article” and count only the ones that have that number that matches John Doe user id (2). Tedious work, no doubt.

In the case of Ethereum we’re giving an extra tool to help us a little. That is that instead of looking at one transaction after the other, we can look at the “state tree” (sometimes “state trie”) of each block. In Ethereum, once we paid the high fee of storage, each variable is indexed. Whenever a change is made to that variable, a new state will be added to the “state tree”. That means that by knowing the index of the said variable, I can find the said indexed variable.

This raises the questions:

  • Can we construct our transactions in a manner that will make it easier to read through?
  • Is it possible to “point” from one transaction to the other?
User id User name User email Number of articles published by user
TX 1 Shlomi Zeltsinger Shlomi.zeltsinger@gmail.com [TX 2, TX 7]
TX 3 John Doe NoOne@gmail.com [TX 4,TX 5,TX 6, TX 8]

Table – articles:

Article id Created by user id body title
TX 2 TX 1 Story number one Good title
TX 4 TX 3 Story number two Better title
TX 5 TX 3 Story number three Best title
TX 6 TX 3 Story number four Excellent title
TX 7 TX 1 Story number five Stolen title
TX 8 TX 3 Story number six Bad title

(Maybe instead of “User Id” field to identify user we’ll do better to point to the transaction at which the user was created?)

  • If were storing the information on the Ethereum network, perhaps it will be easier to treat it as objects (NoSQL) rather then as information that should be stored in rational tables?

Append only

We just saw that both Ethereum and Bitcoin, maintains a list of ALL of their transactions and that we can reconstruct our database by following each transaction. One after the other. As we know, once transactions make it to the blockchain, they stay there. That means that I have no way to change (UPDATE) the information that was already stored. However, by properly constructing my transaction, I can add a comment for future users noting a distinct change. For example, I can create a transaction for changing the user email address

TX 1 Add user (“Shlomi Zeltsinger”, Shlomi.zeltsinger@gmail.com)
TX 2 Add article ( TX 1, “story number one”, “Good title”)
TX 3 Add user (“John Doe”, NoOne@gmail.com)
TX 4 Add article (TX 3, “story number two”, “Better title”)
TX 5 Add article (TX 3, “story number three”, “Best title”)

.

.

.

 .

.

.

TX 500 Update user (TX 1, email: newMail@gmail.com )

Pay attention that now, if I want to find all the articles that were created by the user with the email newMail@gmail.com just iterating through the “add users” transactions will simply won’t do. I will also be required to look for the UPDATE transaction.

This is a great example of a case in which the Ethereum state tree is far superior because once the user is indexed, I can check his or hers email by checking the state of that proper variables.

This raises the questions:

  • What can we do when a node isn’t fully synced. There might be some changes to the data that we have that we’re not yet aware of?
  • If John Dow will add a new article, should this article point to TX 3 or the newer TX 500?
  • Is all the data that is stored on the blockchain is relevant for us? How can we quickly identify the pieces of information that are relevant?

The blockchain probably isn’t the place to store your data. Yet.

As you can see, there are some major issues that need to be solved and considered before you could move your data to the blockchain. It is my recommendation for those who wish to arrnest its power to first try and mimic some of its working architecture principals in-house. Prepare your database both regarding design and structure and make sure that the architectural differences (and the reasonings behind them) clear to you and your team. That way you’ll in a much better position that will allow you to start and slowly integrate some of your business with either Bitcoin or Ethereum (or both) in a way that is truly advantages.

In the next parts, I’ll be talking about some specific codes and architectural tricks that will help us to implement some of the principles mentioned above.

Scripts and stacks

Scripts and stacks

Personal note, Many things happened in past two months the required my full attention. I hope to resume a steady flow of posts in coming days.

Review

In the last post we’ve talked about one the biggest bitcoin misconception – The idea that transaction actually moves coins from one wallet to another. The truth is that transactions are nothing more that statements. These statements always points to a previous statement (that in turn point to an even older statement and so on), and usually these statement also specify an amount of coins that the current owner is wishing to transfer. The statement also contains a riddle, or an equation that needs to be proofed, and mostly, the key to proof this equation will require the use of the private_key that is associated with the recipient bitcoin address.

level3

Pay attention, even though Bob will be required to use his own private_key to proof that he indeed can solve this problem, the private_key still won’t be available to any one.

 

Now let’s look for a second at this transaction message. We’ve already learned how to create a bitcoin message (see this section about Version message and this one about headers). We just need to make sure that all of the fields are filled in accordance the protocol rules. Just like filling a form. You can find a complete list of the fields that needs to be filled in the bitcoin developers documentation.

 

 

Most of the fields are quite straight foreword. I might still create another post in the future with detail instructions on how to fill all the fields, but this isn’t really the topic of this post. This post deals with one of bitcoin more fascinating aspects – The riddle that Alice place in her statement. The riddle that only Bob can solve -The script.

 

(You just can’t wait to create your own transaction? you’re more than welcome to watch my videos on creating bitcoin transaction)

Scripts, what is it?

Scripts is a computer language. In more detail, it’s a set of predefined words that are agreed upon. Every node that follows the rules specified in the bitcoin protocol will know how to read, interpret and implement these words. Because bitcoin messages are basically nothing more then a string on bytes, these words are not written in plain English, rather are translated to OP_CODEs. That way, we can send our message as a string of bytes, and the receiving node will know that these bytes represent some instructions. (Important note, The receiving node will only treat this bytes as instructions only if they appear inside one of the script field.)

Here’re selected few:

Word Opcode Hex Input Output Description
OP_1ADD 139 0x8b in out  1 is added to the input.
OP_1SUB 140 0x8c in out 1 is subtracted from the input.
N/A 1-75 0x01-0x4b (special) data The next opcode bytes is data to be pushed onto the stack
OP_MIN 163 0xa3 a b out  Returns the smaller of a and b.
 OP_SHA256  168  0xa8  in  hash The input is hashed using SHA-256
OP_EQUAL 135 0x87 x1 x2 True / false Returns 1 if the inputs are exactly equal, 0 otherwise.

The original list included around 200 of these words, but currently most nodes will only support few dozes of these words. Using these few words we can create many “riddles” or state many conditions to claim the coins in our transaction message.

For example I can add the following string of bytes as my script.

0x01 0x8b 0x87 0x02 0x87

<1> <OP_1ADD> <2> <OP_EQUAL>
  1. It will take the number 1.
  2. Use the OP_CODE OP_1ADD to add 1 to it -> The output of this OP_CODE will be 2.
  3. Use the OP_CODE OP_EQUAL to make sure if the result is equal to 2. -> The output of this OP_CODE will be True.

A word of caution though, most nodes not only refuse to accept most of these OP_CODEs, they will even refuse to accept most non-standard  scripts, mainly because they want users to use standard transactions. Many nodes will not only refuse to accept a transaction with a non standard script, they’ll also refuse to transmit these transactions to other nodes.

 

Stacks

You might’ve already noticed that this script language can only be written as a list of operations. Unlike other high level languages (such as python for example) Scripts can only be used in a predefined order. This type of structure is called stack, because we’re stacking variables and data on top of each other. But not only we’re stacking them, using the stack structure also means that they’ll be processed in accordance to the order in which they were stacked.

In our previous example, the integer 1 was the first item in our stack. Then came the operation OP_1ADD which took that item as its input, processed this item by adding 1 to it, and than giving the output 2. Now the number 2 is stacked BELLOW the integer 2.

<1> <OP_1ADD> <2> <OP_EQUAL>

<2> <2> <OP_EQUAL>

The node recognize the OP_CODE <0x02> as the integer 2, so it moves on to the next item in our stack – the OP_CODE OP_EQUAL. This operation input is the two items that are directly bellow it and compere the two. If both are equal, it will return True.

<True>

 

This example code can’t be used with a standard bitcoin transaction, it’s only meant to give you a general feel on how scripts works.

You can find an example of a real transaction over here:

 

 

Give it a try with bitpy

One of bitpy newest feature is the ability to create stacks and see them in action in real time. Mind you, only few OP_CODES are currently implemented, but it might still give you a feel on how stacks works.

Example of stack using bitpy
Example of stack using bitpy

 

Simple stack architecture with python

Stack architecture can easily be implemented using arrays. After all, it’s nothing than an array of objects (variables, operations, results etc’).

In our bitpy project, under Utils/OpCodes/Codes.py I’ve created a stack class. In its most basic form, this class will only create and empty array upon initialization, followed by  2 methods only.

class Stack():

    def __init__(self):
        self.items = []

    def push(self, item):
        self.items.append(item)

    def pop(self):
        elm = self.items.pop()
        return elm
  1. push(item) append new item to the array
  2. pop(item) remove the topmost item in my array.

This should be enough to create a very basic stack class. Still, I’ve added few more methods.

class Stack():

    def __init__(self):
        self.items = []

    def isEmpty(self):
        return self.items == []

    def push(self, item):
        self.items.append(item)

    def pop(self):
        elm = self.items.pop()
        return elm

    def size(self):
        return len(self.items)

    def printStack(self):
        display = ""
        for items in self.items:
            items = str(items)
            if len(items) > 5:
                display += " " + "<"+ items[:5] + "..." + ">"
            else:
                display += " " + "<" + items + ">"
        return display

    def clear(self):
        self.items.clear()

The isEmpty method will check if our stack array is empty.

The size method will give us the size of the array.

The printStack will provide us with a visual representation of our array. Pay attention that I’ve limited the size of each item to only 5 characters so that items such as hashed messages, bitcoin addresses, keys etc’ won’t take the all screen.

The clear method will remove all items from our array.

Using this methods we can easily start implementing more advanced OP_CODE to our stack array.

def OP_DUP(self):
    elm = self.pop()
    self.items.append(elm)
    self.items.append(elm)

def OP_HASH160(self): #saved as string!
    self.push(Utils.keyUtils.keys.generate_hashed_public_key_string(self.pop()))

def OP_EQUAL(self):
    elm1 = self.pop()
    elm2 = self.pop()

    if elm1 == elm2:
        self.push(1)
    else:
        self.push(0)

def OP_VERIFY(self):
    top = self.pop()
    if top == 1:
        self.push(1)
    else:
        self.push(0)

def OP_RETURN(self, input):
    self.push(input)