Seven Habits To Create Reliable Software - Resources Are Not Free

This is the third part of a collection of seven blog posts about how to write reliable software. When a new developer joins the company we go over this list, and so I’ve decided to organize my thoughts on the subject a bit and share them with a wider audience.

At Komfo, we rely extensively on external services - Facebook, Twitter, Instagram, YouTube, Google+, LinkedIn, Zendesk, external CDN to store image and video files. For 24 hours, we make between 1 and 2 million requests to Facebook alone. They consume lot’s of resources that are not cheap and we are closely monitoring our consumption.

Since it’s very hard and undesirable to test against real services, in order to make our tests fast and reliable we simulate most of the external behavior of the 3rd party web services. We’re migrating from internal (within the code itself) to external stubs (more on this in another blog post). For our purposes we are writing a simulator that acts as a standalone replacement for all HTTP and HTTPS traffic.

Recently, as part of the development of this simulator, I came across the following piece of PHP code in our product:


public function get_image_result($page_id, $image_path) {      
        $content = file_get_contents($image_path);
        $image_info = getimagesize($image_path);

        $this->doSomeStuff($content, $image_info['mime']);
}

This method is supposed to get the contents of an image stored somewhere in the cloud (file_get_contents()), then get the mime type (getimagesize()) and call some other method at the end. The problem with this code is that it makes two HTTP connections, and in this case only one is sufficient. We already have all the needed data from the first request, there is no need to make the second request.

This was bad already, but the worse part is that the official PHP documentation encourages the same wasteful behavior:


$size = getimagesize($filename);
$fp = fopen($filename, "rb");
if ($size && $fp) {
    header("Content-type: {$size['mime']}");
    fpassthru($fp);
    exit;
} else {
    // error
}

With all the shortcomings of PHP, you’d think that at least the documentation would be better, alas.

Anyways, we ended up refactoring the above method to this:


public function get_image_result($page_id, $image_path) {
        $content = file_get_contents($image_path);

        $finfo = new finfo(FILEINFO_MIME);
        $mime = $finfo->buffer($content);

        $this->doSomeStuff($content, $mime);
}

It uses the data we already have to deduct the mime type. It uses only one HTTP request.

Now, you’d think that I’m nitpicking but think about the scale we operate. 50% less external requests, means faster responses, less to pay for bandwidth, CPU, memory, disk. Also, this request is used to get the contents of images and videos. And while images sizes are between 1 and 4 MB, the maximum video size we support is 1GB. Think about the saving possibilities when we remove the second time the video is downloaded, just to get its mime type.

Here is another example what we caught while developing the simulator:


public function getUserAccount() {
        try {
            $current_user = $this->_api->getCurrentUser();
        }
        // catch invalid access token
        catch (\Instagram\Core\ApiAuthException $e){
            $this->_setTokenAsInvalid( $e );
        }
}

This method calls Instagram and gets the current user information. When we make a single comment to any Instagram post this method is called three times!

  • The first time to get information about the user that will make the comment (to insert later in the DB).
  • The second time to set username in the comment before we make the POST request to Instagram.
  • The third time, after the POST to Instagram, we make a GET request to make sure that the comment is really there and to get some additional info about it.

Needless to say the user information does not change in a matter of seconds (it may take less than two seconds to make all three requests). There is absolutely no practical need to call three times Instagram to get user information for such a short period. This is how the method was refactored:


protected $currentUser;

public function getUserAccount() {
        try {

            if($this->currentUser) {
                $current_user = $this->currentUser;
            }
            else {
                $current_user = $this->_api->getCurrentUser();
                $this->currentUser = $current_user;
            }

        }
        // catch invalid access token
        catch (\Instagram\Core\ApiAuthException $e) {
            $this->_setTokenAsInvalid( $e );
        }
}

The first time this method is called it gets the user information from Instagram and then saves it for later use. On every subsequent method call it checks if the information is in the ‘cache’ ($currentUser local variable) and if so it uses it. We ended up needing only 1/3 of the original requests. Again, it may not seem like much, but because the scale we operate and because of the throttling limits imposed by some of the social networks it makes a lot of sense.

Another example of wasting resources was in a piece of unused code. A method method tries to update contracts_activation table upon contract cancelation.


public function cancelActivationPlan($id, $object) {
	$success = $this->db->AutoExecute('contracts_activation', $object, 'UPDATE', "id={$id}");
        if (!$success) {
                return 0;
        }
        else {
	        $activation_plan = $this->db->getRow('SELECT cd_id FROM contracts_activation WHERE id = ? ', array( (int)$id ));
	    }
        return $id;
}

If the operation is successful, a ‘0’ is returned (this is not a good design at all, but I want to point out something else). If the operation is not successful, the $id is returned, but before that, the method select a row from the database. However the result stored in $activation_plan is never used! The whole else clause should be removed and we’ll save one database read operation.

These were just two examples how resources are being waster constantly. Remember that resources are not free, no matter how cheap the hardware gets. When you have some data use it to the full extent. Save the results for any subsequent operations. Use object properties or caching mechanism to store it.

Here is the non-extensive list of what operations that can needlessly waste resources.

  • CPU and memory
  • Writing or reading from the file system (I/O)
  • Writing or reading from the database (I/O)
  • Calls to external systems (network bandwidth)
  • Sending unneeded data over the wire (CPU, memory and bandwidth on mobile devises)
  • Saving more data than you actually need (HDD size)
  • Doing too much between the start and the end of DB transaction (e.g. do a network operation while a table is locked).

In my experience the junior developers are mostly ignorant about resource usage. Also developers that have experience only with outsourcing projects (because they are not the ones that will operate the software once it’s delivered). Maybe working only with high level languages and frameworks that abstract the underlying mechanics is also part of the cause.

The operation folks are usually the ones that bear the brunt. That’s why practicing DevOps is really important. Maybe also carrying a pager, or receiving SMS alerts from production. Another helpful tip would be share the same metrics that operations folks have access to.

Think about how much money you can save when you don’t have to buy more/faster hardware or pay for bigger pipe to the Internet. It all comes down to system thinking.

The rest of the related posts can be found here: