Fatvat

The Dark Path

2017-01-11T22:56:00.004-08:00

Over the last few months I’ve dabbled in building in doing my accounts using a spreadsheet (Google Sheets and Excel). The similarities are so stark that I wonder if this isn’t a new trend in managing accounts. If so, it is a dark path.

Both tools have integrated some functional characteristics. For example, they both update automatically to reflect changes in values. This is a good thing, in general.

My problem is that both tools have doubled down on automation. Both seem to be intent on forcing me to write references to every needed cell in the spreadsheet!

Now I don’t want you to think that I’m opposed to automation. I’m not. I use pen and paper, I use tools. I have a slight preference for pen and paper, but I'm using a spreadsheet too.

It’s not the fact that spreadsheets automate that has me concerned. Rather, it is the depth.

Previously, I'd created tables in Word. I can structure it so that it's correct; but I can also violate many of the "rules" whenever I need or want to. Word underlines some bits in green/red and throws up a few roadblocks; but not so many as to be obstructionist.

Google Sheets and Excel, on the other hand, are completely inflexible when it comes to their rules. For example, in Google Sheets if I sum up a column then by God every thing in that column and all the dependent references have to be adorned by being a "number". There is no way, in this tool, to silently ignore me mistaking a string for a number!

Now, perhaps you think this is a good thing. Perhaps you think that there have been a lot of bugs in systems that have resulted from un-coerced numbers. Perhaps you think that that if you aren’t escorted, step by step, through the dependent cells it would be risky and error prone. And, of course, you would be right about that.

The question is: Whose job is it to manage the "numbers". The tool? Or the pen and paper?

These so called "spreadsheets" are like the little Dutch boy sticking his fingers in the dike. Every time there’s a new kind of bug, we add a feature to prevent that kind of bug. And so these tools accumulate more and more fingers in holes in dikes. The problem is, eventually you run out of fingers and toes!

This is the wrong path!

Ask yourself why we are trying to plug defects with tools. The answer ought to be obvious. We are trying to plug these defects because these defects happen too often.

Now, ask yourself why these defects happen too often. If your answer is that our tools don’t prevent them, then I strongly suggest that you quit your job and never think about use a spreadsheet again; because errors are never the fault of our tools. Defects are the fault of users. It is users who create defects – not spreadsheets.

And what is it that programmers are supposed to do to prevent defects? I’ll give you one guess. Here are some hints. It’s a verb. It starts with a “T”. Yeah. You got it.TEST!

You test every number is indeed a number. You test that your formulas refer to actual elements; not empty cells. You test that you've recalculated everything!

Why are these spreadsheets adopting all these features?

We now have spreadsheets that force us to adorn every function, all the way up the dependent cells, with number. We now have spreadsheets that are so constraining, and so over-specified, that you have to specify all the elements that they refer to!

All these constraints, that these spreadsheets are imposing, presume that the user has perfect knowledge of the system; before the system is written. They presume that you know which number is a number. They presume you know to not to mix different units. They presume you know which input should link to which output. They presume you know what units a result will come back in.

And because of all this presumption, they punish you when you are wrong.

And how do you avoid being punished? There are two ways. One that works; and one that doesn’t. The one that doesn’t work is to design everything up front before starting. The one that does avoid the punishment is to override all the safeties.

And so, you write everything on paper and you leave these so called "spreadsheets" alone.

Why did the nuclear plant at Chernobyl catch fire, melt down, destroy a small city, and leave a large area uninhabitable? They overrode all the safeties. So don’t depend on safeties to prevent catastrophes. Instead, you’d better get used to writing lots and lots of tests, no matter what spreadsheet you are using!

Of course, this is just early morning unfunny parody of an article by Bob Martin, The Dark Path.

I strong disagree with the sentiment expressed in the article. Types are a tool that help you write code safely. Tests are a tool that help you write code safely. Neither replaces the other.

To suggest that we should abandon static-typing is wrong. If I change something returning type A to returning type B I can see how that might mean I have to change a lot of my code base. That doesn't mean static typing is bad, it could mean anything!

Maybe the design sucks, why does so much of the code know about type A?
Maybe it's a good thing - you didn't know everything upfront, an assumption has changed, so you should change the code?
Maybe you want to experiment? We should find a way to defer type errors to runtime?
Maybe we should invest in tooling to make this problem more tractable? (Jackpot!?)

We, as software engineers, should be actively looking to advance the state of the art. We should be building tools to support our ways of working, not rallying against those that do.

Radical Focus - OKRs

2017-01-11T01:10:00.000-08:00

Radical Focus explains the OKR process with a business narrative (a startup building the StarBucks of Coffee). The story shows how focusing on OKRs supports the team taking the tough decisions (e.g. stopping promising work if it doesn't support an objective) and spending their limited runway of activity on the right tasks.

It's a short read, with some good points but having read the book and watched "The Executioners Tale" (https://vimeo.com/86392023) I'd pick the video next time!

Key takeaways

OKRs are great for setting goals, BUT without a system to achieve them you are as likely to fail as with anything else.
A mission keeps you on the rails - the OKRs provide focus and milestones.
Set only one OKR for the company - it's about focus
Timescales should be about 3 months - too long and it's too far away to have impact, and too short and it's not bold enough
Objectives are inspirational not metrics
Repeat the message. The goal needs to be in front of everyones mind and tied to all activities. - "When you are tired of saying it, people are starting to hear it" (Jeff Weiner, CEO of LinkedIn")
Heuristic for KRs - one usage metric, a revenue metric and a satisfaction metric.
A good key result should be a bit scary - a 50/50 confidence you can make it is about right.
Use health metrics to identify areas to protect as you meet the goals (what can't you screw up?)
Use the four-square template (http://www.tightship.io/assets/weekly-meeting.jpg) to keep focus
Reinforce the message at the beginning and end of the week (Monday discuss/challenge, Friday demonstrate/celebrate)

The 3X Approach

2017-01-03T08:58:00.003-08:00

I watched a webinar recently about Kent Beck's characterization of product development as a triathlon and thought I'd summarize the notes here!

Kent presents product development as a three-phased approach (eXplore, eXpand, eXtract). This really closely models the product life cycle (from HBR).

Kent's insight is that you should act differently depending on the phase you are in.

In the Explore phase (stage #1 above) you can't predict the future. You've no idea how long finding market fit is going to take. It's a high risk activity. The main driver of success is the rate at which you can run experiments and learn quickly. At this stage software quality is irrelevant - the half-life of the code is short. You want a cross-functional, loosely co-ordinated team to deliver this phase.

The next stage is Expand - this is rapid growth, equivalent to the B round of venture capital. You have validated the market, you know what to do and how to do it. Time to scale, develop features and get it in the hands of users and build those feedback cycles.

Finally, you are at Extract. This is where economies of scale shine. Work here is predictable - adding a feature will result in a known about of revenue. You can estimate things well because you've done it plenty of times before. Quality is really important - cutting corners now will cost you because you'll be stuck with it for ever.

Organizations can get tuned to a particular way of thinking and that can constrain. For example, it's easiest to get tuned into the last phase Extract (see Innovator's Dilemma)

Beck states that this is one of the things XP got wrong- there is no one-size fits all methodology. The 3X model says it's all about where you are on the s-curve.

Design Pattern Haikus

2016-08-31T06:57:00.000-07:00

Singletons

a singleton is

a global variable

sounds a bit better

A singleton is just a fancy name for a global variable. I like to think of a singleton as a way of warping to another point in space/time, changing the space time continuum and then heading back. You've seen the sci-fi films where this happens, right? It has similar effects on your code, making it difficult to understand what the hell is happening.

Interpreter

interpreter is

domain specific language

that is all it is

Well, there's not much more to say is there? You want your code to be readable in the language of your domain. One way to do this is to make the code look more like the language of your domain. A domain-specific language is one way to accomplish this.

Visitor

multiple dispatch

that is what you really wanted

visitor will do

You've got a function that wants to do different things depending on the types of the arguments. You've got a language that only allows you to do one different thing via polymorphism. You don't want type-switching cos someone told you that was bad.

Assuming you've got those prerequisites then go for the visitor pattern. Alternatively, look at multiple dispatch and see there's no problem there at all really.

Strategy

functions compose well

classes do not compose at all

love strategy pattern

You want to break apart a set of work into small discrete components (let's call them "objects"). The strategy pattern allows you to plug these together to solve a problem.

Or... You want to break apart an algorithm into discrete functions (let's call them functions). Functions glue together.

Professionalism in Software Engineering

2016-05-11T22:54:00.004-07:00

Yesterday, I attended a talk by Bob Martin on Professionalism in Code hosted by Software East at Redgate. I also spent considerable time beforehand working out whether it'd be unfair to introduce Bob as the "Donald Trump of Computer Programming". Sanity prevailed and I didn't. But I just wrote it up there, didn't I? D'oh.

Is the software engineering industry professional? As software engineering professionals we should have skill and good judgement. Does the software engineering industry have that?

Bob gave the famous example [PDF] of Knight Capital and Volkswagen, but there's many more (Therac-25, the F35 and The Chaos Report [PDF]). You could argue that the software industry is in meltdown and developer incompetence / poor judgement has cost the industry billions. I think you'd have a pretty convincing case! Or would you? There was no mention the other side - the tremendous advantages our haphazard industry has made to almost the whole planet (the Internet, mobile phones, communication).

Bob painted the nightmare scenario - regulation. Imagine some time from now, some bug somewhere (a missing ; even) results in a number of deaths. The nuclear reactor blows, the self-driving cars go made on a leap year and start running people over, the planes turn upside down when crossing the equator (etc). The natural result of this is the Government blames us (software developers) and starts to put some regulations in place. Again, this feels believable-ish.

But why hasn't it happened yet? Well, safety-critical systems are pretty regulated. See this lump [PDF trigger warning] from the FAA about how they do things. I'm not going to argue it's perfect, but it's demonstrably good enough to stop null pointers making things fall out the sky regularly.

So what should we do to prevent this threat of regulation? Well, we should:

Not ship shit!
Give reasonable estimates!
QA should find no bugs
Software should get better, not worse
Invest 20 hours per week in personal development

Most of this stuff is easy to agree. Of course we shouldn't ship shit - are you insane? Of course we should strive to write bug free code. If you want to explore these ideas more, Bob's book (The Clean Coder) covers the topics in much greater detail. It was a great talk and got the audience thinking.

Are these the right things to make the industry professional? Maybe. Maybe as an industry we should look at other areas:

Should we write code in unsafe languages like C for safety critical systems?
Should we put JavaScript on a plane?
Is that shared mutable state acceptable in my car?
Languages with null are a billion dollar mistake - outlaw them!
Should we use test-driven development as justification for using weak languages?

There's a huge space for software engineers to explore about building professional quality software. We're not explored much yet. I'm certainly no historian, but I'd imagine professions like Medicine, Law and Engineering took more than 60 or so years to establish what good looked like.

The deliciously chaotic world of software engineering is going to continue for a while yet!

Azure Platform Services

2016-04-04T09:23:00.000-07:00

Last time around we compiled a large compendium of links detailing Azure Infrastructure services. In this post, we'll compile an even larger set of links detailing Azure's Platform Services (PaaS rather than IaaS).

The breadth of services that Azure offers is pretty overwhelming, so take a deep breath :)

Cloud Compute

Azure Cloud Services allows you to create a compute service (ASP.NET, Python, node and PHP are all supported). This might be a worker role (e.g. a background tasks) or a web role (e.g. simple requests to display data). Cloud Services gives you the ability to scale these horizontally as needed.

Azure Batch runs your large scale parallel tasks and big batch processing jobs. Azure scales these as needed.

Azure Service Fabric is an orchestration layer for micro-service based deployments. It was borne out of internal use at Microsoft used to develop Azure itself.

Azure RemoteApp is a bridging technology (similar to Citrix type things) allowing you to access your application anywhere.

Web and Mobile

Web App Service allows you to deploy web applications in languages like C#, node.js and Python. It's now part of the more general Azure App Service.

API App Service allows you to deploy secured API services and generate appropriate clients to access them.

API Management Service lets you "take any API and publish a service in minutes". More specifically you get monitoring, RESTful and JSON-ful support and the ability to combine multiple back-ends into a single API endpoint.

Mobile App Service gives you an API specifically for mobiles, including support for off-line sync.

Logic App Service lets you integrate business processes and workflows visually. It's goal is to make it easy for you to join your data from on-premise to cloud-based workflows.

Notification Hubs provide scalable push-notifications to all major platforms (including iOS and Android).

Data

SQL Database provides a fully managed PaaS version of SQL Server with advanced features such as an index advisor (monitoring your workload to see access patterns that would benefit from an index).

SQL Data Warehouse is a data warehouse that can scale to huge volumes of data (pricing is based separately on Compute and Storage use).

Redis Cache is the PaaS version of Redis, an in-memory data structure store.

DocumentDB is a store for JSON documents. As of Build 2016, it was announced that there is a MongoDB compatibility layer (see here).

Azure Search provides a fully managed search service (with Lucene query compatibility).

Azure Table Storage provides you with a key-value store aimed at large schema-less documents.

Analytics and IoT

HDInsight is a managed Apache Hadoop (map/reduce), Spark, R, HBase and Storm service made "easy".

Azure Machine Learning is a set of machine-learning API's, allowing you to applying advanced analytics to a wide source of data (pictures, people, text etc.). As of Build 2016, this seems to be in the process of being rebadged "cognitive services".

Azure Stream Analytics gives you the ability to do real time processing of streaming data from huge numbers of sources.

Azure Data Factory is a set of data orchestration API's, allowing you to mangle data from different sources together (with tools for data lineage, ETL and so on).

Azure Event Hubs is a scalable pub-sub service for aggregating events from many sources.

Mobile Engagement is a set of API's for monitoring and understanding app usage on mobile devices.

Azure Infrastructure Services

2016-04-01T02:13:00.003-07:00

Azure “is a growing collection of integrated cloud services for moving faster, achieving more, and saving money”. Well, that’s the marketing lingo, but what are the actual services available?

The diagram above, from here, is the best example I've seen of capturing everything that Azure is and the services that it offers.

Let's start with the infrastructure services.

Infrastructure services (IaaS) abstract away physical machines to services that can be molded via code rather than plugging in cables.

Under the banner of Compute, there are a couple of services. Azure Virtual machines let you deploy images in any way. They aren't just limited to Windows, support includes Linux, Oracle IBM and SAP. Azure Container Service allows you to deploy containers to Azure. This is heavily open-source friendly and allows you to use Apache Mesos or Docker Swarm to orchestrate.

There's many options for file storage as a service. Azure Blob Storage provides a service for storing large amounts of unstructured data that can be accessed via http(s). A blob account has multiple containers (think of these are folders or organizational units) and each container can lumps of data (blocks, append-only, page blobs). Azure Files provides fully managed file shares using the standard SMB protocol. This allows you to migrate file shared-based applications to the cloud with no changes. Finally in the storage offerings there are a variety of low-latency and high throughput storages referred to as premium storage. These are essentially pre-configured virtual machines with optimized technology (e.g. SSD) for storage. Options include hosting SQL Server, Oracle, MySQL, Redis and MongoDB.

There's a whole raft of networking services. Azure Virtual Network provides an environment to run your machines and applications where you can control the subnets, access control policies and more. Azure Load Balancer does exactly what it says on the tin - it's a Layer 4 load balancer that allows you to distribute incoming traffic. Azure DNS is another Ronseal service! ExpressRoute lets you create private connections between your data center and Azure data centers (giving you up to 10 Gbps). Traffic Manager is similar to load balancing, but with more flexibility around failover, A/B testing and combining Azure / on-prem systems. Azure VPN Gateways is another virtual network manager, and Application Gateway is an application level load balancer.

Confused yet?

Evolutionary Design Reading List

2016-01-18T03:26:00.003-08:00

Evolving a shared library with an API in flux is a tough problem, but there’s plenty of principles, practices and patterns around this.

Parallel Change - Parallel change, also known as expand and contract is a pattern to implement backward-incompatible changes to an interface in a safe manner by breaking the change into three distinct phases: expand, migrate, and contract.

Postel’s Principle - Be conservative in what you send; be liberal in what you accept.

Refactoring Module Dependencies - Some patterns for refactoring module dependencies

Package Management Principles - Principles of packages (REP, CCP, CRP, ADP, SDP, SAP). Think of these as a higher-level version of SOLID.

Strangler Application - A metaphor describing growing a new system around the edges of old.

Asset Capture - A strategy for migrating between a strangler application and back again.

On the Criteria to be used in Decomposing Systems into Modules - Parnas’ classic paper on modular systems (referenced by Tim in his recent talk).

Escape Integration Test Syrup - Talk from Agile on the Beach about testing and rapidly changing dependencies.

Semantic Versioning - For completeness!

The (very) Basics of R with the Game of Life

2015-11-09T22:56:00.002-08:00

R is a programming language for statistical computing and graphics. It's also the language of choice amongst pirates. Arrr! R is increasingly important for big data analysis, and both Oracle and Microsoft have recently announced support for database analytics using R.

So, how do you get started with R? Well, for the rest of this I'm going to assume that you already know how to program in a { } language like Java / C# and I'm going to cover the minimum amount possible to do something vaguely useful. The first step is to download the environment. You can get this from here. Once you've got something downloaded and installed you should be able to bring up a terminal and start R. I really like the built in demos. Bring up a list of them with demo() and type demo(graphics) to get an idea of the capabilities of R.

These are the boring syntax bits:

R is a case sensitive language
Comments start with # and run to the end of the line
Functions are called with parentheses e.g. f(x,y,z)

The "standard library" of R is called the R Base Package. When you bring up R, you bring up a workspace. A workspace is just what is in scope as any one time. You can examine the workspace by calling the ls function.

# Initially my workspace is empty
> ls()
character(0)

# Now I set a value and lo-and-behold, it's in my workspace
> x = "banana"
> ls()
[1] "x"

# I can save my workspace with
> save(file="~/foo.RData");

# I can load my workspace with
> load("~/foo.RData");

# I can remove elements from the workspace with rm
> rm(x)
> ls()
character(0)

# I can nuke my workspace with rm(list=ls())
> x = 'banana'
> rm(list=ls())
> ls()
character(0)

We've seen above that R supports string data, but it also supports vectors, lists, arrays, matrices, tables and data frames. To define a vector you use the c function. For example:

> x = c(1,2,3,4,5)
[1] 1 2 3 4 5

> length(x)
[1] 5

Remember everything in a vector must be of the same type. Elements are co-erced to the same type, so c(1,'1',TRUE) results in a vector of string types. Indexing into vectors starts at 1 (not zero). You can use Python style list selection:

> x = c(1,2,3,4,5,6,7,8,9,10)
> x[7:10] # select 7 thru 10
[1] 7 8 9 10
> x[-(1:3)] # - does exclusion
[1] 4 5 6 7 8 9 10

To define a list, you use, ahem, list. Items in list are named components (see the rules of variable naming).

> y = list(name="Fred", lastname="Bloggs", age=21)
> y
$name
[1] "Fred"
$lastname
[1] "Bloggs"
$age
[1] 21
> y$name # access the name property
[1] "Fred"

Finally, let's look at matrices. You construct them with matrix and pass in a vector to construct from, together with the size.

> m = matrix( c(1,2,3,4,5,6,7,8,9), nrow=3, ncol=3)
> m
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9

OK, that should be enough boring information out the way to let me write a function for the Game of Life. All I want to do is take a matrix in, apply the update rules, and return a new one. How hard can that be? You write R files with the extension ".R" and bring them into your workspace with the source function. Here's an embarrassingly poor go at the Game of Life (note I've only spent 5 minutes with the language, so if you've got any improvements to suggest or more idiomatic ways of doing the same thing, they are greatly received!).

Testing this at the REPL with a simple pattern.

    > blinker = matrix(0,5,5)
    > blinker[3,2:4]=1
    > blinker
         [,1] [,2] [,3] [,4] [,5]
    [1,]    0    0    0    0    0
    [2,]    0    0    0    0    0
    [3,]    0    1    1    1    0
    [4,]    0    0    0    0    0
    [5,]    0    0    0    0    0
    > nextStep(blinker)
         [,1] [,2] [,3] [,4] [,5]
    [1,]    0    0    0    0    0
    [2,]    0    0    1    0    0
    [3,]    0    0    1    0    0
    [4,]    0    0    1    0    0
    [5,]    0    0    0    0    0

Huzzah! Next steps are probably to write some unit tests [PDF] around it, but learning how to install packages can wait till another day!

Mindless TDD

2015-10-17T10:17:00.000-07:00

In this post, I look at a common misinterpretation of TDD I often see at coding dojos/katas.

A quick refresher - TDD is simple, but not easy.

Write a failing test
Write the minimum code to make test pass
Refactor

You have to think at each step. This is often overlooked, and TDD portrayed as series of mindless steps (if that were true, developers probably wouldn't get paid so well!).

Let's take the Bowling Kata as an example of how easy it is to fall into mindless steps. What's the right test to do first?

[Test]
public void PinExists() {
Assert.That(new Pin(), IsStanding, Is.True);
}

We're bound to need a Pin class right? And we should definitely check whether it's standing up or falling down. We continue in this vein, and create a Pin that can be knocked down and stood up. Everything proceeds swimmingly. 15 - 20 minutes have elapsed and we have a unit tested equivalent of bool that's no use to anyone.

I've seen similar anti-patterns in the Game of Life kata. We write some tests for a cell, and the rules (three live neighbours means alive and so on). We use this to drive a Cell class and we add methods to change the state depending on the number of neighbours. Some time passes, and then we realize we actually want a grid of these objects, and they change state based on their neighbours state and we're in a bit of a pickle. Our first guess at an implementation has given us a big problem.

If we somehow manage to solve the problem from this state, we end up with a load of tests that are coupled to the implementation. Worse, because we've ended up creating lots of classes, we start prematurely applying SOLID, breaking down things into even more waffly collections of useless objects with no coherent basis. Unsurprisingly, it's difficult to see the value in test-driven development when practiced like this.

So what's the common problem in both these cases?

Uncle Bob has described this behaviour Slide 9 of the Bowling Kata PPT describes a similar problem, but attributes it to over-design and suggests TDD as the solution. I agree, but I think some people pervert TDD to mean test-driven development of my supposed solution, rather than TDD of the problem itself.

The common problem is simple. Not starting with the end in mind!

If we'd have started the Bowling Kata from the outside-in, our first test might have simply bowled 10 gutter balls and verified we return a zero. We could already ship this to (really) terrible bowlers and it'd work!

Maybe next we could ensure that if we didn't bowl any spares/strikes it'd sum the scores up. Again, now we can ship this to a wider audience. Next up, let's solve spares, then strikes and at each stage we can ship!

Each time around the TDD loop we should have solved more of the problem and be closer to fully solving it. TDD should be continuous delivery, if the first test isn't solving the problem for a simple case it's probably not the right test.

Similarly for the Game of Life, instead of starting from a supposed solution of a cell class, what happens if your first test is just evolving a grid full of dead cells? What happens if we add the rules one at a time? You can ship every test once you've added the boilerplate of the "null" case.

TDD isn't about testing your possible implementation on the way to solving the problem, it's about writing relevant tests first and driving the implementation from that. Start from the problem!

TDD done right is vicious - it's a series of surgical strikes (tests) aimed at getting you to solve the problem with the minimum amount of code possible.

Anatomy of a class.

2015-03-31T08:31:00.000-07:00

Do you ever view a class and get filled with a sense of dread? I did today, so I thought a good old-fashioned rant was in order.

I opened up a class today and was greeted with this. First off, don't worry, I made the Wibble up. Secondly, if wibble was the first thing you noticed we're probably in trouble.

public sealed class MultiWibbledEntitiesDataPresenterFactory<TDetectionContext, TWibbledEntity, TProvider> : BasicFactory<Unit, IDataPresenter<MultiWibbledEntitiesContext<TDetectionContext, TWibbledEntity>>>
where TWibbledEntity : WibbledEntity
where TProvider : IProvider<TWibbledEntity>
{
private readonly IUtcDateTimeProvider m_UtcDateTimeProvider;
private readonly ILocalDateTimeProvider m_LocalDateTimeProvider;
private readonly IWibbledEntityDetector<TDetectionContext> m_WibbledEntityDetector;
private readonly IFactory<TWibbledEntity, TProvider> m_ProviderFactory;
private readonly IFactory<TWibbledEntity, IDataPresenter<SingleWibbledEntityContext<TWibbledEntity, TProvider>>> m_RawWibbledEntityDataPresenterFactory;
private readonly Func<IUtcDateTimeProvider, string, IStatusLogger> m_StatusLoggerBuilder;

public MultiWibbledEntitiesDataPresenterFactory(
IUtcDateTimeProvider utcDateTimeProvider,
ILocalDateTimeProvider localDateTimeProvider,
IWibbledEntityDetector<TDetectionContext> wibbledEntityDetector,
IFactory<TWibbledEntity, TProvider> providerFactory,
IFactory<TWibbledEntity, IDataPresenter<SingleWibbledEntityContext<TWibbledEntity, TProvider>>> rawWibbledEntityDataPresenterFactory,
Func<IUtcDateTimeProvider, string, IStatusLogger> statusLoggerBuilder
)
{
m_UtcDateTimeProvider = utcDateTimeProvider;
m_LocalDateTimeProvider = localDateTimeProvider;
m_WibbledEntityDetector = wibbledEntityDetector;
m_ProviderFactory = providerFactory;
m_RawWibbledEntityDataPresenterFactory = rawWibbledEntityDataPresenterFactory;
m_StatusLoggerBuilder = statusLoggerBuilder;
}

protected override IDataPresenter<MultiWibbledEntitiesContext<TDetectionContext, TWibbledEntity>> ConstructItem(Unit key)
{
return new MultiWibbledEntitiesDataPresenter<TDetectionContext, TWibbledEntity, TProvider>(
m_UtcDateTimeProvider,
m_LocalDateTimeProvider,
m_WibbledEntityDetector,
m_ProviderFactory,
m_RawWibbledEntityDataPresenterFactory,
m_StatusLoggerBuilder
);
}
}

OK, you've read through that. You've probably died a little inside. What did you learn? Well, this is a MultiWibbledEntitiesDataPresentorFactory.

What the actual fuck?

A multi wibbled entities data presenter factory.

Spacing it out doesn't help much either. It's a factory that makes data presenters for multi wibbled things. OK, that starts to make some sense. I guess I'd use this class if ever I needed to make a multi-wibbled-entities-data-presenter-factory.

Let's say that's the case. How do I construct one of these factory things? I need a couple of time providers (UTC and local time, just in case), a detector (no idea what that is), two more factories and a function called "statusLoggerBuilder". And this is just to create an object (albeit a rather complicated multi-wibbled data presenter object).

What can you do with the class? Not a lot, there aren't any public methods other than the constructor, and that's pretty boring. So, in order to make any progress, you'll have to explore a few more classes. You'll need to visit the "BasicFactory". You'll need to look at Unit, IDataPresenter and a few more parameterized classes. In order to work out what this does, I've got to read all these files.

What's with all the generics? Does this tell me the original developer was a template meta-programming C++ person? Why all the complexity?

How many files do I need to open in order to understand this class?

What problem is this class solving? The code doesn't tell me this, there aren't any comments and there aren't any tests. The only way for me to understand this code is to navigate all the code's friends and work out what each of them do.

But on the plus side, I can create one and test it, so it must be good right?

The Diamond Square Algorithm

2015-01-19T11:00:00.000-08:00

Ever wondered how to generate a landscape? I've been fascinated by these since the days of Vista Pro on my trusty Amiga.

The diamond-square algorithm is a method for generating heightmaps. It's a great algorithm because it's amazingly simple and produces something very visual (similar to the emergent behaviour exhibited by the flocking algorithm. My kind of algorithm! In this post, I'll try to explain the implementation using Haskell and generate some pretty pictures.

As the paper [PDF] states, previous modelling techniques for graphics were based on the idea that you can simply describe a landscape as some set of deterministic functions. Bezier and B-spline patches used higher-order polynomials to describe objects and this approach was good for rendering artificial objects. Natural objects, such as terrain, don't have regular patterns so an approach likes splines doesn't work.

This algorithm was innovative because it used a stochastic approach. Given some simple rules (and some randomness!) the algorithm generates a "natural" looking landscape. The paper describes models for 1D, 2D and 3D surfaces. We'll just use the simplest possible example, rendering a height map.

We start with a square with each corner given a randomly assigned height. If the area of this square is 1, then we’re done. Easy. We’ll call the corners, TL, TR, BL and BR (representing top left, top right, bottom left and bottom right).

If the square is too big, then we recursively divide it into smaller squares.

We assign each new square a height based on the average of the points surrounding it. Note there’s nothing stochastic about this approach yet, it’s purely deterministic.

We can model this with Haskell pretty clearly. We start off by defining a simple type to represent a Square.

type Point = (Int,Int)

data Square = Square

{

position :: Point

, size :: Int

, tl :: Double -- Height of top left

, tr :: Double -- Height of top right

, bl :: Double -- Height of bottom left

, br :: Double -- Height of bottom right

} deriving (Show,Eq)

Now all we have to do write a little function to divide things into four. Firstly let’s capture the pattern that dividing stops when the size of the square is one.

isUnit :: Square -> Bool

isUnit sq = size sq == 1

allSubSquares :: (Square -> [Square]) -> Square -> [Square]

allSubSquares f sq

| isUnit sq = [sq]

| otherwise = concatMap (allSubSquares f) (f sq)

The allSubSquares function now simply repeatedly called our splitting function until things are reduced to the tiniest possible size.

What does our split function look like? Well, all it has to do is calculate the new squares as the picture defines above. It looks a little like this:

divide :: Double -> Square -> [Square]

divide eps parent = [

sq { tr = at, br = ah, bl = al } -- top left unchanged

, (move sq (half,0)) { tl = at, bl = ah, br = ar } -- top right unchanged

, (move sq (0,half)) { tr = ah, br = ab, tl = al } -- bottom left unchanged

, (move sq (half,half)) { tl = ah, bl = ab, tr = ar } -- bottom right unchanged

]

where

half = size parent `div` 2

sq = parent { size = half }

at = averageTopHeight parent

ah = averageHeight eps parent -- height of middle

ab = averageBottomHeight parent

ar = averageRightHeight parent

al = averageLeftHeight parent

OK, this isn’t very exciting (and I’ve left out the boilerplate). But we have something now, it’s deterministic, but it creates cool results.

Woo. I used JuicyPixels to render the image. I really wish I’d found this library a long time ago, it’s fabulously simple to use and all I needed to do was use the sexy generateImage function.

So how do we actually generate something that looks vaguely natural?

The answer is randomness. Tightly controlled. Let’s look at our original square divider and make one really small change.

I’ll save you the trouble of finding it, it’s that pesky “e” we’ve added to the middle. What is e?

Well, it’s the stochastic approach. It’s a random number that’s assigned to displace the midpoint. When the square is big, the displacement is big. When the square is small, the displacement is small. In fact we simply define e as a random number [-0.5, 0.5] scaled by the size of the square.

What happens when we add this displacement is kind of cool. We now get a random surface that smooths itself out and almost looks natural.

I think this is pretty neat. It’s a smooth landscape that could easily look natural. We can do even better by giving a bit of color. I’ve done this using a simple color map as described on Stackoverflow.

Using a map generated from similar parameters, we get a much prettier colour. If you squint a bit, imagine something it could be a natural scene right?

All the code for this is available on my GitHub profile, the diamond-square project. Some fun extensions to this would be to create some animations, or actually render it in 3D with OpenGL.

Book review: Lead with Respect - A Novel of Lean Practice

2015-01-13T14:00:00.000-08:00

Book review: Lead with Respect - A Novel of Lean Practice

As I said in a previous post, I'm a sucker for a business novel. Lead with Respect is another business novel by father and son team Michael and Freddy Balle.

My goal in reading this was to get an idea of how lean management might apply to software development.

The story starts with Jane the CEO of Southcape software who is working on some software for a famed Lean company called Nexplas. As you might expect, things aren't going well. The software doesn't do what's required and milestones aren't being met. This sets the stage for the sensei/student relationship between Jane (CEO of a software company) and Andrew (VP of a manufacturing company). Throughout the book, Andrew imparts knowledge to Jane (and hopefully the reader too).

The core theme of the book is, as the title suggests, respect. Respect, in the lean sense, is much wider than the dictionary definition of respect. In Lean, respect means:

Engage everybody all the time in problem solving, together, by making every effort to understand each other's point of view.
Guarantee the quality, productivity and flexibility as we try to cut nonsatisfaction and nonvalue-added work.
Share success and reward involvement and initiative which makes our respect promise credible and sustains our long-term growth. Customer satisfaction simply can't happen without employee satisfaction.

"Lead with Respect" rallies against a preconceived notion of Lean as grinding people into the ground, cutting costs and working them till they drop. I've never had this view of lean, but I can see how it would make excellent FUD for competiting philosophies.

So what does a Lean manager actually do? Put simply:

Our job as managers is to create conditions where people can be successful at their job. And what that comes down to is working together to solve the problems we face.

This all sounds easy, right?

What are problems? The book ties problems down to continuous improvement through the familiar equation that a job is the sum of work and the continuous improvement that must be a part of employment. We should accept that continuous improvement is a part of the job. I don't think this is a difficult case to argue. In software engineering (and other knowledge work) if you aren't constantly learning, then you are falling behind. This differs

Lead with Respect argues that our job is to support people in this journey of continuous improvement. Continuous improvement is about change; change is scary! We should work with people to break larger challenges into smaller, every day steps. The link is made (again) with kaizen and standards, namely that you can't have continuous improvement without some standards.

To improve performance we have to improve processes. To improve processes we have to improve individual's competence and their ability to work with others.

In the book, Jane improves her performance as a team using things that are familiar to most software engineers who've got any experience of agile. Pair programming, test-driven development and listening to customers. None of this is surprising. Towards the end of the book, another tool for conversations is introduced in the form of A3 Problem Solving. This is something that sounds like a process nonsense, but the book does a good job of explaining that it's about the scientific method. By following a structured approach it provides a way to have structured conversations which in turn makes it easier for others to understand the problem and potentially coach people to a solution. This is something that Mike Rother explores in his book, Toyota Kata which is yet another item on my ever-growing reading list.

Was this book a good read? Well, it was enjoyable enough, the characters were believable at least. I'm not sure I got as much out of it as the earlier book and some of the discussions about software felt a bit unrealistic. The key themes definitely seem transferable to any discipline.

Book Review: The Lean Manager - A novel of lean transformation

2015-01-05T13:00:00.000-08:00

I've recently taken on a different role at work and as part of that I've tried to force myself to read as many books on management topics as possible.

After reading The Goal and The Phoenix Project, I realized that I'm a sucker for a business novel. After a bit of searching, I settled on the book by Freddy Balle and Michael Balle entitled The Lean Manager: A novel of lean transformation.

The book's setting is in an automotive part plant in France that is under threat from closure. Andy, the books protaganist, agrees a deal with the CEO, Phil (also his mentor), that if the plant becomes competitive it will not close. It's a familiar setting from other books, I suppose the concept is that the best catalyst for change is adversity.

What did this book teach me?

Standardized work and kaizen are two sides of the same coin.

For a long time, I've resisted the idea that standardizing any work to do with programming is a good thing. The best teams I've worked in have always had implicit coding standards created by being a closely knit team unafraid to voice concerns when standards (even if they are only in the heads of a few) weren't met. The idea of explicitly setting coding standards (and I'm not talking about tabs vs. spaces, more in the style of 101 Coding Guidelines for C++) has always been an anathema for me. I think the reasons for this are simple; I used to think standards implied something external to the team influencing how they work.

Standardized work is about agreeing how the work should be done best, to better see the problems. Kaizen is about encouraging operators and frontline supervisors to solve all the problems that appear as gaps to the standard.

I realize this is talking about manufacturing, not software engineering, but the idea of defining a standard and viewing a gap to the standard as a problem is a powerful one. As a stupid example, let's say you define automated acceptance testing as standard for all new features, but fail to meet it. Why x 5? What can you learn from this that changes the way you develop software? By stating a standard and holding yourself to account you see the problems and force a conversation about it. Standardized work encourages problem solving (kaizen) by acting as a tool that allows you to have the right conversations.

Another key theme from the book is the idea of "go and see". The best way to learn is to go and see. This applies everywhere. Go and see (Genchi Benbutsu) teams, Go and see customers. Go to the place where the work happens and magic will happen. Again, this sounds like a very simple thing (management by wandering around) but it's deceptively powerful when adopted as a deliberate technique (or at least, it is in the book!).

Visual Work Management is another tool in the Lean toolbox. Part of Go and See is being able to immediately recognize problems. We already have something like this in the software engineering industry with build status monitoring (Siren of Shame!). What else could we visualize? The advantage of the automative industry is the takt time is often short (if customers are demanding 10K units a week, the takt time is in minutes). In Software Engineering, our sprints are often weeks. It's difficult to know if things are going off the rails. Perhaps some elephant carpaccio is in order?

Does go and see translate to software engineering? Definitely for some parts, namely visiting customers and understanding their requirements (customers want holes, not drills). Does this apply at other times, such as when teams are writing code? I suspect it does; the only way to understand why teams are flying or struggling is to actually see them in action.

The last big theme from the book was that developing people is just as important as developing the product. The idea is simply that once *everyone* is contributing to product improvement and innovation then you've built yourself a significant advantage that is almost impossible to copy.

All in all, The Lean Manager was an enjoyable read. I'm not sure how many of themes adapt perfectly to software engineering, but definitely food for thought!

How much time should you spend fixing bugs in legacy code?

2014-12-01T00:36:00.001-08:00

How much time should you spend fixing bugs in legacy code?

There's a huge amount written about dealing with greenfield code. You start with practices such as test-driven development, walking skeletons and thin vertical stripes of functionality. Legacy code is much harder. Given hundreds of thousands of lines of poorly structured code, where'd you start? Working Effectively with Legacy Code gives some great pointers; put seams in, get the tests in place and TDD the new feature work. I'm interested in the next level up, how do you balance feature work against bug fixing?

I'm got an interesting problem. We've got a clump of legacy software that product management tell me needs new features, but we also know from support that the number of bugs is a worry. From my point of view as a development manager I want data that allows me to make the right decision and that requires evidence and understanding of the scale and scope of the problem.

It is impossible to find any domain in which humans outperformed crude extrapolation algorithms, less still sophisticated ones (Expert Political Judgement: How good is it? How can we know? [via How to Measure Anything: Finding the Value of Intangibles in Business)

I'd like to move from a faith-based to a science-based approach to balancing new features over bug count.

One field that provides some inspiration is population estimation. Given a small sample size, how do you estimate the total population?

Mark and recapture is a common method for population estimation. Capture 100 animals, tag them and release them. Repeat the process. The number of tagged animals is proportional to the number of tagged animals in the population. If we had no morals whatsoever, we could release an update to 1000 users and sample the number of bugs. We could then release the same update to another 1000 users and see how many bugs we see again.

This isn't a great way to do things, but it does give us some simple formula. If we use the same notation as Wikipedia, then

N is the total number of bugs
K is the number of bugs found by the first group
n is the number of bugs found by the second group
k is the number of bugs seen for a second time

This gives us a simple formula that we could use (N = Kn / k). For bugs for released products, it's even simpler. Since we can tally bugs against each other automatically, we can estimate N without doing anything too amoral. We can use the data from the latest release to arbitrarily divide the users in half, calculate how many bugs each side finds and count the number of duplicates.

After a bit of searching around, I found that this isn't a very novel application of the idea. "How many errors are left to find?" talks about this, but from the perspective of software testing (this seems to have generated some controversy in the response, "Another silly quantitative model").

There's a lot of caveats with model-based approaches like this (what exactly is a bug anyway?), but it's better than nothing.

The Goal - The Match Bowl Experiment

2014-09-23T23:38:00.003-07:00

Recently I've been re-reading The Goal by Eliyahu Goldratt. It's a great little book about manufacturing plants and how to manage them. It introduces the Theory of Constraints and that's relevant for all software developers for an understanding of why our development processes are structured the way they are.

Throughout the book it uses games and metaphors to illustrate faulty thinking about interconnected processes. In this post, I'd like to introduce Goldratt's dice game.

In the dice game there are a number of stations (representing part of a business process). The stations are arranged in a line with the output from one station becoming the input of the next. This arrangement represents a production line. In order to move items through the production line players take it in turn to roll a dice. The number rolled is the maximum you can move to the next station. For example, if you roll a six, but only have three items in your station, then you can only move three to the next station.

Let's imagine a really simple system with 8 stations that starts with 100 units in the left hand bowl with the aim of producing 100 units in the right hand bowl. Each bowl has the same capacity; it'll produce between 1-6 units each work step.

Based on the rules above, what's the flow of work going to look like through the system?

You might expect the flow to be smooth; each workstation has about 0 items at any one time because as soon as they are produced they move onto the next state. The reality is somewhat different.

The movie below shows the system processing a set of 100 units. The bottleneck (that is the work centre with the most items in it) is highlighted in yellow.

What's really interesting is the chaotic nature of the bottleneck. Random fluctuations mean that the bottleneck can appear anywhere. Balancing capacity across each item is clearly not the right answer.

All the diagrams in this page were built using the excellent Diagrams library for Haskell.

Dynamic Time Warping

2014-07-26T04:56:00.001-07:00

Dynamic Time Warping is nothing to do with the Rocky Horror show. It's a dynamic programming algorithm for aligning sequences of data that vary in terms of speed or time. Some typical applications of dynamic time warping are aligning fragments of speech for the purposes of performing speaker recognition.

In this post, we'll look at how simple the algorithm is and visualize some of the output you can get from aligning sequences. The complete code is on github and any flames, comments and critiques are most appreciated. You'll need the Haskell platform installed and a cabal install of the Codec.BMP package if you want to generate some images.

Given two vectors of some symbol a representing a time series (e.g. they both represent the output f(x) = y where x is some time, and y is an output signal) produce as output an array describing the cost. The output array gives the "alignment factor" of the two sequences at difference points in time.

We can find the best alignment path by simply walking back through the matrix from the top right, to the bottom left and taking the minimum choice at each turn. Using this we can visualize the best matching path for two exactly matching sequences. That's dead simple to code up:

Let's look at what happens if we try match the signal against itself and highlight the matching path in white.

The colour demonstrates how well the signals match. Blue highlights the best match (e.g. least cost) and hotter colours (such as red) highlight the worst cost. This pattern matches simple intuition. Since the sequences are exactly aligned, we'd expect a path from the top right to the bottom left, and that's what we get.

What happens if we try to match two completely random signals of integers? First off, let's try with the measure of the cost function being the absolute difference between the values (e.g. the cost function passed in is simply cost x y = abs (x - y)).

Cool patterns. Does this make sense? I think it does. The best match is at the beginning, before the sequences have diverged. As time goes on the match always gets worse because the cumulative absolute difference between the sequences is continuously increasing (albeit randomly).

What if we try to match a sequence against its opposite? Let's visualize that:

That looks odd. What's the intuition describing the image here? The best match of these two signals occurs in the middle (since they are opposite), this feels like this explains the center structure. By the time we reach the end of the signal (the top right) we've got the worst possible match and hence the brightest colour.

This implementation of the algorithm isn't all that practical. It's an O(N^2) algorithm and thus isn't suitable for signals with a high number of samples. However, it's fun to play with!

If you want to find out more about an efficient implementation of dynamic time warping then Fast DTW is a great place to start. As someone who enjoys reading papers, it's fantastic to see the code behind it, quoting from the link:

FastDTW is implemented in Java. If the JVM heap size is not large enough for the cost matrix to fit into memory, the implementation will automatically switch to an on-disk cost matrix. Alternate approaches evaluated in the papers listed below are also implemented: Sakoe-Chiba Band, Abstraction, Piecewise Dynamic Time Warping (PDTW). This is the original/official implementation used in the experiments described in the papers below.

The Stable Marriage Problem

2014-07-14T00:27:00.000-07:00

It's been far too long since I wrote posts with any real code in, so in an attempt to get back into good habits I'm going to try to write a few more posts and read up a bit more about some algorithms and the history behind them.

The Stable Marriage Problem was originally described by David Gale and Lloyd Shapley in their 1962 paper, "College Admissions and the Stability of Marriage". They describe the problem as follows:

A certain community consists of n men and n women. Each person ranks those of the opposite sex in accordance with his or her preferences for a marriage partner. We seek a satisfactory way of marrying off all member of the community. We call a set of marriage unstable if under it there are a man and a woman who are not married to each other, but prefer each other to their actual mates.

Gale and Shapley shows that for any pattern of preferences it's possible to find a stable set of marriages.

On its own, this doesn't sound very interesting. However, bringing together resources is an important economic principle and this work formed part of the puzzle of Cooperative Game Theory and Shapley was jointly awarded the Nobel Prize for economics in 2012.

So how does the algorithm for Stable Marriages work?

Let's start by defining the problem. Given two lists of preferences, find the match such that there is no unstable match (that is two pairs that would cooperatively trade partners to make each other better off). The only constraint the types have is that they have is that they are equatable. This isn't the ideal representation (to put it mildly) in a strongly typed language (it doesn't enforce any invariants about the structure of the lists), but it's probably the simplest representation for explaining the algorithm.

stableMatch :: (Eq m, Eq w) => [(m,[w])] -> [(w,[m])] -> [(m,w)]

The algorithm continues whilst there are any unmarried men. If there are no unmarried men, then the algorithm terminates.

  stableMatch :: (Eq m, Eq w) => [(m,[w])] -> [(w,[m])] -> [(m,w)]
  stableMatch ms ws = stableMatch' []
    where       
      stableMatch' ps = case unmarried ms ps of
        Just unmarriedMan  -> stableMatch' (findMatch unmarriedMan ws ps)
        Nothing            -> ps

  unmarried :: Eq m => [(m,[w])] -> [(m,w)] -> Maybe (m,[w])
  unmarried ms ps = find (\(m,_) -> m `notElem` engagedMen) ms
    where
      engagedMen = map fst ps

If there is at least one unmarried man, then we need to find a match. We do this by proposing to each of his preferences in turn. If his first preference is not engaged, then we propose. Otherwise, if his potential partner is already engaged and would prefer him then this violates the stable marriage principle and we breakup the engagement and re-engage with our first choice.

findMatch :: (Eq m,Eq w) => (m,[w]) -> [(w,[m])] -> [(m,w)] -> [(m,w)]
  findMatch (m,w:rest) ws ps = case isEngaged w ps of
      
    -- w is already engaged to m' - is there a better match?
    Just m' -> if prefers (getPrefs ws w) m m'
               then engage (breakup m' ps) m w
               else findMatch (m,rest) ws ps
                      
    -- can match with first choice
    Nothing -> engage ps m w

You can see the full code at Stable Marriage Problem. As always flames, comments and critiques gratefully received.

Getting the most out of Extract Class

2014-06-12T01:54:00.001-07:00

Resharper is a wonderful tool. I can't imagine working in the horribleness of legacy code without it.

Every so often you come across a little workflow that makes slicing and dicing code either. For example, before you could "Move Instance Method" you could "Make Static", "Move Method" and "Make instance method". Knowing you could do this made tearing code apart easier.

Recently I've been using "Extract class" a million and one times to deal with one of those 10K line long classes that no-one ever admits to having. The classes in question are in this:

class DoesEverythingAndThenSome {

       private ThingToDoWithA1 m_ThingToDoWithA1;
       private ThingToDoWithA2 m_ThingToDoWithA2;
       private ThingToDoWithA3 m_ThingToDoWithA3;

       private ThingToDoWithB1 m_ThingToDoWithB1;
       private ThingToDoWithB2 m_ThingToDoWithB2;
       private ThingToDoWithB3 m_ThingToDoWithB3;

       // repeat for thousands of other "things"

       public void DO_ALL_THE_THINGS_WITH_A () {

       }
   
       public void DO_ALL_THE_THINGS_WITH_B () {

       }

       // thousands of lines of random shit
       public void example_of_random_shit () {
          if (incrediblyComplicatedCondition()) {
             for (var apples in bananas) {
               m_ThingsToDoWithA1.SomethingImportant();
             }
          } else if (auberginesAreLumpy()) {
             m_ThingsToDoWithB.SomethingElse();
          }
       }
    }

Obviously I want to extract out the responsibilities to do with A and B into separate class. Using "Extract Class" directly doesn't work because I can't pull out the references in example_of_random_shit without making properties public and introducing back references from the extracted class to the parent class.

The simple refactoring is to just extract out each line to do with each field into a single method.

class DoesEverythingAndThenSome {

       /* the rest is the same as above */

       public void SomethingToDoWithA() {
           m_ThingsToDoWithA1.SomethingImportant();
       }
 
       public void SomethingToDoWithB() {
           m_ThingsToDoWithB.SomethingElse();
       }

       // thousands of lines of random shit
       public void example_of_random_shit () {
          if (incrediblyComplicatedCondition()) {
             for (var apples in bananas) {
                 SomethingToDoWithA();
             }
          } else if (auberginesAreLumpy()) {
             SomethingToDoWithB();
          }
       }
    }

Once I've completed this simple refactoring, "Extract Class" can now do the heavy lifting and I can move all the fields, and all the functions across in a single refactoring. What was previously hard (unpicking the back references) is now incredibly simple and I end up with a mechanical transformation to get data and functions in the right place. Extract class will now give me:

class ThingsToDoWithA {
       private ThingToDoWithA1 m_ThingToDoWithA1;
       private ThingToDoWithA2 m_ThingToDoWithA2;
       private ThingToDoWithA3 m_ThingToDoWithA3;

       // snip constructor

       public void DO_ALL_THE_THINGS_WITH_A () {

       }

       public void SomethingToDoWithA() {
           m_ThingsToDoWithA1.SomethingImportant();
       }
    }

    class DoesEverythingAndThenSome {
       
       private ThingToDoWithA m_ThingToDoWithA;

       private ThingToDoWithB1 m_ThingToDoWithB1;
       private ThingToDoWithB2 m_ThingToDoWithB2;
       private ThingToDoWithB3 m_ThingToDoWithB3;

       // snip constructor

       // repeat for thousands of other "things"
   
       public void DO_ALL_THE_THINGS_WITH_B () {

       }

       // thousands of lines of random shit
       public void example_of_random_shit () {
          if (incrediblyComplicatedCondition()) {
             for (var apples in bananas) {
               m_ThingsToDoWithA.SomethingImportant();
             }
          } else if (auberginesAreLumpy()) {
             m_ThingsToDoWithB.SomethingElse();
          }
       }
    }

Repeat this mechanically for all the other responsibilities Then the hard work begins of actually working out sensible names...

How does it feel to give a terrible conference talk?

2014-04-11T04:05:00.003-07:00

Have you been to a conference and sat through an awful presentation and wondered just how the hell someone got there? Me too!

Recently I attended the ACCU conference in Bristol and got to experience what it feels like to deliver something that went down like a lead balloon. One evening many moons ago, I thought I'd send in a proposal. By some small miracle I got accepted and was all set to run a 90 minute introduction to Haskell.

I'd already run through the workshop once at a local user group. The material isn't amazing, but I was confident in delivering it and thought it offered people a chance to get a taste of Haskell and programming with functions.

Then the problems started. It's ACCU. It's full of clever people, therefore I should level-up the material and assume more knowledge. Right? I should make it more hands-on, more interactive and better in every way.

I prepared hard. I updated the slides. I added more and more. I wrote notes, I dug references and I was confident it would kick-ass.

And then the day arrived.

90 minutes seem like a long time. It isn't. I spent a good 15 minutes ensuring that everyone could run "hello world". Very rapidly 90 minutes because 60 minutes.

Then my cleverness got the better of me. The Curry-Howard isomorphism is fascinating, but perhaps it's not the best subject matter within the first 30 minutes of any presentation. Trying to explain it under pressure with questions from an audience eager to learn makes it even worse. I probably lost another 20 minutes trying and failing to explain that const :: a -> b -> a only has one valid implementation in Haskell. And what the hell are the poor attendees going to do with this information? GAH!

And so it continued. On to writing some code. I'd wanted to make it easier to compose higher order functions to produce results, so I'd made the initial data structures in the exercises a bit more complicated than those I'd shown in the example slides. Big mistake. This made it much harder for people to grok the syntax; I'd shown simple syntax but not given enough direction. 30 minutes rapidly disappeared and I'm now *way* behind schedule.

At this point, I'd already realized the situation was going Pete Tong. But what'd you do? You can't just down tools and walk out the room (well, I suppose you could, but that'd be worse), so you just have to knuckle down and carry on. And carry on I did, through more examples (well over-egged) and then onto the Universality of Fold (brain, what the hell are you thinking?!).

With 5 minutes left, there's plenty of time to through a demo of QuickCheck in, right?But then, I realized I'm in an Emacs buffer. How'd I increase the font-size so people can read it? GNARGH!! It's over to Notepad and bump the fonts up in that. "Should have used vi!" went the audience. ARGH!

And then the buzzer sounds (well, not really, but it's time to go). Bring things to a halt and escape to a corner of the building. I can't imagine that was particularly fun for the participants. A few people kept up (hurrah!) and there were a couple of positive things said, but I knew it'd gone wrong and it boy that doesn't feel good.

So, at least now I know how it feels (bad, very bad) and I also learnt an important lesson. Keep the message simple! Focus on the single takeaway you want participants to have. I wanted people to leave knowing that Haskell isn't impenetrable and looking at how far you can get just by reading type signatures. However, I lost this in a noise of other random related things and tried (and failed) to communicate a million and one other features.

KISS!

Agile - What Next?

2014-04-09T00:18:00.002-07:00

I'm at ACCU at the moment, and instead of preparing my talk on Haskell for Thursday, I'm writing up my notes from Bob Martin's talk on agile yesterday.

Agile was originally founded by a bunch of programmers over a decade ago. The aim (from Kent Beck) was to devise a system that eliminated the trust divide between programmers and managers (them and us). Transparency was the aim of the game. Programmers would record velocity using story points. Managers would track number of story points per sprint and produce burn-down charts. Everyone is happy.

Unfortunately, burndown and velocity charts track only one part of software development, features. There's a hidden part of software development that isn't captured by these charts, ability to change. If there's one thing for certain in software development it's that people will change their mind and features will need to adapt. It's no good having your software with the correct features today, if it can't have the correct features tomorrow. Arguably, a code bases ability to respond to change is the primary responsibility of the developers.

In the original light-weight process, XP, this was kept in check by Ron Jefferies concentric circles.

This, again, is part of transparency and trust. At the inner-level, TDD, pair-programming and simple design keep the software honest. A suite of tests gives transparency on the system functionality. Moving further out we reinforce these practices with collective ownership (transparency again, no siloed development). And so on, and so forth.

Fast-forward a decade or so, and where are we now? Agile is the domain of the manager. There are no developers at agile conferences any more, it's all about the secondary value of a software product (shipping features) rather than the primary ability (reacting to change).

The XP Practices have been forgotten. Scrum empowers teams to take ownership of their practices and opt out of ones that don't work. Of course, it's easier (in the short run!) to forget about TDD, simple design and refactoring. However, in the long run productivity grinds to a halt (see Design Stamina Hypothesis).

Bob argues (The Corruption of Agile) that agile doesn't exist without the practices that support it. I agree; most agile teams aren't agile in their ability to react to change. Martin Fowler has a term for it "Flaccid Scrum" where we adopt the project management side of it, but not the underlying practices for ensuring that the code base becomes malleable and responsive to change.

With all this in mind, the trust issues have reemerged. Dropping the velocity (number of story points per sprint) is a bad idea, so developers have rebelled. Let's just make the stories smaller. The points counted are the same, but the size of the stories is much smaller. Teams are wading through custard, developing features just as slowly as ever.

The thrust against this has come in the form "software craftsmanship". This is trying to reimagine the circles from the inside out, but it's failed. It's failed because it doesn't attempt to bridge the divide between the managers and the coders. It might help the engineers to "do the right thing" more often, but it doesn't show transparency.

And the talk ended there, no answers for the future and a little depressing. I've definitely seen the scenarios Bob describes, but what's the solution? It's probably not "kill all the project managers" as someone suggested. I'd love to make the "ability to change" a tangible concept that teams can explore and understand. It's not an easily measured property, but I think taking data-driven decisions about code is part of the answer. Project managers need options to meet business constraints. Sometimes it's OK to go quick and dirty, to spike a feature that may not live longer than a week, but you have to accept that the remedial cost of recovering from that burst of activity exists and understand the remedial cost.

Right, now to finish off a few slides for this Haskell thing.

The First International Conference on Software Archaeology

2014-02-01T13:22:00.001-08:00

I recently attended The First International Conference on Software Archaeology, much more memorably shorted to #ticosa.

It was a slightly strange conference, in that it was never particularly clear what software archaeology was, but that was a good thing as it gave a great variety of talks encompassing everything from metrics, to tools for understanding, to philosophical thoughts on the architecture of information.

Process Echoes in Code

Michael Feathers opened the proceedings with a question, what's the real point of version control systems? The most common answer is that VCS systems help you roll back to previous revisions should something go wrong, or support multiple different product lines. The truth is this doesn't really happen. If your team deploys something to production that goes wrong, then I imagine you'll revert the deploy (not the VCS) and simply deploy again. The real purpose of source control is providing change logging. By looking at those changes we can see the traces of the way we work that are indelibly written in the version control system.

Michael demonstrated a tool (delta-flora) to explore the traces left in the source code. The tool was a simple Ruby program that mapped the Git commit history (SHA1, files changed, author, code diff) into Method event objects (methods added, changed and modified). This is a simple transformation, but one that seems to yield a vast amount of useful information.

Exploring the temporal correlation of class changes seems like an incredibly useful way of identifying an area of related objects. I'm working on a large, badly understood code-base. We're already finding that adding features requires touching multiple files. By mining information from the past, maybe we can make more educated decisions in the future?

Another area Michael mentioned that sent my synapses firing was analysing classes by closure date. Even if you have a huge code=base, identifying the closed classes (those that haven't changed) helps reduce the surface area you have to understand. One particular graph he showed (graphing the set of active classes against the open classes) was particularly interesting.

I'd love to plot this on a real code-base, but my understanding is that whilst you've got open classes, chances are you haven't finished a feature and the code-base is in an unstable phase. Looking forward to trying this one out.

Are you a lost raider in the code of doom?

Daniel Brolund followed with a quick overview of the Mikado Method. The Mikado Method provides a pragmatic way of dealing with a big ball of mud. We've probably all experienced the "shockwave" refactoring (or refucktoring?) where we've attempted to make a change, only to find that change requires another change, then another and before you know it you have a change set with 500 files in and little or no confidence that anything works.

The Mikado Method helps you tackle problems like this by recognizing that doing things wrong and reverting is not a no-op. You've gained knowledge. Briefly the method seems to consist of trying the simplest possible thing, using the compiler and more to find pre-requisites (e.g. If only that class was in a separate package...). By repeatedly finding the dependent refactorings you can arrange a safe set of refactorings to tackle larger problems.

I completely agree with this approach. Big bang refactorings on branches are no longer (if they ever were!) acceptable ways to work. Successful refactoring keeps you compiling and keeps you working in the smallest possible batch size. I liked the observation that the pre-requisites form a graph; before I've worked in pairs where we've kept a stack of refactorings (the Yak stack?) but it's an interesting observation that sometimes it's a graph.

How much should I refactor?

Matt Wynne gave a great metaphor for keeping code clean. If you imagine that software engineers are chefs and their output is meals, then the code base is the kitchen. What does your kitchen look like?

Matt had an exemplar code base (Cucumber rewrite), created as greenfield code, test-first, small-team, small commits and no commercial pressures. By analysing commits, a rough and ready guess was that 75% of commits were pure refactoring.

In answer to the question, how much should I refactor? The answer is simple.

More than you currently do.

Code Metrics

Keith Braithwaite gave us a talk about metrics and in particularly the dangers of not knowing what you are doing.

He gave some examples from earlier analysis that (allegedly) demonstrated that TDD exhibited bigger methods than test last. This doesn't fit our intuition and indeed analysing the results showed that they based the results on the mean. If we plot method length distribution, we'd find it's not a normal distribution but a power-law distribution. Doing a more statistically sound analysis actually gives the opposite results.

The moral of the story for me was that reducing a data set without knowing what you are doing is very dangerous!

Visualizing Project History

Dmitry Kandalov showed us an amazing analysis of a number of open source projects by mining the version control history (see here). This was the highlight of the conference for me, seeing interactive history of real code bases. Neat!

I really enjoyed seeing the way Scala and Clojure have evolved. Scala has progressively added more complexity and more code. Clojure however, has stabilised. Draw from that what you will!

Tools for Software Business Intelligence

Stephane Ducasse gave us an overview of some of the tools he used for software business intelligence. There was a call to action that we need dedicated tools for understanding code bases and I couldn't agree more with that. There were many interesting links:

Understanding Historical Design Decisions

Stuart Curran gave a presentation on "Understanding Historical Design Decisions". Stuart's perspective was very different as he comes from an information architecture / design background and didn't consider himself a programmer.

Some books to add to my ever-growing reading list:

Confronting Complexity

Robert Smallshire gave a talk on Confronting Complexity and returned us back to metrics (see also notes from Software Architect 2013).

We started by analysing how to calculate cyclomatic complexity. One interesting observation was that cyclomatic complexity gives us a minimum bound on the number of tests we need to get code coverage. If you follow this through, then if you add a conditional statement once every fifth line then every five lines of code you write demands another test. Ouch.

We looked at a simpler proxy for code complexity, Whitespace Integrated over Lines of Text (WILT). This is a really simple measure and incredibly quick to calculate so it lends itself to visualizing code data quickly.

There was a really good quote attributed to Rob Galankis (technical director at Eve Online):

How many if statements does it take to add a feature?

Again, this comes back to one of the recurring themes of the conference, Bertrand Meyer's open-closed principle. One of my takeaways from this was to pay much more attention to OCP!

Rob mentioned that Refactoring Reduces Complexity and gave the example of "Replace switch with polymorphism". I'd agree with this for the most part, but there are exceptions. Rename for example preserves code complexity, but increases code comprehensibility: the two don't always align. It'd be interesting to hook in a plugin to refactoring tools to calculate WILT before and after refactorings and report on the cumulative benefits.

Rob finished off by presenting an alternative model-driven approach to software engineering. The visualizations were neat and helped show the range of possibilities. That immediately seems like an improvement over other models such as COCOMO. Interestingly, going back to COCOMO shows that developer half-life isn't considered in the model, nor is complexity of the code produced (I guess the assumption is that complexity of the product => complexity of the code?).

Lightning Talks

Finally, we ended up with a set of lightning talks. Nat Pryce gave a quick demo of using neo4j to analyse a heap dump. Graph databases are cool!

Ivan Moore gave a few opinions on how you can protect your software for archaeologists from the future.

Ship your source with your product
Put your documentation be in source control
Put your dependencies in source control (reminded me of nuget package restore considered harmful)
Make sure you put instructions to build the product in source control (chef!)

There was a presentation towards the end that showed how adding sound to a running program (initially for the purposes of accessibility) produced some interesting effects. I've done this kind of thing before (creating animations for log files). Sometimes you can just rely on your brain to find the interesting things when you present it in another way.

Conclusion

TICOSA was a great conference. There was a good line up of speakers and lots of interesting content to muse over. What would I like to see next year? I'd really like to hear more war stories. I'd love to hear stories of archaeological digs. I'd especially love to hear about restorations. My general impression is that very few code bases start a restore process and come out better at the end (usually you hear about the big rewrite and sometimes those fail too), but I'd love to hear otherwise!

I'm looking forward to getting back to work on Monday and scraping through the commit logs to see what I can uncover!

The Trouble with Scrum

2013-11-01T02:18:00.000-07:00

Scrum. An iterative and incremental agile software development framework. It’s full of buzzwords. It frees us from the tyranny of waterfall development (not to say that ever existed anywhere anyway ). It’s based on the premise that the customer doesn't know what they want; we iterate quickly and deliver every sprint. We communicate with the customer often, inspect and adapt, and we build what the customer wants.

Sorted. We fired the silver-bullet and scored a direct hit.

Well no. Some things don't fit into our nicely delineated sprints. Where does user experience fit into a two-week sprint? Where does architecture fit? Where does overall product quality fit in?

Stories that we can’t estimate are known as epics. According to Scrum terminology, there’s nothing intrinsically wrong with epics, as long as they aren't high priority (!). We can’t directly work on epics. We can't put a story like "Fix User Experience" on the board. Project managers would go insane, blood would be spilt.

So what do we do? Well, if your experience is anything like mine then we try to break a story up into smaller bits. Perhaps we break “Great User Experience” into pithy little tasks such as "Move Button" and "Improve Dialog". Maybe the great architecture revisiting is broken into a small prod into the right direction. "Extract class for FooBar" or "Break X and Y Dependency".

Do these small tasks make sure that we get the best user experience? Do these tasks make sure that we've got an architecture to support the needs of the code over the next few months?

Of course not.

How do we make sure that cross-cutting concerns like user experience, quality and architecture are given adequate attention in an iterative development environment? I’m not sure I have the answer (I'm not sure that anyone does), but I have a suggestion.

A clearly communicated vision.

It doesn't matter whether it's user experience, product quality or software architecture. A clearly communicated vision gives you a tool for making the right decisions as you build software.

Am I suggesting the dreaded "Big Design Up Front"? No! It doesn't need all the minutiae just enough to navigate in the right direction. You might say, Just Enough Design Upfront.

Not all names are created equal

2013-10-30T01:24:00.000-07:00

I think everyone agrees that naming things is one of the hardest things you can do? Books like Clean Code devote whole chapters to naming. Names should convey meaning so that the next person reading the code has an easier job understanding what it does. After all, we read code far more than we write it. It's definitely OK to spend some time arguing about the right name. It's important.

So that's it. Names are important. Job done? Of course not! There's more to the story than that.

At Agile Cambridge 2013, I attended a session (Unpicking the Haystack) where the source code was only available from decompiled byte code (some sad story involving not using version control, not backing up and all the things that no-one ever does). Our task was to recover what the original program actually did. When we're looking at decompiled code almost all the naming information has gone. By the time you've gone source code to binary to source code you've lost variable names. Unsurprisingly, trying to decipher what the code does with local variables such as a1 to a999 is very hard.

With variable names gone, we have to look for other clues for programmer intent. So what else is there? Well, it certainly helps that public methods aren't lost. In this respect, method names are more important to get right than variable names. The naming is stickier. But something far more important gives us even more clues about this mystery code base.

Enter types. Decompilation reveals the names of public types. Names of types can show much more information than the variable name. For example, string s reveals little, whereas URL s reveals much more. If we're disciplined followers of domain-driven design then our types align with the problem we are solving. I'd say that types sit right at the top of the most-important-things-to-name-correctly hierarchy.

In this view of decompiled code, some names are more important than others. Parameter names and local variables are least important, whereas type names are the most important (with methods a close second).

Coming at names from decompiled source is certainly a weird way to do it, but this seems to fit with Bob Martin's definition of name length.

The length of a variable name should be proportional to its scope. The length of a function or class name is the inverse.
— Uncle Bob Martin (@unclebobmartin) February 20, 2011

I'd like to try to reinforce the view that types are by far the most important thing to get right. Crisply named abstractions matter more than almost anything else. To explore this area, we'll look at a strongly-typed static language, Haskell, and explore just enough syntax to understand its types. But first...

What is a type? A type is a label that describes properties of all objects that are instances of this type. If you see string in C#, you know you are getting an immutable set of characters with certain methods available. If you see a AbstractSingletonFactoryVisitorBean then you know you've got problems. I'm kidding.

Anyway, back to sensible types. Types describe program behaviour. Don't believe me? Let's begin our detour into Haskell:



-- Whenever you see "::" replace it with "is of type"

-- When you see a capital letter variable then you've got a type

-- add5 is of type Int, returning Int

add5 :: Int -> Int 

add5 x = x + 5



-- Parameters are separated by ->

-- For the purposes of this, let's just say the last one is the return

-- type and the rest are the arguments

-- add is of type (Int -> Int) returning Int

add x y :: Int -> Int -> Int

add x y = x + y



-- Generics are represented with lower case paramets

-- middle is of type three generic parameters (a,b,c) returning b

middle :: a -> b -> c -> b

middle x y z = y

Let's look at that last one again. middle :: a -> b -> c -> b. From the name we might guess that it returns the middle argument (e.g. middle 1 2 3 returns 2). Is there any other definition of what the function could do? In Haskell, there's no such thing as type-casting, if all I know is that something could be any type, there's not many options. I can't add anything to it. I can't convert it to a string. In fact, I can't do anything with it other than return it. The types don't let me. Types constrain the implementation choices to a more sensible subset.

Do the names matter? We know that the argument x has type a. Is there any more descriptive name? Probably not, from the type we have no idea what properties hold for the types so a long descriptive name is just wasting space. For all we know, the argument could be a function. Or it could be a monad. What are you going to call it?

Is the method name important? It's definitely nice to have a good name, but is it essential? If I gave you quux :: a -> b -> a I'm betting you could tell me what it does?

In fact, armed with just a little knowledge about types you can start to infer what functions do without even needing to see their definition. Here's a few random functions with really poor names; what do they do?



bananaFactory :: a -> a



-- (a,b) is a tuple of two elements of type a and type b

spannerBlender :: (a,b) -> a 



-- (a -> b) is a function taking anything of type a and returning type b

-- [a] is a list of items of type a

omgWTF :: (a -> b) -> [a] -> [b]



-- "Num a =>" says a must be an instance of the Num typeclass

-- think of this as specifying an interface

-- boing is a function of type taking two numbers and returning a number

boing :: (Num a) -> a -> a -> a



-- m is a type constructor that takes an argument of any type a

mindBlown :: (a -> b) -> m a -> m b

Armed with this basic knowledge of reading Haskell type signatures, you're now equipped to use Hoogle. You can search for the type signatures given above (a -> a, (a,b) -> a, (a -> b) -> [a] -> [b] and (a -> b) -> m a -> m b) and get a good idea of what these functions do.

So that's why I think long variable names are less common in functional programming. It's because the languages are terser (Uncle Bob's rule still applies) and because the type signature gives you the power of reasoning, not the variable names.

@kevfromireland @hairybreeches variable names matter much less in FP because types. #controversialopinion
— Jeff Foster (@fffej) October 25, 2013

Names are important; but not all names are equally important.

Software Architect 2013 Day #2

2013-10-10T15:26:00.000-07:00

What's wrong with current software architecture methods and 10 principles for improvement

Tom Gilb showed us a heck of a lot of slides and tried to convince us that we must take architecture seriously. I don't disagree with this, our industry could definitely do with a bit more rigour. Tom was very forthright in his views, and I appreciated his candour.

The system should be scalable, easy customizable and have a great user interface

That's a typical "design constraint" that we're probably all guilty of saying. This is nothing more than architectural poetry (putting it politely) or complete and utter bullshit. In order to take architecture seriously we should measure. Architecture is responsible for the values of the system. We should know these values and be able to measure them. If a given architecture isn't living up to these values, we should replace it with something that does. Architecture exists solely to satisfy the requirements.

Real architecture has multi-dimensional objectives, clear constraints, estimates the effects of changes. Pseudo-architecture has no dedication to objects and constraints, no ideas of the effects and no sense of the relationship between the architecture and the requirements.

If we're going to take architecture seriously, then we need to start treating it as engineering. We must understand the relationship between our architecture and the requirements of the system. We must demonstrate that our architecture works.

And then the wheels came off.

I don't work with huge systems, but I can clearly see that understanding the relationship between an architecture and the requirements is a good thing. Unfortunately, Tom presented examples from a domain that was unfamiliar to me (300 million dollar projects). In the examples, incredibly accurate percentages were shown (302%). At that point, I lost the thread. Estimates are just that, and if experience has taught me anything, it's that estimates have HUGE error bars. I didn't really see how all that planning up front led to a more measurable design. I've got a copy of Tom's book, Competitive Engineering so hopefully I can fill in the blanks.

Building on SOLID foundations

Nat Pryce and Steve Freeman gave a thought-provoking presentation entitled "Building on SOLID foundations" which explored the gap between low-level detail and high-level abstractions.

At the lowest level we have guidelines for clean code, such as the SOLID principles. At this level, it's all about the objects, but not about how they collaborate and become assembled into a functioning system. Even with SOLID principles applied, macro level problems occur (somehow all related to food metaphors), colourfully referred to as "raviolli code". Individual blocks are well organized, but as a whole it still looks like a mess. "Filo code" is code that's got so many layers you can't tell what's going on. "Spaghetti and Meatballs" code is an application with a good core, but the communication glue surrounding it is a huge mess.

At the highest level we have principles such as Conway's Law, Postel's Robustness Principle, CAP, end-to-end principle and REST.

But what's in the middle?

In the middle there are some patterns, such as Cockburn's, Hexagonal Architecture that help us structure systems as an inner domain language surrounded by specific adapters converting that data to the needs of the client. The question remains though; what are the principles between low and high level design?

Nat and Steve assert that compositionally is the principle for the middle. We should adopt a functional type approach and build a series of functions operating on immutable data in a stateful context. That sounds complicated, so what does code written in this style look like? Hamcrest gives us some examples, where by using simple combinators (functions that combine data) you can build up complicated expressions from simple operations (see the examples).

Having done a fair bit of Haskell I found it really easy to agree with this point of view. When there's no mutable state you can reason about code locally (and not checking for mutation). Local reasoning means that I can understand the code without jumping around. This is a hugely important part of a well-designed system.

I was slightly concerned to hear this style of programming as Modern Java. I hope it's not, because using Java like this feels like putting lipstick on a pig. One of the things I value in Haskell is that composition is a first class citizen. Partial application, function composition and first class functions mean that gluing simple code together to make something powerful is incredibly easy. I hope we're at that awkward point in language evolution where we're stretching our current languages to do things they don't want to do. Maybe this is finally the time when a functional language hits mainstream? (maybe it's Clojure or Scala.

We tried adopting this style of programming at Dynamic Aspects when building domain/j [PDF]. It was fantastic fun, and I really love Java's static imports for making the code lovely and terse (finding out $ is an operator also helps). Something about it feels dirty though. Haven't quite put my finger on what that was then, and hopefully with lambdas in Java 8 it's more natural.

So what is the bit in the middle? The bit in the middle is the language that describes your domain. Naming is everything and you should do whatever you can to make it easiest to understand. Eschewing mutable state and using functional programming to compose multiple simple operators seems to work!

Agile Architecture - Part 2

Allen Holub gave a presentation on agile techniques for design. Allen examined the fragile base class in some depth, before recapping CRC cards (not used enough!). Allen is a good presenter, so it was great to have a recap and have a few more examples to stick in my brain!

Leading Technical change

Nate closed out the day by giving a presentation on Leading Technical Change. It was well presented and focused on two things. How do you keep up with technology and how do you engage your organization to move to different technologies?

Nate presented some really disturbing statics about how much time Americans (and presumably other countries) waste on TV. Apparently the average American watches 151 hours of TV a month! Wow.

Nate introduced the audience to the idea of the technology radar which allows you to keep track of technology that is hot for yourself or your organization. We're trying to build one at Red Gate. We've also experimented with skills maps too, and you can see an example from a software engineering point of view here (love to know what you think?).

Introducing change is hard, and Nate presented the same sort of ideas that Roy presented the previous day. Change makes things worse in the beginning, but better in the end. Having the courage to stick out the dip is a hard thing! (image from here)

I have to admit, I didn't take many notes from this talk because I was enjoying it instead :) It was well presented and engaging with the audience. In summary, change is hard and it's all about the people. I think deep down I always knew that (people are way more complicated than code) but it was great to hear it presented so well!