You're not going to like this.
18 Nov · Fri 2005
18 Nov · Fri 2005
Tom Evslin points out that APIs will make or break most of the new web applications. And he's dead right. In fact, API's are where all the really interesting stuff happens, because you get people doing things with your service that you never dreamed of.
Now we're talking specifically about web service APIs here. Because it's the network effects that make the magic. Metcalfe's Law, Reed's Law, even Godwin's Law. Take your pick. So these days, any sensible web services entrepreneur is going to want to get an API on the boil sooner rather than later.
Now I've been doing some work with the "big" APIs, like Google, Amazon and friends. I'm building a set of examples for my new product (due out shortly). One thing I will say is, go REST young man! SOAP interfaces are the suck. They raise the bar to entry far too much. Heck, do you realise that most people not only do not understand XML Namespaces, but are actually afraid of them? Too many nooks and crannies. Too much to go wrong. Amazon, for example, seems to change their namespace on a regular basis – it's date based. What a pain.
The other thing about APIs is that they are very very hard to design and maintain. I've run into this issue quite a bit with my Java Components. It is basically impossible to fix a broken API after the fact, but version 1.0 of anything useful still has far too much complexity to get it all right the first time. Actually, I took an approach to this problem based on some ideas by David Bau, which has worked out quite well. All the callback interfaces in my components use an intermediate abstract class to provide a DMZ for future changes. I've already used this succesfully to fix some problems, without any users having to recompile.
So how can you apply this to web services, where there is no inheritance model? The most important thing is to ignore what you don't understand. HTML is the classic example of this. The second thing is that you must support until death-do-us-part all your declared objects, methods, values and properties. You just gotta. Think real hard about them before you publish them. It's very difficult to get right, but that's programming for you – not as easy as it looks. And finally, don't reinvent the square wheel. You absolutely, positively should look at what other people are doing, how it's worked for them, and what you can do better. Build a Toyota, not a Trabant.
I predict (OOH, dangerous sir!) that a market will develop in helper applications and third party API providers for websites and services. In fact, you can appize (you heard it here first: [æp'aiz]) any site, if you really want to – just look at all the blogification (OK, I'll stop now) going on. So, start your motors gentlemen! We've just got ourselves a new business model.
17 Nov · Thu 2005
Business ideas are worthless. Trouble is, when you get one, you can never believe otherwise. They usually strike around bedtime, and before you know it, you've designed a full-scale enterprise architecture to support the concept by 1AM, prepared an initial marketing plan by 2AM, and by 3AM you're wondering about how to structure refunds.
Great. Any idea can work, you just have to execute. I had this idea while back that you could sell "words". You'd have some sort of attention-getting site, and then punters could buy the right to be linked from a certain word. Yeah, I know, RealNames and all that. Still, this would be a "Web 2.0" web service thing with an API and so forth. Great.
In the cold light of day you end up discarding most of these brainwaves. Still, you wonder, maybe some of them could work.
So I was delighted to come across AllYourWords. A deadringer! I jumped in and bought some words right away, just to say thanks for going with the idea. So it would have worked!
Business ideas are worthless because if it's just occured to you, somebody else has probably already implemented it (If that breaks your heart, just remember, Google wasn't the first search engine).
16 Nov · Wed 2005
So you want to start your own software company? You've got a great product idea and it's a simple case of programming to get the first version together. Then you have to put up a website, and a payment system. Oh and there's a load of documentation to write. Oh and support. And don't forget regression testing, a formal release procedure and backwards compatiblity, and after all that, marketing and promotion and managing ads and...
Right, still want to start a software company? The truth is, if you have never built an actual product, that is, a coherent entire solution to a business problem, you are vastly underestimating the level of effort required to create one.
Most developers work on software projects. Custom software projects for in-house use, or custom software projects for direct clients. For this sort of work, you can provide the client (internal or external) with so much direct support that you don't need all the collaterals that go along with a software product. And someone else probably did the selling. And you'll have students in to do the testing, and so forth. In fact, if your company has any clue, they will make sure that you can focus on your job: writing code, rather than doing any of that other stuff. And it's mostly a good idea to organise things this way.
So if this describes your situation, and you still want to create our own company with your own products, how do you go about it? Well, write a business plan, get venture funding and sit back. Oh no wait, sorry, parallel universe.
You have to learn how to build a product and you have to learn how to run a business. Learning how to build a product is easier than you think. Start an open-source project and run it as a virtual product. Believe you me, even free doesn't sell by itself. And you'll need to get the whole software product production thing down before you go off trying to the business stuff.
Next, get yourself in front of paying customers. In your day job, try to move into one of those "client-facing" roles. Push outside your personal comfort-zone here - you're gonna need to do that big time when you want to make your own money. There's nothing like dealing with customers on the front line to change your perceptions. You really should work as a freelance contractor as well, for a while at least. When you have to put your own bread on the table, you learn a lot really fast.
And then comes the most difficult part. You've got to let go of all the safety nets, quit your job, fire your clients, and start trying to sell. It's very easy to get stuck at this point, with a half-developed product that will never see the light of day. Because now you have to stop thinking like a programmer and start thinking like a business owner. Keep cutting features, for a start. Remember, every feature has to be documented and tested and supported.
Product development takes ten times more effort than custom development, probably more. Be prepared.
By the way, if all this sounds like too much hard work, you can always start a web service.
15 Nov · Tue 2005
I've been writing about a Perl sudoku solver. Sudoku is a great example of a problem worthy of study. It looks easy from the outside, but when you have a go at solving it, it gets tricky very quickly. The comments on this series have included suggestions to go recursive. Well, that's definitely a better approach than the one I am documenting here - but the nice thing is that I know that for sure because I tried it the simple way first. Less is not always more. Anyway, this series is a retrospective, so if I may beg your indulgence, we shall continue to stick our heads in the sand for a little while longer.
So where were we? Well, at this point it may help to actually name some of the processes we have applied to solve sudoku puzzles. This page on solving sudoku provides us with the necessary nomenclature. If you review that page, you'll see that what is refered to as excluding and reducing in the Perl code is more commonly known as removing singles and hidden singles respectively.
Once I got this far I was a bit stuck. I refused to allow myself to look up anything on the web (yeah, a bit pointless, considering my original aim was to win €150!). Eventually I realised that if you have two cells, with just two number possibilities in them, then no other cells in the same vertical, horizontal or home square can have those two numbers. If other cells did have, say, one of the numbers, then that would solve one of the double number cells, leaving the other inconsistent with the other solved cell. Ok, an example: say we have 12 and 12 and 123 in the top left corner, with all other relevant cells solved. Then we can exclude 1 and 2 from the 3rd cell, giving 12 and 12 and 3. This approach is known as naked pairs.
Actually if you think about it for a moment, this technique also applies to sets of three cells with three numbers, four cells with four numbers, and so on. I have only coded it up for pairs however, and that has solved all the sudokus that I have tried.
Before I show you the code, let's talk economics. You see, I had to stop doing the Irish Independent Super Sudoku. Firstly, and most importantly, entering a load of hex digits into square boxes every weekend get boring very quickly. Secondly, I wasn't winning anything and it was starting to look like I might actually actually lose money. In order to enter the competition you have to buy the paper, €1.50, and then you have to post off the entry, another €0.48. Do that every week and after 76 weeks, you'll hit €150.48 (I was already 10 weeks down). Plus it's boring filling out the squares. Did I mention that already?
Right, here's how we do naked pairs. By the way, Perl was really great for hacking out this solution, but it is still pretty verbose. I'm considering porting this to Javascript and some other scripting languages, just for the craic.
sub excludePair {
my @cell = @{shift()};
my $prow = shift;
my $pcol = shift;
my $valbin = $cell[$prow][$pcol];
if( 1 == keys(%{$valbin}) ) {
return;
}
my $cmpstr = join('',sort(keys(%{$valbin})));
my @vals = keys(%{$valbin});
my $valsmatch = 0;
for( my $c = 0; $c < ($sqsize*$sqsize); $c++ ) {
if( $c != $pcol ) {
my $testbin = $cell[$prow][$c];
if( 2 == keys(%{$testbin}) ) {
my $teststr = join('',sort(keys(%{$testbin})));
if( $cmpstr eq $teststr ) {
$valsmatch = 1;
}
}
}
}
if( $valsmatch ) {
for( my $c = 0; $c < ($sqsize*$sqsize); $c++ ) {
if( $c != $pcol ) {
my $testbin = $cell[$prow][$c];
if( 2 <= keys(%{$testbin}) ) {
my $teststr = join('',sort(keys(%{$testbin})));
if( $cmpstr ne $teststr ) {
for (@vals) {
delete(${$testbin}{$_});
}
}
}
}
}
}
$valsmatch = 0;
for( my $r = 0; $r < ($sqsize*$sqsize); $r++ ) {
if( $r != $prow ) {
my $testbin = $cell[$r][$pcol];
if( 2 == keys(%{$testbin}) ) {
my $teststr = join('',sort(keys(%{$testbin})));
if( $cmpstr eq $teststr ) {
$valsmatch = 1;
}
}
}
}
if( $valsmatch ) {
for( my $r = 0; $r < ($sqsize*$sqsize); $r++ ) {
if( $r != $prow ) {
my $testbin = $cell[$r][$pcol];
if( 2 <= keys(%{$testbin}) ) {
my $teststr = join('',sort(keys(%{$testbin})));
if( $cmpstr ne $teststr ) {
for (@vals) {
delete(${$testbin}{$_});
}
}
}
}
}
}
}
Well that's a candidate for The Daily WTF anyway. Notice that I was too lazy to actually code up the home square part. Oh well.
This subroutine assumes that it is only called on cells with two numbers left. It searches for another cell that matches the current one, and if found, removes the two numbers of these two cells from all the other relevant cells. The first two for loops handle the column, and the second two handle the row.
The expression join('',sort(keys(%{$testbin}))) gets the keys of the cell, that is, the numbers in the cell, sorts them and concatenates them together to form a canonical comparison string. Probably not the most efficient way of checking each cell for two numbers.
Ok, I guess I'll let you have the whole damn thing now. Download the Perl code at your peril. It's public domain and the usual disclaimers apply (not guilty for "thermonucular" destruction of your computer, etc.).
I am quite interested in seeing how far one can push this type of heuristics-based approach. So the plan now is to write up some sort of test harness and generate a few unsolvable sudokus. I would like to keep adding heuristics as described on the suduko hints page above and see how far we get. I may then compare it to a proper solution, using real algorithms, but for now, Good Bye, Good Night and Good Luck!.
14 Nov · Mon 2005
Sure I suppose I'd better give this Google Analytics thing a go. If only for the fancy graphics. I already use awstats and that's OK as far as it goes.
The Google stats system is pretty flash, but surprisingly slow. I wonder is that because it was not developed by Google (it was originally called Urchin and Google bought it). Since it's free there's basically no downside to trying it out.
So what will Google buy next – the million dollar home page? that's about the only business model left on this planet…
11 Nov · Fri 2005
Thanks to Adam Stiles for this one.
I fell off my chair the first time.
11 Nov · Fri 2005
I know brackets are parentheses, but bear with me, we're on the other side of the pond here.
So what is the story with the
functionName (param1, param2, ...)
style? Darlings, you are disassociating context!
Huh? Well look, one of the basics of design, of style, is put things in context. To put those things that are associated with one another, together. That way the eye places them together for the brain, aiding understanding and bringing clarity to the mind.
In languages of the C heritage, you often have variables standing alone:
int count = 0; String foo = "bar";
A name standing by itself is a visual clue that what you are dealing with is a variable.
What about methods? Where's the visual clue? It's the brackets! A name attached to brackets is a method name. It goes into the cerebral cortex straight through the eyes when you scan the code, short circuiting the hard, "thinking" part of understanding code.
functionName( param1, param2, ... )
puts it all in perspective.
People complain about code being hard to read. Well schucks, you're making it hard for yourself! Get with the program and starting applying the principles of visual design to your code.
10 Nov · Thu 2005
I'm a big fan of BigPictureSmallOffice, mostly because I don't work in one anymore! Apparently the IT department managed to entirely bork the company's database, including the list of oustanding orders. Yee Haa!
Hey, I've been there. One month into starting my first job I deleted a live customer website. Now, we were transfering it to a new server and there was a new design, so no permanent harm done. But the old site was still live (or rather, recently deceased after my act of vandalism), and due to the vagaries of DNS updates, still needed to be live for at least a week. Oh Joy. Let's just say that the client's ISP still had the old IP, and the client was having a hard time believing that other people could see their new site even though they couldn't.
Never make the same mistake twice. Well I have never deleted a live website since. You might think the lesson I learned was: always check twice. Oh no. The lesson I learned was: if you mess up a client and your boss has to handle the angry phone call because you were late in that morning, you are in deep deep doodaa.
09 Nov · Wed 2005
My recent post about documentation writing and technical writers was a little off the cuff and quite easy to misinterpret. I suppose the phrase "throw some technical writers at it" didn't help either.
To clarify: I think technical writers are a very valuable part of the software development process. I have worked with some really great ones and I know that collaboration can be good fun and produce excellent results. But it has always been for user level documentation, not API level documentation. The problem I see with API docs is that you nearly have to write the documentation to describe the functionality.
So after thinking about this for a while, and thinking about what makes O'Reilly books so good, I think that the approach to take for API docs is one of a more editorial nature. The developer should write the actual text, given that this is the most efficient way to get the information down, and given the enormous benefit of generating a feedback loop in the developer's mind. But then the technical writer can step in to act as mentor and editor – to enhance the writing skills of the developer and to provide those things that only a professional can: style, grammar and textual flow.
So to repeat: I do not see the technical writer as someone who simply takes a set of bullet points and a demo application and mechanically hacks out a lot of repetitive pages. I do understand the value of technical writing. But my question was: how can we apply this skill to API documentation, which is so close to the code? In the end, it must be through even greater collaboration – the technical writer must be part of the team, not off in a separate department. And the technical writer is not a writer in this role, they are an editor and an educator.
08 Nov · Tue 2005
With the sudoku program I hacked together in the last post, I was able to solve the Irish Independent's Super Sudoku for three weeks running. But then, disaster! It turns out that sudoku puzzles can be pretty fiendish - they are still solvable with a very small set of given numbers.
So my initial logic, just continuously removing numbers that could not possibly be in a cell, did not solve all sudokus. Time for a rethink. After staring at the unsolved output for a while, it occurred to me that sometimes you end up with a set of cells in a vertical, horizontal or home square that are all unsolved, but where only one cell contains a certain number. So lets say the top three cells on the left are 123, 234, 234, with all others in the top row solved. Well then the top left cell must be 1. There are no other possibilities, it's the only place for that number. Now we're suckin' diesel.
For some reason, I think of this new operation as "reducing" the cell possibilities, so henceforth to the code:
sub reduceHome {
my @cell = @{shift()};
my $prow = shift;
my $pcol = shift;
my $valbin = $cell[$prow][$pcol];
if( 1 == keys(%{$valbin}) ) {
return;
}
my $foundval;
for my $val (keys(%{$valbin})) {
my $found = 0;
for( my $r = $sqsize*floor($prow / $sqsize);
$r < $sqsize*floor(($prow+$sqsize) / $sqsize);
$r++ ) {
for( my $c = $sqsize*floor($pcol / $sqsize);
$c < $sqsize*floor(($pcol+$sqsize) / $sqsize);
$c++ ) {
if( $prow != $r || $pcol != $c ) {
my $testbin = $cell[$r][$c];
if( "" ne ${$testbin}{$val} ) {
$found = 1;
}
}
}
}
if( !$found ) {
$foundval = $val;
last;
}
}
if( "" ne $foundval ) {
for my $val (keys(%{$valbin})) {
if( $val ne $foundval ) {
delete(${$valbin}{ $val });
}
}
}
}
Yikes. Well what does this monstrosity do? Again I plead guilty to all charges of hacking and offer this code as final proof that Perl is meant to be written, not read. So this just runs through all the cells in the home square of a given cell: $prow, $pcol. The ugly nested for loop in the middle does that by keeping us within the bounds of the home square. So if we're cell 2,2 in a 3x3 sudoku, the home square is defined as all the cells in the range 0,0 - 3,3.
Now, as we run through the cells, if we come across a value $foundval, that is only in one cell, then we can delete all the other numbers from that cell. That's what the last for loop does in a rather pointless way - why not just create a new map with just one entry? I don't know, I was cut-n-pasting I guess.
This operation is also performed horizontally and vertically, so our main loop now does all this:
excludeHome(\@cell,$rowI,$colI); excludeVertical(\@cell,$rowI,$colI); excludeHorizontal(\@cell,$rowI,$colI); reduceHome(\@cell,$rowI,$colI); reduceVertical(\@cell,$rowI,$colI); reduceHorizontal(\@cell,$rowI,$colI);
I'd say that's efficient alright. So did this help? You bet! I got another two weeks out of this baby. But, yeah, you guessed, still not enough. Even this code won't solve all sudokus. There's another trick that I came up with though, and we'll look at that the next time.
This post is part of a series on a Perl Suduko solver.
07 Nov · Mon 2005
I've been writing a lot (and I mean a whole lot) of JavaDoc lately. It's all for the new Ricebridge product, to be announced Real Soon Now.
One of the things that I like to do with my components is to document pretty much every method. This really makes things easier for users as you can grab usage information out of context just by reading one method description. So it's a big part of what makes Ricebridge components special – pretty good docs.
The reason most JavaDoc documentation is not very good is that it is a real pain to write. Technical writing in general is very hard. It's one of those things that "disappears" when it's good. You only notice bad documentation. Sun is quite good at it. But in fairness a lot of their documentation is still fairly sparse. And don't get me started on most open source Java components (even my own!). There's just not enough payback to really put the hours in.
But for paying customers, you just gotta do it. It's part of what people pay for. I certainly expect proper and complete documentation with anything I buy. I am often disappointed, but when the documentation is good it really makes a big impression on me. So that's why I have to document pretty much everything when it comes to Ricebridge components.
Right now, I can do this myself. It's time consuming and hard work, but it still makes sense to do it myself. However, I am beginning to wonder how the documentation process can be optimised. I'm sorry, but you can't just throw technical writers at it. First of all, it's not usage documentation in the traditional sense. We're talking hard-core techie stuff here. Secondly, you really have to be a developer to have a feeling for the pain points. Thirdly, forcing developers to write creates a nice feedback loop that actually increases code quality. But still, we do need to actually produce something that describes all the salient information about a given method, in good writing. I'm thinking text that is of the O'Reilly standard.
So developers can't write well (as a rule), but they know the facts. And technical writers can write, but a deep understanding of the API is something they will struggle with. Where do we go from here? I don't know. I know there are some folks out there who can mix these disciplines. I know that a lot of technical writers would say that they are up to job. But I want to know how it can be done with ordinary people in a small company. That's useful. Star performers are not a real solution. How does the average developer/technical writer produce great API documentation?
06 Nov · Sun 2005
Roller is working out pretty well. No road blocks so far for anything that I wanted to do.
However, my approach is probably not the easiest way to do things. I have set up the main layout of the site in the _decorator.vm template. I edit this directly in my own theme and then reselect the theme to apply the changes. You don't actually have to reselect the theme each time – you can use the theme preview window to see your latest changes by just hitting F5.
I have my own theme just to see what maintaining a theme involves. The trouble is that you need to be able to edit the theme files directly on the server. No bother to me as I just use Emacs for that. But you do have to login to the server and use it from there.
The ideal setup is to edit locally and deploy remotely. To this end I have imported my theme files into a subversion repository and I will set up a local installation of Roller on my development machine. This should make things a little easier.
The only outstanding issue that I have at the moment is that my hacked atom.vm template does not format paragraphs properly, which is a bit naff – all the text just flows together. I will have to fix this as I do intend to post the full text of each entry in my feeds.
One word of warning, turn off the Textile Formatter plugin when you are posting source code, as it really arses it up.
04 Nov · Fri 2005
Take some time out to confuse your eyes: stare at the black + in the center.
Hey, I got all the pink dots to disappear! More like this at: Visual Illusions · Optische Täuschungen.
Yeah I know, now I have a headache too…
03 Nov · Thu 2005
Solving Sudoku algorithmically is actually a bit harder than it looks - as a previous commenter pointed out. But I'm with Richard Feynman on this one: If you want to learn something, make a mess of it yourself first, then read up on the solution.
Let's restrict ourselves to 3x3 sudoku for the purposes of this series, with the numbers 1-9. My Perl program works the same way for 4x4 grids, so we're not losing anything. OK, each cell in the grid, and there are 9x9 of them, can contain the numbers 1-9. Now we are given certain cells, so we can immediately remove all other numbers in the given cells.
The given cells in turn mean that unknown cells can't contain given numbers. If we are given 1 in the top left corner, then the top left square cannot contain a 1 in any other cell, so we can remove 1 from the list of possibilities for the other unknown cells.
So this suggests are very simple approach - for each "solved" cell (a cell with only one possible number left), check all cells in the same horizontal, and all cells in the same vertical, and all cells in the same home square. If the number occurs in the list of possible numbers for a cell that we are checking, remove it. In this way we can reduce the list of possibilities in each unknown cell.
How does this help us? Well if we keep repeating this operation, we eventually reduce the list of possibilities in some cells to one number, and that means we have solved those cells. These cells then used to solve further cells and so on. At first glance you'd think that would be it. I sure did! In fact this simple algorithm did solve a number of 4x4 sudokus all by itself (I was not drawn out of the hat on those ones, and did not win anything...).
But it can't solve all sudokus. Sometimes the set of initial given numbers does not give enough information to exclude all possibilities, and further loops of the algoritm fail to produce any changes in the sudoku data structure. In keeping with this simple minded approach, see if you can think up anything to reduce the list of possibilities in each cell further. We'll talk about that in the next post.
Now for some code. The data structure that I used to store the sudoku puzzle is a two-dimensional array of associative arrays. The associative arrays don't actually associate anything - they are just bins for the list of possibilities. Initially they contain the pairs { 1=>1, 2=>2, 3=>3, ... }. This setup makes is easy to test if a given number is in a cell or not.
Here's the main loop. This just keeps running the number exclusion logic until no more changes are seen:
sub solve {
my @cell = @{shift()};
my $pass = 0;
my $changes = 1;
while( $changes ) {
$pass++;
$changes = 0;
my $rowI = 0;
for my $row ( @cell ) {
my $colI = 0;
for my $col (@{$row}) {
my $origsize = keys(%{$col});
if( 1 < $origsize ) {
excludeHome(\@cell,$rowI,$colI);
excludeVertical(\@cell,$rowI,$colI);
excludeHorizontal(\@cell,$rowI,$colI);
}
my $size = keys(%{$col});
$changes = $changes
|| ($origsize > $size);
$colI++;
}
$rowI++;
}
}
print "passes:$pass\n";
}
I have made no attempt to prettify this code - this is exactly how I hacked it up. Since it uses Perl references, here's a quick decompile: \@cell = reference to entire sudoku, @{$row} = list of cells for given row, %{$col} = number bin for a given cell.
Let's look at what happens inside the exclude subroutines. We'll use excludeVertical as an example.
sub excludeVertical {
my @cell = @{shift()};
my $prow = shift;
my $pcol = shift;
my $valbin = $cell[$prow][$pcol];
if( 1 == keys(%{$valbin}) ) {
return;
}
for( my $r = 0; $r < ($sqsize*$sqsize); $r++ ) {
if( $r != $prow ) {
my $testbin = $cell[$r][$pcol];
if( 1 == keys(%{$testbin}) ) {
my $remove = (keys(%{$testbin}))[0];
if( 1 > keys(%{$valbin}) ) {
delete(${$valbin}{ $remove });
}
}
}
}
}
We do a quick check to make sure that there's still more than one number possibility (if there is only one number left, then this cell is solved), and then we run through all the cells in the same vertical as our current cell (indicated by $prow, $pcol). $sqsize is 3 for a 3x3 grid, and 4 for a 4x4 grid, and so on. We also make sure to ignore the current cell ( if( $r != $prow ) ).
This subroutine inverts the informal algorithm described above. It searches for all cells in the vertical that contain only one number, and removes all such numbers from the current cell. This is equivalent to removing the current cell from all others if it only contains one number. The keys(%{$testbin}) expression tells you how many numbers are still in the bin to be tested. We delete the solved number from the current bin using the Perl delete function, which has the rather strange idiom of using the value of the key, rather than the key itself. Oh well, it's Perl.
I know, I know, the code is ugly and messy, but I was trying to win €150 as quickly as possible before I got better and had to go back to real work.
This post is part of a series on a Perl Suduko solver.






