Node.js – How to Write a For Loop With Callbacks

Let’s say you have 10 files that you need to upload to your web server. 10 very large files. You need to write an upload script because it needs to be an automated process that happens every day.

You’ve decided you’re going to use Node.js for the script because, hey, it’s cool.

Let’s also say you have a magical upload function that can do the upload:

upload('myfile.ext', function(err){
  if( err ) {
    console.log('yeah, that upload didn't work: '+err)
  }
})

This upload function uses the callback pattern you’d expect from Node. You give it the name of the file you want to upload and then it goes off and does its thing. After while, when it is finished, it calls you back, by calling your callback function. It passes in one argument, an err object. The Node convention is that the first parameter to a callback is an object that describes any errors that happened. If this object is null, then everything was OK. If not, then the object contains a description of the error. This could be a string, or a more complex object.

I’ll write a post on that innards of that upload function – coming soon!

Right, now that you have your magical upload function, let’s get back to writing a for loop.

Are you a refugee from Javaland? Here’s the way you were thinking of doing it:


var filenames = [...]

try {
for( var i = 0; i < filenames.length; i++ ) { upload( filenames[i], function(err) { if( err ) throw err }) } } catch( err ) { console.log('error: '+err) } Here's what you think will happen: 1. upload each file in turn, one after the other 2. if there's an error, halt the entire process, and throw it to the calling code Here's what you just did: 1. Started shoving all 10 files at your web server all at once 2. If there is an error, good luck catching it outside that for loop – it's gone to the great Event Loop in the sky Node is asynchronous. The upload function will return before it even starts the upload. It will return back to your for loop. And your for loop will move on to the next file. And the next one. Is your website a little unresponsive? How about your net connection? Things might be a little slow when you push all those files up at the same time. So you can't use for loops any more! What's a coder to do? Bite the bullet and recurse. It's the only way to get back to what you actually want to do. You have to wait for the callback. When it is called, only then do you move on to the next file. That means you need to call another function inside your callback. And this function needs to start uploading the next file. So you need to create a recursive function that does this. It turns out there's a nice little recursive pattern that you can use for this particular case:

var filenames = [...]

function uploader(i) {
  if( i < filenames.length ) {
    upload( filenames[i], function(err) {
      if( err ) {
        console.log('error: '+err)
      }
      else {
        uploader(i+1)
      }
    })
  }
}
uploader(0)
Do you see the pattern?
repeater(i) {
  if( i < length ) {
     asyncwork( function(){
       repeater( i + 1 )
     })
  }
}
repeater(0)
You can translate this back into a traditional for(var i = 0; i < length; i++) loop quite easily: repeater(0) is var i = 0,
if( i < length ) is i < length, and
repeater( i + 1 ) is i++

When it comes to Node, the traditional way of doing things can mean you lose control of your code. Use recursion to get control back.




Posted in Node.js | 23 Comments

The JavaScript Disruption

The mainstream programming language for the next ten years will be JavaScript. Once considered a toy language useful only for checking form fields on web pages, JavaScript will come to dominate software development. Why this language and why now?

What is JavaScript? It is the language that web designers use to build web pages. It is not the language the software engineers use to build the business logic for those same web sites. JavaScript is small, and runs on the client, the web browser. It’s easy to write unmaintainable spaghetti code in JavaScript. And yet, for all these flaws, JavaScript is the world’s most misunderstood language. Douglas Crockford, a senior engineer at Yahoo, is almost singlehandedly responsible for rehabilitating the language. In a few short, seminal online essays published shortly after the turn of the century, Crockford explains that JavaScript is really LISP, the language of artificial intelligence. JavaScript borrows heavily from LISP, and is not really object-oriented at all. This curious design was well suited to a simple implementation running in a web browser. As an unintended consequence, these same mutations make JavaScript the perfect language for building cloud computing services.

Here is the prediction then: within ten years, every major cloud service will be implemented in JavaScript. Even the Microsoft ones. JavaScript will be the essential item in every senior software engineer’s skill set. Not only will it be the premier language for corporate systems, JavaScript will also dominate mobile devices. Not just phones, but tablets, and whatever enters that device category. All the while, JavaScript will continue to be the one and only language for developing complex interactive websites, completing drowning out old stalwarts such as Flash, even for games. For the first time in a history, a truly homogeneous programming language infrastructure will develop, with the same toolkits and libraries used from the top to the bottom of the technology stack. JavaScript everywhere.

How can such a prediction be made? How can one make it so confidently? Because it has all happened before, and it will happen again. Right now, we are at a technology inflexion point, yet another paradigm shift is upon us, and the JavaScript wave is starting to break. We have seen this before. Every ten years or so, the
programming world is shaken by a new language, and the vast majority of developers, and the corporations they work for, move en mass to the new playground. Let’s take a look at the two technology shifts that have preceded this one, so that we can better understand what is happening right now.

Prior to Java in the first decade of this century, the C++ language was dominant in the final decade of the last. What drove the adoption of C++? What drove the subsequent adoption of Java? And what is driving the current adoption of JavaScript? In each case, cultural, technological and conceptual movements coalesced into a tipping point that caused a sudden and very fast historical change. Such tipping points are difficult to predict. No such prediction is made here – the shift to JavaScript is not to come, it has already begun. These tipping points are driven by the chaotic feedback channels at the heart any emerging technology. One need only look at the early years of the motor vehicle: steam, electric and oil-powered vehicles all competed for dominance in similar historical waves.

What drove C++? It was the emergence of the object-oriented programming paradigm, the emergence of the PC and Microsoft Windows, and support from academic institutions. With hindsight such large-scale trends are easy to identify. The same can be done for Java. In that case, the idea of the software virtual machine, the introduction of garbage collection – a language feature lacking in C++ that offers far higher programmer productivity, and first wave of internet mania. Java, backed by Sun Microsystems, became the language of the internet, and many large corporate networked systems today run on Java. Microsoft can be included in the “Java” wave, in the sense the Microsoft’s proprietary competitive offering, C#, is really Java with the bad bits taken out.

Despite the easily recognizable nature of these two prior waves, one feature that both share is that neither wave led to a true monoculture. The C++ wave was splintered by operating systems, the Java wave by competing virtual languages such as C#. Nonetheless, the key drivers, the key elements of each paradigm shift, created a decade- long island of stability in the technology storm.

What is happening today? What are the key changes? Cloud computing is one. For the first time, corporations are moving their sensitive data and operations outside of the building. They are placing mission critical systems into the “cloud”. Cloud computing is now an abused term. It means everything and nothing. But one thing that it does mean, is that computing capacity is now metered by usage. Technology challenges are no longer solved by sinking capital into big iron servers. Instead, operating expenses dominate, driving the need for highly efficient solutions. The momentum for green energy only exacerbates this trend. Needless to say, Java/C# are not up to the job. We shall see shortly that JavaScript is uniquely placed to benefit from the move to cloud
computing.

Mobile computing represents the other side of the coin. The increasing capabilities of mobile devices drive a virtuous circle of cloud-based support services leading to better devices that access more of the cloud, leading to ever more cloud services. The problem with mobile devices is severe platform fragmentation. Many different platforms, technologies and form factors vie for dominance, without a clear leader in all categories. The cost of supporting more than one or two platforms is prohibitive. And yet there is a quick and easy solution: the new HTML5 standard for websites. This standard offers a range of new features such as offline apps and video and audio capabilities that give mobile websites almost the same abilities as native device applications. As HTML5 adoption grows, more and more mobile applications will be developed using HTML5, and of course, made interactive using JavaScript, the language of websites.

While it is clear that the ubiquity of HTML5 will drive JavaScript on the client, it is less clear why JavaScript will also be driven by the emergence of cloud computing. To see this, we have to understand something of the way in which network services are built, and the challenges that the cloud brings to traditional approaches. This challenge is made concrete by what is known as the C10K problem, first posed by Dan Kegel in 2003.

The C10K problem is this: how can you service 10000 concurrent clients on one machine? The idea is that you have 10000 web browsers, or 10000 mobile phones, all asking the same single machine to provide a bank balance or process an e-commerce transaction. That’s quite a heavy load. Java solves this by using threads, which are way to simulate parallel processing on a single physical machine. Threads have been the workhorse of high capacity web servers for the last ten years, and a technique known as “thread pooling” is considered to be industry best practice. But threads are not suitable for high capacity servers. Each thread consumes memory and processing power, and there’s only so much of that to go round. Further threads introduce complex programming programs, including a particularly nasty one known as “deadlock”. Deadlock happens when two threads wait for each other. They are both jammed and cannot move forward, like Dr. Seuss’s South-going Zax and North-going Zax. When this happens, the client is caught in the middle and waits, forever. The website, or cloud service, is effectively down.

There is a solution to the this problem – event-based programming. Unlike threads, events are light-weight constructs. Instead of assigning resources in advance, the system triggers code to execute only when there is data available. This is much more efficient. It is a different style of programming, one that has not been quite as fashionable as threads. The event-based approach is well suited to the cost structure of cloud computing – it is resource efficient, and enables one to build C10K-capable
systems on cheap commodity hardware.

Threads also lead to a style of programming that is known as synchronous blocking code. For example, when a thread has to get data from a database, it hangs around (blocks) waiting for the data to be returned. If multiple database queries have to run to build a web page (to get the user’s cart, and then the product details, and finally the current special offers), then these have to happen one after other, in other words in a synchronous fashion. You can see that this leads to a lot of threads alive at the same time in one machine, which eventually runs out of resources.

The event based model is different. In this case, the code does not wait for the database. Instead it asks to be notified when the database responds, hence it is known as non-blocking code. Multiple activities do not need to wait on each other, so the code can be asynchronous, and not one step after another (synchronous). This leads to highly efficient code that can meet the C10K challenge.

JavaScript is uniquely suited to event-based programming because it was designed to handle events. Originally these events were mouse clicks, but now they can be database results. There is no difference at an architectural level inside the “event loop”, the place where events are doled out. As a result of its early design choices to solve a seemingly unrelated problem, JavaScript as a language turns out to be perfectly designed for building efficient cloud services.

The one missing piece of the JavaScript puzzle is a high performance implementation. Java overcame it’s early sloth, and was progressively optimized by Sun. JavaScript needed a serious corporate sponsor to really get the final raw performance boost that it needed. Google has stepped up. Google needed fast JavaScript so that its services like Gmail and Google Calendar would work well and be fast for end-users. To do this, Google developed the V8 JavaScript engine, which compiles JavaScript into highly optimized machine code on the fly. Google open-sourced the V8 engine, and it was adapted by the open source community for cloud computing. The cloud computing version of V8 is known as Node.js, a high performance JavaScript environment for server.

All the pieces are now in place. The industry momentum from cloud and mobile computing. The conceptual movement towards event-based systems, and the cultural movement towards accepting JavaScript as a serious language. All these drive towards a tipping point that has begun to accelerate: JavaScript is the language of the next wave.




Posted in Uncategorized | 72 Comments

Node.js – Dealing with submitted HTTP request data when you have to make a database call first

Node’s asynchronous events are fantastic, but they can have a sting in the tail. Here’s a solution to something that you’ll probably run into at some point.

If you have a HTTP endpoint that accepts JSON, XML, or even a streaming upload, you normally read the data in using the data and end events on the request object:

var bodyarr = []
request.on('data', function(chunk){
  bodyarr.push(chunk);
})
request.on('end', function(){
  console.log( bodyarr.join('') )
})

This works in most situations. But when you start building out your app, adding in production features like user authentication, then you run in trouble.

Let’s say you’re using connect, and you write a little middleware function to do user authentication. Don’t worry if you are not familiar with connect – it’s not essential to this example. Your authentication middleware function gets called before your data handler, to make sure that the user is allowed to make the request and send you data. If the user is logged in, all is well, and your data handler gets called. If the user is not logged in, you send back a 401 Unauthorized.

Here’s the catch: your authentication function needs to talk to the database to get the user’s details. Or load them from memcache. Or from some other external system. (Don’t tell me you’re still using sessions in this day and age!)

So here’s what happens. Node will happily start accepting inbound data on the HTTP request, but before you’ve had a chance to bind your handler functions to the data and end events. Your even set up code only gets called after the authentication middleware is finished its thing. This is just the way that Node’s asynchronous event loop works. In this scenario, by the time Node gets to your data handler, the data is long gone, and you’ll stall waiting for events that never come. If your response handler depends on that end event, it will never get called, and Node will never send a HTTP response. Bad.

Here’s the rule of thumb: you need to attach your handlers to the HTTP request events before you make any asynchronous calls. Then you cache the data until you’re ready to deal with it.

Luckily for you, I’ve written a little StreamBuffer object to do the dirty work. Here’s how you use it. In that authentication function, or maybe before it, attach the request events:

new StreamBuffer(request)

This adds a special streambuffer property to the request object. Once you reach your handler set up code, just attach your handlers like this:

request.streambuffer.ondata(function(chunk) {
  // your funky stuff with data
})
req.streambuffer.onend(function() {
  // all done!
})

In the meantime, you can make as many asynchronous calls as you like, and your data will be waiting for you when you get to it.

Here’s the code for the StreamBuffer itself. (Also as a Node.js StreamBuffer github gist).


function StreamBuffer(req) {
var self = this

var buffer = []
var ended = false
var ondata = null
var onend = null

self.ondata = function(f) {
for(var i = 0; i < buffer.length; i++ ) { f(buffer[i]) } ondata = f } self.onend = function(f) { onend = f if( ended ) { onend() } } req.on('data', function(chunk) { if( ondata ) { ondata(chunk) } else { buffer.push(chunk) } }) req.on('end', function() { ended = true if( onend ) { onend() } }) req.streambuffer = self } This originally came up when I was trying to solve the problem discussed in this question in the Node mailing list.




Posted in Node.js | Leave a comment

The Six Key Mobile App Metrics you Need to be Tracking.

Mobile applications are web sites, and traditional web analytics are not appropriate for mobile applications. What you need is insight that will make your app more effective. You will not find this insight by tracking downloads and installs, phone platforms and versions, screen sizes, new users per day, frequency of use, or any of the traditional metrics. Many of these have been dragged over, kicking and screaming, from the world of web analytics. Yes, these numbers will give you surface measures of the effectiveness of your app. Yes, they are important to know. Yes, you can use them to make pretty charts. But they are all output measures. They measure the results of your app design, interaction model and service level. They do not tell you what to change to achieve your business goals.

To gain real insight into your app and its users, insight that you can use to make your app more effective, you need to measure inputs. There are the six key input metrics that we cover in this article. Funnel analysis tells you why users are failing to complete your desired user actions, such as in-app purchases, or ad clicks. Measuring social sharing tells you what aspects of your app are capturing the hearts and minds of your users. Correlating demographic data with user behaviour will tell you why your user base does what it does. Tracking time and location, together, gives you insights into the contexts in which your app is used. Mobile apps design naturally tends toward deeply hierarchical interfaces – how optimised is yours? Finally, the real business opportunity may be something you never even thought of, so capturing the emergent behaviours of your user base is critical. Let’s tale a closer look at each of these metrics, and then take a look at how you can get this data with today’s services.

Funnel analysis allows you to determine the parts of your application that are preventing your users from reaching your business goals. Let’s take a simple unit converter app as a example. The canonical unit converter app lets you convert between kilograms and pounds, or inches and centimeters, and so on. Let’s say one of your business goals is to get your users to sign up to a mailing list from within the app. If you look at the user journey this requires, you might have a call-to-action button on the main screen, followed by a form to capture the email, followed by an acknowledgment page telling people to check their email accounts to verify their subscription. Funnel analysis breaks this user journey down into discrete steps: the tap on the button, typing in the email address, submitting the email address, reading the acknowledgement page. You need to know the percentage of users you are losing at each stage. Probably more than 50%. Understanding this activity funnel to your desired business goal is critical to building an effective app. Perhaps the next version should drop the call-to-action button, or use better copy text. Use funnel analysis to measure this.

Social media are a key element in the promotion of your app. When you leverage these media, you need to track the viral spread of your app. This is more than simple counting the number of tweets or facebook likes. You need to understand the structure of the social network you are attempting to permeate. You need to find the highly interconnected individuals, those who recommendations are actively followed by their friends and acquaintances. In any social network there are always a small set of key individuals who know everybody. You need to identify these people and engage with them. This might be as simple as special promotions, or even making them employees! Your mobile analytics solution should be telling you who these people are.

Do you understand the demographic constitution of your users, and can you correlate these demographics with user behaviour? This is the classic diapers and beer effect. A major UK supermarket chain found, through mining their purchase data, that increased beer purchases were correlated with increased diaper purchases. Cross-referencing this with the demographic data they have collected via a loyalty card scheme, the supermarket chain was able to figure out that parents with new babies were staying at home having a homemade meal and a beer, rather then going out to restaurants. This allowed for far more effective targeted advertising. Demographic data are more difficult to capture in the mobile app space, but carriers such as Sprint are now beginning to offer this information.

Location is an important element of the mobile user experience, and many mobile analytics services will offer location analysis. However this is not enough. Again, simply counting the number of users in various geographies does not tell you very much. It validates a business goal, but does not give you insight. You actually need to track the temporal dimension as well. Time and space must be analyzed together. Take our unit converter app. Usage of the app on Sunday afternoons within DIY store differs from at-home usage at mealtime during the week. In the first case you might like to show ads for power tools, in the second ads for food products. Mobile analytics offerings have yet to reach this level of capability, so you may need to consider custom solutions for this type of analysis.

Mobile application interfaces are very hierarchical in nature. This means that there are lots of screens with small amounts of information that the user has to navigate through. There simple isn’t enough screen space to show too much information at once. As result, the careful design of the screen hierarchy is critical to effective use of the app. If a particular function, such as in-app purchases, is too deeply buried, you will not achieve your goals for the app. Therefore it is very important to measure of the navigation pathways within the app. Berkeley University in California determined the layout of their campus walkways by not laying any paths at first. After the students had trampled the lawns for a year, they then build the pathways where the students had walked. This is what you need to do. (Actually, the Berkeley story is an urban legend, but it’s still a great one)

The final metric is something that requires a certain open mindedness. It can be measured using some heavy mathematics, but it can also be noticed intuitively. When you put a product on the market, it may well be the case that your customers start using it in weird and wonderful ways, that you never imagined. Hashtags (#thesethings) on twitter are a good example. Twitter did not invent them, but noticed that their users had come up with this interesting convention for marked content themes. They embraced this emergent behavior and were handed a core product feature on a plate. Of all the metrics in this article, this one, emergent behaviour, is the most precious. It could turn you into the next facebook (relationship status? What a feature!), or you could kill the golden goose without even knowing it by ignoring your users (Iridium satellite phones anyone?). Detecting emergent behavior is both and art and a science – keep your eyes open.

First published in GoMoNews Nov 2010.




Posted in Uncategorized | Leave a comment

Debug PhoneGap Mobile Apps Five Times Faster

PhoneGap is a fantastic open source project. It lets you build native mobile apps for iPhone, Android and others using only HTML, CSS and JavaScript. It’s a real pleasure to work with. It makes developing mobile apps a lot faster.

Still, you might find that your debug cycle is still too slow. After all, you still have to deploy your app to your phone for proper testing, and this can chew up precious time. The faster you can wash, rinse and repeat, the faster you can debug, and the faster you can deliver.

One way to speed things up is to use Safari on your desktop. There’s an even faster technique, but we’ll get to that in a minute. Using a WebKit-based desktop browser like Safari means that your development cycle is almost as fast as building a static website. Edit, Save, Reload. Just point Safari at the www/index.html file in your PhoneGap project and away you go.

Well almost.

Desktop browsers don’t offer exactly the same API, nor do they work in exactly the same way. Some mobile functions, like beeping or vibrating the phone are not really testable. The biggest issue though is that desktop browsers are too fast. Don’t forget that your runtime target is a mobile version of WebKit, such as Mobile Safari. Another issue is that touch gestures are tricky to handle, and have to be simulated with click events. It is worth it though for the fast development turnaround for certain kinds of functionality.

The obvious next step is to compile up your app in XCode and deploy to the simulator. Again, this works pretty well, but even the simulator has differences from the actual device, and again, it is just too fast. So what else can you do?

Why not install your native app as a web app? Sounds weird I know. The whole point of using PhoneGap is so that your apps can be native! But, if you install your app as a web app, guess what? No more installs! You just reload the app directly on your device every time you make a change.

Setting this up requires a little configuration. You need to run a web server to serve up the files in the www folder of the PhoneGap project. nginx is a good choice – here’s a simple configuration snippet:

[gist id=”604802″]

You can then point your browser at http://your-ip/myapp/index.html and there’s your app! Do this using mobile Safari on your device, hit the + button and select “Add to Home Screen” to install as a web app, and away you go.

The big advantage to this approach is that you can test your app pretty much as it will appear and behave. You can even access the mobile safari debug log. Just remember to use the special meta tags to get rid of the browser chrome.

[gist id=”604807″]

One further advantage is that the API environment will now be slightly closer to the full PhoneGap mobile API. Of course, you won’t be able to do things that can only be done using PhoneGap, but this gets you quite far along the road.

One final trick. Do the same thing on the desktop iPhone emulator and speed up your testing there as well!




Posted in Uncategorized | 1 Comment

The Difference Between Alchemy and Chemistry

Paris. It is the 8th of May, 1794. Antoine Lavoisier, a partner in the despised Ferme général, stands before the guillotine. As a senior partner of the Ferme général, a tax collection agency for Louis XVI, Lavoisier is one of many wealthy aristocrats beheaded during the French revolution. Later, Joseph-Louis Lagrange, the esteemed Italian mathematician (and without whom today no satellite would make it into orbit), would write: “It took them only an instant to cut off his head, but France may not produce another such head in a century.”

Why would Lagrange care for a wealthy tax-collector? This tax-collector, the chemist Antoine Lavoisier, put to death the most embarrassing of the pseudo-sciences: alchemy. He did this not by great experiments (although he did some of those), nor by great denouncements (he left that to Robert Boyle’s The Sceptical Chymist), nor by great popularity (not many rushed to save him from the guillotine). Alchemy was ultimately defeated by the creation of a common language for naming chemical elements. We still use much of this language today when we talk of sulfates or oxides. Published in 1787, Lavoisier’s Méthode de nomenclature chimique describes an organised systematic method for naming chemical compounds, both those already known, and importantly, those yet to be discovered.

Why does this matter? And why does this matter more than Lavoisier’s other work (funded by all that tax collecting)? The establishment of a common language and a common standard for chemistry allowed this new science to separate itself from the the medieval confidence trick that is alchemy. The core cultural aspects of the practice of alchemy are secrecy, obfuscation, indirection, and mysticism. To read a given alchemical text, and to then attempt to reproduce the activities (calling them experiments is too kind) described, was often impossible, even for experienced alchemists. The language of alchemy is one of multiple dialects, private jokes, and over-the-top jargon. The pinnacle of alchemical exposition is the wonderful illustrated manuscript Mutus Liber. Published in France in 1677, this book consists of nothing more than a series of mystical illustrations, presented without explanation. These illustrations describe, in considerable detail, the process whereby one can obtain Gold from Mercury. I have taken some time to divine the meaning of this great work. I alone can finally reveal the mystical secrets contained within its fifteen sublime pages. It now seems clear that it is a fairly straightforward introduction to the photoneutron process (briefly: Mercury 198 + 6.8MeV gamma ray 1 neutron + Mercury 197 Gold 197 + 1 positron). Sadly, schematics for a nuclear research reactor were not included.

How powerful ultimately, was Lavoisier’s new chemical language? Powerful enough to convince Richard Kirwan, proponent of the phlogiston theory of fire (this was a magical substance released via burning), to renounce his views and accept those of Lavoisier. Of course, the science and the experiments did the grunt work. But the conversion of Kirwan has more to do with open communication and open data than lab work. Written in English, Lavoisier would not have been able to read Kirwan’s 1787 Essay on Phlogiston and the Constitution of Acids, were it not for the remarkable scientific partnership that he formed with his wife, Marie-Anne Pierette Paulze, one of the great unsung heroes of modern chemistry. Not only did Marie-Anne, proficient in Latin and English, translate Kirwan’s book, she also translated much of Lavoisier’s copious correspondence. This was possible only because she herself was a subject matter expert, working closely with Lavoisier as a co-researcher. Her documentation of their work, particularly in the form of engravings, is an important part of our shared scientific heritage.

Open communication between scientific collaborators led to open communication between scientific rivals. In 1791 Kirwan pronounced himself a convert to Lavoisier’s Oxygen theory of combustion (carefully established by weighing the reaction components before and after burning to establish that fire does not create or destroy matter, only converts it to another form). How was this possible? Lavoisier and Kirwan communicated using a common chemical language. By freeing themselves from the obtuseness of alchemy, they were able to communicate directly and openly. Kirwan could always be certain, using the Méthode de nomenclature chimique (an english translation was available as early as 1788), that he and Lavoisier were talking about the same chemicals.

The development of an open standard lead to an exponential explosion of research progress in the nascent field of chemistry. This is simply another instance of Metcalfe’s law: the value of a network (of scientists) grows exponentially with the number of interlinked nodes (scientists who can communicate with other scientists). There was no sort of mathematical alchemy in the 1700’s – mathematics already had a common set of concepts. Joseph Priestley, discoverer of Oxygen, writing nine short years after Lavoisier’s book, shows us the power of exponential growth: “There have been few, if any, revolutions in science so great, so sudden, and so general, as the prevalence of what is now usually termed the new system of chemistry…”. Open standards enable open data, and both enable rapid scientific progress.




Posted in Uncategorized | 1 Comment

Do Something Practical With CSV Files!

Want to be able to export and import tables from your database using a web interface? You've come to the right place!

I've just finished a new tutorial for our CSV Manager product: Uploading and Downloading CSV Files from a Website Database. It's one of those classic CSV use cases — a simple solution to a tricky problem.

Basically, you can outsource comnplex data editing tasks to Excel. This means you don't have to write such a complex back office application for your customers. And everybody's happy!




Posted in Java | Leave a comment

Spark Lines Without the Spark

Sparklines are one of those great ideas that you just know is “right” the moment you see it. Edward Tufte invented them, and let me tell you, he knows his stuff.

Here's an example: . Want to make some yourself? Check out Joe Gregorio's Sparkline creator.

So what's this rant about? Well given that sparklines are such a great little idea, such a compact, non-intrusive way to present information, you'd imagine it would be hard to get them wrong. And that's exactly what Der Spiegel has managed to do.

Take a look at this article about the current market meltdown. Look at all those lovely sparklines! Each one right beside the market index refered to. Lovely.

Oh wait. They're all the bloody same! Huh? Why go to the bother of inserting a little graphic beside each market index, in the text, and not making it a sparkline? Imagine how much more readable and understandable the text would be if these little graphics were real sparklines! Way to go. What a waste. If I was the online editor of Der Spiegel I would really jump on this and sort it out. What a difference it would make.




Posted in Rant | Leave a comment

Some Volatile Patterns

I've always regarded Java's volatile variables as voodoo variables. In fact, I've been scared off by very many articles telling you how terribly dangerous they are. In cases like these I tend to retreat to the safety of a few good patterns.

Except, I could never find any good patterns for using volatile. Luckily, Brian Goetz has just written an article solving this problem! Go check out Managing volatility.

The patterns are:

And hey, it's Brian Mr. Concurrency Goetz, so this stuff has to be good!

tag gen:Technorati Tags: Del.icio.us Tags:




Posted in Java | Leave a comment

How to Beat Nasty Interview Programming Tasks

Shane Bell does a write-up of an interview he went through. Apparently the company just dumped a programming exercise on him and left him with a pencil and paper for an hour. Nasty!

While the basic idea of a “real” programming test at interview is great, asking someone to do it with a pencil is just plain daft! This is a perfect example of cargo-culting. They know they should get people to program in an interview, they know they should ask a “tough” question. But then they invalidate the whole thing by testing “pencil-based-programming-acuity”! Whatcha building guys? A Babbage engine? Um, you know, how difficult is it, if you are going to the trouble of all this testing, to set up a locked down machine with no internet access?

Anyway, Shane runs through the exercise and his solution. He does pretty well. He also asks if there's a better solution.

Yes, Virginia, there is a Santa Claus!

And he lives at MIT OpenCourseWare. Specifically, the AI search lectures. Fantastic stuff.

Looking at the problem they gave Shane, finding a path through maze from top-right to bottom-left, it looks like you could throw an A* search at it and do pretty well. Add some iterative-deepening if you're feeling fancy and want to handle big mazes. Basically, you try to predict the best direction by calculating your current straight-line distance from the goal square at the bottom right, and choosing the next square as the one that gets you closest. If you get stuck in a cul-de-sac, backtrack out of it (Shane does use backtracking).

So how do you beat these nasty interviews? Know your search algorithms! Most of these “puzzles” can be solved with some sort of search. I'll bet you anything the guys who set this question where either a.) clueless, so a good algorithm will really impress them, or b.) not clueless and actually looking for a proper algorithm like A*. Either way you win!




Posted in General | Leave a comment