Do Something Practical With CSV Files!

Want to be able to export and import tables from your database using a web interface? You've come to the right place!

I've just finished a new tutorial for our CSV Manager product: Uploading and Downloading CSV Files from a Website Database. It's one of those classic CSV use cases — a simple solution to a tricky problem.

Basically, you can outsource comnplex data editing tasks to Excel. This means you don't have to write such a complex back office application for your customers. And everybody's happy!




Posted in Java | Leave a comment

Spark Lines Without the Spark

Sparklines are one of those great ideas that you just know is “right” the moment you see it. Edward Tufte invented them, and let me tell you, he knows his stuff.

Here's an example: . Want to make some yourself? Check out Joe Gregorio's Sparkline creator.

So what's this rant about? Well given that sparklines are such a great little idea, such a compact, non-intrusive way to present information, you'd imagine it would be hard to get them wrong. And that's exactly what Der Spiegel has managed to do.

Take a look at this article about the current market meltdown. Look at all those lovely sparklines! Each one right beside the market index refered to. Lovely.

Oh wait. They're all the bloody same! Huh? Why go to the bother of inserting a little graphic beside each market index, in the text, and not making it a sparkline? Imagine how much more readable and understandable the text would be if these little graphics were real sparklines! Way to go. What a waste. If I was the online editor of Der Spiegel I would really jump on this and sort it out. What a difference it would make.




Posted in Rant | Leave a comment

Some Volatile Patterns

I've always regarded Java's volatile variables as voodoo variables. In fact, I've been scared off by very many articles telling you how terribly dangerous they are. In cases like these I tend to retreat to the safety of a few good patterns.

Except, I could never find any good patterns for using volatile. Luckily, Brian Goetz has just written an article solving this problem! Go check out Managing volatility.

The patterns are:

And hey, it's Brian Mr. Concurrency Goetz, so this stuff has to be good!

tag gen:Technorati Tags: Del.icio.us Tags:




Posted in Java | Leave a comment

How to Beat Nasty Interview Programming Tasks

Shane Bell does a write-up of an interview he went through. Apparently the company just dumped a programming exercise on him and left him with a pencil and paper for an hour. Nasty!

While the basic idea of a “real” programming test at interview is great, asking someone to do it with a pencil is just plain daft! This is a perfect example of cargo-culting. They know they should get people to program in an interview, they know they should ask a “tough” question. But then they invalidate the whole thing by testing “pencil-based-programming-acuity”! Whatcha building guys? A Babbage engine? Um, you know, how difficult is it, if you are going to the trouble of all this testing, to set up a locked down machine with no internet access?

Anyway, Shane runs through the exercise and his solution. He does pretty well. He also asks if there's a better solution.

Yes, Virginia, there is a Santa Claus!

And he lives at MIT OpenCourseWare. Specifically, the AI search lectures. Fantastic stuff.

Looking at the problem they gave Shane, finding a path through maze from top-right to bottom-left, it looks like you could throw an A* search at it and do pretty well. Add some iterative-deepening if you're feeling fancy and want to handle big mazes. Basically, you try to predict the best direction by calculating your current straight-line distance from the goal square at the bottom right, and choosing the next square as the one that gets you closest. If you get stuck in a cul-de-sac, backtrack out of it (Shane does use backtracking).

So how do you beat these nasty interviews? Know your search algorithms! Most of these “puzzles” can be solved with some sort of search. I'll bet you anything the guys who set this question where either a.) clueless, so a good algorithm will really impress them, or b.) not clueless and actually looking for a proper algorithm like A*. Either way you win!




Posted in General | Leave a comment

Level 3!

Well you might have thought that I had given up on the touch typing. I've been trying to learn to touch-type for the last two years. It's all going tragically slowly. But, I can tell you that I am in fact touch-typing this very blog post — my first touch-typed blog post ever!

I'm not really there yet — my touch-typing is still slower than my “natural” typing. But I have, finally, cracked the notorious level three on the learn2type.com site. If you've read my previous post about learn2type, you'll remember that level three is this dreadfully unbalanced drill that goes straight into all the punctuation straight from the home row keys. It's a real killer for your enthusiasm. You have to be pretty dedicated to beat it. It took me over a year. While the learn2type site is pretty OK so far as learning to type goes, it does have some serious flaws. And I have not seen any updates in over a year. Still, I would recommend it overall &mdash the performance graphs in particular are very cool, providing good feedback on your progress.

My meta-strategy for learning to touch-type remains the same as before — try all the online tutoring sites one at a time until I have good speed and accuracy. Stay tuned…




Posted in General | Leave a comment

Boxes and Lines, Boxes and Lines…

Charles Miller posted a great comment on his blog that absolutely cracks me up:

Pretty much any computing problem, given a sufficient level of abstraction, can be reduced to a diagram of boxes joined together with lines. At this level your solution will look startlingly simple, and you'll be able to sell it to someone.

So true, so true.




Posted in General | Leave a comment

How to Create a Comment Archive Using CSV to Generate HTML

I've been trying to find a workable way to manage my comments for quite some time. By which I mean, the comments that I make on other people's blogs. You need to be able to go back and see if the conversation has progressed. It's also nice just to have a record of what you said and when you said it.

I was using CoComment for a while. This is a service that tracks comments on blogs. It's pretty cool. Trouble is, it only works for the main blogging engines, and you have to install a plugin. I removed all plugins from my Firefox recently because it was acting up, and I'm not keen on reinstalling just at the moment. In any case, the CoComment plugin tended to slow down non-blog sites (looking for comment forms I suppose).

So I've decided on a simpler solution: just have a page on my blog where all my comments are listed in reverse chronological order, with a link back to the relevant blog entry. I can skim through the first few to see if recent conversations have anything new. As for the old conversations, well, I guess I won't know if there are more comments. But that's “good enough” for the time being. The easiest way to build this page is cut-and-paste. Come up with a bit of HTML and copy it for each new entry. Yeah, it has to be done by hand, but hey! The archive of comments is interesting enough to be worth recording.

Here's the comments archive, so you can see what I mean.

Well, you're right, cut-and-paste is such a bad smell. It's better to have your data in a manageable format. So Ricebridge to the rescue! You can put the data into a CSV file and generate the HTML (or rather XHTML) from it. For example, here's a record of some comments:

Date,Blog,Link,Comment
2007-04-10,mariosalexandrou.com, 
  http://www.mariosalexandrou.com/blog/?p=291&c=y, 
  "Hey! I did all that already! Where's my six figures? love it :) "
2007-04-04,Tyner Blain, 
  http://tynerblain.com/blog/2007/04/03/ba-profit-center, 
  "It’s amazing how naming something almost completely defines it."
...

It's just a CSV file. Easy to update by hand. Whenever you make a new comment, throw in the details (date, blog title, link and comment text) at the top of the CSV file.

So then how do we turn this into HTML? Well, here's the HTML I'm producing from this CSV file:

<div class="commentbox">
  <div class="comment">
    <b>
      <span>2007-04-10</span>
      <a href="http://www.mariosalexandrou.com/blog/?p=291&c=y">mariosalexandrou.com</a>
    </b>
    <p>Hey! I did all that already! Where's my six figures? it :) </p>
  </div>
  <div class="comment">
    <b>
      <span>2007-04-04</span>
      <a href="http://tynerblain.com/blog/2007/04/03/ba-profit-center">Tyner Blain</a>
    </b>
    <p>It's amazing how naming something almost completely defines it.</p>
  </div>
</div>

It's a nice little microformat of sorts, I suppose.

To produce this, you need to take the CSV columns and place them into the right positions in the XML format. We're generating XHTML, which is just XML, which is just well-behaved HTML, so this is all cool and froody.

Using XML Manager, you can define a set of XPath expressions to handle this. And here they are:

each row     -> /div/div
'commentbox' -> /div/@class
'comment'    -> @class
Date         -> b/span
Blog         -> b/a
Link         -> b/a/@href
Comment      -> p

This creates a main <div class="commentbox"> containing a set of <div class="comment"> elements, one for each comment. The CSV columns all go into subelements of the comment div.

And here's the code to tie it all together:

CsvManager csvman = new CsvManager();
csvman.getCsvSpec().setStartLine(2);
csvman.getCsvSpec().setIgnoreEmptyLines(true);
List in = csvman.load("data/comment.csv");

RecordSpec rs = new RecordSpec("/html/body/div/div", 
    new String[] { "/html/body/div/@class", "@class", 
      "b/span","b/a","b/a/@href","p"});

List out = new ArrayList();
for( Iterator cI = in.iterator(); cI.hasNext(); ) {
  String[] inrow = (String[]) cI.next();
  String[] outrow = new String[] {"commentbox","comment",
    inrow[0],inrow[1],inrow[2],inrow[3]};
  out.add(outrow);
}

XmlManager xmlman = new XmlManager(rs);
xmlman.save("data/comment.htm",out);

You just load up the CSV, and spit it out again as XML… “there's nothing to it, really…”

And then all you do is dynamically include this file on your web page, and you're done!

tag gen:Technorati Tags: Del.icio.us Tags:




Posted in Java | Leave a comment

Ruby: A Quick Cleaner for Your Adwords Invoices

If you advertise on Google then you probably have to save and print your monthly adwords invoices. Especially if you want to get the VAT back! In fact, I do this with pretty much all my online services. I need to have copies of invoices for my own records and for end-of-year accounts. It's a pain in the ass, frankly. You have to go to each site, call up the invoice page, launch the printable version (if they have one), save the page and print it. Don't know about you, but what a waste of precious coding time!

Somebody should write a web service to consolidate online service payments. Yeah, sure, the offline payments world has been done. You can go to sites like BillPay.ie and sort it all out. Where's the same solution for my online accounts? Isn't it ironic that the most connected, most virtual services, are the ones I'm putting the most physical labour into?

And what really bugs me is when I save a HTML page, FireFox saves all the ancilliary files in a [name]_files folder. Normally this is what you want I suppose, but for a series of monthly invoices it's not the right thing at all. You end up with loads of folders all containing the same set of CSS, JavaScript, and image files (by the way, can we all agree to call them folders, and not directories? Saves on the typing, you know…). Which is annoying, because you do actually need all those extra files to make the invoices look pretty if you ever want to print them out again. So really what you want is for all the extra files to live in one folder, say, files, and for all the downloaded HTML pages to refer to this folder.

And the other reason for having them all in the same folder is to support version control. Call me paranoid, but I put everything into Subversion. So handy. And multiple files that are all really the same is a completely pointless state of affairs when you add Subversion into the mix. Oy vey!

Up to now I've half-solved this problem whenever it bugged me enough by recording a quick Emacs macro and flying through the files. But you can only record the same macro so many times in your life without cracking up. Time for automation: “Why program by hand in five days what you can spend five years of your life automating?”

Well, let's do it in Ruby and then we can all go home after lunch. Here, for your use, should you have this exact same problem, is a little Ruby script to fix the links in the downloaded HTML pages, and copy the ancilliary files into a files folder (folder, yeah?). Notice that I don't delete the old files and folders. Trust me, never delete files in a hacked-together five-minute script, you will very much regret it. Delete stuff by hand.

require 'fileutils'

date_file_paths = Dir.entries('.').select { |f| 
  f =~ /dddddddd.htm/ }
date_names = date_file_paths.map { |f| 
  (f.match /(dddddddd)/)[1] }

max = 0
date_names.each { |n| 
  content = []
  File.foreach("#{n}.htm","r") { |line|
    content << line.gsub(/#{n}_files/,'files')
  }
  File.open("#{n}.htm","w") { |f|
    f.print content
  }
  max = n.to_i if n.to_i > max
}

FileUtils.cp( Dir.entries("#{max}_files").select{ |n| 
  n != '.' && n != '..' }.map { |n| 
    "#{max}_files/#{n}" }, 'files' )

Oh yeah, one more thing, I save the files using the naming convention YYYYMMDD.htm (you might have guessed, I suppose).

tag gen:Technorati Tags: Del.icio.us Tags:




Posted in Web | Leave a comment

Beat ya to it Joel!

Ah, imitation, the sincerest form of flattery :)

Now Joel, just because I posted my demand curve doesn't mean you have to post yours.

Ah, only kidding.

Actually it's good to see some data from a more established business. I definitely made some pricing mistakes at the start. This type of data should give anyone starting a micro-ISV the confidence to go for higher margins.

Don't be scared. People will pay.




Posted in Business | Leave a comment

Squidoo?

I thought I would give Seth's Squidoo thing a go.

Of course, what other subject to create content on, but CSV files! :)

Let's see if it generates any traffic…




Posted in General | Leave a comment