Ruby: A Quick Cleaner for Your Adwords Invoices

If you advertise on Google then you probably have to save and print your monthly adwords invoices. Especially if you want to get the VAT back! In fact, I do this with pretty much all my online services. I need to have copies of invoices for my own records and for end-of-year accounts. It's a pain in the ass, frankly. You have to go to each site, call up the invoice page, launch the printable version (if they have one), save the page and print it. Don't know about you, but what a waste of precious coding time!

Somebody should write a web service to consolidate online service payments. Yeah, sure, the offline payments world has been done. You can go to sites like BillPay.ie and sort it all out. Where's the same solution for my online accounts? Isn't it ironic that the most connected, most virtual services, are the ones I'm putting the most physical labour into?

And what really bugs me is when I save a HTML page, FireFox saves all the ancilliary files in a [name]_files folder. Normally this is what you want I suppose, but for a series of monthly invoices it's not the right thing at all. You end up with loads of folders all containing the same set of CSS, JavaScript, and image files (by the way, can we all agree to call them folders, and not directories? Saves on the typing, you know…). Which is annoying, because you do actually need all those extra files to make the invoices look pretty if you ever want to print them out again. So really what you want is for all the extra files to live in one folder, say, files, and for all the downloaded HTML pages to refer to this folder.

And the other reason for having them all in the same folder is to support version control. Call me paranoid, but I put everything into Subversion. So handy. And multiple files that are all really the same is a completely pointless state of affairs when you add Subversion into the mix. Oy vey!

Up to now I've half-solved this problem whenever it bugged me enough by recording a quick Emacs macro and flying through the files. But you can only record the same macro so many times in your life without cracking up. Time for automation: “Why program by hand in five days what you can spend five years of your life automating?”

Well, let's do it in Ruby and then we can all go home after lunch. Here, for your use, should you have this exact same problem, is a little Ruby script to fix the links in the downloaded HTML pages, and copy the ancilliary files into a files folder (folder, yeah?). Notice that I don't delete the old files and folders. Trust me, never delete files in a hacked-together five-minute script, you will very much regret it. Delete stuff by hand.

require 'fileutils'

date_file_paths = Dir.entries('.').select { |f| 
  f =~ /dddddddd.htm/ }
date_names = date_file_paths.map { |f| 
  (f.match /(dddddddd)/)[1] }

max = 0
date_names.each { |n| 
  content = []
  File.foreach("#{n}.htm","r") { |line|
    content << line.gsub(/#{n}_files/,'files')
  }
  File.open("#{n}.htm","w") { |f|
    f.print content
  }
  max = n.to_i if n.to_i > max
}

FileUtils.cp( Dir.entries("#{max}_files").select{ |n| 
  n != '.' && n != '..' }.map { |n| 
    "#{max}_files/#{n}" }, 'files' )

Oh yeah, one more thing, I save the files using the naming convention YYYYMMDD.htm (you might have guessed, I suppose).

tag gen:Technorati Tags: Del.icio.us Tags:




Posted in Web | Leave a comment

Beat ya to it Joel!

Ah, imitation, the sincerest form of flattery :)

Now Joel, just because I posted my demand curve doesn't mean you have to post yours.

Ah, only kidding.

Actually it's good to see some data from a more established business. I definitely made some pricing mistakes at the start. This type of data should give anyone starting a micro-ISV the confidence to go for higher margins.

Don't be scared. People will pay.




Posted in Business | Leave a comment

Squidoo?

I thought I would give Seth's Squidoo thing a go.

Of course, what other subject to create content on, but CSV files! :)

Let's see if it generates any traffic…




Posted in General | Leave a comment

New Title 2.0

I've decided to change the title of my blog. I haven't been happy with the old one for a while.

It was a play on the film How to Succeed in Business Without Really Trying, but I don't think it ever really worked.

I've been working on some new directions for the business, with new products due for release later this year, so the 2.0 moniker seems right. Now I just need to check with Tim




Posted in General | Leave a comment

Update: Firefox crashing with too many fonts

Yep, you read right. Too many fonts and bye-bye Firefox.

And by “too-many” I mean one more than the number of standard windows fonts…

You know, it really sucks. Firefox is so cool, but this is just not on. I was using Opera before I found this fix and you know what? Opera kicks ass. It's a damn fine browser and it hardly ever crashes these days. Back in the day I never used Opera because it was always bloody crashing. Must be Firefox's turn eh?

This “Interwebnet” thing will never catch on…




Posted in General | Leave a comment

Firefox and Thunderbird Crashing Randomly

I am not a happy camper.

Firefox and Thunderbird on my Windows 2000 box are crashing completely at random. It seems like some sites and emails are toxic! Thunderbird even dies sometimes when scrolling.

Being of the software mindset I have poked and prodded at this problem a bit, but I'm stuck. Here's the killer: both work perfectly on my XP laptop. Killer huh?

And yes, I've done all usual stuff like reinstalling, new profiles, compacting, yada yada, etc. etc.

Like I said, not a happy camper at all, at all.




Posted in General | Leave a comment

Race Condition in Ruby GServer

I've been playing around with Ruby lately, trying to write a UDP server. Of course, the way to write servers in Ruby is apparently (I'm still a bit new at this Ruby malarky) to subclass GServer.

And what a fine piece of code GServer is! Really does the job and shows just how easy it is to do stuff in Ruby. If you want to a TCP server.

But I needed a UDP server. So after much messing about I came up with the following half-baked solution. Please feel free to tear to pieces. I should remind you however that this code is WOMM certified.

# with apologies to John W. Small

require "socket"
require "thread"

class UServer

  DEFAULT_HOST = "127.0.0.1"
  DEFAULT_TRANSPORT = "TCP";

  def serve(io,content)
  end

  @@services = {}   # Hash of opened ports, i.e. services
  @@servicesMutex = Mutex.new

  def UServer.stop(port, 
                   host = DEFAULT_HOST, 
                   trans = DEFAULT_TRANSPORT)
    @@servicesMutex.synchronize {
      @@services[host][port][trans].stop
    }
  end

  def UServer.in_service?(port, 
                          host = DEFAULT_HOST, 
                          trans = DEFAULT_TRANSPORT)
    @@services.has_key?(host) and 
      @@services[host].has_key?(port) and
      @@services[host][port].has_key?(trans)
  end

  def stop
    @connectionsMutex.synchronize  {
      if @serverThread
        @serverThread.raise "stop"
      end
    }
  end

  def stopped?
    nil == @serverThread
  end

  def shutdown
    @shutdown = true
  end

  def connections
    @connections.size
  end

  def join
    @serverThread.join if @serverThread
  end

  attr_reader :port, :host, :trans, :maxConnections
  attr_accessor :stdlog, :audit, :debug

  def connecting(client)
    addr = client.peeraddr
    log("#{self.class.to_s} #{@host}:#{@port} #{@trans} client:#{addr[1]} " +
        "#{addr[2]}<#{addr[3]}> connect")
    true
  end

  def disconnecting(clientPort)
    log("#{self.class.to_s} #{@host}:#{@port} #{@trans}" +
      "client:#{clientPort} disconnect")
  end

  protected :connecting, :disconnecting

  def starting()
    log("#{self.class.to_s} #{@host}:#{@port} #{@trans} start")
  end

  def stopping()
    log("#{self.class.to_s} #{@host}:#{@port} #{@trans} stop")
  end

  protected :starting, :stopping


  def error(detail)
    log(detail.backtrace.join("n"))
  end

  def log(msg)
    if @stdlog
      @stdlog.puts("[#{Time.new.ctime}] %s" % msg)
      @stdlog.flush
    end
  end

  protected :error, :log

  def initialize(port, host = DEFAULT_HOST, trans = DEFAULT_TRANSPORT, maxConnections = 4,
    stdlog = $stderr, audit = false, debug = false)
    @serverThread = nil
    @port = port
    @host = host
    @trans = trans
    @maxConnections = maxConnections
    @connections = []
    @connectionsMutex = Mutex.new
    @connectionsCV = ConditionVariable.new
    @stdlog = stdlog
    @audit = audit
    @debug = debug
  end


  def start(maxConnections = -1)
    raise "running" if !stopped?
    @shutdown = false
    @maxConnections = maxConnections if maxConnections > 0
    @@servicesMutex.synchronize  {
      if UServer.in_service?(@port,@host,@trans)
        raise "Port already in use: #{host}:#{@port} #{@trans}!"
      end

      if "TCP" == @trans
        @server = ContentTCPServer.new(@host,@port)
      else
        @server = ContentUDPServer.new(@host,@port)
      end

      @@services[@host] = {} unless @@services.has_key?(@host)
      @@services[@host][@port] = {} unless @@services[@host].has_key?(@port)
      @@services[@host][@port][@trans] = self;
    }
    @serverThread = Thread.new {
      begin
        starting if @audit
        while !@shutdown
          @connectionsMutex.synchronize  {
          puts "start @con.size=#{@connections.size}"
             while @connections.size >= @maxConnections
               @connectionsCV.wait(@connectionsMutex)
             end
          }

          client, port, close, content = @server.accept

          Thread.new(client,port,close,content)  { 
            |myClient, myPort, myClose, myContent|

            @connectionsMutex.synchronize {
              @connections << Thread.current
            }
            begin
              serve(myClient,myContent) if !@audit or connecting(myClient)
              puts "finished serve"
            rescue => detail
              error(detail) if @debug
            ensure
              begin
                if myClose
                  myClient.close
              end
              rescue
              end

              @connectionsMutex.synchronize {
                @connections.delete(Thread.current)
                @connectionsCV.signal
              }

              disconnecting(myPort) if @audit
            end
          }
        end
      rescue => detail
        error(detail) if @debug
      ensure
        begin
          @server.close
        rescue
        end
        if @shutdown
          @connectionsMutex.synchronize  {
             while @connections.size > 0
               @connectionsCV.wait(@connectionsMutex)
             end
          }
        else
          @connections.each { |c| c.raise "stop" }
        end
        @serverThread = nil
        @@servicesMutex.synchronize  {
          @@services[@host][@port].delete(@trans)
        }
        stopping if @audit
      end
    }
    self
  end

end


class ContentTCPServer

  def initialize( host, port )
    @server = TCPServer.new(host,port)
  end
  
  def accept
    client = @server.accept
    return [client,client.peeraddr[1],true,client.gets(nil)]
  end  

  def close 
    @server.close
  end

end


class ContentUDPServer

  def initialize( host, port )
    puts "init"

    @socket = UDPSocket.new
    puts "new s: #{@socket} on #{host}:#{port}"

    @socket.bind(host, port)
    puts "bound: #{@socket}"
  end
  
  def accept
    puts "accept"

    packet = @socket.recvfrom(1024)
    return [@socket, 0, false, packet[0]]
  end  

  def close 
    @socket.close
  end

end

Lovely! Who says Java can't be translated into Ruby? I even used duck typing!

Anyway, see if you can spot the significant difference in thread handling between this and the original GServer.

Well, OK, here it is:

GServer says:

@connections << Thread.new(client)  { |myClient|
  ...
}

I say:

Thread.new(client,port,close,content)  { |myClient, myPort, myClose, myContent|
  @connectionsMutex.synchronize {
    @connections << Thread.current
  }
  ...
}

What happened was that UDP packets were handled so quickly that the thread was (mostly) never placed in the @connections list until after it was (supposedly) removed from the list by

@connectionsMutex.synchronize {
  @connections.delete(Thread.current)
  @connectionsCV.signal
}

at the end of the request thread. Seems like a nasty race condition to me. My code just makes sure that the thread is placed into the @connections list before nasty stuff can happen.

Am I right? You tell me…




Posted in General | Leave a comment

Demand Curve for Java Software Components

I'm going to give some insider information. When I was starting up my components business I found it nearly impossible to get an data on what my sales would look like. So now that I've had a couple of years to collect some data on my own products, I'm going to let you have a little look at it.

The most critical decision you face for a new product is how to price it. There's all sorts of stuff written about this. And all sorts of names for the various strategies: skimming, penetration, cost-plus, etc. Go read Joel for a nice introduction.

Anyway, what I'm going to give you is my demand curve, and my revenue curve. Hmm. I don't know if it's called a revenue curve, but it seems like a good name. It's the one where instead of showing price, like in the demand curve, you show total revenue. Basically, price by units sold. Both curves show values per month. Right, on with the show:

My Demand Curve

My Revenue Curve

Ah. Number's eh? Actual revenue numbers? ROFL mate!

Nope, sorry, can't give you actual revenue numbers or units sold. Confidential business information and all that.

Still, if you're thinking of getting into the software components business, these curves will give you something to chew on. Let me explain them a bit more.

First, the demand curve does show the product price. That's public knowledge. I have collected my data from three pricing periods. I started with $97 and stayed with that price for the first 16 months. I then moved to $47 for a further 10 months. And I've been on $170 for the last two months. This gives a nice bit of data to work with. We can plot three points on the demand and revenue curves and interpolate between them to get some idea of the shape of the curves.

Second, the product I'm focusing on is actually the single developer versions of CSV Manager and XML Manager. The single developer version is the entry level version of these components and the biggest seller. Both products are also sufficiently similar in style and function that we can acceptably merge their data sets. These curves are representative of the majority of Ricebridge sales.

Third, the scales are linear and they are not zero-based. The curves give you information about the relative values between the three prices.

So what are the curves telling you? Well first, software components ain't no Giffen good. Bummer dude. The demand curve is pretty normal. Push up the price and no-one buys. Flog 'em cheap and you'll get a load of buyers.

More interesting is the revenue curve. This is what actually helps you decide what price is going to make the most money. For software you can pretty much ignore unit costs. The main cost to you of selling an extra unit is just your payment processing fees and they increase linearly. Looking at my curve you can see that there's a sweet spot between the $97 and $170 prices.

So should I reprice at $120? Would you?

Well I'm not going to. There's not enough data for the $170 price yet. I have a feeling it will hit sales harder than it has to date. I think I've just been lucky so far. That pushes the maximum point of the curve closer to $97. Remember, don't panic. Get the data.

You can tell that the $47 price point was a disaster. I stuck with it for too long. The data does not lie. It may have generated higher sales volumes, but overall revenue was significantly down. But hey, I had to know.

Of course, the problem with this entire analysis is that software products change. New versions get released. CSV Manager 1.2 came out last November.  It has new features so you get more for a given price. Let's just ignore that inconvenient factoid — it sorta messes up my lovely graphs!

To return to my pricing decision, I think the only way is up, actually. Given that the revenue difference between $170 and $97 is not that big, and given that new versions will be better value for money, I think on the whole that my price is pretty much about right for the moment. Yes, I am sacrificing volume. It's probably a skimming strategy. Then again, a price penetration strategy against open source (most of the alternative components) would be pretty nuts.

So there you go, if you're thinking of entering the software components market, now you have a bit more to go on. If you're already in the market, you might want to post something similar…




Posted in Business | Leave a comment

StelsXML XML JDBC Driver Launched!

I'd like to announce the release of a really cool new product by J-Stels Softare: StelsXML.

“StelsXML is a JDBC type 4 driver that allows you to perform SQL queries and other JDBC operations on XML files. With the StelsXML JDBC driver, you can easily access data contained in your XML documents by using standard SQL syntax and XPath expressions. The driver is completely platform-independent.”

StelsXML uses Ricebridge XML Manager as its underlying XML engine. As you can imagine we're pretty happy to have been chosen for the job and we can heartily recommend the final product. Yet another way to break the impedance mismatch between Java and XML.

So go check StelsXML out!




Posted in General | Leave a comment

CSV Manager 1.2.1 Released!

This is a new release of Ricebridge CSV Manager. New features
include support for Java Beans, a pull/push streaming API for loading and saving CSV, and a simplified set of load and
save methods. The example code has been expanded and now includes line-by-line explanations. It's all detailed on the
What's New page

We're also introducing a new pricing scheme. A free single developer XML Manager license (worth $170) is included FREE with
every CSV Manager license. PLUS, we include a free SIX MONTH email support package (worth $1500) with each purchase.
And if you're an independent
contractor, we've introduced a new option just for you: claim a 50% discount when you link to us from your site or blog!

This version is fully backwards compatible with CSV Manager 1.1 so you won't need to change your existing code.

Existing Ricebridge customers are invited to download the free upgrade from their user accounts.




Posted in General | Leave a comment