Jabberwocky

This site is work in progress ...

Transparency

September 17th 2011

2 weeks ago, we watched some of the Strata preview from the Office. Strata is a conference about bhe business of big data which takes place every year.

Michael Nelson's story: after Wikileaks broke some highly sensitive stories, not only governments but also organizations decided to clamp down on every possible leak and information source. Michael Nelson argued that this is a bad approach - a much better oneis to strive for maximum transparency. The brazilian company SEMCO, striving for total transparency, was one of his examples.

He posited that in most organizations there is a kind of Pareto principle of sensitivity of information: a large majority of data could be disclosed without any damage, only a few percentage are of strategic importance.

Even better, disclosing that non-sensitive information creates a sense of trust and familiarity in the customers. Being open is simply good PR.

I do agree with those points, especially the last one: as examples see the blog post by a Rand Fishkin, and the Peldi's blog. Both company founders talk about milestones, decisions, joys and disappointments in an earnest way. Their sharing creates a personal connection. The reader feels like cheering for their successes, and lamenting their losses, which does no end of good to their brand.

A more cynical part of me also senses that if you look transparent, people won't look that closely because they think they know all there is to know. Also, one could be drowning out relevant information in noise, as it were. Not to mention that any displayed infomation can be given the spin it requires.

On the other hand, I feel that a lot of organizations would still be reluctant to go that far: unfortunately, companies can be dysfunctional enough that they wouldn't like to wash their dirty laundry in public.

While in fact, transparency might be very beneficial for these organizations: public shaming is a strong motivator for change. Something they might want to come to terms with, because in the age of porous boundaries, disclosure might happen whether they want to or not.

Interlude

August 23rd 2011

I've been slack on the blogging front for several months now - been fairly busy:

I'm slowly regrouping and have a little bit more time to settle here in London. Time to start blogging again - plenty of material to do so.

Getting cosy with MRI Ruby

April 28th 2011

Have you ever wondered what is going on in the entrails of MRI Ruby ? I certainly have. I started out studying engineering because I wanted to know how everything works. As it turned out, every fact just generated more questions, but that won't keep me from trying.

A little bit of poring through the source code (ruby 1.9.2) yielded some tools which can provide helpful insights for this research.

You're probably already aware that a Ruby program is executed like so:

  • parsing: the program is tokenized, lexed and interpreted into an Abstract Syntax Tree (AST). This AST is basically the source code reduced to a tree form, which looks vaguely lispy in construction.
  • compilation: this tree is compiled to a set of instructions for the Ruby virtual machine.
  • execution: the instructions are then executed by the ruby virtual machine.

Here are some tools, built-in in ruby, to see what happens to a bit of code.

Parsing

ruby --dump parsetree code.rb

say code.rb contains just one statement:

answer = 42

The corresponding AST yielded by this option:

      # @ NODE_SCOPE (line: 1)
      # +- nd_tbl: :answer
      # +- nd_args:
      # |   (null node)
      # +- nd_body:
      #     @ NODE_DASGN_CURR (line: 1)
      #     +- nd_vid: :answer
      #     +- nd_value:
      #         @ NODE_LIT (line: 1)
      #         +- nd_lit: 42
      

This gives us an insight in what constitutes an AST node for Ruby.

Compilation

2 ways to see the instructions produced by compilation.

The first one is a dump parameter like the one above:

ruby --dump insns code.rb

outputs:

== disasm: @test>====================
      local table (size: 2, argc: 0 [opts: 0, rest: -1, post: 0, block: -1] s1)
      [ 2] answer     
      0000 trace            1                                               (   1)
      0002 putobject        42
      0004 dup              
      0005 setdynamic       answer, 0
      0008 leave
      

The second way to get information somewhat more tricky - in this case a C macro needs to be 'turned on' for it to work. A hint in the comments of compile.c:

       * debug level:
       *  0: no debug output
       *  1: show node type
       *  2: show node important parameters
       *  ...
       *  5: show other parameters
       * 10: show every AST array
      

There are two ways to set CPDEBUG: you can add a parameter to the compilation in the makefile -DCPDEBUG=5 Or you can change the default value directly in compile.c

      #ifndef CPDEBUG
      #define CPDEBUG 5
      #endif
      

In both cases, recompilation is necessary, and will produce more output than usual. Running the same minimal program code.rb yields a whole lot of output about the generated instructions, maybe more than is useful, depending on how deep you want to go.

Execution in the VM

If you want to see output at the execution level: pointers, heaps, frame pointers, there's another debug constant you can activate. In vm.c, change the value of PROCDEBUG to a non-zero value:

      #define PROCDEBUG 1
      

Apparently, this causes segmentation faults out of the box, but by commenting out some stuff I was able to get it compiled. This causes an output like so:

      ---
      envptr: 0x100482f48
      orphan: 0x100887358
      inheap: 0x0
      lfp:    0x100482f48
      dfp:    0x100482f48
      :1:in `require': cannot load such file -- rubygems.rb (LoadError)
        from :1:in `'
      

OK, not very enlightening still. Waiting for the ruby-core's feedback on that one.


By running these few commands on extremely simple programs, it seems to me that we can get a better insight into how it works.

Note: setting tabstop=8 (in vim, or tab = 8 spaces in other editors) gives you the proper incrementation for the Ruby source code. I had to find out.