Academic publishers behave more like libraries (hosting knowledge, curating collections).
All the _actual work_ (intellectual/experimental, writing, proofreading, peer review, typesetting) is done on a voluntary basis by mostly tax-funded academics. Therefore publishers should

0. die in a fire if unwilling to change, become tax-funded public institutions otherwise
1. provide free universal access to all publications.
2. OR, but it’s a mutually exclusive scenario, publishers start _paying their suppliers_, like everyone else.

There, I said it.

You know why this doesn’t happen? Because academia is an ego- and jealousy-driven enterprise, and branding one’s work under prestigious logos is the only tangible* metric of success most academics can aspire to. We are nothing but neurotic shaved monkeys, deal with it.

edit: I’d like to deconstruct what I wrote above: is it any true? and if that’s the case, does it necessarily have to be so? I.e. can this be turned into a positive statement; what drives academia (I’m referring to its research aspect ony; let’s leave education aside fttb) and why? To drive the human spirit forward by expanding knowledge and insight into the workings of the tangible (or intangible? here’s looking at you, theoreticians) world. To form the people who do so into heralds of positive change.

What do paywalled journals have to do with this? Why do we accept being reduced to currency, by an unfair economic lock-in mechanism? (This is what makes us neurotic, I think …)

* “impact factors” are b.s. numbers invented by pointy-haired management to rank clouds and solar flares by prettiness. Research is “invaluable” in the sense that 0. money goes in 1. nothing comes out, i.e. any given publication has measure zero in terms of immediate usefulness. A single paper is NOT worth 25$ of taxes or of someone’s attention, but it’s only worth in the context of all others (at best**) and all human knowledge in an extended sense.

What’s the origin, the source of “prestigiousness” for a journal? It’s a sort of self-fulfilling prophecy in which one’s work gains value purely by proximity to other “prestigious things”, think of it as a halo.
Sure, publishers contend that the curation process is expensive, but I’m pretty confident they have huge operation margins. Need to see the numbers though.
edit: the citation graph is what counts, in a very literal sense. One way to make sense of this growing amount of literature is to keep track of the “hubs”, i.e. the highly connected-to nodes: the “most influential” works are recognized by this. Under “fairness” assumptions which might not always hold; i.e. excluding or including a citation has many psychological hooks that I don’t dare to fully expand on here.. but the most obvious nonlinearity in the citation graph is the author being aware of (or not) of a certain work.

Truth be said, an increasing number of for-profit publishers are graciously giving an “open access” option to authors. At a charge, of course. Between 1K and 1.5K€ per article. Do you recognize this pattern? We’re getting s+++++d big time and have to say thank you as well !

(**) OTOH, there is such a thing as a b.s. publication, with 0 value, period. The price tag of any given paper only explains this vast, semi-invisible, mass of b.s. clogging hard-drives everywhere.

So let’s stop burying our research behind paywalls, break the addiction chain, do some actual good and open-source everything.

Instead, we’re stuck in a pusher-junkie situation, in which the substance is peer recognition, “visibility”. Immeasurable at best. Don’t you hate this state of things? Well, I do.

If you’re wondering what’s the cost of storage and infrastructure, Google rates are close to 2 dollar CENTS per GB per month. A color pdf with plenty of data inside is say 0.5 MB.
Say we define a “relevance lifetime” of a publication, let’s say 10 years (wild guess; it can be 1 month for biology, 50 years for civil engineering).
The hosting cost of a single pdf under these assumptions becomes 0.001171875 USD, ONE TENTH OF A CENT.
So I’d dare say those publication prices are more to justify Springer’s golden armchairs and infant’s blood fountains than actual data hosting.

OTOH, you can never be sure about the future relevance of an article. A “relevance lifetime” could be a loose assumption, and we should never disregard or delete a paper just because of its age. However, it becomes increasingly “better known” (on average), so any market value we attach to it should decrease.

Coordinating peer review has a price, too.I.e. calling those _volunteers_ and making them work faster. Automated reminders. The end.

I’ve completely avoided the problem of interpretation and information context so far. Value is subjective, but we have to deal with very objective monetary cost.
To a non-specialist, a Nature paper is worth exactly 0, apart from the pretty-picture “ooh!” value.
To a specialist instead, what lies inside is not pure information, because of the interpretative “line noise” introduced by natural language.
Raw numerical data too has a context-dependent value; let that sink in. No two persons share the same “universe of discourse”, the “set of all possible messages” introduced by C. Shannon. So how do we quantify the value of this? By the average number of “a-ha” moments over all readership?

But I’m digressing. Academic publishers are a legalised scam, and we should stop foraging them.


Dear blog, I’ve got the hots for Haskell.
Its terse syntax is intimidating at first, but as you keep exploring, you start seeing its beauty and even possible ways to write a Hello World program.
But let’s start from the easy bits: this language makes it trivial to compose functions, and in a sense specialize the bare language to represent more closely the semantics of the computational problem at hand.

One of the most basic higher order functions is map, which takes a unary function f and a list l and returns a new list, made of mapping f onto every element of l:

map         :: (a -> b) -> [a] -> [b]
map f []     = []
map f (x:xs) = f x : map f xs

Silly examples!

> map (+1) [1,2,3]
> map (\c -> if c=='o' then 'i' else c ) "potatoes"

Similarly, the right ‘fold’ operation (called ‘reduce’ in other languages) foldr takes a binary function f, a starting object v and a list (x:xs) , and maps f across the elements of (x:xs) recursively:

foldr :: (t -> t1 -> t1) -> t1 -> [t] -> t1
foldr f v [] = v
foldr f v (x:xs) = f x (foldr f v xs)

Binary functions? We’re not restricted to arithmetic operations! If we consider f in the example above as the functional composition operator (.), and the starting element v being the identity function id, the Haskell typechecker will infer that foldr (.) id just requires a list of a -> a functions (i.e. that share the same type signature as id, in this case) and an element to start with!
So here’s compose :

compose :: [a -> a] -> a -> a
compose = foldr (.) id

This idea of representing n-ary functions as daisy-chains of unary elements is called “currying“, and is crucial for reasoning about programs, as we can see in the above example.
Functional composition is implemented as follows, where “\u -> .. body .. ” is the idiom for anonymous (lambda) functions of a variable u:

(.) :: (b -> c) -> (a -> b) -> a -> c
(.) f g = \x -> f (g x)

Another thing of beauty is unfold, which serves to build data structures up from a prototype element.
In this case, it’s specialized to the List monad, so the neutral element is the empty list [] and the structure constructor is the list concatenation operation (:).

unfold :: (a -> Bool) -> (a -> b) -> (a -> a) -> a -> [b]
unfold p h t x
  | p x       = []
  | otherwise = h x : unfold p h t (t x)

See the type signature? We need a function from a domain to Booleans (a “decision” function p), two functions h and t that are applied to the head and to the recursion tail, respectively.

Unfold can be generalized to arbitrary recursive structures, such as trees, heaps etc.

As a quick example, a function that takes a positive integer and returns its digits as a list can be very compactly formulated in terms of unfold (however, leading zeroes are discarded):

toDigits :: Integer -> [Integer]
toDigits = reverse . unfold (==0) (`mod` 10) (`div` 10)
> toDigits 29387456

Next up: Monoids! Functors! Monads! “Hello World” ! and other abstract nonsense.

While virtualenv (VE) is a very valuable tool, one quickly realizes that there might be a need for some usability tweaks.

Namely, activating and deactivating a VE should be quick and intuitive, much in the same way as any other shell commands are.

Enter (*drumroll*) virtualenvwrapper. This tool allows you to create, activate and deactivate and remove VEs with a single command each.

  • mkvirtualenv
  • workon : if you switch between independent Python installations, workon  lets you see the available VEs and switch between them, rather than deactivating one VE and activating the next one.
  • rmvirtualenv

Very handy.

After installation of VEW, we need to set up a couple of environment variables in our .bashrc or .profile file, and then we’re good to go.

Physically, the VEs created with VEW all reside in a single folder, which should be hidden from regular usage by e.g. giving it a dotted name ( e.g. ~/.virtualenvs ). This effectively hides the VE engine room details from sight, so developers can better focus on the task at hand (or, as someone says, this tool “reduces cognitive load”).

You can also find a convincing screencast here.

So go ahead and try it !

Hello there Internets! So you’re starting up with python for data analysis and all that, yes?

Here I outline the installation steps and requirements for configuring a python library installation using virtualenv and pip that can be used for scientific applications (number crunching functionality i.e. linear algebra, statistics .. along with quick plotting of data etc.).

Python tends to have somewhat obscure policies for library visibility, which can be intimidating to a beginner. Virtualenv addresses these concerns and allows to maintain self-contained python installations, thus simplifying maintenance. It amounts to a number of hacks (with a number of caveats described here), but I find it to be very effective nonetheless, if you really need Python libraries in your project. In particular, it saved me from Python Package Hell, and I hope it will streamline your workflow as well.

I do not assume much knowledge on the part of the reader, however you are welcome to ask for clarifications in the comments and I’ll reply ASAP. In this tutorial we address UNIX-like operating systems (e.g. Linux distributions, OSX etc.). The tags delimited by angular brackets, <> are free for the user to customize.

1) virtualenv : First thing to install. (If you have already installed it, skip to point 2).

Do NOT use the system Python installation, it leads to all sorts of inconsistencies. Either

  • pip install virtualenv

OR “clone” (make a local copy) the github repository

2) create the virtualenv in a given directory (in this example the current directory, represented by . in UNIX systems):

  • virtualenv .

This will copy a number of commands (e.g. python, pip), configuration files and setup environment variables within the <venv> directory.

Alternatively, the virtualenv can be made to use system-wide installed packages with a flag. This option might lead to inconsistencies. Use at own risk:

  • virtualenv –system-site-packages .

3) Activate the virtualenv, which means parsing the activate script:

  • source /bin/activate

As a result of this step, the shell prompt should change and display (<venv>) 

4) Test the virtualenv, by verifying that pip and python refer to the newly-created local commands:

  • which pip
  • which python

should point to a /bin directory contained within the current virtualenv.

When you are done using the virtualenv, don’t forget to deactivate it. If necessary, rm -rf <venv> will delete the virtualenv, i.e. all the packages installed within it etc. Think twice before doing this.

5) Install all the things!

From now on, all install commands use pip, i.e. have the form pip install <package> , e.g. pip install scipy :

scipy (ships with numpy, so it is fundamental)

pandas (various helper functions for numerical data structures)

scikit-learn (machine learning libraries, can be handy)

matplotlib (plotting functions, upon which all python plotting is build)

pyreadline for tab completion of commands


ipython, esp. with browser-based notebooks. The install syntax will be

  • pip install “ipython[notebook]”

bokeh (pretty plots)

ggplot for those who need R-style plots. The requirements for ggplot are

  • matplotlib, pandas, numpy, scipy, statsmodels and patsy

6) Develop your scientific Python applications with this powerful array of technologies

7) Once you’re ready to distribute your application to third parties, freeze its dependencies using pip. This is another hack, but hey, we’re in a hurry to do science right? The following two statements represent the situation in which one needs to install the dependencies on a second computer, account or virtualenv.

  • pip freeze > requirements.txt
  • pip install -r requirements.txt

That’s it for now; comment if anything is unclear, or if you find errors, or would like to suggest alternative/improved recipes.


Visual poetry

December 21, 2013

“It is said that paradise with virgins is delightful, I find only the juice of the grape enchanting! Take this penny and the let go of a promised treasure, because the war drum sound is exhilarating only from a distance.” — Omar Khayyam (1048-1131), Iranian polymath and poet

The above is an example of a Ruba’i, a traditional Persian form of quatrain poetry. I find it beautiful on so many levels.

The size of what can be known

December 15, 2013

The Planck length is estimated at 1.616199(97) \times 10^{-35} meters, whereas the radius of the observable Universe (comoving distance to the Cosmic Microwave Background) is 46.6 \times 10^{9} light years, i.e. 4.41 \times 10^{26} meters. 

Both represent the metric limits of what we can perceive, regardless of the observation technique: the Planck length corresponds to the smallest measurable distance, whereas the observable radius of the Universe corresponds to the most ancient observable radiation (the CMB is the redshifted light emitted at the end of the Inflationary Epoch).

The existence of universes whose Planck length is larger than the observable size of ours (or, whose universe is bounded by our Planck length) is not provable. A fractal nesting of turtles.

  • In case you manage to break your installation so badly that it won’t move much past the bootloader (say, when the contents of  /etc/init/ are read .. runlevel 2?), you may want to modify the boot command line in order to gain shell access (choose it and press e), by appending init=/bin/bash at the end of the line starting with ‘linux’.
  • Your HD might be mounted in read-only mode at this stage, so you might want to remount it in read-write mode, like so: first note down the device name (e.g. /dev/sda3 ), and then call mount with the appropriate remount options: mount -o remount,rw /dev/sda3 .
  • You might need to know a couple of vi commands, in order to edit the relevant configuration files (pray that you know what you’re doing). For example, i enters ‘insertion’ mode, x deletes a character, ESC returns vi in command mode at which point you can either close without saving with :q! or after saving with :wq
  • Ubuntu 12 waits until the network interfaces listed in /etc/network/interfaces are brought up (see man ifup). One can override this setting by replacing start on (filesystem and static-network-up) or failsafe-boot with: start on (filesystem) or failsafe-boot in /etc/init/rc-sysinit.conf .
  • In general, having a proven working operating system on another partition/neighboring PC helps a lot. Happy sysadministration!
  1. Computer program power consumption
    A programming language that “minimizes” power consumption through minimal interconnect usage (e.g. memory calls).
  2. Food sourcing power consumption
    Farmland supply to cities: how to optimize land usage? What part of the produce can be made local e.g. made at the consumer or turned to hydroponic and its culture brought within the city itself?

Both these problems require a grammar of solutions, rather than a single instance, due to the diversity of the operating/boundary conditions that are encountered.
As such, I don’t think that a “proof of correctness” for either can be hoped for, but perhaps a number of heuristic checks might prove the point.
The former is addressed by a single technology, whereas the second requires a diverse array of strategies.

General considerations

  • Area and land usage
    Arbitrary rearrangement of the resources is not trivial: CPUs are designed with CAD tools that favor periodicity and reuse, and farmland restricts supply due to physiological productivity/rest cycles.
  • Time and flow
    Time plays a part as well: the edges in these supply nets do not handle a constant flow. In the first case, storage is regulated by registers, queues and stacks, whereas in the second, the flowing entities are subject to seasonal variation, degrade with time etc.

This framework is intentionally generic in order to highlight similarities, and it is of course a work in progress.
Both these problems in fact have broad political implications, which leaves plenty of space for many juicy discussions. Looking forward.


  1. An article from the NYT: A Balance Between the Factory and the Local Farm (Feb. 2010) highlights both the high costs of local (i.e. small-scale) green production, citing The 64$ Tomato, and the related climatic issues (e.g. cultivation on terrain located in the snow belt).
    The article closes with “Localism is difficult to scale up enough to feed a whole country in any season. But on the other extreme are the mammoth food factories in the United States. Here, frequent E. coli and salmonella bacteria outbreaks […] may be a case of a manufacturing system that has grown too fast or too large to be managed well.
    Somewhere, there is a happy medium.” — an optimum, if you will.

Side questions

  • Why do large-scale economics “work better”? i.e. have a larger monetary efficiency, which drives down the prices for the end user? More effective supply chain, waste minimization, minimization of downtime …

Extensions and interfaces

October 16, 2013

I would like to gather here data and interpretation regarding artificial extensions to human capability (the broadest definition of “technology”): We are witnessing transition from “technology-as-screwdriver” to “technology-as-cognition-extension”? More precisely, exactly how advanced must a technology be, until one cannot realize anymore to be using it?
This abstracts one step beyond A.C.Clarke’s “Third Law”: technology and magic will, at that point, be reduced to commonplace human experience, and therefore become indistinguishable from it.
It’s a rather bold statement, and I’m no starry-eyed singularitarian. Let’s start with a simple analysis by restricting to present-day tangible R&D results, and leave end-of-history predictions to fortune tellers.

Large scale: Behavioral trait clustering
October 29, 2012 : “We show that easily accessible digital records of behavior, Facebook Likes, can be used to automatically and accurately predict a range of highly sensitive personal attributes including: sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age, and gender. The analysis presented is based on a dataset of over 58,000 volunteers who provided their Facebook Likes, detailed demographic profiles, and the results of several psychometric tests. […]”

Personal scale: Distributed training for machine learning
October 11, 2013 : “Qualcomm also envisions alternatives to app stores, which he called “experience stores,” allowing users to download expertise into their consumer products”. Have a look at the original EETimes article.
While neural networks aren’t exactly news, the idea of “sharing” training across devices seems intriguing. I wonder whether this concept is of broader applicability.

Personal scale: Human-computer interfaces
This is where human-machine interaction, on a personal (affordable, household) scale started: computer mice, hypertext introduced in 1968 during Douglas Engelbart’s “mother of all demos” (official title “A Research Center for Augmenting Human Intellect”):

… and this is (a small selection of) where we are now:

Technical Illusions CastAR, an Augmented Reality “platform” composed of glasses with integrated projector, wand/joystick, backreflective mat (AR) or VR glass add-ons.
Video here
Still raking funds from the Kickstarter community, but apparently it’s going well. I’m a bit concerned about all that hardware one has to deploy, especially the “mat”. Apart from showers, only one application I can think of benefits from mats.

Thalmic Myo.
Video here
This one is an interesting concept: it’s an armband device that integrates accelerometer and bearing sensors with neural readout, so muscular twitching such as finger contraction can be correlated with movement of the limb as a whole, allowing for very expressive interaction. It has been available for pre-order for a few months now and will sell for 149 USD from the beginning of 2014, and I’m seriously considering getting one.

Leap Motion, and extensions thereof, e.g. the DexType “keyboard” software, see below.
Video here
The Leap Motion simply processes optical range information (possibly using “structured light” like the Microsoft Kinect), so a number of artifacts in the gesture recognition are to be “engineered against”. However, offering an open SDK was a winning move, there are tens of application and games in various stages of development being offered on the Leap store.

Possible implications
Adaptive communication: i.e. terminals that are aware of user patterns and sync accordingly, “sync” meaning information display based on remote context (e.g. remote user busy or focused on other). Attention economics brokerage.
Are we heading towards higher-order communication, i.e. in which one won’t communicate with a machine one character at a time but through symbols, sign language, ideograms?
Next level: J.Lanier’s “postsymbolic” communication in cuttlefish; the “body” of a user (intended in an extended sense, i.e. with hardware enhancements) becomes a signaling device in its own right (e.g. flashing, changing shape/”state”, radiating information etc.)

In fact, I think it’s only natural that machine interfaces are to be evolved in order to effectively disappear, the only question is when will this transition occur.

Coffee + graphs

Good morning!

Today I kick off with some more Algorithms basic theory. Tim Roughgarden, the instructor, has a precise yet captivating delivery style (not too informal, not too nerdy), which makes this worthwhile.
In general, I noticed the same pattern during all of my studies: the instructor has a large responsibility in keeping the students’ heads up, in both senses.
The presentation of even the most formal of subjects has to be enriched by “refreshing” elements. For example, my Calculus 1, 2, 3 AND 4 teacher, the never-enough-celebrated G. C. Barozzi, had a way of embedding anecdotes and biographical facts about mathematicians. This both put the subject in historical context, making it feel less abstract, and relieved the audience from continued focus. Seems too nerdy still? You’ve found the wrong blog then! Ha.

I can agree that some subjects might feel a bit “dogmatic”, as formal proof of all their aspects would require a disproportionate effort from large part of the audience. In this case, it is up to the lecturer to select the more illuminating material beforehand : Longum iter est per praecepta, breve et efficax per exempla (Seneca)

So, contraction algorithms for the MINCUT problem for graphs: given a graph G = \{V, E\}, with nodes V and edges E, which is the smallest set of edges \tilde{E} that divides G in two otherwise disjoint subsets of nodes V_1 and V_2 (i.e. the smallest cutset)?

Random contraction (Karger’s algorithm)
Sequentially remove edges uniformly at random, merging the resulting nodes V(E_{k}) = V_i, V_j \rightarrow V_i (“edge contraction”) and rerouting the edges initially connected to V_i and V_j to V_i.
The cutset that remains when there are only two nodes left is a local optimum of MINCUT.
A single run of Karger’s algorithm (adapted from Wikipedia)

If N_E = |E| is the edge set size, there are N_E! possible contraction sequences. Only a subset of these yield the true MINCUT (i.e. the problem is path-dependent).