Feeds:
Posts

## Notes from Data Science Summit: ML in Production

I’m (Berian, that is, rather than Luke) attending the SF Data Science Summit today and tomorrow. I’m taking some rough notes as I go and want to publish them in digestible bits. One of the speakers I most enjoyed today was Carlos Guestrin (@guestrin), who gave a keynote and then a little 25-minute appendix later in the day. Here’s what I wrote down.

## Robot cars of the future

The Times of New York has an article on AI-driven cars* developed by Google, whose Bigendian presence has the effect of making the technology seem simultaneously much more immediate and slightly ludicrous. The capability to engineer and drive these vehicles has, of course, been with us for some time now, and

[…] robot drivers react faster than humans, have 360-degree perception and do not get distracted, sleepy or intoxicated, the engineers argue.

Indeed. Minor vehicle accidents are, I claim, caused by one or more drivers acting in a manner that is unanticipated by the drivers of other cars. While it seems clear that the software is currently capable of dealing with the unanticipated behaviour of other, human, drivers, I’m not sure how the software is supposed to control for the fact that the robotic car will itself be driving in a manner that humans will have difficulty anticipating.

Two possibilities: modifying the software to more faithfully emulate the mischievous and erratic driving behavior of humans, to which we are all (self-)accustomed; or, attempting to forge ahead with the hope of re-adjusting driving expectations to a higher standard. The latter sounds like a good bet in the long run, and like social engineering too.

But then, driving is a social act, as well as a mechanical one.

* The article itself seems to be having some display issues (at least in Chrome), so make liberal use of the wonderful arc90 experiment, Readability.

## Correct use of chatroulette

By now, ChatRoulette has reached even the most corneriest corners of the Internet, garnering mentions in academic blogs, friends’ Twitter feeds and blandly expository New York Times articles. I’m sure I’m not alone in being disappointed at the ease with which the open-ended potential of real-time video connections to random strangers has been channelled  into mundane middlebrow expression; also, gratuitous display of genitalia. (On the other hand, those folks are probably having a blast, and that’s not without merit.)

However, here is an inkling of a better world (some viewer discretion advised):

Update: Hello! Ben Folds follows up by Chat Rouletting Merton-style during a gig in Charlotte. The next time someone is all ‘Hmm… what is art  these days?’ in your vicinity, do the right thing and point them at this.

Amusingly, I think Merton’s lyrics are funnier, though having the crowd in the background really blows it away. I mean, imagine if this catches on. Imagine if every time you log in to Chat Roulette you have to prepare yourself for appearing in the middle of a concert.

## The loss of wisdom; some computations

• I must post five days per week (I choose which five), except when traveling, no matter what I have done.
• I must write only about research; no committees, no refereeing, no teaching, no excuses.

Well, the first rule sounds good, & I also undertake not to blog about committees, refereeing, teaching or excuses.

So, here is what happened today: I had a wisdom tooth taken out. It was impacted, which means that it was at an angle of somewhat less than the usual pi/2 radians relative to the gum-line and so obstructing access to its root, as though it knew it would one day be the target of this sort of operation. To get through the tooth itself, they used a technique that is new to me, called piezosurgery, which pulverises the tooth using ultrasonic rage rather than the traditional drill. The purpose of this is to reduce the physical trauma to the gum and thus reduce the swelling.

The whole thing took about 30 minutes from the application of local anaesthetic (which was, unsurprisingly, the most painful part) to me forking over what I feel is a reasonable remuneration in Danish crowns given the services rendered. And true to their word the swelling has been much less dramatic than is usually advertised in association with the removal of wisdom teeth. I will apply some more cold pressure to it, in the form of my tub of chocolate mousse from Irma*, and take non-prescription painkillers and see how things are in the morning.

As I am working from home today, to shield my coworkers from what I had presumed would be a visage worthy of Picasso, after cooking myself a simple lunch I updated to the latest release of Matlab (R2010a, 64-bit for Intel Mac), refreshed my default path locations, etc., and made some computations with some simple mock density fields corresponding to one of the fields of the WiggleZ survey, provided for me by Chris Blake. We are currently deciding between two methodological approaches and the computations I ran today are aimed at putting this decision on a quantitative footing. We call them ‘Method 1’ and ‘Method 2’ and, several times, I have forgotten which of these methods is which; I think this is a good way to do science. Once we have decided which label produces the better outcome, we can examine what method that label actually corresponds to.

* A Danish supermarket, not to be confused with a young woman, perhaps the eponymous Irma of the supermarket’s logo, who brought me mousse out of sympathy, or for some other reason.

## Annals of superior Matlab scripting

The 2009/10 Konditori van Gogh award for excellence in Impressionist Scripting

## The Civilised World

Joshua Gans informs us that, at long last, an Australian ISP will be offering broadband without a monthly bandwidth cap. Here is something interesting that he says regarding why it hasn’t happened already:

I had suspected an adverse selection issue whereby the small percent of very high users would migrate to the first mover causing them disproportionately large costs.

The cost is \$100/month on a two-year contract. But if it works (and the PR for this company will be good if nothing else), I can imagine other ISPs offering the same service before long.

## Comment: the future of journal publishing

Peter Coles has just written a post on what he terms the ‘academic journal racket’, and rather than add a lengthy comment, I’ll write something here.

The rational argument for electronic editing and publishing is certainly made very strongly in his post. I would like to hear a scientist at a later stage of their career than me expound a little more on how the inertia of our current system should be overcome. The arXiv++ path, though undoubtedly fraught with complications, certainly seems to have a lot of potential—but how could the community of scientists be galvanised around it? Would Peter consider discussing this with the arXiv administrators directly? At least one cosmology blogger has participated in the shaping of the arXiv previously. Online petitions are very twee, but if there is a earnest desire for change among those making decisions about journal subscriptions, perhaps a concensus can be quickly reached acknowledged.

Two further comments: the arXiv would need to be further decentralised for these ideas to be tenable—if it became the Hauptbahnhopf for astrophysics papers, Cornell University Library would feel obliged to ask for contributions to the maintainence costs, putting everyone roughly back where we started. The merit of Universities having their own preprint servers, which Peter correctly derides as being pointless compartmentalisation, is that there is no mechanism for some party to feel agrieved over the burden they bear. In this vein, I contend that: a cheap, effective and rigorous publication process for astrophysics papers can be achieved with a highly distributed network of continually updating arXiv mirrors, all acting as entry points for papers that are then directed to editors on the basis of subject, assigned to referees, and revised and updated through much the same process that exists at present.

Achieving this would require senior academics to stop their departments’ current academic journal subscriptions, to wrest some control of the arXiv from Cornell and to design and implement a functional editorial system. I don’t have much more to say about the lattermost here, though I don’t believe that blog-style comments or wiki-style modification are senisble at the present time—it’s too easy to act in a rash and unrestrained manner through those media.

Lastly, and quite topically for Luke and myself, acknowledging that we can get by just with electronic copies of papers should lead some universities to acknowledge that the archaic ritual of thesis binding can be done away with. The cost is high and are often borne entirely by the student. Having a bound copy of the work for oneself is ‘nice’, though it amounts to vanity publishing (not that this bothers me)—but university libraries can get along just fine with electronic copies; distributing them through the arXiv is an increasingly common practice.

## Against impact

Because I spend insufficiently little time feeling as though I’m still an undergraduate, may I politely agitate for others to follow the advice of Chris Bertram:

Those of you working in higher education in the UK already know about the barbarous proposal to make future support for research depend on a government assessment of its “impact” – in other worlds whether there’s a tangible payoff in terms of economic growth or social policy.

My colleague James Ladyman has launched a petition on the No.10 website to tell Gordon Brown what we think of the idea. If you’re British, even if you don’t live in the UK any more, pop over and sign it.

Something similar has been mentioned by Andy, just in reference to the UK’s main science council. To pass broader comment, I am uncertain of the efficacy of these petitions. I hypothesise that the UK government acts only on those that are i) highly supported; and ii) of purely symbolic value. For instance, the recent public apology issued posthumously to Alan Turing is both terrifically sensible and completely uncourageous. I am unaware of an initiative requiring the expenditure of political captial that has seen fruition through these petitions, but I acknowledge that, if my hypothesis stems from cynicism, it could be that I have selectively ignored counterexamples. On the one hand, I hope that’s wrong, but on the other, it would be nice if it were true.

## Insta-Reich

Tonematrix by André Michelle is a function defined on the two-dimensional vector space over the field $\mathbb{Z}_{17}$. And yet, it’s so much more; though self-explanatory, it might save you three seconds to know that the axes correspond to time and pitch, with the latter staggered non-linearly to prevent dissonance. It seems possible that Tokyo / Vermont Counterpoint could be produced directly from this tool if the latter were extended further in time.

A more interesting challenge would be to create a devolution from the audio to tonematrix form. I claim such reverse engineering, followed by a Fourier transform to draw out the particular periodicities in frequency and time, can be used to generate an infinte family of pleasant sounding meanderings of arbitrary length. I would also like to consider the possibility of randomly placing and removing structures on the map at different positions: most of the early random and fractal music was interesting but sounded very much like garbage, whereas by forcing tonality on the output, tonematrix minimises the potential for something awful.