Feeds:
Posts
Comments

Archive for December, 2008

For some reason, the ICC Test ‘Championship’ (read ‘Perpetual League Table’) is occupying my thoughts today. Having now disregarded a malicious desire to construct a (non-trivial) ranking system that maintains Australian superiority indefinitely, I instead offer the following cluster of thoughts:

  • Can it really be that the ICC points system fails to incorporate runs scored or wickets taken during a test? Intuitively, a team that loses narrowly deserves more respect than a team that plays a lot of what Bill Lawry calls ‘poor cricket’, and I would like to see this tackled directly. Of course there is great natural variation in both runs and wickets on account of ground conditions and locale, but surely this is removed by taking ratios (for runs) or differences (for wickets) between the teams. I imagine this being done on and innings-by-innings basis.
  • There is this Belgian chap who has done some work on rankings incorporating batting and bowling averages. The page seems light on explication but the iterative correction (“The batting averages have been corrected by taking into account the bowling average of the opposition, and vice versa.”) looks pretty good. There seem to be some free parameters though: “The Combined Rating is equal to the difference between batting and bowling average, multiplied by 20.” Unless the 20 is for total number of wickets or something.
  • I am only interested in team rankings, or rather ratings; individual ratings have long been a subject of study, not least (though incompletely) by Brendon in his pathbreaking article on Bayesian Estimation and Cricket. To what degree, however, can a team rating be successfully constituted of the individual ratings of its members? Conversely, given trustworthy team and individual ratings, can the importance of a team ‘functioning as a unit’ be quantified by a resulting discrepancy? In fact, I am willing to claim that if such a ‘constructive’ team rating algorithm could be implemented, it would perform more accurately (i.e. with respect to predictions of individual encounters) than the top-down methods practiced by virtually everyone.

Update: I hadn’t realised until now that the last time Australia lost every match in a series at home was in the 19th Century, i.e. pre-relativity. Could we avoid that happening again now please? I hope all members of the team realise that we don’t like them because of their winning personalities, and with that in mind wish them a Happy New Year. (More charitably: Best of luck to whichever of Bollinger and Hilfenhaus makes his debut tomorrow.)

Update the second: Match not starting ’til tommorow, obviously; I’d presumed it would be on a Friday, but of course they need more than two days of rest between tests and the Melbourne test has to start on the 26th irrespective of which day of the week that is. Also, Bollinger is in for Brett Lee and Andrew Macdonald (qui?) replaces Andrew Symonds, who is out filming commericals for Ford.

Update the third (07/01/09): Well, Australia has won a nail-biter. Congrats to Bollinger and Macdonald for turning in fairly solid performances, and especially to Peter Siddle for his Man of the Match performance. Though I only witnessed it through Cricinfo, Hayden’s dropped catch at the death is beyond the last straw: he should be dropped… from reality. As X. W. Halliwell remarked, that would be “an interesting selection decision.”

Read Full Post »

My title is taken from a similarly titled article by the physicist Ed Jaynes, whose work influenced me greatly. It refers to a controversial idea of epistemological probability theory: the method of maximum entropy, that was popularised and (arguably) invented by Jaynes. This principle states that, when choosing probabilities on a discrete hypothesis space, subject to constraints on the probabilities (e.g. a certain expectation value is specified), you should distribute the probability as uniformly as possible by the criterion of Shannon entropy.

It was soon realised that this is not, as Jaynes hoped, a method of assigning probabilities from scratch. With no constraints apart from normalisation, you get a uniform distribution, which is lurking in the background as a “prior” that is assumed by MaxEnt. The uniform distribution might be justified by another argument (e.g. invariance under relabelling of the hypotheses), but the point remains: Maximum Entropy updates probabilities from a previous distribution, it doesn’t generate them from scratch (I will use the term `ME’ to refer to MaxEnt applied in this updating fashion). This puts the principle on the same turf as another, much more well-accepted method for updating probabilities: Bayes’s theorem. The main difference seems to be that ME updates given constraints on the probabilities, and Bayes updates on new data.

Of course, when there are two methods that claim to do the same, or similar, things, disagreements can occur. There is a large, confusing literature on the relationship and possible conflicts between Bayes’s theorem and Maximum Entropy. I don’t recommend reading it. Actually, it gets worse: there are at least three different but vaguely related ideas that are called maximum entropy in the literature! The most common conflict is demonstrated in this short discussion by David MacKay. The problem posed is the classic one, along these lines (although MacKay presents it slightly differently): given that a biased die averaged 4.5 on a large number of tosses, assign probabilities for the next toss, x. This problem can seemingly be solved by Bayesian Inference, or by MaxEnt with a constraint on the expected value of x: E(x) =4.5. These two approaches give different answers!

Given the success of Bayes, I was confused and frustrated that nobody could clearly explain this old MaxEnt business, and whether it was still worth studying. All of this was on my mind when I attended the ISBA conference earlier this year. So, aided by free champagne, I sought out some opinions. John Skilling, an elder statesman (sorry John!) of the MaxEnt crowd, seems to have all but given up on the idea. Iain Murray, a recent PhD graduate in machine learning, dismissed MaxEnt’s claim to fundamental status, saying that it was just a curious way of deriving the exponential families. He also reminded me that Radford Neal rejects MaxEnt. These are all people whose opinions I respect highly. But in the end of this story I end up disagreeing with them.

How did this come about? At ISBA, I tracked down the only person who mentioned maximum entropy on their poster – Adom Giffin, from the USA, and had a long discussion/debate, essentially boiling down to the same issues raised by MacKay in the previous link: MaxEnt and Bayes can both be used for this problem, and are quite capable of giving different answers. I was still confused, and after returning to Sydney I thought about it some more and looked up some articles by Giffin and his colleague, Ariel Caticha. These can be found here and here. After devouring these I came to agree with their compatibilist position. The rationale given for ME is quite simple: prior information is valuable and we shouldn’t arbitrarily discard it. Suppose we start with some probability distribution q(x), and then learn that, actually, our probabilities should satisfy some constraint that q(x) doesn’t satisfy. We need to choose a new distribution p(x) that satisfies the new constraint – but we also want to keep the valuable information contained in q(x). If you seek a general method for doing this that satisfies a few obvious axioms, ME is it – you choose your p(x) such that it is as close to q(x) as possible (i.e. maximum relative entropy, minimum Kullback-Leibler distance) while satisfying the new constraint.

This seems to present a philosophical problem: where do we get constraints on probabilities from, if not from data? It is easy to imagine updating your probabilities using ME if some deity (or exam question writer) provides a command “thou shalt have an expectation value of 4.5”, but in real research problems, information never comes in this form. An experimental average is not an expectation value. I emailed Ariel Caticha with a suggested analogy for understanding this situation, which he agreed with (apparently a rare phenomenon in this field). The analogy is that, in classical mechanics, all systems can be specified by a Hamiltonian, and the equations of motion are obtained by differentiating the Hamiltonian in various ways. But hang on a second – what about a damped pendulum? What about a forced pendulum? I remember studying those in physics, and they are not Hamiltonian systems! But we understand why. Our model was designed to include only the coordinates and momenta that we are interested in – the ones about the pendulum – and not those describing the rest of the universe; these are dubbed “external to the system”, and their effects summarised by the damping coefficient or the driving force f(t). However, our use of a model of this kind does not stop us from believing that energy is actually conserved, if only our model included these extra coordinates and momenta. It also doesn’t mean we can be arbitrary in choosing the damping coefficient and the driving force f(t) – these ought to be true summaries of the relevant information about the environment.

Similarly, in inference, one should write down every possibility imaginable, and delete the ones that are inconsistent with all of our experiences (data). This would correspond to coming up with a really big model and then using Bayes’s theorem given all the data you can think of. However, this is impossible in practice, so for pragmatic reasons we summarise some of the data by a constraint on the probabilities we should use on a smaller hypothesis space, in much the same way that, in physics, we reduce the whole rest of the universe to just a single damping coefficient, or a driving force term. That is where constraints on probabilities come from – summaries of relevant data that we deem to be “external to the system” of interest. Once we have them, we need to process them to update our probabilities, and ME is the right tool for this job.

The simplest way to reduce some data to a constraint on probabilities is if the statement “I got data D” is in your hypothesis space, as it is in the normal Bayesian setup. Applying the syllogism “I got data D ==> my probabilities should satisfy P(D)=1”, then applying ME, leads directly to the conventional Bayesian result – as demonstrated by Giffin and Caticha. Thus, Bayesian Inference isn’t about accumulating data in order to overwhelm the prior information, as it is often presented. It is just the opposite – we are really trying to preserve as much prior information as possible!

This leaves one remaining loose end – that pesky biased die problem, or the analogous one discussed by MacKay. Which answer is correct? In my opinion, both are correct but deal with different prior states of knowledge (the ultimate Bayesian’s cop-out ;-)). If we actually knew that it was a repeated experiment, the Bayesian set-up of a “uniform prior over unknown probabilities”, and then conditioning on the observed mean, is correct. The hypothesis space here is the space of possible values for the 6 “true probabilities” of the die, producted with the space of possible sequences of rolls, {1,1,1,…,1}, {1,1,1,…,2}, …, {6,6,6,…,6}. Note that not all of these sequences of tosses are equally likely. If we condition on a 1 for the first toss, this raises our probability for the 2nd toss being a 1 as well. This is relevant prior information that should, and does, affect the result. This is the source of the disagreement between MaxEnt and Bayes.

If we didn’t know this whole setup about the die, merely that there were 6^N possibilities, with N large, this model would be inappropriate. We would have uniform probabilities for the sequence of tosses (this corresponds to the “poorly informed robot” from Jaynes’s book). In this scenario MaxEnt with E(x) = 4.5 completely agrees with Bayesian Inference. It is this case where MaxEnt is appropriate because we really do possess no information other than the specified average value.

This concludes my narrative of my journey from confusion to some level of understanding of this issue. At the moment, I am working on some ideas related to ME that can help clear up some difficulties in conventional Bayesian Inference. Particularly, there’s been a flare-up of controversy, basically over Lindley’s Paradox in cosmology, that I believe ME can go some way to resolving.

I’d like to leave you with a quote from neuroscientist V. S. Ramachandran, that gave me the confidence to reveal my heretical thoughts on this matter.

“I tell my students, when you go to these meetings, see what direction everyone is headed, so you can go in the opposite direction. Don’t polish the brass on the bandwagon.” – V. S. Ramachandran

Read Full Post »

Intermission: Toilet Humour

Am I the last person to become aware of the existence of the Australian Government’s National Public Toilet Map, located where else but at http://www.toiletmap.gov.au? I would have stopped short of the slavish adherence to website personalisation that led to the inclusion of a ‘My Toilets’ section, but, in all seriousness, the website is a very good idea. It is also hilarious. Moving swiftly on…

Read Full Post »

The Next Big Thing

That John Lanchester fellow has written yet another interesting article in the LRB, this time not about the economic crisis (or so it seems to me), but about the medium that this year

[f]rom the economic point of view… overtook music and video, combined, in the UK. The industries’ respective share of the take is forecast to be £4.64 billion and £4.46 billion. (For purposes of comparison, UK book publishers’ total turnover in 2007 was £4.1 billion.)

The industry is, of course, video games, and the article, similar in spirit to a less-well-written offering in Prospect Magazine six months back (to which I have now cancelled my subscription, and not to be confused with the in-house magazine of the Western Australian government’s Department of Industry and Resources), strikes exactly the right balance between outsider’s whimsy (for Lanchester is not what he would consider a gamer) and enthusiastic respect. Hence the marvellously-deadpan description:

Also great fun is Super Mario Kart, a racing game, again silly, with a highly welcome low level of pick-up-and-play difficulty. In it, Donkey Kong (a large gorilla) can race Princess Peach (the multiply kidnapped sort of love object of the Mario series), an Italian plumber (the eponymous Mario), his evil twin, Wario, and a small green dinosaur called Yoshi, and so on, all of the vehicles being driven by various friends and family members, and comprehensible and playable by anyone over the age of about four.

There are plenty of rule-of-thumb dichotomies too:

There is no other medium that produces so pure a cultural segregation as video games, so clean-cut a division between the audience and the non-audience.

A common criticism of video games made by non-gamers is that they are pointless and escapist, but a more valid observation might be that the bulk of games are nowhere near escapist enough. A persuasive recent essay by the games theorist Steven Poole made the strong argument that the majority of games offer a model of play which is oppressively close to work.

Both useful insights, worth mentioning to anyone you know who has never played a computer game in their life and would like to understand how commentary on them has evolved from the glorious 1990s-era debates about children being brainwashed.

Read Full Post »

Via a gmail status message of O. X. Dive, a modular implementation of those Himalayas of engineering, the Great Ball Contraption:

There is (a non-lego) one in the Questacon at the Parliamentary Triangle and it was definitely one of my favourite exhibits as a young enthusiast of science and symmetry, along with (for different reasons, of course) the Tesla Coil and the Jacob’s Ladder. The music, performed by the Balanescu Quartet, is a cover of Kraftwerk’s Robots.

Read Full Post »

Via Sullivan, Keljeck (nice blog template) posts the full transcript of an interview between John Lofton and Allen Ginsberg, as printed in a 1990 issue of Harper’s Bazaar (though itself apparently a reprint!):

LOFTON: When you say you suppose this could have applied to you, does this mean you don’t know if you are mad?

GINSBERG: Well, who does? I mean everybody is a little mad.

LOFTON: But I’m asking you.

GINSBERG: You are perhaps taking this a little too literally. There are several kinds of madness: divine madness—

LOFTON: But I’m talking about this in the sense you spoke of in your 1949 poem “Bop Lyrics,” when you wrote: “I’m so lucky to be nutty.”

GINSBERG: You’re misinterpreting the way I’m using the word.

LOFTON: No. I’m asking you a question. I’m not interpreting anything.

GINSBERG: I’m afraid that your linguistic presupposition is that “nutty” as you define it means insanity rather than inspiration. You are interpreting, though you say you aren’t, by choosing one definition and excluding another. So I think you’ll have to admit you are interpreting.

LOFTON: Actually, I don’t admit that.

GINSBERG: You don’t want to admit nuttin’! But you want me to admit something. Come on. Come off it. Don’t be a prig.

It’s not too long (maybe three to five minutes of reading in full), but it’s really engaging because many of their remarks are so unctuous, and many others quite dazzling. I like, for instance, the way Ginsberg includes in the passage above the clause “though you say you aren’t”, as a way of acknowleding that he understand Lofton’s first attempt to rebuff the former’s charge of misinterpretation. It’s a good rhetorical device and worth remembering. There are, of course, contentful insights to be had throughout the interview, as well.

Read Full Post »

Every Single Christmas …

With Christmas less than a week away, it’s time to share with you my top five songs that get stuck in my head every Christmas. Readers are encouraged to provide their own list in the comments section.

5. Wonderful Christmas Time – Paul McCartney

I don’t know any of the verses to this song, but the chorus – “simply … having … a wonderful Christmas time” can get stuck in a loop in my head for hours on end. Check out the video clip below – he’s playing the song in a pub, accompanied by piano and upright bass and the like. I’m not sure what’s making the futuristic laser-esque noise …

4. It’s The Most Wonderful Time Of The Year – Andy Williams

Once again, my grasp of the lyrics is a bit vague. It goes something like:

It’s the most wonderful time of the year
where the something and something and something and something
and be of good cheer
It’s the most wonderful time of the year

It get’s stuck in my head because that first line is so uplifting. Here’s a clip from youtube.

3. All I Want For Christmas Is You – Mariah Carey

So catchy. Every house needs a copy of Mariah Carey’s Christmas Album, right next to the Bing Crosby collection. Her version of “O Holy Night” is actually rather good, but it’s this one that gets played endlessly on radio.

I can’t embed this one. Follow this link

2. It Feels Like Christmas – Muppet’s Christmas Carol

At some point during December, some TV channel will broadcast and I will watch “A Muppet’s Christmas Carol”. I loved it in 1992 and I’m still amused by Sam the Eagle’s line: “It is the American Way … “. As usual, the songs are feel good classics (remember Kermit singing “Rainbow Connection” … did your childhood resemble mine?). I can’t help singing along with the Ghost of Christmas Present:

1. White Christmas – The Drifters

This song is way out in front. I sing this year round. It sends my housemates crazy. The problem is that I attempt to sing all parts simultaneously, alternating between the lyrics and the backing line: “ba do ba do, da da … da da da”.

If you think you’ve heard this somewhere before, it’s in the movie “Home Alone” – Kevin sings along into a hairbrush before slapping aftershave only his cheeks and screaming into the mirror.

Now it’s your turn … comments please!

Read Full Post »

Those who have spoken to me recently about such things will know of my advocacy of the iTunes movie rental service. Though their selection is limited, it is an excellent proof of concept for the general idea of how digital media should sensibly be ‘owned’. I know that the Internet is full of well-informed, passionate people with strong views on this and related issues, but I would just like to go on record noting that I believe it a quirk of history that we think of songs, albums, movies &c. as goods rather than services.

(more…)

Read Full Post »

Have just had a nice week working at Mount Stromlo Observatory and am looking forward to the coming week in Melbourne (generally) and Swinburne (particularly). Yesterday and the day before I was working on a small problem that grew out of a larger project (which I’m sure I’ll return to at a later date, just not now) involving a little geometry: when numerical simulations of large-scale cosmological structure are made, they inevitably result in galaxies located within a big cube; on the other hand, when we observe the Universe with telescopes what we get is a pointy wedge with the Earth at the tip. How to go from one to the other? N.B. this post has exciting mathematics puzzles!

(more…)

Read Full Post »