Monday, April 20, 2009

Interferometry File Formats

When astronomers talk about software done right, they often hold up FITS as the gold standard. I'll admit, FITS has done more to live up to its namesake (Flexible Image Transport System) than many believed possible. But unfortunately, there can be too much of a good thing. Sometimes too much emphasis is put on defining an end-all-be-all file format, when all we really need are good tools for converting between file formats.

Data formats are a problem in radio astronomy software. Currently, there are at least three major formats (MIRIAD, UVFITS, and MeasurementSets), each linked to a major software package (MIRIAD, AIPS, and CASA), with rudimentary/non-existant tools for converting between them. Many have taken this current state of affairs as a sign that multiple file formats are bad and that the community should decide on a single format. Since each format is intimately tied to major software package, this battle over file formats has escalated to a war between software packages.

The mistake made here was blaming the file formats. File formats are not the problem. The problem is that the software for reading them has not been circulated in easily accessible modules. I am encountering this problem as I'm trying to get AIPY to be agnostic about file formats by wrapping them all into Python. Here's where I am:

The MIRIAD file format was actually easily wrapped up, owing to MIRIAD having a developed programmer's API.

MeasurementSets (with CASA) are giving me a lot more trouble. It seems that CASA, with all of it's C++ objects that are passed between functions, is something of an "all or nothing" deal. If I want to read a MeasurementSet, I apparently need to wrap up the entirety of CASA. The failing here is code modularity.

UVFITS is giving me the opposite problem.
UVFITS was cooked up as a FITS-conforming file format to handle raw interferometric data. Unfortunately, interferometric data isn't in picture form yet, so an extension of the FITS format (the binary table) was cooked up to accommodate that (Cotton et al. 1995). The result was a file format that is so general that it does not tell the programmer what the data actually means.

File formats exist to support the needs of different applications. They've been created out of need, and should not be dismissed as unnecessary. I recommend to the radio astronomy software community that we embrace these file formats and work on modular code so that they are accessible from any software package.

Thursday, April 16, 2009

PAPER: 8 Station Results

Yesterday I submitted our paper on the 4- and 8-antenna deployments of PAPER to the Astronomical Journal & astro-ph. Although it does not place any meaningful constraints on cosmic reionization (when light from the first stars broke up the bulk of the hydrogen gas that had been sitting around since it originally formed), it nonetheless illustrates a first level of calibration and analysis towards that goal. This paper should have some impact in the community, as it shows that we are fairly well along in our experiment and it provides a first look at some of the astrophysical foregrounds that will interfere with detecting reionization. This paper also will double as the last chapter of my dissertation, so publishing it puts me in the endgame of my doctoral studies... This fall I'm going to take an NSF postdoc back at Berkeley. Basically, that just means I'll be doing the same research (along with some teaching), but with better pay.

Tuesday, April 14, 2009

More on MCMC in Python

Markov-Chain Monte Carlo (MCMC) seems to be a promising technique for the calibration/imaging problem that we are facing with our experiment the Precision Array for Probing the Epoch of Reionization (PAPER). Yesterday, in addition to taking a crash-course in MCMC, I also started playing with PyMC, which implements, among other things, MCMC using Metropolis-Hastings chains. A first shot at a simple fitter using PyMC went something like this:

import pymc, numpy as n, pylab as p
from pymc import deterministic

x = n.arange(-10., 10, .01)

def actual_func(a, b, c): return a*x**2 + b*x + c

sig = .1
tau = 1/sig**2
noise = n.random.normal(0, scale=sig, size=x.size)
mdata = actual_func(1., 2., 3.)
mdata += noise

a_fit = pymc.Uniform('a', 0., 2.)
b_fit = pymc.Uniform('b', 1., 3.)
c_fit = pymc.Uniform('c', 2., 4.)
tau_fit = pymc.Uniform('tau', tau/3, 3*tau)

@deterministic
def func(a=a_fit, b=b_fit, c=c_fit): return actual_func(a, b, c)

mdata_var = pymc.Normal('mdata', mu=func, tau=tau_fit,
value=mdata, observed=True)

mdl = pymc.Model([a_fit, b_fit, c_fit, d_fit, mdata_var, tau_fit])
mc = pymc.MCMC(mdl)
mc.sample(iter=1000,burn=250)

a_hist, a_edges = n.histogram(a_fit.trace(), 40, (0,2))
a_bins = (a_edges[:-1] + a_edges[1:])/2
p.plot(a_bins, a_hist)

p.show()


I chose this example over the ones that came with PyMC because it was much closer to the kind of calibration problems we will be trying to solve with PAPER. An interesting point that came up early on was the reliance of MCMC on an estimate of noise levels in the data. I remember this from the Maximum Entropy Method (MEM) for deconvolution that I coded up in AIPY. An interesting technique, used here, is to actually include the noise level (tau_fit) as a variable that is fit for. This way you can specify a starting guess for noise, along with a confidence interval, and not have to pull a magic number out of a hat. In this example, the fitter does a good job of accurately determining noise levels. I think what happens in this code is that the current guess for the noise level is used as part of the calculation that determines the next state to jump to in the Markov chain, and that new state make include a revision of the noise level. This clearly might be instable for more complex systems, so I imagine some amount of care must be exercised in leaving noise as a free parameter.

Monday, April 13, 2009

Image Deconvolution with Markov-Chain Monte Carlo

I've decided to morph this blog to be more about short updates pertaining to current ideas I'm thinking about, rather than the long-winded philosophical rants I've posted so far. Hopefully this might keep me more engaged as a blogger and maybe even help me keep better track of the things I'm working on. Starting up this blog again, by the way, is a shameless procrastination technique, since my dissertation is due in about 1 month, and all my writing energy should really be focused on that...

After attending an SKA Imaging Workshop in Socorro, NM a couple of weeks ago, I've developed an interest in Bayesian statistics and Markov-Chain Monte Carlo (MCMC) techniques as they pertain to interferometric imaging. Having never taken a stats course, I'm scrambling a little to absorb the vocabulary I need to understand papers written on the subject. Fortunately, in this era of wikipedia, getting up to speed isn't that hard. After reading wiki articles on MCMC, Markov Chains, and the Metropolis-Hastings algorithm, I dived into EVLA Memo 102, which talks about a first shot at using MCMC for image deconvolution.

The Maximum Entropy Method (MEM) is a classic deconvolution technique (one I've already reimplemented for AIPY), but I'd like to go a bit further down this road. According to the standard implementation (which I gleaned from reading Cornwell & Evans (1984) and Sault (1990)) this algorithm uses a steepest descent minimization technique based on the assumption of a nearly diagonal pixel covariance matrix (i.e. the convolution kernel is approximately a single pixel). While this is an effective computation-saving assumption, I found that for the data I was working with, this assumption lead to the fit diverging when I started imaging at finer resolutions.

I think MCMC, by not taking the steepest decent, might be able to employ the diagonality assumption more robustly. I also think it's high time that deconvolution algorithms make better use of priors. The spatial uniformity prior in MEM makes it powerful for deconvolving extended emission, while the brightest-pixel selection technique in CLEAN makes it effective for deconvolving point sources. There's no reason we can't build a prior explicitly for a deconvolution algorithm that tells it to prefer single strong point sources over many weaker point sources, but also tells it that when all else is equal, entropy should be maximized.