Posts

Showing posts from January, 2018

Maps

Image
We tell stories with maps about global warming, biodiversity; we can design more livable cities, track the spread of epidemics. That makes a difference.  Jack Dangermond A map is not the territory it represents, but, if correct, it has a similar structure to the territory, which accounts for its usefulness. Alfred Korzybski I wanted to plot sea surface temperature (SST). Actually, I wanted more than that. I wanted an easy way to plot global SST. I wanted a lazy way to make a nice plot. It turns out it's not hard, but more difficult than I naively expected. Get Some Data NOAA's National Centers for Environmental Information ( NCEI ) hosts public access archives for environmental data on Earth. Among the many data sets are high resolution satellite measurements of SST. The data, daily OISST , I'm interested in has a spatial grid resolution of 0.25° and temporal resolution of 1 day. It uses Advanced Very High Resolution Radiometer (AVHRR) infrared satellite SST da...

Changes

Image
Turn and face the strange  Ch-ch-changes David Bowie - Changes Suppose you have some data that looks like this. It might be a time series or some other form of sequential data. It looks like there might be several different regions in the data.The count ranges seems to change somewhere around 25 and maybe again 75 or 80. Assuming there is some underlying process generating the data, we would like to know where do the parameters that control the process change and what are the values of the parameters in each region. This is toy data generated by this function: """ generateSeq - generate a sequence of Poisson distributed data with changes in mean at specified points arguments: length - the total sequence length cps - change point location lambdas - mean of data for each segment returns; seq - a numpy array of values at each position in the sequence Note: This function does no error checking. len(lambdas) must = len(cps) +1 ""...

Symbolic Regression

Image
In a previous post , I looked at Gaussian process regression. Gaussian processes are powerful.They give you useful error bounds. However, often the results look like curve fitting similar to fitting with splines and I suspect that people have a difficult time interpreting the covariance kernel. Often what people want is a simple interpretation of the data. This is relatively easy if it looks like there is a linear relationship among the  data or the data can be transformed to be somewhat approximately linear. Just crank up your favorite linear or  generalized linear modeling  (GLM) software. The software will fit the model to the data and give you the appropriate coefficients. Since the models are simple, interpretation is easy (well, maybe relatively easy). If the relationships among the data are non-linear, there's software to fit non-linear models, assuming you can come up with an appropriate set of functions to fit the data. Suppose you don't have strong feeling ...