Statisticians use experience and heuristics to figure out which model provides the "best" data analysis. This process sometimes uses a priori knowledge of the data.
Question - Model selection is impacted by the amount of available data (available data typically increases over time). Answer: Uh, carefully apply model selection.
Additional detailed statistics questions.
Dogs, non-dogs and statistics: (Bayesian) searches in cosmology (Roberto Trotta, Imperial College London, Cosmology)
Imperial Centre for Inference and Cosmology, Bayesian work (as opposed to frequentist)
Background:
Cosmological concordance model: inflation, dark matter, dark energy
Model assumptions: Isotropy and homogeneity, Approx. Gaussianity of CMB fluctuations, Adiabaticity
Data set: WMAP7
The power spectrum contains the full statistical information IF fluctuations are Gaussian.
Baryonic Acoustic Oscillation (BAO) - derived from WMAP data.
Gets relation between redshift and acoustic scale.
This yields constraints on total matter and dark energy.
Discussed ways to deviate from the vanilla model:
Start with Non-Gaussianity from inflation.
Bispectrum, wavelets, skweness, kurtosis, genus statistics, Minkowski funtionals, needlets - higher order statistics accounting for deviation from non-Gaussian inflation.
Search for non-Gaussianity, non-trivial topology (light from distant objects can reach us along multiple paths)
Searches - use the WMAP data, apply these different models, plot the results and compare with vanilla model.
Generic Departures from the LCDM
Search from deviations from Concordance Model
Number of neutrinos species and statistical isotropy - unclear if these two are significant deviations from the model.
Principled Bayesian model selection:
Level 1: Select model M and prior P(T|M) -> Parameter inference
Level 2: Compare several possible models
Level 3: Model averaging (none of the models is clearly the best)
Bayesian stats lets you to get P(M|d) from P(d|M)
Bayesian evidence balances quality of fit vs extra model complexity.
Jaynes - "there is no point in rejecting a model unless one has a better alternative"
Showed source detection using Bayesian reconstruction - 7/8 objects correctly identified. Mistake happens b/c two objects are very close.
Showed cluster detection from Sunyaev-Zeldovish effect in CMB maps using Bayesian Model selection.
Mentioned Multinest (Feroz and Hobson, 2007)
"Many anomalies/unexpected deviations go away with better data/modeling/insight. Is this evidence the community jumps too soon b/c of statistical flukes?"
Great question - What is the scientific conclusion when Bayesian and Frequentist approaches disagree?
Recent developments and current challenges in statistics for particle physics (Kyle Cranmer)
Center for Cosmology and Particle Physics
LHC
This talk attempts to explains what LHC particle physicists do to see if statisticians can help out...
Showed Lagrangian of the Standard (Matter) Model
energy of configuration of matter fields
energy allows for predicting how configuration will evolve with time
Feynman diagrams / Quantum Field Theory
Method
QFT, Monte Carlo/Perturbation Theory, Feynman Diagrams
Simulate particle interactions
Run algorithms on sim data to detect particular particles
10^14 collisions, looking for a few interesting interactions (e.g. Higgs)
20 dimensions of 'nuisance parameters' over several different channels of data.
RooFit, RooStats - toolkit for determining expected values of some physical process.
Statistical Analysis
Primarily frequentist, some Bayesian but there is a general dislike for assigning prior probabilities to theoretical particles not yet observed.
"Bayesian probability allows prior knowledge and logic to be applied to uncertain statements. Theres another interpretation called frequency probability, which only draws conclusions from data and doesnt allow for logic and prior knowledge." (ML in Action)
Inverse problems in X-ray scattering (Stefano Marchesini, Lawrence Berkeley National Laboratory, Photon Science)
Complex likelihood functions: global fit in Supersymmetry
Bayesian v Frequentist statistics
The parametric bootstrap and particle physics (Luc Demortier, Rockefeller University, Particle Physics)
Started out as a nice, organized talk promising a beginning, a middle and an end. Lost it half way through...Bataan death march.
Bootstrap provides a tractable numerical approach that approximates exact methods.
Bootstrap is a frequentist methodology.
Two basic ideas:
Plug-in (or substitution) principle
Re-sampling - generate toy data samples that are statistically equiv to the observed dta samples. two possibilities:
Parametric re-sampling
Non-parametric re-sampling
Main uses:
Bias reduction
variance estimation
Confidence interval construction
Hypothesis testing
When can we be sure bootstrap estimator converges to the true values as the sample size increases?
Not on boundaries
Not on maximum values
Gave a confidence interval example from particle physics.
Bootstrap Confidence Interval lessons
Use a pivot or an approx pivot whenever possible (A pivot is a function of both data and parameters, whose distribution does not depend on any unknowns.).
Test inversion seems to improve performance of bootstrap.
Two developments in tests for discovery: use of weighted Monte Carlo events and an improved measure (Glen Cowan, Royal Holloway, University of London, Particle Physics)
Super fast, hardcore statistics talk.
I could not follow the contours of the swamp, but I did learn a new word.
Panel Session The Development and Use of Public Databases: It's Complicated! (Ioannidis, Scargle, Cartaro, Skinner)
Effects of extraneous noise in Cryptotomography (reconstructions from random, unoriented tomograms) (Duane Loh (grad student), PULSE Institute - SLAC National Laboratory, Photon Science)
Statistics in nano-particle (3D) imaging using x-rays.
Photon counts from diffraction patterns - 100's of photons per particle imaged.
Particles can't be held, continual x-ray bombardment destroys particle, don't know orientation of the particle (hence the crypto part).
Signal averaging, phase retrieval on random, noisy tomograms.
How to proceed:
Look for fixed points. Replicate in 8 orientations and compare to data stream. Waved hands on stats used to determine compatibility between data and model.
Ab initio reconstruction - expectation maximization. Uh...
Talked about how he might deal with extraneous noise (i.e. signal averaging). Rushed b/c running out of time.
Combine orientations and look for statistically improbable representative values.
Novice speaker.
Filtering femtosecond x-ray scattering data using Singular Value Decomposition (Trigo, Mariano, SLAC , Photon Science)
Interested in imaging ultra-fast x-ray diffraction.
Self Amplified Spontaneous Emission - process of laser emitting electrons...synchrotron radiation...self amplifying...
Scattering experiment that uses SVD to manipulate the data in matrix format to reduce dimensionality and obtain higher signal to noise.
Machine Science: Distilling Natural Laws from Experimental Data, from nuclear physics to computational biology (Automating discovery of invariants) (Lipson, Hod - Cornell University, Statistical)
Machine looks at physical system and produces mathematical, symbolic model.
Example - double pendulum model produced Hamiltonian invariant.
Motivated by making better robots - learning from environment with fewest experiences as possible.
Started with Genetic Algorithms to create robots that move as fast as possible (simulated) that also worked in reality.
Can more complex robots learn using evolutionary techniques?
Also evolving control programs to make robots behave in certain ways.
Limited success with programs controlling robot movement. Robots moved but in lame ways.
"Simulator" -> evolve robots -> build it -> collect sensor data -> evolve simulator (co-evolution)
Emergent model - robot figures out what it is, i.e. how it is configured
Then figures out how to move
Ideal of self modeling applied to more abstract applications
candidate models -> candidate tests -> perturbations -> candidate models in a way that maximizes disagreement between predictions
Symbolic regression (e.g. what function describes a data set). Disadvantages: Computationally expensive and over fits the data.
Combined this with co-evolution using sub-sets of data to maximize disagreement between models.
Started to apply this to inferring equations from data (e.g. used bio system equations to produce data, fed data to system, system returns equations)
Then took high speed photos of flapping wing, derived representative parameters (e.g. lift, wing size), fed data to system and system reproduced several models, some known, some new.
Epic fail on single cell domain. Added time delay building block and removed sin/cos building blocks. Produced more elegant model that the human produced equations.
Looking For Invariants
We can fit models to data but what do models mean?
Pendulum example...energy is constant, Hamiltonian gives ability to prediction of system behavior. Can this be applied to evolution model - what is constant in the system?
Started using ratio of derivatives to find non-trivial invariants.
Published code: Eureqa - data mining tool: you give it data and it spits out models.
How does it handle noise?
Great talk.
Molecular structure from protein soup: assembling 3-D images of weakly scattering objects Anton Barty
multivariate correlations - clusters with a reduction of the statistical dimensionality
e.g color - g-r, u-g, [3.6]-r parameter space (optical)
e.g quasar selection in color parameter space found by finding outliers in clustering parameter spaces
Thoughts/Challenges
Data mining is statistics expressed as algorithms
Scalability issues
Maybe learn to live with an incomplete analysis
Clusters are seldom Gaussian - beware of the assumptions
Can we develop some entropy-like criterion to isolate portions or projections of hyper-dimensional parameter spaces where "something non-random may be going on?"
Need to account fo heteroskedadic errors
Visualization in >>3 dimensions is a huge problem
Cosmic Cinematography (Astronomy in Time Domain)
Synoptic surveys + time domain
Catalina Real-Time Transient Survey (public data policy)
Classification should take humans out of the loop b/c data sizes are going to get huge.
Talked about classifier for variable stars
Currently likes 2D Light Curve Priors (Lead: B. Moghaddam)
Marvin Weinstein dynamic clustering algorithm - find clusters in very big dimensions.