Research
My full list of publications is available on ADS.
Galaxy / globular cluster scaling relations for low-mass galaxies
A remarkably tight scaling relation exists between galaxy masses and the number or combined mass of their globular cluster (GC) populations over many dex, alluding to a fundamental connection between the formation and evolution of galaxies and that of their most massive star clusters. This relation is least constrained for dwarf galaxies, and many of the lowest-mass galaxies lack GCs altogether. It is not well understood whether this is due to a fundamental difference in the formation and evolution of low-mass galaxies, or if they are simply the natural low-mass end of the normal scaling relation, too small to form very massive star clusters. I study this open question through the use of hurdle and zero-inflated count models to describe the GC populations of low-mass galaxies in the Local Group and nearby universe.
Comparing globular clusters to young clusters
Globular clusters are some of the oldest stellar systems in the universe. They are products of the denser environments of the early universe; star formation rarely occurs on such massive scales at z=0. We do not yet have a good understanding of the conditions of GC formation, and learning about them observationally is infeasible with current instruments (JWST is beginning to change this, but not on a large enough scale for a statistical sample). I compare the GC populations of nearby dwarf galaxies to the populations of young massive clusters (YMCs) of nearby star-forming dwarfs. These comparisons can contribute to our understanding of the differences in star cluster formation at different redshifts, providing insight into how, and perhaps even why, star formation on the largest scales is different today than it was in the early universe.
Statistical techniques
Bayesian methods
In most of my research, I utilize Bayesian techniques. I have developed Bayesian hierarchical models for my work on globular cluster populations, which offer flexibility and allow for the estimation of individual and population level parameters at once. These models can also thoroughly incorporate uncertainties, and can be built to allow for subgroups (partial pooling and mixed models, i.e. different galaxy environments or types of galaxies). Through collaborations with statisticians working in other fields with observational data, I am adapting methods for handling biased and censored data for astronomical contexts.
I have also developed predictive model comparison techniques for Bayesian models. Bayesian statistics lacks the robust model comparison methods that Frequentist statistics employs, but evaluating models is nonetheless crucial when using Bayesian methods. I advocate for predictive model evaluation and comparison tools, which are based on comparing data simulated from the fitted model, or other population parameters of this simulated data, to the observed data to evaluate the strengths and weaknesses of the model.
Zeros in data
Much of my reserach revolves around zeros in datasets, for example, the many low-mass galaxies that lack globular cluster populations. I develop and use statistical methods optimized for datasets that contain many zeros, namely hurdle models and zero-inflated count models. Hurdle models are a two-part model that simultanously predict the probability that a value will be zero or non-zero as well as the expected value if it is non-zero. Zero-inflated models are a type of count model that add a seperate zero-generating process to the base model, allowing for more zeros than would otherwise be predicted.
[undergraduate research] Characterizing the circumgalactic medium of the M31 system
The circumgalactic medium (CGM) is the diffuse, multiphase gas surrounding galaxies out to many viral radii. It is integral to large-scale feedback processes in galaxies, may be the site of a large portion of metals in star-forming galaxies, and could be part of the answer to the missing baryon problem. Despite its importance to galaxies and their evolution, it is poorly understood due to its diffuseness, and is unobservable in emission with conventional telescopes. Therefore, it is best studied through absorption, often using quasar sightlines. Project AMIGA (Absorption Maps in the Gas of Andromeda) is the first program to characterize absorption from gas in the CGM of a single galaxy:, using 43 quasar sightlines within 2 viral radii of M31. In the collaboration, I worked to distinguish between the CGM components of M31’s satellite galaxies and that of the central galaxy, and characterize the diffuse gas of dwarfs to learn about the effects of galaxy interactions on CGM gas.
If you'd like to get in touch, my email address is sam.berek[at]mail.utoronto.ca