Stan Core Tools & Infrastructure Engineer

Columbia University / Andrew Gelman / Stan

state-of-the-art platform for statistical modeling

Stan (http://mc-stan.org) is an open source probabilistic programming language and Bayesian inference toolkit that data scientists and applied statisticians across the world and many fields use to specify statistical models and fit them to data. Stan emerged from the research group under Andrew Gelman at Columbia, which is focused on state-of-the-art in Bayesian inference methodology and consulted on a variety of statistical problems in the social, biological, and physical sciences, engineering, sports, and business. Since its initial release, we’ve welcomed collaborators and contributors across the globe and have about 30 active contributors at the time of writing.

We’re looking for a tools and infrastructure developer who is interested in science and open source, is a self-starter and intrinsically motivated to help out, wants a flexible lifestyle, and would like to learn Bayesian stats and data analysis. Ideally you’d come in knowing a decent amount about either systems administration or programming (or both) and we would teach you about statistics. We’re looking for help with the following problems initially:

  • Streamlining installation for our R and Python interfaces for scientists on Windows and Mac, perhaps creating installers for each platform
  • Maintaining and improving our continuous integration infrastructure (currently Jenkins, Travis, AWS)

We expect that you will need to spend a significant portion of your time on at least some of these problems at first. Beyond that, there is much work to be done on Stan as an ecosystem. For example, we could also use help with the following key underserved areas of the project:

  • performance benchmarks/tests and profiling-based improvements
  • Various refactorings, mostly in C++
  • Pedagogical materials
  • GPU and distributed computation support
  • Bringing the PyStan interface up to feature parity with RStan and improving both of them (Python)
  • Higher order autodiff test framework and infrastructure (C++)

There are also many other ways to improve Stan that could use helping hands:

  • Stan 3.0 language and/or compiler rewrite
  • CloudStan
  • Graphical modeling language transpiler
  • New algorithm research implementations
  • Anything from the list here: https://github.com/stan-dev/stan/wiki/Longer-Term-To-Do-List or the road map here: https://github.com/stan-dev/stan/wiki/Stan-Road-Map

Links to some Stan team outputs and other Stan-related resources:

  • https://github.com/stan-dev
  • andrewgelman.com
  • http://mc-stan.org/users/documentation/
  • https://github.com/stan-dev/stancon_talks
  • great intro Bayesian analysis book: http://xcelab.net/rm/statistical-rethinking/

To apply, email Sean Talts, contact info here: mc-stan.org/about/team/

To apply for this job email your details to sean.talts@gmail.com

Apply using webmail: Gmail / AOL / Yahoo / Outlook