Website Probabl

A very good entry point to contributing to scikit-learn

You can best view and apply for this job under: https://join.probabl.ai/jobs/3615272-open-source-software-engineer-documentation-ux

 

Overview
As the open source team at Probabl, we help maintain a range of libraries, such s scikit-learn, skrub, skops, fairlearn, etc. Almost all projects we develop use ReST as their documentation source format, and use sphinx and sphinx-gallery to generate the HTML pages of the API docs, user guides, and examples.

We also work on creating interactive documentation using technologies based on WASM such as pyodide, and jupyterlite. Occasionally we contribute upstream to these projects to help further the efforts of having python code on the browser.

This role is about helping with all of the above, both in terms of writing documentation, which requires a fair understanding of the libraries, and helping with the UX which might involve working on the website generation machinery. While working on the documentation, you often will also end up working often on the implementation itself.

Qualifications
There are a few areas which contribute to succeeding in this role:

Machine learning and technical writing: having an understanding of machine learning concepts and being good at technical writing helps with producing good documentation for the users of these libraries.
sphinx, HTML, CSS, JS: understanding how the website is generated, and being able to slightly tune and modify the look and feel of them, is very helpful in this role.
UX design: having affinity with graphical UX design so to improve the documentation and usability of the libraries mentioned would be helpful. For instance, being able to design a graphical representation of estimator.get_params() the same way that we have an HTML repr of the estimators (example output).
pyodide, jupyterlite and other WASM based frameworks to help introduce interactivity to documentation.
Other than the above general areas, on your day-to-day basis, you need the following skills:

Python: while this role is not heavy on the programming side to start with, it’s useful to be familiar with concepts used in our projects in order to contribute to them. This includes object oriented programming, writing tests (pytest), and understanding continuous integration (CI).
Collaborative development: development happens directly on the open source libraries, and you’d need to be comfortable iterating through your solutions with other reviewers, some of whom not working at Probabl. This requires strong communication skills.
git tools: git and GitHub (or similar collaborative platforms), previous experience is a big plus.
We don’t expect you to have all the above qualifications to start with. If you think you have competency in certain areas and are excited to learn the rest, we’re happy to hear from you.

Duties and Responsibilities
You will work on advancing the main goal explained above, which includes working on existing related issues, maintaining existing tools, and helping with the development of relevant parts of the related libraries, while having a focus on fixing documentation issues and improving them.

You will start by working on existing issues with our guidance, and as you grow in the role, you will be able to better prioritize and foresee future needs.

You will also grow into being able to triage related issues, engage with the community, and help the community in this aspect.

Way of Working and Culture
The open source team is a distributed team, but for people who start, we’d like them to be near one of our hubs, i.e. Paris and/or Saclay, France and Berlin, Germany, in order to have regular work together days, which is typically twice a week or so. You will be provided with access to an office or a coworking space.

Working hours are very flexible. You’re not required to be online outside working hours, but you can compensate for lost time if you need breaks during the day due to personal constraints.

We have regular on-site days where most of us meet, so some traveling once every month or two is required.

We have a no-silly-questions culture and you’re encouraged to ask all questions you might have. We understand the learning curve at the beginning for such a role can be quite steep.

We strive for creating a diverse team, who work and communicate well together, and each bring a unique perspective to the table. If you feel you’re underrepresented in tech and machine learning, we especially want to hear from you.

You will be mostly working with Olivier Grisel, Guillaume Lemaitre and Adrin Jalali to get onboarded and start on your journey.

Your Application
Your cover letter should tell us which skills / past experiences of yours can contribute to this role. We would also like to hear about why you’re considering Probabl as a place to work.

Programming question:

Using sklearn.inspection.DecisionBoundaryDisplay, plot the decision boundaries of a sklearn.ensemble.HistGradientBoostingClassifier on a dataset created using sklearn.datasets.make_moons. Then explain what is happening in the notebook, assuming you’re teaching a beginner.

Put the resulting notebook in a GitHub repository, and leave the link to the notebook file in the application form under “Answer to the programming question” field.

To apply for this job please visit join.probabl.ai.