About the Field Data team
As we scale to reach millions of people experiencing extreme poverty, we believe that data can transform cash delivery by:
- Improving recipient experience
- Increasing operational efficiency
- Mitigating risk
- Conveying our impact to donors
To realize this vision, we are building a best-in-class data team focused on cash delivery: operations, payments, and recipient safeguarding.
You will be the Field Data team’s first Data Scientist, reporting to the Senior Data Manager and working closely with the Senior Data Architect. This is an opportunity to join at the inception of the team and guide foundational decisions regarding vision, priorities, and best practices. We are a small (but growing) group with a startup mentality, applying engineering best practices and a product development approach to data.
Our data infrastructure is built on AWS and Databricks, with dashboards and visualizations in Tableau. We primarily use SQL, Python, and R.
About this role
You will deliver insights and decision making tools that help shape cash delivery by determining which projects we take on, how we design them, and how we implement them.
- Statistical analysis and predictive modeling. Own process end-to-end: implement statistical methods and build, train, and deploy predictive models to solve problems in cash delivery utilizing code (e.g., Python, R). Current priorities:
- Targeting. Implement and improve existing machine learning algorithms predicting poverty level from call-detail records (CDR) and geospatial data. This work enables us to remotely identify eligible recipients at scale, and has been highlighted in WIRED with findings published by our research partners in Nature.
- Risk prediction. Develop new risk prediction models that help us design programs and target recipient outreach to keep recipients and communities safe.
- Support data quality assurance. Work with the Senior Data Architect to design processes that ensure key metrics, models, and analysis are built on high quality data.
- Communication. Collaborate with leaders throughout the organization to define data product requirements, educate them about solutions, and make recommendations.
This role has a high degree of autonomy and the ability to shape our process from the ground up by determining best practices and informing the data product roadmap.
This role is fully remote but must overlap with an East Africa timezone by at least 4 hours.
Reports to: Senior Manager, Data
You are an analytical problem solver who knows how to:
- Frame a testable question from ambiguous requirements
- Gather, clean, and manipulate the required data
- Deliver answers that directly inform important decisions
You deliver results that impact real-world outcomes by:
- Solving for user needs and constraints
- Understanding when to build complex models and when simpler solutions are more appropriate
- Communicating takeaways clearly
You are intellectually curious and humble. You think probabilistically and measure the uncertainty of your predictions.
You are a mid-level to Senior Data Scientist with the following qualifications.
- 5+ years experience working with data querying languages (e.g., SQL) and statistical modeling software (e.g., Python, R)
- 5+ years experience as a Data Scientist or related position applying statistical and/or machine learning (ML) techniques, with a deep understanding of the key parameters that affect their performance
- Deep understanding of ML algorithms – ability to apply the appropriate methods, as well as benchmark and diagnose predictions to rapidly improve performance
- Proficient in spark, dask, and/or other packages for distributed computing
- Ability to travel and deploy to international, on premise locations for up to 1 month at a time
Preferred but not required qualifications:
- Bachelor’s degree in a STEM or other quantitative field
Helpful qualifications (we do not expect you to have all of these):
- Advanced degree in a STEM or other quantitative field
- Experience analyzing call-detail records, geospatial (e.g., ArcGIS or Python geospatial stack), and/or social network data
- Experience developing an end-to-end machine learning pipeline with a scalable ML framework (TensorFlow, Torch, H2O, Spark MLib, scikit-learn)
- Familiarity with cloud-based solutions (AWS, GCP etc.) for storing and processing large, terabyte-scale, datasets
- Familiarity with poverty measurement methodologies – e.g., geographic poverty, poverty proxies, common administrative data sources, household survey data
- Experience building dashboards and visualizations, preferably in Tableau
- Experience mentoring and developing teams of Data Scientists
Level and Compensation
At GiveDirectly, we strive to pay our employees generously and equitably. We use an accredited third party salary aggregator to ensure that staff’s total compensation package (base compensation + bonus) falls within the 75th percentile of similar roles, at similar organizations. We also have a no negotiation policy to ensure we are paying staff equitably across roles.
- The United States base salary for this role is $130,000 with a 10% bonus index.
- The Kenya base salary for this role is $72,000 with a 10% bonus index.
- For non-US and non-Kenya candidates, the salary for this role will be adjusted based on cost of living and the local labor market. A benchmark for your location will be shared during the interview process.
For exceptionally experienced candidates we are open to considering a Senior Data Scientist role for this posting. Both positions will start as independent contributors with the potential to manage if the team grows.
To apply for this job please visit boards.greenhouse.io.