Data Engineer, Data Acquisition Team

Website Applecart

Applecart deploys proprietary technology to run smarter advertising campaigns. We work with some of the nation’s most prominent corporations, non-profit organizations and political candidates to activate and communicate with key target audiences at a scale and level of efficacy previously thought impossible.

Our core offering is a proprietary social graph that leverages publicly-available data to map real-world relationships between individuals at national scale. Our roots are in politics, where we have tested and honed our methods at every level, to give our clients a proven technological edge. We’re branching out beyond political campaigns to tackle new advertising challenges in which determining “who knows whom” provides decisive advantages for our clients.

Applecart’s political work has been featured by The Colbert Report, CNN, The Washington Post, The Associated Press, USA Today, The Huffington Post, among other prominent news outlets.

As a Data Engineer on our Data Acquisition team, you will be responsible for designing, building and maintaining data feeds that plug into our social graph. At Applecart, we live and die on the coverage and quality of our data –  your work will directly affect our clients in the form of election outcomes, increasing political and non-profit fundraising yields and optimizing advertising spends and risk assessments.

 

Responsibilities:

  • Collaborate on estimates, database schemas, and architectural decisions for upcoming data initiatives.
  • Design, develop and deploy ETL pipelines and web scraping projects that contribute to the number of nodes, attributes and edges that make up the Social Graph
  • Work with large quantities of structured and unstructured data of various integrity
  • Implement monitoring mechanisms for both streaming and batch jobs to ensure 99.999% uptime and to track data quality
  • Design and build the infrastructure to support the scheduling and execution of jobs that crawl, ingest, and structure terabytes of data
  • Interact cross-functionally with a wide variety of people and teams, including the graph team, modeling team and overseas contractors, in order to identify and implement data initiative.
  • Promote strong engineering principles including writing DRY code, test-driven development (unit, functional, and end-to-end testing), code reviews, and continuous deployment

 

Basic Qualifications:

  • BS or MS degree in Computer Science, Math, Statistics or other technical field
    2-3+ years of applied software engineering experience (especially in startups, finance or adtech)
  • Python Expertise: classes & inheritance, map & filter functions, list comprehension, generators, decorators, style guides, pylint, pytest
  • Scraping / ETL Expertise: XPATHs, exponential back-off, API querying, and popular parsing frameworks like BeautifulSoup, LXML or Scrapy
  • Experience wrangling with messy data at scale using libraries like Pandas or technologies like
  • Scala / PySpark to structure and clean datasets
  • Experience working in teams – productionizing and deploying code in a cloud environment (preferably AWS using services like RDS, S3, EC2, ECS, Data Pipeline)
  • Experience with various storage technologies such as S3, Redis, MySQL, PostgreSQL and
  • Redshift and storage formats such as JSON, CSV, XML, HTML
  • Self-driven and results-oriented individual with aptitude to tackle challenging problems while adhering to best practices

 

Preferred Qualifications:

  • Experience with parallel or distributed computing using technologies like EMR / Docker / AWS Lambda
  • Experience with workflow management platforms like Airflow or Luigi
  • Experience working with streaming data using technologies like AWS Kinesis, RabbitMQ or Kafka
  • Significant interest or background in politics, advertising or big data

To apply for this job please visit applecart.recruiterbox.com.