Data Platform Engineer

  • Company: Capital One
  • Location: Vienna, Virginia
  • Posted: August 20, 2016
  • Reference ID: R9400
Towers Crescent (12066), United States of America, Vienna, Virginia

Data Platform Engineer

Who We Are: Capital One is a technology company, a research laboratory, and a nationally recognized brand with over 65 million customers. We offer a broad spectrum of financial products and services to consumers, small businesses and commercial clients - and data is at the center of everything we do. In December 2015, Capital One was named a Blue Ribbon Company by Fortune Magazine as one of only 25 companies in the world to make their top company lists four times in 2015 (Fortune’s 100 Best Companies to Work For, Global 500, Fortune 500, World’s Most Admired Companies). Capital One was also named as Top Workplace in the Greater Chicago area for 2015 by The Chicago Tribune and 5th among the large companies. Come learn more about the great opportunities we have to offer!

The Security and Technology Analytics team is on a mission to improve detection and prevention of cyber-attacks across the entire Capital One enterprise systems. We envision, create, deploy, and maintain critical security tools powered by streaming big data, state of the art machine learning, micro-service architecture, and beautiful visualizations in the cloud. We are highly technical with strong backgrounds in what we do.

As a member of the team, you’ll work to build and maintain a multi-PB production data lake running in the cloud.  You will tune our cluster for optimal performance while balancing cost and resilience.  You will efficiently leverage infrastructure provisioning tools such as Docker, Ansible, Mesos, Terraform, CFT to automate deployment of cloud based environments. You will build and tenaciously keep Big data infrastructure in cloud & in Capital One datacenters operational across various platforms such as Hadoop, Streaming etc. Responsibilities will include working with developers to build out CI/CD pipelines, enable self-service build tools, and reusable deployment jobs.

Any given day you will:

  • Design end-to-end engineering solutions for business opportunities using existing on premises or new Cloud based technology platforms
  • Tenaciously keep the Big Data infrastructure (Hadoop and peripheral tools) operational across various environments in datacenters & AWS Cloud.
  • Install and manage Cloudera, Hortonworks, and Open source Hadoop, various tools & tech stack in datacenter & AWS Cloud environments.
  • Deploy & manage EMR, Redshift, and Dynamo DB etc. services based applications’ needs on AWS
  • Lift & shift of on premises Big Data applications / tools to AWS Cloud.
  • Work with the team to build the Big Data platform solutions for different business needs
  • Develop scripts and glue code to integrate multiple software components
  • Proactively monitor environments and drive troubleshooting and tuning
  • Demonstrate deep knowledge in OS, networking, Hadoop & Cloud technologies to troubleshoot issues
  • Evaluate & build different compute frameworks for all tiers for technologies in AWS cloud.
  • Identify technical obstacles early and work closely with team to find creative solutions to build prototypes & develop deployment strategies, procedures
  • Build prototypes for open source technology solutions

Who you are:

  • You yearn to be part of cutting edge, high profile projects and are motivated by delivering world-class solutions on an aggressive schedule
  • Someone who relishes challenges; thrives even under pressure; is passionate about their craft; and hyper focused on delivering exceptional results
  • You love to learn new technologies and mentor junior engineers to raise the bar on your team
  • Passion for automation of everything to release fast, frequently, and reliably.
  • Experience in deploying scalable and highly available fault tolerant platforms using AWS & third party offered services. (EMR, Dynamo DB, Redshift, Cloudera, etc.)
  • Automator. You know how to script and automate anything and everything, from test to environment provisioning
  • You have deep understanding on Cloud technologies, backbone structure & service offered, specially by AWS.
  • Curiousity. You ask why, you explore, you're not afraid to blurt out your crazy idea. You probably have a Bachelors, Masters or higher degree
  • Do-er. You have a bias toward action, you try things, and sometimes you fail. Expect to tell us what you’ve shipped and what’s flopped. We respect the hacker mentality
  • It would be awesome if you have a robust portfolio on Github and / or open source contributions you are proud to share


Basic Qualifications:

  • Bachelor’s Degree or Military Experience
  • At least 4 years of IT experience
  • At least 1 years of hands on experience deploying & managing AWS based applications
  • At least 3 years of experience providing enterprise Linux based system administration 
  • At least 2 years of experience working with code repositories and build tools such as Git, GitHub, Nexus, or Jenkins
  • At least 2 years of experience with automation tools such as Chef or Ansible,
  • At least 1 year of experience with MySQL, Postgres, Elastic Search, Docker, or Mesos,
  • At least 1 year of experience in OO programming

Preferred Qualifications:

  • Bachelor’s degree in Computer Science, Information Technology 
  • 3+ years of overall IT experience in an enterprise cloud environment 
  • 2+ years of experience with administration & build experience of big data technologies such as Hadoop, Spark, or EMR
  • 1+ year of experience in Metadata Hub (Ab Initio) or Data Quality (Informatica) or Collibra
  • 1+ year of experience with scripting, Python, Java , R
  • 1+ year of experience with Docker, Mesos, Kubernetes, or ECS
  • 3+ years of experience in designing, deploying, administrating enterprise class Big Data clusters

Capital One will consider sponsoring a new qualified applicant for employment authorization for this position.

Share this Job