Data Platform Engineer

  • Company: Capital One
  • Location: Vienna, Virginia
  • Posted: January 11, 2017
  • Reference ID: R17289
Towers Crescent (12066), United States of America, Vienna, Virginia

Data Platform Engineer

Who We Are: Capital One is a technology company, a research laboratory, and a nationally recognized brand with over 65 million customers. We offer a broad spectrum of financial products and services to consumers, small businesses and commercial clients - and data is at the center of everything we do. In December 2015, Capital One was named a Blue Ribbon Company by Fortune Magazine as one of only 25 companies in the world to make their top company lists four times in 2015 (Fortune’s 100 Best Companies to Work For, Global 500, Fortune 500, World’s Most Admired Companies). Capital One was also named as Top Workplace in the Greater Chicago area for 2015 by The Chicago Tribune and 5th among the large companies. Come learn more about the great opportunities we have to offer!

The Security and Technology Analytics team is on a mission to improve detection and prevention of cyber-attacks across the entire Capital One enterprise systems. We envision, create, deploy, and maintain critical security tools powered by streaming big data, state of the art machine learning, micro-service architecture, and beautiful visualizations in the cloud. We are highly technical with strong backgrounds in what we do.

As a member of the team, you’ll work to build and maintain a multi-PB production data lake running in the cloud.  You will tune our cluster for optimal performance while balancing cost and resilience.  You will efficiently leverage infrastructure provisioning tools such as Docker, Ansible, Mesos, Terraform, CFT to automate deployment of cloud based environments. You will build and tenaciously keep Big data infrastructure in cloud & in Capital One datacenters operational across various platforms such as Hadoop, Streaming etc. Responsibilities will include working with developers to build out CI/CD pipelines, enable self-service build tools, and reusable deployment jobs.

Any given day you will:

Design end-to-end engineering solutions for business opportunities using existing on premises or new Cloud based technology platforms

Tenaciously keep the Big Data infrastructure (Hadoop and peripheral tools) operational across various environments in datacenters & AWS Cloud.

Install and manage Cloudera, Hortonworks, and Open source Hadoop, various tools & tech stack in datacenter & AWS Cloud environments.

Deploy & manage EMR, Redshift, and Dynamo DB etc. services based applications’ needs on AWS

Lift & shift of on premises Big Data applications / tools to AWS Cloud.

Work with the team to build the Big Data platform solutions for different business needs

Develop scripts and glue code to integrate multiple software components

Proactively monitor environments and drive troubleshooting and tuning

Demonstrate deep knowledge in OS, networking, Hadoop & Cloud technologies to troubleshoot issues

Evaluate & build different compute frameworks for all tiers for technologies in AWS cloud.

Identify technical obstacles early and work closely with team to find creative solutions to build prototypes & develop deployment strategies, procedures

Build prototypes for open source technology solutions

Who you are:

You yearn to be part of cutting edge, high profile projects and are motivated by delivering world-class solutions on an aggressive schedule

Someone who relishes challenges; thrives even under pressure; is passionate about their craft; and hyper focused on delivering exceptional results

You love to learn new technologies and mentor junior engineers to raise the bar on your team

Passion for automation of everything to release fast, frequently, and reliably.

Experience in deploying scalable and highly available fault tolerant platforms using AWS & third party offered services. (EMR, Dynamo DB, Redshift, Cloudera, etc.)

Automator. You know how to script and automate anything and everything, from test to environment provisioning

You have deep understanding on Cloud technologies, backbone structure & service offered, specially by AWS.

Curiousity. You ask why, you explore, you're not afraid to blurt out your crazy idea. You probably have a Bachelors, Masters or higher degree

Do-er. You have a bias toward action, you try things, and sometimes you fail. Expect to tell us what you’ve shipped and what’s flopped. We respect the hacker mentality

It would be awesome if you have a robust portfolio on Github and / or open source contributions you are proud to share


Basic Qualifications:

Bachelor’s Degree or Military Experience

At least 4 years of IT experience

At least 1 years of hands on experience deploying and managing AWS based applications

At least 3 years of experience providing enterprise Linux based system administration

At least 2 years of experience working with code repositories and Git, GitHub, Nexus, or Jenkinsbuild tools

At least 2 years of experience with automation tools such as Chef or Ansible,

At least 1 year of experience with MySQL, Postgres, Elastic Search, Docker, or Mesos,

At least 1 year of experience in OO programming

Preferred Qualifications:

Bachelor’s degree in Computer Science, Information Technology

3+ years of overall IT experience in an enterprise cloud environment

2+ years of experience with administration and build experience of big data technologies such as Hadoop, Spark, or EMR

1+ year of experience in Metadata Hub (Ab Initio) or Data Quality (Informatica) or Collibra

1+ year of experience with scripting, Python, Java, R

1+ year of experience with Docker, Mesos, Kubernetes, or ECS

3+ years of experience in designing, deploying, administrating enterprise class Big Data clusters

At this time, Capital One will not sponsor a new applicant for employment authorization for this position.

Share this Job