Data Platform Engineer
Who We Are: Capital One is a technology company, a research laboratory, and a nationally recognized brand with over 65 million customers. We offer a broad spectrum of financial products and services to consumers, small businesses and commercial clients - and data is at the center of everything we do. In December 2015, Capital One was named a Blue Ribbon Company by Fortune Magazine as one of only 25 companies in the world to make their top company lists four times in 2015 (Fortune’s 100 Best Companies to Work For, Global 500, Fortune 500, World’s Most Admired Companies). Capital One was also named as Top Workplace in the Greater Chicago area for 2015 by The Chicago Tribune and 5th among the large companies. Come learn more about the great opportunities we have to offer!
The Security and Technology Analytics team is on a mission to improve detection and prevention of cyber-attacks across the entire Capital One enterprise systems. We envision, create, deploy, and maintain critical security tools powered by streaming big data, state of the art machine learning, micro-service architecture, and beautiful visualizations in the cloud. We are highly technical with strong backgrounds in what we do.
As a member of the team, you’ll work to build and maintain a multi-PB production data lake running in the cloud. You will tune our cluster for optimal performance while balancing cost and resilience. You will efficiently leverage infrastructure provisioning tools such as Docker, Ansible, Mesos, Terraform, CFT to automate deployment of cloud based environments. You will build and tenaciously keep Big data infrastructure in cloud & in Capital One datacenters operational across various platforms such as Hadoop, Streaming etc. Responsibilities will include working with developers to build out CI/CD pipelines, enable self-service build tools, and reusable deployment jobs.
Any given day you will:
Design end-to-end engineering solutions for business opportunities using existing on premises or new Cloud based technology platforms
Tenaciously keep the Big Data infrastructure (Hadoop and peripheral tools) operational across various environments in datacenters & AWS Cloud.
Install and manage Cloudera, Hortonworks, and Open source Hadoop, various tools & tech stack in datacenter & AWS Cloud environments.
Deploy & manage EMR, Redshift, and Dynamo DB etc. services based applications’ needs on AWS
Lift & shift of on premises Big Data applications / tools to AWS Cloud.
Work with the team to build the Big Data platform solutions for different business needs
Develop scripts and glue code to integrate multiple software components
Proactively monitor environments and drive troubleshooting and tuning
Demonstrate deep knowledge in OS, networking, Hadoop & Cloud technologies to troubleshoot issues
Evaluate & build different compute frameworks for all tiers for technologies in AWS cloud.
Identify technical obstacles early and work closely with team to find creative solutions to build prototypes & develop deployment strategies, procedures
Build prototypes for open source technology solutions
Who you are:
You yearn to be part of cutting edge, high profile projects and are motivated by delivering world-class solutions on an aggressive schedule
Someone who relishes challenges; thrives even under pressure; is passionate about their craft; and hyper focused on delivering exceptional results
You love to learn new technologies and mentor junior engineers to raise the bar on your team
Passion for automation of everything to release fast, frequently, and reliably.
Experience in deploying scalable and highly available fault tolerant platforms using AWS & third party offered services. (EMR, Dynamo DB, Redshift, Cloudera, etc.)
Automator. You know how to script and automate anything and everything, from test to environment provisioning
You have deep understanding on Cloud technologies, backbone structure & service offered, specially by AWS.
Curiousity. You ask why, you explore, you're not afraid to blurt out your crazy idea. You probably have a Bachelors, Masters or higher degree
Do-er. You have a bias toward action, you try things, and sometimes you fail. Expect to tell us what you’ve shipped and what’s flopped. We respect the hacker mentality
It would be awesome if you have a robust portfolio on Github and / or open source contributions you are proud to share
Bachelor’s Degree or Military Experience
At least 4 years of IT experience
At least 1 years of hands on experience deploying and managing AWS based applications
At least 3 years of experience providing enterprise Linux based system administration
At least 2 years of experience working with code repositories and Git, GitHub, Nexus, or Jenkinsbuild tools
At least 2 years of experience with automation tools such as Chef or Ansible,
At least 1 year of experience with MySQL, Postgres, Elastic Search, Docker, or Mesos,
At least 1 year of experience in OO programming
Bachelor’s degree in Computer Science, Information Technology
3+ years of overall IT experience in an enterprise cloud environment
2+ years of experience with administration and build experience of big data technologies such as Hadoop, Spark, or EMR
1+ year of experience in Metadata Hub (Ab Initio) or Data Quality (Informatica) or Collibra
1+ year of experience with scripting, Python, Java, R
1+ year of experience with Docker, Mesos, Kubernetes, or ECS
3+ years of experience in designing, deploying, administrating enterprise class Big Data clusters
At this time, Capital One will not sponsor a new applicant for employment authorization for this position.