Data Engineer, Apache Spark

  • Company: Capital One
  • Location: McLean, Virginia
  • Posted: August 03, 2016
  • Reference ID: R8124
McLean 1 (19050), United States of America, McLean, Virginia

Data Engineer, Apache Spark

Do you want to be on the forefront of the next BIG thing in data? Are you charged by the thought of being involved at the ground level of an enterprise-wide Big Data transformation? Do you enjoy solving complex business problems in a fast-paced, collaborative, and iterative programming environment using cutting edge Open-Source tools and technologies? If this excites you, then keep reading. 

Capital One – a top 10 US bank that is on a quest to use technology to bring ingenuity, simplicity, and humanity to banking – needs software engineers to power its Big-Data transformation. We are looking for bright, driven, and talented individuals to join our team of passionate and innovative software engineers. In this role, you’ll use your experience with Java, Fast Data, Streaming & Big Data technologies to work side-by-side with product owners and Agile team members in building our next generation of big data capabilities. 

The Job:

-Collaborating as part of a cross-functional Agile team to create and enhance software that enables state of the art, next generation Big Data & Fast Data applications

-Building efficient storage for structured and unstructured data

- Developing and deploying distributed computing Big Data applications using Open Source frameworks like Apache Spark, Apex, Flink, Storm, Akka and Kafka on AWS Cloud

- Utilizing programming languages like Java, Scala, Python and Open Source RDBMS and NoSQL databases like PostgreSQL and MongoDB  

-Utilizing Hadoop modules such as YARN & MapReduce, and related Apache projects such as Hive, Hbase, Pig, and Cassandra

-Developing data-enabling software utilizing open source frameworks or projects such as Spring, Angular JS, SOLR, Drools, etc.

-Leveraging DevOps techniques and practices like Continuous Integration, Continuous Deployment, Test Automation, Build Automation and Test Driven Development to enable the rapid delivery of working code utilizing tools like Jenkins, Maven, Nexus, Chef, Teraform, Ruby, Git and Docker 

-Performing unit tests and conducting reviews with other team members to make sure your code is rigorously designed, elegantly coded, and effectively tuned for performance

Your interests:

-You geek out over obscure sports statistics. You ponder what drove your streaming music service’s algorithms to accurately predict that you’d like the song you’re listening to. Nate Silver asks you who’s going to win the next election.  You love data.

-You get a thrill out of using large data sets, some of them slightly messy, to answer real-world questions

-You yearn to be a part of cutting edge, high profile projects and are motivated by delivering world-class solutions on an aggressive schedule

-You are passionate about finding refined solutions to complex coding challenges and helping the entire team meet its commitments

-You love learning new technologies and mentoring more junior developers

-Humor and fun are a natural part of your flow

#ilovedata #bigdata #transforminganalytics 

Basic Qualifications:

-Bachelor’s Degree or military experience

-At least 3 years of professional work experience coding in data management

Preferred Qualifications:

-2+ years of experience with the Hadoop Stack

-2+ years of Distributed Computing frameworks experience

-2+ years experience with Cloud computing (AWS a plus)

-2+ years of NoSQL implementation experience (MongoDB and Cassandra a plus)

-4+ years Java or Scala development experience

-4+ years of scripting experience

-4+ years' experience with Relational Database Systems and SQL (PostgreSQL a plus)

-4+ years of ETL design, development and implementation experience

-4+ years of UNIX/Linux experience

Share this Job