IT Manager - System Support Engineering

  • Company: Capital One
  • Location: Richmond, Virginia
  • Posted: January 10, 2017
  • Reference ID: R16954
West Creek 4 (12074), United States of America, Richmond, Virginia

Great - and always improving – production support strategies are an essential ingredient of our current and future success.  We are actively seeking talented system support and engineering managers who can own support of products in production and can drive reliability and performance across massive scale by mastering the full depth of the stack. As an IT Manager, you will have the opportunity to tackle complex problems of scale which are unique to tech companies while using your expertise in delivery and support of mission critical services. With a passion for devops and continuous improvement you will be encouraged to power support engineers to take their capabilities to the next level.


- Driving incident resolution through a systematic problem solving approach, coupled with a strong sense of ownership and drive

- Supporting production system deployments while ensuring SLAs are met

- Progress, protect, and provide for the software and systems behind all of Capital One’s external and internal customer facing services with an ever-watchful eye on their availability, latency, performance, and capacity

- Collaborating with other tech leads and support teams to ensure integrated end-to-end availability, reliability, and performance

- Empower support teams to manage Continuous Integration/Continuous Delivery pipelines

- Influencing resiliency and scalability in production environments in Amazon Web Services
 and other cloud platforms

- Equipping systems with automated monitoring and alerting

- Lead a team to design, write and deliver technical and process automations to improve the availability, scalability, latency, and efficiency of Capital One’s services

- Solve problems relating to mission critical services and build automation to prevent problem recurrence; with the goal of automating response to all non-exceptional service conditions

- Influence and support new designs, architectures, standards and methods for large-scale distributed systems

- Engage in service capacity planning and demand forecasting, software performance analysis and system tuning

- Identifying and remediating risk to critical and non-critical system KPIs

- Provide technical leadership with Unix systems internals and networking

- Identify and implement opportunities for automation of routine maintenance tasks and common issues

- Troubleshoot networking problems with an indepth knowledge and understanding of network theory including various concepts such as networking protocols (TCP/IP, UDP, ICMP, etc), MAC addresses, IP packets, DNS, OSI layers, and load balancing.

Basic Qualifications:

- Bachelor’s Degree

- At least 1 year of experience in leading production support teams or at least 1 year experience in leading software development teams or at least 1 year experience in enterprise support, ITIL practices and principles

- At least 1 year of experience in supporting highly available production systems in a cloud environment such as AWS, Azure, or Bluemix.

Preferred Qualifications:

-1+ years of experience with one of the following:

  • Linux, UNIX, Openstack administration
  • Python, Ruby, Go, JavaScript development experience
  • NoSQL platforms

- 1+ years of experience with any of the following:

  • AWS cloud services configuration & administration
  • Chef, Ansible, Puppet or UDeploy
  • Jenkins, Codeship, or Travis experience
  • Restful web/API services support and deployment
  • Splunk, BlueStripe, Zabbix monitoring / alerting

At this time, Capital One will not sponsor a new applicant for employment authorization for this position.

Share this Job