Great - and always improving – production support strategies are an essential ingredient of our current and future success. We are actively seeking talented system support and engineering specialists who can own support of applications and systems in production and can drive reliability and performance across massive scale by mastering the full depth of the stack. As an engineering-focused Production Support Specialist, you will have the opportunity to tackle complex problems of scale which are unique to tech companies while using your expertise in delivery and support of mission critical services. With a passion for devops and continuous improvement you will be encouraged to raise the bar for other support specialists to help take their capabilities to the next level.
- Incident resolution and supporting production system deployments while ensuring SLAs are met.
- Deliver on Time to Resolve and Time to Detect reduction efforts.
- Identify and contribute to long-term solutions and preventative techniques.
- Increasing Self-Healing through closed loop automation.
- Progress, protect, and provide for the applications and sub-systems behind all of Capital One’s external and internal customer facing services with an ever-watchful eye on their availability, latency, performance, and capacity.
- Collaborating with other tech leads and support teams to ensure integrated end-to-end availability, reliability, and performance
- Support and deliver within Continuous Integration/Continuous Delivery pipelines
- Influencing resiliency and scalability in production environments in Amazon Web Services and other cloud platforms
- Equipping systems with automated monitoring and alerting
- Support the team and contribute to designing, writing and delivering technical and process automations to improve the availability, scalability, latency, and efficiency of Capital One’s services.
- Solve problems relating to mission critical services and build automation to prevent problem recurrence; with the goal of automating response to all non-exceptional service conditions.
- Influence and support new designs, architectures, standards and methods for large-scale distributed systems.
- Engage in service capacity planning and demand forecasting, software performance analysis and system tuning.
- Identifying and remediating risk to critical and non-critical system KPIs
- Bachelor’s Degree in Computer Science or related technical field, or equivalent practical experience
- At least 3 years of experience in Enterprise production support or At least 3 years of experience in software development teams
- At least 3 years of Support experience in one or more of the following:
- At least 1 year of experience with ITIL practices and principles
- Systematic problem solving approach, coupled with a strong sense of ownership and drive
- At least 1 year of experience with Enterprise Monitoring Tools such as Splunk, BlueStripe, Zabbix, HPOM, Diagnostics, Sitescope, BSM, AppResponse, CA-Unicenter
- Masters Degree
- At least 1 year working in 1 or more of the following:
- Familiarity with hosting apps in Cloud
- Technical certification(s) in Cloud or OpenSource technologies
- Familiarity with running web services at scale
- Understanding of Unix/Linux systems from kernel to shell and beyond, taking in system libraries, file systems, and client-server protocols along the way
At this time, Capital One will not sponsor a new applicant for employment authorization for this position.