Ncounter
Network Reliability Specialist
London | Hybrid
Ncounter is supporting a global financial services organisation as they expand a specialist team responsible for the reliability, resilience, and security of critical network infrastructure supporting international trading and investment platforms.
This is a highly hands-on engineering role, focused on ensuring complex network environments remain stable, observable, secure, and highly available. Rather than spending your time producing network designs, you'll be deeply involved in the day-to-day engineering of network reliability, building automation, creating monitoring frameworks, improving operational processes, and preventing incidents before they occur.
Working across data centre, enterprise, and cloud environments, you will take ownership of the tooling, automation, and observability capabilities that allow the wider business to operate with confidence.
Key Responsibilities
• Build and enhance network observability, monitoring, and alerting frameworks across critical infrastructure
• Develop automation solutions to improve operational efficiency, configuration management, and incident response
• Proactively identify reliability risks, performance bottlenecks, and single points of failure across network environments
• Investigate complex incidents, perform root cause analysis, and implement permanent improvements
• Create meaningful telemetry, dashboards, metrics, and alerting strategies that provide actionable operational insight
• Improve network resilience through automation, testing, and continuous operational enhancement
• Work closely with infrastructure, platform, security, and engineering teams to strengthen service reliability
• Support network security initiatives including hardening, secure design practices, access controls, and threat detection capabilities
What We're Looking For
• Strong networking fundamentals including BGP, OSPF, multicast, routing, and switching
• Experience operating large-scale production networks where uptime and reliability are critical
• Hands-on experience with network automation using Python, Ansible, or similar technologies
• Strong knowledge of monitoring, observability, and alerting platforms
• Experience building operational tooling, automation frameworks, or reliability-focused engineering solutions
• Understanding of network security principles and secure infrastructure practices
• Experience with Arista and/or Cisco technologies
• Ability to troubleshoot complex infrastructure issues in high-pressure environments
This is an opportunity to take ownership of the reliability engineering function for critical global infrastructure. If you enjoy automating away manual effort, improving operational resilience, building meaningful monitoring capabilities, and solving complex network challenges, we'd be keen to arrange a confidential discussion.
To apply for this job please visit www.reed.co.uk.
Make this application stronger
Use these quick checks before applying so your CV, interview preparation and job search are better matched to this vacancy.
Before you apply
Check the key details and make sure the role matches what you are looking for.
- Review the job title, company, location, salary and working pattern if provided.
- Check the skills, experience or qualifications requested by the employer.
- Make sure the commute, hours and contract type are realistic for you.
Tailor your CV
For IT Jobs, highlight the most relevant skills, experience and achievements linked to this type of work. Keep it honest, clear and focused on what the employer is asking for.
Use the CV Builder or browse Career Advice.
Prepare for interview
If your application is successful, prepare simple examples that show your motivation, strengths and suitability.
Keep searching smarter
Do not rely on one application. Keep searching similar roles and set up alerts so new vacancies reach you faster.
