Site Reliability Engineer

San Antonio, Texas
Apply on company site

Job Details

  • Company Team Cymru Inc
  • Address 78253 San Antonio, Texas, United States
Apply on company site
Team Cymru Inc

*This position is on-site in San Antonio, Texas*

*This position requires security TS/SCI clearance and a full scope polygraph*

Job Summary:

As a Site Reliability Engineer (SRE) at Team Cymru, you’ll be at the forefront of maintaining our user-facing services and production systems. In this role, you’ll blend the best of operational expertise and software craftsmanship, applying cutting-edge engineering principles, operational discipline, and innovative automation to both our environments and codebase.

As an SRE, you’ll focus on a range of systems including operating systems, storage subsystems, and networking. You’ll champion best practices for availability, reliability, and scalability, all while delving into algorithms and distributed systems.

Supervisory Responsibilities:

  • None.

Duties/Responsibilities:

  • Study product characteristics or customer requirements to determine validation objectives and standards.
  • Analyze validation test data to determine whether systems or processes have met validation criteria or to identify root causes of production problems.
  • Develop validation master plans, process flow diagrams, test cases, or standard operating procedures.
  • Prepare detailed reports or design statements, based on results of validation and qualification tests or reviews of procedures and protocols.
  • Conduct validation or qualification tests of new or existing processes, equipment, or software in accordance with internal protocols or external standards.
  • Communicate with regulatory agencies regarding compliance documentation or validation results.
  • Prepare, maintain, or review validation and compliance documentation, such as engineering change notices, schematics, or protocols.
  • Recommend resolution of identified deviations from established product or process standards.
  • Design validation study features, such as sampling, testing, or analytical methodologies.
  • Create, populate, or maintain databases for tracking validation activities, test results, or validated systems.
  • Install racked equipment, labeling and cable management.
  • Resolve testing problems by modifying testing methods or revising test objectives and standards.
  • Conduct audits of validation or performance qualification processes to ensure compliance with internal or regulatory requirements.
  • Direct validation activities, such as protocol creation or testing.
  • Coordinate the implementation or scheduling of validation testing with affected departments and personnel.
  • Participate in internal or external training programs to maintain knowledge of validation principles, industry trends, or novel technologies.

Required Skills/Abilities:

  • General knowledge of 4 technical expertise areas, with deep knowledge in 1 area
    • Chef (basic syntax, recipes, cookbooks) and Ansible (basic syntax, tasks, playbooks)
    • Terraform basic syntax and CI/CD configuration, pipelines, jobs
    • Cloud resources provisioning and configuration through CLI/API
    • Kubernetes basic understanding, CLI, service re-provisioning
    • Provision and setup metric in Prometheus, Thanos, and Grafana, alerts and silences
    • Provision and setup logs and queries for general questions
    • Operating system (Linux) configuration, package management, startup and troubleshooting
    • Block and object storage configuration
    • Datacenter installation processes, equipment management requirements and cable management requirements
    • Networking VPCs, proxies and CDNs

Education and Experience:

  • High school diploma or equivalent.
  • At least two years of related experience.

Physical Requirements:

  • Prolonged periods of sitting at a desk and working on a computer.

Location:

  • San Antonio, TX


PI248452737

Apply on company site

Published: 3 weeks ago

Similar jobs near you