Site Reliability Developer 2 (Reston, VA)- job post

February 15, 2026

Apply for this job

Job Description

3.83.8 out of 5 stars

United States

Full-time

Job details

Job type

  • Full-time

Shift and schedule

  • On call

Full job description

The IC2 Site Reliability Engineer in our OCI Sovereign Cloud team supports daily operations for a secure, large-scale OCI-based cloud environment powering mission-critical federal government workloads. This entry-level position focuses on maintaining and supporting existing infrastructure, implementing incremental improvements, and ensuring operational health and compliance. Working within a Linux-centric environment, you will leverage scripting and basic automation to manage deployments, perform fleet maintenance, and maintain system health under the supervision and guidance of senior engineers.

Key Responsibilities:

  • Perform routine operational tasks such as deployments, patching, fleet maintenance, and basic troubleshooting for cloud-based systems.
  • Tune team-specific alarms and thresholds, escalate incidents appropriately, and support the management of metrics, KPIs, and system health dashboards.
  • Participate in incident response by quickly triaging and escalating incidents, executing operational playbooks, and documenting issues for senior review. You will follow established procedures under supervision and contribute to root-cause analysis by gathering data and providing initial troubleshooting support.
  • Serve as a technical support point of contact, troubleshooting and resolving technical issues, assisting customers with environment setup and debugging, and providing timely communication and status updates to customers and internal teams.
  • Own, maintain, and improve runbooks to ensure consistency and clarity for operational processes.
  • Implement defined enhancements to existing tools, documentation, and monitoring solutions.
  • Collaborate closely with other team members and escalate complex issues for further investigation and resolution.
  • Participate in on-call rotations with support from senior engineers, ensuring continuity of coverage and timely response.
  • Ensure compliance with all security, operational, and documentation standards.

Minimum Qualifications:

  • U.S. Citizenship and possess and maintains TS/SCI w/Poly security clearance.
  • Hands-on experience with Linux systems administration.
  • Scripting ability with Python or Bash.
  • Understanding of basic cloud concepts (networking, compute, identity, observability).
  • Strong problem-solving skills and willingness to learn complex systems.
  • Ability to work collaboratively with technical teams and communicate effectively.

Preferred Qualifications:

  • Exposure to company Cloud Infrastructure (OCI) or other major cloud platforms.
  • Familiarity with Infrastructure-as-Code tools such as Terraform or Ansible.
  • Experience supporting production systems or participating in on-call rotations.
  • Understanding of security best practices within classified environments.

Why Join Us?
This is an opportunity to grow your career in a highly collaborative team supporting mission-critical systems in one of the world’s leading cloud environments. You will receive significant coaching and guidance while gaining hands-on experience with real-world enterprise operations.