Role:- Site Reliability Engineer

Experience:- 5 to 8 years

Location:- Poland

JD:-

We are looking for a dedicated Site Reliability Engineer (SRE) to join our infrastructure team. In this role, you will bridge the gap between development and operations, ensuring our systems are scalable, reliable, and efficient. You will spend your time automating away manual tasks through Shell scripting, contributing to our Python-based application stack, and managing complex deployments across a containerized environment.

CI/CD & Deploy: Spinnaker, GitHub Actions

Infrastructure: Pulumi

Version Control: Git, GitHub

EXPERIENCE/ SKILL LEVEL

* Production on-call experience with incident response and systems troubleshooting

* Experience designing and operating containerized applications in production

* Proficiency in writing, and releasing (Python, Go or similar)

* Experience with distributed services in a large-scale Linux/Unix environments

SKILLS

infra as code: Terraform / Puppet

Container orchestration : Helm / K8s

DevOps : Spinnaker

Observability : Splunk / Prometheus

Coding: Python/ Golang etc

Job Overview

We are seeking a highly skilled Site Reliability Engineer (SRE) to join our infrastructure team. The ideal candidate will work at the intersection of software development and operations, ensuring our production systems are scalable, reliable, and efficient.

You will be responsible for automating operational tasks, managing containerized environments, improving deployment pipelines, and maintaining system observability and reliability across large-scale infrastructure.

Key Responsibilities

Infrastructure Reliability

Ensure high availability, performance, and reliability of production systems.

Participate in incident response, troubleshooting, and root cause analysis.

Support on-call rotations to resolve production incidents.

Automation & Development

Automate operational workflows using Shell scripting (Bash/Zsh) and Python.

Develop tools and scripts to improve system reliability and operational efficiency.

Contribute to internal tooling and platform improvements.

Containerization & Orchestration

Deploy and manage containerized applications using Kubernetes (K8s) and Docker.

Manage Kubernetes deployments using Helm.

CI/CD & Release Engineering

Build and maintain CI/CD pipelines using Spinnaker and GitHub Actions.

Support continuous integration and continuous deployment practices.

Infrastructure as Code

Provision and manage infrastructure using Pulumi, Terraform, or Puppet.

Maintain infrastructure configuration and automation.

Monitoring & Observability

Implement monitoring and alerting using Prometheus and Splunk.

Analyze system metrics and logs to proactively prevent incidents.

Required Skills & Qualifications

Technical Skills

Strong Shell scripting (Bash/Zsh) skills

Proficiency in Python or Golang

Experience with Kubernetes and Docker

Experience managing CI/CD pipelines (Spinnaker, GitHub Actions)

Knowledge of Infrastructure as Code tools (Terraform, Pulumi, Puppet)

Experience with Git / GitHub version control

Infrastructure & Operations

Strong experience with Linux / Unix systems

Experience supporting distributed systems at scale

Knowledge of container orchestration (Helm, Kubernetes)

Monitoring & Observability

Experience with Prometheus or Splunk

Understanding of system monitoring, alerting, and incident response

Preferred Qualifications

Experience operating large-scale production systems

Experience with microservices architectures

Strong troubleshooting and debugging skills

Experience with cloud infrastructure environments

Soft Skills

Strong problem-solving ability

Excellent communication skills

Ability to work in a fast-paced production environment

Ability to handle on-call production responsibilities

Site Reliability Engineer

💰 Wynagrodzenie

📋 Informacje

📝 Opis główny / Wstęp

📡 Metadata statystyk

🔗 Podobne oferty