Skip to content

Senior DevOps Engineer

  • On-site, Remote, Hybrid
    • Tallinn, Harjumaa, Estonia

Own CI/CD and infrastructure, improve reliability and automation for a high-load SaaS platform handling time-critical communication globally.

Job description

Stack: Linux, Kubernetes (Helm), PostgreSQL, MongoDB, CI/CD(Gitlab CI), Cloud infrastructure (OVH, bare-metal, OpenStack, Cloudflare, AWS), IaC (Terraform, Ansible)

Company description

Textmagic AS is a publicly traded SaaS company listed on Nasdaq First North Tallinn. Our core product is a business messaging platform that enables companies to send A2P SMS, email, and build automated communication flows. Our customers use the platform for urgent alerts, compliant notifications, and other time-critical communication. Trusted by over 25,000 businesses worldwide, our system processes high-volume traffic while meeting strict uptime and performance requirements.

Our team of 40+ professionals is distributed across Estonia (headquarters), Romania, Ukraine, Serbia, and Montenegro. We work in a remote-friendly setup with high ownership and clear accountability. You will join a focused engineering team responsible for maintaining and scaling production systems used daily by thousands of businesses worldwide.

Role overview

We’re looking for a Senior DevOps Engineer to take strong ownership of the infrastructure behind our global SaaS messaging platform. This role is for someone who wants to shape how infrastructure is built, operated, and improved. You will be responsible for reliability, scalability, automation, and production stability across all environments.

You will work closely with engineering leadership and developers to improve system architecture, deployment processes, security, and operational standards. This is a high-impact role with real influence on technical decisions and how our infrastructure evolves as we grow.

Job requirements

Key responsibilities

  • CI/CD ownership: Design and own CI/CD pipelines (Gitlab), improving build speed, deployment safety, and rollback processes across environments

  • Infrastructure ownership: Own and evolve our cloud and bare-metal infrastructure (OVH, Cloudflare, AWS, OpenStack), ensuring high availability, performance, and stability under load

  • Infrastructure as code: Lead infrastructure as code practices using Terraform and Ansible, enforcing version control, peer review, and consistency standards

  • Observability and monitoring: Improve system observability using monitoring, logging, tracing, and alerting tools (Grafana, Prometheus, Loki), and drive proactive reliability improvements

  • Infrastructure security: Strengthen infrastructure security, including DDoS mitigation, traffic filtering, and access control management

  • Incident management: Lead root cause analysis of production incidents and implement long-term reliability improvements

  • Automation: Design automation to reduce manual operational work and improve deployment and recovery processes

  • Database reliability: Ensure high availability and performance of production databases (PostgreSQL, MongoDB), including backup, recovery, and scaling strategies

  • Environment management: Ensure consistency and reliability across development, staging, and production environments

Expected qualifications

  • Linux expertise: Strong Linux system administration experience in high-availability production environments

  • Kubernetes production experience: Hands-on experience running Kubernetes in production, including scaling, upgrades, and troubleshooting

  • Systems architecture understanding: Solid understanding of containerization, virtualization, and infrastructure design trade-offs

  • Networking knowledge: Strong understanding of networking concepts (L2, L4, L7), debugging tools (tcpdump, ngrep), and traffic analysis

  • Production lifecycle experience: Experience operating and troubleshooting applications in high-availability production environments

  • CI/CD systems design: Experience designing and maintaining CI/CD systems and deployment workflows

  • Database operations: Strong experience managing PostgreSQL and MongoDB in production, including performance tuning and reliability

  • Infrastructure as code: Practical experience with Terraform and configuration management tools (Ansible or similar), following best practices

  • Monitoring and logging: Experience working with monitoring and log aggregation systems (Grafana, Prometheus, Loki, or similar)

  • Security awareness: Practical understanding of infrastructure security principles and production hardening

  • Communication skills: Fluent written English and fluent spoken Russian required

Nice to have

  • Messaging/telecom background: Experience with telecom or messaging systems (SMPP, Asterisk, Kamailio)

  • PostgreSQL high availability: Experience with PostgreSQL replication/clustering, backups, and failover (PITR, Patroni/repmgr or similar)

  • Kubernetes operations: Experience operating Kubernetes clusters in production (upgrades, autoscaling, networking, troubleshooting)

  • Scripting: Scripting skills in Bash, Python, or Go for automation and internal tooling

  • Security and traffic protection: Experience mitigating malicious traffic and managing DDoS protection (Cloudflare WAF/rate limiting, fail2ban)

  • Email deliverability basics: Familiarity with SPF, DKIM, DMARC, and how they affect sending reliability

  • SRE practices: Experience with SLOs/SLIs, alert quality, and incident postmortems

What we offer

  • Competitive compensation: Salary aligned with senior-level responsibility

  • Technical ownership: Real influence on infrastructure decisions and long-term architecture

  • Production impact: Direct responsibility for high-availability systems used globally

  • Lean structure: Small team, fast decisions, and minimal management overhead

  • Tooling freedom: Ability to improve processes, automation, and infrastructure standards

  • Remote flexibility: Work remotely or from our office — your choice

  • Professional growth: Support for relevant training and technical development

  • Professional culture: High ownership, clear accountability, and direct communication

Why join us?

At Textmagic, infrastructure reliability is critical to the product. Our customers rely on us for time-sensitive communication, which means uptime and performance are not optional.

As a Senior DevOps Engineer, you will work on production systems that operate under real load and strict reliability requirements. You will have the authority to improve how infrastructure is designed, deployed, and maintained, and your decisions will directly affect system stability and performance.

You will join a focused engineering team where ownership is expected, and technical decisions matter.

or