Why this role Forge’s core promise is safe sandboxed agents + real‑time cost governance + auditable execution. You’ll build the production foundations that make that promise true: secure epoxy runtimes, token→USD metering, end‑to‑end observability, and safe deploy/rollback. What you’ll own Ephemeral execution environments for agents/tools (containers / Firecracker / WASM), with CPU/mem/disk/network quotas, secrets brokering, and isolation hardening Cost governance infrastructure : accurate token→USD accounting, per‑tenant budgets, anomaly detection, enforcement hooks (throttle/downshift/queue) Release engineering : CI/CD, canaries/blue‑green, rollbacks, feature flags, backups/DR, incident response Impact (first 90 days) 60 days : ephemeral sandbox in cloud with quotas + network policies; $/task dashboards; alerting/runbooks 90 days : budget enforcement in production; canary releases + rollbacks; hardened IAM/secrets; measurable reliability + cost SLOs Success metrics p95 latency and $/task visible for core workflows Less than 1% budget overruns, with automated detection and enforcement SLOs + alerts in place; MTTR improving with each incident Requirements (must‑have) 3–8+ years in platform/SRE/systems engineering; you’ve owned production services Strong with IaC (Terraform) and cloud networking/security fundamentals (IAM, secrets, TLS) Comfortable with container orchestration (Kubernetes/EKS or equivalent) Proven experience implementing observability (OTel + metrics/logs/traces) and on‑call/incident practices Nice to have Experience with FinOps in usage‑metered systems (LLMs, APIs, multi‑tenant platforms) Language Skills Strong conversational English for team interactions. (C1/C2) Professional proficiency in English for documentation, code reviews, and cross‑functional collaboration. Soft Skills Curiosity and eagerness to learn across the tech stack. Strong problem‑solving and debugging skills. Ability to work collaboratively in distributed teams. Good communication and time‑management skills. Additional Details Fully Remote Salary compensation 10,000,000.00 - 13,500,000.00 COP Seniority level Mid‑Senior level Employment type Full‑time Job function Engineering and Information Technology Industries Software Development #J-18808-Ljbffr
Founding Platform/Sre Engineer - Sandbox, Cost & Observability
TIRESIA SOLUCIONES TECNOLÓGICAS S.A.S
workfromhome, workfromhome
Publicado hace 21 días
Denunciar empleo