Location Bogotá, Colombia, LatAm Employment Type Full time Department Engineering About the Role This is where you come in. Below, you’ll find what this role is all about—the impact you’ll drive, the challenges you’ll tackle, and what it takes to thrive at Addi. If you’re ready to be part of something big, keep reading. What’s the mission you’ll drive To architect and lead the evolution of Addi’s engineering platform, championing a culture of independent deployability and proactive reliability , directly enabling rapid product deployment and guaranteeing availability, security, and scalability required to support our transformation into a leading financial platform in Latin America. What you will do Architect and provide the tooling that allows product squads to deploy without cross-team synchronization. Transition 80% of core services to a self-service deployment model where "deployment trains" are replaced by independent, asynchronous releases. Redesign the global networking architecture to move away from legacy monolithic ingress points toward a decoupled, multi-layered security perimeter . Implement a "Security-by-Design" topology (e.g., Service Mesh or API Gateway isolation) that abstracts internal services from public entry points. Deliver a standardized "Delivery-as-a-Product" capability that enables complex traffic shifting strategies for all teams. Provide a unified interface for teams to manage Canary and Blue-Green deployments , including automated health‑check gates that trigger rollbacks without manual intervention. Build the telemetry "Golden Signals" pipeline and provide the libraries/tooling required for other teams to instrument their own services (OpenTelemetry). 100% of new services are "Observability‑Ready" at launch, with automated SLO dashboards and alerting enabled via the platform. Integrate AI‑assisted development workflows into the platform (e.g., AI‑driven IaC generation or LLM‑based troubleshooting assistants for dev squads). 30% reduction in platform‑related support tickets by providing AI‑assisted self‑service documentation and diagnostic tools. What we’re looking for Proven Expertise in Cloud & Infrastructure Fundamentals 3-5 years of full‑time, relevant experience as a DevOps Software Engineer, Cloud Engineer, or Site Reliability Engineer (SRE). Demonstrated mastery in designing and operating highly available, scalable, and self‑healing systems on the cloud, aligning with the AWS Well‑Architected Framework. Expert proficiency in Linux system administration and scripting (Bash, Python) for large‑scale automation, and experience administering relational/non‑relational database systems. Track Record of Success with IaC and CI/CD Optimization Deep, professional experience defining, provisioning, and managing 100% of cloud infrastructure using modern IaC tools like Terraform or AWS CDK. Proven ability to implement and optimize CI/CD pipelines (e.g., using GitHub Actions or Jenkins) resulting in a quantifiable reduction in Mean Time to Deployment (MTTD) and Change Failure Rate (CFR). Proven ability to design and deliver internal developer platforms that enable product squads to deploy independently via Service Mesh, Feature Flags, and Contract Testing frameworks. Expert proficiency in implementing complex deployment patterns (Canary, Blue/Green) at scale, automating traffic shifting and health‑check gates to ensure zero‑downtime releases. Demonstrates Technical Ownership & SRE Mindset Takes full responsibility for the reliability and performance of critical production systems, treating infrastructure as a product. Not only uses observability tools (Prometheus, Grafana, OpenTelemetry) but builds the "Observability Pipeline" that enables other teams to instrument their own services and define their own SLOs/SLIs . Proactively identifies and eliminates "toil" by automating repetitive tasks, measuring, and reporting on efficiency gains. Actively participates in on‑call rotations, writing clear runbooks, and leading post‑mortem analyses to prevent recurrence of incidents. Has Solid Expertise in IaC Mastery & Security‑First Approach Expertly defines, provisions, and manages 100% of cloud resources (AWS) using Infrastructure as Code (e.g., Terraform, AWS CDK). Deep expertise in redesigning cloud networking topologies (e.g., Transit Gateways, PrivateLink, Service Mesh) to decouple entry points and enforce a Zero‑Trust security mindset. Proven track record of abstracting internal services from public ingress points, reducing the blast radius of incidents through sophisticated routing and isolation strategies. Contributes to architectural decisions (ADRs) by proposing scalable, cost‑efficient, and security‑hardened patterns for cloud resource consumption. Possesses Strong Systematic Problem Solving Skills Applies a structured, data‑driven methodology to diagnose and resolve complex distributed system issues across the full stack. Expertly leverages observability tools (logs, metrics, and traces) to identify root causes of failures, moving beyond symptoms to implement long‑term fixes. Effectively manages the resolution process for high‑severity incidents, communicating clearly and concisely to technical and non‑technical stakeholders. Proven Ability to Drive DevOps Culture & Collaboration Fosters a culture of shared responsibility by actively collaborating with development teams to simplify deployment, testing, and monitoring workflows. Early adopter and implementer of AI‑assisted development tools (e.g., LLM‑driven IaC generation, automated log analysis) to accelerate the engineering feedback loop and reduce cognitive load. Acts as a force multiplier by mentoring software engineers on cloud‑native patterns, networking security, and self‑service tools to improve overall engineering autonomy. Communicates complex technical issues and proposed solutions effectively and concisely to both peers and leadership across distributed teams. Why join us? Work on a problem that truly matters – We are redefining how people shop, pay, and bank in Colombia, breaking down financial barriers and empowering millions. Your work will directly impact customers' lives by creating more accessible, seamless, and fair financial services. Be part of something big from the ground up – This is your chance to help shape a company, influencing everything from our technology and strategy to our culture and values. You won’t just be an employee—you’ll be an owner Unparalleled growth opportunity – The market we’re tackling is massive, and we’re growing faster than almost any fintech lender at our stage. If you’re looking for a high‑impact role in a company that’s scaling fast, this is it. Join a world‑class team – Work alongside top‑tier talent from around the world, in an environment where excellence, ownership, and collaboration are at the core of everything we do. We care deeply about what we build and how we build it—and we want you to be a part of it. Competitive compensation & meaningful ownership – We believe in rewarding our talent. You’ll receive a generous salary, equity in the company, and benefits that go beyond the basics to support your growth. #J-18808-Ljbffr