Great software doesn’t happen on its own. It takes great people. That just happens to be our Forte. With 25 years of matching top engineering talent with preeminent and innovative brands, we seek individuals who are inquisitive, resourceful, and dedicated to their craft, driven to help companies build exceptional software. If this sounds like you, read on. Position Summary We are looking for a Platform Operations Engineer to support and strengthen the operational foundation of our Backend organization while helping build the next‑generation infrastructure for our platform. This role sits at the intersection of Engineering, SRE, DBA, and NOC, acting as the first line of support for platform reliability and performance, and contributing hands‑on to the construction and evolution of our backend infrastructure. You will debug complex issues, improve performance, deepen observability, and help architect and implement a more resilient, scalable platform as we modernize our systems. This is a high‑impact engineering role for someone comfortable working across boundaries, who understands real‑world system behavior, and who is motivated to improve both what exists today and what comes next. Why This Role Matters Our platform is entering a new phase of scale and modernization. This role is critical to bridging day‑to‑day operational realities with the future state of our infrastructure. You will shape and support the systems that power the entire platform, ensuring that what we build is not only functional but also resilient, observable, and ready to meet the demands of a rapidly growing business. What You’ll Do Operational Ownership Serve as the operational proxy for the Backend team in cross‑functional discussions with SRE, DBA, Infrastructure, and NOC teams. Participate in PRAC, CAB, and other operational forums, representing backend engineering needs and platform realities. Act as first responder for backend‑impacting issues, partnering with SRE and DBA teams to drive quick resolution. Infrastructure Development Contribute directly to the design and buildout of new infrastructure and platform components. Partner with SRE and Infrastructure teams to define infrastructure requirements, deployment patterns, scaling strategies, and reliability expectations. Develop tooling, automation, and configuration patterns that support the new platform. Assist in migrating services, workloads, or configurations as part of platform modernization. Platform Stability & Performance Debug production issues across logs, metrics, traces, configs, and data flows. Manage API performance using JMeter and other load/performance tools, identify bottlenecks, and propose fixes. Build and refine monitors, alerts, dashboards, and investigative workflows. Implement tools and automation to reduce manual operational effort and improve visibility. Extend observability across new and existing services. Drive efficiency in deployments, debugging, and incident response. Cross‑Functional Collaboration Work closely with backend developers to understand new features and how to instrument, test, and monitor them. Partner with DBAs on database behaviors, schema changes, and query performance. Collaborate with SRE on reliability goals, capacity planning, and readiness criteria for new infrastructure components. Proactively identify weaknesses in system design, monitoring, deployment, or configuration, and implement improvements. Use time between incidents to strengthen resilience, optimize performance, and reduce future operational load. Technology Java, Spring Boot / Spring Framework MySQL, Redis Kafka or RabbitMQ Git / GitHub JMeter, BlazeMeter, Postman Jira for task management What You Bring Experience working with large‑scale distributed backend systems and cloud‑native environments. Strong AWS knowledge and exposure to infrastructure patterns such as autoscaling, distributed caching, service meshes, load balancing, and cloud deployments. Ability to read and interpret Java or similar backend code during debugging. Solid foundation in relational databases and performance best practices. Experience building internal tools, automation, or operational workflows. Ability to debug end‑to‑end system issues through logs, metrics, traces, and configuration analysis. Strong communication skills with the ability to represent engineering in cross‑functional operational forums. High personal ownership for platform performance, reliability, and operational excellence. Useful Qualifications Experience designing or migrating infrastructure as part of a modernization effort. Familiarity with Terraform, CloudFormation, or other infrastructure‑as‑code tools. Background in performance engineering, load testing, or capacity planning. Experience in observability platforms such as Datadog, Grafana, Honeycomb, or similar. Familiarity with streaming, media delivery, or high‑throughput API platforms. Desired Characteristics Passionate about software development, problem‑solving, and high‑quality engineering. Collaborative. Works well in a team. Excited to succeed and to help others succeed. Enthusiastic about expanding one’s skillset, learning and leveraging new technologies. Open to new approaches and new ideas. Able to independently solve moderately complex issues in practical ways and knows when to solicit help. Communicates effectively, both verbally and in writing, with peers and colleagues. Join a team that invests in your well‑being, growth, and success! #J-18808-Ljbffr
Platform Operations Engineer (Sre) Colombia / Mexico
FORTE GROUP
Remote, Remote
Publicado hace 12 días
Denunciar empleo