LH01 - Staff Agentic AI Engineer
Application
Add your info [here]
About the Project
We are working with a rapidly growing engineering organization (100+ engineers) to build production-grade Agentic AI systems that power complex, real-world workflows across internal platforms and customer-facing products. The platform is centered around LLM-driven agents orchestrated with LangGraph, integrated with MCP (Model Context Protocol) servers, backend services, and domain-specific tools. These agents operate across long-running workflows, multi-step reasoning, asynchronous execution, and real-time decision-making. This is not a research role and not a prompt-only position. This role focuses on designing, building, and operating reliable agentic systems in production, with strong ownership over orchestration, backend integration, scalability, and correctness. This is a long-term, high-impact role with deep influence over how AI agents are designed, deployed, and evolved across the company.
Role and Responsibilities
- Design, build, and operate production-grade agentic AI systems using LangGraph as the core orchestration layer.
- Architect and implement multi-agent and multi-step workflows, including planning, execution, reflection, retries, and failure handling.
- Design and operate MCP Servers to expose tools, services, and domain capabilities to LLM agents in a structured, scalable, and secure way.
- Build and maintain backend services and APIs (Python, FastAPI) that power agent execution, tool invocation, and workflow state.
- Own agent context, memory, and state management, including:
- Short-term and long-term memory
- Workflow state persistence
- Deterministic replay and debugging
- Integrate agents with internal systems and external APIs, including databases, SaaS tools, CRMs, data pipelines, and internal platforms.
- Design async and event-driven execution models for agents using Redis, queues, and background workers.
- Implement guardrails, validation layers, and safety mechanisms to ensure agents behave predictably in production.
- Build observability for agentic systems, including:
- Traces of agent decisions and tool calls
- Workflow timelines and state transitions
- Failure modes and retry behavior
- Cost, latency, and throughput metrics
- Collaborate closely with product, backend, and platform teams to define agent responsibilities, system boundaries, and escalation paths.
- Ensure security, access control, and compliance across agent tooling, MCP servers, and data flows.
- Continuously refine agent behavior based on real-world usage, failures, and evolving product needs.
What We’re Looking For
Must-Haves
- 4+ years of professional engineering experience, with significant exposure to AI-driven or distributed backend systems.
- Strong proficiency in Python, with production experience using FastAPI or similar backend frameworks.
- Hands-on experience building agentic AI systems using frameworks like LangGraph (or equivalent graph/state-based orchestration).
- Strong understanding of agent workflows, including planning, tool calling, memory, retries, and failure handling.
- Experience designing and operating MCP Servers or similar tool-exposure layers for LLMs.
- Solid backend fundamentals:
- API design
- Async processing
- State management
- Error handling and resilience
- Experience using Redis for:
- Queues and background jobs
- Session, workflow, or agent state caching
- Strong understanding of latency, scalability, reliability, and cost trade-offs in production AI systems.
- Experience deploying and operating systems on cloud platforms (AWS, GCP, or Azure), including containers, CI/CD, monitoring, and logging.
- Comfortable owning systems end to end, from architecture to implementation to production operations.
Looking for people with real Agentic AI experience
Why Join Us?
- Build real production agentic systems, not demos or experimental prototypes.
- Own the agent orchestration and backend layer that powers critical workflows.
- Work with a high-caliber engineering organization that values correctness, reliability, and maintainability.
- High autonomy and technical ownership — your architectural decisions will have lasting impact.
- Be at the forefront of Agentic AI in real-world systems, where engineering discipline matters as much as intelligence.
Application Instructions
Submit everything [here]:
- Your résumé/CV highlighting agentic systems, LangGraph, MCP servers, backend engineering, and production AI
- Links to GitHub repositories, architecture docs, or systems you’ve built involving LLM agents or orchestration
- Your availability and compensation expectations
- Any experience with regulated systems, large-scale platforms, or complex backend architectures (optional but valued)