Get to know us better
CodiLime is a software and network engineering industry expert and the first-choice service partner for top global networking hardware providers, software providers and telecoms. We create proofs-of-concept, help our clients build new products, nurture existing ones and provide services in production environments. Our clients include both tech startups and big players in various industries and geographic locations (US, Japan, Israel, Europe).
While no longer a startup - we have 250+ people on board and have been operating since 2011 we’ve kept our people-oriented culture. Our values are simple:
- Act to deliver.
- Disrupt to grow.
- Team up to win.
The project and the team
We're building a software for modern platforms and operating systems, supporting leading networking equipment manufacturers, cloud-native solutions, and infrastructure projects. AI/ML is increasingly at the heart of these initiatives—not as an add-on, but as a core tool to address complex engineering and networking challenges. We are seeking an engineer with a strong software engineering background, solid AI/ML expertise, and experience in computer networks
Your role
As a part of the team, will be responsible for:
- Developing MCP-like tools that expose network device APIs and CLI commands with clear descriptions, structured inputs/outputs, validation logic, and error handling
- Managing tool metadata and supporting semantic search over available tools using a vector database
- Creating golden user queries, expected answers, and query variations for specific tools, intents, and network-operation scenarios
- Building automated tests to verify correct tool selection, tool parameterization, output structure, and end-to-end agent responses
- Designing evaluation workflows combining deterministic checks, human review, and LLM-as-a-judge techniques, for example using DeepEval or custom evaluation prompts
- Refining prompts, tool descriptions, schemas, and agent workflows while monitoring regressions when new tools or changes are introduced
- Developing production-quality Python code and tests using frameworks such as LangChain and LangGraph
- Collaborating with software engineers, network domain experts, and DevOps teams to deliver reliable, testable, and maintainable agentic workflows
Do we have a match?
As a Mid/Senior AI engineer with networking experience you must meet the following criteria:
- AI and development expertise: Hands-on experience with LLM-driven workflows, agentic frameworks such as LangChain and LangGraph, and tool-calling patterns
- Agentic tool development: Experience designing structured tools with clear descriptions, input/output schemas, validation logic, and integration with external APIs or command-based systems
- Search, RAG, and prompting: Experience with semantic search, vector databases, RAG patterns, prompt engineering, and structured LLM outputs
- Testing and evaluation: Experience creating golden queries, automated tests, regression checks, and chatbot/agent response evaluations, including LLM-as-a-judge approaches
- Python engineering: Proven experience developing production-quality Python code, including automated tests and maintainable integration logic
- Networking expertise: CCNA certificate or equivalent knowledge. Understanding of networking platforms, device commands, and troubleshooting
- English (B2 level at minimum, but preferably C1 or C2)
Beyond the criteria above, we would appreciate the nice-to-haves:
- Experience with AI-assisted coding tools such as Codex, GitHub Copilot, Cursor, or similar is a plus
- MCP and agent interoperability: Familiarity with Model Context Protocol, MCP server design, tool discovery, tool permissions/scopes, and emerging agent-to-agent communication patterns such as A2A
- Advanced agent architectures: Understanding of routing agents, supervisor/planner patterns, multi-agent workflows, guardrails, and architectures combining deterministic logic with LLM-based reasoning
- LLM evaluation tooling: Experience with frameworks and platforms such as DeepEval, LangSmith, OpenAI Evals, TruLens, BenchLLM, or similar tools for evaluating LLM and agent workflows
- AI/ML for infrastructure data: Practical knowledge of classification, clustering, anomaly detection, time-series analysis, or statistical methods applied to telemetry, syslog, events, alerts, or operational data
- Production deployment and operations: Experience deploying AI/LLM-based solutions in production environments, including Docker, Kubernetes, CI/CD, monitoring, MLOps, or cloud/hybrid infrastructure
- Interactive analysis and visualization: Experience building dashboards, notebooks, or lightweight applications for analysis and validation using tools such as Jupyter, Streamlit, Plotly, Altair, matplotlib, or similar
More reasons to join us
- Flexible working hours and approach to work: fully remotely, in the office or hybrid
- Professional growth supported by internal training sessions and a training budget
- Solid onboarding with a hands-on approach to give you an easy start
- A great atmosphere among professionals who are passionate about their work
- The ability to change the project you work on