Applied AI Engineer (RAG & Agents)

Job ID: 240901

Location: Cyprus / Remote (EU)

LoopSmart is a technology research and development company. We build advanced software, AI systems, and infrastructure that power our partners' products. We focus on creating tangible technology solutions that solve complex problems at scale.

We are seeking an Applied AI Engineer to lead the development of our retrieval-augmented generation (RAG) and agentic workflow capabilities. In this role, you will move beyond "demo-ware" to build systems that are robust, measurable, and ready for transfer. You will work backwards from the needs of future licensees - stability, observability, and cost-efficiency - to design architectures that solve real-world information retrieval and synthesis problems.

You will be responsible for the full lifecycle of these assets: from reading the latest literature and prototyping new retrieval strategies to documenting the operational trade-offs for the engineering teams that will eventually adopt your code.

Key Job Responsibilities

Design and implement modular RAG pipelines, selecting appropriate chunking, embedding, and reranking strategies for diverse data types.
Build and evaluate agentic workflows that can reliably use tools to answer complex queries, with a focus on safety and predictability.
Develop comprehensive evaluation suites (golden datasets, automated metrics, human-in-the-loop grading) to quantify system performance.
Instrument code for deep observability, ensuring that every decision made by the model is traceable and debuggable.
Write technical documentation and "transfer packages" that allow other engineers to understand, deploy, and maintain your work.
Collaborate with infrastructure engineers to optimize inference costs and latency.

A day in the life

Your day might start by reviewing the results of an overnight evaluation run, analyzing why a new reranking model improved precision but hurt latency. You discuss these trade-offs in a design review, deciding to expose a configuration knob for future licensees. Later, you spend a few hours coding a new tool for an agent that needs to query a SQL database, writing unit tests to ensure it handles schema errors gracefully. In the afternoon, you read a new paper on "long-context" models and write a one-pager proposing an experiment to see if it can replace a complex chunking strategy. You wrap up by updating the "Usage Guide" for a library you shipped last week, clarifying how to handle rate limits.

About the team

You will join a small, high-density team of researchers and engineers who value shipping over hype. We operate like a lab: we form hypotheses, run experiments, and document results. We are not a feature factory; we are an asset factory. We value clear writing, intellectual honesty, and the discipline to finish what we start. We work asynchronously and respect deep work time.

Basic Qualifications

3+ years of non-internship professional software development experience.
2+ years of experience with Python or TypeScript in a production environment.
Experience building and deploying applications using LLMs (OpenAI, Anthropic, or open weights).
Experience with vector databases (e.g., Qdrant, Pinecone, pgvector) and search concepts.
Bachelor's degree in Computer Science or equivalent practical experience.

Preferred Qualifications

Master's degree or PhD in Computer Science, AI, or a related field.
Experience with evaluation frameworks (e.g., Ragas, TruLens) or building custom eval harnesses.
Familiarity with AWS infrastructure (Lambda, ECS, Bedrock).
Contributed to open-source AI/ML projects.
Strong written communication skills for technical documentation.