AI Agent Evaluation Analyst for Autonomous Agents (No coding required)

Remote Full-time
We’re hiring detail-oriented, analytical contributors to help test and improve autonomous AI agent evaluations. This is part-time, fully remote work with flexible hours, ideal for people who enjoy finding edge cases, questioning assumptions, and strengthening complex systems. What you’ll do • Review and refine agent evaluation tasks and scenarios for logic, completeness, and realism • Identify inconsistencies, ambiguities, and missing assumptions • Define gold-standard expected behaviors for agents • Annotate reasoning paths, cause-effect relationships, and plausible alternatives • Collaborate with QA, writers, and developers to suggest refinements and expand edge case coverage • Ensure autonomous agents are tested thoroughly and realistically What we’re looking for • Strong analytical thinking and excellent attention to detail • Fluent written English with clear documentation skills • Comfort reading structured formats such as JSON or YAML (no need to write code) • Ability to reason about complex systems and spot what could break or be misinterpreted Nice to have Prior exposure to QA/test-case thinking, logic puzzles, or evaluation frameworks Apply tot his job
Apply Now

Similar Opportunities

Lead Agentic AI Developer

Remote

Senior Technical Writer, Business Analyst with Gen. AI skills

Remote

AI Automation Specialist - Remote US

Remote

AI Automation Developer for Ongoing Work (AI, n8n, Make.com, Voiceflow) - Contract to Hire

Remote

AI Automation Specialist​/Remote View Position

Remote

AI Agent Developer to Build an Autonomous Instagram Marketing System (Strategy + Automation)

Remote

AI Automation Engineer – Build Internal Business Applications

Remote

AI Automation Engineer – Extend Existing Salesforce-Based AI Outreach System - N8n / Salesforce

Remote

AI Automation Engineer (n8n + Playwright) for Google Flow Video Generation

Remote

Lead Data Engineer + AI Client - Altimetrik Takeda Location: Remote Need minimum 3 years of experien

Remote

Experienced Bilingual Data Entry Specialist for Remote Legal Team Support – Detail-Oriented and Tech-Savvy Individuals Wanted for arenaflex

Remote

[FULL TIME Remote] (Data Entry Work At Home) Walgreens Remote

Remote

Experienced Customer Service Representative – Remote Work Opportunity with arenaflex for Delivering Exceptional Online Shopping Experiences

Remote

**Experienced Customer Service Representative – Part-Time Remote Position at blithequark**

Remote

Guest Advocate (Cashier), General Merchandise, Fulfillment, – Amazon Store

Remote

**Experienced Full Stack Digital Chat Customer Service Representative – Evening Shifts at blithequark**

Remote

Experienced Teen Video Editor Wanted for Remote Digital Content Creati – Amazon Store

Remote

**Experienced Remote Data Entry Specialist – Flexible Work Schedule and Competitive Compensation**

Remote

Enterprise Risk Management Analyst

Remote

Experienced Data Entry Specialist for blithequark - Work from Home Opportunity with Competitive Compensation

Remote
← Back to Home