The role involves designing the logic, personality, and execution of AI agents by refining complex prompt stacks and analyzing system logs. You will build Python-based evaluation harnesses to test agent behavior and ensure an intuitive user experience within a media-generation platform.