Infojobs CTO Unpacks Generative AI Integration: From Traditional ML to Holistic Design and Critical EVALs
Ana, CTO at Infojobs, recently offered a deep dive into the company’s journey and learnings in integrating Generative AI (GenAI) into its product suite. Building on a foundation of “traditional machine learning” since 2017—which powered features like job recommenders and CV parsing—Infojobs has rapidly embraced GenAI to unlock new product opportunities. While traditional ML development typically required specialized data science and ML engineering teams, GenAI’s pre-trained models have lowered the barrier to entry, enabling broader product teams to integrate predictive capabilities within their bounded contexts. Initial GenAI implementations included features such as suggesting job offer descriptions and experience details, evolving into more complex functionalities like AI-driven selection question suggestions for employers and natural language search for job seekers, which translates complex queries into precise filter applications.
However, Ana emphasized that GenAI integration is far from trivial, challenging the notion that it’s merely about crafting a good prompt and an API call. Infojobs’ key learnings include the necessity for a holistic design approach to ensure user value and trust, mitigating risks like bias, discrimination, and privacy through transparent UIs and controlled human supervision. They also advocate for avoiding unnecessary complexity; while autonomous agents are intriguing, Infojobs often found more controlled, workflow-based implementations provided superior cost-efficiency, lower latency, and greater operational control for features like natural language search. Critically, Ana highlighted the paramount importance of EVALs methodologies. Given the probabilistic and non-deterministic nature of GenAI outputs, traditional software testing falls short, necessitating statistical and probabilistic metrics to assess abstract qualities like relevance, tone, and the absence of discrimination, often employing larger models for comprehensive output quality assessment.