AI Evaluation Breakthrough: Trismik Secures £2.2M to Revolutionize LLM Testing

September 29, 2025, 3:36 pm
Trismik
AIB2BEvaluationSoftwareTools
Location: United Kingdom
Total raised: $2.97M
Parkwalk Advisors
Parkwalk Advisors
TechnologyDataHealthTechSoftwareHardwareMedTechPlatformUniversityEnergyTechQuantum
Location: United Kingdom, England, London
Employees: 11-50
Founded date: 2009
Trismik, a Cambridge spin-out, has emerged from stealth. It raised £2.2 million in Pre-Seed funding. The company redefines AI evaluation. It applies psychometric methods, mirroring human IQ testing, to large language models. This innovative adaptive testing approach tackles benchmark saturation. It provides dramatically faster, more scalable, and statistically precise AI model assessments. This cuts evaluation costs by up to 95%. Trismik addresses critical needs for AI trust, safety, and regulatory compliance. Its platform moves towards an early 2026 enterprise solution, vital for accelerating AI development cycles globally. Keywords: AI evaluation, LLM testing, adaptive AI, psychometrics, Pre-Seed funding, Trismik, AI trust, compliance, Cambridge spin-out.

The rapid advancement of artificial intelligence presents a paradox. AI models grow increasingly powerful. Their evaluation methods, however, struggle to keep pace. Traditional AI benchmarks like MMLU and GSM8K are saturating. Many leading models now score above 90 percent. These generic benchmarks offer diminishing returns. They obscure true model capabilities. This creates a hidden bottleneck for AI development. Teams face mounting pressure. They need reliable, efficient evaluation. Without it, shipping AI with confidence becomes impossible.

Cambridge spin-out Trismik addresses this critical void. The company recently secured £2.2 million in Pre-Seed funding. This capital infusion propels its mission. Trismik emerged from stealth, introducing a science-backed approach. It aims to redefine how large language models (LLMs) are measured. Its platform promises a new era of precise, scalable AI evaluation.

Trismik's innovation draws from an unexpected domain: psychometrics. This science underpins human IQ testing. The company applies Item Response Theory (IRT) and Computerized Adaptive Testing (CAT) to LLM evaluation. These foundational methods gauge human intelligence. They dynamically adjust test difficulty. Trismik mirrors this process for AI models. The system adapts questions in real-time. It learns from a model's responses. This method quickly isolates a model's actual proficiency. It moves beyond the limitations of static, one-size-fits-all tests.

This adaptive testing paradigm yields unprecedented efficiency. It delivers near-identical accuracy rankings. This happens with dramatically fewer questions. Early results validate its power. Adaptive tests matched traditional evaluation rankings. Spearman correlations consistently exceeded 0.96. Crucially, they required only 8.5 percent of test items. This represents a 91.5% reduction in testing volume. Such efficiency has significant financial implications. Trismik estimates evaluation costs could plummet by up to 95 percent. This is a game-changer. AI teams often spend six figures monthly on GPU compute for model assessment. Reducing this overhead frees vast resources. It accelerates development cycles. It enables more frequent, thorough testing.

Trismik's leadership team brings unique expertise. Professor Nigel Collier is Chief Scientific Officer. He is a distinguished NLP researcher at Cambridge University. Collier has authored over 200 papers in NLP and AI. His current focus is on building measurable, explainable, and trustworthy AI systems. CEO Rebekka Mikkola is a seasoned repeat founder. She possesses extensive experience in enterprise AI sales. CTO Marco Basaldella, formerly an Amazon scientist, rounds out the leadership. This blend of academic rigor and industry acumen provides Trismik with a strong foundation.

The £2.2 million Pre-Seed funding round reflects strong investor belief. Twinpath Ventures led the investment. Cambridge Enterprise Ventures, Parkwalk Advisors, Fund F, Vento Ventures, and angel network Ventures Together also participated. This capital will accelerate the launch of Trismik's adaptive AI evaluation platform. It supports ongoing development and market expansion.

The timing for Trismik's solution is critical. The global AI landscape is evolving rapidly. New regulatory frameworks are on the horizon. The EU AI Act is a prime example. Sector-specific compliance regimes will follow. These mandates demand precise, transparent, and verifiable AI performance. AI development cycles continue to accelerate. Teams must deploy models faster. They must also ensure models are safe, aligned, and effective. Current generic benchmarks fail to meet these complex demands. They do not reflect proprietary data distributions. They falter on domain-specific tasks. Furthermore, traditional evaluations are static. They offer no mechanism to adapt as models improve or objectives shift. Trismik's dynamic, scientific approach addresses these multifaceted challenges directly. It builds trust at scale. It offers continuous, reliable validation.

Trismik is now rolling out its LLM evaluation platform. It targets AI builders seeking superior assessment tools. The product supports both classical and adaptive testing. Its capabilities span factuality, alignment, reasoning, safety, and domain knowledge. A lightweight interface facilitates rapid experimentation. Early access to the platform is currently available. Adaptive testing capabilities have already been validated across seven models and five benchmark datasets. The company plans to release further technical results and case studies later this year. Enterprise users will begin onboarding towards the end of 2025. A comprehensive enterprise solution is slated for launch in early 2026. Trismik envisions its platform evolving into a broader environment. It will support LLM experimentation end-to-end. This includes fine-tuning, prompt engineering, compliance tracking, and performance visualization. Trismik is set to fundamentally transform how AI capabilities are measured and trusted. It paves the way for a more robust, responsible AI future.