"TIKOS™ spots neural network weaknesses before they fail" >>>

practice post: Tikos spots neural network weaknesses before they fail— the Iris dataset

See how Tikos identifies model weakness on a toy example and how you can apply the approach to any deep-learning architecture, including LLMs, for a wide range of use-cases. Project available on GitHub , specifically this Jupyter notebook

A glance at this chart tells you the results on the left are notable. Box plots show the variance in causality traces generated by Tikos as different models classify flower types from the iris dataset. So, what are we learning?

These models will likely be robust and reliable on unseen data for classifying species 1 & 2 but will be brittle when classifying species 0 and may well breakdown in production. Good to know.

This example demonstrates a sub-component of the Tikos Reasoning Platform: ‘Synapses Logger’ performs proprietary computations over inputs, weights and activation information at the individual neuron level. Outputs (causality traces) are best suited for understanding correlations between internal model operations and outputs. Other Tikos components focus on causality and explainability and will be the focus of follow-on articles.

As Tikos, and this approach, can be applied to any DL architecture, where internals can be accessed, there are potential applications throughout model lifecycle and for different stakeholders (see below).

This is timely because new regulations and expectations (AI EU ACT, ISO/IEC 42001, NIST AI RMF, Sectoral Regs: FCA, MHRA, etc.) require teams building AI enabled systems and applications to improve their technical AI governance approaches, particularly around transparency and explainability.

How can this approach help?

  1. Model Development
    – Debug and analyse errors, identify exact internal elements and processing steps that lead to failures, rather than relying on trial/error.
    Refine and optimise model architecture, identify and prune redundant elements that don’t contribute to model performance.
    – Improve feature engineering, identify which features are most influential or if the model is latching onto spurious correlations.
    – Compare models, understand why one model or architecture performs differently from another.
  2. Trust, Safety & Responsibility
    – Detect bias, identify where and how biases (gender, race, age, etc.) are encoded and propagated.
    – Identify, mitigate harmful outputs (especially for LLMs), understand the internal pathways that lead to harmful outputs, particularly helpful for internal circuit breaking/refusal training.
    – Assess vulnerability, understand why a model is susceptible to adversarial attack by examining which internal features or computations could be (or are being) exploited.
    – Predict failures, identify potential failure modes in critical systems (autonomous vehicles, medical diagnosis, etc.) by understanding internal processes and how models might behave in unseen/edge cases.
    – Detect attribution (especially for LLMs), investigate if/how a model has memorised specific training data, and trace outputs back to source.
  3. Governance, Risk & Compliance (GRC)
    – Explain model outputs
    generate primary explanations (not interpretations) for individual model decisions to satisfy GDPR, or EU AI Act requirements.
    – Audit models/accountabilityaudit models by examining their internal workings, not just their inputs and outputs. Establish accountability for model behavior by understanding underlying mechanisms.
    – Complete comprehensive risk assessmentsidentify a broader range of AI-related risks (operational, ethical, reputational, financial) by understanding how models function internally and where potential weaknesses or unintended consequences might arise.
    – Protect (model) IPdetect model theft or unauthorised replication by comparing the internal ‘fingerprints’/unique mechanistic properties.
  4. Security & Operations
    – Tighten cyber-security — data poisoning, observe and track for anomalous patterns triggered by specific, malicious inputs.
    – Tighten cyber-security — evasion/inversion attacks, explore how attackers might craft inputs to evade detection or infer sensitive training data by analysing a model’s internal sensitivity to input manipulations.
    – Monitor data and concept drift, changes in internal activation distributions, feature importance, or decision pathways can signal the model is starting to drift.
    – Confidently respond to incidents, when a model fails, analyse causality traces to complete root cause analysis at the individual output level.
    – Retrain/find-tune models efficiently, understand which parts of a model have degraded/need updating, enabling more targeted, efficient retraining or fine-tuning (rather than retraining from scratch).
  5. LLM Applications (open weight models)
    – Debug hallucinations, investigate the internal mechanisms that lead to hallucinations (we will publish a case study on this shortly).
    – Advance alignment, help develop techniques to align LLM behavior with human values or specific instructions (reducing bias, improving instruction following, etc.) by understanding how these concepts are represented and enforced internally.
    – Optimise prompt engineering, understand how different prompts can influence an LLM’s internal state and output, leading to more effective and reliable prompt designs.
    – Detect jailbreaks, examine internal mechanisms that allow LLMs to be tricked into bypassing safety guidelines – and develop more robust defences.

Summary

One problem Tikos addresses is the “black box” nature of neural networks, which makes it hard to understand why these models make certain decisions and predict when they might fall short of expected levels of performance, across accuracy, robustness, fairness, etc.

Tikos addresses this fundamental problem by providing a technical solution that offers deep internal visibility into model operations. Specifically the platform generates ‘causality traces’ to identify weaknesses, explain outputs, and therefore, help achieve compliance under increased AI governance regulations and stakeholders expectations.

What is Tikos?

Corporate site
Technical documentation

Tikos is a technical solution that helps with model transparency and explainability and works with all model classes, developer frameworks, tooling and deployment infrastructures. We adopt a regulations first approach: Tikos is designed to solve the hard questions of AI compliance in regulated sectors.

The system creates two data assets for each ML/AI model at inference.

‘Cases’ is a proprietary data structure (1) — including input, activation, and weight path information for deep-neural networks (2) — which are optimised through information minimisation and serialised for efficient case indexing, searching, retrieval, matching, and adaptation (3). This process delivers log level monitoring and observability (4) for individual decision outputs for any model.

‘Context’ extends system capabilities from transparency to explainability. Model features are combined with relevant domain information and represented in a knowledge graph, or other datastore (5). Matched or adapted Cases relating to individual model output decisions are then explained using the Context (6).

Related Posts
Leave a Reply

Your email address will not be published.Required fields are marked *