A True History of the Internet: Binding Real Live Human to Virtual Identifier with data minimisation

Wednesday, May 21, 2025

Binding Real Live Human to Virtual Identifier with data minimisation - Ф-ML-Id

People use hashes of biometric data embedded in identifier (digital documents) so then the verifier can re-acquire the biometric and (hopefully, in a privacy preserving way) verify the person standing in front of the camera is a) live and b) the same as the one in the document - this is so 20th century

Why not have a behaviour that actually relates to the verifier's actual domain requirements - say they want to check this person is allowed to drive - perhaps for renting a car - so they could measure the person's driving style, which could also be stored in their digital driving license shard of their identity. This could be very robust - graduatlly improves in uniqueness -- and also stops people re-linking across different domains when people use the same feature/behaviours for multiple roles (face, being a common example) - we can also incorporate increasingly robust defenses against GenAI building deepfake people (via deepfake 3D faces or voices) -

The verifier would then be an agentic AI which basically has two things, a measurement model (classic ML) and a physics model (of what people can do) - so now we have Ф-ML-Id...

[lots of good background on Ф-M from the Turing's interest group]

The more borng application is in multimodal biometric systems tat use face+voice+lipsynch on given phrases to work out who someone is and that they are alive- same thing - physics model predicts face movement and voice audio given text input, then verifies against previously onboarded voice&Ф-M 3D face model.

In facial reconstruction (e.g. taking oliver cromwell's skull and building up a picture of what he actually looked like in person) there's a model of how flesh covers bone - this includes models of the way the stuff we are made of folds/hangs/ages etc - these models are from first principles. Unlike deep learning systems trwained on lots of labelled data to classify faces and extract features (eyes nose mouth etc) purely based on statistics and luck, these physics models are correct because that's the way the universe works. You can build hybrid, so called Ф-ML systems, which give you benefits of both the statistical approach, and the explainability of the physics model - the cool recent work also shows that the physics model lets you reduce the amount of data necessary for the statistical model - sometimes by 2 or 3 orders of magnitude, and retain the accuracy and recall of the stats model.

In the world of biometric id, there are requirements from applications (many use cases are with a human waiting for verification of some attribute) that mean you want fast, accurate and efficient models that can run on affordable devices in tolerable time.

You also want to be future proof against deep fake, and also against adversaries with access to complex system wide attacks.

I claim that having these underlying real world explanatory component, alongside the statisitcally acquired twin, will be more resilient, and might even let you cope with things like kids and aging and other changes, as well as allowing you to verify attributes other than the standard biometrics of face, fingerprint, iris etc, in robust ways which provide better domain specific data minimisation.

A True History of the Internet

Wednesday, May 21, 2025

Binding Real Live Human to Virtual Identifier with data minimisation - Ф-ML-Id

No comments:

music while you browse

Blog Archive

About Me