In many areas of AI research, we often talk about bias, hallucination, or mode collapse as if they were local failures—bugs in data, flaws in algorithms, or mistakes in optimization. But what if these phenomena are better understood as emergent properties of complex systems, rather than isolated errors?

Insights from physics—particularly the spin-locking effect—offer a useful lens for rethinking how structure, stability, and bias emerge from large-scale interactions, and why purely local fixes often fall short.

From Random Motion to Stable Structure

In statistical physics, spin-locking refers to a phenomenon where systems composed of many randomly interacting elements nevertheless exhibit stable, ordered macroscopic behavior.

Recent work on the Brownian spin-locking effect shows that even when particles undergo stochastic motion in disordered media, repeated interactions can produce persistent correlations between spin and motion direction [1]. What is striking is that this order is not explicitly imposed—it emerges through averaging, coupling, and repetition.

At the microscopic level, the system is noisy and unpredictable. At the macroscopic level, however, the noise cancels out, leaving behind a surprisingly simple and stable structure.

This idea should sound familiar to anyone working with large-scale AI systems.

AI Systems as Statistical Machines

Modern AI—especially large language models—operates in a regime that is deeply statistical. Individual token predictions are uncertain; single interactions are noisy. Yet over massive datasets and repeated training cycles, models converge toward highly stable behavioral patterns.

This stability is often celebrated as robustness—but it also explains why biases can be stubbornly persistent.

Recent analyses of post-training alignment show that diversity loss and mode collapse are not solely algorithmic issues. Instead, they are strongly driven by typicality bias in human preference data, where annotators systematically favor familiar, conventional outputs [2]. Over time, training objectives amplify these preferences, locking models into narrow response modes.

In other words, alignment procedures do not merely shape behavior—they lock it.

Bias Is Not a Bug—It’s a System Property

This reframes a core misconception: bias is not something that only appears after model training, nor is it limited to datasets.

Bias can emerge at every stage of the Human–AI pipeline:

  • Use cases define which outcomes matter—and which failures are tolerated.
  • Design choices allocate control between humans and machines.
  • Data collection and labeling encode human values long before training begins.
  • Pre-training and post-training reinforce statistical regularities.
  • Evaluation metrics reward confidence over calibrated uncertainty.
  • Monitoring practices determine which deviations are noticed—and which are ignored.

Seen through the lens of spin-locking, these stages act as repeated interactions that gradually stabilize particular system behaviors, even when those behaviors are undesirable.

Hallucination and Overconfidence

Recent discussions around hallucination point to a deeper misalignment: standard training and evaluation pipelines reward guessing, not acknowledging uncertainty [3]. Much like students trained on multiple-choice exams, language models learn that producing some answer is better than admitting ambiguity.

This is not a flaw of intelligence—it is a consequence of what the system is optimized to do. From this perspective, hallucination resembles an emergent artifact of statistical pressure, not a defect in reasoning capacity.

Human–AI Interaction Is a Socio-Technical System

This is why Human–AI Interaction cannot be reduced to UI design. It is a system-level problem spanning:

use case → design → data → model → test → monitoring

Human involvement that is too strong leads to inefficiency; involvement that is too weak leads to loss of control. The challenge is not whether humans should be in the loop, but how participation is structured over time.

As researchers in human-centered AI have repeatedly shown, trust, adoption, and productivity depend less on raw model performance and more on whether systems align with human judgment, workflows, and values [4].

Why This Matters for Research

Spin-locking teaches us a humbling lesson: once a system’s macroscopic behavior stabilizes, changing it becomes difficult—not because we lack better algorithms, but because the structure itself resists change.

For AI research, this suggests a shift in focus:

  • from optimizing isolated components
  • to understanding how long-term interaction patterns shape system behavior

True progress may come not from stronger models, but from rethinking how incentives, data, evaluation, and human participation interact over time.

Closing Thought

Order does not always signal understanding. Sometimes, it is simply what remains after uncertainty has been averaged away.

If we want AI systems that are not just stable, but meaningfully aligned, we must learn to see bias, confidence, and failure not as local anomalies—but as emergent properties of the systems we build.

References

[1] Zhang, X. et al. Brownian Spin-Locking Effect. arXiv:2412.00879, 2024. https://arxiv.org/abs/2412.00879

[2] Zhang, J., Yu, S., Chong, D., Sicilia, A., Tomz, M., Manning, C., Shi, W. Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity. arXiv, 2025.

[3] OpenAI. Why Language Models Hallucinate. OpenAI Research Blog, 2024.

[4] Geyer, W. Human-Centered and Responsible AI: Reflections from IBM Research. LinkedIn, 2024.