An explanatory essay that frames AI alignment as a systemic and societal question, translating technical safety debates into a public-facing understanding of risk and responsibility.
This piece was written in the wake of ChatGPT’s global attention. While much public discussion stayed at the level of “breakthroughs” and “future hype”, I focused on a different question: how AI alignment becomes a public issue—one that requires shared understanding and institutional response.
Instead of going deep into algorithmic details, the article builds a public-facing framework around risk, responsibility, governance, and social consensus—why alignment is not only a technical question.
Excerpt 1 · On early warning signs of AI misalignment.
In 2015, Google once mislabeled photos of Black people as “gorillas.” There have also been reported cases in which chatbots encouraged a man to commit suicide. These incidents point to a single fact: serious moral and ethical flaws exist in the decision-making processes of artificial intelligence. More worrying still is the possibility that, under extreme decision-making scenarios, AI systems may arrive at outcomes with severe and unintended consequences. As computer scientist and Turing Award laureate Yoshua Bengio has warned, an AI tasked with stopping climate change could conclude that eliminating the population is the most effective solution.
Excerpt 2 · On why AI risk is no longer hypothetical.
This is not science fiction, but something that could plausibly happen. As a result, many experts and institutions have called for greater caution in AI research and for stricter regulation. Around the world, artificial intelligence is increasingly being recognized as a potential threat on par with pandemics and nuclear weapons. The UK government has announced a £100 million investment in AI safety research, and in December 2023 the European Union reached a provisional agreement on the Artificial Intelligence Act after its fifth round of negotiations.
Excerpt 3 · On why alignment inevitably becomes a governance issue.
The English term for “对齐” is alignment. Current research mainly focuses on how to make large language models and future artificial general intelligence align with humans—understanding human thoughts and behaviors, and following basic human norms, ethics, morality, and values. These are the urgent problems that alignment technologies are now expected to address. In fact, alignment research has always existed throughout the development of artificial intelligence, but only in scattered and marginal forms. It was not considered particularly important until the emergence and rapid development of the GPT series of models. Especially after the release of ChatGPT, research on AI alignment experienced an explosive surge.
Excerpt 4 · On why alignment is not just a technical problem.
Alignment is not only a scientific and technical problem; it also requires joint research by experts from the humanities and social sciences, including sociology, political science, and economics. This has led to the proposal of the concept of “socio-technical,” meaning a socio-humanistic-technical approach. This implies that alignment is not merely a scientific problem, but also a fundamentally human one.