National Taiwan University team unveils AI tool to block violent and fake content

國立臺灣大學團隊推出人工智慧工具，攔截暴力和虛假內容

TAIPEI (Taiwan News) — An electrical engineering team at National Taiwan University has built “Receler,” a tool that teaches AI to forget harmful ideas such as violence, deepfakes, and copyright theft, CNA reported Wednesday.

The team, led by NTU Professor Wang Yu-chiang (王鈺強), developed a concept-erasing method to block AI from generating harmful content without retraining an entire model.

The project was funded by the National Science and Technology Council, and the results were unveiled at the 2024 European Conference on Computer Vision.

The developers style Receler as a concept-erasing tool that helps AI forget or block ideas like violence, nudity, or copying art styles without retraining the whole system. It removes only those concepts while still allowing normal, creative images.

Wang said AI is useful but can easily cross legal lines. He cited examples like ChatGPT making Ghibli-style images or deepfake tools swapping faces of politicians and celebrities — sometimes in sexual content — as ways AI can breach ethics and copyright.

He said keyword filters often fail to catch everything. Receler instead uses adversarial learning and cross-attention to remove specific high-risk concepts while preserving creative ability.

NSTC Director-General Hong Le-wen (洪樂文) said the new tool could prevent AI from creating harmful or illegal results. He added that the open-source release has already been widely used in online models, showing a sizable impact on AI safety.