• About
  • Advertise
  • Privacy & Policy
  • Contact
HK Businesswire
  • Home
  • News
    • All
    • Business
    • Politics
    • PR Newswire
    • Science
    • World
    Xia Baolong concludes HK inspection

    Xia Baolong concludes HK inspection

    Iran deal ‘not final’, says Trump

    Seven Perfect Shuffles Randomize a Deck of Cards. But How Many Sloppy Ones?

    AXI SECURES FSC MAURITIUS LICENCE, BRINGING REGULATED TRADING TO THE WORLD’S FASTEST-GROWING MARKETS

    AXI SECURES FSC MAURITIUS LICENCE, BRINGING REGULATED TRADING TO THE WORLD’S FASTEST-GROWING MARKETS

    CE welcomes Hainan Governor

    CE welcomes Hainan Governor

    Man vs. Machine: 7th-Gen COFE+ Robotic Café Outperforms Elite Baristas in Historic Live Showdown

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • PR Newswire
  • Business
  • World
  • Entertainment
  • Sports
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup

    Alipay Launches AI-Powered Version ‘Abao’ to Streamline Services

    Xiaohongshu Prepares Confidential Hong Kong IPO Filing

    SpaceX Raises $75 Billion in Historic IPO Amid $350 Billion Investor Demand

    Chinese firms double down on tech: Xiaomi, Haier

    Xiaomi Launches MiMo Code AI Programming Assistant to Enter Coding Agent Market

    Apple Unveils Overhauled Siri AI and Major OS Updates at WWDC 2026

    OpenAI launches AI browser Atlas

    OpenAI Files Confidentially for IPO Amid Intensifying AI Competition

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Feature
No Result
View All Result
  • Home
  • News
    • All
    • Business
    • Politics
    • PR Newswire
    • Science
    • World
    Xia Baolong concludes HK inspection

    Xia Baolong concludes HK inspection

    Iran deal ‘not final’, says Trump

    Seven Perfect Shuffles Randomize a Deck of Cards. But How Many Sloppy Ones?

    AXI SECURES FSC MAURITIUS LICENCE, BRINGING REGULATED TRADING TO THE WORLD’S FASTEST-GROWING MARKETS

    AXI SECURES FSC MAURITIUS LICENCE, BRINGING REGULATED TRADING TO THE WORLD’S FASTEST-GROWING MARKETS

    CE welcomes Hainan Governor

    CE welcomes Hainan Governor

    Man vs. Machine: 7th-Gen COFE+ Robotic Café Outperforms Elite Baristas in Historic Live Showdown

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • PR Newswire
  • Business
  • World
  • Entertainment
  • Sports
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup

    Alipay Launches AI-Powered Version ‘Abao’ to Streamline Services

    Xiaohongshu Prepares Confidential Hong Kong IPO Filing

    SpaceX Raises $75 Billion in Historic IPO Amid $350 Billion Investor Demand

    Chinese firms double down on tech: Xiaomi, Haier

    Xiaomi Launches MiMo Code AI Programming Assistant to Enter Coding Agent Market

    Apple Unveils Overhauled Siri AI and Major OS Updates at WWDC 2026

    OpenAI launches AI browser Atlas

    OpenAI Files Confidentially for IPO Amid Intensifying AI Competition

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Feature
No Result
View All Result
HK Businesswire
No Result
View All Result
Home News Science

A new way to test how well AI systems classify text

David Lee by David Lee
13 August 2025
in Science
0
A new way to test how well AI systems classify text
0
SHARES
5
VIEWS
Share on FacebookShare on Twitter

Is this movie review a rave or a pan? Is this news story about business or technology? Is this online chatbot conversation veering off into giving financial advice? Is this online medical information site giving out misinformation?These kinds of automated conversations, whether they involve seeking a movie or restaurant review or getting information about your bank account or health records, are becoming increasingly prevalent. More than ever, such evaluations are being made by highly sophisticated algorithms, known as text classifiers, rather than by human beings. But how can we tell how accurate these classifications really are?Now, a team at MIT’s Laboratory for Information and Decision Systems (LIDS) has come up with an innovative approach to not only measure how well these classifiers are doing their job, but then go one step further and show how to make them more accurate.The new evaluation and remediation software was developed by Kalyan Veeramachaneni, a principal research scientist at LIDS, his students Lei Xu and Sarah Alnegheimish, and two others. The software package is being made freely available for download by anyone who wants to use it.A standard method for testing these classification systems is to create what are known as synthetic examples — sentences that closely resemble ones that have already been classified. For example, researchers might take a sentence that has already been tagged by a classifier program as being a rave review, and see if changing a word or a few words while retaining the same meaning could fool the classifier into deeming it a pan. Or a sentence that was determined to be misinformation might get misclassified as accurate. This ability to fool the classifiers makes these adversarial examples.People have tried various ways to find the vulnerabilities in these classifiers, Veeramachaneni says. But existing methods of finding these vulnerabilities have a hard time with this task and miss many examples that they should catch, he says.Increasingly, companies are trying to use such evaluation tools in real time, monitoring the output of chatbots used for various purposes to try to make sure they are not putting out improper responses. For example, a bank might use a chatbot to respond to routine customer queries such as checking account balances or applying for a credit card, but it wants to ensure that its responses could never be interpreted as financial advice, which could expose the company to liability. “Before showing the chatbot’s response to the end user, they want to use the text classifier to detect whether it’s giving financial advice or not,” Veeramachaneni says. But then it’s important to test that classifier to see how reliable its evaluations are.“These chatbots, or summarization engines or whatnot are being set up across the board,” he says, to deal with external customers and within an organization as well, for example providing information about HR issues. It’s important to put these text classifiers into the loop to detect things that they are not supposed to say, and filter those out before the output gets transmitted to the user.That’s where the use of adversarial examples comes in — those sentences that have already been classified but then produce a different response when they are slightly modified while retaining the same meaning. How can people confirm that the meaning is the same? By using another large language model (LLM) that interprets and compares meanings. So, if the LLM says the two sentences mean the same thing, but the classifier labels them differently, “that is a sentence that is adversarial — it can fool the classifier,” Veeramachaneni says. And when the researchers examined these adversarial sentences, “we found that most of the time, this was just a one-word change,” although the people using LLMs to generate these alternate sentences often didn’t realize that.Further investigation, using LLMs to analyze many thousands of examples, showed that certain specific words had an outsized influence in changing the classifications, and therefore the testing of a classifier’s accuracy could focus on this small subset of words that seem to make the most difference. They found that one-tenth of 1 percent of all the 30,000 words in the system’s vocabulary could account for almost half of all these reversals of classification, in some specific applications.Lei Xu PhD ’23, a recent graduate from LIDS who performed much of the analysis as part of his thesis work, “used a lot of interesting estimation techniques to figure out what are the most powerful words that can change the overall classification, that can fool the classifier,” Veeramachaneni says. The goal is to make it possible to do much more narrowly targeted searches, rather than combing through all possible word substitutions, thus making the computational task of generating adversarial examples much more manageable. “He’s using large language models, interestingly enough, as a way to understand the power of a single word.”Then, also using LLMs, he searches for other words that are closely related to these powerful words, and so on, allowing for an overall ranking of words according to their influence on the outcomes. Once these adversarial sentences have been found, they can be used in turn to retrain the classifier to take them into account, increasing the robustness of the classifier against those mistakes.Making classifiers more accurate may not sound like a big deal if it’s just a matter of classifying news articles into categories, or deciding whether reviews of anything from movies to restaurants are positive or negative. But increasingly, classifiers are being used in settings where the outcomes really do matter, whether preventing the inadvertent release of sensitive medical, financial, or security information, or helping to guide important research, such as into properties of chemical compounds or the folding of proteins for biomedical applications, or in identifying and blocking hate speech or known misinformation.As a result of this research, the team introduced a new metric, which they call p, which provides a measure of how robust a given classifier is against single-word attacks. And because of the importance of such misclassifications, the research team has made its products available as open access for anyone to use. The package consists of two components: SP-Attack, which generates adversarial sentences to test classifiers in any particular application, and SP-Defense, which aims to improve the robustness of the classifier by generating and using adversarial sentences to retrain the model.In some tests, where competing methods of testing classifier outputs allowed a 66 percent success rate by adversarial attacks, this team’s system cut that attack success rate almost in half, to 33.7 percent. In other applications, the improvement was as little as a 2 percent difference, but even that can be quite important, Veeramachaneni says, since these systems are being used for so many billions of interactions that even a small percentage can affect millions of transactions.The team’s results were published on July 7 in the journal Expert Systems in a paper by Xu, Veeramachaneni, and Alnegheimish of LIDS, along with Laure Berti-Equille at IRD in Marseille, France, and Alfredo Cuesta-Infante at the Universidad Rey Juan Carlos, in Spain. 

Tags: Science
David Lee

David Lee

Read More

MIT in the media: For the future of tech, “Massachusetts can absolutely lead”

18 June 2026

Why the Human Genome’s Tangled Physicality May Confound AI

18 June 2026
  • Trending
  • Comments
  • Latest
Clarivate Releases Journal Citation Reports 2026

Clarivate Releases Journal Citation Reports 2026

17 June 2026

HKICPA Supports Government Plan to Boost Corporate Treasury Centres in Hong Kong

12 June 2026
Jabs urged as doctors fear flu season overlap

Ping An Good Doctor Upgrades AI Health Service to Cover 90 Million Monthly Users

17 June 2026

Fluorescent nanosensor enables rapid, first-of-its-kind detection of key gut health biomarker

15 June 2026
Xia Baolong concludes HK inspection

Xia Baolong concludes HK inspection

17 June 2026

Iran deal ‘not final’, says Trump

17 June 2026

Seven Perfect Shuffles Randomize a Deck of Cards. But How Many Sloppy Ones?

17 June 2026
AXI SECURES FSC MAURITIUS LICENCE, BRINGING REGULATED TRADING TO THE WORLD’S FASTEST-GROWING MARKETS

AXI SECURES FSC MAURITIUS LICENCE, BRINGING REGULATED TRADING TO THE WORLD’S FASTEST-GROWING MARKETS

17 June 2026

Recent News

Xia Baolong concludes HK inspection

Xia Baolong concludes HK inspection

17 June 2026

Iran deal ‘not final’, says Trump

17 June 2026

Seven Perfect Shuffles Randomize a Deck of Cards. But How Many Sloppy Ones?

17 June 2026
AXI SECURES FSC MAURITIUS LICENCE, BRINGING REGULATED TRADING TO THE WORLD’S FASTEST-GROWING MARKETS

AXI SECURES FSC MAURITIUS LICENCE, BRINGING REGULATED TRADING TO THE WORLD’S FASTEST-GROWING MARKETS

17 June 2026
HK Businesswire

Stay ahead with the latest insights on Hong Kong’s economy, finance, and investments. From market trends to policy updates, we bring you in-depth analysis and expert opinions.

📩 Subscribe to our newsletter for exclusive updates.
📍 Follow us on social media for real-time news.
📧 Contact us: info@hongkong-invest.com

Follow Us

  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2025 by HKBusinesswire.com

No Result
View All Result

© 2025 by HKBusinesswire.com