• About
  • Advertise
  • Privacy & Policy
  • Contact
HK Businesswire
  • Home
  • News
    • All
    • Business
    • Politics
    • PR Newswire
    • Science
    • World
    Shinning in the Lion City: Intangible Cultural Heritage Markets and Artistic Masterpieces Made a Brilliant Appearance at the “Chongqing Week in Singapore”

    Shinning in the Lion City: Intangible Cultural Heritage Markets and Artistic Masterpieces Made a Brilliant Appearance at the “Chongqing Week in Singapore”

    Sun Dong begins Dutch visit

    Sun Dong begins Dutch visit

    Families hold funerals for Air India crash victims

    Families hold funerals for Air India crash victims

    Israel keeps up Iran strikes after missile barrage

    Israel keeps up Iran strikes after missile barrage

    Manslaughter suspect arrested after gang fight

    Manslaughter suspect arrested after gang fight

    Lawmaker calls for loan limits for domestic workers

    Lawmaker calls for loan limits for domestic workers

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • PR Newswire
  • Business
  • World
  • Entertainment
  • Sports
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup

    Xiaomi SU7 Ultra Becomes Fastest Mass-Produced EV on Nürburgring Nordschleife

    MPF at 25: PwC and HKRSA Urge Bold Reform for Hong Kong’s Retirement System

    CrowdStrike Shares Dip Despite Strong Q1 Earnings Amid Soft Revenue Guidance

    Constellation Energy (CEG) Stock Surges 37% in May 2025 Amid Strong Earnings and Strategic Partnerships

    Dunamu and HYBE’s NFT Platform ‘Momentica’ to Cease Operations Amid Ongoing Losses

    Shein Shifts IPO Plans to Hong Kong After London Listing Stalls

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Feature
No Result
View All Result
  • Home
  • News
    • All
    • Business
    • Politics
    • PR Newswire
    • Science
    • World
    Shinning in the Lion City: Intangible Cultural Heritage Markets and Artistic Masterpieces Made a Brilliant Appearance at the “Chongqing Week in Singapore”

    Shinning in the Lion City: Intangible Cultural Heritage Markets and Artistic Masterpieces Made a Brilliant Appearance at the “Chongqing Week in Singapore”

    Sun Dong begins Dutch visit

    Sun Dong begins Dutch visit

    Families hold funerals for Air India crash victims

    Families hold funerals for Air India crash victims

    Israel keeps up Iran strikes after missile barrage

    Israel keeps up Iran strikes after missile barrage

    Manslaughter suspect arrested after gang fight

    Manslaughter suspect arrested after gang fight

    Lawmaker calls for loan limits for domestic workers

    Lawmaker calls for loan limits for domestic workers

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • PR Newswire
  • Business
  • World
  • Entertainment
  • Sports
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup

    Xiaomi SU7 Ultra Becomes Fastest Mass-Produced EV on Nürburgring Nordschleife

    MPF at 25: PwC and HKRSA Urge Bold Reform for Hong Kong’s Retirement System

    CrowdStrike Shares Dip Despite Strong Q1 Earnings Amid Soft Revenue Guidance

    Constellation Energy (CEG) Stock Surges 37% in May 2025 Amid Strong Earnings and Strategic Partnerships

    Dunamu and HYBE’s NFT Platform ‘Momentica’ to Cease Operations Amid Ongoing Losses

    Shein Shifts IPO Plans to Hong Kong After London Listing Stalls

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Feature
No Result
View All Result
HK Businesswire
No Result
View All Result
Home News Science

3 Questions: How to help students recognize potential bias in their AI datasets

David Lee by David Lee
2 June 2025
in Science
0
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter

Every year, thousands of students take courses that teach them how to deploy artificial intelligence models that can help doctors diagnose disease and determine appropriate treatments. However, many of these courses omit a key element: training students to detect flaws in the training data used to develop the models.Leo Anthony Celi, a senior research scientist at MIT’s Institute for Medical Engineering and Science, a physician at Beth Israel Deaconess Medical Center, and an associate professor at Harvard Medical School, has documented these shortcomings in a new paper and hopes to persuade course developers to teach students to more thoroughly evaluate their data before incorporating it into their models. Many previous studies have found that models trained mostly on clinical data from white males don’t work well when applied to people from other groups. Here, Celi describes the impact of such bias and how educators might address it in their teachings about AI models.Q: How does bias get into these datasets, and how can these shortcomings be addressed?A: Any problems in the data will be baked into any modeling of the data. In the past we have described instruments and devices that don’t work well across individuals. As one example, we found that pulse oximeters overestimate oxygen levels for people of color, because there weren’t enough people of color enrolled in the clinical trials of the devices. We remind our students that medical devices and equipment are optimized on healthy young males. They were never optimized for an 80-year-old woman with heart failure, and yet we use them for those purposes. And the FDA does not require that a device work well on this diverse of a population that we will be using it on. All they need is proof that it works on healthy subjects.Additionally, the electronic health record system is in no shape to be used as the building blocks of AI. Those records were not designed to be a learning system, and for that reason, you have to be really careful about using electronic health records. The electronic health record system is to be replaced, but that’s not going to happen anytime soon, so we need to be smarter. We need to be more creative about using the data that we have now, no matter how bad they are, in building algorithms.One promising avenue that we are exploring is the development of a transformer model of numeric electronic health record data, including but not limited to laboratory test results. Modeling the underlying relationship between the laboratory tests, the vital signs and the treatments can mitigate the effect of missing data as a result of social determinants of health and provider implicit biases.Q: Why is it important for courses in AI to cover the sources of potential bias? What did you find when you analyzed such courses’ content?A: Our course at MIT started in 2016, and at some point we realized that we were encouraging people to race to build models that are overfitted to some statistical measure of model performance, when in fact the data that we’re using is rife with problems that people are not aware of. At that time, we were wondering: How common is this problem?Our suspicion was that if you looked at the courses where the syllabus is available online, or the online courses, that none of them even bothers to tell the students that they should be paranoid about the data. And true enough, when we looked at the different online courses, it’s all about building the model. How do you build the model? How do you visualize the data? We found that of 11 courses we reviewed, only five included sections on bias in datasets, and only two contained any significant discussion of bias.That said, we cannot discount the value of these courses. I’ve heard lots of stories where people self-study based on these online courses, but at the same time, given how influential they are, how impactful they are, we need to really double down on requiring them to teach the right skillsets, as more and more people are drawn to this AI multiverse. It’s important for people to really equip themselves with the agency to be able to work with AI. We’re hoping that this paper will shine a spotlight on this huge gap in the way we teach AI now to our students.Q: What kind of content should course developers be incorporating?A: One, giving them a checklist of questions in the beginning. Where did this data came from? Who were the observers? Who were the doctors and nurses who collected the data? And then learn a little bit about the landscape of those institutions. If it’s an ICU database, they need to ask who makes it to the ICU, and who doesn’t make it to the ICU, because that already introduces a sampling selection bias. If all the minority patients don’t even get admitted to the ICU because they cannot reach the ICU in time, then the models are not going to work for them. Truly, to me, 50 percent of the course content should really be understanding the data, if not more, because the modeling itself is easy once you understand the data.Since 2014, the MIT Critical Data consortium has been organizing datathons (data “hackathons”) around the world. At these gatherings, doctors, nurses, other health care workers, and data scientists get together to comb through databases and try to examine health and disease in the local context. Textbooks and journal papers present diseases based on observations and trials involving a narrow demographic typically from countries with resources for research. Our main objective now, what we want to teach them, is critical thinking skills. And the main ingredient for critical thinking is bringing together people with different backgrounds.You cannot teach critical thinking in a room full of CEOs or in a room full of doctors. The environment is just not there. When we have datathons, we don’t even have to teach them how do you do critical thinking. As soon as you bring the right mix of people — and it’s not just coming from different backgrounds but from different generations — you don’t even have to tell them how to think critically. It just happens. The environment is right for that kind of thinking. So, we now tell our participants and our students, please, please do not start building any model unless you truly understand how the data came about, which patients made it into the database, what devices were used to measure, and are those devices consistently accurate across individuals?When we have events around the world, we encourage them to look for data sets that are local, so that they are relevant. There’s resistance because they know that they will discover how bad their data sets are. We say that that’s fine. This is how you fix that. If you don’t know how bad they are, you’re going to continue collecting them in a very bad manner and they’re useless. You have to acknowledge that you’re not going to get it right the first time, and that’s perfectly fine. MIMIC (the Medical Information Marked for Intensive Care database built at Beth Israel Deaconess Medical Center) took a decade before we had a decent schema, and we only have a decent schema because people were telling us how bad MIMIC was.We may not have the answers to all of these questions, but we can evoke something in people that helps them realize that there are so many problems in the data. I’m always thrilled to look at the blog posts from people who attended a datathon, who say that their world has changed. Now they’re more excited about the field because they realize the immense potential, but also the immense risk of harm if they don’t do this correctly.

Tags: Science
David Lee

David Lee

Read More

First-of-its-kind device profiles newborns’ immune function

First-of-its-kind device profiles newborns’ immune function

13 June 2025

Is Gravity Just Entropy Rising? Long-Shot Idea Gets Another Look.

13 June 2025
  • Trending
  • Comments
  • Latest
Over 150 firms hoping to list in Hong Kong: HKEX

Over 150 firms hoping to list in Hong Kong: HKEX

28 May 2025
Stablecoins laws effective Aug 1

Stablecoins laws effective Aug 1

6 June 2025

Macau Enforces 183-Day Residency Rule for 2025 Wealth Partaking Scheme

29 May 2025

Power Talk | Cody OOH’s Hilda Cheung: Reinventing Hong Kong’s Moving Billboards for the AI Age

2 June 2025
Shinning in the Lion City: Intangible Cultural Heritage Markets and Artistic Masterpieces Made a Brilliant Appearance at the “Chongqing Week in Singapore”

Shinning in the Lion City: Intangible Cultural Heritage Markets and Artistic Masterpieces Made a Brilliant Appearance at the “Chongqing Week in Singapore”

15 June 2025
Sun Dong begins Dutch visit

Sun Dong begins Dutch visit

15 June 2025
Fritz downs Zverev to secure first title of year

Fritz downs Zverev to secure first title of year

15 June 2025
Tottenham sign Tel on permanent deal from Bayern

Tottenham sign Tel on permanent deal from Bayern

15 June 2025

Recent News

Shinning in the Lion City: Intangible Cultural Heritage Markets and Artistic Masterpieces Made a Brilliant Appearance at the “Chongqing Week in Singapore”

Shinning in the Lion City: Intangible Cultural Heritage Markets and Artistic Masterpieces Made a Brilliant Appearance at the “Chongqing Week in Singapore”

15 June 2025
Sun Dong begins Dutch visit

Sun Dong begins Dutch visit

15 June 2025
Fritz downs Zverev to secure first title of year

Fritz downs Zverev to secure first title of year

15 June 2025
Tottenham sign Tel on permanent deal from Bayern

Tottenham sign Tel on permanent deal from Bayern

15 June 2025
HK Businesswire

Stay ahead with the latest insights on Hong Kong’s economy, finance, and investments. From market trends to policy updates, we bring you in-depth analysis and expert opinions.

📩 Subscribe to our newsletter for exclusive updates.
📍 Follow us on social media for real-time news.
📧 Contact us: info@hongkong-invest.com

Follow Us

  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2025 by HKBusinesswire.com

No Result
View All Result

© 2025 by HKBusinesswire.com