Full width home advertisement

Welcome Home

Post Page Advertisement [Top]

Avoiding the Shortcuts of Artificial Intelligence in Order to Make More Reliable Predictions

Taking a shortcut may allow your Uber driver to get you to your destination more quickly. If a machine learning model, on the other hand, takes a shortcut, it may fail in unexpected ways.

Using a shortcut solution in machine learning means that instead of learning about the true nature of the data, the model decides based on one characteristic of a dataset rather than learning about the true nature of the data as a whole. This can result in inaccurate predictions. When learning to recognize cow images, a model may choose to concentrate on the green grass in the photographs rather than the more complex shapes and patterns of the cows, as shown in the example above.

This study, conducted by the Massachusetts Institute of Technology (MIT), examines the issue of shortcuts in a popular machine-learning method, and suggests a solution that prevents shortcuts by forcing the model to make more data-driven decisions.

In addition to removing the model's simpler characteristics, the researchers force the model to consider more complex features of the data that it had previously neglected. By asking the model to complete a task twice, once using the simpler features and then again using the complex features it has now learned to recognize, they are able to decrease the model's proclivity for shortcut solutions while simultaneously improving its performance.

Machine learning models that are used to detect disease in medical images could benefit from the findings of this study, which could help them perform better. Shortcuts may result in incorrect diagnoses in this situation, which can have serious consequences for patients.

"It is still unclear why deep networks make the decisions that they do, and more specifically, which parts of the data these networks choose to focus on when making a decision remains a mystery to researchers. It is possible to go even further in answering some fundamental but very practical questions that are critical for those attempting to deploy these networks if we gain a better understanding of how shortcuts work "The paper's lead author, Joshua Robinson, is a PhD student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) at the University of California, Berkeley.

Suvrit Sra, the Esther and Harold E. Edgerton Career Development Associate Professor in the Department of Electrical Engineering and Computer Science (EECS), who was a founding member of the Institute for Data, Systems, and Society (IDSS) as well as the Laboratory for Information and Decision Systems; Stefanie Jegelka, the X-Consortium Career Development Associate Professor in EECS who was also a member of CSAIL and IDSS; and U.S. Department of Energy (DOE). The findings of the study will be presented at the Conference on Neural Information Processing Systems in December.

The researchers concentrated on contrastive learning, a technique for self-supervised machine learning that has proven to be extremely effective. Unlabeled data with no label descriptions is used to train a model in self-supervised machine learning, which is different from traditional machine learning. Therefore, a wider range of data can be successfully processed using this method.

Image classification is one example of a task in which self-supervised learning models acquire useful representations of data, which are then used as inputs for other tasks. This information will not be available to these tasks if the model makes omissions or takes shortcuts that are critical to the task.

Suppose a self-supervised learning model is trained to classify pneumonia in X-rays taken from a variety of hospitals, but the model learns to make predictions based on the hospital from which the scan was taken (because some hospitals have a higher prevalence of pneumonia cases than others), the model will perform poorly when data from a new hospital is added to the model.

When training an encoder algorithm to discriminate between pairs of inputs that are similar and dissimilar, contrastive learning models are created and used. It is through this process that complex and rich data, such as images, are encoded in a way that the contrastive learning model can comprehend.

A series of images were used to evaluate contrastive learning encoders, which were found to be susceptible to shortcut solutions during the training process. Encoders frequently concentrate on the simplest features of an image in order to determine which pairs of inputs are similar and which pairs of inputs are dissimilar. According to Jegelka, when making a decision, the encoder should take into account all of the data's useful characteristics.

Consequently, the team made it more difficult to distinguish between similar and dissimilar pairs, and they discovered that doing so changed which features the encoder considered when making a decision on which pair to use.

According to her, by increasing the difficulty of distinguishing between similar and dissimilar items, your system is compelled to learn more meaningful information from the data because it will be unable to complete the task otherwise.

When difficulty was increased further, however, a trade-off occurred, in which the encoder became more adept at focusing on certain features of the data while becoming less adept at focusing on others. Robinson observes that the company almost appeared to have forgotten about the more basic features.

Instead, the researchers instructed the encoder to discriminate between pairs in the same way it did initially, using simpler features, and also after the researchers removed the previously learned information, in order to avoid this trade-off. By completing the task in both directions at the same time, the encoder's performance improved across the board.

Using a technique known as implicit feature modification, the researchers adaptively modify samples in order to eliminate the simpler features that were used by the encoder to distinguish between the pairs. According to Sra, the technique is not reliant on human input, which is critical because real-world data sets can contain hundreds of unique features that can combine in complex ways, making human input unnecessary.

One test of this method was carried out by the researchers using images of vehicles. They made implicit adjustments to the color, orientation, and vehicle type in order to make it more difficult for the encoder to distinguish between pairs of images that are similar and those that are dissimilar. Texture, shape, and color accuracy improved at the same time as the encoder's accuracy increased for each of the three features.

To determine whether or not the method would hold up to more complex data, the researchers also tested it on samples taken from a medical image database of chronic obstructive pulmonary disease (COPD). Once again, the method resulted in parallel improvements to all of the features that were being evaluated by the researchers.

The researchers point out that, while this work advances our understanding of the causes of shortcut solutions and our efforts to address them, further refinement of these methods and their application to other types of self-supervised learning will be critical for future advancements.

"Some of the most fundamental questions about deep learning systems, such as 'Why do they fail?' and 'Can we predict in advance when your model will fail?' are addressed here. There is still a great deal to learn about shortcut learning in its entirety, and I encourage you to do so "Robinson expresses himself.


  1. Nice Post thank you very much for sharing such a useful information and will definitely saved and revisit your site and i have bookmarked to check out new things frm your post.
    Data Science Course

  2. Thanks Your post is so cool and this is an extraordinary moving article and If it's not too much trouble share more like that.
    Digital Marketing Course in Hyderabad

  3. You have done excellent job Thanks a lot and I enjoyed your blog. Great Post.
    Data Science Certification in Hyderabad

  4. Great post happy to see this. I thought this was a pretty interesting read when it comes to this topic Information. Thanks..
    Artificial Intelligence Course

  5. I wanted to thank you for this great read and definitely enjoying every little bit of it and bookmarked to check out new stuff you post.
    Best Digital Marketing Institute in Hyderabad


  6. This is an informative and knowledgeable article. therefore, I would like to thank you for your effort in writing this article.
    Best Digital Marketing Courses in Bangalore

  7. Actually I read it yesterday but I had some ideas about it and today I wanted to read it again because it is so well written.
    Data Scientist Course in Jaipur

  8. Hi, I looked at most of your posts. This article is probably where I got the most useful information for my research. Thanks for posting, we can find out more about this. Do you know of any other websites on this topic?
    Data Science Course Details

  9. It's good to visit your blog again, it's been months for me. Well, this article that I have been waiting for so long. I will need this post to complete my college homework, and it has the exact same topic with your article. Thanks, have a good day.
    Data Analytics Course in Jaipur

  10. Excellent work done by you once again here and this is just the reason why I’ve always liked your work with amazing writing skills and you display them in every article. Keep it going!
    Data Analytics Courses in Hyderabad


Bottom Ad [Post Page]