Analytical Review of Task Allocation Methods for Human and AI Model Collaboration

Capa

Citar

Texto integral

Resumo

In many practical scenarios, decision-making by an AI model alone is undesirable or even impossible, and the use of an AI model is only part of a complex decision-making process that includes a human expert. Nevertheless, this fact is often overlooked when creating and training AI models – the model is trained to make decisions independently, which is not always optimal. The paper presents a review of methods that allow taking into account the joint work of AI and a human expert in the process of designing (in particular, training) AI systems, which more accurately corresponds to the practical application of the model, allows to increase the accuracy of decisions made by the system “human – AI model”, as well as to explicitly control other important parameters of the system (e.g., human workload). The review includes an analysis of the current literature on a given topic in the following main areas: 1) scenarios of interaction between a human and an AI model and formal problem statements for improving the efficiency of the “human – AI model” system; 2) methods for ensuring the efficient operation of the “human – AI model” system; 3) ways to assess the quality of human-model AI collaboration. Conclusions are drawn regarding the advantages, disadvantages, and conditions of applicability of the methods, as well as the main problems of existing approaches are identified. The review can be useful for a wide range of researchers and specialists involved in the application of AI for decision support.

Sobre autores

A. Ponomarev

St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS)

Email: ponomarev@iias.spb.su
14-th Line V.O. 39

А. Agafonov

St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS)

Email: agafonov.a@spcras.ru
14-th Line V.O. 39

Bibliografia

  1. Wilder B., Horvitz E., Kamar E. Learning to Complement Humans // IJCAI’20: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. 2020. pp. 1526–1533.
  2. Madras D., Pitassi T., Zemel R. Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer // Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS 2018). 2018. pp. 6150–6160.
  3. Chow C.K. On Optimum Recognition Error and Reject Tradeoff // IEEE Trans. Inf. Theory. 1970. vol. 16. no. 1. pp. 41–46.
  4. Cortes C., DeSalvo G., Mohri M. Learning with rejection // Algorithmic Learning Theory (ALT 2016). Lecture Notes in Computer Science. 2016. vol. 9925. pp. 67–82.
  5. Алексеев А., Носков Ф., Панов М. Непараметрическая регрессия с возможностью отказа от предсказания // ИТиС 2022. Институт проблем передачи информации им. А.А. Харкевича РАН (Москва), 2022. С. 215–226.
  6. Lyons J.B., Sycara K., Lewis M., Capiola A. Human–Autonomy Teaming: Definitions, Debates, and Directions // Frontiers in Psychology. 2021. vol. 12. doi: 10.3389/fpsyg.2021.589585.
  7. Shively R.J., Lachter J., Brandt S.L., Matessa M., Battiste V., Johnson W.W. Why Human-Autonomy Teaming? // Advances in Neuroergonomics and Cognitive Engineering (AHFE 2017). Cham: Springer, 2018. vol. 586. pp. 3–11.
  8. Кильдеева С., Катасёв А., Талипов Н. Модели и методы прогнозирования и распределения заданий по исполнителям в системах электронного документооборота // Вестник Технологического университета. 2021. Т. 24. № 1. С. 79–85.
  9. Hendrickx K., Perini L., Van der Plas D., Meert W., Davis J. Machine learning with a reject option: a survey // Machine Learning. 2024. vol. 113. no. 5. pp. 3073–3110.
  10. Leitão D., Saleiro P. Human-AI Collaboration in Decision-Making: Beyond Learning to Defer // Workshop on Human-Machine Collaboration and Teaming, ICML. 2022.
  11. Zahedi Z., Kambhampati S. Human-AI Symbiosis: A Survey of Current Approaches. arXiv preprint arXiv:2103.09990. 2021. doi: 10.48550/arXiv.2103.09990.
  12. Kitchenham B., Charters S. Guidelines for performing Systematic Literature Reviews in Software Engineering. Keele, Staffs: Kitchenham, 2007. 65 p.
  13. Snyder H. Literature review as a research methodology: An overview and guidelines // Journal of business research. 2019. vol. 104. pp. 333–339.
  14. Mozannar H., Sontag D. Consistent estimators for learning to defer to an expert // 37th International Conference on Machine Learning. 2020. pp. 7076–7087.
  15. Raghu M., Blumer K., Corrado G., Kleinberg J., Obermeyer Z., Mullainathan S. The Algorithmic Automation Problem: Prediction, Triage, and Human Effort. arXiv preprint arXiv:1903.12220. 2019.
  16. Ma S., Le Y., Wang X., Zheng C., Shi C., Yin M., Ma X. Who Should I Trust: AI or Myself? Leveraging Human and AI Correctness Likelihood to Promote Appropriate Trust in AI-Assisted Decision-Making // Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. New York, USA: ACM, 2023. pp. 1–19. doi: 10.1145/3544548.3581058.
  17. Vodrahalli K., Gerstenberg T., Zou J. Uncalibrated Models Can Improve Human-AI Collaboration // Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS 2022). 2022. vol. 35. pp. 4004–4016.
  18. Charusaie M.-A., Mozannar H., Sontag D., Samadi S. Sample Efficient Learning of Predictors that Complement Humans // Proceedings of the 39th International Conference on Machine Learning. 2022. pp. 2972–3005.
  19. Okati N., De A., Gomez-Rodriguez M. Differentiable Learning Under Triage // Advances in Neural Information Processing Systems. 2021. vol. 34. pp. 9140–9151.
  20. Verma R., Nalisnick E. Calibrated Learning to Defer with One-vs-All Classifiers // Proceedings of the 39 th International Conference on Machine Learning. 2022. pp. 22184–22202.
  21. Gao R., Maytal Saar-Tsechansky M., De-Arteaga M., Han L., Sun W., Kyung Lee M., Lease M.. Learning Complementary Policies for Human-AI Teams. arXiv preprint arXiv:2302.02944. 2023.
  22. Hemmer P., Schellhammer S., Vössing M., Jakubik J., Satzger G. Forming Effective Human-AI Teams: Building Machine Learning Models that Complement the Capabilities of Multiple Experts // Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI-22). 2022. pp. 2478–2484. doi: 10.24963/ijcai.2022/344.
  23. Steyvers M., Tejeda H., Kerrigan G., Smyth P. Bayesian modeling of human–AI complementarity // Proceedings of the National Academy of Sciences (Proceedings of the National Academy of Sciences of the United States of America). 2022. vol. 119. no. 11. doi: 10.1073/pnas.2111547119.
  24. Lemmer S.J., Corso J.J. Evaluating and Improving Interactions with Hazy Oracles // Proceedings of the AAAI Conference on Artificial Intelligence. 2023. vol. 37. no. 5. pp. 6039–6047.
  25. Alves J.V., Leitão D., Jesus S., Sampaio M., Saleiro P., Figueiredo M., Bizarro P. FiFAR: A Fraud Detection Dataset for Learning to Defer. arXiv preprint arXiv:2312.13218. 2023.
  26. Straitouri E., Adish Singla A., Balazadeh Meresht V., Gomez-Rodriguez M. Reinforcement Learning Under Algorithmic Triage. arXiv preprint arXiv:2109.11328. 2021.
  27. Verma R., Barrejón D., Nalisnick E. Learning to Defer to Multiple Experts: Consistent Surrogate Losses, Confidence Calibration, and Conformal Ensembles // Proceedings of The 26th International Conference on Artificial Intelligence and Statistics. 2023. pp. 11415–11434.
  28. De A., Okati N., Zarezade A., Gomez Rodriguez M. Classification Under Human Assistance // The 35th AAAI Conference on Artificial Intelligence (AAAI-21). 2021. vol. 35(7). pp. 5905–5913.
  29. Liu D.-X., Mu X., Qian C. Human Assisted Learning by Evolutionary Multi-Objective Optimization // Proceedings of the AAAI Conference on Artificial Intelligence. 2023. vol. 37. no. 10. pp. 12453–12461.
  30. Showalter S., Boyd A., Smyth P., Steyvers M. Bayesian Online Learning for Consensus Prediction // Proceedings of The 27th International Conference on Artificial Intelligence and Statistics. 2024. vol. 238. pp. 2539–2547.
  31. Keswani V., Lease M., Kenthapadi K. Towards Unbiased and Accurate Deferral to Multiple Experts // AIES 2021 – Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. New York, USA: ACM, 2021. pp. 154–165.
  32. Mao A. et al. Two-Stage Learning to Defer with Multiple Experts // NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2023. pp. 3578–3606.
  33. Mao A., Mohri M., Zhong Y. Principled Approaches for Learning to Defer with Multiple Experts // International Symposium on Artificial Intelligence and Mathematics (ISAIM 2024). 2024. pp. 107–135.
  34. Noti G., Chen Y. Learning When to Advise Human Decision Makers // Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2023. pp. 3038–3048.
  35. De A., Koley P., Ganguly N., Gomez-Rodriguez M. Regression under human assistance // Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020. pp. 2611–2620.
  36. Kobayashi M., Wakabayashi K., Morishima A. Human+AI Crowd Task Assignment Considering Result Quality Requirements // Proceedings of the AAAI Conf. Hum. Comput. Crowdsourcing. 2021. vol. 9. pp. 97–107.
  37. Lai V., Carton S., Bhatnagar R., Liao Q.V., Zhang Y., Tan C. Human-AI Collaboration via Conditional Delegation: A Case Study of Content Moderation // Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 2022. pp. 1–18. doi: 10.1145/3491102.3501999.
  38. Gao R., Saar-Tsechansky M., De-Arteaga M., Han L., Lee M.K., Lease M. Human-AI Collaboration with Bandit Feedback // Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI 2021). 2021. pp. 1722–1728.
  39. Narasimhan H., Jitkrittum W., Menon A.K., Rawat A., Kumar S.. Post-hoc Estimators for Learning to Defer to an Expert // Advances in Neural Information Processing Systems. 2022. vol. 35. pp. 29292–29304.
  40. Popat R., Ive J. Embracing the uncertainty in human–machine collaboration to support clinical decision-making for mental health conditions // Frontiers in Digital Health. 2023. vol. 5. doi: 10.3389/fdgth.2023.1188338.
  41. Zhang Z., Wells K., Carneiro G. Learning to Complement with Multiple Humans (LECOMH): Integrating Multi-rater and Noisy-Label Learning into Human-AI Collaboration. arXiv preprint arXiv:2311.13172. 2023.
  42. Straitouri E., Wang L., Okati N., Gomez Rodriguez M. Improving Expert Predictions with Conformal Prediction // Proceedings of the 40th International Conference on Machine Learning. 2023. pp. 32633–32653.
  43. Gao R., Yin M. Confounding-Robust Policy Improvement with Human-AI Teams. arXiv preprint arXiv:2310.08824. 2023.
  44. Kerrigan G., Smyth P., Steyvers M. Combining Human Predictions with Model Probabilities via Confusion Matrices and Calibration // Advances in Neural Information Processing Systems. 2021. vol. 34. pp. 4421–4434.
  45. Raman N., Yee M. Improving Learning-to-Defer Algorithms Through Fine-Tuning // 1st Workshop on Human and Machine Decisions (WHMD 2021) at NeurIPS. 2021. 6 p.
  46. Hemmer P., Westphal M., Schemmer M., Vetter S., Vossing M., Satzger G. Human-AI Collaboration: The Effect of AI Delegation on Human Task Performance and Task Satisfaction // Proceedings of the 28th International Conference on Intelligent User Interfaces. New York, NY, USA: ACM, 2023. pp. 453–463.
  47. Gupta S. et al. Take Expert Advice Judiciously: Combining Groupwise Calibrated Model Probabilities with Expert Predictions // ECAI 2023. Front. Artif. Intell. Appl. 2023. vol. 372. pp. 956–963.
  48. Babbar V., Bhatt U., Weller A. On the Utility of Prediction Sets in Human-AI Teams // Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2022. pp. 2457–2463.
  49. Mozannar H., Satyanarayan A., Sontag D. Teaching Humans When To Defer to a Classifier via Exemplars // Proceedings of the 36th AAAI Conf. Artif. Intell (AAAI 2022). 2022. vol. 36(5). pp. 5323–5331.
  50. Singh S., Jain S., Jha S.S. On Subset Selection of Multiple Humans To Improve Human-AI Team Accuracy // Proceedings of the e 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023). 2023. pp. 317–325.
  51. Bansal G., Nushi B., Kamar E., Horvitz E., Weld D.S. Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork // Proceedings of the AAAI Conference on Artificial Intelligence. 2021. vol. 35(13). pp. 11405–11414.
  52. Mozannar H., Lang H., Wei D., Sattigeri P., Das S., Sontag D. Who Should Predict? Exact Algorithms For Learning to Defer to Humans // Proceedings of the The 26th International Conference on Artificial Intelligence and Statistics (PLMR 2023). 2023. vol. 206. pp. 10520–10545.
  53. Joshi S., Parbhoo S., Doshi-Velez F. Learning-to-defer for sequential medical decision-making under uncertainty. Trans. Mach. Learn. Res. 2021. vol. 2023.
  54. Cordelia L.P., De Stefano S., Tortorella F., Vento M. A Method for Improving Classification Reliability of Multilayer Perceptrons // IEEE Trans. Neural Networks. 1995. vol. 6. pp. 1140–1147.
  55. De Stefano C., Sansone C., Vento M. To reject or not to reject: that is the question – an answer in case of neural classifiers // IEEE Transactions on Systems, Man, and Cybernetics, Part C. 2000. vol. 30. pp. 84–94.
  56. Gal Y., Ghahramani Z. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning // Proceedings of the 33rd International Conference on International Conference on Machine Learning (ICML 2016). 2016. vol. 48. pp. 1050–1059.
  57. Geifman Y., El-Yaniv R. Selective classification for deep neural networks // Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems. 2017. pp. 4878–4887.
  58. Lakshminarayanan B., Pritzel A., Blundell C. Simple and scalable predictive uncertainty estimation using deep ensembles // Adv. Neural Inf. Process. Syst. 2017. vol. 30. pp. 6403–6414.
  59. Raghu M., Blumer K., Sayres R., Obermeyer Z., Kleinberg R., Mullainathan S., Kleinberg J. Direct Uncertainty Prediction with Applications to Healthcare. 2018. pp. 1–14.
  60. Platt J.C. Using analytic QP and sparseness to speed training of support vector machines // Advances in neural information processing systems. 1999. pp. 557–563.
  61. Cohn D., Atlas L., Ladner R. Improving Generalization with Active Learning // Mach. Learn. 1994. vol. 15. no. 2. pp. 201–221.
  62. Hemmer P., Thede Д., Vössing M., Jakubik J., Kühl N. Learning to Defer with Limited Expert Predictions // Proceedings of the 37th AAAI Conf. Artif. Intell. AAAI 2023. 2023. vol. 37. pp. 6002–6011.
  63. Goh H.W., Tkachenko U., Mueller J. CROWDLAB: Supervised learning to infer consensus labels and quality scores for data with multiple annotators // arXiv preprint arXiv:2210.06812. 2022.
  64. Xiao R., Dong Y., Wang H., Feng L., Wu R., Chen G., Zhao J. ProMix: Combating Label Noise via Maximizing Clean Sample Utility // Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI). 2023. vol. 2023-Augus. pp. 4442–4450.
  65. Garg A., Nguyen C., Felix R., Do T.-T., Carneiro G. Instance-Dependent Noisy Label Learning via Graphical Modelling // Proceedings of the 2023 IEEE Winter Conf. Appl. Comput. Vision (WACV 2023). 2023. pp. 2287–2297.
  66. Peterson J., Battleday R., Griffiths T., Russakovsky O. Human uncertainty makes classification more robust // Proceedings of the IEEE Int. Conf. Comput. Vis. 2019. pp. 9616–9625. doi: 10.1109/ICCV.2019.00971.
  67. Lintott C.J., Schawinski K., Slosar A., Land K., Bamford S., Thomas D., Raddick D., Nichol R.C., Szalay A.S., Andreescu D., Murray P., Vandenberg J. Galaxy Zoo: Morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey // Monthly Notices of the Royal Astronomical Society. 2008. vol. 389. no. 3. pp. 1179–1189.
  68. Kamar E., Hacker S., Horvitz E. Combining human and machine intelligence in large-scale crowdsourcing // Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2012). 2012. vol. 1. pp. 467–474.
  69. Majkowska A., Mittal S., Steiner D.F., Reicher J.J., McKinney S.M., Duggan G.E., Eswaran K., Cameron Chen P.-H., Liu Y., Raju Kalidindi S., Ding A., Corrado G.S., Tse D., Shetty S. Chest radiograph interpretation with deep learning models: Assessment with radiologist-adjudicated reference standards and population-adjusted evaluation // Radiology. 2020. vol. 294. no. 2. pp. 421–431.
  70. Wang X., Peng Y., Lu L., Lu Z., Bagheri M., Summers R. ChestX-ray8: Hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases // Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017. pp. 3462–3471.
  71. Salehi P., Chiou E., Mancenido M., Mosallanezhad A., Cohen M., Shah A. Decision Deferral in a Human-AI Joint Face-Matching Task: Effects on Human Performance and Trust // Proceedings of the Human Factors and Ergonomics Society. 2021. vol. 65. no. 1. pp. 638–642.
  72. Bondi E., Koster R., Sheahan H., Chadwick M., Bachrach Y., Cemgil T., Paquet U., Dvijotham K.. Role of Human-AI Interaction in Selective Prediction // Proc. 36th AAAI Conf. Artif. Intell. AAAI 2022. 2022. vol. 36. pp. 5286–5294.
  73. Collins K., Barker M., Espinosa Zarlenga M., Raman N., Bhatt U., Jamnik M., Sucholutsky I., Weller A., Dvijotham K. Human Uncertainty in Concept-Based AI Systems // AIES 2023: Proc. of the AAAI/ACM Conf. on AI, Ethics, and Society. 2023. pp. 869–889.
  74. Donahue K., Gollapudi S., Kollias K. When Are Two Lists Better Than One?: Benefits and Harms in Joint Decision-Making // Proceedings of the AAAI Conf. Artif. Intell. 2024. vol. 38. no. 9. pp. 10030–10038.
  75. Spitzer P., Holstein J., Hemmer P., Vössing M., Kühl N., Martin D., Satzger G. On the Effect of Contextual Information on Human Delegation Behavior in Human-AI collaboration. arXiv preprint arXiv:2401.04729. 2024.

Arquivos suplementares

Arquivos suplementares
Ação
1. JATS XML

Согласие на обработку персональных данных с помощью сервиса «Яндекс.Метрика»

1. Я (далее – «Пользователь» или «Субъект персональных данных»), осуществляя использование сайта https://journals.rcsi.science/ (далее – «Сайт»), подтверждая свою полную дееспособность даю согласие на обработку персональных данных с использованием средств автоматизации Оператору - федеральному государственному бюджетному учреждению «Российский центр научной информации» (РЦНИ), далее – «Оператор», расположенному по адресу: 119991, г. Москва, Ленинский просп., д.32А, со следующими условиями.

2. Категории обрабатываемых данных: файлы «cookies» (куки-файлы). Файлы «cookie» – это небольшой текстовый файл, который веб-сервер может хранить в браузере Пользователя. Данные файлы веб-сервер загружает на устройство Пользователя при посещении им Сайта. При каждом следующем посещении Пользователем Сайта «cookie» файлы отправляются на Сайт Оператора. Данные файлы позволяют Сайту распознавать устройство Пользователя. Содержимое такого файла может как относиться, так и не относиться к персональным данным, в зависимости от того, содержит ли такой файл персональные данные или содержит обезличенные технические данные.

3. Цель обработки персональных данных: анализ пользовательской активности с помощью сервиса «Яндекс.Метрика».

4. Категории субъектов персональных данных: все Пользователи Сайта, которые дали согласие на обработку файлов «cookie».

5. Способы обработки: сбор, запись, систематизация, накопление, хранение, уточнение (обновление, изменение), извлечение, использование, передача (доступ, предоставление), блокирование, удаление, уничтожение персональных данных.

6. Срок обработки и хранения: до получения от Субъекта персональных данных требования о прекращении обработки/отзыва согласия.

7. Способ отзыва: заявление об отзыве в письменном виде путём его направления на адрес электронной почты Оператора: info@rcsi.science или путем письменного обращения по юридическому адресу: 119991, г. Москва, Ленинский просп., д.32А

8. Субъект персональных данных вправе запретить своему оборудованию прием этих данных или ограничить прием этих данных. При отказе от получения таких данных или при ограничении приема данных некоторые функции Сайта могут работать некорректно. Субъект персональных данных обязуется сам настроить свое оборудование таким способом, чтобы оно обеспечивало адекватный его желаниям режим работы и уровень защиты данных файлов «cookie», Оператор не предоставляет технологических и правовых консультаций на темы подобного характера.

9. Порядок уничтожения персональных данных при достижении цели их обработки или при наступлении иных законных оснований определяется Оператором в соответствии с законодательством Российской Федерации.

10. Я согласен/согласна квалифицировать в качестве своей простой электронной подписи под настоящим Согласием и под Политикой обработки персональных данных выполнение мною следующего действия на сайте: https://journals.rcsi.science/ нажатие мною на интерфейсе с текстом: «Сайт использует сервис «Яндекс.Метрика» (который использует файлы «cookie») на элемент с текстом «Принять и продолжить».