Analytical Review of Audiovisual Systems for Determining Personal Protective Equipment on a Person's Face

Cover Page

Cite item

Full Text

Abstract

Since 2019 all countries of the world have faced the rapid spread of the pandemic caused by the COVID-19 coronavirus infection, the fight against which continues to the present day by the world community. Despite the obvious effectiveness of personal respiratory protection equipment against coronavirus infection, many people neglect the use of protective face masks in public places. Therefore, to control and timely identify violators of public health regulations, it is necessary to apply modern information technologies that will detect protective masks on people's faces using video and audio information. The article presents an analytical review of existing and developing intelligent information technologies for bimodal analysis of the voice and facial characteristics of a masked person. There are many studies on the topic of detecting masks from video images, and a significant number of cases containing images of faces both in and without masks obtained by various methods can also be found in the public access. Research and development aimed at detecting personal respiratory protection equipment by the acoustic characteristics of human speech is still quite small, since this direction began to develop only during the pandemic caused by the COVID-19 coronavirus infection. Existing systems allow to prevent the spread of coronavirus infection by recognizing the presence/absence of masks on the face, and these systems also help in remote diagnosis of COVID-19 by detecting the first symptoms of a viral infection by acoustic characteristics. However, to date, there is a number of unresolved problems in the field of automatic diagnosis of COVID-19 and the presence/absence of masks on people's faces. First of all, this is the low accuracy of detecting masks and coronavirus infection, which does not allow for performing automatic diagnosis without the presence of experts (medical personnel). Many systems are not able to operate in real time, which makes it impossible to control and monitor the wearing of protective masks in public places. Also, most of the existing systems cannot be built into a smartphone, so that users be able to diagnose the presence of coronavirus infection anywhere. Another major problem is the collection of data from patients infected with COVID-19, as many people do not agree to distribute confidential information.

About the authors

A. A Dvoynikova

SPC RAS

Email: dvoynikova.a@iias.spb.su
14-th Line V.O. 39

M. V Markitantov

SPC RAS

Email: m.markitantov@yandex.ru
14-th Line V.O. 39

E. V Ryumina

SPC RAS

Email: ryumina.e@iias.spb.su
14-th Line V.O. 39

D. A Ryumin

SPC RAS

Email: ryumin.d@iias.spb.su
14-th Line V.O. 39

A. A Karpov

SPC RAS

Email: karpov@iias.spb.su
14-th Line V.O. 39

References

  1. Habib A. et al. Global Epidemiology of COVID-19 and the Risk of Second Wave. Journal of Drug Delivery and Therapeutics. 2021. vol. 11. no. 1. pP. 1–2.
  2. Иванов В.А., Часовская Ю.С. Маски-индивидуальные средства защиты от воздушно-капельных инфекций // Интегративные тенденции в медицине и образовании. 2020. Т. 3. С. 47–53.
  3. Boškoski I., Gallo C., Wallace M.B., Costamagna G. COVID-19 pandemic and personal protective equipment shortage: protective efficacy comparing masks and scientific methods for respirator reuse. Gastrointestinal endoscopy. 2020. vol. 92. no. 3. P. 519–523.
  4. Macintyre C.R., Chughtai A.A. Facemasks for the prevention of infection in healthcare and community settings. Bmj. 2015. vol. 350.
  5. Abdulwhhab M.T. Use of Face-Mask Sampling as a Means of Characterising the Microbiota Exhaled from Human Respiratory Tract in Health and Disease: дис. – University of Leicester. 2020.
  6. Нагиев М.Р., Нестерова Н.В. Анализ осведомленности населения об эффективности использования одноразовых медицинских масок в профилактике ОРЗ и ОРВИ, а также перспектива использования лигнина гидролизного в их усовершенствовании // Молодой ученый. 2020. №. 20. С. 207–211.
  7. Jiang F. et al. Review of the clinical characteristics of coronavirus disease 2019 (COVID-19). Journal of general internal medicine. 2020. vol. 35. no. 5. pp. 1545–1549.
  8. Badillo-Goicoechea E., Chang T-H., Kim E., LaRocca S., Morris K., Deng X., Chiu S., Bradford A., Garcia A., Kern C., Cobb C., Kreuter F., Stuart E.A. Global trends and predictors of face mask usage during the COVID-19 pandemic. arXiv preprint arXiv:2012.11678. 2020.
  9. Eikenberry S.E. et al. To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic. Infectious Disease Modelling. 2020. vol. 5. pp. 293–308.
  10. Гольдштейн Э.М. Факторы, влияющие на смертность от новой коронавирусной инфекции в разных субъектах Российской Федерации // Журнал микробиологии, эпидемиологии и иммунобиологии. 2021. Т. 97. №. 6. С. 604–607.
  11. Мусихин И.Г. и другие. Ношение медицинских масок как эффективный способ защиты от covid-19 // Современное общество: опыт, проблемы и перспективы развития. 2021. С. 5–17.
  12. Chughtai A.A., Seale H., Macintyre C.R. Effectiveness of cloth masks for protection against severe acute respiratory syndrome coronavirus 2. Emerging infectious diseases. 2020. vol. 26. no. 10.
  13. Singh A. et al. Social perception and practices of households regarding mask use in public places during COVID-19 postquarantine period. BLDE University Journal of Health Sciences. 2020. vol. 5. no. 2. P. 209.
  14. Rahimi Z. et al. Face mask use among pedestrians during the Covid-19 pandemic in Southwest Iran: an observational study on 10,440 people. BMC Public Health. 2021. vol. 21. no. 1. pp. 1–9.
  15. Haischer M.H. et al. Who is wearing a mask? Gender-, age-, and location-related differences during the COVID-19 pandemic. PloS one. 2020. vol. 15. no. 10. P. e0240785.
  16. Peretti-Watel P. et al. Attitudes about COVID-19 lockdown among general population, France, March 2020. Emerging infectious diseases. 2021. vol. 27. no. 1. pp. 301–303.
  17. Буркова В.Н., Феденок Ю.Н. Медицинская маска как средство индивидуальной и коллективной защиты в условиях пандемии COVID-19 (кросс-культурные аспекты) // Вестник антропологии. (Herald of Anthropology) 2021. Т. 51. №. 3. С. 74–91.
  18. Natnael T. et al. Facemask wearing to prevent COVID-19 transmission and associated factors among taxi drivers in Dessie City and Kombolcha Town, Ethiopia. PloS one. 2021. vol. 16. no. 3. P. e0247954.
  19. Gunasekaran G.H. et al. Prevalence and acceptance of face mask practice among individuals visiting hospital during COVID-19 pandemic: an observational study. Preprints 2020. 2020.
  20. Ge S. et al. Detecting masked faces in the wild with lle-cnns. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. pp. 2682–2690.
  21. Roy B. et al. MOXA: A Deep Learning Based Unmanned Approach For Real-Time Monitoring of People Wearing Medical Masks. Transactions of the Indian National Academy of Engineering. 2020. vol. 5. no. 3. pp. 509–518.
  22. Faisal N., Wasiq K., Salwa Y., Abir H. Face Mask Detection Video Dataset. Mendeley Data. 2020.
  23. Wang Z. et al. Masked face recognition dataset and application. arXiv preprint arXiv:2003.09093. 2020.
  24. Huang B. et al. When Face Recognition Meets Occlusion: A New Benchmark. ICASSP. 2021. pp. 4240–4244.
  25. Yi D. et al. Learning face representation from scratch. arXiv preprint arXiv:1411.7923. 2014.
  26. Loey M. et al. A hybrid deep transfer learning model with machine learning methods for face mask detection in the era of the COVID-19 pandemic. Measurement. 2021. vol. 167. P. 108288.
  27. Learned-Miller E. et al. Labeled faces in the wild: A survey. Advances in face detection and facial image analysis. 2016. pp. 189–248.
  28. Chen Y. et al. Adversarial occlusion-aware face detection. 2018 IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS). 2018. pp. 1–9.
  29. Loey M. et al. Fighting against COVID-19: A novel deep learning model based on YOLO-v2 with ResNet-50 for medical face mask detection. Sustainable cities and society. 2021. vol. 65. P. 102600.
  30. Ryumina E., Ryumin D., Ivanko D., Karpov A. Novel Method for Protective Face Mask Detection Using Convolutional Neural Networks and Image Histograms. International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences. 2021. vol. XLIV-2/W1-2021. pp. 177–182.
  31. He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. pp. 770–778.
  32. Redmon J., Farhadi A. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767. 2018.
  33. Sandler M. et al. Mobilenetv2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. pp. 4510–4520.
  34. Nagrath P. et al. SSDMNV2: A real time DNN-based face mask detection system using single shot multibox detector and MobileNetV2. Sustainable cities and society. 2021. vol. 66. P. 102692.
  35. Liu W. et al. Ssd: Single shot multibox detector. Lecture Notes in Computer Science. 2016. vol. 9905. P. 21–37.
  36. Anisimov D., Khanova T. Towards lightweight convolutional neural networks for object detection. 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). 2017. pp. 1–8.
  37. Mohan P., Paul A.J., Chirania A.A. Tiny CNN Architecture for Medical Face Mask Detection for Resource-Constrained Endpoints. Innovations in Electrical and Electronic Engineering. Lecture Notes in Electrical Engineering. 2021. vol. 756.
  38. Вашкевич М.И., Азаров И.С. Определение патологии голосового аппарата на основе анализа модуляционного спектра речи в критических полосах. // Труды СПИИРАН. 2020. № 2 (19). C. 249–276.
  39. Авдеев В.Б., Трушин В.А., Кунгуров М.А. Унифицированная речеподобная помеха для средств активной защиты речевой информации // Информатика и автоматизация. 2020. № 5 (19). C. 991–1017.
  40. Dvoynikova A., Verkholyak O., Karpov A. Emotion Recognition and Sentiment Analysis of Extemporaneous Speech Transcriptions in Russian. Lecture Notes in Computer Science. 2020. vol. 12335 LNAI. pp. 136–144.
  41. Deshpande G., Schuller B.W. Audio, Speech, Language, & Signal Processing for COVID-19: A Comprehensive Overview. arXiv preprint arXiv:2011.14445. 2020.
  42. Monge-Alvarez J. et al. Robust detection of audio-cough events using local hu moments. IEEE journal of biomedical and health informatics. 2018. vol. 23. vol. 1. pp. 184–196.
  43. Schuller B., et al. The Interspeech 2017 computational paralinguistics challenge: Addressee, cold & snoring. INTERSPEECH. 2017. pp. 3442–3446.
  44. Sharma N. et al. Coswara A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis. INTERSPEECH. 2020. pp. 4811–4815.
  45. Brown C., Chauhan J., Grammenos A. et al. Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '20). 2020. pp. 3474–3484.
  46. Stasak B. et al. Automatic Detection of COVID-19 Based on Short-Duration Acoustic Smartphone Speech Analysis. Journal of Healthcare Informatics Research. 2021. vol. 5. Is. 2. P. 201–207.
  47. Saeidi R., Niemi T., Karppelin H., Pohjalainen J., Kinnunen T., Alku P. Speaker recognition for speech under face cover. INTERSPEECH. 2015. pp. 1012–1016.
  48. Schuller B., Batliner A., Bergler C., Messner E., Hamilton A., Amiriparian S., Baird A., Rizos G. The INTERSPEECH 2020 Computational paralinguistics challenge: Elderly emotion, Breathing & Masks. INTERSPEECH. 2020. pp. 2042–2046.
  49. Montacié C., Caraty M. Phonetic, Frame Clustering and Intelligibility Analyses for the INTERSPEECH 2020 ComParE Challenge. INTERSPEECH. 2020. pp. 2062–2066.
  50. Radeck-Arneth S., Milde B. et al. Open source german distant speech recognition: Corpus and acoustic model. International Conference on Text, Speech, and Dialogue. 2015. pp. 480–488.
  51. Matos S. et al. Detection of cough signals in continuous audio recordings using hidden Markov models. IEEE Transactions on Biomedical Engineering. 2006. vol. 53. vol. 6. pp. 1078–1083.
  52. Monge-Alvarez J. et al. Audio-cough event detection based on moment theory. Applied Acoustics. 2018. vol. 135. pp. 124–135.
  53. Gosztolya G., Busa-Fekete R., Grósz T., Tóth L. DNN-based feature extraction and classifier combination for child-directed speech, cold and snoring identification. INTERSPEECH. 2017. pp. 3522–3526.
  54. Schuller B., Batliner A., Bergler C., et al. The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates. INTERSPEECH. 2021. P. 5.
  55. Schuller B.W., Coppock H., Gaskell A. Detecting COVID-19 from Breathing and Coughing Sounds using Deep Neural Networks. arXiv preprint arXiv:2012.14553. 2020.
  56. Klumpp P., et al The Phonetic Footprint of Covid-19?. INTERSPEECH. 2021.
  57. Xia T. et al. Uncertainty-Aware COVID-19 Detection from Imbalanced Sound Data. arXiv preprint arXiv:2104.02005. 2021.
  58. Muguli A. et al. DiCOVA Challenge: Dataset, task, and baseline system for COVID-19 diagnosis using acoustics. arXiv preprint arXiv:2103.09148. 2021.
  59. Mendel L.L., Gardino J.A., Atcherson S.R. Speech understanding using surgical masks: a problem in health care?. Journal of the American Academy of Audiology. 2008. vol. 19. vol. 9. pp. 686–695.
  60. Cohn M., Pycha A., Zellou G. Intelligibility of face-masked speech depends on speaking style: Comparing casual, clear, and emotional speech. Cognition. 2021. vol. 210. P. 104570.
  61. Kalikow D.N., Stevens K.N., Elliott L.L. Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability. The Journal of the acoustical society of America. 1977. vol. 61. vol. 5. pp. 1337–1351.
  62. Pörschmann C., Lübeck T., Arend J.M. Impact of face masks on voice radiation. The Journal of the Acoustical Society of America. 2020. vol. 148. vol. 6. pp. 3663–3670.
  63. Saeidi R., Huhtakallio I., Alku P. Analysis of Face Mask Effect on Speaker Recognition. INTERSPEECH. 2016. pp. 1800–1804.
  64. Weninger F., Eyben F., Schuller B., Mortillaro M., Scherer K. On the Acoustics of Emotion in Audio: What Speech, Music and Sound have in Common. Frontiers in Emotion Science. 2013. vol. 4. pp. 1–12.
  65. Schmitt M., Schuller B. openXBOW – Introducing the Passau Open-Source Crossmodal Bag-of-Words Toolkit. Journal of Machine Learning Research. 2017. vol. 18. pp. 1–5.
  66. Freitag M., Amiriparian S., Pugachevskiy S., Cummins N., Schuller B. AuDeep: Unsupervised Learning of Representations from Audio with Deep Recurrent Neural Networks. Journal of Machine Learning Research. 2018. vol. 18. pp. 1–5.
  67. Amiriparian S., Gerczuk M., Ottl S., Cummins N., Freitag M., Pugachevski S., Schuller B. Snore sound classification using image-based deep spectrum features. INTERSPEECH. 2017. pp. 3512–3516.
  68. Yang Z., An Z., Fan Z., Jing C., Cao H. Exploration of Acoustic and Lexical Cues for the INTERSPEECH 2020 Computational Paralinguistic Challenge. INTERSPEECH. 2020. pp. 2092–2096.
  69. Klumpp P., Arias-Vergara T., Vásquez-Correa J., Pérez-Toro P, Hönig F., Nöth E., Orozco-Arroyave J. Surgical Mask Detection with Deep Recurrent Phonetic Models. INTERSPEECH. 2020. pp. 2057–2061.
  70. Illium S., Müller R., Sedlmeier A., Linnhoff-Popien C. Surgical Mask Detection with Convolutional Neural Networks and Data Augmentations on Spectrograms. INTERSPEECH. 2020. pp. 2052–2056.
  71. Ristea N., Ionescu R. Are you Wearing a Mask? Improving Mask Detection from Speech Using Augmentation by Cycle-Consistent GANs. INTERSPEECH. 2020. pp. 2102–2106.
  72. Koike T., Qian K., Schuller B., Yamamoto Y. Learning Higher Representations from Pre-Trained Deep Models with Data Augmentation for the COMPARE 2020 Challenge Mask Task. INTERSPEECH. 2020. pp. 2047–2051.
  73. Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., Bengio Y. Generative adversarial networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014. vol. 2. pp. 2672–2680.
  74. Szep J., Hariri S. Paralinguistic Classification of Mask Wearing by Image Classifiers and Fusion. INTERSPEECH. 2020. pp. 2087–2091.
  75. Simonyan K., Zisserman A. Very Deep Convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. 2014. P. 14.
  76. Huang G., Liu Z., Van Der Maaten L. Weinberger K. Densely connected convolutional networks. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. pp. 4700–4708.
  77. Krizhevsky A., Sutskever I., Hinton G. Imagenet classification with deep convolutional neural networks. Communications of the ACM. 2017. vol. 60. vol. 6. pp. 84–90.
  78. Markitantov M. et al. Ensembling end-to-end deep models for computational paralinguistics tasks: ComParE 2020 Mask and Breathing Sub-challenges. INTERSPEECH. 2020. P. 2666.
  79. Schuller B.W. et al. Covid-19 and computer audition: An overview on what speech & sound analysis could contribute in the sars-cov-2 corona crisis. arXiv preprint arXiv:2003.11117. 2020.
  80. Fecher N. The "audio-visual face cover corpus": investigations into audio-visual speech and speaker recognition when the speaker's face is occluded by facewear. INTERSPEECH. 2012. pp. 2250–2253.
  81. Корпус аудиовизуальных русскоязычных данных людей в защитных масках (BRAVE-MASKS - Biometric Russian Audio-Visual Extended MASKS corpus). Свидетельство о государственной регистрации Базы данных № 2021621094 от 26.05.2021, авторы: Маркитантов М.В., Рюмин Д.А., Рюмина Е.В., Карпов А.А., правообладатель: СПб ФИЦ РАН.

Supplementary files

Supplementary Files
Action
1. JATS XML

Согласие на обработку персональных данных с помощью сервиса «Яндекс.Метрика»

1. Я (далее – «Пользователь» или «Субъект персональных данных»), осуществляя использование сайта https://journals.rcsi.science/ (далее – «Сайт»), подтверждая свою полную дееспособность даю согласие на обработку персональных данных с использованием средств автоматизации Оператору - федеральному государственному бюджетному учреждению «Российский центр научной информации» (РЦНИ), далее – «Оператор», расположенному по адресу: 119991, г. Москва, Ленинский просп., д.32А, со следующими условиями.

2. Категории обрабатываемых данных: файлы «cookies» (куки-файлы). Файлы «cookie» – это небольшой текстовый файл, который веб-сервер может хранить в браузере Пользователя. Данные файлы веб-сервер загружает на устройство Пользователя при посещении им Сайта. При каждом следующем посещении Пользователем Сайта «cookie» файлы отправляются на Сайт Оператора. Данные файлы позволяют Сайту распознавать устройство Пользователя. Содержимое такого файла может как относиться, так и не относиться к персональным данным, в зависимости от того, содержит ли такой файл персональные данные или содержит обезличенные технические данные.

3. Цель обработки персональных данных: анализ пользовательской активности с помощью сервиса «Яндекс.Метрика».

4. Категории субъектов персональных данных: все Пользователи Сайта, которые дали согласие на обработку файлов «cookie».

5. Способы обработки: сбор, запись, систематизация, накопление, хранение, уточнение (обновление, изменение), извлечение, использование, передача (доступ, предоставление), блокирование, удаление, уничтожение персональных данных.

6. Срок обработки и хранения: до получения от Субъекта персональных данных требования о прекращении обработки/отзыва согласия.

7. Способ отзыва: заявление об отзыве в письменном виде путём его направления на адрес электронной почты Оператора: info@rcsi.science или путем письменного обращения по юридическому адресу: 119991, г. Москва, Ленинский просп., д.32А

8. Субъект персональных данных вправе запретить своему оборудованию прием этих данных или ограничить прием этих данных. При отказе от получения таких данных или при ограничении приема данных некоторые функции Сайта могут работать некорректно. Субъект персональных данных обязуется сам настроить свое оборудование таким способом, чтобы оно обеспечивало адекватный его желаниям режим работы и уровень защиты данных файлов «cookie», Оператор не предоставляет технологических и правовых консультаций на темы подобного характера.

9. Порядок уничтожения персональных данных при достижении цели их обработки или при наступлении иных законных оснований определяется Оператором в соответствии с законодательством Российской Федерации.

10. Я согласен/согласна квалифицировать в качестве своей простой электронной подписи под настоящим Согласием и под Политикой обработки персональных данных выполнение мною следующего действия на сайте: https://journals.rcsi.science/ нажатие мною на интерфейсе с текстом: «Сайт использует сервис «Яндекс.Метрика» (который использует файлы «cookie») на элемент с текстом «Принять и продолжить».