基于Faster R-CNN的食品图像检索和分类

doi:10.13878/j.cnki.jnuist.2017.06.007

2025年4月13日 5:45 星期日

首页 > 过刊浏览>2017年第9卷第6期 >635-641. DOI:10.13878/j.cnki.jnuist.2017.06.007

基于Faster R-CNN的食品图像检索和分类
DOI:
                        10.13878/j.cnki.jnuist.2017.06.007
                    
作者:
                        梅舒欢梅舒欢
山东科技大学 数学与系统科学学院, 青岛, 266590;中国科学院计算技术研究所 智能信息处理重点实验室, 北京, 100190
在期刊界中查找
在百度中查找
在本站中查找
闵巍庆闵巍庆
中国科学院计算技术研究所 智能信息处理重点实验室, 北京, 100190
在期刊界中查找
在百度中查找
在本站中查找
刘林虎刘林虎
中国科学院计算技术研究所 智能信息处理重点实验室, 北京, 100190;中国科学院大学 人工智能技术学院, 北京, 100049
在期刊界中查找
在百度中查找
在本站中查找
段华段华
山东科技大学 数学与系统科学学院, 青岛, 266590
在期刊界中查找
在百度中查找
在本站中查找
蒋树强蒋树强
中国科学院计算技术研究所 智能信息处理重点实验室, 北京, 100190
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金（61532018，61602437，61672497，61472229，61202152）；北京市科技计划（D161100001816001）；山东省自然科学基金（ZR2017MF02）；山东省科技发展计划（2016ZDJS02A11，2014GGX101035，2014BSB01020）

Faster R-CNN based food image retrieval and classification

Author:

MEI Shuhuan
MEI Shuhuan
College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao 266590;Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190
在期刊界中查找
在百度中查找
在本站中查找
MIN Weiqing
MIN Weiqing
Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190
在期刊界中查找
在百度中查找
在本站中查找
LIU Linhu
LIU Linhu
Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190;School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049
在期刊界中查找
在百度中查找
在本站中查找
DUAN Hua
DUAN Hua
College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao 266590
在期刊界中查找
在百度中查找
在本站中查找
JIANG Shuqiang
JIANG Shuqiang
Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

面向食品领域的图像检索和分类等方面的研究成为多媒体分析和应用领域越来越受关注的研究课题之一.当前的主要研究方法基于全图提取视觉特征，但由于食品图像背景噪音的存在使得提取的视觉特征不够鲁棒，进而影响食品图像检索和分类的性能.为此，本文提出了一种基于Faster R-CNN网络的食品图像检索和分类方法.首先通过Faster R-CNN检测图像中的候选食品区域，然后通过卷积神经网络（CNN）方法提取候选区域的视觉特征，避免了噪音的干扰使得提取的视觉特征更具有判别力.此外，选取来自视觉基因库中标注好的食品图像集微调Faster R-CNN网络，以保证Faster R-CNN食品区域检测的准确度.在包括233类菜品和49 168张食品图像的Dish-233数据集上进行实验.全面的实验评估表明：基于Faster R-CNN食品区域检测的视觉特征提取方法可以有效地提高食品图像检索和分类的性能.

关键词:食品图像;图像检索;图像分类;深度学习;Faster R-CNN;卷积神经网络

Abstract:

Automatic understanding of food images has various applications in different fields,such as food intake monitor and food calorie estimation.Thus,the research on food related tasks,such as food image retrieval and classification has been one of the hot research topics in the field of multimedia analysis and applications recently.Existing methods mainly extract the visual features from the whole food image for further food analysis.The extracted features are lacking in robustness because of the background interference from the images.In order to solve this problem,we propose a Faster R-CNN (Region-based Convolutional Neural Network) based food retrieval and classification method.For the solution,we first detect the food candidate regions using Faster R-CNN,and then adopt the CNN network to extract the visual features from the detected food regions.Such extracted features are more discriminative for reducing the background interference.Furthermore,we select the annotated food images from the Visual Genome dataset to fine-tune the Faster R-CNN to guarantee its performance.We conduct the experiment on two datasets:Food-101 with 101 classes and 10 641 food images,and Dish-233 with 233 dishes and 49 168 images.The extensive evaluation demonstrates the effectiveness of the proposed Faster R-CNN based food visual feature extraction method in food image retrieval and classification.

Key words:food image;image retrieval;image classification;deep learning;Faster R-CNN;convolutional neural network

参考文献

[1] Yang S L,Chen M,Pomerleau D,et al.Food recognition using statistics of pairwise local features[C]//IEEE Conference on Computer Vision and Pattern Recognition,2010:2249-2256

[2] Bossard L,Guillaumin M,Van Gool L.Food-101-mining discriminative components with random forests[C]//European Conference on Computer Vision,2014:446-461

[3] Krizhevsky A,Sutskever I,Hinton G E.ImageNet classification with deep convolutional neural networks[C]//International Conference on Neural Information Processing Systems,2012:1097-1105

[4] Kagaya H,Aizawa K,Ogawa M.Food detection and recognition using convolutional neural network[C]//ACM International Conference on Multimedia,2014:1085-1088

[5] Hassannejad H,Matrella G,Ciampolini P,et al.Food image recognition using very deep convolutional networks[C]//International Workshop on Multimedia Assisted Dietary Management,2016:41-49

[6] Singla A,Yuan L,Ebrahimi T.Food/non-food image classification and food categorization using pre-trained GoogLeNet model[C]//International Workshop on Multimedia Assisted Dietary Management,2016:3-11

[7] Xu R,Herranz L,Jiang S Q,et al.Geolocalized modeling for dish recognition[J].IEEE Transactions on Multimedia,2015,17(8):1187-1199

[8] Farinella G M,Allegra D,Moltisanti M,et al.Retrieval and classification of food images[J].Computers in Biology & Medicine,2016,77:23-39

[9] Krishna R,Zhu Y K,Groth O,et al.Visual genome:Connecting language and vision using crowdsourced dense image annotations[J].International Journal of Computer Vision,2017,123(1):32-73

[10] Min W Q,Jiang S Q,Wang S H,et al.A delicious recipe analysis framework for exploring multi-modal recipes with various attributes[C]//ACM International Conference on Multimedia,2017(in press)

[11] Dehais J,Anthimopoulos M,Mougiakakou S.Dish detection and segmentation for dietary assessment on smartphones[C]//International Conference on Image Analysis and Processing,2015:433-440

[12] Tanno R,Okamoto K,Yanai K.Deepfoodcam:A DCNN-based real-time mobile food recognition system[C]//International Workshop on Multimedia Assisted Dietary Management,2016:89

[13] Chen J J,Ngo C-W.Deep-based ingredient recognition for cooking recipe retrieval[C]//ACM on Multimedia Conference,2016:32-41

[14] Min W Q,Jiang W Q,Sang J T,et al.Being a super cook:Joint food attributes and multi-modal content modeling for recipe retrieval and exploration[J].IEEE Transactions on Multimedia,2017,19(5):1100-1113

[15] Salvador A,Hynes N,Aytar Y,et al.Learning cross-modal embeddings for cooking recipes and food images[C]//IEEE Conference on Computer Vision and Pattern Recognition,2017:3020-3028

[16] Szegedy C,Liu W,Jia Y Q,et al.Going deeper with convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition,2015:1-9

[17] Tolias G,Sicre R,Jégou H.Particular object retrieval with integral max-pooling of CNN activations[J].arXiv e-print,2015,arXiv:1511.05879

[18] Radenovic F,Tolias G,Chum O.CNN image retrieval learns from DoW:Unsupervised fine-tuning with hard examples[C]//European Conference on Computer Vision,2016:3-20

[19] Girshick R,Donahue J,Darrell T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition,2014:580-587

[20] He K M,Zhang X Y,Ren S Q,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[C]//European Conference on Computer Vision,2014:346-361

[21] Girshick R.Fast R-CNN[C]//IEEE International Conference on Computer Vision,2015:1440-1448

[22] Ren S Q,He K M,Girshick R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149

[23] Redmon J,Farhadi A.YOLO9000:Better,faster,stronger[J].arXiv e-print,2016,arXiv:1612.08242

[24] Min W Q,Bao B K,Mei S H,et al.You are what you eat:Exploring rich recipe information for cross-region food analysis[C]//IEEE Transactions on Multimedia,2017(In public)

[25] Meyers A,Johnston N,Rathod V,et al.Im2Calories:Towards an automated mobile vision food diary[C]//IEEE International Conference on Computer Vision,2015:1233-1241

引用本文

梅舒欢,闵巍庆,刘林虎,段华,蒋树强.基于Faster R-CNN的食品图像检索和分类[J].南京信息工程大学学报(自然科学版),2017,9(6):635-641
MEI Shuhuan, MIN Weiqing, LIU Linhu, DUAN Hua, JIANG Shuqiang. Faster R-CNN based food image retrieval and classification[J]. Journal of Nanjing University of Information Science & Technology, 2017,9(6):635-641

复制

文章指标

点击次数:1714
下载次数: 3795
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2017-07-28
最后修改日期:
录用日期:
在线发布日期: 2017-11-25
出版日期:

地址：江苏省南京市宁六路219号邮编：210044

联系电话：025-58731025 E-mail：nxdxb@nuist.edu.cn

引用本文

分享

文章指标

历史