• Volume 9,Issue 6,2017 Table of Contents
    Select All
    Display Type: |
    • >综述
    • A survey of fine-grained image recognition based on user click data

      2017, 9(6):567-574. DOI: 10.13878/j.cnki.jnuist.2017.06.001 CSTR:

      Abstract (1301) HTML (0) PDF 2.10 M (2610) Comment (0) Favorites

      Abstract:In recent years,fine-grained image recognition has become a hotspot in computer vision area.Due to the subtle visual differences among different image categories and the serious semantic gap,the performance of traditional image recognition algorithms for fine-grained images recognition is mostly unsatisfactory.To overcome these challenges,many researchers have been concentrating on image recognition with user click data.This paper focuses on the three key modules of the fine-grained recognition system with user click data:data pre-processing,feature extracting and model construction.Also,existing algorithms for click data based image recognition are summarized,and the related latest progresses are demonstrated.

    • Research progress on development and construction of knowledge graph

      2017, 9(6):575-582. DOI: 10.13878/j.cnki.jnuist.2017.06.002 CSTR:

      Abstract (1501) HTML (0) PDF 1.18 M (6514) Comment (0) Favorites

      Abstract:Knowledge graph technology is widely concerned and studied during recent years,in this paper we introduce the construction methods,recent development of knowledge graph in details,we also summarize the interdisciplinary applications of knowledge graph and future directions of research.This paper details the key technologies of textual,visual and multi-modal knowledge graph,such as information extraction,knowledge fusion and knowledge representation.As an important part of the knowledge engineering,knowledge graph,especially the development of multi-modal knowledge graph,is of great significance for efficient knowledge management,knowledge acquisition and knowledge sharing in the era of big data.

    • Rumor detection on social media with multimodal feature fusion

      2017, 9(6):583-592. DOI: 10.13878/j.cnki.jnuist.2017.06.003 CSTR:

      Abstract (1391) HTML (0) PDF 2.39 M (3234) Comment (0) Favorites

      Abstract:Social media,such as microblogs,has developed rapidly nowadays,which accelerates the information diffusion on the Internet.However,numerous false rumors fostered on social media are spreading widely on the social network and can result in serious consequences.It has become a huge concern in research and industry areas to detect rumors automatically on social media.Focused on the rumor detection task,this paper summarizes the approaches of multimodal fusion on this problem.Starting from the basic concepts,we give formal definitions of rumors and introduce the characteristics of social media.We summarize the studies on rumor detection into two major parts,i.e.,extracting effective multimodal features to identify rumors and constructing robust models to detect rumors.For each of the research aspects,we give detailed introduction based on existing studies.This paper can be served as a basic guidance to build state-of-the-art rumor detection models and a reference for future researches.

    • A survey of image artistic stylization

      2017, 9(6):593-598. DOI: 10.13878/j.cnki.jnuist.2017.06.004 CSTR:

      Abstract (971) HTML (0) PDF 3.53 M (3678) Comment (0) Favorites

      Abstract:In recent years,image artistic style transfer has become a prosperous research field.More and more activities in this field have been promoted by scientific challenges and industrial needs,so image artistic style transfer is worthy of researching.In this paper,we analyze the present situation of image artistic style transfer,the characteristics of different style transfer methods,the shortcomings of the current style transfer methods and the development trend of image style transfer.Finally,we provide the direction for further style transfer research.

    • Review on multimedia social event analysis research

      2017, 9(6):599-612. DOI: 10.13878/j.cnki.jnuist.2017.06.005 CSTR:

      Abstract (1363) HTML (0) PDF 1.76 M (2630) Comment (0) Favorites

      Abstract:In recent years,with the rapid development of Internet,more and more social networking sites appear and allow users to conveniently share their ideas,pictures,posts,and activities.Therefore,when a popular event is happening around us,it can spread very fast in different social media sites with substantial amounts of multimedia data including images,videos,and texts.Therefore,it is important and necessary to conduct the research of multimedia social event analysis to know the evolutionary trend of social event over time automatically.This paper provides a survey and summarizes major progresses in multimedia social event analysis.We focus on four areas:(1) multimedia social event representation;(2) multimedia social event detection and tracking;(3)multimedia social event evolutionary analysis;and (4) multimedia social event topic-opinion analysis.Then,the development trend of multimedia social event analysis is highlighted.Finally,the possible future research topics in multimedia social event analysis are prospected.

    • Recent advance in content-based image retrieval:A literature survey

      2017, 9(6):613-634. DOI: 10.13878/j.cnki.jnuist.2017.06.006 CSTR:

      Abstract (1139) HTML (0) PDF 2.03 M (3445) Comment (0) Favorites

      Abstract:The explosive increase and ubiquitous accessibility of visual data on the Web have led to the prosperity of research activity in image search or retrieval.With the ignorance of visual content as a ranking clue,methods with text search techniques for visual retrieval may suffer inconsistency between the text words and the visual content.Content-based image retrieval(CBIR),which makes use of the representation of visual content to identify relevant images,has attracted sustained attention in recent two decades.Such a problem is challenging due to the intention gap and the semantic gap problems.Numerous techniques have been developed for content-based image retrieval in the last decade.The purpose of this paper is to categorize and evaluate those algorithms proposed during the period of 2003 to 2016.We conclude with several promising directions for future research.

    • >研究性论文
    • Faster R-CNN based food image retrieval and classification

      2017, 9(6):635-641. DOI: 10.13878/j.cnki.jnuist.2017.06.007 CSTR:

      Abstract (1710) HTML (0) PDF 3.87 M (3687) Comment (0) Favorites

      Abstract:Automatic understanding of food images has various applications in different fields,such as food intake monitor and food calorie estimation.Thus,the research on food related tasks,such as food image retrieval and classification has been one of the hot research topics in the field of multimedia analysis and applications recently.Existing methods mainly extract the visual features from the whole food image for further food analysis.The extracted features are lacking in robustness because of the background interference from the images.In order to solve this problem,we propose a Faster R-CNN (Region-based Convolutional Neural Network) based food retrieval and classification method.For the solution,we first detect the food candidate regions using Faster R-CNN,and then adopt the CNN network to extract the visual features from the detected food regions.Such extracted features are more discriminative for reducing the background interference.Furthermore,we select the annotated food images from the Visual Genome dataset to fine-tune the Faster R-CNN to guarantee its performance.We conduct the experiment on two datasets:Food-101 with 101 classes and 10 641 food images,and Dish-233 with 233 dishes and 49 168 images.The extensive evaluation demonstrates the effectiveness of the proposed Faster R-CNN based food visual feature extraction method in food image retrieval and classification.

    • Video description based on relationship feature embedding

      2017, 9(6):642-649. DOI: 10.13878/j.cnki.jnuist.2017.06.008 CSTR:

      Abstract (1140) HTML (0) PDF 2.17 M (2773) Comment (0) Favorites

      Abstract:Video description has received increased interest in the field of computer vision.The process of generating video descriptions needs the technology of natural language processing,and the capacity to allow both the lengths of input (sequence of video frames) and output (sequence of description words) to be variable.To this end,this paper uses the recent advances in machine translation,and designs a two-layer LSTM (Long Short-Term Memory) model based on the encoder-decoder architecture.Since the deep neural network can learn appropriate representation of input data,we extract the feature vectors of the video frames by convolution neural network (CNN) and take them as the input sequence of the LSTM model.Finally,we compare the influences of different feature extraction methods on the LSTM video description model.The results show that the model in this paper is able to learn to transform sequence of knowledge representation to natural language.

    • Demosaicing based on residual interpolation and convolutional neural networks

      2017, 9(6):650-655. DOI: 10.13878/j.cnki.jnuist.2017.06.009 CSTR:

      Abstract (1291) HTML (0) PDF 2.03 M (2977) Comment (0) Favorites

      Abstract:In order to accurately restore the texture on the oblique edges and improve the overall resolution of the demosaiced image,a convolutional neural network demosaicing algorithm is proposed based on residual interpolation.The algorithm uses the information of Bayer color filter arrays to calculate the gradient of diagonal edges,which can be used to determine the edge directions.Therefore,the corresponding interpolation formula is proposed for different edges.We incorporate the convolutional neural networks into our method to refine the interpolated images.To demonstrate the superiority of the proposed algorithm,several experiments were conducted with IMAX dataset.The experimental results show that the proposed algorithm exhibits better visual effect,higher PSNR and shorter running time compared with those of commonly used Bayer demosaicing algorithms.

    • An online real-time multiple object tracker with multiple information integration

      2017, 9(6):656-660. DOI: 10.13878/j.cnki.jnuist.2017.06.010 CSTR:

      Abstract (1225) HTML (0) PDF 3.14 M (3377) Comment (0) Favorites

      Abstract:The multiple object tracking (MOT) algorithm will fail when its target is occluded or in fast motion,furthermore,it cannot recover from drifting.To solve these problems,firstly,we employ integrated information to enhance the representation of objects,which includes the target's appearance,shape and motion information.By means of the integrated information,we can accurately calculate the similarity,which is as similar as possible between the same targets and as different as possible between the different targets.Secondly,we propose a novel real-time single object tracker based on the combination of the discriminative correlation filters (DCF) and the Kalman filters,which is robust to occlusion and fast motion.Extensive experiments have been done,and results show that the proposed MOT algorithm can accurately track the target in case of occlusion or fast motion in real time.

    • Automatic generation of family music album based on multi-modal fusion

      2017, 9(6):661-668. DOI: 10.13878/j.cnki.jnuist.2017.06.011 CSTR:

      Abstract (1219) HTML (0) PDF 1.11 M (2292) Comment (0) Favorites

      Abstract:With the development of the big data and social network,electronic albums and online services have become basic uses of computers and the Internet.Especially in recent years,the number of electronic albums has exploded with the popularity of social network.So how to improve the user experience of music album becomes particularly important.A photo album with certain topic usually has some emotion information.This paper studies the problem of automatic generation of family music album based on multi-modal fusion,so that users can enjoy music when browsing album photos with matched emotion.According to the emotions in music and images,the representative sentence-level features both for music and images are selected,and the LPP (Locality Preserving Projection) is employed to study the relevance between the music and the images in the same emotion.The image feature and the music feature are mapped into the latent space with more emotional classification ability to realize the automatic generation of music album.In the experiments,the objective evaluation result shows that the LPP method is higher than pure CCA (Canonical Correlation Analysis) method in precision;and in the subjective evaluation,the proposed LPP method achieves 72.06% at satisfaction level,which is close to the results of manually recommended approach (78.09%) and is higher than the results of randomly recommended approach and pure CCA approach.

    • Edge guided dual-channel convolutional neural network for single image super resolution algorithm

      2017, 9(6):669-674. DOI: 10.13878/j.cnki.jnuist.2017.06.012 CSTR:

      Abstract (1060) HTML (0) PDF 1.32 M (2561) Comment (0) Favorites

      Abstract:At present,although the super-resolution (SR) reconstruction algorithm based on the Convolutional Neural Network (CNN) has achieved great success,it cannot well reconstruct the high-frequency texture of the image.As a result,there exists obvious shake in local edge of the high-resolution (HR) image.We present an edge guided dual-channel CNN SR reconstruction algorithm integrated with Morphological Component Analysis (MCA).The low-resolution (LR) image to be processed is decomposed into texture part and structure part by MCA,then the texture part and the original LR image form a dual channel together,which is then input into the modified network structure to reconstruct the HR texture part.The reconstruction loss of both the HR image and HR texture are chosen simultaneously for training.As for post-processing step,we perform histogram matching between our network output and the LR input to strengthen the visual effect and apply an iterative back projection refinement to improve the PSNR.As shown in experiment results,this method with dual-channel input can restore texture details of the image,especially restore the image with rich texture.

    • Short circuit current limiting measures in power grid system

      2017, 9(6):675-680. DOI: 10.13878/j.cnki.jnuist.2017.06.013 CSTR:

      Abstract (1087) HTML (0) PDF 1.24 M (2891) Comment (0) Favorites

      Abstract:To address the problem of excessive short-circuit current in power grid system,we analyzed the application of various short-circuit current limiting measures,including switches in system operating modes,improvement in the power grid structure,use of high impedance transformer,add of fault current limiter.Based on simulation and comparison of short-circuit current limiting measures,this paper puts forward the most reasonable short circuit current restrictive measures according to the actual situation of the power grid.

Current Issue


, Volume , No.

Table of Contents

Archive

Volume

Issue

Most Read

Most Cited

Most Downloaded

Address:No. 219, Ningliu Road, Nanjing, Jiangsu Province

Postcode:210044

Phone:025-58731025