多模态融合的家庭音乐相册自动生成
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(61401227);北京市自然科学基金(4152053)


Automatic generation of family music album based on multi-modal fusion
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    随着大数据以及社交网络的发展,电子相册与在线服务成为如今人们使用计算机与互联网的基础应用.尤其是近年社交网络的流行,电子相册的数量得到了爆炸增长,而如何增强相册的用户体验变得尤为重要.具有某种主题的相册一般都带有一定的情感信息,因此,本文研究了基于多模态融合的家庭音乐相册自动生成问题,旨在使用户能够在享受音乐的同时配以与音乐情感相同的相册图片.针对音乐与图片中所蕴含的情感,本文在音乐和图像中分别选取能够表达其情感的句子级别的音频特征和图像特征,然后在图像与音乐之间异构和跨模态的特征融合问题上,采用局部保持投影(LPP)方法,将图像特征与音乐特征映射到更具情感分类能力的隐式特征空间中,实现了音乐相册的自动生成.在实验中,客观评测结果表明,采用LPP方法在查准率方面高于纯CCA方法;在主观评测中LPP获得72.06%的满意度,与人工推荐的评价结果(78.09%)比较接近,明显高于随机推荐和CCA方法的满意度.

    Abstract:

    With the development of the big data and social network,electronic albums and online services have become basic uses of computers and the Internet.Especially in recent years,the number of electronic albums has exploded with the popularity of social network.So how to improve the user experience of music album becomes particularly important.A photo album with certain topic usually has some emotion information.This paper studies the problem of automatic generation of family music album based on multi-modal fusion,so that users can enjoy music when browsing album photos with matched emotion.According to the emotions in music and images,the representative sentence-level features both for music and images are selected,and the LPP (Locality Preserving Projection) is employed to study the relevance between the music and the images in the same emotion.The image feature and the music feature are mapped into the latent space with more emotional classification ability to realize the automatic generation of music album.In the experiments,the objective evaluation result shows that the LPP method is higher than pure CCA (Canonical Correlation Analysis) method in precision;and in the subjective evaluation,the proposed LPP method achieves 72.06% at satisfaction level,which is close to the results of manually recommended approach (78.09%) and is higher than the results of randomly recommended approach and pure CCA approach.

    参考文献
    相似文献
    引证文献
引用本文

刘君芳,邵曦.多模态融合的家庭音乐相册自动生成[J].南京信息工程大学学报(自然科学版),2017,9(6):661-668
LIU Junfang, SHAO Xi. Automatic generation of family music album based on multi-modal fusion[J]. Journal of Nanjing University of Information Science & Technology, 2017,9(6):661-668

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2017-08-28
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2017-11-25
  • 出版日期:

地址:江苏省南京市宁六路219号    邮编:210044

联系电话:025-58731025    E-mail:nxdxb@nuist.edu.cn

南京信息工程大学学报 ® 2024 版权所有  技术支持:北京勤云科技发展有限公司