Big Data:Conceptions,key technologies and application
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [50]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    With the rapid development of internet of things,cloud computing,and mobile internet,the rise of Big Data has attracted more and more concern,which brings not only great benefits but also crucial challenges on how to manage and utilize Big Data better.This paper describes the main aspects of Big Data including definition,data sources,key technologies,data processing tools and applications,discusses the relationship between Big Data and cloud computing,internet of things and mobile internet technology.Furthermore,the paper analyzes the core technologies of Big Data,Big Data solutions from industrial circles,and discusses the application of Big Data.Finally,the general development trend on Big Data is summarized.The review on Big Data is helpful to understand the current development status of Big Data,and provides references to scientifically utilize key technologies of Big Data.

    Reference
    [1] Nature.Big Data..http://www.nature.com/news/specials/bigdata/index.htm
    [2] Science.Special online collection:Dealing with data.(2011-02-11)..http://www.sciencemag.org/site/special/data/
    [3] Big Data across the federal government...http://www.whitehouse.gov/sites/default/files/microsites/ostp/big_data_fact_sheet_final_1.pdf
    [4] Agrawal D,Bernstein P,Bertino E,et al.Challenges and opportunities with Big Data.Cyber Center Technical Reports,2012
    [5] 李国杰,程学旗.大数据研究:未来科技及经济社会发展的重大战略领域[J].中国科学院院刊,2012,27(6):647-657 LI Guojie,CHENG Xueqi.Research status and scientific thinking of Big Data[J].Bulletin of Chinese Academy of Sciences,2012,27(6):647-657
    [6] Manyika J,Chui M,Brown B,et al.Big Data:The next frontier for innovation,competition,and productivity..http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation
    [7] 冯登国,张敏,李昊.大数据安全与隐私保护[J].计算机学报,2014,37(1):246-258 FENG Dengguo,ZHANG Min,LI Hao.Big Data security and privacy protection[J].Chinese Journal of Computers,2014,37(1):246-258
    [8] 孟小峰,慈祥.大数据管理:概念、技术与挑战[J].计算机研究与发展,2013,50(1):146-169 MENG Xiaofeng,CI Xiang.Big Data management:Concepts,techniques and challenges[J].Journal of Computer Research and Development,2013,50(1):146-169
    [9] 李国杰.大数据研究的科学价值[J].中国计算机学会通讯,2012,8(9):8-15 LI Guojie.Scientific value on Big Data research[J].Communications of China Computer Federation,2012,8 (9):8-15
    [10] 中国计算机学会大数据专家委员会.中国大数据技术与产业发展白皮书.2013 Big Data Expert Committee in China Computer Federation.White paper on China's Big Data technology and industry development.2013
    [11] 周晓方,陆嘉恒,李翠平,等.从数据管理视角看大数据挑战[J].中国计算机学会通讯,2012,8(9):16-20 ZHOU Xiaofang,LU Jiaheng,LI Cuiping,et al.Big Data challenges from the point view of data management[J].Communications of China Computer Federation,2012,8(9):16-20
    [12] Vardi M.On the integrity of databases with incomplete information//Proceedings of the 5th ACM SIG ACT-SIGMOD Symposium on Principles of Database Systems,1985:252-266
    [13] Gottlob G,Zicari R.Closed world databases opened through null values//Bancilhon F,deWitt D J.Proceedings of the 14th International Conference on Very Large Databases,1988:50-61
    [14] Dalvi N N,Suciu D.Management of probabilistic data:Foundations and challenges//Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of database systems,2007:1-12,doi:10.1145/1265530.1265531
    [15] Kanagal B,Li J,Deshpande A.Sensitivity analysis and explanations for robust query evaluation in probabilistic databases//SIGMOD,2011:841-852
    [16] Li J,Saha B,Deshpande A.A unified approach to ranking in probabilistic databases[J].The VLDB Journal,2011,20(2):249-275
    [17] Cooper B F,Sample N,Franklin M J,et al.A fast index for semistructured data//Proceedings of the International Conference on VLDB,2001:341-350
    [18] Times N Y.Power,pollution and the internet..http://www.nytimes.com/2012/09/23/technology/data-centers-waste-vast-amounts-of-energy-belying-industry-image.html?pagewanted=all
    [19] 刘锋.互联网进化论[M].北京:清华大学出版社,2012 LIU Feng.Internet evolution[M].Beijing:Tsinghua University Press,2012
    [20] Haas L.Integrating extremely large data is extremely challenging//Proceedings of XLDB Asia 2012.http://idke.ruc.edu.cn/xldb/www.xldb-asia.org/program.html
    [21] Li X,Dong X L,Lyons K,et al.Truth finding on the deep web:Is the problem solved?//Proceedings of the 39th International Conference on Very Large Data Bases(VLDB'2013),2013:97-108
    [22] Arasu A,Chaudhuri S,Chen Z,et al.Experiences with using data cleaning technology for bing services[J].IEEE Data Engineering Bulletin,2012,35(2):14-23
    [23] Ghemawat S,Gobioff H,Leung S-T.The Google file system//Proceedings of the 19th ACM Symposium on Operating Systems Principles,2003:29-43
    [24] HDFS Architecture Guide..http://hadoop.apache.org/docs/stable/hdfs_design.htm,20130512
    [25] Dean J,Ghemawat S.MapReduce:Simplified data processing on large clusters[J].Communications of the ACM,2008,51(1):107-113
    [26] Zaharia M,Chowdhury M,Das T,et al.Resilient distributed datasets:A fault-tolerant abstraction for in-memory cluster computing//Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation,2012:2-16
    [27] Gonzalez J E,Low Y,Gu H,et al.PowerGraph:Distributed graph-parallel computation on natural graphs//Proceeding of the 10th USENIX Symposium on Operating Systems Design and Implementation,2012:17-30
    [28] 吴甘沙.大数据计算范式的分野与交融[J].程序员,2013(9):104-108 WU Gansha.Big Data computing paradigm divergence and blending[J].Programmer,2013(9):104-108
    [29] Melnik S,Gubarey A,Long J J,et al.Dremel:Interactive analysis of web-scale datasets[J].Communications of the ACM,2011,54(6):114-123
    [30] Kumar R.Two computational paradigm for Big Data..http://kdd2012.sigkdd.org/sites/images/summerschool/Ravi-Kumar.pdf
    [31] Neumeyer L,Robbins B,Nair A,et al.S4:Distributed stream computing platform//IEEE International Conference on Data Mining Workshops,2010:170-177
    [32] Goodhope K,Koshy J,Kreps J,et al.Building LinkedIn's real time activity data pipeline[J].IEEE Data Engineering Bulletin,2012,35(2):33-45
    [33] Zaharia M,Das T,Li H Y,et al.Discretized streams:An efficient and fault-tolerant model for stream processing on large cluster//Proceedings of the 4th USENIX conference on Hot Topics in Cloud Computing,2012:10-16
    [34] Bu Y Y,Howe B,Balazinska M,et al.HaLoop:Efficient iterative data processing on large cluster[J].Proc VLDB Endow,2010,3(1/2):285-296
    [35] Ekanayake J,Li H,Zhang B J,et al.Twister:A runtime for iterative MapReduce//Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing,2010:810-818,doi:10.1145/1851476.1851593
    [36] Zhang Y F,Gao Q X,Gao L X,et al.iMapReduce:A distributed computing framework for iterative computation[J].Journal of Grid Computing,2012,10(1):47-68
    [37] Elnikety E,Elsayed E,Ramadan H E.iHadoop:Asynchronous iterations for mapreduce//IEEE 3rd International Conference on Cloud Computing Technology and Science,2011:81-90
    [38] Malewicz G,Austern M,Bik A,et al.Pregel:A system for large-scale graph processing//Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data,2010:135-146
    [39] Shao B,Wang H X,Li Y T,et al.Trinity:A distributed graph engine on a memory cloud//Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data,2013:1-12
    [40] Xin R,Gonalez J,Franklin M.Graph X:A resilient distributed graph system on spark//Proceedings of the First International Workshop on Graph Data Management Experience and System,2013:12-18
    [41] InfiniteGraph,the Distributed Graph Database..http://www.infinitegraph.com/.2011:7-29
    [42] Kang U,Chau D H,Faloutsos C.PEGASUS:Mining billion-scale graphs in the cloud//IEEE International Conference on Acoustics,Speech,and Signal Processing (ICASSP),2012:5341-5344,doi:10.1109/ICASSP.2012.6289127
    [43] Gubanov M,Pyayt A.MEDREADFAST:A structural information retrieval engine for big clinical text//Proceedings of the 13th International Conference on Information Reuse and Integration(IRI),2012:371-376
    [44] Das S,Sismanis Y,Beyer K S,et al.Ricardo:Integrating R and Hadoop//Proceedings of the 2010 International Conference on Management of Data,2010:987-998
    [45] Ahrens J,Brislawn K,Martin K,et al.Large-scale data visualization using parallel data streaming[J].IEEE Computer Graphics and Applications,2001,21(4):34-41
    [46] Scheidegger L,Vo H T,Kruger J,et al.Parallel large data visualization with display walls//Proceedings of the 2012 Conference on Visualization and Data Analysis(VDA),2012:1-8
    [47] Schadt E E.The changing privacy landscape in the era of Big Data[J].Molecular System Biology,2012,8(1):612
    [48] 大数据应用与案例分析..中国人民大学经济学论坛,http://bbs.pinggu.org/bigdata Application and analysis of Big Data.Economics Forum of Renmin University of China,http://bbs.pinggu.org/bigdata
    [49] 刘琼.专家解读大数据时代的美国经验与启示..人民网(人民论坛),http://theory.people.com.cn/n/2013/0521/c112851-21551972.html LIU Qiong.Expert interpretation:American experience and enlightenment for Big Data era..People's Daily Online:People's Tribune,http://theory.people.com.cn/n/2013/0521/c112851-21551972.html
    [50] 工业和信息化部赛迪智库.大数据时代信息安全面临的挑战与机遇..科技日报,http://digitalpaper.stdaily.com/http_www.kjrb.com/kjrb/html/2013-06/24/content_209820.htm?div=-1 CCID think tank,Ministry of Industry and Information Technology of the PRC.Challenges and opportunities of information security in time of Big Data..Science and Technology Daily,http://digitalpaper.stdaily.com/http_www.kjrb.com/kjrb/html/2013-06/24/content_209820.htm?div=-1
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

FANG Wei, ZHENG Yu, XIU Jiang. Big Data:Conceptions, key technologies and application[J]. Journal of Nanjing University of Information Science & Technology,2014,6(5):405-419

Copy
Share
Article Metrics
  • Abstract:2343
  • PDF: 27016
  • HTML: 0
  • Cited by: 0
History
  • Received:September 01,2014
  • Online: October 25,2014
Article QR Code

Address:No. 219, Ningliu Road, Nanjing, Jiangsu Province

Postcode:210044

Phone:025-58731025