Computer Science

Research Progress on Error Propagation Model in Software System

WANG Xun and WANG Yi-chen

Computer Science. 2016, 43 (6): 1-9. doi:10.11896/j.issn.1002-137X.2016.06.001

Abstract

PDF(904KB) ( 1006 )

References | Related Articles | Metrics

The complexity and uncertainty of complex software system bring about the complexity of software beha-vior,interaction,and error behavior.In the research process of complex software system reliability,security and other fields,error propagation has been a research focus,which attracts wide attention of domestic and overseas scholars.Firstly,the latest development of error propagation was summarized.Then the research direction of error propagation was presented.What’s more,this paper compared and analyzed the factors affecting error propagation,including system architectural characteristics and fault type.Finally,the challenges and future research of error propagation were proposed.

Advances in Vision-based Target Location Technology

ZHAO Xia, YUAN Jia-zheng and LIU Hong-zhe

Computer Science. 2016, 43 (6): 10-16. doi:10.11896/j.issn.1002-137X.2016.06.002

Abstract

PDF(663KB) ( 2733 )

References | Related Articles | Metrics

The vision-based target location technology is the hotspot in the field of computer vision.The current state-of-the-art visual positioning technologies were summarized in this paper.This paper emphatically introduced research status,advantages and disadvantages of monocular vision-based positioning technology,binocular vision-based positioning technology and panorama vision-based positioning technology.Finally,the development trends of vision-based locating method were described briefly.This paper can provide reference for the research of visual positioning problem.

Managing Marine Data as Big Data:Uprising Challenges and Tentative Solutions

HUANG Dong-mei, ZHAO Dan-feng, WEI Li-fei, DU Yan-ling and WANG Zhen-hua

Computer Science. 2016, 43 (6): 17-23. doi:10.11896/j.issn.1002-137X.2016.06.003

Abstract

PDF(884KB) ( 1126 )

References | Related Articles | Metrics

Big data have been continually drawing extensive interests in both academia and industry.Currently,the scale of marine data is increasing consecutively and exponentially with the rapid development of ocean observation technologies and data acquisition methodologies.Until recently,most of the solutions focus on the generic big data,while an extensive study over marine data is still left undiscussed,since the uniqueness of marine data brings new challenges for its management.As a result,this article first outlined the characteristics of marine data as well as the fundamental architecture of marine data management.Secondly,this paper also analyzed the problems of data storage,data quality and data security as well as the corresponding tentative solutions,which will provide significant evidence and references for the future study over ocean science and engineering technology.

Research on Temporal Behavior of Microblog Users

ZHANG Jie-bin and QIN Hong

Computer Science. 2016, 43 (6): 24-27. doi:10.11896/j.issn.1002-137X.2016.06.004

Abstract

PDF(317KB) ( 941 )

References | Related Articles | Metrics

Human daily activities are widespread in all aspects of life.Human behavior has the characteristics of high complexity because of its various types of individual activities and individual differences.Using microblog users’ microblogging time data,we studied the time behavior of users.The results show that the statistical characteristics of time interval for individual users mainly follow three distributions:power-law,exponential and bimodal distribution.Furthermore,we proposed an individual behavior dynamic model based on task queue to explain the characteristics of time interval for the microblog users.

Symbolic ZBDD-based Judgment Method for Assembly Feasibility

PENG Rui, LI Feng-ying, CHANG Liang and MENG Yu

Computer Science. 2016, 43 (6): 28-31. doi:10.11896/j.issn.1002-137X.2016.06.005

Abstract

PDF(402KB) ( 595 )

References | Related Articles | Metrics

To enlarge the solution scale and improve the efficiency of assembly sequence planning the level of assembly automation,zero-suppressed binary decision diagram(ZBDD) was proposed to represent assembly connection matrix and interference matrix.A novel ZBDD-based method was presented to judge the feasibility of assembly operation,thereby feasible assembly operations of an assembly can be obtained efficiently.The experimental results demonstrate the validity and feasibility of ZBDD-based assembly model and judgment method for feasible assembly operation.

SDN-based Multipath Routing Algorithm for Fat-tree Data Center Networks

NONG Huang-wu, HUANG Chuan-he and HUANG Xiao-peng

Computer Science. 2016, 43 (6): 32-34. doi:10.11896/j.issn.1002-137X.2016.06.006

Abstract

PDF(352KB) ( 984 )

References | Related Articles | Metrics

To increase bandwidth and improve fault tolerance,the fat tree topology with multipath capability has been used in many data center networks (DCNs) in recent years.But traditional routing protocols have only limited support for multipath routing,and can not fully utilize the available bandwidth in such networks.This paper studied the SDN-based multipath routing for fat tree networks.Firstly,a linear programming problem was proposed and its NP-completeness was proved.Secondly,a practical solution which takes advantage of the emerging software defined networking paradigm was proposed.Our algorithm relies on a central controller to collect necessary network state information in order to make optimized routing decisions.Finally,the algorithm was implemented as an OpenFlow controller module and was validated by simulation.Experimental result shows that the algorithm outperforms the traditional multipath algorithm based on random assignments both in increasing throughput and reducing end-to-end delay.

MapReduce-based Skyline Query Processing Algorithm

CUI Wen-xiang, XIAO Ying-yuan, HAO Gang, WANG Hong-ya and DENG Hua-feng

Computer Science. 2016, 43 (6): 35-38. doi:10.11896/j.issn.1002-137X.2016.06.007

Abstract

PDF(409KB) ( 731 )

References | Related Articles | Metrics

Skyline query is a typical multi-objective optimization problem and is widely applied in multi-objective optimization,data mining and other fields.Most of the existing Skyline query processing algorithms assume that the data set is placed in a single server,and query processing algorithm is designed as a serial algorithm for a single server.With the rapid growth of data,especially under the background of big data,the traditional serial Skyline algorithms based on a single computer are far from enough to meet the needs of users.Based on the popular distributed parallel programming framework MapReduce,this paper studied the parallel skyline query algorithm suitable for large data sets.Aiming at the factors affecting MapReduce,this paper improved the existing data partition strategy based on angles and proposed the data partition strategy based on Balanced Angular.Meanwhile,to reduce the computation of Reduce phase,this paper proposed the data filtering strategy in advance at Map.The experimental results show that the proposed Skyline query algorithm can improve system performance significantly.

Ontology Storage Model Based on HBase

SONG Hua-zhu, DUAN Wen-jun and LIU Xiang

Computer Science. 2016, 43 (6): 39-43. doi:10.11896/j.issn.1002-137X.2016.06.008

Abstract

PDF(653KB) ( 598 )

References | Related Articles | Metrics

Ontology is a formal description of important concepts within a specific domain.Reasonable storage of ontological data is an important prerequisite to perform its sharing features,and its function is more outstanding especially under the current distributed systems.By analyzing different means of storing ontology data at present and combining the features of semantic Web and Hadoop,this paper came up with an ontology storage model based on HBase,called HBase-OntSM,which regards ontology’s triple dataset as a graph storing it in a database as a record,and then presented a series of basic definitions and index definitions related to the graph.Finally,this paper took a segment of Tibetan culture ontology for an example,and explained the ontology storage model and its stored procedures.

Mobile Cache Replacement Algorithm Based on Social Network

XING Qi-yuan, WANG Jing, YAN A-bin and HAN Yan-bo

Computer Science. 2016, 43 (6): 44-49. doi:10.11896/j.issn.1002-137X.2016.06.009

Abstract

PDF(500KB) ( 573 )

References | Related Articles | Metrics

In recent years,mobile applications grow rapidly with the development of Android and iOS platforms.Most of applications on these smart phones are based on users,and these data are always generated by users.When users want relative data,it is not very realistic to request the data from a server every time.So a suitable cache technology is required.Traditional cache technologies pay much attention to the frequency or the last access time,but do not consider more about data generators’ relationship between data-generators.In mobile social network environment,data access is closely related to users’ relationships,so this factor should be suitable for use in cache technologies.In this paper,we proposed a user-relationship-based cache replacement algorithm,and combined users’ relationships with the classic cache algorithm LRU.Not only the access time of each data,but also the closeness value between the data requestors and the generator was taken into consideration.The experiment results show that our replacement strategy can improve cache hit ratio in mobile social environments.

Research on Online Multi-task Load Balance Algorithm in Cloud Server Cluster

XU Ai-ping, WU Di, XU Wu-ping and CHEN Jun

Computer Science. 2016, 43 (6): 50-54. doi:10.11896/j.issn.1002-137X.2016.06.010

Abstract

PDF(401KB) ( 819 )

References | Related Articles | Metrics

In this paper we presented a load-balancing algorithm applying in heterogeneous cloud server clusters.Ave-rage hardware resource consumption of jobs running on server was measured.Balancing server receives load status of each server in cluster periodically.A load status vector of each server can be estimated according to the latest load status report and other parameters.As a request is submited to cluster,balancing server calculates the load status estimation vector of each server,and then dispatches it to the server that possesses the minimal load status estimation value.Experi-ment results show that this dynamic load balancing algorithm is reasonable and effective.

Parallelization of Random Forest Algorithm Based on Discretization and Selection of Weak-correlation Feature Subspaces

CHEN Min-cheng, YUAN Jing-ling, WANG Xiao-yan and ZHU Sai

Computer Science. 2016, 43 (6): 55-58. doi:10.11896/j.issn.1002-137X.2016.06.011

Abstract

PDF(407KB) ( 554 )

References | Related Articles | Metrics

With the coming of the big data age,data information is increasing exponentially at a dramatic rate.The traditional classification algorithm will encounter great challenges.In order to improve the efficiency of classification algorithm,this paper proposd a parallel random forest algorithm based on discretization and the selection of the weak-correlation feature subspaces.This algorithm discretizes continuous attributes in data pretreatment phase.At the step of the selection of feature subspaces for growing decision trees,we used vector space modal of attributes to calculate the correlation between attributes,and then constructed the weak-correlation feature subspaces.This algorithm not only reduces the correlation among decision trees,but also improves the classifying effect of the random forest.We also designed and realized a double parallel method for building random forest model based on the MapReduce framework.This strategy goes a step further with its own charity efforts.

Research on Two-layered Path Planning System Based on Multi-agent Simulation

XIONG Mu-zhou and LI Yong

Computer Science. 2016, 43 (6): 59-64. doi:10.11896/j.issn.1002-137X.2016.06.012

Abstract

PDF(799KB) ( 725 )

References | Related Articles | Metrics

With the development of modeling and simulation in crowd movement,it has been widely applied to various applications for crowd movement estimation and safety evaluation.Recently,Crowd simulation has been an efficient tool for research on crowd movement feature and pattern.As one of the most important compositions of crowd simulation model,path planning system illustrates the process of pedestrian’s decision for her/his route in environments.In order to simulate the process of pedestrian’s route decision,this paper proposed a two-layered path planning system.The including a high-lever model produces a rough path,and the low-level model provides precise navigation according the route previously computed.The related experiment results indicate that the proposed model is able to consider both staticand dynamic environment issues,which leads to a good path planning for simulation model.In addition,the proposed model is also effective in simulation execution.

Adaptive Image Retrieval Algorithm with Multi-feature of Center Block

GUO Jing-lei, LI Wei and JIN Cong

Computer Science. 2016, 43 (6): 65-67. doi:10.11896/j.issn.1002-137X.2016.06.013

Abstract

PDF(569KB) ( 516 )

References | Related Articles | Metrics

An adaptive image retrieval algorithm with multi-feature of center block was proposed to better retrieval the image information.By retrieving the main color of image,the improved method reduces the interference of background noise on the target object.In order to solve the difficult problem of the weight setting,a differential evolution algorithm was presented to optimize the feature weight.Experimental results demonstrate that the proposed algorithm can reduce the interference of background noise in calculating the image distance,and achieve better results in retrieval accuracy and efficiency.

Mass Sensor Information Storage Infrastructure Based on Fusion Database

LEI Xing-bang and FANG Jun

Computer Science. 2016, 43 (6): 68-71. doi:10.11896/j.issn.1002-137X.2016.06.014

Abstract

PDF(945KB) ( 635 )

References | Related Articles | Metrics

In the internet of things,industrial control and other systems,large-scale sensor generates a large amount of data all the time.The Real-time database has an advantage in terms of time-sensive data processing,while it has a problem in storage capacity and scalability.On the contrary,the HBase has the advantage of high read and write perfor-mance,high scalability and high reliablity.Through the combination of real-time database and HBase,we designed and implemented the sensor information storage architecture based on fusion database.The architecture uses a multi-tenant mechanism to optimize HBase writes,the original sensor data are centrally stored,and the sensor metadata and historical data are stored separately,while maintaining the original real-time database queries,data structure characteristics.Our experiments verify that the system has high read and write performance and good scalability.And it effectively avoids the region write hot and achieves the objective of the cluster load balancing.

Improved Image Similarity Algorithm

ZOU Cheng-ming, XUE Dong, GUO Shuang-shuang and ZHAO Guang-hui

Computer Science. 2016, 43 (6): 72-76. doi:10.11896/j.issn.1002-137X.2016.06.015

Abstract

PDF(919KB) ( 1182 )

References | Related Articles | Metrics

Image similarity algorithm has important significance in image recognition,image search engines and other research areas.Traditional gray color histogram algorithm can not accurately describe the distribution of each color in the image.To address this issue,an improved image similarity algorithm was proposed.It fuses texture feature of ima-ge,and extracts image pixels’ feature information at various positions on the image by using gray level co-occurrence matrix.Experimental results show that the method of fusing image texture features not only retains the characteristics of efficient implementation of gray color histogram algorithm,but also can improve the accuracy of the algorithm.In practical application scenarios,we can adjust the weights of two kinds of algorithm to further improve the accuracy of the algorithm.

High Stability Low Delay Spanning Tree Algorithm for Application Layer Multicast

CUI Jian-qun, CHEN Ai-ling, XIA Zhen-chang and WU Li-bing

Computer Science. 2016, 43 (6): 77-81. doi:10.11896/j.issn.1002-137X.2016.06.016

Abstract

PDF(659KB) ( 536 )

References | Related Articles | Metrics

Since application layer multicast(ALM) relies on terminal hosts to forward multicast data,any intermediate node fails or quits would result in the system stability problem.Meanwhile,ALM has strict requirement for multicast tree delay.To improve stability and data transmission efficiency,this paper introduced a Spanning tree problem model SDMD based on stability probability,degree-constrained,and minimum diameter for ALM,according to the influencing factor of ALM stability and delay.The SDMD problem was proved to be NP-hard and an approximation algorithm TG-Swas proposed to solve it.The simulation result demonstrates the advantages of TG-S algorithm in average receiving delay,multicast tree delay and accumulative interrupt times.

Parallel Operational Transformation Algorithm in Multi-core

LI Ming-li, CAI Wei-wei, LV Xiao and HE Fa-zhi

Computer Science. 2016, 43 (6): 82-85. doi:10.11896/j.issn.1002-137X.2016.06.017

Abstract

PDF(278KB) ( 697 )

References | Related Articles | Metrics

Operational transformation algorithm is the first choice of real-time collaborative editing systems.As a concurrency control strategy,it not only supports unconstrained interactions,but also maintains the intention consistency of distributed operations.However,as the number of executed operations increases,the performance degrades,affecting the responsive time of operations.Combining the development of multi-core and multi-threading,this paper proposed the first parallel operational transformation algorithm,which can greatly reduce the time costs of integrating remote operations.The traditional sequential algorithm is modified,so that the procedure with computation-dependency can be parallelized.Extensive experiments show that the proposed algorithm takes large advantage over the traditional algorithm,and even though the operation history is very large,it still provides a decent responsive time.

Implementation and Protocol Analysis of Embedded VoIP Voice Terminal Based on SIP

LIN Wang and TIAN Hong-xian

Computer Science. 2016, 43 (6): 86-90. doi:10.11896/j.issn.1002-137X.2016.06.018

Abstract

PDF(1202KB) ( 582 )

References | Related Articles | Metrics

An embedded voice terminal system was achieved.The hardware of the terminal system uses a Feiling embedded development board OK6410.Its core board employs S3C6410 processor which takes ARM11 as the kernel.The software of the terminal system uses embedded WINCE operating system and achieves the transplant of LINPHONE code based on SIP protocol.This paper analyzed the architecture of the entire VOIP voice system including hardware and software.Then,we focused on the analysis of LINPHONE work process and SIP protocols.Finally,the test shows that this system has good voice communication quality.

Fingerprint-based Indoor Localization via Matrix Completion

SHA Chao-heng, XIAO Fu, CHEN Lei, SUN Li-juan and WANG Ru-chuan

Computer Science. 2016, 43 (6): 91-96. doi:10.11896/j.issn.1002-137X.2016.06.019

Abstract

PDF(511KB) ( 630 )

References | Related Articles | Metrics

In recent years,indoor localization technique has attracted widespread attention of researchers.Existing fingerprint-based algorithms require sufficient fingerprint data and are apt to cause big localization error under the interference of noise.To address this challenge,we proposed a robust indoor localization algorithm based on matrix completion,which utilizes the low rank feature of fingerprint matrix to reconstruct the original fingerprint database from a small amount of RSSI fingerprint data.By introducing L1-norm and Frobenius-norm to smooth outlier and enhance algorithm stability,the recovering of fingerprint database with noise is formulated as a norm-regularized matrix completion pro-blem,which can be effectively solved by alternating direction method of multiplier and variable splitting technology.Experiment results demonstrate that this algorithm can recover the complete fingerprint database with a small amount of fingerprint data and achieve higher localization accuracy than similar algorithms under the interference of various types of noise.

Cooperative Congestion Control Scheme Based on MPTCP in Heterogeneous Network

WANG Zhen-chao and YANG Xiao-long

Computer Science. 2016, 43 (6): 97-101. doi:10.11896/j.issn.1002-137X.2016.06.020

Abstract

PDF(687KB) ( 597 )

References | Related Articles | Metrics

A cooperative congestion control scheme based on MPTCP was proposed.At the stage of avoiding congestion,the unconfirmed data packets transferred in each route in heterogeneous network are predicted based on the Mar-kov Model,and then the sender calculates the maximum amount of data that each route can carry.If the network congestion window is more than two times higher than the minimum of the maximum amount of data that every path can carry,the mechanism of cooperative congestion control will be started.Under the cooperative congestion control mechanism,the network congestion window should be adjusted according to the norm of additive increase of AIMD algorithm.If the sum of the carrying capacity of each path is less than the network congestion window,the cooperative congestion control mechanism should be ended,and then traditional slow start of TCP algorithm is implemented.To enhance broadband utilization at the slow start stage,this paper modified the broadband estimation algorithm of TCPW(TCP Westwood) to make the estimation of available broadband in each path more accurate and improve the reasonability of slow start threshold setting.Simulation results demonstrate that the proposed scheme can increase the number of successful transmission data packets while guaranteeing the load balance and fairness between single TCP flow and MPTCP flow.

Ultra-wideband Signal Detection Method Based on Hilbert-Huang and Wavelet Packet

LIU Xiao-wen, JIANG Lei, XU Hua and CHEN Xi

Computer Science. 2016, 43 (6): 102-105. doi:10.11896/j.issn.1002-137X.2016.06.021

Abstract

PDF(424KB) ( 791 )

References | Related Articles | Metrics

For Hilbert-Huang transform(HHT) is restricted by signal to noise ratio(SNR) in detecting ultra-wideband(UWB) signal, this paper analyzed the stopping criteria for sifting of EMD and wavelet packet denoising,and proposed the method of HHT combining with wavelet packet and the new stopping criteria for sifting.The proposed method was used to detect UWB signal and its performance was compared and analyzed using root-mean-square error.Simulation results show that the UWB signal in strong noise canditions can be accurately reconstructed by the this new method,and the problem that SNR has big impact on HHT is solved.

IEEE 802.15.4 Real-time Bandwidth Allocation Algorithm Supporting Heterogeneous Data Communication

HU Xian-jun, CHEN Jian-xin, ZHOU Sheng-qiang and LI Yi-fan

Computer Science. 2016, 43 (6): 106-111. doi:10.11896/j.issn.1002-137X.2016.06.022

Abstract

PDF(1276KB) ( 537 )

References | Related Articles | Metrics

GTS allocation mechanism on IEEE 802.15.4 protocol can support real time and delay constrained applications.It has been used in the Internet of Things field such as medical health,industrial control,building automation and etc.But in the applications of high-speed real-time heterogeneous data transmission,there are still some limitations.It does not support real-time services required by more than seven devices,delay constraints less than the superframe length and different periodic heterogeneous data transmission applications.To overcome these limitations,this paper proposed a new IEEE 802.15.4 real-time bandwidth allocation algorithm supporting heterogeneous data communication, and adjusted the transmission time of some of the transfer tasks according to the data transmission information of different periodic tasks.Performance analysis result shows that this algorithm is in strict compliance with delay constraints,and can meet the needs of heterogeneous data communication,improve the performance of the bandwidth utilization,and then improve the performance of the overall network.

Research on LMD in EEG Signal Analysis Based on Wavelet Packet

MA Xiao and ZHU Xiao-jun

Computer Science. 2016, 43 (6): 112-115. doi:10.11896/j.issn.1002-137X.2016.06.023

Abstract

PDF(400KB) ( 1027 )

References | Related Articles | Metrics

The EEG signal is a very weak reflection of potential which its produced by brain nerve cell activity,at the same time,it’s also a kind of non-stationary and nonlinear electrical signal.EEG signal are susceptible to external noise interference in the collection process.In order to reduce the noise levels in the EEG signals and improve the efficiency of EEG signal decomposition,this paper proposed a local mean decomposition me-thod based on wavelet packet.This method mainly USPS wavelet packet to make denoising pretreatment on the collected EEG signals,and then analy-zesitvia local mean decomposition.The simulation results show that the improved LMD decomposition can effectively remove the high frequency noise in the original signals,and can effectively eliminate the influence of noise componenton on decomposition process and result.

Segmented Address Assignment Policy and Routing for Wireless Sensor Mesh Networks

YUAN Li-yong, ZHU Yi-hua and QIU Shu-wei

Computer Science. 2016, 43 (6): 116-121. doi:10.11896/j.issn.1002-137X.2016.06.024

Abstract

PDF(1054KB) ( 481 )

References | Related Articles | Metrics

As the wireless sensor network device has requirements shuch as low power,low cost,small size and other requirements,its communication capability,computing power and memory space are extremely restricted.So,wireless sensor network routing algorithm must have the following characteristics:low storage overhead,low routing computation,no route discovery,et al.HiLow is a hierarchical routing protocol.It compliant with aforementioned characteristics and has better routing performance than IEEE 802.15.5.Since there are still problems such as low address utilization rate,only applicable to small-scale networks,HiLow cannot be applied in WSN application scenarios,such as environmental monitoring,animal protection,which require the deployment of a large number of sensor nodes.In this paper,we proposed a two-fragment address policy (TFA),in which 16-bit address is divided into two fields,the significant field is used for address allocation of full function devices,and the insignificant field is used for address allocation of reduced function devices.TFA has a higher address utilization rate and a larger maximum depth of routing tree than those of HiLow,which means that TFA is suitable for larger-scale networks.We also analyzed the features of TFA which can be used to optimize routing,and proposed a mesh routing algorithm based on local link state and TFA.Simulations show that the TFA based mesh routing outperforms IEEE 802.15.5 in terms of memory usage and energy consumption.

Application of ICA-R Algorithm in Weak Signal Extraction

GU Ling-ling and LIU Guo-qing

Computer Science. 2016, 43 (6): 122-126. doi:10.11896/j.issn.1002-137X.2016.06.025

Abstract

PDF(379KB) ( 598 )

References | Related Articles | Metrics

Independent Component Analysis(ICA)is an effective method to solve the blind source separation (BSS) problem.FastICA,which take central limit theorem as the starting point,uses the optimization algorithm of fixed-point iteration,and converges fast and steadily.Due to the disadvantage of central limit theorem,the FastICA no longer applieds when we extract weak signal.This paper explored the FastICA theoretically and experimentally,and addressed this problem by proposing new ideas.Based on the FastICA,we use independent component analysis (ICA-R) to establish proxintity measure combined with the conception of extrapolation in minimize weighted norm,in the case of a part of power spectrum of the source signal is known.Thus,the targeted weak signal can be extracted by integrating the measurement into FastICA in a constrained way.Experiments show that the proposed algorithm is effective for both analog and real signal.

Bayesian Rule-based Background Model for Object Device-free Localization

QU Qiang, WU Xin-jie and CHEN Xue-bo

Computer Science. 2016, 43 (6): 127-130. doi:10.11896/j.issn.1002-137X.2016.06.026

Abstract

PDF(672KB) ( 476 )

References | Related Articles | Metrics

Object Device-Free Localization is allowed to localize and track person or other things without carrying any electronic device or tag.Aiming at the problem that the localization accuracy of radio tomographic imaging (RTI) algorithm is not ideal in a multipath environment,an improved algorithm based on bayesian background model was proposed.Firstly,the bayesian background model which is used to eliminate redundant links is established by combining the skew-laplace distribution with the bayesian theory.Then,the changes of received signal strength are weighted to reduce the interference of multipath effect on localization accuracy.Finally,the target location is corrected with the introduction of the posteriori estimate mean.The feasibility and availability of localization algorithm are verified by experiment.

Anti-geometric-attack Watermarking Algorithm Based on Pseudo-Zernike Moments and Contourlet Transform

ZHU Dan-dan and LV Li-zhi

Computer Science. 2016, 43 (6): 131-134. doi:10.11896/j.issn.1002-137X.2016.06.027

Abstract

PDF(858KB) ( 492 )

References | Related Articles | Metrics

This paper presented a novel anti-geometric-attack digital image watermarking algorithm based on pseudo-Zernike moments and Contourlet transform knowledge.It uses Contourlet transform to extract low frequency sub-band of the image.We embeded the digital watermark information into this low sub-band by quantizing the magnitudes of the selected pseudo-Zernike moments,according to the human visual system (HVS) and the correlation between the coefficients of the embedded watermark before and after.Without the aid of the original image,the watermark can be extracted and the real blind detection is realized.According to the experimental results,the proposed watermarking algorithm has the best resistance to shift transform,and has good resistance to rotation and zoom.After JPEG compression,the digital watermark is not distorted.

Batch Verification Scheme Defensing Collusive Attack in VANET

LU Jie, SONG Xiang-mei, HAN Mou and ZHOU Cong-hua

Computer Science. 2016, 43 (6): 135-140. doi:10.11896/j.issn.1002-137X.2016.06.028

Abstract

PDF(489KB) ( 660 )

References | Related Articles | Metrics

Owning to vulnerabilities and openness of the wireless networks,Vehicular ad-hoc networks (VANET) are vulnerable to attacks,such as the bogus information attack and the message replay attacks.Message authentication is one of the effective techniques to solve these problems.Obviously VANET consists of a huge number of fast moving vehicles,thus messages need to be verified rapidly.Recently,b-SPECS+ has became the best scheme on batch verification,which provids a software-based solution to satisfy the privacy requirement,gives lower message overhead and verifies message in batch.However,it suffers from the collusion attack executed by the roadside unit and the on-board unit.In this paper,we provided a pseudonymous verification public key based scheme that can solve collusion attack problem.Our solution has a higher rate than b-SPECS+ in the message verification.

Security Interoperation Model of Cross-domain Network Resources

TANG Cheng-hua, ZHANG Xin, WANG Lu, WANG Yu and QIANG Bao-hua

Computer Science. 2016, 43 (6): 141-145. doi:10.11896/j.issn.1002-137X.2016.06.029

Abstract

PDF(347KB) ( 596 )

References | Related Articles | Metrics

Network resources are in need of sharing and interoperability under the control of security policy.Aiming at the interoperability security problem of the resources among the heterogeneous security domains,a security interoperation model of accessing to cross-domain network resources based on RBAC security policy was proposed.Firstly,the concept of inter-domain role was introduced,and the requirement of accessing to cross-domain resources sharing was defined.Secondly,based on the cross-domain operation criteria,the security interoperation model and access algorithm of heterogeneous inter domain resources were put forward.Finally,The model and algorithm were analyzed through the application environment of a real project case.Results show that this method has the characteristics of high pertinence and effective access control,and provides a feasible way for the security implementation of resources sharing and interoperation.

Audit Log Secure Storage System Based on Trusted Computing Platform

CHENG Mao-cai and XU Kai-yong

Computer Science. 2016, 43 (6): 146-151. doi:10.11896/j.issn.1002-137X.2016.06.030

Abstract

PDF(747KB) ( 665 )

References | Related Articles | Metrics

Aiming at the log security issues existing in computer audit system,this paper proposed an audit log security storage system,combined with the secure storage,key generation and cryptographic operation functions provided by TPM (Trusted Platform Module).The significance of this system is to ensure the security of log transfer and storage,optimize the key storage management mechanism,and solve the key synchronization problem existing in the trusted computing platform key management mechanism,which enhances key management security of platform as a whole.In the end,we analyzed the security of log integrity authentication algorithm and the complexity of key usage.Experimental result shows that this log storage system is safe and practical.

JPEG Color Image Self-adaptive Encryption Algorithm with Format Compatibility

LEI Zheng-qiao and XIAO Di

Computer Science. 2016, 43 (6): 152-155. doi:10.11896/j.issn.1002-137X.2016.06.031

Abstract

PDF(841KB) ( 460 )

References | Related Articles | Metrics

To ensure the security application of images with special format,it is necessary to conduct an in-depth study of the corresponding image encryption algorithm compatible with its format.By integrating JPEG compression and the idea of self-adaptive encryption,a self-adaptive encryption algorithm for JPEG color image was proposed.DC coefficients and the first 16 AC coefficients in Zig-Zag order of JPEG compression are chosen to construct the corresponding matrices.Based on chaotic random sequence,self-adaptive encryption scheme is utilized to encrypt the coefficients,and the sign of DC coefficients is also encrypted to avoid color information leaking.Experimental results and analyses verify that both the security and performance of the proposed algorithm are good,and the impact on the compression is negligible.Furthermore,it is compatible with JPEG file format.

Software Failure Prediction Model Based on Improved Nonparametric Method

WANG Zong-hui, ZHOU Yong and ZHANG De-ping

Computer Science. 2016, 43 (6): 156-159. doi:10.11896/j.issn.1002-137X.2016.06.032

Abstract

PDF(410KB) ( 519 )

References | Related Articles | Metrics

Based on principal component analysis (PCA) and improved N-W nonparametric estimation method (INW),a new software failure prediction model was presented.First of all,through the principal component analysis of training sample set of nonparametric estimation,the input number of nonparametric method was reduced.Then the variancecontribution ratio of PCA was used as the weight of the bandwidth matrix in nonparametric estimation method,the impact of each imput factor on the results was eliminated in a different extent and software failure prediction models were built.Finally,this paper gave example analysis based on one real software failure data set Eclipse JDT.The results show that the failure prediction model based on improved nonparametric method has made further improvement in prediction precision and stability. Within the forecast range of the last ten steps,the average error of predictive value is 16.2575,and the mean square error is 0.0726.

Towards Personal Cloud Service and Resource Mashup Based on Mobile Devices

WANG Huan-huan, PENG Xin and ZHAO Wen-yun

Computer Science. 2016, 43 (6): 160-166. doi:10.11896/j.issn.1002-137X.2016.06.033

Abstract

PDF(841KB) ( 572 )

References | Related Articles | Metrics

Personal cloud is a collection of smart phones,personal computers,and smart albums which are seamlessly accessible through network in the surrounding environment of ordinary users.It benefits both the personal data proces-sing and sharing.However,how to provide the users with convenient and efficient personal cloud service and resource integration based on different practical requirements is still a problem demanding prompt solution.To address this pro-blem,we proposed a framework towards personal cloud service and resource mashup based on mobile devices,including how to manage devices,and how to define service and resource.We also explained how to build a Mashup application through the framework.We developed an agent-based implementation for the framework and an Android client to help users complete mashup.Finally,we evaluated the effectiveness and usability of our Mashup framework and Android clientby a preliminary experimental study.

Activity Pattern Mining in Software Development

JI Cai-ying, DAI Fei, LI Tong and JIANG Xu-dong

Computer Science. 2016, 43 (6): 167-172. doi:10.11896/j.issn.1002-137X.2016.06.034

Abstract

PDF(444KB) ( 494 )

References | Related Articles | Metrics

It is a critical issue to mine the real enacted software process (not experiential process).Due to the characte-ristics of software process event data,i.e.,many records,no specific activity and its unique features,the existing business process mining methods can’t support them effectively.Firstly,the atomic activities of event log were gained.Then,activity patterns were mined from the software project development instances by folding method,the software process was got,and the software development activities in different periods were analyzed.Finally,this method was verfied through a case of open source projects.

Prolog Based Approach to Validate Time Constraints in Business Process

CHEN He-wen, ZHOU Yong and YAN Xue-feng

Computer Science. 2016, 43 (6): 173-178. doi:10.11896/j.issn.1002-137X.2016.06.035

Abstract

PDF(455KB) ( 545 )

References | Related Articles | Metrics

With the rapid development of Internet technology,the demand for business process modeling of complex system is increasing.In order to verify the correctness of the business process model with time constraints,this paper put forward a kind of graph decomposition algorithm based on node switching rules,which transforms the business process model into execution trace set and transforms the execution trace set to Prolog.That is to say,the trace of nodes,gateway and time constraints are all converted into Prolog fact.This paper put forward an algorithm that transforms the business process model to Prolog language and transforms the duration time pattern,cycle time pattern and fixed time pattern into Prolog rules,supporting the validation of the business process model on three time patterns.Finally,a medical process instance with time constraints was verified.

Extraction Approach for Software Bug Report

LIN Tao, GAO Jian-hua, FU Xue, MA Yan and LIN Yan

Computer Science. 2016, 43 (6): 179-183. doi:10.11896/j.issn.1002-137X.2016.06.036

Abstract

PDF(367KB) ( 506 )

References | Related Articles | Metrics

Bug reports in software engineering areincreasing rapidly,and developers are bewildered by the large number accumulation of reports.Therefore,it is necessary to study on the extraction of bug reports for the task of bug fixing and software reuse,etc.This paper proposed a novel extraction approach.Synonyms are merged into one specific word firstly in the approach.Then it sets up a vector space model.And some text mining methods,such as TF-IDF and information gain,are used to collect word features in bug reports specifically.Meanwhile,there is an algorithm for determining sentence complexity,so as to choose long sentences.Finally Bayes classifier is introduced to bug report extraction.TPR is increased and FPR is decreased in this approach.The experiment proves that the bug report extraction based on text mining and Bayes classifier is competitive in the evaluation of AUC(0.71),F-score(0.80) and Kappa value(0.75).

Performance Comparison of Join Operations on SIMFS and EXT4

ZHAO Li-wei, CHEN Xian-zhang and ZHUGE Qing-feng

Computer Science. 2016, 43 (6): 184-187. doi:10.11896/j.issn.1002-137X.2016.06.037

Abstract

PDF(401KB) ( 770 )

References | Related Articles | Metrics

Join is the most primary and expensive operation in relational database,and it has a great impact on the performance of the database.Since the data tables are stored in file system,the performance of the file system essentially determines the performance of join operation.The tests of join among different file systems have great meaning in database research,but now there are few of such tests.First,the differences between the data access of new in-memory file system SIMFS (Sustainable In-Memory File System) and the I/O path of disk-based file system EXT4 (Fourth extendedfile system) were compared.Then experiments were designed to test the effect of different file systems on join operations.And test metrics such as different data block and I/O block sizes were set for SIMFS and EXT4 respectively.The experimental results show that the join operation on SIMFS and EXT4 has obvious difference in performance optimization ,effect of block size,the bottleneck of performance improvement,constraints of hardware,and so on.Based on the analysis of the experimental results,suggestions on in-memory file system were proposed to optimize the join operations.

Shortest Path Searching Algorithm Based on Geographical Coordinates and Closed Attribute in Road Network

GU Ming-hao and XU Ming

Computer Science. 2016, 43 (6): 188-193. doi:10.11896/j.issn.1002-137X.2016.06.038

Abstract

PDF(569KB) ( 823 )

References | Related Articles | Metrics

Concerning the problem of finding the shortest path in the large scale city road network,this paper proposed an algorithm based on the Edge Clustering Tree and Minimal Closed Lattice to achieve the goal of quick searching.Firstly,the city road network is proprocessed,i.e. using the definition of the MCL to classify the road network.Then ETC is used to storage the MCL .Eventually,depending on the idea of the virtual path(the shortest distance between two points is the straight line distance),combining the attribute of MCL planarity of the road network and taking advantage of the ECT will dramatically reduce the visits to dud nodes.So this kind of algorithm really reduces the time complexity and achieves the purpose of quick searching.The theoretical analysis and the simulation experiment demonstrate that the storage space of ECT is 45.56% less than the PCDC algorithm and 24.35% less than TNR.In the aspect of storage,ECT is slightly better than SILC.Besides,the query efficiency of the MCL is 15.6% faster than that of the SPB.The results of the experiment suggest that the MCL algorithm based on ETC storage can improve the query efficiency.

Improved Topology-potential-based Opinion Leader Mining Algorithm

MAO Tian-ming, GUAN Peng and PI De-chang

Computer Science. 2016, 43 (6): 194-198. doi:10.11896/j.issn.1002-137X.2016.06.039

Abstract

PDF(415KB) ( 498 )

References | Related Articles | Metrics

Opinion leaders mining has important theoretical guiding significance as well as practical application value in many fields such as public opinion monitoring,market promotion and information transmission.As traditional opinion leader mining algorithms have the shortcomings of one-sidedly considering the target property,subjective assessments and lacking of relevance,this paper proposed an improved topology-potential-based opinion leader mining algorithm.Combining with objective attributes and the network structure of specific nodes,this algorithm uses data deviation to correct subjective weight,and evaluates the target node objectively,thereby mining opinion leaders.According to the rea-listic Microblog statistics,the algorithm proposed in this paper is superior to traditional algorithms in accuracy and rele-vance,and can mine opinion leaders from different backgrounds.

Link Prediction in Networks with Node Attributes Based on Random Walks Algorithm

CHEN Yong-xiang and CHEN Ling

Computer Science. 2016, 43 (6): 199-203. doi:10.11896/j.issn.1002-137X.2016.06.040

Abstract

PDF(496KB) ( 797 )

References | Related Articles | Metrics

Link prediction is an important area in complex network analysis.It is widely used in various domains such as sociology,anthropology,information science and computer science.A link prediction algorithm was proposed based on link similarity score propagation by a random walk in networks with node attributes.In the algorithm,each link in the network is assigned a transmission probability according to the similarity of the attributes on the nodes connected by taking a link.The similarity of the attributes on the nodes is also treated as the initial value of their link similarity.The link similarity between the nodes is then propagated by the random walk in the network according to their transmission probability.At last,the link similarity is just the final score of link prediction.Our experimental results show that the proposed algorithm can obtain more accurate results than other algorithms in the networks with node attributes.

Approach to Knowledge Reduction Based on Inconsistent Confidential Dominance Principle Relation

GOU Guang-lei and WANG Guo-yin

Computer Science. 2016, 43 (6): 204-207. doi:10.11896/j.issn.1002-137X.2016.06.041

Abstract

PDF(316KB) ( 562 )

References | Related Articles | Metrics

Confidential dominance relation based rough set model is used to deal with incomplete ordered decision system,in which knowledge reduction is one of the most important problems.In order to discern two objects in IODS,their decision preference should be taken into account.This paper proposed a knowledge reduction approach based on inconsistent confidential dominance principle relation,with which two objects are discernable.Furthermore,the judgment theo-rems and the discernable matrix are investigated,from which we can obtain a new approach to knowledge reduction in ordered decision system.An example illuminates effectiveness of the new reduction.

Deep Random Forest for Churn Prediction

YANG Xiao-feng, YAN Jian-feng, LIU Xiao-sheng and YANG Lu

Computer Science. 2016, 43 (6): 208-213. doi:10.11896/j.issn.1002-137X.2016.06.042

Abstract

PDF(752KB) ( 632 )

References | Related Articles | Metrics

Churn prediction models help telecom operators identify potential off-network user.Most previous models adopt shallow machine learning algorithms such as logistic regression,decision tree,random forest and neural networks.This paper proposed a novel deep random forest algorithm,which is a multi-layer random forest with layer-wise trai-ning.In terms of telecom operators’ real data,we confirmed that the proposed deep random forest performs better than previous shallow learning algorithms in churn prediction.Moreover,increasing the volume of training data can further improve the performance of deep random forest,which implies that big data make deep models advantageous over shallow models.

Document Vector Representation Based on Word2Vec

TANG Ming, ZHU Lei and ZOU Xian-chun

Computer Science. 2016, 43 (6): 214-217. doi:10.11896/j.issn.1002-137X.2016.06.043

Abstract

PDF(369KB) ( 1866 )

References | Related Articles | Metrics

In text classification issues,it is difficult to express a document efficiently by the word vector of word2vec.At present,doc2vec built on the combination of word2vec and clustering algorithm can express the information of document very well.However,this method rarely considers a single word’s influence for the entire document.To solve this pro-blem,in this paper, TF-IDF algorithm was used to calculate the right weight of words in documents,and word2vec was combined to generate document vectors,which were used for Chinese text classification.Experiments on the Sogou Chinese corpus laboratory demonstrate the efficiency of this newly proposed algorithm.

New Heuristic Algorithm for Attribute Reduction in Decision-theoretic Rough Set

CHANG Hong-yan and MENG Zu-qiang

Computer Science. 2016, 43 (6): 218-222. doi:10.11896/j.issn.1002-137X.2016.06.044

Abstract

PDF(393KB) ( 678 )

References | Related Articles | Metrics

Attribute reduction is one of the most important research contents in rough set theory.Scholars have proposed various definitions for attribute reduction in decision-theoretic rough set,including the definition of keep the positive decisions of all objects unchanged.Directing at the positive decision definition,in order to efficiently obtain the reduction set,designed a heuristic function is designed,that is important degree of decision-making.This heuristic function defines the decision important degree of every attribute according to the size of positive decision objects set.The bigger the size of positive decision objects set,the greater the improtance,thus constructs heuristic attribute reduction algorithm based on the decision important degree.The advantage of this algorithm is that it determines the search direction according to the sorting of attribute decision important degree,avoids the calculation of attribute combination,and can reduce the amount of calculation and find out a smaller reduction set.The experimental results show that the algorithm is effective and can obtain a good reduction effect.

Research of Distributed Integration Algorithm on Concept Lattices

FAN Shu-yuan, WANG Li-ming, JIANG Qin and ZHANG Zhuo

Computer Science. 2016, 43 (6): 223-228. doi:10.11896/j.issn.1002-137X.2016.06.045

Abstract

PDF(539KB) ( 480 )

References | Related Articles | Metrics

With the advent of the era of big data,the distributed storage and computing of massive data are increasingly important,and the distributed integration of concept lattices is particularly urgent.In order to solve the problem of long constructing time of concept lattice.This paper put forward the concept lattices oriented distributed integration algorithm.Integration of concept lattice are defined as follows:concepts in sub-concept lattice are sorted according to the decreases of intent number,and then the sub-concept lattices are integrated into a global concept lattice. Two types of integration are selected to construct the global concept Lattice in this paper:one is the add lattice merge,the master node receives and integrates sub-concept lattices that come from all child nodes；the other way is called two-way merge,firstly,its sub-concept lattices from all child nodes are integrated,and then the master node receives and integrates the sub-concept lattices.The experiments show that two kinds of distributed integration strategy of concept lattice have their advantages and disadvantages,but both of them can effectively reduce the time of constructing concept lattice.

Big Data Clustering Algorithm Based on Chaotic Correlation Dimensions Feature Extraction

XIE Chuan

Computer Science. 2016, 43 (6): 229-232. doi:10.11896/j.issn.1002-137X.2016.06.046

Abstract

PDF(324KB) ( 701 )

References | Related Articles | Metrics

Big data clustering process is a kind of stochastic nonlinear processing and has very high uncertainty.Because the traditional methods need prior knowledge to learn,are not good to adapt to the real-time change situation of big data and unable to effectively implement large data clustering,we put forward a kind of big data clustering method based on chaotic correlation feature extraction.We analyzed the disadvantages of the traditional methods,established a multidimensional state space vector and the chaotic trajectory by phase space reconstruction.Much of the geometry characte-ristic information in the original system remains same,which provides the effective basis for the analysis of chaotic cha-racteristics of the original system.Time delay referred by the abscissa when the average mutual information obtains the first minimum is as the best time delay of reconstructing phase space,and the false nearest neighbor algorithm is used to select the best embedding dimension.The extracted correlation dimension is used as the haotic correlation characteristics of bige data clustering,and big data is clustered based on the extracted chaos correlation dimension feature .The simulation results show that the proposed algorithm can effectively improve the efficiency of the clustering of data,reduce energy consumption,and is an effective method of data clustering.

Research on Algorithm Design for Prediction Market-based Contracts Trading Focusing on Collective Intelligence

YANG Xiao-xian and LV Bin

Computer Science. 2016, 43 (6): 233-239. doi:10.11896/j.issn.1002-137X.2016.06.047

Abstract

PDF(566KB) ( 515 )

References | Related Articles | Metrics

To sufficiently utilize the data,information and knowledge we have,the key issue is how to converge the dispersed and tacit resources.For this purpose,this paper first analyzed the essential features of prediction market,and then gave three key algorithms,mainly including contract purchase algorithm,inter-user contract exchange algorithm and contract settlement algorithm.The fluctuant contract price reveals the trend development of future events,by which the changed price reflects the impact of all factors and the market’s collective consensus to the event result.It provides a reference for event prediction.Finally,the aggregation and effectiveness of our method are demonstrated by experiments and data analysis.

Improved TextRank-based Method for Automatic Summarization

YU Shan-shan, SU Jin-dian and LI Peng-fei

Computer Science. 2016, 43 (6): 240-247. doi:10.11896/j.issn.1002-137X.2016.06.048

Abstract

PDF(649KB) ( 1963 )

References | Related Articles | Metrics

The canonical TextRank usually only considers the similarity between sentences in the processes of automatic summarization and neglects the information of text structures and sentence contexts.To overcome these disadvantages,we proposed an improved method on the basis of TextRank,called iTextRank,by incorporating the structure information of Chinese texts.iTextRank takes some important contexts and semantic information into consideration,including titles,paragraphs,special sentences,positions and lengths of sentences,when building the network diagram of TextRank,computing the similarities of sentences and adjusting the weights of the nodes.We also applied iTextRank into the automatic summarization of Chinese texts and analyzed its time complexities.Finally,some experiments were done.The results prove that iTextRank has higher accuracy rate and lower recall rate compared with canonical TextRank.

Implicit Feedback Personalized Recommendation Model Fusing Context-aware and Social Network Process

YU Chun-hua, LIU Xue-jun and LI Bin

Computer Science. 2016, 43 (6): 248-253. doi:10.11896/j.issn.1002-137X.2016.06.049

Abstract

PDF(602KB) ( 564 )

References | Related Articles | Metrics

As a key solution to the problem of information overload,the recommender system can filter a large amount of information according to a user’s preference and provide personalized recommendations for users.This paper explored the area of personalized recommendation based on implicit feedback and proposed a recommendation model,namely implicit feedback recommendation model fusing context-aware and social network process(IFCSP),which is a novel context-aware recommender system incorporating processed social network information.This model handles contextual information by applying a decision tree algorithm to classify the original user-item-context selections so that the selections with similar contexts are grouped.Then implicit feedback recommendation model (IFRM) was employed to predict the preference of a user for a non-selected item using the partitioned matrix.In order to incorporate social network information,a regularization term was introduced to the IFRM objective function to infer a user’s preference for an item by learning opinions from his/her friends who are expected to share similar tastes.The study provides comparative experimental results based on the typical Douban and MovieLens-1M data sets.Finally,the results show that the proposed approach outperforms state-of-the-art recommendation algorithms in terms of mean average precision (MAP) and mean percentage ranking (MPR).

Set Similarity Approximation Algorithm Based on Parity of Data Sketch

JIA Jian-wei and CHEN Ling

Computer Science. 2016, 43 (6): 254-256. doi:10.11896/j.issn.1002-137X.2016.06.050

Abstract

PDF(567KB) ( 598 )

References | Related Articles | Metrics

Jaccard similarity is one of the most important methods in set similarity computation.When approximately computing the Jaccard similarity of two sets using the b-bits hash function,if there are multiple elements being similar to the input element with similarity up to 1,the b-bits hash function can’t differentiate these elements very well.In order to improve the accuracy of data sketch and the application performance based on set similarity,this paper proposed a set similarity approximation algorithm based on parity of data sketch.After getting the two permutation sets with minwise hash function,we used two n-bits indicator vectors to represent the parity of elements in the permutation set appearing in indicator vectors,and estimated the Jaccard similarity of original sets based on these two parity vectors.We inferred the parity sketch based on both Markov chain and Poisson distribution models,and verified their equivalence.Experiments on Enron dataset show that the proposed parity sketch is more accurate than the b-bits hash function,and performs much better in both applications of duplicate document detection and associate rule mining.

Collaborative Filtering Recommendation Based on Random Walk Model in Trust Network

HE Ming, LIU Wei-shi and WEI Zheng

Computer Science. 2016, 43 (6): 257-262. doi:10.11896/j.issn.1002-137X.2016.06.051

Abstract

PDF(484KB) ( 797 )

References | Related Articles | Metrics

Collaborative filtering is one of the most widely used techniques for recommendation system which has been successfully applied in many applications.However,it suffers from serious problems of cold start and data sparsity.In addition,these methods can not indicate their confidence in recommendation.In this paper,we improved the random walk model combining trust-based and item-based collaborative filtering method for recommendation.The trust factor is introduced as an important factor of guiding recommendations.The random walk model considers not only the ratings of target item,but also those of the similar items.The probability of using the rating of a the similar item instead of a ra-ting for the target item increases with increasing length of walk.Our framework contains both trust-based and item-based collaborative filtering recommendations as special cases.The empirical analysis on the Epinions dataset demonstrates that our method can provide better recommendation result in terms of evaluation metrics than other algorithms.

Research for Uncertain Data Clustering Algorithm:U-PAM and UM-PAM Algorithm

HE Yun-bin, ZHANG Zhi-chao, WAN Jing and LI Song

Computer Science. 2016, 43 (6): 263-269. doi:10.11896/j.issn.1002-137X.2016.06.052

Abstract

PDF(557KB) ( 532 )

References | Related Articles | Metrics

UK-means algorithm is very sensitive to outliers in dealing with uncertain data,and the probability density or distribution function of uncertain data must be acquired in advance.However,it is often difficult to obtain in practice.For the shortage of UK-means in dealing with uncertainty measurement data,this paper firstly proposed a new algorithm namely U-PAM,based on PAM algorithm and intervals.It describes the uncertainty of measurement data with intervals reasonably and standard deviation so as to complete clustering effectively.Secondly,it is often difficult to cluster for the massive of data.For this regard,according to sampling techniques,this paper proposed the UM-PAM algorithm so as to deal with massive of uncertainty measurement data efficiently.It primary clusters sample data,and then clusters overall.Finally,the U-PAM algorithm can analyze the clustering result by combining with the CH validity index to determine the optimal clustering number.Experimental results show that the proposed algorithm can give effective clustering result obviously.

Running Gait Planning and Control for Humanoid Robot Based on Energy Efficiency Optimization

YANG Liang, FU Yu, FU Gen-ping and DENG Chun-jian

Computer Science. 2016, 43 (6): 270-275. doi:10.11896/j.issn.1002-137X.2016.06.053

Abstract

PDF(482KB) ( 771 )

References | Related Articles | Metrics

A novel parametric running gait optimization algorithm was proposed for solving the fatal problem of high energy consumption for humanoid robot,limiting the practical application of humanoid robot.After analyzing the impact of different gait parameters on horizontal and vertical stability,the gait planning problem is transformed to the multi-objective optimization problem,and the expressions of stability and energy consumption are obtained according to connective link model.In order to achieve the ideal running gait,a method based on opposite learning generic algorithm was proposed,which helps to reduce energy consumption and obtain good stability margin in pitching,rolling and yawing axis.In view of the problems of early maturing and slow convergence of traditional GA algorithm,an effective policy of initiating population based on domain knowledge was developed and the population was updated by generating opposite entity.To improve the trajectory tracking performance,an adaptive controller was designed and the stability proof was provided.The simulation results prove the validity of the method.

Piecewise Smooth Semi-supervised Support Vector Machine for Classification

FAN Xu-hui, ZHANG Jie and BAN Deng-ke

Computer Science. 2016, 43 (6): 276-279. doi:10.11896/j.issn.1002-137X.2016.06.054

Abstract

PDF(287KB) ( 515 )

References | Related Articles | Metrics

In order to focus on the non-smooth and non-convex problems of the semi-supervised support vector machine,a piecewise function based on piecewise ideas was proposed to approach the non-convex and non-smooth objective function.The approach degree of the piecewise function to objective function can be chosen according to the accuracy demand.A new piecewise smooth semi-supervised support vector machine (PWSS³VM) model based on piecewise function was constructed.LDS algorithm was applied to solve the model and its approximation performance to the symmetric hinge loss function was analyzed.Theoretical analysis and numerical experiments confirm that PWSS³VM model has better classification performance and higher classification efficiency than previous smooth models.

Hellinger Distance Based Similarity Analysis for Categorical Variables in Mixture Dataset

ZHAO Liang, LIU Jian-Hui and WANG Xing

Computer Science. 2016, 43 (6): 280-282. doi:10.11896/j.issn.1002-137X.2016.06.055

Abstract

PDF(305KB) ( 899 )

References | Related Articles | Metrics

Similarity analysis of categorical variables is an important part of data mining.The traditional methods have the defects of neglecting the difference between categorical variables,which are seriously affected by unbalanced dataset and can not be used in mixture dataset.To overcome the shortcomings mentioned above,this paper proposed an algorithm to measure the similarity between categorical variables based on the Hellinger distance.It accumulates the distribution differences of variables with different attributes in subsets corresponding to categorical variables as similarity variables and fits for mixture dataset.The experiments which use the derived similarity metrics in clustering algorithm and apply UCI datasets show that there is significant improvement in accuracy,validity and stability.

Model of Three-way Decision Theory Based on Intuitionistic Fuzzy Sets

XUE Zhan-ao, ZHU Tai-long, XUE Tian-yu, LIU Jie and WANG Nan

Computer Science. 2016, 43 (6): 283-288. doi:10.11896/j.issn.1002-137X.2016.06.056

Abstract

PDF(801KB) ( 563 )

References | Related Articles | Metrics

Three-way decision theory is an important theoretical underpinning in dealing with uncertain decision pro-blems.In recent years,the study of three-way decision theory has been one of the research hotspots for scholars both at home and abroad.The model of three-way decision theory was studied based on decision theoretic rough sets and intui-tionistic fuzzy set theory.The 2-state model and 3-state model were constructed based on intuitionistic fuzzy set theory,and they were extended to the general model.The threshold value was reset with the hesitancy and the event objects were assessed through the membership functions.Finally,the validity of the method is demonstrated by a case study of organochlorine insecticides contamination in sediments and the physicochemical characteristics of sediments of Huaihe River.

Method of Key Frames Extraction Based on Double-threshold Values Sliding Window Sub-shot Segmentation and Fully Connected Graph

ZHONG Xian, YANG Guang and LU Yan-sheng

Computer Science. 2016, 43 (6): 289-293. doi:10.11896/j.issn.1002-137X.2016.06.057

Abstract

PDF(927KB) ( 574 )

References | Related Articles | Metrics

With the development of multimedia technology,multimedia information is becoming more and more common in our life and work.How to efficiently retrieve the useful information in massive amounts of video information is becoming a more and more serious problem.In order to solve the above problems,this paper presented a method of key frames extraction from key frames based on double-threshold sliding window sub shot segmentation and fully connected graph.Firstly,it uses the double-threshold-based shot segmentation method, and gets the mutation boundary and the gradient boundary of a shot by setting double-threshold sliding window in order to divide the shot.Then,it uses the sub-shot segmentation method based on sliding window that adds a sliding window to the video frame sequence,and divides the shot again according to the frame differences at the range of a certain window.Finally,it uses key frame extraction method based on sub-shot segmentation,which regards the sub-shot as a fully connected graph.In this graph,the vertex is treated as a frame,the edge is treated as the frame difference so as to extract the key frames.The experimental results show that the method proposed in this paper has higher average accuracy and a less average number of key frames compared to baselines.Therefore,we can use this method to extract the key frames of video efficiently.

Vehicle Detection Research Based on Adaptive SILTP Algorithm

LI Fei, ZHANG Xiao-hong, ZHAO Chen-qiu and YAN Meng

Computer Science. 2016, 43 (6): 294-297. doi:10.11896/j.issn.1002-137X.2016.06.058

Abstract

PDF(856KB) ( 489 )

References | Related Articles | Metrics

This paper presented an adaptive SILTP algorithm based on the SILTP algorithm to improve the efficiency of vehicle detection in complex background.The vehicle detection starts with a two-dimensional discrete wavelet trans-form for the image.The next steps of vehicle detection include extracting the vehicle images’ information texture with the adaptive SILTP algorithm,using Gauss mixture model for background modeling,and using the texture information of the new image to update background dynamically. Finally,moving vehicle is obtained by comparing with the background model.The results demonstrate that this detection algorithm can achieve high detection efficiency under a complex background ,and has strong adaptability.

Coefficient-similarity-based Dictionary Learning Algorithm for Face Recognition

SHI Jing-lan, CHANG Kan, ZHANG Zhi-yong and QIN Tuan-fa

Computer Science. 2016, 43 (6): 298-302. doi:10.11896/j.issn.1002-137X.2016.06.059

Abstract

PDF(944KB) ( 520 )

References | Related Articles | Metrics

Using a compact dictionary obtained by sparse learning could greatly improve the accuracy and speed up the procedure of classification for sparse representation-based face recognition method.However,the traditional metaface learning (MFL) method doesn’t take into account the similarity among the training samples from the same person.In order to take the advantage of this prior information and make the learned dictionary more discriminative,an algorithm called coefficient-similarity-based metaface learning (CS-MFL) was proposed.In CS-MFL,the coefficient similarity is incorporated as a new constraint to the original objective function.To solve the new optimization problem,both l2 norm-based constraints are combined,and the original problem becomes a typical l₂-l1 problem.An experiment was carried out on different face databases,which shows that the proposed CS-MFL algorithm can achieve higher recognition rate than MFL algorithm,which demonstrates that the dictionary obtained by CS-MFL algorithm is more efficient and discriminative than that of the traditional MFL for face recognition application.

Forward and Unsupervised Convolutional Neural Network Based Face Representation Learning Method

ZHU Tao, REN Hai-jun and HONG Wei-jun

Computer Science. 2016, 43 (6): 303-307. doi:10.11896/j.issn.1002-137X.2016.06.060

Abstract

PDF(935KB) ( 495 )

References | Related Articles | Metrics

The existing face representation learning methods based on deep convolutional neural network demand massive labeled face dataset.In real-world application,it is difficult to precistly annotate the labels of face dataset.In this paper,an unsupervised forward convolutional neural network based face representation learning algorithm was proposed.By design,virtual labels of training samples were abtained based on K-means clustering and then convolution kernels were learnt by linear discriminant analysis.The network architecture in this paper is simple and effecitve and does not need back propagation during training,so its training speed is much quicker than supervised deep convolution neural network.The experimental results demonstrate that the proposed method in this paper out performs the state-of-the-art unsupervised feature learning algorithm and local feature descriptors in both real-world Labeled Face in the Wild (LFW) dataset and the classical controlled Feret dataset.

Pedestrian Detection Based on PCA Dimension Reduction of Multi-feature Cascade

GAN Ling, ZOU Kuan-zhong and LIU Xiao

Computer Science. 2016, 43 (6): 308-311. doi:10.11896/j.issn.1002-137X.2016.06.061

Abstract

PDF(841KB) ( 557 )

References | Related Articles | Metrics

In pedestrian detection,Histogram of oriented gradient(HOG) has the defects of too much redundant information,low detection speed,this paper proposed features cascading pedestrian detection based on PCA dimensional reduction.Firstly,we used PCA to reduce the dimension of HOG features,then took HOG features,Gabor features and color features as the features of pedestrian detection.Finally we used SVM radial basis (RBF) kernel function to classify.Experiments on INRIA pedestrian database show that this method not only increases the speed of classification,but also improve the accuracy of detection.

Kernel-based Supervised Neighborhood Projection Analysis Algorithm

ZHENG Jian-wei, KONG Chen-chen, WANG Wan-liang, QIU Hong and ZHANG Hang-ke

Computer Science. 2016, 43 (6): 312-315. doi:10.11896/j.issn.1002-137X.2016.06.062

Abstract

PDF(931KB) ( 507 )

References | Related Articles | Metrics

A new algorithm called KSNPA which exhibits a nonlinear form of discriminative elastic embedding (DEE) was proposed.KSNPA integrates class labels and linear projection matrix into the final objective function,as well as uses kernel function to deal with nonlinear embedding situation.According to two different strategies for optimizing the objective function,the algorithm is divided into kernel-based supervised neighborhood projection analysis algorithm 1(KSNPA1) and supervised neighborhood projection analysis algorithm 2 (KSNPA2).Furthermore,a deliberately selected search direction,termed as Laplacian Direction,is applied in KSNPA1 for achieving faster convergence rate and lower computational complexit.Experimental results on several databases demonstrate that the proposed algorithm achieves powerful pattern revealing capability for complex manifold data.Moreover,the algorithm is more efficient and robust than DEE and related dimensionality reduction algorithms.

Research on Recognition Algorithm for Subject Web Pages Based on Tag Tree Adjacency Matrix

SONG Jun, YANG Xiao-fu, LI Yi-cai and WANG Jia-wei

Computer Science. 2016, 43 (6): 316-320. doi:10.11896/j.issn.1002-137X.2016.06.063

Abstract

PDF(641KB) ( 501 )

References | Related Articles | Metrics

With the development of Web program technology,the same type subject pages can show the same visual feature information of the Web page by using different HTML tags,resulting in existing Web structure similarity algorithm which measures the structure similarity of the Web page base on matching the HTML tag name information can’taccurately recognize the same type subject pages.So,we proposed a recognition algorithm for the same type subject pages based on the tag tree adjacency matrix.This algorithm constructs Web page tag tree’s adjacency matrix and re-cognizes the same type subject pages by computing the structure similarity between the Web pages through the tag tree adjacency matrix.The experimental results indicate that the optimal performance of the algorithm can reach 100% recall rate and 96% precision rate,and the average performance can reach 97% recall rate and 89% precision rate.

Image Synthesis Model Based on Bayesian Estimation Method of Posterior Maximum

YANG Lin, XU Hui-ying and WANG Yan-jie

Computer Science. 2016, 43 (6): 321-324. doi:10.11896/j.issn.1002-137X.2016.06.064

Abstract

PDF(862KB) ( 461 )

References | Related Articles | Metrics

In the area of image processing,it usually needs to generate a novel image via a series of related input images.Most of current researches set some heuristic rules in the process of image synthesis.In order to improve the efficiency of image synthesis,this paper proposed a Bayesian based image synthesis model.Given the ideal image synthesis model,we analyzed the errors of sensors and images.As the error between image and geometric is related,we further analyzed their relationship.While doing posterior estimation with given image data,we got the prior parameters of the model by minimizing energy.In the process of optimizing the target function,we applied the re-weighted iterative method based on related works.The experiments show that the proposed image synthesis model has better performance in image synthesis and rendering than related works.