Started in January,1974(Monthly)
Supervised and Sponsored by Chongqing Southwest Information Co., Ltd.
ISSN 1002-137X
CN 50-1075/TP
Current Issue
Volume 46 Issue 2, 15 February 2019
Big Data & Data Science
Big Data Analytics and Insights in Distribution Characteristics of Supply Chain Finance
LIU Ying
Computer Science. 2019, 46 (2): 1-10.  doi:10.11896/j.issn.1002-137X.2019.02.001
Abstract PDF(1490KB) ( 1594 )   
References | Related Articles | Metrics
The semi-structured,unstructured and massive supply chain finance data make the analysis method relatively complicated in large data environment.How to use the unique characteristics of large samples to improve classification performance is worth exploring for the research on large data samples.This paper analyzed the main factors,which affectthe classification model of credit risk based on the distribution characteristics of financial data in supply chain,proposed distribution characteristics of credit data after researching the relevant achievements over the years,including imbalance data,noise and outliers,nonlinear multidimensional and so on,and then discussed further solutions to mine the know-ledge of the massive financial data,which provides an effective theoretical basis for the construction of credit risk model.
Quality-embedded Hypergraph Model for Big Data Product Manufacturing System and Decision for Production Lines
WANG Yang, CAI Shu-qin, ZOU Xin-wen, CHEN Zi-tong
Computer Science. 2019, 46 (2): 11-17.  doi:10.11896/j.issn.1002-137X.2019.02.002
Abstract PDF(1444KB) ( 883 )   
References | Related Articles | Metrics
Due to the different characteristics of the Big Data Product (BDP) in terms of raw materials,user requirements,and processing techniques,research on BDP production system still stays at the conceptual model stage.To solve this problem,this paper purposed the concept of BDP production lines,discussed the decision factors for production line selection based on the characteristics of production line,and emphasized the action mechanism of quality as a key decision element in BDP production.This paper established the BDP production system model embedded with quality,quality transfer function and quality aggregation function by using the hypergraph theory,designed the decision processes for BDP production line,and proposed the decision mode of the BDP production line with supply-side-stable and demand-side-stable.The results show that the proposed model and decision method can meet users’ requirements for BDP quality.
BioPW+:Biological Pathway Data Visualization System Based on Linked Data
LIU Yuan, WANG XinGAN Ying, YANG Chao-zhouLI Wei-xi
Computer Science. 2019, 46 (2): 18-23.  doi:10.11896/j.issn.1002-137X.2019.02.003
Abstract PDF(2035KB) ( 892 )   
References | Related Articles | Metrics
Since the Linked Data project arises,abundant open linked data have been published on the semantic Web,which contain plentiful biological pathway datasets.To make these data be utilized effectively for the biological scientists,this paper conducted the research on heterogeneous biological pathway data visualization system based on Linked Data,proposed a biological pathway visualization model,and then designed visualization layout strategies.After that,this paper utilized the dynamic mapping of identifiers to implement the browsing of heterogeneous biological pathway data,and finally developed a biological pathway visualization system called BioPW+.Primarily,BioPW+ retrieves the essential information with respect to the biological pathway by means of the semantic Web technologies and SPARQL queries.Then,through the Open PHACTS platform,it acquires the detailed information of the pathway.Finally,it illustrates the biological pathway on the Web page by employing the force-directed layout and Sankey layout,and furnishes various interoperable functions.Not like the existing tools that only retrieve data from single data source,BioPW+ is based on the Linked Data,and can elaborate the biological pathways with their relevant biochemical information from multiple datasets,saving large amounts of time and improving the data integrity.
Visualization Method for University Teachers’ Performance Data Based on Mixed Layout Strategy
DING Wei-long, XUE Li-li, CHEN Wan-jun, WU Fu-li
Computer Science. 2019, 46 (2): 24-29.  doi:10.11896/j.issn.1002-137X.2019.02.004
Abstract PDF(2884KB) ( 1058 )   
References | Related Articles | Metrics
The performance data of college teachers play an important role in the evaluation of teachers,such as assessment and evaluation,salary increasement,job appointment and promotion of professional titles.Aiming at the characteris-tics of complex hierarchical features and multi-dimensional attributes,this paper proposed a VPM (Venn Parallel Co-ordinates Mixing) hybrid visualization method based on circular nested graphs and parallel coordinates to visualize the performance data of college teachers.The method uses a circular nested packed graph based on D3 layout algorithm to represent the hierarchical structure,and then divides the circumference of the leaf nodes into different attribute axes,rea-lizing visualization of multi-dimensional attributes by user interaction such as layout design,attribute mapping,attri-bute point connection,scaling and highlighting.The method is applied to the teacher performance data,realizing the visua-lization of the hierarchical structure of the college,the research institute and the teacher,and clearly displaying detailed information of the data item.The experimental results show that the proposed VPM method can effectively display teachers’ performance data.The data and evaluation results are also in accordance with the actual situation,which can help the administrators to manage and evaluate teachers in a better way.
Improved ROUSTIDA Algorithm for Missing Data Imputation with Key Attribute in Repetitive Data
FAN Zhe-ning, YANG Qiu-hui, ZHAI Yu-peng, WAN Ying, WANG Shuai
Computer Science. 2019, 46 (2): 30-34.  doi:10.11896/j.issn.1002-137X.2019.02.005
Abstract PDF(1382KB) ( 1067 )   
References | Related Articles | Metrics
With the rise of data analysis,the importance of data pre-processing has attracted more and more attention,especially the imputation of missing data.Based on the ROUSTIDA algorithm,this paper proposed an improved ROUSTIDA algorithm-Key&Rpt_RS algorithm.Key&Rpt_RS algorithm inherits the advantages of ROUSTIDA algorithm,considers the characteristic of repeatability in objective data,and analyzes the influence of key attribute on imputation effect.At last,this paper conducted the experiments based on the alarm data in communication network.The results show that Key&Rpt_RS algorithm outperforms the traditional ROUSTIDA algorithm in terms of the imputation effect for missing data.
Multi-keyword Streaming Parallel Retrieval Algorithm Based on Urban Security Knowledge Graph
GUANJian, WANG Jing-bin, BIAN Qian-hong
Computer Science. 2019, 46 (2): 35-41.  doi:10.11896/j.issn.1002-137X.2019.02.006
Abstract PDF(2395KB) ( 1181 )   
References | Related Articles | Metrics
With the popularization and construction of the concept of smart city security in China,and the deep application of big data in the construction of smart city security,higher requirements on the processing response speed of keyword retrieval are needed.Aiming at this problem,this paper proposed a streaming multi-keyword parallel retrieval algorithm based on the urban security knowledge graph (MKPRASKG).This algorithm can construct a query subgraph set based on the entities of knowledge graph through the construction,pruning and fusion operation of the associated class graphs based on the query keywords input by the user in real time.And then combined with the scoring function,the high-scoring query subgraph is used as a guide,and the parallel search is performed in the knowledge graph instance data,and finally the Top-k query results are returned.Experimental results show that this algorithm has great advantages in terms of real-time search,response time,search effect and scalability.
Multilevel and Intelligent Rent-seeking and Matching Resource Strategy and Value Creation of Public Service Platform in Big Data Environment
BI Ya, YUAN Hui-qun, CHU Ye-ping, LIU hui
Computer Science. 2019, 46 (2): 42-49.  doi:10.11896/j.issn.1002-137X.2019.02.007
Abstract PDF(1510KB) ( 648 )   
References | Related Articles | Metrics
The problem of resource rent-seeking and matching based on public service platform in big data environment was studied in this paper.In view of the unstructured features of large data,the semantic distance was redefined by considering the path distance,connection depth and breadth of the ontology tree,and a formal five element description mo-del based on semantic distance was proposed to eliminate the complexity of the large data in the underlying structure and type.In view of the large scale of large data,a strategy of resource classification intelligent rent-seeking and matching was proposed.First,a coarse particle filter is carried out to reduce the range of resource matching and speed up the matching speed of the algorithm by means of coarse particle size of Scategory and Sstatus which has few and simple parameters.Then by fine-grained matching of Sability and SQoS,a resource ordering aggregate satisfying the requirement of the demand side is finally obtained.Experiments show that the computational efficiency of this method is significantlyhigherthan that of traditional multi-threading algorithm,and the precision and recall of this method are also better than those of common resource rent-seeking and matching algorithms.Compared with the existing resource matching algorithm,this method is effective and feasible.It can not only realize the rapid rent-seeking and accurate search of the resources on the public service platform,but also further enhance the value creation of resources under the large data environment.
Under-sampling Method for Unbalanced Data Based on Centroid Space
JIN Xu, WANG Lei, SUN Guo-zi, LI Hua-kang
Computer Science. 2019, 46 (2): 50-55.  doi:10.11896/j.issn.1002-137X.2019.02.008
Abstract PDF(1424KB) ( 972 )   
References | Related Articles | Metrics
In view of the fact that the classification performance of current classification algorithms is not ideal for the unbalanced dataset,through combining supervised learning and unsupervised learning,this paper proposed a sub-sampling method based on centroid,namely ICIKMDS.In practical applications,some data are not easily to be obtained or different types of data are different in quantity,resulting in uneven distribution of data,such as the disproportion of the sufferer and the normal people in the detection of diseases,the disproportion of the fraud users and normal users in credit card fraud and so on.The new method solves the disproportion problem of dataset well.In this method,the initial centroid is obtained by solving the Euclidean distance between samples,and then the k-means algorithm is used to cluster the large-class sample sets to make the disproportionate dataset more balanced in distribution,effectively improving the effect of classifiers.The proposed method makes the classification accuracy of the classifier much better than that of random under-sampling and SMOTE algorithm on the subclass of test set,and its accuracy on the whole test set has little difference from other algorithms.
Travel Route Recommendation Based on Knowledge Graph and Frequent Sequence Mining
SUN Wen-ping, CHANG Liang, BIN Chen-zhong, GU Tian-long, SUN Yan-peng
Computer Science. 2019, 46 (2): 56-61.  doi:10.11896/j.issn.1002-137X.2019.02.009
Abstract PDF(1936KB) ( 2200 )   
References | Related Articles | Metrics
Big data provide a huge amount of information and bring the information overload problem to users.This situa-tion is more serious in tourism field.To solve the problem that tourists need to spend a lot of time and energy to prepare a travel plan,this paper firstly proposed a method to construct a knowledge graph incorporating heterogeneous tourism data to extract the knowledge in tourism field.Secondly,this paper used the knowledge graph and travelogues to gene-rate the travel route database,and proposed a frequent route sequence pattern mining algorithm which can generate mass candidate routes according to the tourists’ type.Finally,a multi-dimensional route searching and ranking mechanism was designed to recommend personalization travel routes.The experimental results illustrate that the proposed method can adopt various factors,i.e.the number of tourist’s travel days,user types and attractions preferences,to help tourists to design personalized travel routes and effectively improve the tour experience.
Sparse Feature Selection Algorithm Based on Kernel Function
ZHANG Shan-wen, WEN Guo-qiu, ZHANG Le-yuan, LI Jia-ye
Computer Science. 2019, 46 (2): 62-67.  doi:10.11896/j.issn.1002-137X.2019.02.010
Abstract PDF(2262KB) ( 867 )   
References | Related Articles | Metrics
In view of the condition that the traditional feature selection algorithm can not capture the relationship between features,a nonlinear feature selection method was proposed.By introducing a kernel function,the method projects the original data set into a high-dimensional kernel space,and considers the relationship between sample features by performing operations in the kernel space.Due to the superiority of the kernel function,even if the data are projected into the infinite dimensional space through the Gaussian kernel,the computational complexity can be controlled to a small extent.For the limitation of the regularization factor,the use of two norms for double constraint not only improves the accuracy of the algorithm,but also makes the variance of the algorithm only be 0.74,which is much smaller than other similar comparison algorithms,and it is more stable.6 similar algorithms were compared on 8 common data sets,and the SVM classifier was used to test the effect.The results demonstrate that the proposed algorithm can get the improvement by a minimum of 1.84%,a maximum of 3.27%,and an average of 2.75%.
Network & Cornmunication
Self-adaptive EM Phase Noise Suppression Algorithm in F-OFDM System
CHEN Da-shuang, LI Ying-shan, WU Hong
Computer Science. 2019, 46 (2): 68-75.  doi:10.11896/j.issn.1002-137X.2019.02.011
Abstract PDF(3358KB) ( 1179 )   
References | Related Articles | Metrics
Filtered orthogonal frequency-division multiplexing (F-OFDM) is a new technology for next generation mobile communication system.F-OFDM can maintain the advantages such as strong anti-interference ability in OFDM system,and is adaptive for different flexible service configuration in the future.However,it’s more sensitive to phase noise than OFDM,which will cause sub-band common phase error (SCPE) and sub-band inter-carrier interference (SICI),thus seriously decreasing the performance of the system.This paper proposed a self-adaptive EM phase noise suppression algorithm (AEM-PNS) to reduce the phase noise in F-OFDM.It includes two sub-algorithms:EM-SCPE and EM-SICI.AEM-PNS can choose suitable phase noise suppression sub-algorithm automatically according to the inserted phase noise instruction symbol (PNIS) and pilot instruction symbol (PIS) in symbol frame.Simulation results show that the proposed algorithm can track phase noise adaptively,reduce the influence caused by phase noise effectively,and keep low complexity and high spectral efficiency simultaneously.
Intra-domain Energy Efficiency Routing Scheme Based on Network Entropy
ZHANG Ju, GENG Hai-jun, LIU Jie-qi
Computer Science. 2019, 46 (2): 76-80.  doi:10.11896/j.issn.1002-137X.2019.02.012
Abstract PDF(2374KB) ( 726 )   
References | Related Articles | Metrics
The reduction of network energy consumption and the building of green network have become key scientific problems in academic and industrial research.All the existing energy efficiency schemes carry out researches on the premise of knowing the traffic matrix,but it’s not easy to get real-time traffic data.Therefore,this paper studied how to reduce the network energy consumption without knowing real-time traffic matrix,and presented an intra-domain energy efficiency routing scheme based on network entropy.This scheme achieves energy efficiency by turning off the links in network.Firstly,the link criticality model and the network entropy model are proposed.Then,the importance of all links in the network is calculated according to the link criticality.Finally,the links in the network are turned off in turn according to the importance of link and the network entropy model.The experimental results show that the proposed algorithm does not introduce larger path stretch when reducing the energy consumption of network.
Data Forwarding Algorithm Based on Multidimensional Context Matching in Mobile Social Networks
XU Fang, DENG Min, XIONG Zeng-gang, YE Cong-huan, XU Ning
Computer Science. 2019, 46 (2): 81-87.  doi:10.11896/j.issn.1002-137X.2019.02.013
Abstract PDF(1575KB) ( 846 )   
References | Related Articles | Metrics
Through studying the effect of a variety of context information on the mobility patterns in mobile social networks,this paper proposed a multidimensional context matching forwarding (MCMF) algorithm.In this novel algorithm,three dimension contexts,which are physical adjacency,social similarity and social interactivity,are used to make routing decisions dynamically.Firstly,message carrier obtains its neighbor node sets by using physical adjacency matching at the present moment.Then social similarity matching is used to search relay candidate subset of neighbor node sets,and the discrete-time semi-Markov prediction model is used to determine the best relay node.At last,the efficient data forwarding algorithm is designed.Simulation experiments based on real traces show that the proposed MCMF algorithm is more efficient in terms of maximizing the delivery ratio and minimizing the overhead ratio than other three state-of-the-art algorithms.
Network Coding TCP Protocol Based on Cross-layer Optimization in Wireless Vehicle Networks
CHEN Jie, XIE Xian-zhong, HUANG Qian, LI Jia
Computer Science. 2019, 46 (2): 88-94.  doi:10.11896/j.issn.1002-137X.2019.02.014
Abstract PDF(2115KB) ( 954 )   
References | Related Articles | Metrics
Wireless vehicle network (WVN) has important research and application value.At present,there are few researches on network coding TCP protocol in wireless vehicle network.At the same time,cross-layer optimization has not beenpaid attention seriously.This paper first addressed different causes of packet loss in wireless vehicle networks,and gave a cross-layer joint optimization method VC-TPC/NC based on stochastic linear network coding.Aiming at different reasons for packet loss,different treatment were given.Further,the sending strategy of network coding layer sender was redesigned,and theadvantages of VC-TCP/NC in the delay and network throughput were illustrated through theoretical analysis.Finally,simulation results in different scenarios show thatthe performance of VC-TCP/NC is greatly improved compared to the performance of traditional TCP and TCP/NC.
Information Security
Android Malware Detection with Multi-dimensional Sensitive Features
XIE Nian-nian, ZENG Fan-ping, ZHOU Ming-song, QIN Xiao-xia, LV Cheng-cheng, CHEN Zhao
Computer Science. 2019, 46 (2): 95-101.  doi:10.11896/j.issn.1002-137X.2019.02.015
Abstract PDF(1336KB) ( 1028 )   
References | Related Articles | Metrics
The behavior semantics of applications play a key role in Android malware detection.In order to distinguish the behavior semantics of applications,this paper presented suitable features and method for Android malware detection.This paper first defined the generalizdd-sensitive API,and emphasized to consider whether the trigger point of the generalized-sensitive API is UI-related as well as combined the really-used permission.The approach first abstracts the generalized-sensitive API and their trigger points as the semantic feature,extracts the really-used permission as the syntax feature,and then leverages machine learning-based classification method to automatically detect whether the application is benign or malicious.Comparative experiments were conducted on 13226 samples.The experimental results show that the proposed approach costs little time and the feature set is reasonable,and it can get good classification results.Through the comparison of several machine learning-based techniques,Random Forest is chosen as the classification method,and the results demanstrate that the accuracy achieves 96.5%,AUC reaches 0.99,and a classification precision of malware reaches 98.8%.
Detection and Defense Method for Blackhole Attacks in Wireless Sensor Networks
WANG Jun, ZHU Zhi-wei, LIU Jun-jie
Computer Science. 2019, 46 (2): 102-108.  doi:10.11896/j.issn.1002-137X.2019.02.016
Abstract PDF(1969KB) ( 1108 )   
References | Related Articles | Metrics
Wireless sensor networks(WSN) are widely used in military,production,medical and so on.Many sensor networks are deployed in hostile and open environments,so there exist a variety of threats.Blackhole attack is a typical route attack in WSNs.A malicious node claims that it has enough residual energy and it can reach the destination in one hop or claims itself to be the destination node.So the malicious node can attract other nodes to send data to it.But it will discard the packets without forwarding it which results in the “data hole”.This paper proposed a new location detection method with trapping method.A non-existent destination node is used as a bait to locate the blackhole nodes.By verifying the identity and position information of the nodes,the malicious nodes can be removed.At the same time,the algorithm adds the pre-shared symmetric key and HMAC(Hash-based message authentication code) message authentication mechanism to prevent malicious nodes from joining the network.The results of simulation based on NS2 platform show that the proposed algorithm is superior to other ones with higher detection ratio.
Access Control Method Based on Feature Extraction
HUANG Mei-rong, OU Bo, HE Si-yuan
Computer Science. 2019, 46 (2): 109-114.  doi:10.11896/j.issn.1002-137X.2019.02.017
Abstract PDF(1459KB) ( 898 )   
References | Related Articles | Metrics
Recently,fine-grained authorization control has become a hot topic in access control research field,and it can adjust access strategy reasonably in a single fixed environment,so as to meet the safety of workflow.However,it may be difficult to give a correct judgement and only rely on manual checking to confirm whether it is authorized when it is migrated to the new scenario and encounters authorization that is not set by access policy.Manual checking is time-consuming,and it costs too much in big data environments.Therefore,it is imperative to introduce an automatic discrimination mechanism based on past experiences.This paper attempted to give an automatic discrimination method for role-based multilevel access control model,and described the general expression of the access control by sampling the correct and incorrect authorization time and space.This allows the existing access control model to make the righ judgements under the new environments,thus reducing the workload of manual review.The experimental results show that the analysis mechanism has a higher correct judge rate for user access requests.
Hierarchical Hybrid Authentication Model Based on Key Sharing
ZHAO Jiao-jiao, MA Wen-ping, LUO Wei, LIU Xiao-xue
Computer Science. 2019, 46 (2): 115-119.  doi:10.11896/j.issn.1002-137X.2019.02.018
Abstract PDF(1421KB) ( 717 )   
References | Related Articles | Metrics
With the rapid development of the information age,cloud computing data access security has become the most concerned issue for users.Identity authentication technology is an important means to ensure that participants implement secure communications in an open network environment,and how to use identity authentication technology to escort the cloud environment has become a hot issue for many scholars.This paper proposed a public key infrastructure-identity-based encryption hybrid authentication model scheme by establishing a trust relationship between different cloud services by CA certificate that Public Key Infrastructure (PKI) issued,combining multiple clouds which use Identity Based Encryption (IBE) system,adopting hierarchical identity encryption system,introducing shared key technology,and choosing ring structure.And the security of the scheme was analyzed to prove the feasibility of providing ser-vices based on the identity-based hybrid authentication model in the cloud environment.At the same time,a signcryption technology based on this model was designed to achieve cloud authentication and cross cloud authentication by the public and private key pairs.Performance analysis shows that under the premise of a slight increase in the amount of calculation,the scheme ensures sufficient security,and better satisfies the requirements of users in the cloud environment belonging to different cloud domains and users’ secure access,and solves the problem of data access security in a cloud environment effectively.
Clustering Algorithm in Differential Privacy Preserving
HU Chuang, YANG Geng, BAI Yun-lu
Computer Science. 2019, 46 (2): 120-126.  doi:10.11896/j.issn.1002-137X.2019.02.019
Abstract PDF(1916KB) ( 1172 )   
References | Related Articles | Metrics
Data mining has made great progress in the field of research and application of big data,but sensitive information disclosure could bring users many threats and losses.Therefore,how to protect data privacy in clustering analysis has become a hot issue in data mining and data privacy protection.Traditional differential privacy k-means is sensitive to the selection of its initial centers,and it has a certain blindness in the selection of cluster number k,which reduces the availability of clustering results.To improve the availability of clustering results of differential privacy k-means clustering,this paper presented a new DPk-means-up clustering algorithm based on differential privacy and carried out theoretical analysis and comparison experiment.Theoretical analysis shows that the algorithm satisfies ε-differential privacy,and can be applied to data sets with different sizes and dimensions.In addition,experimental results indicate that the proposed algorithm improves clustering availability than other differential privacy k-means clustering methods at the same level of privacy preserve.
Automatic Return-to-dl-resolve Exploit Generation Method Based on Symbolic Execution
FANG Hao, WU Li-fa, WU Zhi-yong
Computer Science. 2019, 46 (2): 127-132.  doi:10.11896/j.issn.1002-137X.2019.02.020
Abstract PDF(1439KB) ( 1163 )    Suppl. Info.
References | Related Articles | Metrics
Return-to-dl-resolve is a general exploit technology to bypass complicated protection mechanism,but the efficiency of manual shell-code’ construction is very low.The thesis studies the core concept of ASLR,NX and Return-to-dl-resolve,and then set up a Return-to-dl-resolve model.The proposed model provides symbolic execution environment for ELF binary program,and generates exploit by constraint solving.It also inplements a control-flow hijacking exploit generation system named R2dlAEG.The experiment results show that R2dlAEG generates exploits in acceptable time,and the exploits can bypass both NX and ASLR.
Privacy Preserving Scheme for SNS in Cloud Environment
LIU Sheng-jie, WANG Jing
Computer Science. 2019, 46 (2): 133-138.  doi:10.11896/j.issn.1002-137X.2019.02.021
Abstract PDF(1673KB) ( 836 )   
References | Related Articles | Metrics
In reality,data stored on social networks are often outsourced to the untrusted cloud services providers.Aiming at the problems of privacy and attribute updating of social network,an attribute-based encryption scheme with hidden policy and attribute revocation in cloud environment was proposed.This scheme reduces the computation of clie-nt by breaking down the way of key generation.Moreover,the policy is hidden by using the composite order bilinear groups,and a mechanism with token tree and attribute trapdoor is used to achieve an efficient and flexible attribute re-vocation.In addition,the scheme is proved to be secure under the standard assumption.So,using this encryption in socialnetwork service to encrypt data to cloud servers is safe and feasible.Compared to other related works,this scheme protects the privacy of access policy and gives a better performance in computing and storage with access control functions.
Illegal Flow Analysis for Lattice Model of Information Flow
WANG Xue-jian, ZHAO Guo-lei, CHANG Chao-wen, WANG Rui-yun
Computer Science. 2019, 46 (2): 139-144.  doi:10.11896/j.issn.1002-137X.2019.02.022
Abstract PDF(1337KB) ( 724 )   
References | Related Articles | Metrics
With the development of the Internet,the status of cyberspace has risen,and the importance of information is increasing.To ensure the security of information,it is particularly important for the control of illegal information flow.This paper analyzed the security of information flow in a lattice model of information flow,and classified the information flow inside the model better.Firstly,the linear analysis is done for the lattice model of the information flow,which is called a linear lattice model of information flow.Then,the Markov chain is introduced,the state attribute of the Markov chain is used,and the probability variation of the two states in the Markov chain is used to quantify the representation between the subject and the object in the model.Further,the security state of each information flow is analyzed by comparing the probability of the normal return state and the transient state corresponding to the internal body and the object respectively.That is to say,when two constant return states occur simultaneously in the model detection,the security model is violated,and an illegal information flow occurs.Due to the identity of the change in probability,the method produces errors and affects its detection results.In order to overcome this shortcoming,this paper introduced the SPA language,then described the SPA language of the linear information flow model,and used the non-interference method in formalization to make the lack of probability identity in the Markov chain model.Finally,the illegal information flow hidden in it is detected,the security state of each information flow with error is judged,and it is concluded that the information flow that conforms to the security model but violates the security policy does not satisfy the non-interference attribute.This is a major significance on software design and hardware application.
Android Malware Detection Based on N-gram
ZHANG Zong-mei, GUI Sheng-lin, REN Fei
Computer Science. 2019, 46 (2): 145-151.  doi:10.11896/j.issn.1002-137X.2019.02.023
Abstract PDF(1828KB) ( 1104 )   
References | Related Articles | Metrics
With the widespread use of Android operating system,malicious applications are constantly emerging on the Android platform,meanwhile,the means by which malicious applications evade existing detection tools are becoming increasingly complicated.In order to effectively analyze malicious behavior,more efficient detection technology is required.This paper presented and designed a static malicious detection model based on N-gram technology.The model decompiles Android APK files by reversing engineering and uses N-gram technology to extract features from bytecodes.In this way,the model avoids dependence on expert knowledge in traditional detection.At the same time,the model combines with deep belief network,which allows it to rapidly and accurately train and detect application samples.1267 malicious samples and 1200 benign samples were tested.The results show that the overall accuracy is up to 98.34%.Further more,the results of the model were compared with those of other machine learning algorithms,and the detection results of the related work were also compared.The results show that the model has better accuracy and robustness.
Software & Database Technology
Analysis on Behavior Characteristics of Developers in Github
LI Cun-yan, HONG Mei
Computer Science. 2019, 46 (2): 152-158.  doi:10.11896/j.issn.1002-137X.2019.02.024
Abstract PDF(5570KB) ( 1045 )   
References | Related Articles | Metrics
Analysis of the behavior characteristics of developers in open source environment is one of the important issues to promote the development of open source community.This paper regarded the data of Github open source community as the research object,analyzed the influence factors of developer contribution degree on Github and explored the cooperative relationship between developers through utilizing the visualization analysis technology,and further dissected the relationship between the region that the developers belong to and the collaboration of developers.Some phenomena and conclusions with important theories and time values can be obtained from the study,revealing some behavioral cha-racteristics of developers from a new perspective.
Approach for Generating Class Integration Test Sequence Based on Dream Particle Swarm Optimization Algorithm
ZHANG Yue-ning, JIANG Shu-juan, ZHANG Yan-mei
Computer Science. 2019, 46 (2): 159-165.  doi:10.11896/j.issn.1002-137X.2019.02.025
Abstract PDF(1750KB) ( 727 )   
References | Related Articles | Metrics
Determination of class integration test sequence is an important topic in object-oriented software integration testing.Reasonable class integration test sequence can reduce the overall complexity of test stub,and then reduce test cost.For particle swarm optimization algorithm,it is easy to be precocious.So a class integration test sequence determination method based on dream particle swarm optimization algorithm was proposed in this paper.First,each sequence is taken as a particle in one dimensional space.Then,every particle is considered to be a dreamer.Each iteration cycle is divided into two phases:day and night.In the daytime,particles move to new locations,and during the night,they contort the locations gained at day phase according to dreaming ability.In this way,particle has the opportunity to search near the current location,so that the algorithm can converge slowly and avoid falling into local optimum too early.The experimental results show that the proposed approach takes a lower test cost in most cases.
Study on Fractal Features of Software Networks
PAN Hao, ZHENG Wei, ZHANG Zi-feng, LU Chao-qun
Computer Science. 2019, 46 (2): 166-170.  doi:10.11896/j.issn.1002-137X.2019.02.026
Abstract PDF(1928KB) ( 671 )   
References | Related Articles | Metrics
With the development of the Internet technology,the scale of software architecture is growing beyond the requirements,and the software function changes the structure of software network.The fractal structure of the software network reflects the self-similarity of the modules and the whole software network,and can analyze the architecture of software from the code level.This paper researched the fractal characteristic of software networks.First,the software networks are weighted by the complex relationships between the classes.Further more,a network centrality based box algorithm is utilized to calculate the fractal dimension of the software network.At last,two representative software of spring and struts2 are analyzed through experiments,and the results show that the both two framework and their mo-dules have fractal features and the fractal dimension of the software network increases with the complexity of the mo-dule.The fractal dimension of the software network with more comprehensive functions is bigger than that of the software network with special functions.Also the fractal dimension increases with the evolution of the software version in which the software function is gradually improved.
Geo-semantic Data Storage and Retrieval Mechanism Based on CAN
LU Hai-chuan, FU Hai-dong, LIU Yu
Computer Science. 2019, 46 (2): 171-177.  doi:10.11896/j.issn.1002-137X.2019.02.027
Abstract PDF(1871KB) ( 806 )   
References | Related Articles | Metrics
Semantic technology can search information more intelligently and accurately,and assist researchers to make scientific decisions.Therefore,this technology has been introduced into geographic information processing and formed a geo-query language GeoSPARQL based on RDF (Resource Description Framework).However,the existing application platforms based on geographic semantic information processing adopt centralized storage and retrieval services,which will cause the disadvantages of single node failure and poor scalability.Although researchers have proposed a variety of methods to use peer-to-peer network to improve the reliability and scalability of application systems,these methods do not consider the characteristics of geographic semantic data.In view of the above problems,this paper considered the feature of geographical semantic data and optimized the storage of semantic data on the peer-to-peer network.This paper proposed a storage and retrieval scheme based on content addressed network,and also improved the retrieval efficiency of semantic data by mapping the triple to the network according to its position.The experimental results show that the proposed scheme has good expansibility,and the query efficiency of topology relation is superior to the existing schemes.
Artificial Intelligence
Transportability of Causal Information Across Different Granularities
YAO Ning, MIAO Duo-qian, ZHANG Zhi-fei
Computer Science. 2019, 46 (2): 178-186.  doi:10.11896/j.issn.1002-137X.2019.02.028
Abstract PDF(1331KB) ( 850 )   
References | Related Articles | Metrics
The knowledge we learned is grain-dependent,which leads to different explanations for a phenomena at different granularities.Causality characterizes the essence of the phenomena.These factors raise an urgent problem currently to be solved in artificial intelligence:the relationship between causality and granularity as well as the transportability of causal effect at one granularity over to a different granularity.Aiming at the information system gathered from observational data,the basic graphical structures required for causal variables can be extracted directly from the data.According to these structures,the causal effects between variables can be computed.By adding new attributes to system and merging multiple information systems,the granularity in the original system is changed and then the issue of whe-ther the causal effect can be transported to the new system is settled in detail.The causal relationship from the original system cannot be transported to the new system if the new attribute acts on the effect variable,otherwise the transporta-bility is feasible in the new system.
Multi-attribute Decision Making and Adaptive Genetic Algorithm for Solving QoS Optimization of Web Service Composition
LU Cheng-hua, KOU Ji-song
Computer Science. 2019, 46 (2): 187-195.  doi:10.11896/j.issn.1002-137X.2019.02.029
Abstract PDF(1575KB) ( 895 )   
References | Related Articles | Metrics
With the increasing of service-oriented computing,the research on Web service composition based on quality of service (QoS) becomes an inevitable trend.With respect of the multi-dimensional nature and mutual contradiction,this paper transformed the optimization of Web service composition based on QoS into the problem of multi-attribute decision making to resolve it.The distances of each solution to the positive ideal solution (PIS) and the negative ideal solution (NIS) were summed up by means of a compromise coefficient.Finally,a set of ranked Web services were provided to users for a flexible choice.The traditional multi-attribute decision making method can not effectively solve the large-scale search space of Web service composition.Therefore,in order to solve the NP-hard problem of Web service composition optimization better,this paper developed an approach combining the multi-attribute decision making and adaptive genetic algorithm (MADMAGA).The experiments were conducted on a real and comprehensive QoS dataset.The experimental results indicate that the method can find the globally optimal solution in a short period of time.The ranking result of solutions is close to the true sort.Moreover,the proposed method has better scalability for solving the large-scale problem of Web service composition optimization.
Randomization of Classes Based Random Forest Algorithm
GUAN Xiao-qiang, PANG Ji-fang, LIANG Ji-ye
Computer Science. 2019, 46 (2): 196-201.  doi:10.11896/j.issn.1002-137X.2019.02.030
Abstract PDF(1495KB) ( 937 )   
References | Related Articles | Metrics
Random forest is a commonly used classification method in the field of data mining and machine learning,which has become a research focus of scholars at home and abroad,and has been widely applied to various practical problems.The traditional random forest methods do not consider the influence of the number of classes on the classification effect,and neglect the correlation between base classifiers and classes,limiting the performance of the random forest in dealing with multi-class classification problems.In order to solve the problem better,combined with the characteristics of multi-class classification problem,this paper proposed a randomization of classes based random forest algorithm (RCRF).From the perspective of classes,the randomization of classes is added on the basis of two kinds of traditional randomizations of random forest,and the corresponding base classifiers with different emphasis are designed for diffe-rent classes.The structures of the decision tree generated by the base classifier are different because different classifiers focus on different classes,which can not only guarantee the performance of the single base classifier,but also further increase the diversity of base classifier.In order to verify the validity of the proposed algorithm,RCRF is compared with other algorithms on 21 data sets in UCI database.The experiment is carried out from two aspects.On the one hand,the accuracy,F1-measure and Kappa coefficient are used to verify the performance of RCRF algorithm.On the other hand,the κ-error diagram is used to compare and analyze various algorithms from the perspective of diversity.Experimental results show that the proposed algorithm can effectively improve the overall performance of the integrated model and has obvious advantages in dealing with multi-class classification problems.
Method Based on Sub-group and Social Behavior for NarrowingRecommended List for Groups
MAO Yu-jia, LIU Xue-jun, XU Xin-yan, ZHANG Xin
Computer Science. 2019, 46 (2): 202-209.  doi:10.11896/j.issn.1002-137X.2019.02.031
Abstract PDF(1261KB) ( 870 )   
References | Related Articles | Metrics
Group recommend systems have been drawn a lot of attention for their capability of targeting users’ prefe-rence to satisfy all the users’ requirements,which cause the problem that the recommended list is too large simulta-neously,thus making it hard for group members to make their decisions.Regarding current issues,a method for narrowing the recommended list for groups called RMSGSB(A Recommendation Method based on Sub-group and Social Behavior) was proposed.In order to narrow the group scale and the amount of group preferences,the method divides the target group into several sub-groups.For guaranteeing the fairness of recommending,the weight of sub-groups is catculated according to the tolerance and altruism of group members’ social behavior.Based on real date set,the experiment results show that the proposed method has better performance than other methods on group recommendation.
Recommendation Algorithm Based on Jensen-Shannon Divergence
WANG Yong, WANG Yong-dong, DENG Jiang-zhou, ZHANG Pu
Computer Science. 2019, 46 (2): 210-214.  doi:10.11896/j.issn.1002-137X.2019.02.032
Abstract PDF(1840KB) ( 1060 )   
References | Related Articles | Metrics
To fully utilize all the ratings and weaken the problem of data sparsity,the Jensen-Shannon divergence in statistics field was used to design a new similarity measure for items.In this similarity measure,the ratings for items are converted to the density of rating values.Then,the item similarity is calculated according to the density of rating values.Meanwhile,the factor for the number of ratings is also considered to further enhance the performance of the proposed similarity measure based on JS divergence.Finally,a collaborative filtering recommendation algorithm is presented according to the JS-divergence-based item similarity.The test results on MovieLens dataset show that the proposed algorithm has good performance in prediction error and recommendation precision.Therefore,it has high potential to be applied in recommendation system.
Focused Annealing Crawler Algorithm for Rainstorm Disasters Based on Comprehensive Priority and Host Information
LIU Jing-fa, LI Fan, JIANG Sheng-yi
Computer Science. 2019, 46 (2): 215-222.  doi:10.11896/j.issn.1002-137X.2019.02.033
Abstract PDF(1476KB) ( 717 )   
References | Related Articles | Metrics
Nowadays,Internet integrates a lot of information related to rainstorm disasters.However,the efficiency of manual search is low,so the web focused crawler becomes very important.On the basis of the general web crawler,in order to improve the computational precision of topic relevance for webpages and prevent the topic drift,this paper proposed a comprehensive priority evaluation method based on webpage content and link structure for the hyperlink.The method consists of a combined effect of four parts,including the topic relevance of the anchor text,the topic relevance of all webpages that contain links,the PR value and the topic relevance of the webpage which the link points to.At the same time,to avoid the search falling into local optimum,a new focused crawler algorithm combining the memory historical host information and the simulated annealing algorithm (SA) was designed for the first time.The experimental results of the focused crawler about rainstorm disaster show that the proposed algorithm outperforms the breadth first search (BFS) strategy and the optimal priority search (OPS) strategy,and the crawling accuracy rate is significantly improved.
Spatio-Temporal Integrated Forecasting Algorithm for Dam Deformation
MAO Ying-chi, CAO Hai, HE Jin-feng
Computer Science. 2019, 46 (2): 223-229.  doi:10.11896/j.issn.1002-137X.2019.02.034
Abstract PDF(2021KB) ( 1026 )   
References | Related Articles | Metrics
The analysis of the spatial-temporal evolution of dam deformation is conducive for managersto master the overall deformation of the dam’s space.The existing predictive research on dam deformation can be divided into two parts.The first part is only making time series prediction for instrument part with distribution deformation,and the se-cond part is using a method of spatial interpolation at the current moment to obtain unknown point’s value of deformation.Both of these cannot use the historical deformation time series data to predict the deformation of the undistributed instrument.To solve this problem,combining the advantages of traditional spatial-temporal prediction model(STK) and neural network modelssuch such asBP and Gated Recurrent Unit (GRU),this paper constructed a spatio-temporal sequence prediction algorithm named BP-STK-GRU.The main steps are described as follows.Firstly,GRU optimizes the historical time series of individual measuring points.Secondly,BP fits the overall trend of spatio-temporal data at measuring points of the next moment.Thirdly,STK fits the stable parts of BP prediction results.Lastly,the spatial residual value and the overall BP space prediction are combined to get the deformation of the undistributed instrument.The experimental results show that the method is effective,and it has good performance in predicting the stability and accuracy of the deformation value of the unknown point.
Novel Soft Rough Set:Soft Rough Semigroups
LU Yi-yao, KONG Xiang-zhi
Computer Science. 2019, 46 (2): 230-235.  doi:10.11896/j.issn.1002-137X.2019.02.035
Abstract PDF(1249KB) ( 692 )   
References | Related Articles | Metrics
In order to further simplify the method of dealing with the problem of uncertainty,this paper studied a new type of soft rough set (MSR-semigroup)on semigroups.First of all,based on some basic theories of soft sets,rough sets and semigroups,the relationship among them were deeply studied,and the operation and basic properties of soft rough sets were discussed.Then,in order to further understand the soft approximation space,the concepts of the lower and upper soft rough approximation were applied.At the same time,two special soft sets G-soft sets and GG-soft sets were proposed on semigroups,corresponding operations were given and some examples were given.Then,the roughness of the soft approximation space was discussed,the concept of MSR-semigroups was proposed,and the basic characteristics of the upper and lower MSR-semigroups were studied.Finally,a comparative analysis of example shows the research value of the soft rough semigroup.
New Three-way Decisions Model Based on Granularity Importance Degree
XUE Zhan-ao, HAN Dan-jie, LV Min-jie, ZHAO Li-ping
Computer Science. 2019, 46 (2): 236-241.  doi:10.11896/j.issn.1002-137X.2019.02.036
Abstract PDF(1244KB) ( 802 )   
References | Related Articles | Metrics
Granularity importance degree is an important content of multi-granularity rough set.Now,the construction methods of granularity importance degree only consider the direct influence of single granularity on decision-making,ignoring the combined effect of other granularity on decision-making.Combining the concept of multi-granularity approximated quality,this paper studied the construction method of granularity importance degree,proposed a new granularity importance degree calcuation method among multiple granularities,and gave a granular structure reduction algorithm based on this method.Meanwhile,in order to reduce the redundant decision information,through combining reduction set and three-way decisions,this paper constructed the three-way decisions model based on granularity importance degree,and gave the decision rules in detail.Finally,the results of an example show that the proposed granular degree reduction algorithm can obtain the data with larger discrimination,and narrows the range of delay domain,making the final decision more reasonable.
Attribute Reduction for Sequential Three-way Decisions Under Dominance-Equivalence Relations
LI Yan, ZHANG Li, WANG Xue-jing, CHEN Jun-fen
Computer Science. 2019, 46 (2): 242-148.  doi:10.11896/j.issn.1002-137X.2019.02.037
Abstract PDF(1829KB) ( 791 )   
References | Related Articles | Metrics
Sequential three-way decision is an effective way to solve problems under multiple levels granularity.Dominance-equivalence relation based rough set approach can be used to handle classification problems for conditional attri-butes with preference ordered,extract related information,approximate target concepts and finally form the decision-making knowledge.The traditional dominance relation-based rough sets model is very time consuming for knowledge reduction and extraction,however,most of current sequential three-way decision models are limited to information systems of symbolic attributes,which can not process continuous and ordinal values effectively,and will cause a certain degree loss of information.Therefore,this paper applied the idea of sequential three-way decisions to the dominance relation-based rough sets models,defined a new attribute reduction method based on sequential three-way decisions and the corresponding attribute importance measure,and thenaccelerated the processing of information systems with ordinal attributes.Finally,the efficiency of knowledge reduction is improved through multiple granularity representations and relationships.Several UCI data sets are selected for experiments.The results show that the proposed sequential three-decision method based on dominance relations can reduce the time consumption noticeably and guarantee the quality of the attribute reduction.
Graphics ,Image & Pattern Recognition
Multi-layer Object Detection Algorithm Based on Multi-source Feature Late Fusion
SHENG Lei, WEI Zhi-hua, ZHANG Peng-yu
Computer Science. 2019, 46 (2): 249-254.  doi:10.11896/j.issn.1002-137X.2019.02.038
Abstract PDF(3647KB) ( 972 )   
References | Related Articles | Metrics
Object detection is a hot topic in computer vision and it is the foundation of video caption.This paper proposed amulti-layer object detection algorithm based on multi-source feature late fusion,and used ways of multi-level decisions to divide the object detection task into two granularities.At the coarse level,the HOG feature was used to classify the images.According to the confidence scores of the classifier,the test images were categorized into positive,negative and uncertain examples.At the fine level,this paper proposed a multi-source feature late fusion method to classify the examples which are in the uncertain field.This paper conducted several comparative experiments on the same data set.Experimental results demonstrate that the proposed algorithm can obtain excellent results in all evaluation metrics,and achieve a better detection result than Faster-RCNN.
Adaptive Adjustment Algorithm of Mobile Phone Screen Brightness Based on Visual Characteristics
LI Meng-jun, LI Wen-wei
Computer Science. 2019, 46 (2): 255-260.  doi:10.11896/j.issn.1002-137X.2019.02.039
Abstract PDF(2088KB) ( 3219 )   
References | Related Articles | Metrics
The development of artificial intelligence technology prompts people to interact with each other by smartphone,therefore,it is significant to set suitable phone screen brightness for keeping eyes healthy.In light of this problem,this paper optimized the adaptive adjustment algorithm of mobile phone screen brightness by combining the human visual characteristics.Firstly,this paper designed two measurement tools to collect and analyze relevant data.Then,this paper made use of the method of subjective evaluation to study the smartphone brightness under different indoor and outdoor illumination conditions.The experimental results show that the ideal mobile phone screen brightness is much lower than the screen brightness of mobile phones obtained by the existing automatic brightness adjustment.At last,according to users’ experience and the characteristics of human vision,this paper designed and evaluated corresponding algorithm.
Retrieving Signed Fuzzy Measure of Choquet Integral Based on Linear Discriminant Analysis
WANG Deng-gui, YANG Rong
Computer Science. 2019, 46 (2): 261-265.  doi:10.11896/j.issn.1002-137X.2019.02.040
Abstract PDF(1416KB) ( 683 )   
References | Related Articles | Metrics
For solving classification problems,Choquet integral classifier plays an increasingly important role by its nonlinear and nonadditivity.Especially,in the domain of solving the problem of data classification and fusion,Choquet integral has obvious advantages,because its signed fuzzy measure provides an effective representation to describe the intera-ction among contributions from predictive attributes to objective attributes.However,there is lack of an effective me-thod to extract the signed fuzzy measure of Choquet integral.Currently,the most common used method is genetic algorithm,but the genetic algorithm is complex and time-consuming.Since the values of signed fuzzy measure are critical parameters in the Choquet integral classifier,it is necessary to design an efficient extraction method.Based on linear discriminant analysis,this paper proposed an extraction method for retrieving the values of signed fuzzy measure in the Choquet integralbased on linear discriminant analysis,and derived the analytic expression of the signed measure value in Choquet integral under the classification problem,so that the key parameters can be obtained quickly and efficiently.This method was tested and validated on artificial data sets and benchmark data sets,respectively.The experiment results show that this method can effectively solve the optimization problem of signed fuzzy measure in Choquet integral classifier.
FPFH Feature Extraction Algorithm Based on Adaptive Neighborhood Selection
WU Fei, ZHAO Xin-can, ZHAN Peng-lei, GUAN Ling
Computer Science. 2019, 46 (2): 266-270.  doi:10.11896/j.issn.1002-137X.2019.02.041
Abstract PDF(2495KB) ( 1275 )   
References | Related Articles | Metrics
When using the FPFH feature of point cloud for 3D object recognition or registration,FPFH feature descriptor is arbitrarily and inefficiently calculated by subjectively adjusting the neighborhood radius,and the whole process can not be completed automatically.This paper proposed an adaptive neighborhood-selection FPFH point cloud feature extraction algorithm to solve this problem.Firstly,the point cloud densities of many pairs of point clouds were estimated.Secondly,the neighborhood radii were computed to extract the FPFH features for SAC-IA,and the radii and the densities were counted when the registration performance is the most optimal,and then the Cubic Spline Interpolation Fitting was used to fit the function expression of the radii and the densities to form the adaptive neighborhood-selection FPFH feature extraction algorithm.The experimental results show that this algorithm can adaptively choose the appropriate neighborhood radius according to the density of point cloud,improves the FPFH feature matching perfor-mance,and improves the computing speed at the same time,indicating that the proposed algorithm is of important guiding significance.
Shot Boundary Detection Method Based on HEVC Compressed Domain
ZHU Wei, SHANG Ming-jiang, RONG Yi, FENG Jie
Computer Science. 2019, 46 (2): 271-278.  doi:10.11896/j.issn.1002-137X.2019.02.042
Abstract PDF(2198KB) ( 750 )   
References | Related Articles | Metrics
Shot boundary detection is an important part of intelligent video retrieval.The existing detection methods are mainly processed in pixel domain,with low accuracy of cut and high computational complexity.To solve these problems,this paper used the encoding information obtained by parsing HEVC stream and proposesd a shot boundary detection method based on HEVC compressed domain.First,the number of PUs with different prediction modes is counted for each frame,and the motion vectors are filtered according to CU depth.Then,the two-stage candidate frame of cut is selected by using the PU prediction modes,the motion vectors and the number of frame bits.And then,the cut shot detection of adaptive threshold is performed.After that,the video is segmented according to the cut frame.In addition,the smooth filtering is carried out for the frame bits in the time domain.Finally,the PU prediction modes and the number of smoothed frame bits are used to detect the gradual shot detection.The experimental results show that the proposed method has a good effect on shot boundary detection with lower computational complexity.
Interactive Likelihood Target Tracking Algorithm Based on Deep Learning
ZHANG Ming-yue, WANG Jing
Computer Science. 2019, 46 (2): 279-285.  doi:10.11896/j.issn.1002-137X.2019.02.043
Abstract PDF(7806KB) ( 830 )   
References | Related Articles | Metrics
The traditional video target tracking methods usually prossess low accuracy.This paper proposed an improved scheme based on convolution neural network and the interactive likelihood algorithm,and optimized the particle filter algorithm on the basis of deep learning.To address the issue of deficient nonlinear fitting ability of the principal component analysis (PCA),a kernel principal component analysis (KPCA) tracking algorithm was provided to obtain the deeper characteristic expression of the target.Then,a novel interactive likelihood (ILH) method was performed for image-based trackers,which can non-iteratively compute the sampling of areas belonging to different targets and thus reducing the requirement for data associations.The performance of the presented algorithm was evaluated in comparison with several related algorithms on image datasets.The experimental results demonstrate the great robustness and accuracy of the proposed algorithm.
Mismatch Elimination Algorithm Based on Point Line Feature Fusion
WEI Yu-hui, WANG Yong-jun, WANG Guo-dong, LIU Hong-min, WANG Jing
Computer Science. 2019, 46 (2): 286-293.  doi:10.11896/j.issn.1002-137X.2019.02.044
Abstract PDF(2708KB) ( 1532 )   
References | Related Articles | Metrics
Image feature matching plays an important role in computer vision,the feature point matching technology based on descriptors have made a series of achievements,since curves have different lengths,incorrect position of endpoints and contain lots of relative texture around neighbor,the research of feature curve matching is still a challenging topic,and many curve matching methods have the problem of fewer matches and low accuracy of feature matching.To improve the total number and accuracy of feature matching,this paper proposed a novel Point Line feature Fusion (PLF) algorithm based on the location relationship between feature points and feature curves.Firstly,it defines the distance from a point to a curve,and obtains the matched points and curves using point and curve descriptors respectively from the images.Secondly,it determines the matched point pairs in the support areas of one pair of matched curves,and eliminate the mismatch of curves according to the distance constraints between the obtained matched points and the curve.Then,it removes the mismatch of points according to the distance constraint between the point and the curve.Three combinations of points and curves have been used in the experiment,which are the points extracted by SIFT and the curves extracted by IOCD curve descriptor,the points extracted by SIFT and the curves extracted by IOMSD curve descriptor,the points extracted by SIFT and the curves extracted by GOCD curve descriptor.The method hasapplicabi-lity to many kinds of point and curve descriptors,it is not only suitable for the points and curves,but also for points and lines,it has applicability to many kinds of features.Experimental results show that the proposed algorithm can effectively improve the total number and accuracy of feature matching,and also increase the accuracy of point matching under image rotation,viewpoint change,illumination change,JPEG compression,noise and image blur.
Interdiscipline & Frontier
Overlapping Protein Complexes Detection Algorithm Based on Assortativity in PPI Networks
WANG Jie, LIANG Ji-ye, ZHAO Xing-wang, ZHENG Wen-ping
Computer Science. 2019, 46 (2): 294-300.  doi:10.11896/j.issn.1002-137X.2019.02.045
Abstract PDF(1440KB) ( 694 )   
References | Related Articles | Metrics
Protein complexes play significant roles in biological processes.The detection of protein complexes from available protein-protein interaction (PPI) networks is one of the most challenging tasks in the post-genome era.Seed expansion method is an effective clustering technique for overlapping protein complexes detection from PPI networks.However,existing methods are usually faced with two problems.One is that they only consider link density between direct neighbors of nodes in a network in the step of seed selection,which is not enough to indicate the importance of nodes in local subgraphs consisting of their neighborhoods.The other is that candidate nodes are assumed to be independent from each other,ignoring the impact of candidate nodes’ order on clustering in the process of cluster extension.To solve the problems,this paper proposed an overlapping protein complexes detection algorithm based on assortativity,which considers 2-order neighborhood of nodes in the process of seed selection,and multiple candidate nodes are added into clusters based on assortativity in networks in the process of cluster expansion.In order to evaluate overlapping results,a new evaluation index named F-overlap was presented.Experiment results on PPI networks show that the proposed algorithm can effectively identify overlapping protein complexes.
Quasispecies Reconstruction Algorithm Based on Color Coding Technology
HUANG Dan, WU Jing-li
Computer Science. 2019, 46 (2): 301-305.  doi:10.11896/j.issn.1002-137X.2019.02.046
Abstract PDF(1246KB) ( 734 )   
References | Related Articles | Metrics
The reconstruction of viral quasispecies haplotypes contributes to know about the structure of viral gene-tic,and is of great significance for vaccine preparation and antiviral therapy.In this paper,a weighted fragment conflict graph was constructed by introducing fuzzy distance,and a viral quasispecies haplotypes reconstruction algorithm CWSS was proposed based on color coding technology.Firstly,the CWSS algorithm preprocesses thefragment conflict graph in accordance with a given threshold.Secondly,under the condition that adjacent vertices must have different colors,all vertices of the graph are colored according to their sum of edge weigh and saturation value.Finally,quasispecies haplotypes are obtained by assembling the fragment with the same color.The time complexity of the CWSS algorithm is O(m2n+mn).Simulated sequencing fragment were adopted to compare the reconstruction performance and quality of the CWSS algorithm and the Dsatur one.The experimental results show that CWSS algorithm can obtain more accurate quasispecies and higher reconstruction performance than Dsatur algorithm.
Study on Performance Optimization of Reduction Algorithm Targeting GPU Computing Platform
ZHANG Yi-ran, CHEN Long, AN Xiang-zhe, YAN Shen-gen
Computer Science. 2019, 46 (2): 306-314.  doi:10.11896/j.issn.1002-137X.2019.02.047
Abstract PDF(1689KB) ( 1261 )   
References | Related Articles | Metrics
Reduction algorithm has wide application in scientific computing and image processing,and it is one of the basic algorithms of parallel computing.Hence,it is significant to accelerate reduction algorithm.In order to fully exploit the capability of GPU for general-purpose computing under heterogeneous processing platform,this paper proposed a multi-level reduction optimization algorithm including inner-thread reduction,inner-work-group reduction and inter-work-group reduction.Different from the traditional way of reduction algorithm optimization of putting more emphasis on inner-work-group reduction,this paper proved that inner-thread reduction is the true bottleneck of reduction algorithm.The experimental results demonstrate that the performance of proposed reduction algorithm has reached 3.91~15.93 and 2.97~20.24 times speedup respectively in AMD W8000 and NVIDIA Tesla K20M under different sizes of data set,compared with carefully optimized CPU version of OpenCV library.In NVIDIA Tesla K20M,compared with CUDA version and OpenCL version of OpenCV library,the algorithm has reached 2.25~5.97 and 1.25~1.75 times speedup respectively.And compared with OpenCL version of OpenCV library in AMD W8000,the algorithm has reached 1.24~5.15 times speedup.This work not only realizes high performance of reduction algorithm on GPU platform,but also reaches the portability of performance between different GPU computing platforms.
Social Team Formation Method Based on Fuzzy Multi-objective Evolution
JIN Ting, TAN Wen-an, SUN Yong, ZHAO Yao
Computer Science. 2019, 46 (2): 315-320.  doi:10.11896/j.issn.1002-137X.2019.02.048
Abstract PDF(2348KB) ( 627 )   
References | Related Articles | Metrics
The present team formation researches in social network mostly take 0-1 rule to measure expert skills.Aiming at the situation that people often utilize the natural language to describe expert skills,this paper proposed a social team formation method based on fuzzy multi-objective evolution.This method focuses on how to find out the appropriate individuals from the expert social network to form a team with certain size and achieves the optimization between communication cost and team performance under the uncertainty circumstances.In this method,the precise parameters represented by 0-1 rule are replaced by fuzzy language variables to describe expert skill.The concept of team performance is used to measure team capability.Because the standard SPEA2 algorithm has slow convergence at the initialevolutio-nary stage,this paper introduced AEL strategy to generate individuals with good characteristics.Considering the ambi-guity of expert skills,this paper also proposed a fine-grained Dominance judgment as the new rule of judging the dominance relationship of individuals.The simulation results show that the improved algorithm converges fast and obtains good quality approximate PF,which can be successfully applied to solve the team formation problem.
Analysis of Effective Low-frequency Behavioral Patterns Based on Petri Net Behavior Closeness
HAO Hui-jing, FANG Xian-wen, WANG Li-li, LIU Xiang-wei
Computer Science. 2019, 46 (2): 321-326.  doi:10.11896/j.issn.1002-137X.2019.02.049
Abstract PDF(1375KB) ( 835 )   
References | Related Articles | Metrics
Low-frequency behavior pattern analysis is one of the important contents of process management.It is very important to distinguish low-frequency logs and noise logs effectively in business process mining.At present,most of the researches have dealt with the direct filtering of low-frequency behavior in the process model as noise,but some low frequency behavior are valid for the model.This paper presented an effective low-frequency pattern analysis method based on Petri nets’ behavioral closeness.Firstly,a reasonable process model is established according to the given event log.Then,all low-frequency log sequences in the process model are found by iteratively expanding the initial patterns.Based on this,the behavioral distance vectors of the log and the model are calculated,and thebehavior closenessof log and model is used to find the effective low-frequency behavioral pattern.Finally,an example is given to verify the feasibility of this method.
Study of Optimal Learning Law and Simplified Learning Law of Iterative Learning Control in Frequency Domain
LIU Jia-cun, ZHAO Gui-yan, MEI Qi-xiang
Computer Science. 2019, 46 (2): 327-332.  doi:10.11896/j.issn.1002-137X.2019.02.050
Abstract PDF(1939KB) ( 795 )   
References | Related Articles | Metrics
To obtain the optimal learning law and simplified learning law which is convenient for engineering implementation of iterative learning control in LTI MIMO system,this paper carried out some studies in frequency domain.Based on the transfer function matrix of system,the time-domain error is converted to frequency-domain error according to Parseval theorem,and the general convergent condition of learning law can be obtained by using matrix properties of Jordan canonical form.The optimal learning law which needs only one time iterative computation with fastest convergence rate can be got by analyzing the aforementioned conditions.Since the higher derivative is not conducive to eliminating noise,the reduced order of derivative was discussed to get a simplified learning law.Simulation results show that the optimal and simplified learning law is effective.