Computer Science

Similarity Joins on Massive Data Based on MapReduce Framework

PANG Jun, YU Ge, XU Jia and GU Yu

Computer Science. 2015, 42 (1): 1-5. doi:10.11896/j.issn.1002-137X.2015.01.001

Abstract

PDF(516KB) ( 636 )

References | Related Articles | Metrics

As a basic operation of large-scale data processing,similarity joins on large-scale data play an important role in document clustering,plagiarism detection,entity resolution and many other fields.On the other hand,the MapReduce programming model is widely applied to massive data processing because of its good scalability,fault tolerance and usability.Therefore,similarity joins on massive data based on MapReduce become one of the hot topics in the field of massive data processing.Firstly,big challenges of similarity join query introduced by rapid growth of data volume were generalized.Then,the definition of similarity joins on massive data was presented and similarity joins on massive data were classified according to three different standards.In addition,the current status of set,string and vector similarity joins on massive data were emphatically analyzed and compared from the aspects of efficiency,applicability and so on.Finally,the research focus and trend of this area were investigated and the promising solutions were suggested.

Survey on Clustering Protocols in Wireless Sensor Network

HAI Mo, ZHANG Yan-mei and ZHANG Yue-jin

Computer Science. 2015, 42 (1): 6-11. doi:10.11896/j.issn.1002-137X.2015.01.002

Abstract

PDF(574KB) ( 520 )

References | Related Articles | Metrics

Clustering in wireless sensor network is a process to divide the network into a number of clusters,which can extend the lifespan of network.Each cluster has a node called cluster head,which can be elected by the nodes in the same cluster or pre-assigned by the network designer.Firstly the classification characteristics of clustering protocols in wireless sensor network were given,and then current clustering protocols were classified by the selection methods of cluster head.After that current clustering protocols were compared from three aspects:basic characteristics,clustering attributes and selection methods of cluster head.Lastly the shortcomings of current research on clustering protocols were summarized and the key problems of clustering protocols which need to be solved in the future were pointed out.

Schedulability Analysis of Time Petri Net and its Application in FMS

ZHAI Zheng-li and DING Zhi-jun

Computer Science. 2015, 42 (1): 12-18. doi:10.11896/j.issn.1002-137X.2015.01.003

Abstract

PDF(492KB) ( 632 )

References | Related Articles | Metrics

In a real-time system,the process of verifying whether a schedule of task execution meets the imposed timing constraints is referred to as scheduling analysis.This paper presented an approach to analyze the scheduling of real time systems modeled in time Petri nets by separating timing properties from other behavioral properties.If a specific task execution is schedulable,we could calculate the time span of the task execution,otherwise pinpointed out non-schedulable transitions to help adjust timing constraints and correct design error.A technique for compositional timing analysis was also proposed to deal with complex task sequences by decomposing it into some simple sub-task sequences,which not only improves efficiency but also facilitates the discussion of reachability issue with regard to scheduling.Finally,sche-duling analysis of an assembly system in flexible manufacturing systems (FMS) was discussed.

Computational Services for Engineering Design Based on Parametric Models

ZHANG Ming-xi, ZUO Bing-quan, WU Zi-jun and HUANG Zheng-dong

Computer Science. 2015, 42 (1): 19-22. doi:10.11896/j.issn.1002-137X.2015.01.004

Abstract

PDF(921KB) ( 456 )

References | Related Articles | Metrics

In engineering design,designers need to learn a lot of simulation analysis software and must have special knowledge of other disciplines.To reduce the designer’s burden from modeling,we developed a network service platform that can help designers to complete the computation task for engineering design.On the Web-based service platform,designers can choose a parametric design model constructed by other professionals,customize the external function connections,input model parameters,and view the visualized simulation results of remote computation in servers.The platform has been applied to the modeling and simulation of engine thermal and dynamical processes.The results show that the service platform can do a lot of simulation work for designers in this field,greatly reducing the modeling workload and improving the design efficiency.Meanwhile,the problem that designers have difficulty for constructing the correct model due to lack of sufficient expertise can be avoided.

Indicator-effect Modeling and Evaluation for CPS System Based on Extended DPN

SONG Cui-ye, DU Cheng-lie and LI Gang

Computer Science. 2015, 42 (1): 23-27. doi:10.11896/j.issn.1002-137X.2015.01.005

Abstract

PDF(397KB) ( 491 )

References | Related Articles | Metrics

Cyber physical system (CPS) is composed of networked physical devices,in which computing entities are embedded to perceive everything and controll the running of the devices.Evaluating the integral effect of the physical and computing arguments,control indicators and related real-time properties is essential to the product development of a CPS system.In this paper,an intelligent vehicle was taken as the research object,and an extended differential Petri net (DPN) model was adopted to establish the integrated model of its cyber physical behavior and desired control effect,in which the key design elements are reflected.By executing of the model,the integrated control effect is judged directly from the model.This method provides an effective way to assess the integral properness,and combined design of key design indicators of a CPS system.

Dynamic Multi-priority Scheduling for Cyber-physical Systems

LIU Chun-yao and ZHANG Li-chen

Computer Science. 2015, 42 (1): 28-32. doi:10.11896/j.issn.1002-137X.2015.01.006

Abstract

PDF(379KB) ( 419 )

References | Related Articles | Metrics

The complexity and heterogeneity of Cyber-physical systems (CPS) bring big challenges to system desig-ners.Conventional task scheduling scheme will not satisfy performance requirements of CPS due to the diversity of tasks.A dynamic multi-priority scheduling scheme for a class of CPS based on large-scale sensor network was proposed.In the proposed scheme,tasks reaching each node in the system are divided into four categories:real time tasks to be sent from local node which have the highest preemptive priority,real time tasks to be transmitted from other nodes which have the second highest preemptive priority,non-real time tasks to be processed from controller which have the third highest non-preemptive priority,non-real time tasks to be transmitted from other nodes which have the lowest non-preemptive priority.A mixed preemptive and non-preemptive priority scheduling method was designed to reduce the average waiting time of tasks in each queue.A waiting-time threshold mechanism was added to ensure the fairness of tasks.We analyzed task transmission delay of our scheduling algorithm.Simulation results show that the proposed dynamic multi-priority scheduling scheme outperforms conventional priority queue scheduling scheme.

Performance Evaluation of Job Scheduling and InfiniBand Network Interconnection in High Performance Computing System Based on Stochastic Petri Nets

LI Zhi-jia, HU Xiang, JIAO Li and WANG Wei-feng

Computer Science. 2015, 42 (1): 33-37. doi:10.11896/j.issn.1002-137X.2015.01.007

Abstract

PDF(489KB) ( 676 )

References | Related Articles | Metrics

Model-based analysis technology plays an important role in the system research and design,because it has many advantages,such as simplicity,flexibility,scalability,and efficiency.Especially,stochastic Petri nets are widely used in performance evaluation.We proposed a method of synthetical performance evaluation of job scheduling and InfiniBand network interconnection based on stochastic Petri nets.The experiment results demonstrate that our method is feasible and comparatively accurate.

Surface Reconstruction Skill Based on Key Characteristics Control in Second-order Polyhedral Meshes

LIU He-dan and WANG Cheng-en

Computer Science. 2015, 42 (1): 38-43. doi:10.11896/j.issn.1002-137X.2015.01.008

Abstract

PDF(972KB) ( 446 )

References | Related Articles | Metrics

In order to solve the problem of accurate topologies and the cost of drawing and transmission during the surface reconstruction process in second-order multi-node polyhedral,the surface reconstruction skill based on key characteristics control was proposed.The method analyzes the characteristics of second-order multi-node polyhedral parameter interpolation functions,calculates accurate contour points in edge,extracts surface and mesh key points which can perform the geometric characteristics of surface in second-order multi-node polyhedral.Based on the logical relationship among the three types of interpolation critical point,it develops the surface triangulation rules which have accurate and unique topologies,and the split-style compress index structure of triangle to triangulation,image optimization and dra-wing.The experiments show that the method can describe the surface accurately,define an unique geometry topology within second-order tetrahedral mesh grid,has the adaptability for different accuracy requirements,and can reduce costs of computation,drawing and transmission dramatically.

Primary Investigation into Parallel Computing in Julia Language

GONG Qing-kui, ZHANG Chang-you, ZHANG Xian-yi and ZHANG Yun-quan

Computer Science. 2015, 42 (1): 44-46. doi:10.11896/j.issn.1002-137X.2015.01.009

Abstract

PDF(497KB) ( 749 )

References | Related Articles | Metrics

Julia language is a free developing scripting language under the MIT license.Its goal is to ease the difficulty of parallel programming.Based on the language mechanisms of Julia,we constructed a use case of computing the average running-time between every two bus stops.And then,we exampled the Julia programming framework and the code refining steps.Julia language supports both multi-cores/CPUs parallel programming modes.To full use all the computing resources,we developed some experiments on new policies about how to improve the computing performance.Experiments show that managing processors in parallel computing model consume working time,but with the increasing of problem size,this impact can be gradually ignored,and gaining nearly linear speedup.

Queueing Network Based Performance Model for Multi-tiered Web Application with Consideration of Performance Interference among Virtual Machines

YANG Lei, DAI Yu, ZHANG Bin and WANG Hao

Computer Science. 2015, 42 (1): 47-49. doi:10.11896/j.issn.1002-137X.2015.01.010

Abstract

PDF(218KB) ( 460 )

References | Related Articles | Metrics

Performance analysis for multi-tiered application is one of the important factors to achieve dynamic resource allocation and management,and insure the better performance of the application.Traditional performance models assume the availability of dedicated hardware for the application and neglect the effect caused by logic resource.With the deve-lopment of cloud computing,hardware resources can be virtualized as virtual resource to provide service which can support the performance insurance of the multi-tiered application effectively.Then,how to take the performance interfe-rence between virtual machines and the logic resource into consideration becomes a key issue when building the perfor-mance model for a multi-tiered application.For this,this paper built a performance model for multi-tiered application based on queuing network.This model extends the current performance model with a drop queue to support concurrent connection limit.With the consideration of the performance interference between virtual machines,this paper presented the parameter solving method of the proposed performance model.The experiments show the better performance of the performance model.

Method of Progressive Intelligent Backtracking Vector Code Adjustment and Optimization

ZHAO Bo, ZHAO Rong-cai, XU Jin-long and GAO Wei

Computer Science. 2015, 42 (1): 50-53. doi:10.11896/j.issn.1002-137X.2015.01.011

Abstract

PDF(682KB) ( 430 )

References | Related Articles | Metrics

In order to fully develop the computing ability of high-performance computer and relieve the pressure of designing and writing parallel programs for programmers,with the expansion of available software sets we designed and realized the vector program through the interactive interface.Using this method,we could optimize the generated vector code to improve the efficiency for the implementation of generated code.Our process is of great significance to express the ability of high-performance computing,enhance the availability and extend range of application.Furthermore,it can provide available supplementary means and tool support.The method of progressive intelligent backtracking vector code adjustment and optimization can automatically generate the parallel code after analysis and transformation of the serial code given by the user using the following methods such as serial code analysis,data dependence analysis and parallelization analysis,etc.The work of this article can greatly reduce the work of programmers by transforming the serial code into parallel code automatically with the analysis of parallelization in the serial code.

Design and Optimization of Storage System in HEP Computing Environment

CHENG Yao-dong, WANG Lu, HUANG Qiu-lan and CHEN Gang

Computer Science. 2015, 42 (1): 54-58. doi:10.11896/j.issn.1002-137X.2015.01.012

Abstract

PDF(705KB) ( 735 )

References | Related Articles | Metrics

High energy physics computing is a typical data-intensive application,and the performance of data access throughput is critical to the computing system.The performance of data access is closely related to the computing model of application.This paper firstly analyzed the typical high energy physics computing models,and then summarized the characteristics of data access.Based on these characteristics,some optimization measures were proposed,including opera-ting system I/O scheduling policy,distributed file system cache configuration and so on.Data access performance and CPU utilization are improved significantly after optimization.Metadata management,data reliability,scalability and othermanageability functions are also important for large-scale storage system.Considering some shortcomings of existing Lustre parallel file system,this paper finally proposed Gluster storage system as a new solution for high energy physics.After tuning of some key factors,such as data management and scalability,the system has been put into use.It is de-monstrated that the data access performance with better scalability and reliability can meet the needs of high-energy physi-cs computing.

Implementation and Exploring of Acceleration Efficiency of Parallel AES Algorithm on CUDA

FEI Xiong-wei, LI Ken-li, YANG Wang-dong and DU Jia-yi

Computer Science. 2015, 42 (1): 59-62. doi:10.11896/j.issn.1002-137X.2015.01.013

Abstract

PDF(386KB) ( 1462 )

References | Related Articles | Metrics

Many services of network applications need data encryption to provide secure communication especially for e-bank and e-business.Massive application servers are challenging for intense computation of unlimited encryption.CUDA (Compute Unified Device Architecture) is a platform for parallel and general-purpose computation on GPU(Graphics processing unit).It can improve the performance of encryption with existed graphics device resource in a low cost manner.A Parallel AES (Advanced Encryption Standard) algorithm is implemented on a Nvidia Geforce G210 graphics card and a serial AES algorithm is implemented on AMD Athlon 7850.This parallel AES algorithm avoids synchronization and communication between threads in a same block and improves performance of acceleration completely.The parallel AES algorithm’s performance model is explored in a enormous data(up to 32MB) environment.The parallel AES algorithm’s speedup is 2.66 to 3.34 times larger than Manavski’s AES-128 parallel algorithm.Acceleration performance of parallel AES algorithm was analyzed from the aspect of acceleration efficient for the first time.Speedup of parallel AES in a 16 cores GPU is up to 15.83 times accompanying a acceleration efficiency of 99.898%.

Automatic Load Modeling Algorithm Based on Real Time Measuring

LIU Xu, MO Ze-yao, AN Heng-bin, CAO Xiao-lin and ZHANG Ai-qing

Computer Science. 2015, 42 (1): 63-66. doi:10.11896/j.issn.1002-137X.2015.01.014

Abstract

PDF(878KB) ( 527 )

References | Related Articles | Metrics

Load imbalance can cause significant performance degradation in large-scale simulations.Accurate load mo-deling is fundamental for load balancing.An automatic load modeling algorithm based on real time measuring was presented.It eases the burden of the user from providing a load model.The algorithm possesses several good properties mathematically,and runs totally distributed with a sublinear computational complexity.Experiment result of molecular dynamics simulation on 2400 processes demonstrates its speed and efficiency.

Performance Optimization and Application of KVM in HEP Computing Environment

HUANG Qiu-lan, LI Sha, CHENG Yao-dong and CHEN Gang

Computer Science. 2015, 42 (1): 67-70. doi:10.11896/j.issn.1002-137X.2015.01.015

Abstract

PDF(594KB) ( 825 )

References | Related Articles | Metrics

High Energy physics computing is a high-performance computing application,which highly requires computing power,and CPU utilization directly affects the computational efficiency of high-energy physics.Virtualization technology has demonstrated a great advantage in achieving high utilization of resources and resource sharing.This paper gave performance testing and optimization of KVM.Firstly,we showed testing results and quantitative analysis of performance between KVM and physical machines in aspect of CPU,disk IO and network IO.Then,we demonstrated various factors in hardware level and kernel level that affect the performance of KVM through the architecture of KVM virtual machine,including EPT (Extended Page Tables) and CPU affinity which are optimized for KVM.Experiment results show the penalty of CPU performance for KVM can be decreased to about 3%.Finally,virtual cluster of high ener-gy computing was shown and the performance of virtual cluster can meet the needs of high energy physics computing.

Program Phase Analysis and Phase Detection Techniques

ZHANG Hai-bo, AN Hong, HE Song-tao, SUN Tao, WANG Tao, PENG Yi and CHENG Yi-chao

Computer Science. 2015, 42 (1): 71-74. doi:10.11896/j.issn.1002-137X.2015.01.016

Abstract

PDF(572KB) ( 505 )

References | Related Articles | Metrics

The rapid development of SMP and new proposed DHMP bring new challenges for program performance optimization.We raised two performance tuning problems and the solutions were given by phase analysis.The first problem is to find theperformance bottlenecks in each phase.We proposed a static phase analysis method,which finds performance bottlenecks in each phases by analyzing architecture features and its similar matrix.The second problem is to give the proper time to reconfigure for DHMP.We proposed dynamic phase detection algorithms,namely DPDA and HTPA.DPDA archives effective performance in a relative low software/hardware cost,and HTPD is the first phase detection algorithm using statistics theory.Our results show that comparing with BBV,DPDA and HTPD can avoid its limitation of offline,plus additional hardware in online algorithm and compiler’s effect,while they offer a comparable stability and correctness.Since DPDA and HTPD do not relay on additional hardware support,they can be implemented directly in mainstream processors and DHMP.

Performance Portability Evaluation for OpenACC on Intel Knights Corner and NVIDIA Kepler

WANG Yi-chao, QIN Qiang, SEE Simon and LIN Xin-hua

Computer Science. 2015, 42 (1): 75-78. doi:10.11896/j.issn.1002-137X.2015.01.017

Abstract

PDF(321KB) ( 718 )

References | Related Articles | Metrics

OpenACC is a programming standard designed to simplify heterogeneous parallel programming by using directives.Since OpenACC can generate OpenCL and CUDA code,meanwhile running OpenCL on Intel Knight Corner is supported by CAPS HMPP compiler,it is attractive to using OpenACC on hardwares with different underlying micro-architectures.This paper studied how realistic it is to use a single OpenACC source code for a set of hardwares with different underlying micro-architectures.Intel Knight Corner and Nvidia Kepler products are the targets in the exper- iment,since they have the latest architectures and similar peak performance.Meanwhile CAPS OpenACC compiler is used to compile EPCC OpenACC benchmark suite,Stream and MaxFlops of SHOC benchmarks to access the performance.To study the performance portability,roofline model and relative performance model were built by the data of experiments.It shows that at most 82% performance compared with peak performance on Kepler and Knight Corner is achieved by specific benchmarks,but as the rise of arithmetic intensity,the average performance is approximately 10%.And there is a big performance gap between Intel Knight Corner and Nvidia Kepler on several benchmarks.This study confirmed that performance portability of OpenACC is related to the arithmetic intensity and a big performance gap still exsits in specific benchmarks between different hardware platforms.

Hardware Implementation of Scalar Multiplication on Elliptic Curves over GF(2^m)

WU Gui-ming, ZHENG Fang, XIE Xiang-hui, WU Dong and YAN Xin-kai

Computer Science. 2015, 42 (1): 79-81. doi:10.11896/j.issn.1002-137X.2015.01.018

Abstract

PDF(299KB) ( 869 )

References | Related Articles | Metrics

A three-stage non-pipelined normal base multiplier was implemented based on an algorithm for GF(2^m) multiplication using Gaussian normal base proposed by Reyhani-Masoleh.On basis of the Gaussian normal base multiplier,we presented a high-performance hardware implementation for the López-Dahab algorithm of scalar multiplication over GF(2m).The Reyhani-Masoleh algorithm can reduce the computation complexity of the multiplication through exploiting the symmetry of the multiplication matrix,and the memory requirement of the López-Dahab algorithm can be reduced by using projective coordinate.Our architecture can benefit from the combination of the two algorithms,making its performance be equivalent to the best architecture to date.

Implementation of Depth First Search Parallel Algorithm on Cluster of GPUs

YU Ying, LI Ken-li and ZHENG Guang-yong

Computer Science. 2015, 42 (1): 82-85. doi:10.11896/j.issn.1002-137X.2015.01.019

Abstract

PDF(330KB) ( 834 )

References | Related Articles | Metrics

Straightforward implementation of depth first search algorithm for large graph on GPU cluster,may lead to load imbalance between threads and un-coalesced memory accesses,giving rise to the low performance of the algorithm.In order to achieve improvement of the performance in a single GPU and multi-GPUs environment,a series of effective operations were used to reschedule before processing the data.A novel strategy for mapping between threads and data was proposed,and by using the prefix sum and binary search operations, load balancing was achieved perfectly.In order to reduce the communication overhead,we performed pruning operation on the set of edges which needs to be exchanged at all branches of DFS.Experimental results show that the algorithm can achieve its best parallelism available on a single GPU and minimize communication overhead among GPUs.GPU cluster can effectively perform the distributed DFS on graphs which contain billions of nodes.

Multiprocessor Task Scheduling Method Based on Cuckoo Search Algorithm

YANG Hui-hua, ZHANG Xiao-feng, XIE Pu-mo and WEI Xiang-yuan

Computer Science. 2015, 42 (1): 86-89. doi:10.11896/j.issn.1002-137X.2015.01.020

Abstract

PDF(311KB) ( 490 )

References | Related Articles | Metrics

Multiprocessor system plays a key role in high performance computing.In order to improve the parallelism of the system,we proposed a new multiprocessor task scheduling algorithm,which is based on Cuckoo search(CS) algorithm.The algorithm takes the latest completion time of all tasks as the goal,and encodes based on task priority method to make continuous CS algorithm suitable for discrete task scheduling problem probably.The experimental results show that CS algorithm not only can obtain shortest task completion time,but also has the fastest solving speed.Its computation time is 60% lower than the widely used genetic algorithm and particle swarm optimization algorithm.

Design and Implementation of Parallel DSRC Compression Algorithm Based on Pthreads

ZHAN Ke, ZHANG Yun-quan, WANG Ting, ZHENG Jing-jing and ZHANG Peng

Computer Science. 2015, 42 (1): 90-91. doi:10.11896/j.issn.1002-137X.2015.01.021

Abstract

PDF(235KB) ( 525 )

References | Related Articles | Metrics

With the development of high throughput sequencing technology,large volumes of DNA data are being genera-ted.The FASTQ format is widely used to store DNA sequence.If the DNA sequence reads in FASTQ format can be compressed,the storage space will be saved efficiently.One of the DSRC advantages is the high compression ratio,therefore parallel DSRC algorithm will increase the efficiency of compressing the DNA sequence reads in FASTQ format.We implemented the parallel DSRC algorithm based on Pthreads,and the experimental results indicate that the para-llel DSRC algorithm gets 3.5 speedup when four threads are used.

Efficient Coordinator in Hybrid Cloud

WANG Zong-jiang, ZHENG Qiu-sheng and CAO Jian

Computer Science. 2015, 42 (1): 92-95. doi:10.11896/j.issn.1002-137X.2015.01.022

Abstract

PDF(655KB) ( 409 )

References | Related Articles | Metrics

Cloud computing provides four deployment models:public cloud,private cloud,community cloud and hybrid cloud.Generally,resources available in a private cloud are limited,thus the cloud users have to rent resources from public clouds.This requirement means that cloud users will incur extra costs.More and more enterprises choose the hybrid cloud to deploy their applications.In the hybrid cloud,in order to minimize the cost of using resources,it is also important to satisfy QoS for user.Therefore,this paper proposed resources allocation algorithm for hybrid cloud users who want to minimize the resource cost and ensure QoS satisfaction.The empirical results demonstrate that resources can be allocated by our algorithm in a way that satisfies the user QoS and keeps low operational costs.

B-RPL:Low Memory Cost RPL Routing Protocol

YANG Hong, ZHU Hong-song and SUN Li-min

Computer Science. 2015, 42 (1): 96-100. doi:10.11896/j.issn.1002-137X.2015.01.023

Abstract

PDF(663KB) ( 678 )

References | Related Articles | Metrics

The RPL routing protocol is widely used in low-power and lossy networks (LLNs).However,the storing mode of RPL is criticized for large memory consumption.In this paper,an advanced RPL was proposed,called B-RPL,which reduces the memory cost by making routing decision according to a set of destinations rather than the routing table used in the raw RPL.Further more,the B-RPL employs Bloom filter to manage the set of destinations,so that the memory consumption becomes extremely low.The B-RPL also contains several adaptive designs specially tailored for dynamic network change in LLNs.Experiments show that,comparing with RPL in storing mode,B-RPL saves 97.8% storage at the expense,and only increases 2.4% transmission overhead.

Mobile Coverage Algorithm Based on Dichotomy Coding for WSANs

DU Jing-lin, ZHENG Ruo-qin, XIE Li and LI Juan

Computer Science. 2015, 42 (1): 101-105. doi:10.11896/j.issn.1002-137X.2015.01.024

Abstract

PDF(407KB) ( 379 )

References | Related Articles | Metrics

In WSANs,the coverage of the target region can be enhanced by adjusting the location of remaining nodes to serve sensors better.A mobile coverage algorithm based on dichotomy coding was proposed.In each searching process,location of actor nodes near to the area of failure nodes can be adjusted in virtual to find the optimum position.This searching process is repeated until the coverage cannot be increased in order to reach the local optimum approximately.This algorithm improves the coverage rate of the remaining nodes and decreases the consumption of actors during mo-ving.Compared to other existing algorithms,this algorithm has better performance.

Research of Performance Optimization and Re-routing Selection Algorithm Based on Improved rLFA

WANG Ming-ming, MENG Xiang-ru, XU You and CUI Wen-yan

Computer Science. 2015, 42 (1): 106-109. doi:10.11896/j.issn.1002-137X.2015.01.025

Abstract

PDF(403KB) ( 449 )

References | Related Articles | Metrics

Aiming at improving network single link (node) failure fast recovery capability further,a method based on improved rLFA was proposed by utilizing Chaos PSO and considering both transmission and congestion cost.Firstly,a method based on rLFA was proposed by improving the creation of tunnels,which was integrated with the supplement links in order to achieve full coverage for failures.The weight factor was set to keep the pertinence of re-routing selection under the different traffic demands.The experimental result shows that the improved rLFA can improve network single failure fast recovery capability further and reduce the numbers of supplement links sharply.The re-routing algorithm can achieve the dynamic selection of re-route to adapt the different traffic demands,meanwhile,it can improve the transmission efficiency and balance the load in single failure network.

Research on Degrees of Freedom of Two PTP Networks Coexistence

ZHOU Lu-na, LIU Feng and ZENG Lian-sun

Computer Science. 2015, 42 (1): 110-112. doi:10.11896/j.issn.1002-137X.2015.01.026

Abstract

PDF(266KB) ( 443 )

References | Related Articles | Metrics

The degree of freedom (DoF) is derived from the concept of spatial multiplexing introduced by the using of multi-antenna system,reflecting the utilization efficiency of communication system to the resources of space.In cognitive radio networks,the coexistence of primary network and secondary network will interfere each other and reduce the DoF of system.This paper studied the DoF of two point to point(PTP) coexisting networks and the corresponding method for interference elimination.We obtained the DoF range of secondary network for message transmission when it is avai-lable to use the spare space resource of primary network in the same frequency.Meanwhile,the maximum DoF of se-condary network was given,when the space resource of primary network is fully exploited.In addition,a new cooperation approach was proposed to increase the degrees of freedom of the whole network by some ways to borrow the spare receive antennas from primary network.Analysis and comparison show that when M₁≤N₁ or N₁≥M₁₊M₂,cooperation at receivers can increase the degrees of freedom.

Research of Web Service Selection Trust Model on P2P Environment

CHEN Wei-dong, LI Min-qiang and ZHAO Qing-zhan

Computer Science. 2015, 42 (1): 113-118. doi:10.11896/j.issn.1002-137X.2015.01.027

Abstract

PDF(449KB) ( 409 )

References | Related Articles | Metrics

In order to adapt to the natural characteristics of Internet service resources “growth,autonomy”,the practice of self-organizing system P2P technology supply chain based on electronic commerce as representative of the shows great potential,but dynamic and open nature of self organizing systems makes it face serious problems of behavior trust.The feedback based Web service selection trust model in P2P environment was proposed,using the local reputation and global reputation as well as the current evaluation and history evaluation combined with factors such as time window designed model,credibility model and service reputation evaluation,service evaluation,the classification of ser-vice evaluation information using the K-means clustering algorithm,according to the different situation of using approp-riate penalty function.Finally,the experiment results show that the proposed trust model has a certain ability to resist the network environment of dishonesty.

Label Propagation Algorithm Based on Community Core for Community Detection

MA Jie-liang, HAN Lu, PAN Zhen-zhen and SONG Yan

Computer Science. 2015, 42 (1): 119-121. doi:10.11896/j.issn.1002-137X.2015.01.028

Abstract

PDF(584KB) ( 565 )

References | Related Articles | Metrics

Community detection in networks is a hot research topic currently.Among many community detection algorithms,label propagation algorithm is widely used for it is simple and rapid.But label propagation algorithm also has the problem of poor stability result.Therefore,we improved the initialization process of label propagation algorithm.We proposed a label propagation algorithm based on community core for community detection,and by calculating any two nodes’s k order common neighbor,we used the most sililar nodes and it’s k order neighbor nodes as the initial community core.According the above process,we got some tight structure as the initial label of label propagation,and assigned the initial community label to these structures.Experimental results in a real network show that the algorithm can improve the stability of the results.

Prediction Model of Network Traffic Based on EMD and RVM

BAI Jun, XIA Jing-bo and ZHAO Xiao-huan

Computer Science. 2015, 42 (1): 122-125. doi:10.11896/j.issn.1002-137X.2015.01.029

Abstract

PDF(308KB) ( 470 )

References | Related Articles | Metrics

A prediction model was proposed for self-similar network traffic based on EMD (empirical mode decomposition) and RVM(relevant vectors machine).Firstly,the network traffic in slipping window is decomposed into multiple IMF(intrinsic mode function) using EMD,and then RVM is applied to fit high frequent components while ARMA is used to structure prediction model for low frequent components,lastly,all the components’ forecasting result is composed.The experiment indicates that the proposed model can accuratly predict traffic time series’ amplitude and trend,and compared to other method,achieves higher prediction accuracy than that of other similar prediction methods.

DPA:A QoS Unicast Routing Algorithm in Dynamic Environment

YI Meng, CHEN Qing-kui, ZHANG Gang and ZHAO Hai-yan

Computer Science. 2015, 42 (1): 126-128. doi:10.11896/j.issn.1002-137X.2015.01.030

Abstract

PDF(331KB) ( 486 )

References | Related Articles | Metrics

In the current Internet routing under network environment,network parameter changes all the time,which easily causes the problem of routing expired,makes the provided QoS routing invalid.In order to solve these issues,this paper proposed a QoS unicast routing algorithm (DPA) suit for the dynamic change of network parameters.In the case of the path to changing costs over time,it can select the optimal routing node autocratically to solve the routing inaccuracies problem in multi-constrained QoS unicast routing.Experimental results show that the routing algorithm is better in adaptability and scalability,in the meantime,compared to traditional routing algorithms,DPA can provide better QoS routing in the route selection.

Network Security Emergency Response Based on CBR and Description Logic

JIANG Fei, GU Tian-long, XU Zhou-bo and CHANG Liang

Computer Science. 2015, 42 (1): 129-136. doi:10.11896/j.issn.1002-137X.2015.01.031

Abstract

PDF(853KB) ( 461 )

References | Related Articles | Metrics

Network security emergency response is the focus of information security policy for future.The current emergency response mainly depends on the incident response team and safety manager,which can effectively deal with part of security incidents,but not give the reasonable,fast,effective processing method for security incidents under specific environment.To solve this problem,the paper proposed an intelligent method based on case based reasoning and description logic for network security emergency response,to handle specific security incidents automatically.First,we used description logic to describe domain knowledge of network security emergency response,and then designed a good matching algorithm of similarity based on refinement operator and refinement graph,gave the realization process of the CBR in emergency response,and finally used the specific examples to validate the proposed method in this paper.The results show that the method has the characteristics of clear semantics,automatic classification of concept and good reasoning ability,and can get the current problem solution from past security incidents,and is capable of giving the handling method of security incidents under specific environment.

Elliptic Curve Based Light-weight Authentication and Key Agreement Scheme

GUO Song-hui, NIU Xiao-peng and WANG Yu-long

Computer Science. 2015, 42 (1): 137-141. doi:10.11896/j.issn.1002-137X.2015.01.032

Abstract

PDF(414KB) ( 873 )

References | Related Articles | Metrics

Certificateless public key cryptosystem has appealing features,namely it does not require the use of certificates and does not have a private key escrow problem,and it can to some extent solve the problem of time consuming and resource consuming of traditional public key cryptography.This paper proposed an elliptic curve based certificateless authentication and key agreement scheme,which includes a protocol and several core algorithms.This scheme can finish two party authentications in double communication without bilinear pairing computing,and greatly increase the efficiency of authentication by 30% compared with the formal protocols.The scheme makes the most of point addition of elliptic curve,increasing the computing speed,and it can complete the authentication and generate the shared key in 20ms without considering the network communication time consuming.The scheme also satisfies communication safety under the exposure of shared key,master key forward secrecy,perfect forward secrecy and key compromise impersonation resilience.The scheme is more suitable for the restricted computing resource of the communication environment,such as wireless sensors,Ad hoc networks,and so on.

Encrypted Session Detection Approach Based on Information Entropy

CHEN Li, ZHANG Li, BAN Xiao-fang and LIANG Jie

Computer Science. 2015, 42 (1): 142-143. doi:10.11896/j.issn.1002-137X.2015.01.033

Abstract

PDF(275KB) ( 2039 )

References | Related Articles | Metrics

Traditional protocol analysis algorithms detect the network encrypted session through the port.It cannot work when encrypted session uses unknown port or encrypted traffic appeares at known plaintext port.To this end,we put forward a detection approach of encrypted session based on information entropy.Firstly it reorganizes net flow according to the port,then calculates the entropy of each packet and statistical entropy value of the entire session,at last determines whether the value belongs to the normal distribution confidence interval,and identifies the encrypted session through character distribution uniformity.Experiments show that the approach does not need fingerprint database,and can achieve higher correct detection rate,real-time detection and processing.

RFID Data Cleaning Algorithm Based on Tag Velocity and Sliding Sub-window

GU Yun-hua, GAO Bao, ZHANG Jun-yong and DU Jie

Computer Science. 2015, 42 (1): 144-148. doi:10.11896/j.issn.1002-137X.2015.01.034

Abstract

PDF(403KB) ( 521 )

References | Related Articles | Metrics

To improve the accuracy of data cleaning under the circumstances of non-uniform RFID(Radio Frequency Identification) data stream,on the basis of the classical algorithm SMURF(statistical SMoothing for unreliable RFID data),an algorithm was presented based on tag velocity and sliding sub-window to clean RFID data.The method consi-ders the influence of tag velocity on the sliding window adjustment,adjusts the speed dynamically according to the label confidence δ,at the same time further divides the sliding window,and takes statistical sampling for tag data in sub window.The results of the statistical sampling are dealing with the data of entire sliding window together to speed up the detection of tag transition,so that the movement of tag can be determined more accurately.Experimental results show that the algorithm reduces the average error rate and frequency of the phenomenon of positive reading,thereby increasing the accuracy of the data.

One Strong Authentication Test Suitable for Analysis of Nested Encryption Protocols

SONG Wei-tao and HU Bin

Computer Science. 2015, 42 (1): 149-154. doi:10.11896/j.issn.1002-137X.2015.01.035

Abstract

PDF(595KB) ( 417 )

References | Related Articles | Metrics

Authentication test is a new type of analysis method of security protocols,which is proposed based on strand space model.It attracts a majority of scholars’ attention because of its simple and practical,but it doesn’t suitable for the analysis of nested encryption protocols.This greatly restricts the application of the method.Meanwhile,the existing improvement schemes are difficult to break through the limitation thoroughly on account of strand space’s poor ability of reflecting the internal relations of terms.By introducing the definitions of equivalence class,class elements,security encryption item,and security encryption package,etc.,this paper improved the strand space’s ability of depicting the internal relations of terms.Then it put forward a general authentication test scheme which can be applied to anylize the nested encryption scenarios of authentication test element in the protocols.Furthermore,we verified the correctness and effectiveness of the new method from two aspects:formal proof and instance analysis.

Android Malware Characterization Based on Static Analysis of Hierarchical API Usage

WEI Song-jie and YANG Ling

Computer Science. 2015, 42 (1): 155-158. doi:10.11896/j.issn.1002-137X.2015.01.036

Abstract

PDF(418KB) ( 1008 )

References | Related Articles | Metrics

Current static-analysis practice on Android application package (APK) mainly uses the features such as permissions,data flows,API calls,extracted from the manifest file and the code.Such features lack consideration on the APK code organizations and object hierarchy,and thus they may be ineffective in describing and predicting an APK’s application behaviors and maliciousness.This research work tried to design and implement a comprehensive API-usage characterization method for Android APK on different resolutions and hierarchies,namely packages,classes,and functions.A tree structure is used to contain such hierarchical API-usage information,and a comparison algorithm is designed for cross-tree similarity,which provides extra insights in classifying and differentiating Android malware of different types and code families.The variations in API-usage on different code layers imply code functionalities and application behaviors,and thus they can be used to improve current static-analysis based malware detection and signature generation.Realistic malware packet samples of various types and families were used to validate the proposed characterization method,and results were discussed for its strength and future improvement.

Chinese Wall Model Based on Dynamic Divided-set

JIANG Lu, HE Rong-yu and WEI Yan-fen

Computer Science. 2015, 42 (1): 159-163. doi:10.11896/j.issn.1002-137X.2015.01.037

Abstract

PDF(386KB) ( 520 )

References | Related Articles | Metrics

The Chinese wall model gives much constraint on write permission,while its access regions need to predetermined and divided statically.Its conflict of interest relation was defined by the object’s interest.A modified vision of Chinese wall was proposed to solve this problems.The divided-sets was defined and both the interest of subject and object were considered to analyze the system’s conflict of interest relation.In this model,objects and subjects can be divided into different access regions which can be extended dynamically.At last,the security of this model was proved.The application of the model was showed by a simple example.

Identification of Encrypted Bit Stream Based on Runs Test and Fast Fourier Transform

XING Meng, WU Yang, WANG Tao and LI Jin-dong

Computer Science. 2015, 42 (1): 164-169. doi:10.11896/j.issn.1002-137X.2015.01.038

Abstract

PDF(479KB) ( 510 )

References | Related Articles | Metrics

To obtain samples of encrypted data and plaintext in data link layer,an encrypted data identification scheme was provided based on the run test,meanwhile,and the fast Fourier transform was used to process the encrypted data and plaintext.Based on the principle of maximum difference,the characteristic point of the result of the fast Fourier transform was determined.Then the value of the characteristic point and the feature template were determined using the principle of normal distribution.Finally,the identification rate of the proposed scheme was verified,taking a wireless network data as the identification object.The experimental results demonstrate that the rate of the proposed scheme achieves 95% both for the encrypted data and the plaintext.

Research of Subjective Trust Evaluation Model and Decision-making

YANG Yu-li, PENG Xin-guang and WANG Zheng

Computer Science. 2015, 42 (1): 170-174. doi:10.11896/j.issn.1002-137X.2015.01.039

Abstract

PDF(418KB) ( 401 )

References | Related Articles | Metrics

Since traditional credit evaluation methods can not describe the business’s creditworthiness characteristics of timeliness and risk effectively in electronic commerce transactions,a new subjective trust evaluation method based on multi-attribute normal cloud model was presented.Firstly,the standard trust cloud including five levels is generated.Then inducting time attenuation factor,the reputation cloud and risk cloud are designed to depict the average level and change rate about the credit history information respectively.Finally,the comprehensive trust cloud is synthesized with the reputation cloud and risk cloud,and its trust level and trust scores are calculated.Simulation result shows the feasibility and effectiveness of the evaluation model,which can assist people to finish trust decision-making intuitively and efficiently.

Interlayer Model for Software Development Based on Collective Intelligence and its Architecture Implementation

HE Yan-xiang, YANG Jian-kang, BAO Hai-zhou, RAN Ya-luo, GUO Bo-bo and YANG Jian-xi

Computer Science. 2015, 42 (1): 175-179. doi:10.11896/j.issn.1002-137X.2015.01.040

Abstract

PDF(452KB) ( 441 )

References | Related Articles | Metrics

Reducing or avoiding duplication of effort is an important way to deal with the software crisis.For the purpose of avoiding duplicate effort and trying to deal with the software crisis,interlayer model which makes use of collective intelligence was proposed after necessary software reuse technologies were researched.Lemon framework the paper presented is one of implementations of interlayer model,which is practical and meanwhile it is also an ongoing project.

Policy Driven Exception Handling Framework for BPEL Processes

WANG Quan-yu, LV Guo-bin, YING Shi and ZHOU Feng

Computer Science. 2015, 42 (1): 180-186. doi:10.11896/j.issn.1002-137X.2015.01.041

Abstract

PDF(659KB) ( 426 )

References | Related Articles | Metrics

How to improve the efficiency of the development of BPEL processes exception handling is one of the key issues for policy-driven exception handling of BPEL processes to be sovled.Firstly,the paper analysized the exception handling mechanism of BPEL processes,creatively designed BPEH/PDL language,a new Policy Description language(PDL) for exception handling of BPEL processes,and then based on BPEH/PDL language,proposed the exception handling framework BPEH/F,a integrated BPEL processes exception handling framework.

Detection of Malware Code Based on Acquaintance Degree

DU Nan, HAN Lan-sheng, FU Cai, ZHANG Zhong-ke and LIU Ming

Computer Science. 2015, 42 (1): 187-192. doi:10.11896/j.issn.1002-137X.2015.01.042

Abstract

PDF(445KB) ( 506 )

References | Related Articles | Metrics

Signature recognition method can only identify the known malicious code,did not solve the problem of the discrimination of the malicious code.The current method based on behavior and heuristic scanning only pays attention to the single danger action point of malicious code,and has a high rate of false positives.The paper focused on the relationship between behaviors,described and tested behaviors and the relationship between behaviors by matrix,then gave a malicious code detection method based on acquaintance degree.Acquaintance degree is the familiarity degree of the system to under-test code.According to the size of the acquaintance degree,whether the under-test code is malicious code can be judged,the greater the acquaintance degree,the smaller the possibility of being malicious code.An algorithm of detecting malware behavior was given and its feasibility was justified through real example test.

Method for Generating Formal Interlocking Software Model Based on Scenario

DONG Yu and GAO Xue-juan

Computer Science. 2015, 42 (1): 193-195. doi:10.11896/j.issn.1002-137X.2015.01.043

Abstract

PDF(319KB) ( 534 )

References | Related Articles | Metrics

It is necessary to do analysis,verification and test of station interlocking system to ensure the safety of train running and people lives.Especially,the formalized model is the foundation of these works.This paper studied the methodtransforming the sequence diagram(SD) into the event deterministic finite automata(ETDFA) model which is based on UML semi-formal model of computer interlocking software,and used ETDFA as the formal model to describe the system.Firstly,a set of global variables associated with sequence diagram’s interaction were selected as the sate vector to analyze and eliminate the contradiction in pre-value and post-value of message in every scenario and the same message among multiple scenarios,and then based on the sequence event of each object,corresponding ETDFA model was genera-ted.At last,by composing all objects’ ETDFA model,system’s ETDFA formal model was got.The method improves safety software’s designing and development and provides technical support of software quality.

Research on Evolution Information Acquisition and Measurement of Component-based Software Based on Ontology Model

ZHONG Lin-hui and ZONG Hong-yan

Computer Science. 2015, 42 (1): 196-200. doi:10.11896/j.issn.1002-137X.2015.01.044

Abstract

PDF(495KB) ( 510 )

References | Related Articles | Metrics

Software evolution is important information reflecting the software change history.However,traditional software evolution information caption methods use the file or project as the basic unit to track the software change,which cannot effectively support the storage and retrieval of component-based software evolution information.This paper pre-sented the strategies of modeling the component-based software evolution information based on the ontology model,and used the Jena inference engine to acquire the software evolution information.This method can not only query the basic software evolution information directly,but also retrieve the software evolution information by defining the rules.In addition,this paper proposed a component-based software measurement model,which can be used to forecast evolution trend by analyzing the evolution properties of the component-based software.

Probabilistic Threshold Reverse Nearest Neighbor Queries for Indoor Moving Objects

WANG Li, QIN Xiao-lin and XU Jian-qiu

Computer Science. 2015, 42 (1): 201-205. doi:10.11896/j.issn.1002-137X.2015.01.045

Abstract

PDF(458KB) ( 390 )

References | Related Articles | Metrics

Indoor spaces are becoming increasingly large and complex,and more and more demand for initiating indoor spatial queries has emerged.Range queries and nearest neighbor queries specifically for indoor space have been proposed,while there are no studies on reverse nearest neighbor queries.Thus,this paper presented the formal definition of probabilistic threshold reverse nearest neighbor query (IPRNN) for indoor moving objects and a device reachable graph model.The query processing algorithm of IPRNN was proposed based on the graph model.The algorithm consists of four parts:model pruning,indoor distance pruning,probability pruning and probability calculation.The objects that are not likely to appear in the result set are built out through pruning strategy,thereby significantly reducing the search space and improving efficiency.

SPindex:Spatial Index Based on M-phase Points

CHEN Ying, CHEN Zhao-ying and YE Xiao-ping

Computer Science. 2015, 42 (1): 206-209. doi:10.11896/j.issn.1002-137X.2015.01.046

Abstract

PDF(363KB) ( 428 )

References | Related Articles | Metrics

Spatial data indexing plays a critical role in spatial data management,and its performance determines the efficiency of spatial database directly.By transforming spatial-temporal issues into pure spatial ones,spatial index can be widely used in spatial-temporal and moving object databases.Therefore it is meaningful and necessary to study the spatial data index.Most of the current spatial indexes are based on R-tree,however,in order to access massive spatial data quickly and efficiently,this paper proposed a novel spatial index SPindex based on the analysis of phrase-points.Firstly,we built a congruent relationship between the MBRs of spatial regions and phase points in phase plane.Secondly,though analyzing the positions of MBRs by the characteristics of the corresponding phase points,we proposed spatial data structure about phase points named MROB.Then a new spatial data indexing SPindex based on analysis of M-phase points was put forward.Finally,we designed experiments to test the performance of SPindex and results show that SPindex is feasible and efficient.

Sentiment Analysis of Words and Sentences with Uncertainty

LI Yang, MIAO Duo-qian and ZHANG Zhi-fei

Computer Science. 2015, 42 (1): 210-214. doi:10.11896/j.issn.1002-137X.2015.01.047

Abstract

PDF(632KB) ( 472 )

References | Related Articles | Metrics

It is significant to analyze emotional uncertain words and sentences in Chinese text sentiment analysis.Emotional uncertain words are generally words with rich meaning,which implies some evaluation in the expression.Emotional uncertain sentences usually have the same size of positive words and negative words,so the emotion is not ob-vious.In this paper,using uncertain word “好” as an example,we designed features for the uncertain sentences.Then using four different classification algorithms of supervised learning to do experiments,we got the conclusion:SVM can better deal with the emotional uncertain words and sentences.

Intelligent Mining for P-Information with Expanded Characteristic of Disjunctive Attribute

LU Ying, LI Bei-you and SHI Kai-quan

Computer Science. 2015, 42 (1): 215-219. doi:10.11896/j.issn.1002-137X.2015.01.048

Abstract

PDF(365KB) ( 378 )

References | Related Articles | Metrics

Inverse P-sets is a set pair composed of internal inverse packet set F and outer inverse packet set ,or() is inverse P-sets,which has the dynamic characteristic.Inverse P-sets is a novel model of studying another class of dynamic information.The attribtutes of the elements in the inverse P-sets satisfy disjunctive characteristic.Using the structure of internal inverse packet set,the expanded form and characteristic of disjunctive attribute,the intelligent mining of internal inverse P-information and intelligent mining theorems under the condition of disjunctive attribute expanded were presented,and the intelligent mining-discovery of complete information of satisfying internal inverse P-reasoning and non-complete information condition was also put forward.By the above theory results,the application of the intelligent mining for the information with the expanded characteristic of disjunctive attribute was given.

Learning Action Models Described in Action Language B by Combining ILP and ASP

LIU Zhen and ZHANG Zhi-zheng

Computer Science. 2015, 42 (1): 220-226. doi:10.11896/j.issn.1002-137X.2015.01.049

Abstract

PDF(533KB) ( 524 )

References | Related Articles | Metrics

Action model learning is beneficial to autonomous and automated systems.If an Agent can update its action model according to the changes occurred in the environment,it can be more adaptable to the world and operate more effectively.Simultaneously,action model learning can provide modeling dynamic domain with an initial rough model which is the foundation for further improvement and modification.We designed an algorithm used for learning action models described in language B by combining ILP and ASP.This algorithm can work on dynamic domains consisting of objects of different scale.In the experiments,we tested the learning algorithm through using classic planning cases and verified the soundness of the learning algorithm.

User Query Intention Oriented Hierarchical Sentence Similarity Computation

LI Jing-yu, ZHANG Yang-sen and CHEN Ruo-yu

Computer Science. 2015, 42 (1): 227-231. doi:10.11896/j.issn.1002-137X.2015.01.050

Abstract

PDF(399KB) ( 424 )

References | Related Articles | Metrics

In order to improve the accuracy of sentence similarity computation algorithm and further enhance its applicability in complex context,a hierarchical sentence similarity algorithm for user-oriented query intention was designed,integrating technologies such as edit distance,keyword and synonyms semantic method,and natural annotation.With thorough analyzing of the experimental data and its feature distribution,a multi-level optimization strategy was put forward.The experimental results confirm the algorithm in this paper is effective and achieves F-value of 0.6019.

Improved AP Algorithm:M-AP Clustering Algorithm

GAN Yue-song, CHEN Xiu-hong and CHEN Xiao-hui

Computer Science. 2015, 42 (1): 232-235. doi:10.11896/j.issn.1002-137X.2015.01.051

Abstract

PDF(386KB) ( 1051 )

References | Related Articles | Metrics

Affinity propagation(AP) clustering simultaneously considers all data points as potential exemplars.It takes similarity between pairs of data points as input measures,and clusters gradually during the message-passing procedure.But the result of AP clustering algorithm in the data set of complex structure(non-group)is not very good.Therefore,we proposed a new clustering algorithm by adding a merge process on the basis of AP clustering algorithm,called M-AP algorithm which can effectively solve this kind of problem.When the number of samples is very large, the problem of large sample can be effectively solved by using CVM compression algorithm.

Differential Private Synthesis Dataset Releasing Algorithm Based on Navie Bayes

CHEN Xuan, LIU Jian, FENG Xin-qi and ZHAO Xue-mei

Computer Science. 2015, 42 (1): 236-238. doi:10.11896/j.issn.1002-137X.2015.01.052

Abstract

PDF(230KB) ( 1245 )

References | Related Articles | Metrics

Non-interactive data releasing has been a hotspot in differential privacy preservation model.A synthesis dataset releasing algorithm based on navie bayes was proposed.This algorithm computes the joint distribution of the original dataset based on the hypothesis of conditional independences in navie bayes firstly,then employs exponential mechanism to generate the synthesis dataset.The experiment results show that the accuracy of classifiers trained by the synthesis dataset improves and tends to be stable with privacy budget increasing.

Web Spam Detection Based on Integrated Classifier with Bagging-SVM

TANG Shou-hong, ZHU Yan and YANG Fan

Computer Science. 2015, 42 (1): 239-243. doi:10.11896/j.issn.1002-137X.2015.01.053

Abstract

PDF(656KB) ( 434 )

References | Related Articles | Metrics

Web spam not only declines the quality of information retrieval,but also causes troubles to the security of Internet.This paper proposed a Bagging-based integration of SVM to detect Web spam.In preprocessing stage,a technique referring to K-means is introduced to solve the class-imbalance problem of dataset firstly,and then an optimal feature subset is culled by using CFS.Finally the optimal feature subset is discretized by the information entropy.In the stage of classifier training,several training datasets are obtained by Bagging and each training dataset is utilized to produce weak classifier respectively after SVM learning.In detection stage,test samples are voted by weak classifiers obtained before detemining their categories.Experimental results on the WEBSPAM-UK2006 reveal that the proposed method can achieve better results with less number of features.

Research on Dynamic Development Model of Information Product Quality Assessment

LIU Jing and ZHAO Song-zheng

Computer Science. 2015, 42 (1): 244-248. doi:10.11896/j.issn.1002-137X.2015.01.054

Abstract

PDF(446KB) ( 417 )

References | Related Articles | Metrics

Information product quality assessment is an important part of Total Data Quality Management (TDQM),lays foundation for potential quality improvement.In this paper,a novel dynamic development model of information product quality assessment that applies to the “Selection” relational algebra for the relational database was established by introducing time variable and analyzing the impact of information product timeliness based on the static propagation model of information product quality assessment.It includes developing the timeliness matrix at time t,proposing dynamic development mapping model of out-of-date dataset and developing a method for information product quality assessment based on the mapping model.The model dataset was used to verify the feasibility and validity of the model.

Research on Locality Rules of Description Logic SHJF for Module Reusing

XU De-zhi, LIAO Hui-huan and XU Lian-jun

Computer Science. 2015, 42 (1): 249-252. doi:10.11896/j.issn.1002-137X.2015.01.055

Abstract

PDF(299KB) ( 439 )

References | Related Articles | Metrics

The extraction of ontology module is an essential step in ontology reuse.Compared to the structural approaches based on ontology hierarchy used in traditional engineering and applications,the logic approaches can make full use of the semantic information that ontologies provide.The extracted modules are more integrated and consistent.Based on conservative extension and locality rules of SHOJQ-based ontologies proposed by Grua B C,this paper provided and proved a SEMLOC rule and a SYNLOC rule for description logic SHJF-based ontologies,which can provide theoretical basis for extracting reusable ontology module.

Adaptive Method of Predicting Arrival Time of Buses on Dynamic Traffic Information

XIE Ling, LI Pei-feng and ZHU Qiao-ming

Computer Science. 2015, 42 (1): 253-256. doi:10.11896/j.issn.1002-137X.2015.01.056

Abstract

PDF(401KB) ( 472 )

References | Related Articles | Metrics

Bus arrival time prediction is the foundation of realizing intelligent bus information service.Reliable prediction of bus arrival time is beneficial to improve the public transport service level,so that it attracts more and more city residents to use public transportation.In this paper,based on the massive historical data of a city bus system in real time,SVM (Support Vector Machine) was applied to establish a bus forecasting model on the static and dynamic information.And the speed of upstream,the latest speed of downstream,the latest travel time of downstream,time-of-day,traffic congestion,etc were introduced into our dynamic model.Besides,this paper put forward an adaptive prediction model to improve the efficiency of prediction based on a lot of volatility of bus arrival time historical data.The experimental results show that the adaptive model outperforms those existing static models.

Feature Extraction of Ballistic Midcourse Target Based on HRRP

ZHAO Zhen-chong, WANG Xiao-dan, BI Kai and XING Ya-qiong

Computer Science. 2015, 42 (1): 257-260. doi:10.11896/j.issn.1002-137X.2015.01.057

Abstract

PDF(274KB) ( 506 )

References | Related Articles | Metrics

This paper first researched the precessional features of midcourse target and its HRRP in different radar posture angles and presented a new projective length abstract method.Based on the peak of the scattering center,this method makes the HRRP present a oscillation phenomenon on this account to abstract scattering center accurately.On this basis,this paper analysed the projective length with precessional motion and presented a new feature abstraction method on precessional angle and the length of the target.Simulation shows that the two methods can work accurately.

Measuring Semantic Similarity between Words Using Web Search Engines

CHEN Hai-yan

Computer Science. 2015, 42 (1): 261-267. doi:10.11896/j.issn.1002-137X.2015.01.058

Abstract

PDF(541KB) ( 687 )

References | Related Articles | Metrics

Semantic similarity measures play important roles in many Web-related tasks such as Web browsing and querysuggestion.Because taxonomy-based methods cannot deal with continually emerging words,recently Web-based methods have been proposed to solve this problem.Because of the noise and redundancy hidden in the Web data,robustness and accuracy are still challenges.We proposed a method integrating page counts and snippets returned by Web search engines.Then,the semantic snippets and the number of search results were used to remove noise and redundancy in the Web snippets.After that,a method integrating page counts,semantics snippets and the number of already displayed search results was proposed.The proposed method does not need any human annotated knowledge,and can be applied Web-related tasks easily.A correlation coefficient of 0.851 against Rubenstein-Goodenough benchmark dataset shows that the proposed method outperforms the existing Web-based methods by a wide margin.Moreover,the proposed semantic similarity measure significantly improves the quality of query suggestion against some page counts based methods.

Method of Antenna Arrays Optimization Based on Bipolar Preferences Dominance

WANG Li-ping, LIN Si-ying and QIU Fei-yue

Computer Science. 2015, 42 (1): 268-271. doi:10.11896/j.issn.1002-137X.2015.01.059

Abstract

PDF(373KB) ( 420 )

References | Related Articles | Metrics

In this work,a bipolar preferences dominance based antenna arrays optimization (AAO) method was proposed to enhance the selection pressure of the multi-objective evolutionary algorithms (MOEAs) on the AAO problems with more than four objectives.Considering the positive preference and negative preference of the decision-makers in real-world problems,the TOPSIS method is employed to compare solutions,construct a more rigid non-domination relationship and induce the population to move towards the position with higher directional radiation pattern and lower null value.Besides,a HSDC method is applied to analyze the experimental results.In the experimental analysis,the proposed method was compared with four state-of-the-arts multi-objective optimization algorithms.The comparison results show the higher accuracy and operating efficiency of the proposed method.

Reserch on Model of Multi Filtering for Adaption about Network Sensitive Information

HU Chuan-zhi, CHENG Xian-yi and CAO Xiao-feng

Computer Science. 2015, 42 (1): 272-275. doi:10.11896/j.issn.1002-137X.2015.01.060

Abstract

References | Related Articles | Metrics

Filtering of network sensitive information is an important and complicated task.Aiming at the problems of detection time lag,low accuracy and poor adaptability,we proposed an adaptive multi filtering model of network sensitive information which is based on the research object of Chinese text media from the internet and uses the technologies of opinion mining,machine learning,high performance computing and natural language processing to recognize the sensitive information adaptively from the point of whole and semantic.The research on adaptive multi filtering model of network sensitive information will give some technical supports for the public opinion monitoring,intelligent business and developing systems of aid making decision application.

Knowledge Representation on Incomplete Formal Context

ZHI Hui-lai

Computer Science. 2015, 42 (1): 276-278. doi:10.11896/j.issn.1002-137X.2015.01.061

Abstract

PDF(229KB) ( 407 )

References | Related Articles | Metrics

Incomplete formal context contains uncertainty information,and thus knowledge representation on incomplete formal context and the one on complete formal context have distinction,and also have connection.In order to study their internal relationship,upper and lower approximate formal context,as well as upper and lower approximate concept lattice were defined respectively.Then a recognition method of rough concept was put forward.Moreover,the relationship between upper approximate concept lattice and lower approximate concept lattice was also studied.Conclusion shows that upper approximate concept lattice can be used as a knowledge representation tool for incomplete formal context.

Convergence of Warning Propagation Algorithm for Regular Structure Instances

WANG Xiao-feng, LI Qiang and DING Hong-sheng

Computer Science. 2015, 42 (1): 279-284. doi:10.11896/j.issn.1002-137X.2015.01.062

Abstract

PDF(469KB) ( 547 )

References | Related Articles | Metrics

Message propagation algorithms are very effective in finding satisfying assignments for 3-SAT instances,and they can make hard region become narrower.However,message propagation algorithms do not always converge for graphs with cycles.Unfortunately,rigorous theory proof of this phenomenon is still lacking.Warning propagation algorithm is the most basic message propagation algorithm,and we analysed convergence of the warning propagation algorithm.Lastly,experimental results show that convergence of the warning propagation algorithm is improved for (3,4)-SAT instances.

FCA Concept Similarity Computation Based on Bounded Transitive Similarity Graph

HUANG Hong-tao, WU Zhong-liang, WAN Qing-sheng and HUANG Shao-bin

Computer Science. 2015, 42 (1): 285-289. doi:10.11896/j.issn.1002-137X.2015.01.063

Abstract

PDF(383KB) ( 411 )

References | Related Articles | Metrics

It is necessary to construct the transitive closure of similarity relation in the case of computing similarity between FCA concepts by means of similarity graph.This method will lead to large scale similarity graph for complex problem,which may affect the efficiency of similarity evaluation.A bounded transitive similarity graph based FCA concept similarity computing method was proposed in order to reduce the size of similarity graph.This method can avoid constructing the transitive closure of similarity relation by adding a bound on transitive similarity relation,and the bounded transitive similarity graph obtained does not contain the transitive relation whose length beyonds the bound,and this omitted transitive relation is useless to distinguish different FCA concepts,which makes it possible to compress the scale of similarity graph.Then a dynamic transitive similarity computation method and a bipartite graph construction method using bounded transitive similarity graph were given.Experimental results show that this bounded transitive similarity graph based method improves the efficiency of FCA concept computation effectively without the loss of accuracy.

Anti-consistency Possibilistic C-means Clustering Algorithm

WEN Chuan-jun, WANG Qing-miao and ZHAN Yong-zhao

Computer Science. 2015, 42 (1): 290-292. doi:10.11896/j.issn.1002-137X.2015.01.064

Abstract

PDF(304KB) ( 739 )

References | Related Articles | Metrics

PCM classification judgment will fail while consistency question occurs.A new algorithm was proposed in this paper which is named as anti-consistency possibilistic C-means clustering(ACPCM).Anti-consistency function is composed of the reciprocal sum of distances between every two clustering centers,and ACPCM objective function is the sum of PCM objective function and anti-consistency function.PSO algorithm is used to estimate clustering centers and gradient method is utilized to solve fuzzy memberships.The effectiveness and anti-consistency of ACPCM were proved through theoretical analysis and simulation experiments.

Classification of Multi-instance Image Based on Sparse Representation

SONG Xiang-fa and JIAO Li-cheng

Computer Science. 2015, 42 (1): 293-296. doi:10.11896/j.issn.1002-137X.2015.01.065

Abstract

PDF(567KB) ( 611 )

References | Related Articles | Metrics

In order to effectively solve the problem of multi-instance image classification,a novel classification method of multi-instance image was proposed which is based on sparse representation.The whole image is regarded as a bag and each region as an instance of that bag.The image bag feature is computed based on instance embedded strategy.Next,the test image bag feature is regarded as sparse linear combination of training image bag feature set and the sparse solution can be obtained by ₁ optimization method.Finally,sparse coefficients are utilized to predict the label of the test ima-ge.Experimental results on the Corel image data show that the proposed method is superior to the state-of-art methods in terms of classification accuracy.

Combination of Nearest Neighbor with Semantic Distance for Image Annotation

WU Wei, GAO Guang-lai and NIE Jian-yun

Computer Science. 2015, 42 (1): 297-302. doi:10.11896/j.issn.1002-137X.2015.01.066

Abstract

PDF(498KB) ( 421 )

References | Related Articles | Metrics

Most of the nearest neighbor (NN) based image annotation or classification methods do not achieve desired performances.The main reason is that much valuable information is lost when extracting visual features from image.A novel nearest neighbor method was proposed.Firstly,we obtained a new image semantic distance learned by distance metric learning (DML) using image class information,and then multiple clustering centers were formed based on this learned semantic distance.Finally,we constructed our NN model by calculating the distances between the image and these clusters.Our model can minimize the semantic gap for intra-class variations and inter-class similarities.Experimental results on image annotation task of ImageCLEF2012 confirm that our method is efficient and competitive compared with the traditional and state of the art classifiers.

Adaptive Decision-based Unsymmetric Trimmed Median Filter

HUANG Yan, LEI Tao, FAN Yang-yu and LU Xi-pan

Computer Science. 2015, 42 (1): 303-307. doi:10.11896/j.issn.1002-137X.2015.01.067

Abstract

PDF(932KB) ( 427 )

References | Related Articles | Metrics

The decision-based unsymmetric trimmed median filter,which is efficient in restoring images contaminated by high-level pepper-salt noise,is however only effective to certain type of images due to its size-fixed filter window and the replacement of centre pixel using mean value of pixels in the window.Aiming at those drawbacks,this paper proposed an adaptive decision-based unsymmetric trimmed median filter.An estimation of homochromatic area is presented to solve the filter problem of original algorithm in homochromatic area.The noise restrain performance of MDBUTMF is improved by adopting adaptive size of filter window.Therefore,the proposed method is more robust and practical than original algorithm.It shows higher PSNR and also lower MAE and NCD in simulation comparing with some classic vector median filters and high-level noise restrain algorithms, restrains pepper-salt noise and retains image hue and details effectively in the meantime.

Color Image Difference Prediction Based on Image Difference Measure

YANG Dan

Computer Science. 2015, 42 (1): 308-311. doi:10.11896/j.issn.1002-137X.2015.01.068

Abstract

PDF(1180KB) ( 412 )

References | Related Articles | Metrics

The feature extraction of image prediction technology is a hot and difficult problem in the image processing field.This paper proposed a image difference prediction algorithm containing the fusion image normalization characteristic.Color difference measurement method based on image feature makes full use of the color information of image,and the color information is converted to a color space.And then the image is normalized to a range of specific perspective to extract the image difference feature (IDF) information.Finally,this paper did a lot of experiments,and the experimental results show that the proposed method greatly improves difference prediction performance of color image.At the same time,we preformed the multi-scale analysis for brightness distortion caused by gamut mapping.The results show that the image differential feature based on brightness extracted from different scale images has higher correlation between scale than the general image distortion.

Adaptive Double-threshold Detection Method of Airport Surface Moving Target

WU Min, WU Hong-gang, YAO Hui, WANG Kai and JIANG Li

Computer Science. 2015, 42 (1): 312-316. doi:10.11896/j.issn.1002-137X.2015.01.069

Abstract

PDF(661KB) ( 432 )

References | Related Articles | Metrics

To solve the accurate detection of air surface moving targets effectively under complicated environment,an adaptive double-threshold detection method of airport surface moving target was proposed.Firstly,Gaussian mixture model is adopted to extract the background image,and then foreground object of difference image is partitioned off by means of the two threshold value.Namely,low-threshold is used for coarse segmentation to detect moving object,which is obvious.In addition,high-threshold is used for fine segmentation to remove noise target and false target based on coarse segmentation.Finally,airport surface moving target is detected and segmented accurately.The experimental results of complicated background show that this method can avoid the noise,has robustness for the slower moving target and is effective for detecting.