Computer Science

Data Science Studies:State-of-the-art and Trends

CHAO Le-men, XING Chun-xiao and ZHANG Yong

Computer Science. 2018, 45 (1): 1-13. doi:10.11896/j.issn.1002-137X.2018.01.001

Abstract

PDF(3414KB) ( 888 )

References | Related Articles | Metrics

The entering big data era gives rise to a novel discipline called data science.First,the differences between domain-general data science and domain-specific data science were proposed based upon conducting an in-depth discussion on its basic concept,brief history,scientific roles and the body of knowledge.Secondly,top ten challenges faced by data science were identified via describing the debates on paradoxical topics including the shifts of thinking pattern (know-ledge pattern or data pattern),perspectives on data (active or negative),implementation of intelligence(via AI or via big data),bottlenecks of data products development(computing intensive or data intensive),data preparation (data preprocessing or data wrangling),quality of services(performance of services or user experiences),data analysis (explanatory or predictive),evaluation of algorithm(by complexity or by scalability),research paradigm(third paradigm or fourth paradigm) as well as main motivations of the education(in order to cultivate data engineer or data scientist).And then,the top ten trends in data science studies were proposed:to vale predictive models and correlation analysis,to give more attention on model integration and meta-analysis,to embrace data first,model later or never paradigm,to be led by rea-lism and ensure data consistence,to support multi-copies and data locality,the coexistence of varieties in implementation techno logies and integrated applications,to be dominated by simple computing and pragmatism,to develop data products and the embedded applications of data science,to embrace the Pro-Am and metadata,and cultivate data scientist and curriculums or majors.Finally,some suggestions on how do further studies were also proposed.

Review of Wireless Sensor Network Routing Method for Environment Perception

DONG Hai-jun, WEI Su-yuan, LIU Xing-cheng, QI Xiao-gang, LIU Li-fang and FAN Ying-sheng

Computer Science. 2018, 45 (1): 14-23. doi:10.11896/j.issn.1002-137X.2018.01.002

Abstract

PDF(4115KB) ( 1069 )

References | Related Articles | Metrics

Routing transmission and data aggregation are both important in wireless sensor network,which are widely used.Due to the diversity of the network,there is no universal routing algorithm or data aggregation scheme.Therefore,it is necessary to summarize both of them.Routing method and data aggregation in wireless sensor networks were summarized.The typical wireless sensor network routing methods were introduced firstly,and then different data aggregation and routing methods were described for multi-class sensor.Then the methods of data collection and routing in one dimension sensor networks were provided.Finally,the future applications and research trends were discussed.

Personalized Affective Video Content Analysis:A Review

ZHANG Li-gang and ZHANG Jiu-long

Computer Science. 2018, 45 (1): 24-28. doi:10.11896/j.issn.1002-137X.2018.01.003

Abstract

PDF(1029KB) ( 1046 )

References | Related Articles | Metrics

Personalized affective video content analysis is an emerging research field which aims to provide personalized video recommendation to an individual viewer tailored to his/her personal preferences or interests.However,there still lacks a review about recent progress on the development of approaches in this field.This paper presented a review of state-of-the-art approaches towards building automatic systems for personalized affective video content analysis from three perspectives of audio-visual features in video content (e.g.light,color,and motion),physiological response signals from viewers (e.g.facial expression,body gesture,and pose),and personalized recommendation techniques.It discussed the advantages and disadvantages of existing approaches,and highlighted several challenges and issues that may need to be overcomed in future work.

Project Funding Analysis on 2017 Annual National Natural Science Foundation of Ministry in Computer Science

MENG Zhi-xin, XING Xing, CHU Han-ting and JIA Zhi-chun

Computer Science. 2018, 45 (1): 29-33. doi:10.11896/j.issn.1002-137X.2018.01.004

Abstract

PDF(6857KB) ( 713 )

References | Related Articles | Metrics

This paper did statistical analysis on funding situation in 2017 National Natural Science Foundation of computer science projects about surface projects,youth projects,regional projects,key projects,overseas Hong Kong and Macao projects,and outstanding youth projects,summarized the research characteristics of the key projects funded by the computer science and provided reference for the application of the researchers’ natural science fund project in this field.

Ensemble Method Against Evasion Attack with Different Strength of Attack

LIU Xiao-qin, WANG Jie-ting, QIAN Yu-hua and WANG Xiao-yue

Computer Science. 2018, 45 (1): 34-38. doi:10.11896/j.issn.1002-137X.2018.01.005

Abstract

PDF(5110KB) ( 760 )

References | Related Articles | Metrics

Driven by the illegal purpose,attackers often exploit the vulnerability of the classifier to make the malicious samples free of detection in adversarial learning.At present,adversarial learning has been widely used in computer network intrusion detection,spam filtering,biometrics identification and other fields.Many researchers only apply the exi-sting ensemble methods in adversarial learning,and prove that multiple classi-fiers are more robust than single classi-fier.However,priori information about the attacker has a great influence on the robustness of the classifier in adversariallearning.Based on this situation,by simulating different strength of attack in learning process and increasing the weight of the misclassified sample,the robustness of the multiple classifiers can be improved with maintaining the accuracy.The experimental results show that the ensemble algorithm against evasion attack with different strength of attack is more robust than Bagging.Finally,the convergence of the algorithm and the influence of parameter on the algorithm were analyzed.

Hybrid Feature Selection Method of Chinese Emotional Characteristics Based on Lasso Algorithm

LI Yan, WEI Zhi-hua and XU Kai

Computer Science. 2018, 45 (1): 39-46. doi:10.11896/j.issn.1002-137X.2018.01.006

Abstract

PDF(6865KB) ( 965 )

References | Related Articles | Metrics

An important issue in Chinese sentiment analysis is the emotional tendency classification.The sentiment feature selection is the premise and foundation of the emotional tendency classification based on the machine learning,with the effect of rejecting irrelevant and redundant features to reduce the dimension of the feature set.The hybrid sentiment feature selection method was proposed in this paper combining the Lasso algorithm and filtering feature selection me-thod.At first,Lasso type penalized methods are used to filtrate original feature set to generate emotional classification feature subset with lower redundancy.Secondly,such filtering algorithms as CHI,MI and IG are introduced to evaluate the dependency weight between the candidate feature word and the text category.And some candidate words with lower correlation can be rejected according to the evaluation result.Finally,the proposed algorithm and those such as DF,MI,IG and CHI are compared about various numbers of feature words by SVM classifier which uses gaussian kernel function.It turns out that the proposed algorithm is more effective and efficient when it is used in blog short text corpus.Otherwise,it can improve the effects of feature selection used in DF,MI,IG and CHI to some extent when feature subset dimension is smaller than sample size.With the comparison of recognition rate and recall ratio,it is obvious that Lasso-MI is better than MI as well as other filtering methods.

Multi-attribute Group Decision Making Method for Interval-valued Intuitionistic Uncertain Language with Completely Unknown Experts’ Weights

PANG Ji-fang and SONG Peng

Computer Science. 2018, 45 (1): 47-54. doi:10.11896/j.issn.1002-137X.2018.01.007

Abstract

PDF(1103KB) ( 653 )

References | Related Articles | Metrics

For the multi-attribute group decision making problems with completely unknown experts’ weights in which the attribute values take the form of interval-valued intuitionistic uncertain linguistic variables,this paper investigated a group decision analysis method based on hybrid weight information and decision maker’s risk attitude.On the basis of defining the difference degree between two interval-valued intuitionistic uncertain linguistic variables,two kinds of experts’ weights were calculated by using the closeness degree in evaluation and consistency degree in ranking respectively.Then based on the equilibrium degree,the objective comprehensive weights of experts were obtained.By fusing the objective comprehensive weights of experts and the weights of individual comprehensive evaluation values based on si-milarity,a hybrid weighted aggregation method was proposed to obtain the group comprehensive evaluation values.Furthermore,by defining the expected value and accuracy function with risk attitude factor,the comparison and ranking of the alternatives were completed.Finally,an illustrative example was given to prove the effectiveness and rationality of the above method.

Preference Clustering Based on Nystrm Sampling and Convex-NMF

YANG Mei-jiao and LIU Jing-lei

Computer Science. 2018, 45 (1): 55-61. doi:10.11896/j.issn.1002-137X.2018.01.008

Abstract

PDF(7023KB) ( 793 )

References | Related Articles | Metrics

Large-scale sparse graph data appear in reality heavily,for example,collaborative graph,Laplacian matrix,and so on.Non-negative matrix factorization (NMF) has become a useful tool in data mining,information retrieval and signal processing.How to achieve the data clustering in large-scale data is an important issue.This paper used the two-stage method to realize data clustering.First of all,the Nystrm approximate sampling method is used.The initial profile of data is obtained from large data,and the similar matrix of user-user or movie-movie is obtained.The purpose of doing that is to reduce the original high dimensional space to a low dimensional subspace.Then convex non-negative matrix decomposition of low dimensional similarity matrix is used to get the center of the cluster and indicator.The center of the cluster represents the features of movies or users,and the indicator represents the weight of the features of mo-vies or users.The advantage of two-stage preference clustering method is that the approximation of initial data contour and convex non-negative matrix factorization have better robustness and anti-noise.On the other hand,the data of the subspace are derived from the real matrix,which makes the results of clustering preference have good interpretability.This paper utilized Nystrm method to solve the problem that large-scale data cannot be stored in the memory,saving memory and improving operation efficiency.Lastly,the test on movie data sets which containing 100000 ratings shows the effectiveness of the clustering algorithm.

Three-way Clustering Analysis Based on Dynamic Neighborhood

WANG Ping-xin, LIU Qiang, YANG Xi-bei and MI Ju-sheng

Computer Science. 2018, 45 (1): 62-66. doi:10.11896/j.issn.1002-137X.2018.01.009

Abstract

PDF(4799KB) ( 920 )

References | Related Articles | Metrics

Most of the existing clustering methods are two-way clustering,which are based on the assumption that a cluster must be represented by a set with crisp boundary.However,assigning uncertain points into a cluster will reduce the accuracy of the method.Three-way clustering is an overlapping clustering which describes each cluster by core region and fringe region.This paper presented a strategy for converting a two-way cluster to three-way cluster using the neighborhood of the samples.In the proposed method,a two-way cluster is shrunk according to whether the neighborhood of sample are contained in this cluster and it is stretched according to whether the neighborhood of sample intersects with this cluster.The shrunk result is called core region and the difference between the shrunk result and stretched result is regarded as the fringe region.Experiment using the proposed method on UCI data sets shows that this strategy is effective in improving the structure and F1 values of clustering results.

Entity Hyponymy Acquisition and Organization Combining Word Embedding and Bootstrapping in Special Domain

MA Xiao-jun, GUO Jian-yi, XIAN Yan-tuan, MAO Cun-li, YAN Xin and YU Zheng-tao

Computer Science. 2018, 45 (1): 67-72. doi:10.11896/j.issn.1002-137X.2018.01.010

Abstract

PDF(1289KB) ( 1008 )

References | Related Articles | Metrics

The semantic relation of entity hypomypy is important to build the domain knowledge graphs.The organization of hierarchical relations is not considered in the traditional method of extracting hyponymy.A method of extracting and organizing the entity hyponymy in the specific field was proposed in this paper,which combines the word embedding and bootstrapping method.Firstly,the tourism corpus is selected as seed corpus,then the hyponymy patterns included in the seed corpus are clustered based on the method of word embedding similarity.Thus,the patterns of high-confidence level are filtrated which is used to identify hyponymy in the unlabeled corpus.After that,the high-confidence instances of relation are obtained which are selected to put in the seed sets.And the next iteration is performed until all the instances of relation are obtained.Finally,the mapping learning methods are applied to conduct the hierarchical relation of domain entity based on the character of the entity of domain hierarchical relations and the vector-deviation of the hyponymy pairs of the entity.The experimental results show that the proposed method improves the F-value by 10% compared with the traditional method.

Attribute Reduction of Partially-known Formal Concept Lattices for Incomplete Contexts

WANG Zhen and WEI Ling

Computer Science. 2018, 45 (1): 73-78. doi:10.11896/j.issn.1002-137X.2018.01.011

Abstract

PDF(2084KB) ( 661 )

References | Related Articles | Metrics

Partially-known formal concept,which was proposed recently,lays the foundation of data analysis of incomplete contexts and also provides the thought of studying on attribute reduction.This paper firstly proposed four kinds of attribute reduction:partially-known formal concept lattice reduction,meet(join)-irreducible elements preserving reduction and partially-known object formal concept preserving reduction.And then,it discussed the relationships among the four kinds of reduction.Finally,it presented the approaches to finding these reduction by discernibility matrices and discernibility functions.

Serial Probabilistic Rough Set Approximations

MA Jian-min, YAO Hong-juan and PAN Xiao-chen

Computer Science. 2018, 45 (1): 79-83. doi:10.11896/j.issn.1002-137X.2018.01.012

Abstract

PDF(949KB) ( 650 )

References | Related Articles | Metrics

The classical probabilistic rough set model was proposed based on an equivalence relation and a conditional probability.However,uncertainty in knowledge base makes it difficult to satisfy the equivalence relation between any two objects.This paper considered the serial binary relation instead of an equivalence relation,making the conditional probability meaningful.Then the serial probabilistic rough set approximations were introduced based on a serial relation.Properties of the serial probabilistic rough lower and upper approximations were discussed when the target concepts are variable.Furthermore,by adjusting the two thresholds,the corresponding serial probabilistic rough lower and upper approximations were also investigated.

Rough Entropy Based Algorithm for Attribute Reduction in Concept Lattice

LI Mei-zheng, LI Lei-jun, MI Ju-sheng and XIE Bin

Computer Science. 2018, 45 (1): 84-89. doi:10.11896/j.issn.1002-137X.2018.01.013

Abstract

PDF(1082KB) ( 700 )

References | Related Articles | Metrics

Attribute reduction is one of the crucial issues in the theory study of concept lattice.In this paper,rough entropy was introduced to conduct a kind of attribute reduction.Firstly,rough entropy in a formal context was defined via the whole set of all concept extents,and the properties of rough entropy were analyzed.Secondly,a rough entropy based attribute reduction of a formal context was given,and the relationship between the rough entropy-based reduct and the concept lattice-based reduct was revealed.Based on this,a heuristic algorithm based on the attribute significance was proposed to compute a rough entropy-based reduct,and some numerical experiments were conducted to show the efficiency of the proposed methods.

Three-way Granular Recommendation Algorithm Based on Collaborative Filtering

YE Xiao-qing, LIU Dun and LIANG De-cui

Computer Science. 2018, 45 (1): 90-96. doi:10.11896/j.issn.1002-137X.2018.01.014

Abstract

PDF(3340KB) ( 810 )

References | Related Articles | Metrics

To decrease the recommendation cost and solve the problem of single rating of traditional collaborative filtering algorithm,this paper proposed a three-way granular recommendation algorithm based on collaborative filtering.On the basis of collaborative filtering,this algorithm considers the influence of items’ characteristics on users’ ratings,and constructs user-item’s granulation rating matrix through granulating user-item rating matrix by the characteristics of items,which is used to measure users’ preferences.At the same time,this algorithm considers both misclassification cost and teacher cost during the process of recommendation,and constructs three-way recommendation based on users’ real preferences on rating.Experimental results show that compared with traditional collaborative filtering algorithm,three-way granular recommendation algorithm based on collaborative filtering not only improves the quality of the re-commendation,but also decreases the recommendation cost.

Algorithm for Ordering Points to Identify Clustering Structure Based on Spark

QU Yuan, DENG Wei-bin, HU Feng, ZHNG Qi-long and WANG Hong

Computer Science. 2018, 45 (1): 97-102. doi:10.11896/j.issn.1002-137X.2018.01.015

Abstract

PDF(3970KB) ( 748 )

References | Related Articles | Metrics

Ordering points to identify the clustering Structure (OPTICS) is a hierarchical density-based data clustering algorithm,which can derive the intrinsic clustering structure of the dataset in a visual way,and can extract the basic clustering information by cluster sorting.However,due to its high temporal and spatial complexity,it can not adapt well to the large datasets in modern society.With the development of cloud computing and parallel computing,a method to solve the complexity of OPTICS algorithm was provided.This paper proposed a parallel OPTICS algorithm based on the Spark memory computing platform.The experimental results show that it can greatly reduce the time and space consumption of OPTICS algorithm.

Pattern Matching with Weak-wildcard in Application of Time Series Analysis

TAN Chao-dong, MIN Fan, WU Xiao and LI Xin-lun

Computer Science. 2018, 45 (1): 103-107. doi:10.11896/j.issn.1002-137X.2018.01.016

Abstract

PDF(1136KB) ( 733 )

References | Related Articles | Metrics

This paper proposed a pattern matching method based on weak-wildcards to obtain accurate and flexible matching which is good for locating critical time points and assisting users’ decision.First,a nominal sequence was obtained through coding the time series.Second,the concepts of weak-wildcard and gaps with special semantics were defined.Third,an efficient pattern matching algorithm was designed.In time series analysis,patterns reflect the trend of data change and indicate the occurrence of events.The traditional exact pattern matching is greatly affected by the noise,which has lower matching flexibility.Adding weak-wildcards gives consideration to both accuracy and flexibility.Experiments were undertaken on oil production and stock transaction data.Results show that compared to exact pattern matching method,the proposed pattern matching method copes with users’ expectation better.

Optimization Algorithm of Multiply Lie Group Covering Learning Algorithm

WU Lu-hui, LI Fan-zhang and ZHANG Li

Computer Science. 2018, 45 (1): 108-112. doi:10.11896/j.issn.1002-137X.2018.01.017

Abstract

PDF(5200KB) ( 864 )

References | Related Articles | Metrics

In the previous study,a multiply Lie group kernel covering learning algorithm was proposed to reduce the intersection of roads and improve the correctness of classification for multi-connected spaces.However,the performance of the kernel learning algorithm depends on the choice of kernel function.In this paper,it is considered that the original Lie group samples are mapped to the target Lie group space by the Lie group homomorphic mapping,the degree of the road association is minimized in different single connected spaces in the target Lie group space,and the correlation degree of the road in the same single connected space is maximized,in order to reduce road cross problems.

Optimal Scale Selection in Multi-scale Decision Systems under Environment of Objects Updating

TIE Wen-yan, FAN Min and LI Jin-hai

Computer Science. 2018, 45 (1): 113-117. doi:10.11896/j.issn.1002-137X.2018.01.018

Abstract

PDF(1004KB) ( 602 )

References | Related Articles | Metrics

Multi-scale decision system is an important kind of relational databases,and the optimal scale selection is one of the main research targets in the study of multi-scale decision system.This paper discussed the optimal scale selection of multi-scale decision systems under the environment of objects updating.First,the notions of the multi-scale information system and multi-scale decision system were presented.Then,a generalized decision function was defined and used to introduce the consistency and the optimal scale in the multi-scale decision systems.At last,the optimal scale transformation mechanism of different consistent multi-scale decision systems were investigated under the environment of objects updating.

State Reduction Algorithm for Completely Specified Sequential Logic Circuit Based on Equivalence Relation

SHANG Ao, PEI Xiao-peng, LV Ying-chun and CHEN Ze-hua

Computer Science. 2018, 45 (1): 118-121. doi:10.11896/j.issn.1002-137X.2018.01.019

Abstract

PDF(897KB) ( 682 )

References | Related Articles | Metrics

State reduction of the completely specified sequential logic circuit refers to find and combine the equivalent states in the logic circuit.The reduction can simplify the circuit,improve its safety and decrease its hardware cost at the same time.The key point for state reduction is to find the maximal state equivalence classes.In this paper,an equivalence relation based algorithm was proposed by introducing granular computing method.By defining output matrix,sub-state matrix,and marking the initial states in the matrices,the initial state mark matrix and the sub-state mark matrix were obtained.Then,state transition system matrix was constructed,and the the initial states in the state table was continuously partitioned from coarser granularity to finer granularity until the maximal state equivalence classes were obtained.The experimental results and complexity analysis show the accuracy and efficiency of the proposed algorithm.

Study on Tourism Demand Forecasting Based on Improved Grey Model

LI Yao, CAO Han and MA Jing

Computer Science. 2018, 45 (1): 122-127. doi:10.11896/j.issn.1002-137X.2018.01.020

Abstract

PDF(4798KB) ( 1004 )

References | Related Articles | Metrics

Aiming at the tourism demand forecasting problems in Hainan Province,this paper proposed a novel dynamic optimal input subset fuzzy grey-Markov prediction model based on traditional grey-Markov model.This model firstly determines the optimal number of input subsets through input subset method according to the mean absolute percentage error of the prediction of GM (1,1) model,then calculates the membership vector and takes it as weight vector of Markov transfer matrix vector so that the forecast value can be revised through fuzzy set theory.An equal dimension increasing dynamic grey prediction model was created based on the characteristic of the passage of time,which enables us to predict the tourism demand.The number of tourists received by hotels in Hainan Province is taken as an example to show that the model can effectively improve the accuracy of forecasting data.

Study on Text Segmentation Based on Domain Ontology

LIU Yao, SHUAI Yuan-hua, GONG Xing-wei and HUANG Yi

Computer Science. 2018, 45 (1): 128-132. doi:10.11896/j.issn.1002-137X.2018.01.021

Abstract

PDF(1030KB) ( 606 )

References | Related Articles | Metrics

Text segmentation plays an important role in information retrieval,abstract generation,question-answering system,information extraction and so on.This paper put forward a new text segmentation method based on domain ontology after analyzing and summarizing existing methods at home and abroad.The method first uses initial concept to automatically obtain structured semantic concepts set,which are then used to affix semantic labels to paragraphs in text based on the frequency of occurrence,position and relationship of concepts and properties.Paragraphs with the same semantic annotation information are grouped into one semantic paragraph,which helps discover the sub-topics information and meanwhile realize topic segmentation for texts.The experimental result shows that the precision,recall and F-mea-sure of this method can achieve 85%,90% and 88% respectively,which performs better than most existing methods and satisfies the real application needs.

Weighted Attribute Reduction Based on Fuzzy Rough Sets

FAN Xing-qi, LI Xue-feng, ZHAO Su-yun, CHEN Hong and LI Cui-ping

Computer Science. 2018, 45 (1): 133-139. doi:10.11896/j.issn.1002-137X.2018.01.022

Abstract

PDF(3298KB) ( 682 )

References | Related Articles | Metrics

Now the existing classical reduction algorithms have high time consumption,especially on the large scale datasets.To handle this problem,this paper introduced weights into the concept of attribute reduction,where weight is the measure of attribute significance.By building optimization problem about weights,it is fond that the attribute dependency degree is just the optimal solution of the weights.As a result,this paper proposed a reduction algorithm based on ranked weights,which significantly accelerate attribute reduction.Numerical experiments demonstrate that the proposed algorithm is suitable on large scale datasets,especially on the datasets with high dimension.

Improvement of Monte Carlo Tree Search Algorithm in Two-person Game Problem

JI Hui and DING Ze-jun

Computer Science. 2018, 45 (1): 140-143. doi:10.11896/j.issn.1002-137X.2018.01.023

Abstract

PDF(7598KB) ( 2163 )

References | Related Articles | Metrics

Monte Carlo tree search (MCTS) is a heuristic search algorithm for some kinds of decision processes,most notably those employed in game play.In the decision process of a very complex game,like a computer Go game,the basic Monte Carlo tree search method converges very slowly due to large computation cost.This article indicated that MCTS cannot converge to the best strategy of the two-person complete information games.Therefore,this article proposed a new search algorithm which combines MCTS with the min-max algorithm in order to avoid the failure due to the randomness of Monte Carlo method.For further improving computation efficiency of MTCS in complex two-person games,this article also considered to employ some progressive pruning strategies.An experimental test shows that the new algorithm significantly improves the accuracy and efficiency of MCTS.

Chinese Text Summarization Based on Classification

PANG Chao and YIN Chuan-huan

Computer Science. 2018, 45 (1): 144-147. doi:10.11896/j.issn.1002-137X.2018.01.024

Abstract

PDF(5192KB) ( 795 )

References | Related Articles | Metrics

Automatic text summarization is an important content in natural language processing.According to different implementation ways,it can be classified into extractive summarization and abstractive summarization.Abstractive summarization consists of ideas or concepts which are taken from the original document but are re-interpreted and shown in a different form,the aspects of which may not appear as part of the original document.This paper proposed an abstractive model with classifier.The model combines encoder-decoder structure based on recurrent neural networks with classifier to use supervised information more sufficiently and get more abstract features.However,encoder-decoder structure and classifier can easily be trained end-to-end and scale a large amount of training data at the same time.The model obtains good performance of text summarization and text classification.

Partially Consistent Reduction in Intuitionistic Fuzzy Ordered Decision Information Systems with Preference Measure

LIN Bing-yan, XU Wei-hua and YANG Qian

Computer Science. 2018, 45 (1): 148-151. doi:10.11896/j.issn.1002-137X.2018.01.025

Abstract

PDF(967KB) ( 617 )

References | Related Articles | Metrics

In real life,the attribute values of many information systems are based on intuitionistic fuzzy numbers because of different needs.In view of this phenomenon,an intuitionistic fuzzy order relation was established on the basis of the weighted score function and an inconsistent intuitionistic fuzzy ordered decision information system with preference measure was proposed.Furthermore,the partial consistent function was introduced into the complex system,and the method of solving partially consistent reduction was studied by the partially consistent discernibility matrix.Finally,the feasibility and effectiveness of the proposed method were verified by a case.

Face Age Classification Method Based on Ensemble Learning of Convolutional Neural Networks

MA Wen-juan and DONG Hong-bin

Computer Science. 2018, 45 (1): 152-156. doi:10.11896/j.issn.1002-137X.2018.01.026

Abstract

PDF(3308KB) ( 655 )

References | Related Articles | Metrics

Face age estimation has attracted much attention due to its potential applications in the areas of human-computer interaction and safety control.This paper focused on face age classification task,and proposed an age classification algorithm based on ensemble convolutional neural network for face age classification.Firstly,two convolutional neural networks which make face images as input are trained,and the deeply global features are extracted mainly by using convolutional neural network.In order to further supply the local features of face images,especially texture information,the extracted LBP feature will be taken as input for another network.Finally,in order to combine the global features and the local features of the face images,three networks are integrated to generate good results in the widely used age estimation dataset.

Two-level Stacking Algorithm Framework for Building User Portrait

LI Heng-chao, LIN Hong-fei, YANG Liang, XU Bo, WEI Xiao-cong, ZHANG Shao-wu and Gulziya ANIWAR

Computer Science. 2018, 45 (1): 157-161. doi:10.11896/j.issn.1002-137X.2018.01.027

Abstract

PDF(7280KB) ( 785 )

References | Related Articles | Metrics

User portraits are a kind of tagged user model constructed from user’s social attributes,lifestyle and consu-mer behavior,etc.The core work of building user portraits is to “tag” the user.Based on the user’s query word history,this paper proposed a two-level stacking algorithm framework for predicting user’s multi-dimensional labels.For the first-level models,a variety of models are built on each tag prediction subtask.The SVM model and Trigram feature are used to extract the differences of user’s words habit.The doc2vec shallow neural network model is used to extract the semantic relation information of the query words,and the convolution neural network model is used to extract the deep semantic association information between the query words.Experiments show that doc2vec has relatively good predictive accuracy in dealing with short texts related tasks (such as user queries).For the second-level models,the XGBTree model and the Stacking method are used to extract the association information between the label’s attributes of the user,so that the average prediction accuracy is further improved by 2%.In the big data competition “Sougou User Portrait Mining For Precision Marketing” organizated by China Computer Federation in 2016,this two-level stacking algorithm framework won the championship from 894 teams.

Study on Detection Method of Pulmonary Nodules with Multiple Input Convolution Neural Network

ZHAO Peng-fei, ZHAO Juan-juan, QIANG Yan, WANG Feng-zhi and ZHAO Wen-ting

Computer Science. 2018, 45 (1): 162-166. doi:10.11896/j.issn.1002-137X.2018.01.028

Abstract

PDF(3676KB) ( 751 )

References | Related Articles | Metrics

In view of that the detection process of pulmonary nodules in traditional computer-aided diagnosis system is complex,the detection results depend on the performance of each step in the early stage of classification,and there is a problem of high false positive rate,this paper presented an end-to-end detection method of pulmonary nodules based on convolution neural network.First,it uses a large number of tagged pulmonary nodule data to input the newly constructed multi-input convolution neural network for training,realizing the supervised learning from the raw data to the semantic label.Then,it uses the fast edge detection method and the two-dimensional Gaussian probability density function to construct the candidate region template,and the obtained candidate region from the CT sequence is used as the input data of the multi-input convolution neural network.Finally,it uses a diagnostic threshold to annotate the suspected pulmonary nodule region,which will be emphatically checked in the next frame.A large number of experimental results on LIDC-IDRI data set show that the detection rate of the small nodules in the lung CT image with the proposed method is high,and the candidate region template with key monitor can slightly reduce the false positive rate of the small nodules detection.

Vietnamese Combinational Ambiguity Disambiguation Based on Weighted Voting Method of Multiple Classifiers

LI Jia, GUO Jian-yi, LIU Yan-chao, YU Zheng-tao, XIAN Yan-tuan and NGUY～N Qing’e

Computer Science. 2018, 45 (1): 167-172. doi:10.11896/j.issn.1002-137X.2018.01.029

Abstract

PDF(3634KB) ( 710 )

References | Related Articles | Metrics

Combinational ambiguity disambiguation is one of the key issues in participle and it directly affects the accuracy of participle.In order to solve the impact problem of combinational ambiguity on the participle in Vietnamese,combining the features of combinational words of Vietnamese,the paper proposed a Vietnamese combinational ambiguity disambiguation method based on integrated Learning.This method first selects Vietnamese combination of polysemy manually,constructs the Vietnamese combinational ambiguities library, matches Vietnamese and Vietnamese combinational-word dictionary,and extracts Vietnamese combinational ambiguities.Secondly,by using three kinds of classifiers to bring in Vietnamese word frequency features and context information,it constructs three class classifier degradation model,and gets the results.Finally,it calculats the classifier weights through the threshold to determine the final classification of Vietnamese combination ambiguity.Experiments show that the proposed method has the accuracy of 83.32% and its accuracy improves 5.81% compared with the single classifier.

Multi-label-specific Feature Selection Method Based on Neighborhood Rough Set

SUN Lin, PAN Jun-fang, ZHANG Xiao-yu, WANG Wei and XU Jiu-cheng

Computer Science. 2018, 45 (1): 173-178. doi:10.11896/j.issn.1002-137X.2018.01.030

Abstract

PDF(974KB) ( 672 )

References | Related Articles | Metrics

Dimensionality reduction of data is a significant and challenging task under multi-label learning,and feature selection is a valid technology to reduce the dimension of vector.In this paper,a multi-label-specific feature selection method based on neighborhood rough set theory was proposed.This method ensures theoretically that there exists a strong correlation between the obtained label-specific features and the corresponding labels,and then reduction efficiency can be improved well.Firstly,a reduction algorithm of rough set theory is applied to reduce redundant attributes,and the label-specific features are obtained while keeping the classification ability unchanged.Then,the concepts of neighborhood accuracy and neighborhood roughness are introduced,the calculation approaches to dependence and attribute significance based on neighborhood rough set are redefined,and the related properties of this model are discussed.Finally,a multi-label-specific feature selection model based on neighborhood rough set is presented,and the corresponding feature selection algorithm for multi-label classification task is designed.The experimental results under some public datasets demonstrate the effectiveness of the proposed multi-label-specific feature selection method.

Multi-view Ensemble Framework for Constructing User Profile

FEI Peng, LIN Hong-fei, YANG Liang, XU Bo and Gulziya ANIWAR

Computer Science. 2018, 45 (1): 179-182. doi:10.11896/j.issn.1002-137X.2018.01.031

Abstract

PDF(5618KB) ( 1122 )

References | Related Articles | Metrics

The State Grid users who are sensitive to electric charge often have a strong reaction on electric quantity,electric price,electric charge,payment,arrearage and other electrical service caused by electricity consumption.How to rapidly locate the electric-charge-users plays an important role in reducing customer complaints rate,enhancing customer satisfaction,and establishing a good service image of power supply enterprise.Based on the data of grid users,this paper presented a multi-view ensemble framework for constructing user profile,which can quickly and accurately identify the electric-charge-users.First of all,this paper analyzed the grid users and used two channels to model the users with different characteristics respectivelty.Secondly,this paper presented a variety of feature extraction methods for constructing user multi-source feature systems.Finally,in order to make full use of multi-source features,this paper proposed a multi-view ensemble model based on double Xgboost.This framework was used to obtain the F1 score of 0.90379(The first place) in the “User Profile” contest of 2016 CCF Big Data and Computational Intelligence Contest,validating the effectiveness of the method.

Intra-domain Routing Protection Algorithm Based on Critical Nodes

GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping

Computer Science. 2018, 45 (1): 183-187. doi:10.11896/j.issn.1002-137X.2018.01.032

Abstract

PDF(3421KB) ( 666 )

References | Related Articles | Metrics

With the expansion of the Internet,a large number of real-time applications are deployed on the Internet,which places greater demands on network delay.However,the current deployed intra-domain routing protocol cannot meet the requirements of real-time application for network delay.Therefore,improving the Internet routing availability has become an urgent problem.Academia and industry employ routing protection schemes to quickly respond to network failures and improve Internet routing availability.The existing routing protection schemes do not consider the importance of nodes in the network.However,the importance of different nodes in the network is not the same in real networks.To solve this problem,an intra-domain routing protection algorithm based on critical nodes (RPBCN) was proposed.Firstly,an Internet routing availability model is built,which can quantitatively measure the Internet routing availa-bility.Then,a node criticality model is established,which can quantitatively measure the importance of nodes in the network.At last,RPBCN is proposed based on the Internet routing availability model and node criticality model.The experiment results show that RPBCN greatly improves the Internet routing availability while possessing low computation overhead,which provides an efficient solution for the ISP to solve the Internet route availability problem.

MIMA:A Multi-identification Check-in Data Matching Algorithm Based on Spatial and Temporal Relations

ZHANG Chen, LI Zhi, ZHU Hong-song and SUN Li-min

Computer Science. 2018, 45 (1): 188-195. doi:10.11896/j.issn.1002-137X.2018.01.033

Abstract

PDF(5949KB) ( 732 )

References | Related Articles | Metrics

Intelligent product often has a tag which can represent its uniqueness,such as the serial number of bus cards,the MAC address of Wi-Fi devices. The check-in data,which represent people’s discrete trajectory, consist of the tag,the time and the location used by the product.Researchers have made lots of surveys about single kind of check-in data.However,single kind of check-in data are sparse,so the adaptability and performance of the surveys are limited.This paper studied a new problem about multiple kinds of check-in data and proposed an algorithm called MIMA based on multiple kind sof check-in data to enrich check-in data and improve the performance of the surveys.Firstly,MIMA builds up the signed network through calculating the positive and negative values between tags based on the temporal and spatial relations of multiple kinds of check-in data produced by one individu al(1)Then the FEC(Finding and Extracting Communities from singed social networks) community detection algorithm is improved by deleting a cut criteria and considering the weight density to adapt to the specialty of check-in data signed network,and it achieves the goal of partitioning multiple tags which belong to one individual.The effectiveness and efficacy of the proposed algorithm are demons-trated through a set of experiments involving both real and simulated situations.

Big Data Dynamic Migration Method Based on Global Load Balancing in Cloud Environment

ZHANG Yong, ZHANG Jie-hui and LIU Bin

Computer Science. 2018, 45 (1): 196-199. doi:10.11896/j.issn.1002-137X.2018.01.034

Abstract

PDF(4715KB) ( 602 )

References | Related Articles | Metrics

In the cloud environment,the data load equalization is slow and the data skew.In order to reduce the cost of data transfer,a global load balancing method of dynamic data migration under the cloud environment was proposed.First,load balancing model was constructed,data migration cost was computed in load balancing,and the minimum cost of data migration model was given.The cost of data transfer was calculated,and the utilization ratio of virtual machine data load was evaluated so that the data overloaded server can be transferred to the data server.The simulation results show that the proposed method improves the speed and efficiency of data load balancing,reduces the cost of data migration,and improves the utilization ratio of resources.

IP Geolocation Method Based on Neighbor Sequence

GUO Li-xuan, ZHUO Zi-han, HE Yue-ying, LI Qiang and LI Zhou-jun

Computer Science. 2018, 45 (1): 200-204. doi:10.11896/j.issn.1002-137X.2018.01.035

Abstract

PDF(5461KB) ( 2178 )

References | Related Articles | Metrics

IP geolocation is intended to accurately determine the physical space location of a given IP address,usually based on measurement technology or data analysis.The existing approaches based on data analysis have less consideration of the relationship between IP address.Taking into account the aggregation of IP address,this paper proposed an IP geolocation approach based on neighbor sequence.First,the approach calculates the neighbor sequence of IP address,converts it to the corresponding sequence of latitude and longitude,and then models it based on the sequence and solves.This approach was experimentally verified by using IP address location library and mobile traffic data with GPS information as original data.Result shows that neighbor sequence can determine the physical space location of IP address,and mean error is between 20km and 30km,which means this approach has achieved county level geolocation.This approach provides a new solution and a new idea for the IP geolocation problem,and it can be combined with other approaches based on measurement or based on data analysis to obtain better result.

RESSP:An FPGA-based REconfigurable SDN Switching Architecture

HE Lu-bei, LI Jun-nan, YANG Xiang-rui and SUN Zhi-gang

Computer Science. 2018, 45 (1): 205-210. doi:10.11896/j.issn.1002-137X.2018.01.036

Abstract

PDF(5502KB) ( 926 )

References | Related Articles | Metrics

SDN,which uses forwarding control separation architecture and centralized management control mechanism,can effectively meet the needs of different networks in different granularity control demand.When SDN teaching and innovation experiments are carried out by researchers in universities,a data plane is needed which can be felt and reprogrammed to support the principle demonstration and the independent research.However,the internal implementation process of traditional ASIC switch is opaque and the lookup architecture is fixed,and the processing speed of the software switch is low,so they can not fully support the research of the data plane.At present,the design of programmable data plane whith FPGA provides a feasible path to meet the diverse needs of different research scenarios.Although academia and industry have been done some preliminary attempt based on FPGA SDN switch design,but FPGA-based reconstructed switch architecture and design method still lack in-depth study,and it is difficult to achieve fine-grained module reconfigurable SDN processing.Therefore,the existing work is hard to reuse and is also unable to provide technical support to SDN data graphic design.This paper proposed a FPGA-based reconfigurable SDN switching architecture,namely RESSP.RESSP disassembles the packet processing into multiple modules which can be dynamically loa-ded.For specific application scenarios switches,a corresponding packet processing was designed by using FPGA to add,remove or replace the RESSP’s module.Based on the structure of RESSP,this paper implemented a prototype of SDN switch MiniSwitch and its management software.MiniSwitch verifies that RESSP can quickly reconstruct the corresponding SDN data plane for different scenarios,and meet the diverse processing requirements of SDN switches in different application scenarios.

Multipath Routing Algorithm in Software Defined Networking Based on Multipath Broadcast Tree

QIN Kuang-yu, HUANG Chuan-he, LIU Ke-wei, SHI Jiao-li and CHEN Xi

Computer Science. 2018, 45 (1): 211-215. doi:10.11896/j.issn.1002-137X.2018.01.037

Abstract

PDF(4990KB) ( 698 )

References | Related Articles | Metrics

Shortest path based single path routing is used in traditional network.It cannot use all links’ capacity effectively.Software defined networking (SDN) provides the centralized control plane to implement the precise control of the routing.To solve the multipath routing problem in SDN,a multipath broadcast tree structure and a multipath selection algorithm were proposed in this paper.The algorithm can allocate the probabilities to the paths according to their available bandwidths and latencies.The path which has bigger bandwidth and less latency will be given higher priority.The results of the simulation show that the algorithm can make routing decision fast while significantly reducing the transmission delay and increasing the throughput.

Community Detection Method Based on Multi-layer Node Similarity

ZHANG Hu, WU Yong-ke, YANG Zhi-zhuo and LIU Quan-ming

Computer Science. 2018, 45 (1): 216-222. doi:10.11896/j.issn.1002-137X.2018.01.038

Abstract

PDF(2654KB) ( 1054 )

References | Related Articles | Metrics

Community detection is an important research content in complex network,and the agglomerative method based on the node similarity is a typical method of community detection.Aiming at the shortages of the existing method for calculating the node similarity,this paper proposed a novel method based on the multi-layer node similarity,which can not only calculate the similarity between nodes more efficiently,but also solve the problem of merging nodes when the node similarity is same.Furthermore,this paper constructed the community detection model based on the improved calculation method of the node similarity and the measure criteria of connection tightness between groups,and conducted the community detection experiments in real world network.Compared with the experimental results of GN algorithm,Fast Newman algorithm and the improved label propagation algorithm,the proposed model can be more accurate to find the members of each community.

Dynamic Grid Based Sparse Target Counting and Localization Algorithm Using Compressive Sensing

YANG Si-xing, GUO Yan, LIU Jie and SUN Bao-ming

Computer Science. 2018, 45 (1): 223-227. doi:10.11896/j.issn.1002-137X.2018.01.039

Abstract

PDF(4866KB) ( 577 )

References | Related Articles | Metrics

According to the localization algorithm based on Compressive Sensing (CS) in Wireless Sensor Network (WSN),the localization area is generally divided into a number of grids and the targets are located in the grid points.Then the locations of the targets can be obtained by ₁-minimization algorithm.Practically,the assumption of the targets located in the grid points is almost impossible,which will make the location vector not sparse,and may lead to dictionary mismatch and cause error.As a result,a novel framework of localization approach based on dynamic grid was proposed.This approach can adaptively adjust the grid to make the targets exactly in the grid points.The proposed algorithm is solved by the iteration between the dictionary update and the obtaining of the location vector.At the same time,the algorithm can obtain the performance of both sparse target counting and localization.Simulation results show that the new proposed approach has advantages in both target counting and localization accuracy compared with the traditional CS-based algorithms.

Distributed Subgradient Optimization Algorithm for Multi-agent Switched Networks

LI Jia-di and LI De-quan

Computer Science. 2018, 45 (1): 228-232. doi:10.11896/j.issn.1002-137X.2018.01.040

Abstract

PDF(1184KB) ( 654 )

References | Related Articles | Metrics

This paper studied the distributed subgradient algorithm for mult-agent optimization problem over switched networks.By using the non-quadratic Lyapunov function method,we proved that the convergence of the proposed distributed optimization algorithm can still be guaranteed under the condition that the directed switched network is periodically strongly connected and the corresponding adjacency matrix is stochastic rather than doubly stochastic.Finally,a simulation example was given to demonstrate the effectiveness of the proposed optimization algorithm.

Anomaly Detection Method of ICS Based on Behavior Model

SONG Zhan-wei, ZHOU Rui-kang, LAI Ying-xu, FAN Ke-feng, YAO Xiang-zhen, LI Lin and LI Wei

Computer Science. 2018, 45 (1): 233-239. doi:10.11896/j.issn.1002-137X.2018.01.041

Abstract

PDF(5657KB) ( 833 )

References | Related Articles | Metrics

At present,the ICS network security has become a key problem in the field of information security.Detecting attacks,such as behavior data tampering attack and control program tampering attack,is a difficult problem of ICS network security.Therefore,this paper proposed an anomaly detection method based on behavior model.This method extracts the behavior data sequence from the industrial control network traffic.Then it constructs the normal behavior model according to the control process and the controlled process of ICS.At last,it determines whether an exception occurs by comparing and analyzing the behavior data extracted in real time and the behavior data predicted by the model.The experimental analysis shows that it can effectively detect behavior data tampering attack,control program tampering attack and so on.

Construction Method of ROP Frame Based on Multipath Dispatcher

PENG Jian-shan, ZHOU Chuan-tao, WANG Qing-xian and DING Da-zhao

Computer Science. 2018, 45 (1): 240-244. doi:10.11896/j.issn.1002-137X.2018.01.042

Abstract

PDF(7057KB) ( 767 )

References | Related Articles | Metrics

ROP is a popular attacking technology used to exploit software vulnerability,and it is always updating to against the technology of defensing ROP.Both kBouncer and ROPecker are the state-of-the-art ROP defense tools,and they are effective in detecting traditional ROP and JOP,and they can trace the process of indirect jump instructions by detecting ROP characters and using LBR register.The bypassing method proposed by Nicholas has the disadvantage that it is hard to find available ROP gadgets.This paper proposed a novel method to organize ROP gadgets.The ROP frame was constructed to execute traditional gadgets in loops by multipath dispatcher.Using this ROP frame,attackers can use plenty of traditional gadgets to execute a complete and efficient ROP chain.The test results show that this method is easy to implement,and it is able to perform complex functions.More importantly,the proposed ROP frame can bypass ROPecker and kBouncer because it has small enough characters.

Study on Evaluation Method of Network Security Situation under Multi-stage Large-scale Network Attack

TANG Zan-yu and LIU Hong

Computer Science. 2018, 45 (1): 245-248. doi:10.11896/j.issn.1002-137X.2018.01.043

Abstract

PDF(3119KB) ( 658 )

References | Related Articles | Metrics

For the traditional network security situation assessment method,there is always a problem of large evaluation bias.In order to accurately analyze the network security situation,a network security situation assessment method of multi-stage large-scale network attack under the new network security was proposed.Firstly,based on the characteri-stics of multiple data sources under multi-stage large-scale network attack,the network security situation assessment model of multi-stage large-scale network attack was established based on information fusion.Next,large-scale network attack stage was identified,and the success probability of network attack and implementation probability of network attack phase were calculated.Finally,three indexes in CVSS was used for network security situation assessment.The exam-ple analysis shows that the proposed method is more suitable for practical application,and the evaluation results are accurate and effective.

Test Case Selection Technique Based on Semi-supervised Clustering Method

CHENG Xue-mei, YANG Qiu-hui, ZHAI Yu-peng and CHEN Wei

Computer Science. 2018, 45 (1): 249-254. doi:10.11896/j.issn.1002-137X.2018.01.044

Abstract

PDF(5103KB) ( 695 )

References | Related Articles | Metrics

The purpose of regression testing is to ensure that no new faults are introduced into software after modifications.The case of regression test is increasing with the evolution of software,the software of so test selection techniques are used to control costs.In recent years,cluster analysis techniques are applied to test selection problem.We proposed a novel method called discriminative semi-supervised K-means clustering method (DSKM),which introduces semi-supervised learning clustering technology.Through DSKM,hidden pairwise constraints information is mined from the test execution history.By taking advantage of a large number of unlabeled samples and a small amount of labeled samples,DSKM can optimize the results of the cluster,and further optimize test case selection results.Experiment shows that compared with Constrained-Kmeans algorithm and SSKM method,DSKM is more effective.

Symbolic ZBDD-based Generation Algorithm for Combinatorial Testing

HUANG Yu-yao, LI Feng-ying, CHANG Liang and MENG Yu

Computer Science. 2018, 45 (1): 255-260. doi:10.11896/j.issn.1002-137X.2018.01.045

Abstract

PDF(1183KB) ( 688 )

References | Related Articles | Metrics

Combinatorial testing is an effective method in system testing,which can test system with fewer test cases under the premise of guaranteeing error detection rate.However,the complexity of the problem of constructing test cases is NP-complete,and many algorithms only get the suboptimal solution.This paper presented a method of generating test cases based on symbolic Zero-suppressed binary decision diagram(ZBDD).The method first uses the structure of ZBDD to perform a compact symbolic representation of the test system.Then,using implicit operations of ZBDD and combining with the idea of greedy algorithm,the method can cover more combinations and reduce the set of uncover combinations to generate smaller set of test cases.The method can satisfy 2 to 4 wise coverage.Experiments show that this method is feasible and has the characteristic of small node cost.

Application of Probabilistic Model Checking in Dynamic Power Management

DU Yi, HE Yang and HONG Mei

Computer Science. 2018, 45 (1): 261-266. doi:10.11896/j.issn.1002-137X.2018.01.046

Abstract

PDF(4088KB) ( 553 )

References | Related Articles | Metrics

It has become a hot topic to make a trade-off between energy consumption and performance of embedded devices.Dynamic power management (DPM) is an efficient way to reduce the devices’ energy consumption while guaranteeing its’ performance,and the key point of DPM is the DPM strategies.Based on probabilistic model checking,a method to generate and verify DPM strategies was proposed.The target system is modeled as SMGs models,goals are set as rPATL properties,and then the probabilistic model checking tool (PRISM-games) is used for strategies synthesis aiming at danamic energy management.Furthermore,PRISM is used for verifying the synthesized strategies.The expe-rimental results show that the method is feasible and efficient.

PPQ:Finding Combinatorial Skyline Based on Partition

DONG Lei-gang, LIU Guo-hua and CUI Xiao-wei

Computer Science. 2018, 45 (1): 267-272. doi:10.11896/j.issn.1002-137X.2018.01.047

Abstract

PDF(8732KB) ( 588 )

References | Related Articles | Metrics

C-skyline computation,aiming to return the outstanding combinations,becomes more and more useful in multi-criteria decision.The current algorithm uses a recursive method with the redundant computation and the unsatisfactory data pruning rate.So a new algorithm PPQ(Partition-Prune-Query)was proposed.The concept of the dominant region was introduced and the whole data region was divided into sub-spaces.Then most useless combinations were pruned based on pruning strategies,and the result could be returned quickly.The experiments manifest the correctness and efficiency of the proposed algorithm.

Storage Location Assignment Optimization Method Based on Elite Multi-strategy

ZHANG Gui-jun, YAO Jun, ZHOU Xiao-gen and WANG Wen

Computer Science. 2018, 45 (1): 273-279. doi:10.11896/j.issn.1002-137X.2018.01.048

Abstract

PDF(5841KB) ( 795 )

References | Related Articles | Metrics

To address the problem of storage location assignment in the intelligent stereoscopic warehouse,a storage location assignment optimization method using elite multi-strategy was proposed.Firstly,by considering the factors of the weight,the frequency and time of import and export of goods,the storage location assignment optimization model was constructed based on the principle of low gravity center of goods shelf,high frequency of import and export and close distance between goods and import and export.Then,an elite multi-strategy-based differential evolution algorithm was designed to solve the constructed model.In this approach,the information of some elite individuals is extracted to guide the mutation and the mutation strategies for different search stages are selected according to the variation of the crowding degree of the elite individuals.Thus,the individuals with high quality are generated and the convergence speed is improved.Finally,the performance of the proposed algorithm was verified over ten classical benchmark functions,and the optimum storage location assignment scheme of the finished product warehouse of a company was obtained by the proposed method.

Transfer Prediction Learning Based on Hybrid of SDA and SVR

REN Jun, HU Xiao-feng and LI Ning

Computer Science. 2018, 45 (1): 280-284. doi:10.11896/j.issn.1002-137X.2018.01.049

Abstract

PDF(7745KB) ( 589 )

References | Related Articles | Metrics

To improve the prediction accuracy of small sample in the era of big data,this paper introduced a novel hybrid model based on stacked denoising auto-encoder(SDA) and support vector regression (SVR).The hybrid model is pretrained by using a large number of source domain data,and then it is fine-tuned by a small amount of target domain data.The method takes the advantage of SDA,extracting common features autonomously on related but different target domain data.By transferring these prior knowledge,the hybrid model can provide a relatively accurate prediction result on high-dimensional and noisy small sample.Experimental results on extensive datasets demonstrate the effectiveness of the proposed model.

Improved Chicken Swarm Optimization Algorithm and Its Application in DTI-FA Image Registration

ZHENG Wei, JIANG Chen-jiao, LIU Shuai-qi and ZHAO Jie

Computer Science. 2018, 45 (1): 285-291. doi:10.11896/j.issn.1002-137X.2018.01.050

Abstract

PDF(5525KB) ( 586 )

References | Related Articles | Metrics

Chicken swarm optimization algorithm(CSO) is a new swarm intelligence optimization algorithm,which has the characteristics of simple and good scalability.In view of the problem that the chicken swarm optimization algorithm is easy to fall into the local extremum for a hen has poor optimization ability,a chaotic improved chicken swarm optimization algorithm (CICSO) was proposed.In this algorithm,the position of chicken is initialized by the ergodicity of the chaotic idea,and the position updating formula of the hen is changed to the rooster with the best global fitness value.In addition,the learning factor is introduced to avoid the local optimum.At last,the improved algorithm (CICSO) was applied to DTI-FA image registration.Simulation results show that the improved algorithm can avoid falling into local extremum,improves the convergence precision and the registration accuracy of the image in the DTI-FA image registration.

Density Clustering Algorithm Based on Laplacian Centrality

YANG Xu-hua, ZHU Qin-peng and TONG Chang-fei

Computer Science. 2018, 45 (1): 292-296. doi:10.11896/j.issn.1002-137X.2018.01.051

Abstract

PDF(4591KB) ( 759 )

References | Related Articles | Metrics

As an important tool of data mining,clustering analysis can measure similarity between the different data and classify them into different categories.It is wisely applied in pattern recognition,economics,biology and so on.In this paper,a new clustering algorithm was proposed.Firstly,dataset to be classified is converted into a weighted complete graph.Data point is a node and the distance between two data points is used as weight of side between these two data points.Secondly,local importance of each node in the network is calculated and evaluated by Laplacian centrality.The cluster center has higher Laplacian centrality than surrounding neighbor nodes and the node with higher Laplacian centrality has larger distance.Finally,the algorithm is a real parameter-free clustering method,which can classify the dataset automatically without any priori parameters.In this article,the new algorithm was compared with 9 famous clustering algorithms in 6 datasets.Experimental results show that the proposed algorithm has good clustering performance.

Log-polar Feature Guided Iterative Closest Point Algorithm

ZHOU Shi-hao and ZHANG Yun

Computer Science. 2018, 45 (1): 297-306. doi:10.11896/j.issn.1002-137X.2018.01.052

Abstract

PDF(3125KB) ( 573 )

References | Related Articles | Metrics

Images with lighting variations,rotation/optical zoom,physical changes of scene or at widely different viewpoints,can substantially change their appearance and shape when they are acquired using different modalities.Even with the state-of-the-art technology,e.g.,the generalized dual-bootstrap iterative closest point (GDB-ICP) method,it is still difficult to register those challenging images.The reason is that the GDB-ICP method uses the scale-invariant blub points (or SIFT keypoints) to drive the iterative closest point method (ICP).However,the SIFT keypoints cannot be reliably extracted from images with large appearance changes.To handle this issue,this paper proposed a novel log-polar feature guided iterative closest point (LPF-ICP) algorithm for image registration.The experimental evaluation illustrates that the LPF-ICP method can reliably extract the log-polar feature points and successfully register all the 22 ima-ge pairs contained in the Rensselaer dataset,while the GDB-ICP method only succeeds in 19 of them,thus verifying the effectiveness of the proposed method.

Selective Ensemble Learning Human Activity Recognition Model Based on Diversity Measurement Cluster

WANG Zhong-min, ZHANG Shuang and HE Yan

Computer Science. 2018, 45 (1): 307-312. doi:10.11896/j.issn.1002-137X.2018.01.053

Abstract

PDF(6044KB) ( 625 )

References | Related Articles | Metrics

To improve the accuracy of human activity recognition based on mobile phone,and optimize the generalization performance of multiple classifiers ensemble system and the diversity of individual classifier,an activity recognition model based on selective ensemble learning of diversity measure increment-affinity propagation clustering(DMI-AP) was proposed.Firstly,all the samples are bootstrapped and base classifiers are trained in the training set. The mode selects the base classifiers whose accuracy is greater than the average accuracy. The classifier set consists of the selected classifiers,and then the base classifiers of the training set are chosen to cluster,the double default diversity increment value are got by calculating the double default diversity measure value between base classifiers.The value is clustered by the affinity propagation clustering algorithm and divided into k clusters.Each cluster’s center classifier forms multi-classifier systems.Finally,the outputs of classifiers are fused by calculating the average.The experimental results show that the diversity of individual classifier increases and the searching space of the classifier decreases by using the DMI-AP model.Compared with the traditional Bagging,Adaboost and RF methods,the recognition accuracy of the proposed model is improved by 8.11%.

Multi-images Mosaic Algorithm Based on Improved Phase Correlation and Feature Point Registration

LI Dan, XIAO Li-qing, TIAN Jun and SUN Jin-ping

Computer Science. 2018, 45 (1): 313-319. doi:10.11896/j.issn.1002-137X.2018.01.054

Abstract

PDF(2074KB) ( 591 )

References | Related Articles | Metrics

The result of image mosaic suffers from different exposure of the camera,scale change,rotation,ambient noise and light interference.Manual sorting images have problems of high error rate and poor efficiency.In this paper,a new algorithm of multi-images mosaic based on improved phase correlation and feature point registration was proposed.Firstly,the improved phase correlation algorithm based on log polar transformation is used to calculate parameters of scaling,rotation and translation.Sort rules are made according to the peak size of impulse function energy.Then,the Harris corner points are extracted in overlapping positions,the matching point pairs are purified by the improved Ransac algorithm,and the transformation matrix is optimized to complete mosaic .At last,according to the phenomenon of joint,images are processed by NSCT transform algorithm to decompose low frequency and high frequency sub-bands.The new fusion strategy can make the image joint seem smooth and natural.The results of the experiments confirm that the model parameters established by the new algorithm are accurate and efficient,and the mosaic has high robustness to complex environment and chaotic sequence images.