Started in January,1974(Monthly)
Supervised and Sponsored by Chongqing Southwest Information Co., Ltd.
ISSN 1002-137X
CN 50-1075/TP
CODEN JKIEBK
Editors
    Content of Big Data & Data Mining in our journal
        Published in last 1 year |  In last 2 years |  In last 3 years |  All
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Citywide Crowd Flows Prediction Based on Spatio-Temporal Recurrent Convolutional Networks
    GUO Sheng-nan, LIN You-fang, JIN Wen-wei, WAN Huai-yu
    Computer Science    2019, 46 (6A): 385-391.  
    Abstract474)      PDF(pc) (4027KB)(1533)       Save
    Accurately forecasting the crowd flows in urban areas can provide effective decision-making support for traffic management and citizens’ travel.The crowd flows in each urban region have strong correlations in both temporal dimensionsand spatial dimensions.These complex factors bring great challenges to accurate predictions.A novel neural network structure named attention-based spatio-temporal recurrent convolution networks (ASTRCNs) was proposed,which can simultaneously model various factors that affect the crowd flows.ASTRCNs consists of three components,which can respectively capture the short-term dependences,the daily periodicity influence and the weekly patterns of the crowd flows.Experimental results on a real data set of crowd flows in Beijing demonstrate that the proposed ASTRCNs outperforms the classical time series methods and the existing deep-learning based prediction methods.
    Reference | Related Articles | Metrics
    Fault Prediction of Power Metering Equipment Based on GBDT
    LIU Jin-shuo, LIU Bi-wei, ZHANG Mi, LIU Qing
    Computer Science    2019, 46 (6A): 392-396.  
    Abstract415)      PDF(pc) (1785KB)(1029)       Save
    The fault risk prediction of power metering equipment can reduce the loss caused by the fault risk of the national grid.Firstly,the data preprocessing and feature selection are carried out.Secondly,the GBDT-based fault categories,fault subclasses and equipment life cycle prediction are designed.Finally,the validity and advancement of the designed model are verified.Data used in the experiment are provided by China Electric Power Research Institute.The experimental results show that the prediction accuracy of the six fault types by using the proposed algorithm is 90.56%,the recall rate is 92.95%,and the F1 value is 91.71%.Compared with regression,BP neural network,Adaboost and decision tree algorithm,the gradient lifting decision tree algorithm has the best performance under parameter tuning conditions.
    Reference | Related Articles | Metrics
    MetaStruct-CF:A Meta Structure Based Collaborative Filtering Algorithm in Heterogeneous Information Networks
    WANG Xu, PANG Wei, WANG Zhe
    Computer Science    2019, 46 (6A): 397-401.  
    Abstract306)      PDF(pc) (1929KB)(913)       Save
    In recent years,heterogeneous information networks (HINs) have received a lot of attention as they contain rich semantic information.Previous works have demonstrated that the rich relationship information in HINs can effectively improve the recommendation performance.As an important tool for mining relationship information in HINs,meta-path has been widely used in many algorithms.However,because of its simple linear structure,meta-path may not be able to express complex relationship information.To address this issue,this paper proposed a new recommendation algorithm,Metastruct-CF,which applies Meta structure to capture the accurate relationship information among data objects.Different from existing methods,the proposed combines algorithm multiple relationships to effectively utilize the information in HINs.Extensive experiments on two real world datasets show that this algorithm achieves better recommendation performance than several popular or state-of-the-art methods.
    Reference | Related Articles | Metrics
    Distributed Spatial Keyword Query Processing Algorithm with Relational Attributes
    XU Zhe, LIU Liang, QIN Xiao-lin, QIN Wei-meng
    Computer Science    2019, 46 (6A): 402-406.  
    Abstract263)      PDF(pc) (2242KB)(734)       Save
    The rapid growth of the mobile internet and the internet of things generates a large amount data of spatial text object with relational attributes.Search engines for webpage text data can efficiently store and index textual data,but only support textual keyword queries.However mixed data including geographic location information,textual information,and relational attributes cannot be processed.Existing query-processing techniques for space-oriented keywords do not consider relation attributes as filter conditions.And those techniques are based on stand-alone implementation and cannot meet query performance requirements.In order to solve the above problems,this paper proposed a novel Baseline algorithm named BADKLRQ (Baseline Algorithm of Distributed Keywords and Location-aware with Relational Attributes Query) that maps attributes of relation attributes,space,and keywords into text data.The row text index indexes the converted text data.For query requests with relation attributes,space,and keywords,the query request is also converted into a plurality of text keywords in the mapping space,and the converted text data is queried.And an improved algorithm based on Baseline algorithm MGDKLRQ is proposed to improve the algorithm of converting spatial attributes into text keywords.Experiments show that the BADKLRQ algorithm improves by 10% to 15% and MGDKLRQ algorithm improves by 20% to 30% over the existing algorithm in terms of index time and query time.
    Reference | Related Articles | Metrics
    Linear Twin Support Vector Machine Based on Data Distribution Characteristics
    SONG Rui-yang, MENG Hua, LONG Zhi-guo
    Computer Science    2019, 46 (6A): 407-411.  
    Abstract351)      PDF(pc) (2885KB)(890)       Save
    Twin Support Vector Machine(TWSVM) have been successfully applied in many fields.However,the standard TWSVM model have poor robustness when dealing with data classification problems involving distribution characteristics,especially when uncertainty in data fluctuates wildly,the standard classification model,which doesn’t consider the distribution characteristics,is no longer satisfactory for classification accuracy.Therefore,a weighted linear twin support vector machine model based on data distribution characteristics was proposed in this paper.The new model,denoted by TWSVM-U,further considers the influence of data distribution characteristics on the locations of classification hyperplanes,and constructs distance weights quantitatively according to data dispersity at the normal vector directions of classification hyperplanes.TWSVM-U is a generalization of TWSVM.In fact,when training samples do not have distribution characteristics,TWSVM-U model will degenerate to the standard TWSVM model.Experiments with 10-fold cross validation show that the TWSVM-U model performs better than the SVM and the TWSVM on classification problems with large data fluctuation range.
    Reference | Related Articles | Metrics
    Common Issues and Case Analysis of System Data Migration
    LU Ye-shan
    Computer Science    2019, 46 (6A): 412-416.  
    Abstract444)      PDF(pc) (2615KB)(1049)       Save
    With the development of society and the rapid change of technical framework,it has become a trend for the daily system to replace the old system with the new one.The replacement of the old system with the new system will inevitably involve the data docking between the old system and the new system.In the system construction of an organization in a city,the project needs to migrate all business data of the old system to the new system.Due to the inconsistency of table space,table structure and table field between the old and new systems,in order to ensure the consistency and integrity of data,ensure that data before and after migration is not missing,and ensure that dirty data does not migrate to affect the operation of the new system,how to migrate data between the old and new systems has become a top priority in the project.In order to solve the problem of data migration,this paper designs a data migration process based on ETL tools,and obtains a complete data migration process line through combination and series connection,thus realizing data migration to complete data docking between old and new systems.This paper elaborates the following problems and solutions for data migration:1)Common errors and solutions in data flow.2)Data migration problems and solutions with inconsistent data types.3)Inconsistent length of field in target database of data migration and solutions.4)How to re-change the original data when data migration is completed Problems and solutions of adjusting migration measurement.Based on this,this paper makes a brief analysis and summary of the problems in the process of data migration and the countermeasures to solve these problems.
    Reference | Related Articles | Metrics
    Temporal Text Data Stream Feature Trend Model and Algorithm
    MENG Zhi-qing, XU Wei-wei
    Computer Science    2019, 46 (6A): 417-422.  
    Abstract205)      PDF(pc) (1946KB)(1973)       Save
    Today,on the platform of e-commerce and social networking,there will be a lot of text data streams.It is very important to extract the characteristics of text data flow quickly to find some trend for guiding the operation of enterprises.For example,clothing enterprises must perceive popular information as quickly and accurately as possible.Fashion trends are of vital importance to the design,production and operation.Taken the text data flow of online goods as the research object,combining the online sales text real-time data flow,this paper defined a characteristic trend model of the temporal text data flow.Then,it proposed a real-time mining algorithm for text data stream feature trend finding.The algorithm was applied on the description of clothing sales text to extract popular feature applications.It can obtain an effective fashion trend and provide decision support for enterprises to formulate production plans and select marketing strategies.On the real sales data of the e-commerce platform,the experiment results prove that the algorithm has good accuracy and fast speed.Therefore,the proposed algorithm has important theoretical and practical significance.
    Reference | Related Articles | Metrics
    Linear Discriminant Analysis of High-dimensional Data Using Random Matrix Theory
    LIU Peng, YE Bin
    Computer Science    2019, 46 (6A): 423-426.  
    Abstract509)      PDF(pc) (1576KB)(1166)       Save
    Linear discriminant analysis (LDA) is an important theoretical and analytic tool for many machine learning and data mining tasks.As a parametric classification method,it performs well in many applications.However,LDA is impractical for high-dimensional data sets which are now routinely generated everywhere in modern society.A primary reason for the inefficiency of LDA for high-dimensional data is that the sample covariance matrix is no longer a good estimator of the population covariance matrix when the dimension of feature vector is close to or even larger than the sample size.Therefore,this paper proposed a high-dimensional data classifier regularization method based on random matrix theory.Firstly,a truly consistent estimation was conducted for high-dimensional covariance matrix through rotation invariance estimation and eigenvalue interception.Secondely,the estimated high-dimensional covariance matrix was used to calculate the discrimination function value.Numerical experiments on the artificial datasets,as well as some real world datasets such as the microarray datasets,demonstrate that the proposed discriminant analysis method has wider applications and yields higher accuracies than existing competitors.
    Reference | Related Articles | Metrics
    Educational Administration Data Mining of Association Rules Based on Domain Association Redundancy
    LU Xin-yun, WANG Xing-fen
    Computer Science    2019, 46 (6A): 427-430.  
    Abstract259)      PDF(pc) (2383KB)(3398)       Save
    Due to the periodicity of teaching and the change of teaching environment,the data of educational administration in colleges and universities have the characteristics of time series,and there are many association redundancy,so it is difficult to find out the efficient and interesting association rules.Although the sequential pattern mining algorithm can mine the time series frequent itemsets,it can not eliminate the association redundancy in educational administration data,and the utility and novelty of mining results can not meet the requirements.Therefore,this paper proposed a FUI_DK association rule mining algorithm based on association redundancy in the educational field.FUI_DK algorithm generates frequent candidate itemsets based on sequential pattern mining algorithm,and increases utility and interest to obtain high utility interesting itemsets based on the support,confidence of classical association rule algorithms,and the association rules satisfying the conditions are sorted out according to their support,confidence and utility.Finally,the result of association rules with high utility and high interest is obtained.The experiment contrast and mining result analysis are carried out on the data of a university student educational administration.The experimental results show that the FUI_DK algorithm has better time performance in the data mining of university educational administration,and the elimination rate of known association rules in the field can reach 43%,which can help colleges and universities to carry out time-saving and effective educational data mining.
    Reference | Related Articles | Metrics
    Research on Population Prediction Based on Grey Prediction and Radial Basis Function Network
    XU Li-li, LI Hong, LI Jin
    Computer Science    2019, 46 (6A): 431-435.  
    Abstract273)      PDF(pc) (1624KB)(1255)       Save
    For the problem of economic growth and social stability,it is extremely important to accurately predict the population.Therefore,this paper used the total population of Shandong Province over the years to construct a gray prediction model and a radial basis network model,respectively,to simulate the total population of 20 years from 1995 to 2014.And for the limitation of the single model,this paper also used the standard deviation method to redistribute the weights of its forecast results,and built a combination model on the basis of it.The results show that the accuracy of the combined forecasting model is higher than that of the grey model and the radial basis network model,and a short-term forecast of the total population between 2015 and 2025 is made by using the combined forecasting model.
    Reference | Related Articles | Metrics
    Vertical Analysis Based on Fault Data of Running Smart Meter
    LIU Zi-yi, LIU Qing, WANG Chong, WANG Ji-meng, WANG Yue, LIU Jin-shuo, YIN Ze-hao
    Computer Science    2019, 46 (6A): 436-438.  
    Abstract166)      PDF(pc) (1532KB)(745)       Save
    As the main tool of electricity measurement and economic settlement,the failure rate of smart meter is directly related to the national economy and livelihood of the masses.This paper devised a vertical analysis model of fault data of running smart meter.The model can analyze the operation failure rate data of smart meters from different manufacturers and batches.The model firstly cleans the useless data,then carries out linear regression analysis on the basic data items,and gets the fault data and changing rate of the failure rate of each batch,which are utilized to do the cluster to evaluate the stability of the factory quality.The method and the result of the model can assess the quality of the batch of the smart meter,and can be beneficial to estimate the quality of factory.
    Reference | Related Articles | Metrics
    Research on Naive Bayes Ensemble Method Based on Kmeans++ Clustering
    ZHONG Xi, SUN Xiang-e
    Computer Science    2019, 46 (6A): 439-441.  
    Abstract513)      PDF(pc) (1645KB)(856)       Save
    Naive Bayes is widely applied because of its simple method,high computation efficiency,high accuracy and solid the oretical foundation.Since the difference is a key condition of ensemble learning,this paper studied the method for improving the ensemble difference of naive Bayes classifier based on kmeans++ clustering technology,so as to improve the generalization performance of naive Bayes.Firstly,plurality of naive Bayesian classifier models are trained through a training sample set.In order to increase the difference between the base classifiers,Kmeans++ algorithm is used to cluster the prediction results of the base classifiers on the verification set.Finally,the base classifier with the best generalization performance is selected from each cluster for ensemble learning,and the final result is obtained by simple voting method.UCI standard data sets are used to verify the algorithm at the end of this paper,and its generalization performance has been greatly improved.
    Reference | Related Articles | Metrics
    Persona Based Social User Modeling Using KD-Tree
    WAN Jia-shan, CHEN Lei, WU Jin-hua, GAO Chao
    Computer Science    2019, 46 (6A): 442-445.  
    Abstract360)      PDF(pc) (1929KB)(2285)       Save
    Traditional information push service takes little consideration of specific needs of social network users in particular conditions,hence it has poorly-targeted recommendations and low-rated system transformation.Responding to these problems,this paper proposed an intelligent push method based on user personas.By analyzing user data of intelligent learning platforms KNN clustering algorithm realized by KD-Tree is used to analyze user preferences and behavior characteristics,and then classifies user categories.First,through clustering center analysis,each type of users is abstracted into a highly-refined short text to form a representative label.Second,on account of label weight value of individual users and different service demands,user personas are modeled two times for refinement.Finally,recommendations are made by collaborative filtering algorithm.User personas will enhance the usability and value of user data.In addition,they may free analysts from large volumes of data,and help make fine classifications and thus more accurate recommendations.
    Reference | Related Articles | Metrics
    Research on Sales Forecast of Prophet-LSTM Combination Model
    GE Na, SUN Lian-ying, SHI Xiao-da, ZHAO Ping
    Computer Science    2019, 46 (6A): 446-451.  
    Abstract790)      PDF(pc) (2982KB)(2119)       Save
    Predicting the short-term or long-term changes in the sales volume of a certain product has an important reference value for enterprises to formulate marketing strategies and optimize industrial layout.After deeply analyzing the characteristics of the Prophet additive model and the LSTM neural network,this paper built a Prophet-LSTM combinatorial model for forecasting sales based on the time-series data of a company's product sales.This paper designed and implemented comparison experiments with pre-combination Prophet,LSTM single-item model,and two typical time series prediction models.Experimental results show that the Prophet-LSTM combination forecasting model has stronger applicability and higher accuracy in the time series analysis of sales volume,which provides an important scientific basis for the company to respond to changes in market demand.
    Reference | Related Articles | Metrics
    Clustering Method Based on Hypergraph Morkov Relaxation
    GUO Peng, LI Ren-fa, HU Hui
    Computer Science    2019, 46 (6A): 452-456.  
    Abstract343)      PDF(pc) (2439KB)(621)       Save
    How to embed high dimention spatial-temporal feature into low dimention semantic feature word bag is a typic clustering problem in the Internet of vehicle .Spectral clustering algorithm is recently focused because of its simple computing and global optimal solution,however,the research about the numbers of clusters is relatively little.Tranditional eigengap heuristic method works well if the clusters in the data are very well pronounced.However,the more noisy or overlapping the clusters are,the less effective this heuristic is.This paper proposed a clustering method based on hypergraph markov relaxation (HS-MR method).The basic idea of this algorithm is using the Markov process to formally describe hypergraph and start random walk.In the relaxation process of hypergraph Markov chain,meaningful geometric distribution of data set is found through tth power of random transfer matrix P and diffusion mapping.Then,the objective function based on mutual information is proposed to automatically converge the clustering number.Finally,the experimental results show that the algorithm is superior to simple graph spectral clustering algorithm and hypergraph spectral clustering algorithm in accuracy rate.
    Reference | Related Articles | Metrics
    Density Peak Clustering Algorithm Based on Grid Data Center
    LI Xiao-guang, SHAO Chao
    Computer Science    2019, 46 (6A): 457-460.  
    Abstract497)      PDF(pc) (3054KB)(732)       Save
    A density peak clustering algorithm based on the grid data center was proposed.The computational complexity of the clustering process is reduced by meshing the dataset.Firstly,the dataset space is divided into grids with the same size,the density value of each grid is composed of the number of data objects that are contained in the grid and the decayed number of the data objects in its adjacent grids,and the distance value of each grid is defined as the nearest distance from its data center to the data center of another grid which has a higher density.Then,the cluster center grids are found since these grids always have high density value and large distance value.Finally,a density-based division approach is used to complete the duty of clustering.The simulation experiments performed on UCI artificial data set show that this algorithm can effectively cluster large-scale data with high clustering accuracy in a short period of time.
    Reference | Related Articles | Metrics
    Personalized Learning Resource Recommendation Method Based on Three-dimensionalFeature Cooperative Domination
    LI Hao-jun, ZHANG Zheng, ZHANG Peng-wei
    Computer Science    2019, 46 (6A): 461-467.  
    Abstract355)      PDF(pc) (2671KB)(594)       Save
    Personalized recommendation is becoming an important form of information service era,and it is an effective way to alleviate knowledge disorientation and improve learning efficiency.In order to meeting learners’ personalized needs for online learning resources,personalized recommendation technology is increasingly important.Therefore,this paper proposed a personalized learning resource recommendation method based on three-dimensional feature cooperative domination (TPLRM).Firstly,a personalized learning resource recommendation model based on three-dimensional feature cooperative domination is constructed,resource recommendation feature parameters are improved,and fitness function is built.Secondly,the binary particle swarm optimization algorithm based on fuzzy control of Gauss’s membership function (FCBPSO) is used to solve the model.Finally,the evaluation target system is established.Five groups of comparative experiments verifies that TPLRM recommendation method has better recommendation performance.
    Reference | Related Articles | Metrics
    Hybrid Recommendation Algorithm Based on SVD Filling
    LIU Qing-qing, LUO Yong-long, WANG Yi-fei, ZHENG Xiao-yao, CHEN Wen
    Computer Science    2019, 46 (6A): 468-472.  
    Abstract270)      PDF(pc) (1811KB)(819)       Save
    With the development of Internet technology,the issue of information overload is becoming increasingly se-rious.The recommendation system is an effective means to alleviate this problem.Focusing on the problem of low recommendation efficiency caused by sparse data and cold start in collaborative filtering,this paper proposed a hybrid recommendation algorithm based on SVD filling.Firstly,Singular Value Decomposition technique is used to decompose the user-item score matrix,and sparse matrix is filled by stochastic gradient descent method.Secondly,time weights are added to optimize the user similarity in the user matrix.At the same time,Jaccard coefficients are added to optimize the item similarity in the item matrix.Then,item-based and user-based collaborative filtering are combined to calculate prediction scores and select the optimal project.Finally,the proposed algorithm is compared with other existing algorithms on Movielens and Jester data set,and the result of experiments verifies that the effectiveness of the proposed algorithm.
    Reference | Related Articles | Metrics
    Online Learning Nonnegative Matrix Factorization
    HE Xiao-wen, HU Yi-fei, WANG Hai-ping, CHEN Mo
    Computer Science    2019, 46 (6A): 473-477.  
    Abstract451)      PDF(pc) (2489KB)(887)       Save
    This paper proposed a new nonnegative matrix factorization of online form,namely online learning nonnegative matrix factorization(OLNMF).The OLNMF algorithm uses incremental forms of non-smooth model,and adopts “anmesic average method” to control the weight of new and old samples,improving the computational efficiency and reducing the computational complex.OLNMF algorithm can deal with large real-time update data sets,and extract more sparse base matrix.Compared with INMF,ONMFO,Lp-INMF,experiments on face databases show that the proposed method achieves better sparsity,andSVM classification method base on OLNMF achieves better classification accuracy on EEG database.
    Reference | Related Articles | Metrics
    Method of Short Text Classification Based on Frequent Item Feature Extension
    JIN Yi-fan, FU Ying-xun, MA Li
    Computer Science    2019, 46 (6A): 478-481.  
    Abstract288)      PDF(pc) (1661KB)(825)       Save
    Short text has the characteristics of high feature dimension and sparse,as a result,the traditional classification method is not effective in short text classification.To solve this problem,a short text classification method based on frequent item feature extension called STCFIFE was proposed.First of all,frequent itemsets in the background corpus are mined through FP-growth algorithm,and combining the contextual association feature,the extended feature weight is calculated.Then the new features are added to the feature space of the original short text.On this basis,SVM (Support Vector Machine) classifier is trained for classification.The experimental results show that,compared with the traditional SVM algorithm and the LDA+KNN algorithm,STCFIFE can effectively alleviate problems of feature deficiency and high dimensional sparsity in short text and improves F1 value by 2%~10%,improving the classification effect in short text.
    Reference | Related Articles | Metrics
    Boundary Distance Algorithm for Determining Sliding Window Size
    PENG Cheng, HE Jing, CHI Hao
    Computer Science    2019, 46 (6A): 482-487.  
    Abstract296)      PDF(pc) (3300KB)(1108)       Save
    Due to a large amount of information and high density of the original measurement data collected by most equipment,the existing time series sliding window dimension reduction method uses the empirical value to determine the window size,which cannot retain important information points of the data to the utmost extent,and has high computational complexity.To this end,the influence of sliding window on time series similarity technology in practical applications was discussed,and an algorithm for determining the initial scale of sliding window was proposed.The upper and lower boundary curves with higher fitting degree are constructed,and the trend weighting is introduced into the LB_Hust distance calculation method,which reduces the difficulty of mathematical modeling and improves efficiency of equipment data similarity classification and state evaluation.
    Reference | Related Articles | Metrics
    Matrix Factorization Recommendation Algorithm Based on Adaptive Weighted Samples
    SHI Xiao-ling, CHEN Zhi, YANG Li-gong, SHEN Wei
    Computer Science    2019, 46 (6A): 488-492.  
    Abstract255)      PDF(pc) (2251KB)(906)       Save
    Missing value estimation of sparse matrix is a necessary basic research,which is also particularly important and significant in some practical applications,such as the recommendation system.There are many methods to solve this problem,one of the most effective method to tackle this issue is Matrix Factorization (MF).However,the traditional MF algorithm has some limitations,which can only directly simulate the elements of the sparse matrix by using regression method.But it did not take into account the sample itself,which has different difficulty in regression and should be treated respectively.According to this limitation,this paper proposed a matrix factorization recommendation algorithm based on adaptive weighted samples (AWS-MF).Based on the traditional MF algorithm,the proposed method exploits the differences among the training samples and treats each sample in a bias weights.In order to improve the performance and robustness of our model,the intermediate results are combined together in the final process to obtain the objective predictions.To verify the superiority of the proposed method,the comprehensive experiments were conducted on the real-world data sets.The experiment results demonstrate that the proposed AWS-MF algorithm is able to adaptively re-weight samples according to the differences among them.Moreover,treating the samples respectively can lead to a promising performance in the real-world applications compared to the baseline methods.
    Reference | Related Articles | Metrics
    Research on Recommendation Application Based on Seq2seq Model
    CHEN Jun-hang, XU Xiao-ping, YANG Heng-hong
    Computer Science    2019, 46 (6A): 493-496.  
    Abstract465)      PDF(pc) (2310KB)(1127)       Save
    There is enormous information around us in daily basis which lead to the recommander systems to filter out the pure gold.The traditional recommander systems have been regarded as static,and lack of the research about the long or short term dependency of data.Considering the outstanding perform of recurrent neural network in tackling the sequence data,recommander system based on seq2seq model was built.The process of recommandation can be viewed as a process of sequence translation or a process of answer generation,and the model make uses of the used interactive sequence data to learn the inherent frequent patterns,then makes the prediction of other users’ actions with items.Two datasets usually used for recommender system test are involved in the experiments,which measured by the BLEU.The results show that the method can make the sequence recommendation.The model only needs the interactive data between users and items,and gets rid of the rating matrix,thus avoids the sparsity problem.
    Reference | Related Articles | Metrics
    Bus Short-term Dynamic Dispatch Algorithm Based on Real-time GPS
    ZHANG Shu-yu, DONG Da, XIE Bing, LIU Kai-gui
    Computer Science    2019, 46 (6A): 497-501.  
    Abstract458)      PDF(pc) (2486KB)(1554)       Save
    This paper analyzed the limitation of traditional bus static dispatching.By using the real-time GPS data of online buses,and analyzing the bus operation mechanism under heavy traffic jam and sudden increase in passenger flow,this paper gave a new bus short-term dynamic dispatching algorithm based on neural network.Through simulations on bus lines in Guiyang,the proposed algorithm can efficiently solve the insufficient of traditional bus static dispatching,and reduce the interference of human factors in manual scheduling,which can realize the automation and intelligence of the bus dispatching.
    Reference | Related Articles | Metrics
    User Interest Recommendation Model Based on Context Awareness
    LI Jian-jun, HOU Yue, YANG Yu
    Computer Science    2019, 46 (6A): 502-506.  
    Abstract253)      PDF(pc) (2142KB)(1502)       Save
    With the development and popularization of electronic commerce and Internet,the user-oriented personalized recommendation is getting more and more attention,the traditional user interest models only consider theuser’s own behavior towards the project,while ignoring the user’s scene at that time.Aiming at this problem,this paper proposed a user interest model based on scene perception,combined the user’s browsing behavior with the situational factors,deeply mined user interest in the project from two aspects,cleared user awareness of the project,thus accurate clustered for users,and target users based on user clustering results were recommended.Experimental results show that the accuracy of this model recommendation is higher than that of other traditional recommendation algorithms.It can better mine users’ interests,adapt to the changes of users’ interests,better solve the problem that users have no way to choose a lot of information,and improve users’ satisfaction.Therefore,it is necessary to excavate the hidden information of users from multiple perspectives so as to better provide personalized recommendations for users.
    Reference | Related Articles | Metrics
    Decision Making of Course Selection Oriented by Knowledge Recommendation Service
    ZHANG Wei-guo
    Computer Science    2019, 46 (6A): 507-510.  
    Abstract260)      PDF(pc) (1813KB)(597)       Save
    Facing the rapid development of the Internet and the massive information resources on the Web,it is urgent to enable users to quickly find the information they want,hence the course selection oriented to knowledge recommendation service is generated .Course selection oriented to knowledge recommendation service is the core issue in the research of personalized recommendation,Based on the theory of the Apriori algorithm of association rules,this method makes use of the traditional collaborative filtering recommendation algorithm to improve Apriori algorithm.Combined with students’ majors,hobbies and academic records,constructs the model of course recommendation system as well as the personalized recommendation algorithm analysis based on this model.Through data mining in the students’ academic record database,it guides students to choose more suitable courses and helps them to learn efficiently and develop with personal characteristics.
    Reference | Related Articles | Metrics
    Personalized Recommendation Algorithm Based on PageRank and Spectral Method
    CHANG Jia-wei, DAI Mu-hong
    Computer Science    2018, 45 (11A): 398-401.  
    Abstract223)      PDF(pc) (1797KB)(1324)       Save
    Traditional PageRank recommendation algorithm is less scalable.To solve this problem,a personalized recom-mendation algorithm based on PageRank and spectral method was proposed.The number of iterations is controlled by adding the number of nodes in the PageRank algorithm to obtain the candidate set,threshold is ued to trim the number of nodes participating in the iteration to get the candidate node set.Spectral clustering is utilized to sort the candidate nodes.The candidate node adjacency matrix is normalized,and eigenvalues and eigenvectors of matrices are used to eva-luate the distance between nodes and target nodes in a graph.At last,a final list of recommendations is produced.Experi-mental results show that the proposed recommendation algorithm improves the processing efficiency on the premise of ensuring the recommendation quality.
    Reference | Related Articles | Metrics
    Service Recommendation Method Based on Social Network Trust Relationships
    WANG Jia-lei, GUO Yao, LIU Zhi-hong
    Computer Science    2018, 45 (11A): 402-408.  
    Abstract258)      PDF(pc) (2442KB)(1399)       Save
    With the advent of service computing,many different electronic services have emerged.Users often have to find what they need from a large number of services,which is a formidable task.Hence,it is necessary to put forward an efficient recommendation algorithm.The traditional cooperative recommendation system has some problems,such as cold start,sparsity of data and poor real-time performance,which lead to poor recommendation results under the circumstances with less scoring data.In order to get a better recommendation result,this paper introduced trust transfer in social networks and utilized it to establish a trust transfer model to obtain trust among users.On the other hand,based on the score data,the similarity between users in the system is calculated.On the basis of similarity between users’ trust and preference,according to the characteristics of social networks,users’ trust and preference are dynamically combined to obtain comprehensive recommendation weights.The comprehensive recommendation weights can replace the traditional similarity measurement standards for user-based collaborative filtering recommendation.This method was verified through the Epinions data set and can further improve the recommendation effect and.
    Reference | Related Articles | Metrics
    Association Rule Mining Algorithm Based on Hadoop
    DING Yong, ZHU Chang-shui, WU Yu-yan
    Computer Science    2018, 45 (11A): 409-411.  
    Abstract270)      PDF(pc) (2936KB)(808)       Save
    The traditional parallel association rule algorithm defines a MapReduce task for each iteration to implement the generation and counting function of the candidate set,but multiple startup of the MapReduce task brings great performance overhead.This paper defined a parallel association rule mining algorithm (PST-Apriori).This algorithm adopts a partition strategy,defines a prefix shared tree in each distributed computing node,and compresses the candidate items generated by each transaction T to the prefix shared tree (PST).Then the breadth traversal algorithm is used,and the 〈key,value〉 corresponding to each node are used as input of the map function,and the MapReduce frame is automatically gathered according to the key value.Finally,the reduce function is called to aggregate the processing results of multiple tasks,and the frequent itemsets satisfying the minimum support threshold are obtained.The algorithm only usestwo MapReduce tasks,and PST is sorted according to key value to facilitate shuffle operation at Mapper,which improves the efficiency of operation.
    Reference | Related Articles | Metrics
    Collaborative Filtering Algorithm Based on User’s Preference for Items and Attributes
    WANG Yun-chao, LIU Zhen
    Computer Science    2018, 45 (11A): 412-416.  
    Abstract301)      PDF(pc) (2154KB)(702)       Save
    Collaborative filtering algorithm is one of the most successful and useful technologies in recommendation systems.Cosine similarity and Pearson correlation coefficient are two of the most widely used traditional algorithms to calculate the similarity in collaborative filtering algorithm.In order to reduce the error,an improved collaborative filtering recommendation algorithm was proposed in view of the disadvantages of the two traditional similarity algorithms.The two traditional algorithms were improved by importing two parameters,one of them was proposed for considering the rating habits of users,and the other was imported to measure the difference of items chosen by users.User’s preference is related to project attributes,therefore,a parameter was designed to measure it.The new algorithm was constructed by the improved traditional algorithm and user’s preference for attributes.The results of experiment on MovieLens dataset show that the proposed algorithm has lower mean absolute error (MAE) and root mean square error (RMSE),and has better performance by using the two parameters compared with traditionalalgorithms.
    Reference | Related Articles | Metrics
      First page | Prev page | Next page | Last page Page 1 of 2, 42 records