Please login first

List of accepted submissions

 
 
Show results per page
Find papers
 
  • Open access
  • 94 Reads
Adaptive Exploration in Stochastic Multi-armed Bandit Problem
, , ,

The multi-armed bandit (MAB) problem is classic problem of the exploration versus exploitation dilemma in reinforcement learning. As an archetypal MAB problem, the stochastic multi-armed bandit (SMAB) problem is the base of many new MAB problems. To solve the problems of weak theoretical analysis and single information used in existing SMAB methods, this paper presents "the Chosen Number of Arm with Minimal Value" (CNAMV), a method for balancing exploration and exploitation adaptively. Theoretically, the upper bound of CNAMV’s regret is proved, that is the loss due to the fact that the globally optimal policy is not followed all the times. Experimental results show that CNAMV yields greater reward and smaller regret with high efficiency than commonly used methods such as ε-greedy, softmax, or UCB1. Therefore the CNAMV can be an effective SMAB method.

  • Open access
  • 133 Reads
Study on Optimal Control Strategy of Automatic Transmission Based on Policy Search

Automatic transmission can shift according to the engine power output and environmental conditions automatically. It is the challenge to reduce the shift jerk and improve the shift quality. A policy search algorithm of reinforcement learning for automatic transmission shift process is proposed. First, algorithm learns from fixed environment set for preliminary strategy. Second, agent interacts with environment and starts online learning for optimal control strategy. Finally, to verify the performance of the algorithm, the simulation study of the shift process under different conditions is carried out. The simulated result demonstrated that the shift jerk can be significantly reduced by applying the optimal control strategy.

  • Open access
  • 131 Reads
Building Domain-Specific Sentiment Lexicon by Sentiment Seed Expansion

Sentiment words extraction is one of the most important subtask of sentiment analysis. However, most sentiment analysis tasks are based on emotion words extracted manually, which requires a lot of manual intervention. In order to overcome this problem, this paper presents a framework to automatically expand the domain-specific sentiment lexicon by sentiment seeds extraction from a large domain corpus of user comments in the automotive field. We use word2vec to learn word embedding and label a small amount of dataset as positive or negative data as training dataset. We extract domain-specific sentiment seeds embedding from training dataset by calculating the sentiment score of the words in training dataset. Afterwards, these sentiment seeds are expanded to cast a large-scale automotive-specific sentiment lexicon by synonyms and k-means algorithm, without any manual annotation. Experimental results show that our approach is able to obtain a large number of new domain-specific sentiment words in the automotive field, and our lexicon reveals better performance than universal sentiment lexicon on user comments sentiment classification.

  • Open access
  • 121 Reads
Video Description with Spatio-temporal Feature and Knowledge Transferring
, , , ,

Describing open-domain video with natural language sequence is a major challenge for computer vision. In this paper, we investigate how to use temporal information and learn linguistic knowledge for video description. Traditional convolutional neural networks (CNN) can only learn powerful spatial features in the videos, but they ignored underlying temporal features. To solve this problem, we extract SIFT flow features to get temporal information. Sequence generator of recent work are solely trained on text from video description datasets, so the sequence generated tend to show linguistic irregularities associated with a restricted language model and small vocabulary. For this, we transfer knowledge from large text corpora and employ word2vec to be the word representation. The experimental results have demonstrated that our model outperforms related work.

  • Open access
  • 134 Reads
Fusing Augmented Spatio-temporal Features for Action Recognition
, , , ,

Visual features are vitally important for action recognition in videos. However, traditional features fail to effectively recognize actions for two reasons: on one hand, spatial features are not powerful enough to capture appearance information of complex video actions; on the other hand, important temporal details are always ignored when pooling and encoding. In this paper, we present a new architecture that fuses multiple augmented spatio-temporal features. In order to strengthen spatial features, we conduct crop and horizontal flip on original frame images. Then we feed these processed images into deep Two-Stream network to produce robust spatial representations. To get powerful temporal features, we employ fourier temporal pyramid (FTP) to capture three different levels of video context, including short-term level, medium-range level, and global-range level. At last, we fuse these augmented spatio-temporal features using canonical correlation analysis (CCA) method, which is capable to capture the correlation between these features. Experimental results on UCF101 dataset show that our method can achieve excellent performance for action recognition.

  • Open access
  • 73 Reads
Attention-based CNNs for Aspect-level Sentiment Classification

Extracting different emotions of different aspects in user comments is a fundamental task of sentiment analysis. For example, “I like apple, but hate banana.”, for aspect apple, the polarity is positive while for banana is negative. So that aspect-level sentiment classification has become pervasive in recent years. In this paper, we present a new framework for aspect-level sentiment classification by attention-based convolutional neural networks. The attention mechanism can focus on different aspects in a sentence, and extract different polarities. The experimental results in SemEval 2014 dataset show that our model achieves state-of-the-art performance on aspect-level sentiment classification.

  • Open access
  • 115 Reads
Descriptors Based on Continuous Indicator Fields for 3D-QSAR Studies
,

CIF descriptors are based on the concept of Continuous Indicator Fields (CIF),1 a particular case of Continuous Molecular Fields.2,3 Each CIF descriptor is defined by an isotropic Gaussian function centered at a specific point in the physical space. The positions of these points can be chosen by applying hierarchical cluster analysis to Cartesian coordinates of all atoms in all molecules in the aligned training set. The value of a CIF descriptor for a molecule is equal to the overlap integral between this function and the sum of analogous Gaussian functions centered on all atoms in the molecule. The resulting matrix of CIF descriptors can be used to build 3D QSAR models.

There are several advantages of using CIF descriptors over the original methodology of building CIF 3D QSAR models.1 Firstly, CIF descriptors can efficiently be computed for big data sets. Secondly, any machine learning method, regression or classification; linear or non-linear, can be applied to build 3D QSAR models. Thirdly, CIF descriptors can be aggregated to form 3D analogs of fragment descriptors, which can be used to interpret 3D QSAR models from structural viewpoint.

CIF descriptors are implemented in R scripts and available as a part of the Continuous Molecular Fields project.4 They were used in conjunction with Support Vector Machines and several other machine learning methods to build 3D QSAR models for several benchmarking data sets.

 

References

  1. Sitnikov G.V.; Zhokhova N.I.; Ustynyuk Yu.A.; Varnek A.; Baskin I.I. J. Comput. Aided Mol. Des. 2015, 29, 233.
  2. Baskin I.I.; Zhokhova N.I. J. Comput. Aided Mol. Des. 2013, 27, 427.
  3. Baskin I.I.; Zhokhova N.I. Challenges and Advances in Computational Chemistry and Physics, 2014, Springer, 17, 433.
  4. http://sites.google.com/sites/conmolfields/
  • Open access
  • 104 Reads
The Use of Energy-Based Neural Networks for Similarity-Based Virtual Screening
,

For the first time, energy-based neural networks (EBNNs) were applied to build structure-activity models. The Hopfield Networks (HNs) and the Restricted Boltzmann Machines (RBMs) were used to build one-class classification models for conducting similarity-based virtual screening. The AUC score for ROC curves and 1%-enrichment rates were compared for 20 targets taken from DUD repository. Five different scores were used to assess similarity between each the tested compounds and the training sets of active compounds: the mean and the maximum values of Tanimoto coefficients, the energy for HNs, the free energy and the reconstruction error for RBMs. The latter score was shown to provide the superior predictive performance. Additional advantages of using EBNNs for similarity-based virtual screening over the state-of-the-art similarity searching based on Tanimoto coefficients are: computational efficacy and scalability of prediction procedures, the ability to implicitly reweight structural features and consider their interactions, their “creativity” and compatibility with modern deep learning and artificial intelligence techniques.

  • Open access
  • 108 Reads
Person Re-identification by Null Space Marginal Fisher Analysis

For better describing pedestrian’s appearance, the feature representations used in person re-identification are usually of high dimension - typically amounting to thousands or even higher. However, this incurs the typical Small Sample Size (SSS) problem, i.e., the number of training samples in most re-identification datasets is much smaller than the feature dimension. Although some dimension reduction techniques or metric regularization could be applied to alleviate this problem, they may result in the loss of discriminative power.

In this work, we propose to overcome SSS problem by embedding training samples into a discriminative null space based on Marginal Fisher Analysis (MFA). In such a null space, the within-class distribution of the images of the same pedestrian will shrink to a single point, resulting the extreme fisher analysis criterion. We theoretically analyze the subspace where the discriminant vectors lie on and derive a closed-form solution. Furthermore, we also extend the proposed method to nonlinear domain via the kernel trick. Experiments on VIPeR, PRID450S and 3DPes benchmark datasets show that our method achieves 56.30%, 76.80% and 66.88% rank-1 matching rates respectively, outperforming the state-of-the-art results by 2.74%, 15.38% and 9.59%.

  • Open access
  • 100 Reads
Trajectory-pooled Spatial-temporal Structure of Deep Convolutional Neural Networks for Video Event Recognition
, , , ,

Video event recognition according to content feature faces great challenges due to complex scenes and blurred actions for surveillance videos. To alleviate these challenges, we propose a spatial-temporal structure of deep Convolutional Neural Networks for video event recognition. By taking advantage of spatial-temporal information, we fine-tune a two-stream Network, then fuse spatial and temporal feature at a convolution layer using a conv fusion method to enforce the consistence of spatial-temporal structure. Based on the two-stream Network and spatial-temporal layer, we obtain a triple-channel structure. We pool the trajectory to the fused convolution layer, as the spatial-temporal channel. At the same time, trajectory-pooling is conducted on one spatial convolution layer and one temporal convolution layer, to form another two channels: spatial channel and temporal channel. To combine the merits of deep feature and hand-crafted feature, we implement trajectory-constrained pooling to HOG and HOF features. Trajectory-pooled HOG and HOF features are concatenated to spatial channel and temporal channel respectively. A fusion method on triple-channel is designed to obtain the final recognition result. The experiments on two surveillance video datasets including VIRAT 1.0 and VIRAT 2.0, which involves a suit of challenging events, such as person loading an object to a vehicle, person opening a vehicle trunk, manifest that the proposed method can achieve superior performance compared with other methods on these event benchmarks.

Top