The emergence of real-time auction in online advertising has drawn huge attention of modeling the market competition, i. The problem is formulated as to forecast the probability distribution of market price for each ad auction. With the consideration of the censorship issue which is caused by the second-price auction mechanism, many researchers have devoted their efforts on bid landscape forecasting by incorporating survival analysis from medical research field. However, most existing solutions mainly focus on either counting-based statistics of the segmented sample clusters, or learning a parameterized model based on some heuristic assumptions of distribution forms.
Moreover, they neither consider the sequential patterns of the feature over the price space. In order to capture more sophisticated yet flexible patterns at fine-grained level of the data, we propose a Deep Landscape Forecasting DLF model which combines deep learning for probability distribution forecasting and survival analysis for censorship handling.
Specifically, we utilize a recurrent neural network to flexibly model the conditional winning probability w. Then we conduct the bid landscape forecasting through probability chain rule with strict mathematical derivations. And, in an end-to-end manner, we optimize the model by minimizing two negative likelihood losses with comprehensive motivations. Without any specific assumption for the distribution form of bid landscape, our model shows great advantages over previous works on fitting various sophisticated market price distributions.
In the experiments over two large-scale real-world datasets, our model significantly outperforms the state-of-the-art solutions under various metrics. Predicting when and where events will occur in cities, like taxi pick-ups, crimes, and vehicle collisions, is a challenging and important problem with many applications in fields such as urban planning, transportation optimization and location-based marketing.
Though many point processes have been proposed to model events in a continuous spatio-temporal space, none of them allow for the consideration of the rich contextual factors that affect event occurrence, such as weather, social activities, geographical characteristics, and traffic. In this paper, we propose DMPP Deep Mixture Point Processes , a point process model for predicting spatio-temporal events with the use of rich contextual information; a key advance is its incorporation of the heterogeneous and high-dimensional context available in image and text data.
Specifically, we design the intensity of our point process model as a mixture of kernels, where the mixture weights are modeled by a deep neural network. This formulation allows us to automatically learn the complex nonlinear effects of the contextual factors on event occurrence. At the same time, this formulation makes analytical integration over the intensity, which is required for point process estimation, tractable. We use real-world data sets from different domains to demonstrate that DMPP has better predictive performance than existing methods.
Online prediction has become one of the most essential tasks in many real-world applications. Two main characteristics of typical online prediction tasks include tabular input space and online data generation. Specifically, tabular input space indicates the existence of both sparse categorical features and dense numerical ones, while online data generation implies continuous task-generated data with potentially dynamic distribution.
Consequently, effective learning with tabular input space as well as fast adaption to online data generation become two vital challenges for obtaining the online prediction model. Particularly, GBDT can hardly be adapted to dynamic online data generation, and it tends to be ineffective when facing sparse categorical features; NN, on the other hand, is quite difficult to achieve satisfactory performance when facing dense numerical features.
Powered by these two components, DeepGBM can leverage both categorical and numerical features while retaining the ability of efficient online update. Comprehensive experiments on a variety of publicly available datasets have demonstrated that DeepGBM can outperform other well-recognized baselines in various online prediction tasks.
In recent years, to mitigate the problem of fake news, computational detection of fake news has been studied, producing some promising early results. While important, however, we argue that a critical missing piece of the study be the explainability of such detection, i. In this paper, therefore, we study the explainable detection of fake news. We develop a sentence-comment co-attention sub-network to exploit both news contents and user comments to jointly capture explainable top-k check-worthy sentences and user comments for fake news detection. We conduct extensive experiments on real-world datasets and demonstrate that the proposed method not only significantly outperforms 7 state-of-the-art fake news detection methods by at least 5.
Graph data widely exist in many high-impact applications. Inspired by the success of deep learning in grid-structured data, graph neural network models have been proposed to learn powerful node-level or graph-level representation. However, most of the existing graph neural networks suffer from the following limitations: 1 there is limited analysis regarding the graph convolution properties, such as seed-oriented, degree-aware and order-free; 2 the node's degreespecific graph structure is not explicitly expressed in graph convolution for distinguishing structure-aware node neighborhoods; 3 the theoretical explanation regarding the graph-level pooling schemes is unclear.
To address these problems, we propose a generic degree-specific graph neural network named DEMO-Net motivated by Weisfeiler-Lehman graph isomorphism test that recursively identifies 1-hop neighborhood structures. In order to explicitly capture the graph topology integrated with node attributes, we argue that graph convolution should have three properties: seed-oriented, degree-aware, order-free.
To this end, we propose multi-task graph convolution where each task represents node representation learning for nodes with a specific degree value, thus leading to preserving the degreespecific graph structure. In particular, we design two multi-task learning methods: degree-specific weight and hashing functions for graph convolution. The experimental results on several node and graph classification benchmark data sets demonstrate the effectiveness and efficiency of our proposed DEMO-Net over state-of-the-art graph neural network models. Partial label learning is an emerging weakly-supervised learning framework where each training example is associated with multiple candidate labels among which only one is valid.
Dimensionality reduction serves as an effective way to help improve the generalization ability of learning system, while the task of partial label dimensionality reduction is challenging due to the unknown ground-truth labeling information. In this paper, the first attempt towards partial label dimensionality reduction is investigated by endowing the popular linear discriminant analysis LDA techniques with the ability of dealing with partial label training examples. Specifically, a novel learning procedure named DELIN is proposed which alternates between LDA dimensionality reduction and candidate label disambiguation based on estimated labeling confidences over candidate labels.
On one hand, the projection matrix of LDA is optimized by utilizing disambiguation-guided labeling confidences. On the other hand, the labeling confidences are disambiguated by resorting to kNN aggregation in the LDA-induced feature space. Extensive experiments on synthetic as well as real-world partial label data sets clearly validate the effectiveness of DELIN in improving the generalization ability of state-of-the-art partial label learning algorithms.
Scientific computational models are crucial for analyzing and understanding complex real-life systems that are otherwise difficult for experimentation. However, the complex behavior and the vast input-output space of these models often make them opaque, slowing the discovery of novel phenomena. In this work, we present HINT Hessian INTerestingness -- a new algorithm that can automatically and systematically explore black-box models and highlight local nonlinear interactions in the input-output space of the model. This tool aims to facilitate the discovery of interesting model behaviors that are unknown to the researchers.
Using this simple yet powerful tool, we were able to correctly rank all pairwise interactions in known benchmark models and do so faster and with greater accuracy than state-of-the-art methods. We further applied HINT to existing computational neuroscience models, and were able to reproduce important scientific discoveries that were published years after the creation of those models. Finally, we ran HINT on two real-world models in neuroscience and earth science and found new behaviors of the model that were of value to domain experts.
Online learning algorithms update models via one sample per iteration, thus efficient to process large-scale datasets and useful to detect malicious events for social benefits, such as disease outbreak and traffic congestion on the fly. However, existing algorithms for graph-structured models focused on the offline setting and the least square loss, incapable for online setting, while methods designed for online setting cannot be directly applied to the problem of complex usually non-convex graph-structured sparsity model.
To address these limitations, in this paper we propose a new algorithm for graph-structured sparsity constraint problems under online setting, which we call GraphDA. The key part in GraphDA is to project both averaging gradient in dual space and primal variables in primal space onto lower dimensional subspaces, thus capturing the graph-structured sparsity effectively. Furthermore, the objective functions assumed here are generally convex so as to handle different losses for online learning settings.
To the best of our knowledge, GraphDA is the first online learning algorithm for graph-structure constrained optimization problems. To validate our method, we conduct extensive experiments on both benchmark graph and real-world graph datasets. Our experiment results show that, compared to other baseline methods, GraphDA not only improves classification performance, but also successfully captures graph-structured features more effectively, hence stronger interpretability. Sequential recommendation and information dissemination are two traditional problems for sequential information retrieval.
The common goal of the two problems is to predict future user-item interactions based on past observed interactions. The difference is that the former deals with users' histories of clicked items, while the latter focuses on items' histories of infected users. In this paper, we take a fresh view and propose dual sequential prediction models that unify these two thinking paradigms.
One user-centered model takes a user's historical sequence of interactions as input, captures the user's dynamic states, and approximates the conditional probability of the next interaction for a given item based on the user's past clicking logs. By contrast, one item-centered model leverages an item's history, captures the item's dynamic states, and approximates the conditional probability of the next interaction for a given user based on the item's past infection records. To take advantage of the dual information, we design a new training mechanism which lets the two models play a game with each other and use the predicted score from the opponent to design a feedback signal to guide the training.
We show that the dual models can better distinguish false negative samples and true negative samples compared with single sequential recommendation or information dissemination models. Experiments on four real-world datasets demonstrate the superiority of proposed model over some strong baselines as well as the effectiveness of dual training mechanism between two models.
Given a large, semi-infinite collection of co-evolving data sequences e. We present an intuitive model, namely OrbitMap, which provides a good summary of time-series evolution in streams. We also propose a scalable and effective algorithm for fitting and forecasting time-series data streams. Our method is designed as a dynamic, interactive and flexible system, and is based on latent non-linear differential equations.
Our proposed method has the following advantages: a It is effective: it captures important time-evolving patterns in data streams and enables real-time, long-range forecasting; b It is general: our model is general and practical and can be applied to various types of time-evolving data streams; c It is scalable: our algorithm does not depend on data size, and thus is applicable to very large sequences.
Extensive experiments on real datasets demonstrate that OrbitMap makes long-range forecasts, and consistently outperforms the best existing state-of-the-art methods as regards accuracy and execution speed. Many real-world problems are time-evolving in nature, such as the progression of diseases, the cascading process when a post is broadcasting in a social network, or the changing of climates.
The observational data characterizing these complex problems are usually only available at discrete time stamps, this makes the existing research on analyzing these problems mostly based on a cross-sectional analysis. In this paper, we try to model these time-evolving phenomena by a dynamic system and the data sets observed at different time stamps are probability distribution functions generated by such a dynamic system.
KDD 12222 Proceedings
We propose a theorem which builds a mathematical relationship between a dynamical system modeled by differential equations and the distribution function or survival function of the cross-sectional states of this system. We then develop a survival analysis framework to learn the differential equations of a dynamical system from its cross-sectional states. With such a framework, we are able to capture the continuous-time dynamics of an evolutionary system.
We validate our framework on both synthetic and real-world data sets. The experimental results show that our framework is able to discover and capture the generative dynamics of various data distributions accurately. Our study can potentially facilitate scientific discoveries of the unknown dynamics of complex systems in the real world. Network community detection is a hot research topic in network analysis. Although many methods have been proposed for community detection, most of them only take into consideration the lower-order structure of the network at the level of individual nodes and edges.
Thus, they fail to capture the higher-order characteristics at the level of small dense subgraph patterns, e. Recently, some higher-order methods have been developed but they typically focus on the motif-based hypergraph which is assumed to be a connected graph. However, such assumption cannot be ensured in some real-world networks.
In particular, the hypergraph may become fragmented. That is, it may consist of a large number of connected components and isolated nodes, despite the fact that the original network is a connected graph. Therefore, the existing higher-order methods would suffer seriously from the above fragmentation issue, since in these approaches, nodes without connection in hypergraph can't be grouped together even if they belong to the same community. To address the above fragmentation issue, we propose an Edge enhancement approach for Motif-aware community detection EdMot.
The main idea is as follows. Firstly, a motif-based hypergraph is constructed and the top K largest connected components in the hypergraph are partitioned into modules. Afterwards, the connectivity structure within each module is strengthened by constructing an edge set to derive a clique from each module.
Based on the new edge set, the original connectivity structure of the input network is enhanced to generate a rewired network, whereby the motif-based higher-order structure is leveraged and the hypergraph fragmentation issue is well addressed. Finally, the rewired network is partitioned to obtain the higher-order community structure. Extensive experiments have been conducted on eight real-world datasets and the results show the effectiveness of the proposed method in improving the community detection performance of state-of-the-art methods.
With the increasing availability of moving-object tracking data, use of this data for route search and recommendation is increasingly important. To this end, we propose a novel parallel split-and-combine approach to enable route search by locations RSL-Psc. The resulting functionality targets a broad range of applications, including route planning and recommendation, ridesharing, and location-based services in general. To enable efficient and effective RSL-Psc computation on massive route data, we develop novel search space pruning techniques and enable use of the parallel processing capabilities of modern processors.
In each sub-task, we use network expansion and exploit spatial similarity bounds for pruning. The algorithms split candidate routes into sub-routes and combine them to construct new routes. The sub-tasks are independent and are performed in parallel. Extensive experiments with real data offer insight into the performance of the algorithms, indicating that our RSL-Psc problem can generate high-quality results and that the two algorithms are capable of achieving high efficiency and scalability. With the proliferation of commercial tracking systems, sports data is being generated at an unprecedented speed and the interest in sports play retrieval has grown dramatically as well.
However, it is challenging to design an effective, efficient and robust similarity measure for sports play retrieval. To this end, we propose a deep learning approach to learn the representations of sports plays, called play2vec, which is robust against noise and takes only linear time to compute the similarity between two sports plays.
We conduct experiments on real-world soccer match data, and the results show that our solution performs more effectively and efficiently compared with the state-of-the-art methods. Express systems are widely deployed in many major cities. Couriers in an express system load parcels at transit station and deliver them to customers.
Meanwhile, they also try to serve the pick-up requests which come stochastically in real time during the delivery process. Having brought much convenience and promoted the development of e-commerce, express systems face challenges on courier management to complete the massive number of tasks per day. Considering this problem, we propose a reinforcement learning based framework to learn a courier management policy. Firstly, we divide the city into independent regions, in each of which a constant number of couriers deliver parcels and serve requests cooperatively.
BDSB guarantees that each courier has almost even delivery and expected request-service burden when departing from transit station, giving a reasonable initialization for online management later. As pick-up requests come in real time, a Contextual Cooperative Reinforcement Learning CCRL model is proposed to guide where should each courier deliver and serve in each short period. Being formulated in a multi-agent way, CCRL focuses on the cooperation among couriers while also considering the system context.
Experiments on real-world data from Beijing are conducted to confirm the outperformance of our model. Analysis of large-scale sequential data has been one of the most crucial tasks in areas such as bioinformatics, text, and audio mining. Existing string kernels, however, either i rely on local features of short substructures in the string, which hardly capture long discriminative patterns, ii sum over too many substructures, such as all possible subsequences, which leads to diagonal dominance of the kernel matrix, or iii rely on non-positive-definite similarity measures derived from the edit distance.
Furthermore, while there have been works addressing the computational challenge with respect to the length of string, most of them still experience quadratic complexity in terms of the number of training samples when used in a kernel-based classifier. In this paper, we present a new class of global string kernels that aims to i discover global properties hidden in the strings through global alignments, ii maintain positive-definiteness of the kernel, without introducing a diagonal dominant kernel matrix, and iii have a training cost linear with respect to not only the length of the string but also the number of training string samples.
To this end, the proposed kernels are explicitly defined through a series of different random feature maps, each corresponding to a distribution of random strings. We show that kernels defined this way are always positive-definite, and exhibit computational benefits as they always produce Random String Embeddings RSE that can be directly used in any linear classification models. Our extensive experiments on nine benchmark datasets corroborate that RSE achieves better or comparable accuracy in comparison to state-of-the-art baselines, especially with the strings of longer lengths.
In addition, we empirically show that RSE scales linearly with the increase of the number and the length of string. In addition, we also design an ego-centric algorithm MC-EGO for heuristically computing a near-maximum clique in near-linear time. We conduct extensive empirical studies on large real graphs and demonstrate the efficiency and effectiveness of our techniques. Personalized Route Recommendation PRR aims to generate user-specific route suggestions in response to users' route queries.
Early studies cast the PRR task as a pathfinding problem on graphs, and adopt adapted search algorithms by integrating heuristic strategies. Although these methods are effective to some extent, they require setting the cost functions with heuristics. In addition, it is difficult to utilize useful context information in the search procedure. Our model consists of two components.
First, we employ attention-based Recurrent Neural Networks RNN to model the cost from the source to the candidate location by incorporating useful context information. Instead of learning a single cost value, the RNN component is able to learn a time-varying vectorized representation for the moving state of a user. Second, we propose to use a value network for estimating the cost from a candidate location to the destination. For capturing structural characteristics, the value network is built on top of improved graph attention networks by incorporating the moving state of a user and other context information.
The two components are integrated in a principled way for deriving a more accurate cost of a candidate location. Extensive experiment results on three real-world datasets have shown the effectiveness and robustness of the proposed model. Collaborative filtering CF has become one of the most popular and widely used methods in recommender systems, but its performance degrades sharply for users with rare interaction data.
Most existing hybrid CF methods try to incorporate side information such as review texts to alleviate the data sparsity problem. However, the process of exploiting and integrating side information is computationally expensive. Existing hybrid recommendation methods treat each user equally and ignore that the pure CF methods have already achieved both effective and efficient recommendation performance for active users with sufficient interaction records and the little improvement brought by side information to these active users is ignorable.
Therefore, they are not cost-effective solutions. One cost-effective idea to bypass this dilemma is to generate sufficient "real" interaction data for the inactive users with the help of side information, and then a pure CF method could be performed on this augmented dataset effectively. However, there are three major challenges to implement this idea.
Firstly, how to ensure the correctness of the generated interaction data. Secondly, how to combine the data augmentation process and recommendation process into a unified model and train the model end-to-end. Thirdly, how to make the solution generalizable for various side information and recommendation tasks. In light of these challenges, we propose a generic and effective CF model called AugCF that supports a wide variety of recommendation tasks. AugCF is based on Conditional Generative Adversarial Nets that additionally consider the class like or dislike as a feature to generate new interaction data, which can be a sufficiently real augmentation to the original dataset.
Also, AugCF adopts a novel discriminator loss and Gumbel-Softmax approximation to enable end-to-end training. Finally, extensive experiments are conducted on two large-scale recommendation datasets, and the experimental results show the superiority of our proposed model. We present a novel method named Latent Semantic Imputation LSI to transfer external knowledge into semantic space for enhancing word embedding. The method integrates graph theory to extract the latent manifold structure of the entities in the affinity space and leverages non-negative least squares with standard simplex constraints and power iteration method to derive spectral embeddings.
It provides an effective and efficient approach to combining entity representations defined in different Euclidean spaces. Specifically, our approach generates and imputes reliable embedding vectors for low-frequency words in the semantic space and benefits downstream language tasks that depend on word embedding. We conduct comprehensive experiments on a carefully designed classification problem and language modeling and demonstrate the superiority of the enhanced embedding via LSI over several well-known benchmark embeddings. We also confirm the consistency of the results under different parameter settings of our method.
Reinforcement learning aims at searching the best policy model for decision making, and has been shown powerful for sequential recommendations. The training of the policy by reinforcement learning, however, is placed in an environment. In many real-world applications, however, the policy training in the real environment can cause an unbearable cost, due to the exploration in the environment. Environment reconstruction from the past data is thus an appealing way to release the power of reinforcement learning in these applications.
The reconstruction of the environment is, basically, to extract the casual effect model from the data. However, real-world applications are often too complex to offer fully observable environment information. Therefore, quite possibly there are unobserved confounding variables lying behind the data. The hidden confounder can obstruct an effective reconstruction of the environment.
In this paper, by treating the hidden confounder as a hidden policy, we propose a deconfounded multi-agent environment reconstruction DEMER approach in order to learn the environment together with the hidden confounder. DEMER adopts a multi-agent generative adversarial imitation learning framework.
It proposes to introduce the confounder embedded policy, and use the compatible discriminator for training the policies. We firstly use an artificial driver program recommendation environment, abstracted from the real application, to verify and analyze the effectiveness of DEMER. Experiment results show that DEMER can effectively reconstruct the hidden confounder, and thus can build the environment better. DEMER also derives a recommendation policy with a significantly improved performance in the test phase of the real application. Influenza leads to regular losses of lives annually and requires careful monitoring and control by health organizations.
Annual influenza forecasts help policymakers implement effective countermeasures to control both seasonal and pandemic outbreaks. We propose EpiDeep, a novel deep neural network approach for epidemic forecasting which tackles all of these issues by learning meaningful representations of incidence curves in a continuous feature space and accurately predicting future incidences, peak intensity, peak time, and onset of the upcoming season. We present extensive experiments on forecasting ILI influenza-like illnesses in the United States, leveraging multiple metrics to quantify success. Our results demonstrate that EpiDeep is successful at learning meaningful embeddings and, more importantly, that these embeddings evolve as the season progresses.
Exploratory analysis over network data is often limited by the ability to efficiently calculate graph statistics, which can provide a model-free understanding of the macroscopic properties of a network. We introduce a framework for estimating the graphlet countthe number of occurrences of a small subgraph motif e. For massive graphs, where accessing the whole graph is not possible, the only viable algorithms are those that make a limited number of vertex neighborhood queries.
We introduce a Monte Carlo sampling technique for graphlet counts, called Lifting, which can simultaneously sample all graphlets of size up to k vertices for arbitrary k. This is the first graphlet sampling method that can provably sample every graphlet with positive probability and can sample graphlets of arbitrary size k. We outline variants of lifted graphlet counts, including the ordered, unordered, and shotgun estimators, random walk starts, and parallel vertex starts.
We prove that our graphlet count updates are unbiased for the true graphlet count and have a controlled variance for all graphlets. We compare the experimental performance of lifted graphlet counts to the state-of-the art graphlet sampling procedures: Waddling and the pairwise subgraph random walk.
How can we estimate the importance of nodes in a knowledge graph KG? A KG is a multi-relational graph that has proven valuable for many tasks including question answering and semantic search. In this paper, we present GENI, a method for tackling the problem of estimating node importance in KGs, which enables several downstream applications such as item recommendation and resource allocation.
While a number of approaches have been developed to address this problem for general graphs, they do not fully utilize information available in KGs, or lack flexibility needed to model complex relationship between entities and their importance. To address these limitations, we explore supervised machine learning algorithms.
Our method performs an aggregation of importance scores instead of aggregating node embeddings via predicate-aware attention mechanism and flexible centrality adjustment. Despite its popularity, it is very challenging to guarantee the feature selection consistency of Lasso especially when the dimension of the data is huge. One way to improve the feature selection consistency is to select an ideal tuning parameter. Traditional tuning criteria mainly focus on minimizing the estimated prediction error or maximizing the posterior model probability, such as cross-validation and BIC, which may either be time-consuming or fail to control the false discovery rate FDR when the number of features is extremely large.
Grading Knowledge: Extracting Degree Information from Texts - Steffen Staab - Google книги
The other way is to introduce pseudo-features to learn the importance of the original ones. Recently, the Knockoff filter is proposed to control the FDR when performing feature selection. However, its performance is sensitive to the choice of the expected FDR threshold. Motivated by these ideas, we propose a new method using pseudo-features to obtain an ideal tuning parameter. In particular, we present the E fficient T uning of Lasso ET-Lasso to separate active and inactive features by adding permuted features as pseudo-features in linear models.
The pseudo-features are constructed to be inactive by nature, which can be used to obtain a cutoff to select the tuning parameter that separates active and inactive features. Experimental studies on both simulations and real-world data applications are provided to show that ET-Lasso can effectively and efficiently select active features under a wide range of scenarios. This paper targets to a novel but practical recommendation problem named exact-K recommendation. It is different from traditional top-K recommendation, as it focuses more on constrained combinatorial optimization which will optimize to recommend a whole set of K items called card, rather than ranking optimization which assumes that "better" items should be put into top positions.
Thus we take the first step to give a formal problem definition, and innovatively reduce it to Maximum Clique Optimization based on graph. To tackle this specific combinatorial optimization problem which is NP-hard, we propose Graph Attention Networks GAttN with a Multi-head Self-attention encoder and a decoder with attention mechanism.
It can end-to-end learn the joint distribution of the K items and generate an optimal card rather than rank individual items by prediction scores. Then we propose Reinforcement Learning from Demonstrations RLfD which combines the advantages in behavior cloning and reinforcement learning, making it sufficient-and-efficient to train the model. Extensive experiments on three datasets demonstrate the effectiveness of our proposed GAttN with RLfD method, it outperforms several strong baselines with a relative improvement of 7.
Adaptive learning, also known as adaptive teaching, relies on learning path recommendation, which sequentially recommends personalized learning items e. Although it is well known that modeling the cognitive structure including knowledge level of learners and knowledge structure e. By viewing path recommendation as a Markov Decision Process and applying an actor-critic algorithm, CSEAL can sequentially identify the right learning items to different learners. Specifically, we first utilize a recurrent neural network to trace the evolving knowledge levels of learners at each learning step.
Then, we design a navigation algorithm on the knowledge structure to ensure the logicality of learning paths, which reduces the search space in the decision process. Finally, the actor-critic algorithm is used to determine what to learn next and whose parameters are dynamically updated along the learning path.
In this paper, we study the problem of online influence maximization in social networks. In this problem, a learner aims to identify the set of "best influencers" in a network by interacting with the network, i. We capitalize on an important property of the influence maximization problem named network assortativity, which is ignored by most existing works in online influence maximization.
To realize network assortativity, we factorize the activation probability on the edges into latent factors on the corresponding nodes, including influence factor on the giving nodes and susceptibility factor on the receiving nodes. We propose an upper confidence bound based online learning solution to estimate the latent factors, and therefore the activation probabilities.
Considerable regret reduction is achieved by our factorization based online influence maximization algorithm. Extensive empirical evaluations on two real-world networks showed the effectiveness of our proposed solution. Given a dynamic graph stream, how can we detect the sudden appearance of anomalous patterns, such as link spam, follower boosting, or denial of service attacks? Additionally, can we categorize the types of anomalies that occur in practice, and theoretically analyze the anomalous signs arising from each type?
In this work, we propose AnomRank, an online algorithm for anomaly detection in dynamic graphs. AnomRank uses a two-pronged approach defining two novel metrics for anomalousness. Each metric tracks the derivatives of its own version of a 'node score' or node importance function. This allows us to detect sudden changes in the importance of any node. We show theoretically and experimentally that the two-pronged approach successfully detects two common types of anomalies: sudden weight changes along an edge, and sudden structural changes to the graph. AnomRank is a Fast and Accurate: up to Empirical entropy refers to the information entropy calculated from the empirical distribution of a dataset.
It is a widely used aggregation function for knowledge discovery, as well as the foundation of other aggregation functions such as mutual information. However, computing the exact empirical entropy on a large-scale dataset can be expensive. Using a random subsample, we can compute an approximation of the empirical entropy efficiently. We derive probabilistic error bounds for the approximation, where the error bounds reduce in a near square root rate with respect to the subsample size. We further study two applications which can benefit from the error-bounded approximation: feature ranking and filtering based on mutual information.
We develop algorithms to progressively subsample the dataset and return correct answers with high probability. The sample complexity of the algorithms is independent of data size. A social network is an ecosystem, and one of its ultimate goals is to maintain itself sustainable, namely keeping users generating information and being informed. However, the reasons why some social ecosystems can keep self-sustaining and others end up with non-active or dead states are largely unknown.
In this paper, rather than studying social ecosystems at the population level, we analyze the fates of different microscopic social ecosystems, namely the final states of their collective activity dynamics in a real-world online social media with detailed individual level records for the first time. We find huge complexities in microscopic social ecosystems, including complex species types, complex individual interaction networks, and complex dynamics and final states. In order to capture the observed complexities in the real-world data, we propose a microscopic ecological model, which is able to capture the complex fates of heterogeneous microscopic social ecosystems accurately in both synthetic and empirical datasets.
Furthermore, we analyze the driven factors of the fates of microscopic social ecosystems, including interaction networks of individuals and dynamical interaction mechanisms of species, leading to the control of microscopic social ecosystems, that is the ability to influence the temporal behaviours and their final states towards active or dead fates. The process of opinion formation is inherently a network process, with user opinions in a social network being driven to a certain average opinion.
One simple and intuitive incarnation of this opinion attractor is the average of user opinions weighted by the users' eigenvector centralities. This value is a lucrative target for control, as altering it essentially changes the mass opinion in the network. Since any potentially malicious influence upon the opinion distribution in a society is undesirable, it is important to design methods to prevent external attacks upon it.
In this work, we assume that the adversary aims to maliciously change the network's average opinion by altering the opinions of some unknown users. We, then, state an NP-hard problem of disabling such opinion control attempts via strategically altering the network's users' eigencentralities by recommending a limited number of links to the users. Relying on Markov chain theory, we provide perturbation analysis that shows how eigencentrality and, hence, our problem's objective change in response to a link's addition to the network.
The latter leads to the design of a pseudo-linear-time heuristic, relying on efficient estimation of mean first passage times in Markov chains. We have confirmed our theoretical and algorithmic findings, and studied effectiveness and efficiency of our heuristic in experiments with synthetic and real networks. Can a system discover what a user wants without the user explicitly issuing a query? A recommender system proposes items of potential interest based on past user history.
On the other hand, active search incites, and learns from, user feedback, in order to recommend items that meet a user's current tacit interests, hence promises to offer up-to-date recommendations going beyond those of a recommender system. Yet extant active search methods require an overwhelming amount of user input, relying solely on such input for each item they pick. In this paper, we propose MF-ASC, a novel active search mechanism that performs well with minimal user input. MF-ASC combines cheap, low-fidelity evaluations in the style of a recommender system with the user's high-fidelity input, using Gaussian process regression with multiple target variables cokriging.
To our knowledge, this is the first application of cokriging to active search.
- Stalingrad: Victory on the Volga!
- An Unspeakable Crime. The Prosecution and Persecution of Leo Frank?
- Bipolar Expeditions: Mania and Depression in American Culture!
Our empirical study with synthetic and real-world data shows that MF-ASC outperforms the state of the art in terms of result relevance within a budget of interactions. Precisely evaluating the effect of new policies e. Recently, Inverse Propensity Score IPS estimators are proposed as alternatives to evaluate the effect of new policy with offline logged data that was collected from a different policy in the past.
They tend to remove the distribution shift induced by past policy. However, they ignore the distribution shift that would be induced by the new policy, which results in imprecise evaluation. Moreover, their performances rely on accurate estimation of propensity score, which can not be guaranteed or validated in practice. In this paper, we propose a non-parametric method, named Focused Context Balancing FCB algorithm, to learn sample weights for context balancing, so that the distribution shift induced by the past policy and new policy can be eliminated respectively.
To validate the effectiveness of our FCB algorithm, we conduct extensive experiments on both synthetic and real world datasets. The experimental results clearly demonstrate that our FCB algorithm outperforms existing estimators by achieving more precise and robust results for offline policy evaluation.
Discovering disease-gene association is a fundamental and critical biomedical task, which assists biologists and physicians to discover pathogenic mechanism of syndromes. With various clinical biomarkers measuring the similarities among genes and disease phenotypes, network-based semi-supervised learning NSSL has been commonly utilized by these studies to address this class-imbalanced large-scale data issue.
However, most existing NSSL approaches are based on linear models and suffer from two major limitations: 1 They implicitly consider a local-structure representation for each candidate; 2 They are unable to capture nonlinear associations between diseases and genes. With the help of GCN, we could capture non-linear interactions and exploit measured similarities.
Moreover, we define a margin control loss function to reduce the effect of sparsity. Empirical results demonstrate that the proposed deep learning algorithm outperforms all other state-of-the-art methods on most of metrics. Hierarchical clustering is typically performed using algorithmic-based optimization searching over the discrete space of trees. While these optimization methods are often effective, their discreteness restricts them from many of the benefits of their continuous counterparts, such as scalable stochastic optimization and the joint optimization of multiple objectives or components of a model e.
In this paper, we present an approach for hierarchical clustering that searches over continuous representations of trees in hyperbolic space by running gradient descent. We compactly represent uncertainty over tree structures with vectors in the Poincare ball. We show how the vectors can be optimized using an objective related to recently proposed cost functions for hierarchical clustering Dasgupta, ; Wang and Wang, Using our method with a mini-batch stochastic gradient descent inference procedure, we are able to outperform prior work on clustering millions of ImageNet images by 15 points of dendrogram purity.
Further, our continuous tree representation can be jointly optimized in multi-task learning applications offering a 9 point improvement over baseline methods. Graph neural networks, which generalize deep neural network models to graph structured data, have attracted increasing attention in recent years.
They usually learn node representations by transforming, propagating and aggregating node features and have been proven to improve the performance of many graph related tasks such as node classification and link prediction. To apply graph neural networks for the graph classification task, approaches to generate thegraph representation from node representations are demanded. A common way is to globally combine the node representations. However, rich structural information is overlooked. Thus a hierarchical pooling procedure is desired to preserve the graph structure during the graph representation learning.
There are some recent works on hierarchically learning graph representation analogous to the pooling step in conventional convolutional neural CNN networks. However, the local structural information is still largely neglected during the pooling process. Random walks are widely adopted in various network analysis tasks ranging from network embedding to label propagation. It could capture and convert geometric structures into structured sequences while alleviating the issues of sparsity and curse of dimensionality.
Though random walks on plain networks have been intensively studied, in real-world systems, nodes are often not pure vertices, but own different characteristics, described by the rich set of data associated with them. These node attributes contain plentiful information that often complements the network, and bring opportunities to the random-walk-based analysis. However, it is unclear how random walks could be developed for attributed networks towards an effective joint information extraction.
Node attributes make the node interactions more complicated and are heterogeneous with respect to topological structures. To bridge the gap, we explore to perform joint random walks on attributed networks, and utilize them to boost the deep node representation learning. The proposed framework GraphRNA consists of two major components, i. AttriWalk considers node attributes as a bipartite network and uses it to propel the walking more diverse and mitigate the tendency of converging to nodes with high centralities.
AttriWalk enables us to advance the prominent deep network embedding model, graph convolutional networks, towards a more effective architecture - GRN. GRN empowers node representations to interact in the same way as nodes interact in the original attributed network. Experimental results on real-world datasets demonstrate the effectiveness of GraphRNA compared with the state-of-the-art embedding algorithms.
Attention operators have been widely applied in various fields, including computer vision, natural language processing, and network embedding learning. Attention operators on graph data enables learnable weights when aggregating information from neighboring nodes.
However, graph attention operators GAOs consume excessive computational resources, preventing their applications on large graphs. In addition, GAOs belong to the family of soft attention, instead of hard attention, which has been shown to yield better performance. To further reduce the requirements on computational resources, we propose the cGAO that performs attention operations along channels. Experimental results demonstrate that our proposed deep models with the new operators achieve consistently better performance.
Comparison results also indicates that hGAO achieves significantly better performance than GAO on both node and graph embedding tasks. Efficiency comparison shows that our cGAO leads to dramatic savings in computational resources, making them applicable to large graphs. We address a fundamental problem in chemistry known as chemical reaction product prediction. Our main insight is that the input reactant and reagent molecules can be jointly represented as a graph, and the process of generating product molecules from reactant molecules can be formulated as a sequence of graph transformations.
To this end, we propose Graph Transformation Policy Network GTPN - a novel generic method that combines the strengths of graph neural networks and reinforcement learning to learn reactions directly from data with minimal chemical knowledge. Compared to previous methods, GTPN has some appealing properties such as: end-to-end learning, and making no assumption about the length or the order of graph transformations. In order to guide model search through the complex discrete space of sets of bond changes effectively, we extend the standard policy gradient loss by adding useful constraints.
We present a graph-based semi-supervised learning SSL method for learning edge flows defined on a graph. Specifically, given flow measurements on a subset of edges, we want to predict the flows on the remaining edges. To this end, we develop a computational framework that imposes certain constraints on the overall flows, such as approximate flow conservation.
These constraints render our approach different from classical graph-based SSL for vertex labels, which posits that tightly connected nodes share similar labels and leverages the graph structure accordingly to extrapolate from a few vertex labels to the unlabeled vertices.
We derive bounds for our method's reconstruction error and demonstrate its strong performance on synthetic and real-world flow networks from transportation, physical infrastructure, and the Web. Furthermore, we provide two active learning algorithms for selecting informative edges on which to measure flow, which has applications for optimal sensor deployment. The first strategy selects edges to minimize the reconstruction error bound and works well on flows that are approximately divergence-free.
The second approach clusters the graph and selects bottleneck edges that cross cluster-boundaries, which works well on flows with global trends. Mapping the human brain, or understanding how certain brain regions relate to specific aspects of cognition, has been and remains an active area of neuroscience research. Functional magnetic resonance imaging fMRI datain the form of images, time series or graphsare central in this research, but pose many challenges in phenotype prediction tasks e.
Technical Accuracy Table 3 summarizes technical accuracy by type of case-detection algorithm and by medical condition. Table 3: Median accuracy by algorithm type and condition. Additional benefit of information extraction from text The main benefit of extracting information from text was that case-detection was significantly improved. Table 4: Accuracy of case-detection algorithms comparing codes and text. What are the future directions for information extraction from EMR text? Strengths and limitations of the current study This study identified a good range of published papers on extraction of information from text in EMRs.
Conclusions A wide range of studies showed that information extracted from EMR text has been used to identify varied conditions with variable degrees of success. World Health Organization. International statistical classification of diseases and related health problems 10th revision, edition Geneva, Switzerland, Copenhagen, Denmark. Accessed from: www. Fam Pract. Power of expression in the electronic patient record: structured data or narrative text? Int J Med Inform. Walsh SH. The clinician's perspective on electronic health records and how they can affect patient care.
Br Med J. Opportunities for and challenges of computerisation. Managers see the problems associated with coding clinical data as a technical issue whilst clinicians also see cultural barriers. Methods Inf Med. NHS England. London, Medical narratives in electronic medical records.
Greenhalgh T, Hurwitz B. Narrative based medicine: why study narrative. Exploring the degree of concordance of coded and textual data in answering clinical queries from a clinical data repository. Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text? Importance of accurately identifying disease in studies using electronic health records. Epidemiology, co-morbidities, and medication use of patients with alzheimer's disease or vascular dementia in the UK. J Alzheimers Dis. Selection of medical diagnostic codes for analysis of electronic patient records.
Application to stroke in a primary care database. Mortality and other important diabetes-related outcomes with insulin vs other antihyperglycemic therapies in type 2 diabetes. J Clin Endocrinol Metab. Recent trends in the incidence of recorded depression and depressve symptoms in primary care. Br J Psych. Ryan R, Majeed A. Prevalence of treated hypertension in general practice in England and Wales, to Health Stat Q. A systematic review of validated methods for identifying patients with rheumatoid arthritis using administrative or claims data.
Automatic prediction of rheumatoid arthritis disease activity from the electronic medical records. PLoS One. The validity of the diagnosis of inflammatory arthritis in a large population based primary care database. BMC Fam Pract. Applying active learning to high-throughput phenotyping algorithms for electronic health records data. An administrative data validation study of the accuracy of algorithms for identifying rheumatoid arthritis: the influence of the reference standard on algorithm performance.
BMC Musculoskelet Disord. Determining the date of diagnosis - is it a simple matter? The impact of different approaches to dating diagnosis on estimates of delayed care for ovarian cancer in UK primary care. Automatically estimating the incidence of symptoms recorded in GP free text notes. ACM; Foundations of Statistical Natural Language Processing. Assessing the difficulty and time cost of de-identification in clinical narratives. Methods Inform Med. Automatic de-identification of textual documents in the electronic health record: a review of recent research.
Kalra D, Ingram D. Electronic health records. Information Technology Solutions for Healthcare. Springer, London; Communication of clinically relevant information in electronic health records: a comparison between structured data and unrestricted physician language. Lexical acquisition for clinical text mining using distributional similarity.
Computational Linguistics and Intelligent Text Processing. Springer, Berlin Heidelberg; A general natural-language text processor for clinical radiology. Automated encoding of clinical documents based on natural language processing. Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. Identification of methicillin-resistant Staphylococcus aureus within the Nation's Veterans Affairs Medical Centers using natural language processing.
- Soumen Chakrabarti.
- Algae Abstracts: A Guide to the Literature. Volume 1: To 1969?
- A Systems Theory Perspective (Infant Development: Perspectives from German-Speaking Countries).
- Faculty of computer science - OvGU Magdeburg Data and Knowledge Engineering?
An efficient pancreatic cyst identification methodology using natural language processing. Automated pancreatic cyst screening using natural language processing: a new tool in the early detection of pancreatic cancer. Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers. Probabilistic case detection for disease surveillance using data in electronic medical records. Online J Public Health Inform. Using natural language processing to improve accuracy of automated notifiable disease reporting.
Portability of an algorithm to identify rheumatoid arthritis in electronic health records. A natural language processing algorithm to define a venous thromboembolism phenotype. Hanauer DA. The Unified Medical Language System. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. ConText: an algorithm for determining negation, experiencer, and temporal status from clinical reports. MedEx: a medication information extraction system for clinical narratives. Use of an electronic medical record for the identification of research subjects with diabetes mellitus.
Clin Med Res. Automated identification of diagnosis and co-morbidity in clinical records. Automated outcome classification of emergency department computed tomography imaging reports. Acad Emerg Med. Prospective recruitment of patients with congestive heart failure using an ad-hoc binary classifier.
Electronic medical records for clinical research: application to the identification of heart failure. Am J Manag Care. Improving sensitivity of machine learning methods for automated case identification from free-text electronic medical records. Automatic generation of case-detection algorithms to identify children with asthma from large electronic health record databases. Pharmacoepidemiol Drug Saf. Predicting atrial fibrillation and flutter using electronic health records.
Validation of psoriatic arthritis diagnoses in electronic medical records using natural language processing. Semin Arthritis Rheum. Extracting and integrating data from entire electronic health records for detecting colorectal cancer cases. Validation of electronic health record phenotyping of bipolar disorder cases and controls. Am J Psychiatry. Automated chart review for asthma cohort identification using natural language processing: an exploratory study. Ann Allergy Asthma Immunol. Uzuner O. Second i2b2 workshop on natural language processing challenges for clinical records.
Comparing ICD9-encoded diagnoses and NLP-processed discharge summaries for clinical trials pre-screening: a case study. The registry case finding engine: an automated tool to identify cancer cases from unstructured, free-text pathology reports and clinical notes. J Am Coll Surg. Identifying patients with hypertension: a case for auditing electronic health record data. Perspect Health Inf Manag. Improving case definition of Crohn's disease and ulcerative colitis in electronic medical records using natural language processing: a novel informatics approach.
Inflamm Bowel Dis. Unbiased identification of patients with disorders of sex development. Application of natural language processing to VA electronic health records to identify phenotypic characteristics for clinical and research purposes. Summit on Translat Bioinforma. Development of query strategies to identify a histologic lymphoma subtype in a large linked database system. Cancer Inform. Validation study in four health-care databases: upper gastrointestinal bleeding misclassification affects precision but not magnitude of drug-related upper gastrointestinal bleeding risk.
J Clin Epidemiol. Using the electronic medical record to identify community-acquired pneumonia: toward a replicable automated strategy. Use of computerized algorithm to identify individuals in need of testing for celiac disease. Electronic medical records for discovery research in rheumatoid arthritis. Arthritis Care Res. Modeling disease severity in multiple sclerosis using electronic health records. Combining free text and structured electronic medical record entries to detect acute respiratory infections.
Venous thromboembolism 2 acute myocardial infarction, upper GI bleeding, ischemic stroke, acute renal failure, acute orbital fracture. Machine learning algorithm combining text with codes, labs, or medication. Ripper Support vector machines SVM Decision tree, vs SVM vs Ripper vs metacost Random forest Comparison of rule based CDA with machine learning and logistic regression CDAs combining text with codes, labs, or medication. Rule based vs SVM vs random forest vs Ripper vs logistic regression Rule based vs logistic regression Rule based vs decision tree Probabilistic secondary case detection algorithm Logistic Regression; Bayesian; machine learning.
Gundlapalli Graiser Valkhoff DeLisle Li Ludvigsson Pakhomov Ananthakrishnan Carroll Liao Xia Zheng Karnik Castro
Related Grading Knowledge: Extracting Degree Information from Texts
Copyright 2019 - All Right Reserved