Transforming the data into a suitable representation is the first key step of data analysis, which the performance of any data-oriented method is heavily dependent on it. This need is concerned with questions surrounding how we can best learn representations for textual entities that are: 1) precise, 2) robust against noisy terms, 3) transferable over time, and 4) interpretable by human inspection.
In this talk, I am going to present significant words language models of a set of documents that capture all, and only, the significant shared terms from these documents. I will also present how to apply significant words language models to several applications like group profiling, relevance feedback, and hierarchical classification and will show how it helps to get a better performance over the state-of-the-art methods.
|
Mostafa Dehghani is a PhD student at University of Amsterdam. Generally, his research lies at the intersection of Information Retrieval and Machine Learning.
He has done some research on the core information retrieval by contributing to the language modeling framework. He is currently working on neural computational models for information retrieval applications.
|