学术报告

当前位置: 首页 > 学术报告 > 正文

清华大学尹建华博士学术报告

发布时间:2017-05-23 编辑: 来源:williamhill体育网页版

TitleModel-based Algorithms for Text Clustering

时间:52308:30-09:00

地点:办公楼2楼会议室

 

Abstract

Text clustering is an important technique in datamining and machine learning, and widely used in applications such as topicdetection and tracking, document summarization, and search results clustering. Althoughmany studies have been done on text clustering, there are still manychallenging problems to be solved: (1) How to set the number of clusters? Canwe learn it from the dataset? (2) How to deal with the high-dimensional problemof text clustering? (3) How to deal with the sparse problem of short text? (4)How to obtain good representation of the clusters? (5) How to detect theoutlier documents? We will introduce several model-based text clusteringalgorithms which can cope with the above challenges. These algorithms are basedon papers we published at SIGKDD'14, ICDE'16, and SIGKDD'16.

  

Bio

Jianhua Yin is currently a doctoral candidate atTsinghua University in the laboratory of Prof. Jianyong Wang. He received hisBS degree in the department of computer science and technology, Xidian Universityin 2012. He visited the data mining research group in computer science,university of Illinois at Urbana-Champaign, under the supervision of Prof.Jiawei Han from October 2015 to April 2016. His research interests fall intothe fields of data mining and machine learning. He is particularly interestedin text mining and probabilistic graphical models.

 

    教育机构

    友情链接

联系我们

地址: 山东省青岛市即墨区滨海公路72号

           威廉体育平台最新网站(青岛)第周苑C座

邮编:266237

院办电话:(86)-532-58630622

本科招生电话:(86)-532-58630176

研究生招生电话:(86)-532-58630610

公司微信公众号

威廉体育平台最新网站微信公众号