特邀报告1:Web Science, AI and Future of the Internet

Wendy Hall (英国南安普顿大学)

Bio: Dame Wendy Hall, DBE, FRS, FREng is Regius Professor of Computer Science, Pro Vice-Chancellor (International Engagement), and is an Executive Director of the Web Science Institute at the University of Southampton.
She became a Dame Commander of the British Empire in the 2009 UK New Year’s Honours list, and is a Fellow of the Royal Society. She was President of the ACM, Senior Vice President of the Royal Academy of Engineering, and has been a member of the UK Prime Minister’s Council for Science and Technology, the European Research Council, the Global Commission on Internet Governance and the World Economic Forum’s Global Futures Council on the Digital Economy and Chair of the European Commission’s ISTAG. Dame Wendy was co-Chair of the UK government’s AI Review, which was published in October 2017, and has recently been announced by the UK government as the first Skills Champion for AI in the UK.

Abstract: The Web and Artificial Intelligence have always been interwoven. AI technologies have long been used by Web developers and the major platforms to provide increasingly intelligent services for Web and internet users, and it was always part of Tim Berners-Lee’s original design to develop an intelligent, or semantic, Web that enabled machines to infer knowledge from interconnected documents and data. Web Science studies the evolution of the Web from a sociotechnical perspective and how human intelligence interacts with the artificial intelligence we derive from our use of the Web.

Artificial Intelligence is set to transform society in the coming decades in ways that have long been predicted by science fiction writers but are only now becoming feasible because of recent developments in computing technology, machine learning and the availability of massive amounts of data on which to train the algorithms. The potential is enormous and governments around the world are worrying about the impact of AI on society both in terms of how it will change the world of work, but also in terms of the potential advantages that the technology can bring to society. But we must also be very aware of the potential threats to society that such developments might bring and the ethical, accountability and diversity issues we need to address, including particularly the world of software automation. If we don’t lay the groundwork well now, there is huge potential for chaos and confusion in the future as AI starts to become more dominant in all our lives, which is why I argue we need to take a socio-technical approach to every aspect of the evolution of AI in society as we have for the study of the Web.

But as a result of all these developments we are facing a time of major change and disruption for the internet – the technology that has underpinned so much societal change over the last fifty years. In this talk we will argue that we must take a sociotechnical approach to our analysis of the evolution of the internet in order to ensure that the internet of the future helps us create a world that we all want to live in.


王海峰  (百度)

Bio: 王海峰,博士,现任百度CTOAI技术平台体系(AIG)和基础技术体系(TG)总负责人,兼任百度研究院院长、深度学习技术及应用国家工程实验室主任。自然语言处理领域最具影响力的国际学术组织ACLAssociation for Computational Linguistics50多年历史上首位出任过主席的华人,唯一来自中国大陆的ACL FellowACL亚太分会创始主席。IEEE Industry Advisory Board委员。兼任中国中文信息学会、中国电子学会、中国网络空间安全协会、类脑智能技术及应用国家工程实验室、新一代人工智能产业技术创新战略联盟、人工智能产业发展联盟等机构副理事长,大数据系统软件国家工程实验室技术委员会副主任,中国人工智能学会会士,新一代人工智能战略咨询委员会委员等。获国家科技进步奖二等奖一项(第一完成人),中国电子学会科技进步一等奖四项(均为第一完成人)。首届全国创新争先奖唯一来自互联网行业的获奖人。首个吴文俊人工智能杰出贡献奖唯一获奖人。享受国务院政府特殊津贴。已发表学术论文120余篇,获得中国和国际授权发明专利100余项。获中国专利银奖一项。

Abstract: 得益于算法、算力及数据的突破,人工智能取得长足发展,呈现出多技术融合和产业应用规模化等趋势。作为人工智能的核心关键技术,知识图谱及语义理解已经广泛应用,并在产业智能化中发挥越来越大的价值。本报告将详解知识图谱与语义理解技术及应用,并探讨未来发展方向。


李明 (加拿大滑铁卢大学)

Bio:  李明教授在美国康奈尔大学获得博士学位。哈佛大学博士后。现为加拿大滑铁卢大学的讲座教授(University Professor),Canada Research Chair,加拿大皇家科学院院士,ACMIEEEFellow2010年获得加拿大顶级国家科学奖Killiam Prize。他是研究Kolmogorov复杂性的权威专家,在研究机器学习,算法平均复杂度、信息距离,和生物信息学方面做出了贡献。在Nature, PNAS, Scientific American, JACM, CACM, FOCS, STOC等杂志会议发表过许多有影响的文章。

Abstract: 第一代聊天机器人使用模板,大数据,深度学习,人云亦云,不能理解和真正意义上的学习。我们猜测并提供部分理论证明:简单的神经网络其实不适合自然语言理解。第二代聊天机器人应该理解和学习。最后我们提出第二代聊天机器人架构和可行的实现方法。

特邀报告4:PaperRobot: Automated Scientific Knowledge Graph Construction and Paper Writing

Heng Ji  (University of Illinois at Urbana-Champaign)

Bio: Heng Ji is a professor at Computer Science Department of University of Illinois at Urbana-Champaign. She received her B.A. and M. A. in Computational Linguistics from Tsinghua University, and her M.S. and Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Information Extraction (IE) and Knowledge Base Population. She is selected as “Young Scientist” and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017. She received “AI’s 10 to Watch” Award by IEEE Intelligent Systems in 2013, NSF CAREER award in 2009, faculty awards from Google, IBM, Bosch and Tencent, PACLIC2012 Best Paper Runner-up, ACL2019 Best Demo Nomination, “Best of SDM2013” paper, and “Best of ICDM2013” paper. She has coordinated the NIST TAC Knowledge Base Population task since 2010, and served as the Program Committee Co-Chair of NAACL-HLT2018 and CCL2019. She is the associate editor for IEEE/ACM Transaction on Audio, Speech, and Language Processing.

Abstract: The ambitious goal of this work is to speed up scientific discovery and production by building a PaperRobot, who addresses three main tasks as follows. 

The first task is to read existing Papers. Scientists now find it difficult to keep up with the overwhelming amount of papers, e.g., more than 500K biomedical papers are published every year, but scientisits read, on average, only 264 papers per year (1 out of 5000 available papers). PaperRobot automatically reads existing papers to build background knowledge graphs (KGs) based on entity and relation extraction. Constructing knowledge graphs from scientific literature is generally more challenging than that in the general news domain since it requires broader acquisition of domain-specific knowledge and deeper understanding of complex contexts. To better encode contextual information and external background knowledge, we propose a novel knowledge base (KB)-driven tree-structured long short-term memory networks (Tree-LSTM) framework, and a graph convolutional networks model, incorporating two new types of features: (1) dependency structures to capture wide contexts; (2) entity properties (types and category descriptions) from external ontologies via entity linking. 

The second task is to automatically create new ideas. Foster et al. (2015) shows that more than 60% of 6.4 million papers in biomedicine and chemistry are about incremental work. This inspires us to automate the incremental creation of new ideas by predicting new links in background KGs, based on a new entity representation that combines KG structure and unstructured contextual text. 

Finally we move on to the final ambitious and fun task to write a new paper about new ideas. The goal of this final step is to communicate the new ideas to the reader clearly, which is a very difficult thing to do; many scientists are, in fact, bad writers (Pinker, 2014). Using a novel memory-attention network architecture, PaperRobot automatically writes a new paper abstract about an input title along with predicted related entities, then further writes conclusion and future work based on the abstract, and finally predicts a new title for a future follow-on paper. We choose biomedical science as our target do-main due to the sheer volume of available papers. Turing tests show that PaperRobot-generated output strings are sometimes chosen over human-written ones; and most paper abstracts only require minimal edits from domain experts to become highly informative and coherent.

This work is based on collaborations with Kevin Knight (DiDi Labs) and Jiawei Han (UIUC).