New data challenges emerge as data becomes a key element to many successes of machine learning and AI. We use social media data to illustrate some novel data challenges for machine learning and data science. In this talk, we examine some critical issues related to big social media data, e.g., how to make `big’ data bigger, how to protect a user’s privacy without the loss of service utility, and how to evaluate machine learned results without ground truth and at scale. With more data and newer tools, we are better than ever equipped to answer challenging and novel research questions and advance data science with ever evolving data.
Bio: Dr. Huan Liu is a professor of Computer Science and Engineering at Arizona State University. He was recognized for excellence in teaching and research in Computer Science and Engineering at Arizona State University. His research interests are in data mining, machine learning, social computing, and artificial intelligence, investigating problems that arise in real-world applications with high-dimensional data of disparate forms. He is a co-author of the textbook on Social Media Mining: An Introduction by Cambridge University Press. He is a founding organizer of the International Conference Series on Social Computing, Behavioral-Cultural Modeling, and Prediction, and Chief Editor of Frontiers in Big Data. He is a Fellow of ACM, AAAI, AAAS, and IEEE.
In Hollywood, “chameleons” are actors who can play any part. Their versatility keeps them in demand. Interestingly, the movie-going public supports this versatility. These stars have avoided the trap of being “type-cast,” but instead have leveraged their value – in terms of both the salaries they command and the production and audience needs they satisfy – by virtue of their versatility.
In much the same way, directors and agents turn to these chameleons to solve their casting dilemmas, by understanding the breadth of their experiences. Researchers need to be able to retrieve relevant case data from the past in overcoming the challenges they face in their contemporary work. This case-based reasoning is accelerated when researchers grasp the context and outcomes of the past. Like the Hollywood “chameleons,” case versatility increases the value of these past experiences.
Stated in another way, the more “roles” our data can “play,” the greater the utility our data can supply and the more case-based reasoning our data can support. Yet, like these “chameleons,” we need to have some insight into this historic data and its associated context. Researchers need mechanisms for mapping their challenges to previous efforts to ensure that the data is relevant and supports a range of new solutions. This presentation will discuss the need for careful cataloging and tagging of data to aid in retrieval and reuse, positing a strategy researchers can employ to identify data “stars” in their respective fields. Projects are underway, which enable context-based retrieval in lieu of tagging. These promise to be less costly, less laborious, and less subject to human cataloging errors, as noted by Autonomy Corporation in a study on the promise of data analytics.
Bio: Dr. Matthew C. Stafford is the Chief Learning Officer for the US Air Force’s Air Education and Training Command, headquartered at Joint Base San Antonio, Texas. In this position, he leads a diverse team of learning professionals exploring contemporary applications of educational technology to enhance learning effectiveness and efficiency, is part of the Air Force team leading an effort to create a single “Airman’s Learning Record” to capture all that Airmen know and can do, and was recently appointed to join a Department of Defense group attempting to create a similar success with a Defense Universal Learning Record. He recently supported the Air Force Science Advisory Board in pursuing a Secretary-of-the-Air Force research project into learning technologies, but is also working with senior Defense officials to design a human capital development approach to ensure that as advanced technologies are created and delivered, planners, operators, and support staff will be ready to integrate, employ, and maintain these capabilities effectively. Dr. Stafford is also the Military Advisor to the Board of Directors of the Asia-Pacific Research Association on Curriculum Studies. He served as a United Nations Advisor on Human Capital Management to the Central Asian Hub for Civil Service in Astana, Kazakhstan; and supports the Chairman of the Joint Chiefs of Staff Process for Accreditation of Joint Education, leading assessment teams in the certification of Joint Education institutions. Prior to his current position, Dr Stafford served as Vice President for Academic Affairs at the US Air Force’s Air University, Maxwell Air Force Base, Alabama, and as Dean of Faculty at the Federal Executive Institute in Charlottesville, Virginia. His research interests are diverse, ranging from military history, to music, to learning theory, educational technology, and cognitive science.
The classical deep learning architectures have achieved great success on Euclidean data. However, in the real world settings, non-Euclidean structured data such as graphs are more ubiquitous. Graphs are fundamental data structures which concisely capture the relational structure in many important real-world domains. In this talk, I will discuss how the multi-view learning, graph theory, and deep learning can be integrated with domain knowledge for supporting knowledge-guided machine learning.
Bio: Dr. Aidong Zhang is a William Wulf Faculty Fellow and Professor of Computer Science in the School of Engineering and Applied Sciences at University of Virginia (UVA). She is also affiliated with Department of Biomedical Engineering and Data Science Institute at University of Virginia. Her research interests include data mining/data science, machine learning, bioinformatics, and health informatics. Dr. Zhang currently serves as the Editor-in-Chief of the IEEE Transactions on Computational Biology and Bioinformatics (TCBB). She served as the founding Chair of ACM Special Interest Group on Bioinformatics, Computational Biology and Biomedical Informatics during 2011-2015 and is currently the Chair of its advisory board. She is also the founding and steering chair of ACM international conference on Bioinformatics, Computational Biology and Health Informatics. Dr. Zhang is a fellow of ACM and IEEE.