Machine Learning and Language (ML²) at CDS: Moving NLP Forward

NYU Center for Data Science
6 min readSep 28, 2023

It’s a pivotal time in Natural Language Processing (NLP) research, marked by the emergence of large language models (LLMs) that are reshaping what it means to work with human language technologies. Building on this momentum is a dynamic research group at the heart of CDS called the Machine Learning and Language (ML²) group.

Through a series of interviews with key members of ML², including CDS Assistant Professor of Computer Science & Data Science He He, CDS Associate Professor of Linguistics and Data Science Tal Linzen, and CDS Affiliated Professor and Assistant Professor, Department of Technology, Operations, and Statistics at NYU’s Stern School of Business João Sedoc, we delve into the vision, impact, and groundbreaking research of this initiative.

The Genesis of ML²

ML², a subset of the larger CILVR Lab (Computational Intelligence, Learning, Vision, and Robotics), has its roots in the collective vision of CDS Associate Professor of Linguistics and Data Science Sam Bowman and CDS Associate Professor of Computer Science and Data Science Kyunghyun Cho. Cho, the first NLP faculty member at CDS, laid the groundwork, and Bowman created and expanded greatly the initiative in the group’s early days. By 2020, ML² was a thriving community, primarily known for its recurring speaker series where researchers presented their work to peers. This collaborative atmosphere, combined with individual lab meetings and the broader ML² seminars, fostered a culture of continuous learning and knowledge sharing.

A Vision for ML²

In the beginning, ML² was simply the hub for NLP research at NYU. While this is still true, the introduction of LLMs has forced NLP as a discipline to examine itself, and ML² has been part of that field-wide reflection. What does it mean to work in NLP in the age of LLMs?

On one hand, LLMs offer unparalleled capabilities in terms of data processing and prediction. On the other hand, their complexity and the proprietary nature of leading models such as OpenAI’s ChatGPT can make them enigmatic, even to seasoned researchers. This duality is further complicated by the fact that while these models are advancing rapidly, they often remain black boxes, making scientific work with them a challenging endeavor.

For ML², the arrival of proprietary LLMs underscores the need for originality and innovation, as the conventional methods of tweaking model architectures or input encodings no longer guarantee breakthroughs. In the immediate wake of LLMs, said Linzen, one trend was to simply “take a task, try to solve it with GPT3, and then write a paper about how GPT3 did on this task.” Dissatisfied with such incremental approaches, Linzen said there is a common sentiment within ML² that new directions are needed, especially ones that go “beyond the obvious next step.”

Additionally, the goals of companies like OpenAI are not necessarily the same goals as researchers. Linzen, for example, is interested in how human language works, which will never be solved simply by developing ever-more-sophisticated AI models. Furthermore, CDS founding director Yann LeCun has said recently that he suspects LLMs are ultimately “an off-ramp” on the “highway towards truly intelligent machines,” implying much more fundamental research is still needed, rather than simply adding more layers and compute to existing architectures.

An Interdisciplinarily Diverse Group

Original takes on the NLP field is exactly what the researchers at ML² specialize in, and one ingredient to that originality is a diversity of perspectives. The group offers a balanced blend of expertise, with members like Cho deeply rooted in the technical side, Linzen in linguistics, and others like He, Sedoc, and Sam Bowman spanning the spectrum.

If you imagine that spectrum as existing between pure machine language research and pure language research, the distribution of the faculty is “evenly spaced,” said Sedoc, with a laugh. “There’s someone in each specialty, rather than the whole group being heavy on one or the other side of that spectrum.” Sedoc himself is associated with the Stern School of Business.

Sedoc added that this diversity is useful for graduate students, too, as they always have an expert to consult with on any given topic. Furthermore, the cross-pollination that results from this diversity ensures a holistic approach to NLP challenges, making ML² an invaluable asset to the wider data science community.

He made a similar point: including members who work in psychology, business, computer science, and computational social science as well as core NLP researchers, the group’s diversity leads to interdisciplinary research that offers new insights and perspectives across various fields.

Research Highlights

He, Linzen, and Sedoc provided examples of the diversity of research focuses at ML².

Cho’s work on building attention mechanisms within deep learning models has been seminal in the field. In an influential paper titled “Neural Machine Translation by Jointly Learning to Align and Translate” from 2015, Cho, along with other co-authors, introduced an attention mechanism that allowed a model to focus on different parts of an input sequence while producing each word in its output sequence. The introduction of the attention mechanism had a profound impact on the field of neural machine translation, as well as tasks such as image captioning, speech recognition, and many more, all of which adopted attention mechanisms to improve their performance.

Similarly, Sam Bowman’s development of assessment tools for large language models, particularly the crowdsourced General Language Understanding Evaluation (“GLUE”) and “SuperGLUE” benchmarks, have become staples in the NLP community. They are collections of resources for training, evaluating, and analyzing natural language understanding systems. They present a series of tasks, ranging from question-answering to sentiment analysis, which aim to evaluate the performance of models on various NLP challenges.

Sedoc’s work on conversational AI, particularly in the realm of public health and healthcare, is another testament to the group’s innovative spirit. His research delves into the effectiveness, bias, fairness, and safety of conversational AI, exploring the nuances of emotion and empathy in these interactions.

Linzen, with a background spanning computer science, math, and linguistics, offers a unique perspective on the group’s endeavors. His research focuses on the evaluation and analysis of models, aiming for an interpretable cognitive model, using machine learning insights to shed light on human intelligence.

He’s recent work on evaluating the impact of writing with large language models like ChatGPT is another testament to the group’s innovative spirit. Her research delves into the unintended consequences of these models, particularly in how they can reduce content diversity and lose the unique characteristics of the writer. He’s research also touches on robustness, truthfulness, alignment, and human collaboration.

A Hub for Collaboration and Learning

Within NYU, ML² serves as a bridge between the larger machine learning community at NYU and other communities integrating NLP problems. The vision is clear: to continue expanding and establishing ML² as one of the leading groups in NLP, offering a diverse range of expertise from the technical to the linguistic.

Beyond research, ML² is also a hub for learning and collaboration. The group hosts the “Text as Data” seminar series, inviting speakers from the NLP realm. Additionally, there’s an ML² seminar where students present papers and their own research, fostering an environment of continuous learning and knowledge sharing.

He also highlighted the NYU AI School, a volunteer-driven initiative that teaches entry-level artificial intelligence and machine learning skills to early undergraduate students.

Looking Ahead

The future of ML² is bright. With the rapid advancements in open-source models, the increasing importance of understanding and leveraging LLMs, and the group’s drive to keep focused on the fundamental questions at the intersection of machine learning and language, the group is poised to remain at the forefront of NLP research.

ML² is more than just a research initiative. It’s a beacon for innovation, collaboration, and exploration in the world of NLP, ensuring that NYU remains a global leader in data science research and education.

By Stephen Thomas

--

--

NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.