Bringing People Together by Training Machines to Better Understand Our Differences

Influential computer scientist Kathy McKeown heads up two multi-million dollar grants—one to analyze cross-cultural norms and another to better understand grief in the Black community

Aug 03 2022 | By Bernadette Ocampo Young
Illustration of digitized face with speech waveform

On TV, it’s a common comedy trope: cultural differences lead to communication breakdown. Here’s a typical scene: government officials from two countries are meeting for the first time. One delegation presents a gift that the other graciously accepts—but they don’t have the cultural sensitivity to reciprocate, and all their best intentions are quickly unraveled from there. It may be funny to watch on a TV show, but in real life, such situations can be disastrous and might even be possible to avoid.

Columbia researchers, in collaboration with colleagues from the University of California Davis, New York University, the University of Illinois at Urbana-Champaign, and Stony Brook University, are setting out to understand how and why intercultural interactions break down and what can be done to avoid it. Led by Columbia Engineering Computer Science Professor Kathleen McKeown, their $5 million grant from the Defense Advanced Research Projects Agency (DARPA) will develop unsupervised models designed to both learn socio-cultural norms across multiple cultures and languages, then analyze how conflicting norms can cause a conversation to derail and misunderstandings to occur. They are dubbing this new system “Cross-cultural Harmony through Affect and Response Mediation,” or CHARM.

The three-year CHARM research project will initially focus on Mandarin within Chinese culture and will expand to include other languages and cultures in later years. The main data collection of the research will be videos in Chinese that will be taken from the internet—multilingual reality TV shows, interview recordings, and group meeting videos.

The aim is to build a corpus that can help broaden the field and improve existing language models. Even with numerous language models available today such as GPT-3, the technology isn’t there yet. Said McKeown, “Natural language processing has advanced a lot in recent years, but the language models cannot be used straight out of the box. They need to be trained to avoid bias and need to be augmented with new objectives so they do not generate outputs that are surprising.”

McKeown, the Henry and Gertrude Rothschild Professor of Computer Science and an expert in natural language processing (NLP), has led the field in text summarization and established processes and models that automatically summarize large texts, like a news article, into a short and easy-to-read summary.

Recently, her research interests have expanded to include how NLP can be used to address social needs. Since there is a vast amount of data available on the internet and through social media, McKeown has various projects that analyze social media for insights into the world. She recently won the 2023 IEEE Innovation in Societal Infrastructure Award for research that analyzes social media and pushes the boundaries of NLP.

McKeown has collaborated with Desmond Upton Patton, formerly a professor of social work and sociology at Columbia’s School of Social Work and now Penn Integrates Knowledge Professor at the University of Pennsylvania. An expert on gun violence, youth social media use, and qualitative methods, Patton has worked with McKeown on numerous projects in the last few years that focus on Black digital expression on social media. A defining characteristic of their research is their unique methodology for engaging with the community to understand how Black people express grief online. For instance, in one collaboration studying how gang members in Chicago use social media to express grief, they worked directly with local youth and gang members to gain deeper insight into tweets that included emojis and terms particular to this group. For this work, the main contribution has been an approach that highlighted the importance of computational models of language specific to the demographic (gang-involved youth) and the impact of context on the interpretation of the tweet.

Kathleen McKeown and Desmond Upton Patton

Kathleen McKeown and Desmond Upton Patton

Their research focuses on the Black community because most research on understanding grief is based on White Americans, meaning very little is known about how Black people use social media to express, process, and cope with grief. The pandemic, police brutality, and losing loved ones can cause traumatic reactions that are difficult for people to cope with. McKeown and Patton saw an opportunity to create innovative computational tools to help recognize and interpret expressions of grief.

Another of their projects, Identifying and Understanding Digital Expressions of Black Grief, will develop a system that can develop tools that will automatically identify digital expressions of grief that could then be used by social workers and health professionals for intervention and treatment programs.

For that new $1.2 million grant from the National Science Foundation, which also focuses on social media, they have devised a new multi-layered approach to annotate their work–first, they will ask participants to submit diary-like submissions regarding their feelings, then a linguist will review the entries to get a better understanding of how they use language to convey their feelings. An expert on grief disorder, M. Katherine Shear, will sit down with some participants to delve deeper into their feelings and their meaning. They will partner with the non-profit civil rights and faith-based organization, Mobilizing Preachers and Communities (MPAC), to identify up to 50 Black Harlem residents to participate in the study.

The team has created a website where participants can write about their reactions to daily events. Aside from the negative emotional effects of the pandemic, Black people experience racism and other hardships particular to their community. The project will focus on analyzing how this community uses African American English to express grief and what events have caused their feelings. The hope is that this more nuanced approach will result in a richer corpus of African American English with a further goal to identify situations when help or treatment is needed.

One of the reasons the partnership between McKeown and Patton is successful is that they understand each other and their work is synergistic. They each bring their own expertise to their projects. Patton, along with his PhD students from the School of Social Work, thinks about a project through a social work and sociological lens that is more qualitative—how and why people feel and act the way that they do. McKeown and students from her NLP group have the computational expertise, and the understanding of machine learning and natural language processing.

The nature of her work with Patton has made them talk about sensitive topics. In their earlier work, they realized that the computer models were categorizing the N-word as an aggressive and negative term. Patton shared that the N-word is not necessarily negative in the Black community and so it should not be categorized as such. This led to him explaining why and how the Black community uses the word and other cultural aspects of the community. Said Patton, “We had to have an uncomfortable conversation about racism and culture, but our work is so much better because of the dialogue we have with each other.”

Patton recalls that when he first came to Columbia in 2015, he reached out to a couple of professors to see how he could collaborate with them on research. McKeown was the only senior professor in computer science who was willing to talk to him. McKeown remembers that she was so excited about the work that Patton presented that she immediately knew that she wanted to work with him. From their first meeting, the two hit it off and began their research examining the Black experience. They are one of the few research teams that focus on this area and Patton credits his transdisciplinary collaboration with McKeown as one of the reasons why he earned tenure at the University.

“I have to say, coming to a new school and being a person of color that is trying to get into the computer science and engineering spaces was very scary,” said Patton. “It was easy to connect with Kathy and to learn from her. She has been such a mentor for me in this space and, I think, the model for how senior faculty should connect with a junior faculty member.”

“We are interested in how the events of our day have impacted people,” said McKeown. “The past two years have been hard for many because of societal issues like the pandemic, social justice, and racist attacks. But we have yet to see the lasting emotional effects of how these everyday events are affecting people.”