Campus
AI Lecture Explores New Ideas in Training, Reasoning, and Language Models
Columbia Engineering’s AI Distinguished Lecture Series showcases the innovations shaping the future of artificial intelligence.
Innovation was front and center at Columbia Engineering’s latest AI Distinguished Lecture, where experts and rising leaders gathered Oct. 29 on the Morningside Heights campus to examine the future of AI. Now in its second year, this series has brought together thought leaders and innovators from universities, companies, and organizations to share breakthroughs shaping the rapidly evolving world of artificial intelligence.
The event highlighted Columbia’s growing expertise in AI—both in research and education. From hiring over 20 new faculty members in AI-related areas to launching cross-disciplinary collaborations with Columbia Business School, the Mailman School of Public Health, and Columbia University Irving Medical Center, Columbia continues to push the boundaries of what’s possible through responsible and impactful innovation.
“And that's the mission for us to be a University collaborator and contributor in this field,” said Dean Shih-fu Chang in his opening remarks at the event. “We have the opportunity to lead the frontier, as well as advocate and develop the framework for responsible AI use.”
This session featured Micah Goldblum and John Hewitt, faculty members who recently joined Columbia Engineering and whose work explores natural language processing, large transformer models, mathematical foundations of deep learning, and 3D generation.
Explorations of optimization and generalization for neural nets
Electrical Engineering Assistant Professor Micah Goldblum first challenged a long-held assumption in the field—that training language models with small batch sizes is too slow or unstable to be useful. He showed that with the right adjustments to how optimizers handle learning, small-batch training can actually be surprisingly powerful. This approach not only saves GPU memory but also skips the need for complicated add-ons like gradient accumulation or LoRa, making advanced AI training more efficient and accessible.
The second half of his talk focused on teaching AI systems to “think longer” when faced with tough problems. By allowing models to loop through their reasoning steps multiple times—similar to how people pause and think more deeply when a task becomes challenging—they can handle more complex challenges, such as navigating larger mazes or solving more intricate puzzles. The idea is simple but exciting: if AI systems can spend more time thinking, they might also become better at reasoning and adapting in new situations.
“The ability to think longer to solve harder problems is only one of the amazing features of human cognition. For example, humans can learn one solution strategy for one kind of problem, another solution strategy for another kind of problem, and then fuse these solution strategies to solve new problems they have never seen before,” said Goldblum. “Maybe we can take inspiration from all the amazing features of human cognition and build models with those same capabilities, too.”
An adventure inside a large language model
The talk by Computer Science Assistant Professor John Hewitt explored how AI systems process and understand language. Hewitt highlighted his research showing that large language models (LLMs), even without explicit teaching, can learn the structure of language—syntax, semantics, and context—just by analyzing text from the internet. By looking under the hood at how words are represented mathematically, researchers can see that AI isn’t just guessing the next word; it’s building a kind of abstract “knowledge map” to guide its responses.
The second major insight focused on instruction-following and AI “vocabulary expansion.” Simple rules can transform a base language model (like ChatGPT) into one that effectively answers questions without fundamentally changing its underlying structure. Hewitt discussed how we can teach AI to follow instructions and even create its own “words” for new concepts. For example, a custom word could tell a model to give super short answers or intentionally wrong ones. AI can understand these new concepts in ways we don’t fully expect, he said, demonstrating that it thinks in ways that are both familiar and unexpected. This work highlights the need for new strategies—and new vocabularies—to communicate effectively with AI as it becomes increasingly intelligent.
“When we think about interpreting what goes on inside LLMs, I think that we should expect to have to have new ideas, to have new words that correspond to the ways in which they process the world,” said Hewitt. “So we can't just connect them back to how we already think about the world. We need to discover new ways of thinking, new concepts that they might have developed that we don't know anything about.”
Lead Photo Caption: (L to R): Computer Science Assistant Professor John Hewitt, Vice Dean of Computing and AI Vishal Misra, Dean Shih-Fu Chang, and Electrical Engineering Assistant Professor Micah Goldblum
Lead Photo Credit: David Dini