This ChatGPT update is centered on truly putting GPT-4o’s “all-around” capabilities to work: it not only writes text, but can also listen, see, and converse more naturally. For everyday use, the most noticeable changes are smoother voice interactions, easier cross-language communication, and faster access on desktop.
GPT-4o turns ChatGPT into an assistant that can “see and hear”
GPT-4o is positioned as omni (all-around), so ChatGPT is no longer limited to text Q&A. Instead, it unifies the understanding of text, images, and audio within a single reasoning system. You can upload images or files within the same conversation, allowing ChatGPT to explain, organize, and analyze directly based on the content.
Compared with the past—when you had to “describe what’s on the screen”—many problems can now be solved by simply “showing it”: spreadsheets, screenshots, and manual pages can all reach conclusions faster.
More natural voice conversations: near real-time interpreting, too
ChatGPT’s voice experience now feels more like a conversation rather than a “voice input box,” with faster responses and a more coherent tone. Even more useful is language switching: within the same conversation, ChatGPT can follow the context as you move back and forth between Chinese and English, without you repeatedly restating the background.
In business trips, hosting, or online meeting scenarios, ChatGPT can handle lightweight interpreting and on-the-fly rewriting: first translate the other party’s words into Chinese, then polish your reply into more natural English—back and forth, saving time.


