ChatGPT has recently received multiple updates, with the voice interaction mode receiving a comprehensive overhaul and the multimodal capabilities of the GPT‑4o model taking the user experience to the next level. Gone are the days of cold text‑based communication; now ChatGPT feels more like an intelligent companion that can understand tone of voice and interpret visual content. Below are some key changes worth noting.
Voice Mode Feels More Natural: Speech Pace and Tone Are Almost Human
The new advanced voice feature has been significantly refined in terms of tone and rhythm, eliminating the previously robotic feel. It now supports real‑time language switching during conversations—for example, Chinese‑English translation—making cross‑language communication extremely smooth. For users who need to attend meetings with overseas colleagues or learn a foreign language, it’s like having a personal interpreter available at all times.
In the future, this voice mode will be further integrated into the Projects mode, creating a more immersive workflow. Imagine just speaking aloud and having ChatGPT organize project progress or generate a draft report via voice, without needing to type a single word.
GPT‑4o Introduces a New Way to Interact: Screen Sharing and Real‑Time Analysis
The launch of GPT‑4o is the highlight of this update. It is no longer limited to text input but supports comprehensive processing of audio, video, and text. Now you can directly share your computer or phone screen with ChatGPT and let it provide suggestions based on what it sees. For example, if you’re stuck while coding, ChatGPT can analyze the code snippet on your screen and tell you where the error is using voice.

