Manus AI vs Perplexity Comet vs AutoGLM: AI Agents for Everyday Tasks

The next wave of artificial‑intelligence tools goes beyond answering questions – they can take real actions for you. Instead of just chatting, these systems can navigate websites, use apps, fill out forms, run code, or even build an entire application. Three notable products illustrate how far these agents have come: Manus AI, Perplexity’s Comet browser and AutoGLM. Each aims to be a digital helper for everyday people, but they take different approaches. This article compares their interfaces, capabilities and strengths.

Manus AI: a general‑purpose autonomous agent

Manus AI positions itself as a “general agent” that turns thoughts into actions. It combines several large‑language models (reportedly including Anthropic’s Claude, Alibaba’s Qwen and other specialist models) with a library of tools to break down a goal into concrete steps. After you give Manus a prompt, it plans the work and executes it autonomously. You can watch what it’s doing in a virtual desktop, intervene if needed and then let it continue. The user interface is a clean chat window and a feed of running agents; you delegate tasks and track their progress.

What makes Manus impressive is the breadth of its capabilities. It includes dozens of integrated tools for browsing, scraping data, writing and executing code, creating documents, building websites and deploying them. In demos, Manus has:

  • built a multi‑page website from a single sentence prompt, complete with HTML/CSS/JS, images and hosting, in minutes;
  • planned a 30‑day trip to New York City with day‑by‑day itineraries, hotel options, packing checklists and cost estimates; and
  • generated a full lead‑generation web application, including login, dashboard and data visualisation, and deployed it live.

Manus can also run several agents in parallel, so you could ask one agent to code an app, another to research a topic and a third to create a presentation at the same time. Once tasks are complete, Manus delivers a finished result rather than just a suggestion. All of this power comes with caveats: the service is in closed beta with long waitlists, and reports suggest it will be priced as a premium subscription. Because it operates in its own cloud environment, it doesn’t have direct access to your personal accounts unless you provide credentials, which limits some tasks such as posting to social media or sending emails on your behalf. Still, for complex multi‑step jobs, Manus is the most autonomous of the three tools compared here.

Perplexity Comet: an AI‑first web browser

Perplexity’s Comet isn’t a separate assistant so much as a smart overlay on the browser you already use. Built on Chromium, Comet integrates an AI assistant into your browsing session. You can ask questions in natural language and get cited answers from the web, just like using Perplexity’s search engine. More importantly, Comet can act on pages: it can click links, fill forms, apply filters, scroll through long documents and summarise or explain whatever you’re reading.

The UI is familiar – it looks like Chrome or Edge with an additional sidebar that houses the assistant. You might say, “Find me a cheaper price for this laptop,” and Comet will compare your current shopping tabs and show alternatives. If you’re reading a paper, you can highlight text and ask the assistant to summarise it or explain key points. You can also issue multi‑step commands such as, “Book a flight from Lisbon to Taipei on 15 August, avoiding 737 Max aircraft, preferably with KLM or EVA,” and Comet will carry out the search, apply filters and present options inside the browser. It can even interact with your webmail to find unread emails and draft responses when you ask it to “check my Gmail for important messages I missed.”

Comet uses a combination of Perplexity’s owme processing happens locally for privacy, while heavier tasks call out to cloud models. The product is designed to be hands‑on: you browse and do your work as usual, but whenever you need help the assistant is right there. The aim is to collapse many of the small steps involved in online tasks into a single instruction. Because it operates within a real browser, Comet inherits the same security and permission models as your own computer; you stay logged into sites and can see exactly what the AI is doing. The product is still in invite‑only beta and is currently available to paying Perplexity subscribers, but it hints at a future where every browser has an integrated agent.

AutoGLM: a mobile and multimodal agent

AutoGLM, developed by China’s Zhipu AI, takes the idea of an agent a step further. Rather than operating only in a browser, AutoGLM is meant to control both mobile apps and desktop software. It does this by running a “cloud phone” and a “cloud computer” – standardised virtual devices with pre‑installed apps (30 or more on the phone, plus office software and a browser on the computer). When you issue a command, AutoGLM interprets it, opens the necessary apps on the cloud device, navigates the interface using computer‑vision models, and performs the actions. You can watch the agent’s progress on your own device and step in if a login or two‑factor code is required.

AutoGLM is pitched as a personal concierge for daily life as well as work. You could tell it to “order 20 bubble teas from the nearest shop using a coupon,” and it will open the delivery app, search for the item, add it to the cart, apply the coupon and proceed to checkout. You could ask it to “book me a flight next Monday evening to Los Angeles,” and it will search a travel app or website, choose flights matching your criteria and hold the booking. In a professional context, AutoGLM can pull information from a Q&A forum, summarise it and create a report or video, then post it directly on socn models and third‑party models like GPT‑4 or Claude to interpret queries and generate responses. So

Share this post

Subscribe to our newsletter

Keep up with the latest blog posts by staying updated. No spamming: we promise.
By clicking Sign Up you’re confirming that you agree with our Terms and Conditions.

Related posts

Events
開源與智慧財產權:AI創新的雙引擎 —— 華為論壇觀察

在人工智慧技術飛速發展的當下,算力基建、智慧財產權保護與開源共用正成為創新領域的焦點議題。近日,筆者以香港浸會大學專利顧問委員會成員的身份,見證於北京舉行的華為2025年創新和智慧財產權論壇,親身感受這場以“開放驅動創新”為主題的思想碰撞。論壇上,華為發佈了第六屆“十大發明”評選結果,涵蓋計算、作業系統、存儲等面向未來的關鍵技術領域。其中最引人注目者,莫過於名列首位的“Scale-up超大規模超節點算力平臺”——一套超級算力系統,被譽為人工智慧時代的新型基礎設施。本文將結合論壇見聞和筆者實務經驗,觀察該超級算力在AI時代的基建角色,探討“開源共用”與“智慧財產權保護”對創新的雙重意義,並反思香港在創新基建、產學研轉化、專利文化等方面的瓶頸與出路。 超級算力集群:AI時代的基建底座 這款被華為評為年度十大發明之首的Scale-up超大規模超節點算力平臺,實質上是由眾多AI處理器組成的單一邏輯超級電腦。隨著AI模型規模指數級增長,訓練這些模型所需的算力和資料輸送量呈爆炸式上升。傳統的伺服器堆疊模式面對超大型AI任務時,往往出現“ 集群越大、有效算力利用率反而越低,訓練中斷越頻繁”的窘境。華為針對此痛點創新出“超節點”系統架構,具備資源池化、線性擴展和高可靠性等特性:通過統一高速協定和共用記憶體編址,打通計算與存儲單元的高頻寬低時延互聯,使有效算力可隨節點規模近乎線性增長,同時大幅提升集群穩定性。華為輪值董事長徐直軍強調:“算力是——而且將繼續是——AI的關鍵”。基於對這一點的共識,華為推出了新一代Atlas系列超節點產品,其中Atlas 950 SuperPoD即對應此次的Scale-up超級算力。該平臺面向超大型AI訓練任務,從基礎器件、協定演算法到光電互聯均實現了系統級創新。例如,它採用正交架構設計實現零線纜的電氣互連,搭配全液冷散熱與浮動盲插技術確保不滲漏,同時首創UB-Mesh遞迴直連拓撲,支持單板內、板間、機架間NPU全互聯,以64卡為模組靈活擴展,最大可支援8192顆昇騰AI處理器無收斂互聯。換言之,上千顆AI晶片可彙聚成“一個大腦”協同運算,真正消除超大規模訓練的瓶頸。 從實踐看,超級算力已不僅是實驗室概念,而成為產業AI生態的基礎底座。華為透露,截至目前其上一代Atlas 900系列超節點系統已累計部署超過300套,服務於互聯網、金融、電信、電力、製造等行業的20多家客戶。在人工智慧時代,類似Atlas 950這樣的本地智算樞紐,相當於數字經濟的高速公路與電力網絡:為產業生態提供共用的算力資源,降低創新應用部署門檻,有力支撐從雲服務到垂直行業落地的AI解決方案。尤其對中國而言,在先進晶片供給受限的背景下,華為選擇利用現有制程自研超大規模計算平臺,以系統工程突破彌補晶片性能不足,體現出以基建思維佈局AI長遠發展的戰略定力。 “開放共用”與“智慧財產權”:雙軌驅動創新的辯證 本屆論壇傳遞出一個明確訊息:開源合作和智慧財產權保護並非對立,而是創新發展的雙引擎,需同步推進、制度協調。華為首席法務官宋柳平在會上表示:“開放創新是推動社會發展和技術進步的重要力量,也是華為的DNA。華為一直在踐行‘開放’的理念,用開放驅動創新。同時,華為注重自有智慧財產權的保護,也尊重他人的智慧財產權,包括專利、商標、版權和商業秘密等。”簡言之,一方面積極參與開源與共用,另一方面嚴格保障智慧財產權,兩條路並行不悖。華為近年來在專利研發和佈局上不遺餘力。2024年華為專利授權收入約6.3億美元,同時其歷年累計支付的專利許可費是自身許可收入的三倍之多。根據世界智慧財產權組織統計,華為2024年通過PCT公開的國際專利申請達6600件,自2014年以來連續位居全球首位。僅2024年一年,華為新公開專利就達3.7萬件,創下歷史新高。強大的專利庫讓華為在5G、Wi-Fi、視頻編碼等領域建立了廣泛的授權生態:截至2024年底,全球已有超過27億台5G設備、12億台消費電子設備和32億台多媒體設備獲得華為專利授權,全球500強企業中有48直接或間接獲得華為的授權許可。 另一方面,華為在開源開放方面同樣投入巨大資源。其副總裁、智慧財產權部部長樊志勇指出,華為透過“軟體開源、硬體開放、專利申請、標準貢獻與學術論文等多種形式”推動技術開放。2024年華為向標準組織新提交技術提案超1萬篇,發表學術論文逾1000篇;在開源社區方面,主導或參與了多個大型專案,例如OpenHarmony開源作業系統社區已有超過8100名共建者;openEuler開源OS發行版本累計裝機量已突破1000萬套;並將昇騰AI基礎軟體棧全面開源,包括CANN計算架構和MindSpore深度學習框架,並優先適配主流開源社區如PyTorch、vLLM等。 由此可見,“智慧財產權保護”保障了創新者的投入回報和商業動力,而“開源共用”則能彙聚眾智加速技術成熟與應用擴散。兩者並非水火不容,關鍵在於尋求制度性的平衡與協同。正如香港大學鄧希煒教授所言,一個強健、開放且受國際信賴的專利體系是創新引擎運轉不可或缺的條件。 全球範圍內,“開源”與“封閉”的博弈亦在演變。NVIDIA以CUDA軟體平臺構建封閉生態,形成極高的市場壁壘與利潤迴圈,但OpenAI從開源轉向封閉的過程亦引發反思。當Meta等公司以Llama開源模型崛起,開源生態再次展現強勁生命力。這些案例共同說明:唯有平衡專利保護與開源合作,才能讓科技創新在競爭與共榮中持續演進。 香港創新生態的瓶頸與建議

Read More