The First Year of Large Language Model Productization: Tactics, Operations, and Strategy
28650 words (115 minutes)
|AI score: 94 ๐๐๐๐๐
The provided translation is accurate in conveying the key points from the original Chinese text. It captures the essence of the article's content and reflects the nuances of the subject matter. However, to ensure the translation aligns with English language conventions and idiomatic expressions, I have made some refinements for clarity and flow:
The article initially underscores the excitement surrounding the productization of large-scale models, alongside the anticipation that AI industry investments will soar to $200 billion by 2025. The accessibility of large-scale models has been significantly enhanced by vendors' APIs, enabling even non-technical individuals to infuse intelligence into their products. Nevertheless, the construction of stable and robust applications based on these models continues to present challenges. Over the past year, the author has shared valuable insights and experiences at the levels of building, operating, and strategizing.
On the tactical front, the article delves into strategies for elevating product quality and reliability, which include refining prompts, optimizing processes, conducting assessments, and implementing monitoring measures. It covers advanced prompting techniques, the application of Retrieval-Augmented Generation (RAG), structuring both inputs and outputs, and orchestrating collaborative workflows between humans and machines.
From an operational standpoint, the article accentuates the critical role of data management, which involves scrutinizing data discrepancies between development and production environments, regularly reviewing samples of inputs and outputs from language models (LLMs), and fostering effective collaboration with the models themselves. Additionally, it addresses the allocation of team roles and responsibilities, outlining approaches to team assembly, user experience design, and mitigating excessive dependence on AI engineers.
Strategically, the article outlines a suite of long-term strategic considerations. These include the judicious use of GPUs only after achieving product-market fit (PMF), the iterative process of adjusting priorities, and tailoring risk tolerance to specific application contexts. It also explores the development of LLM products that transcend mere utility, emphasizing the importance of building trust incrementally, eschewing the development of features that are readily available for purchase, and prioritizing the initiation of prompts, assessments, and data collection.
In conclusion, the article encapsulates the journey from demonstrating initial concepts (0 to 1) to full-scale productization (1 to N), underscoring the significance of this transformation and forecasting the trajectory of future technological advancements.
Announcing LangGraph v0.1 & LangGraph Cloud: Running agents at scale, reliably
LangChain Blog|blog.langchain.dev
1272 words (6 minutes)
|AI score: 93 ๐๐๐๐๐
The LangGraph v0.1 framework, introduced by LangChain, is engineered for constructing agents and multi-agent applications with meticulous control. It facilitates comprehensive process management for LLM (Large Language Model) applications, encompassing precise oversight of code, prompts, and LLM invocations, along with audit and quality assurance mechanisms. Companies like Klarna and Replit have deployed this framework in practical scenarios, validating its efficacy within intricate systems.
Concurrently, LangGraph Cloud, a novel infrastructure in closed beta testing, is designed to deploy LangGraph agents at scale while ensuring fault tolerance. It incorporates sophisticated capabilities such as stream processing, human-in-the-loop collaboration, and double-texting strategy management. One of the pivotal features of LangGraph Cloud is the LangGraph Studio, which equips developers with user-friendly tools for tracing and debugging agent execution paths, thereby streamlining the application deployment and maintenance workflows.
LangChain posits that while LLMs hold potential for controlling application workflows, the actual construction of such systems mandates a heightened level of precision and control. The advent of the LangGraph framework and LangGraph Cloud is intended to empower developers to surmount these obstacles and facilitate a smooth progression from prototype to production deployment. This initiative lays a foundation for ongoing innovation in the artificial intelligence domain, aiming to fortify the dependability of agent-based applications and to bridge the gap between user expectations and the capabilities of AI agents.
Comparing Pinecone vs Weaviate: Functionality Insights
1005 words (5 minutes)
|AI score: 91 ๐๐๐๐๐
This article provides a detailed comparison of Pinecone and Weaviate, two leading vector databases. It highlights their unique features and use cases, focusing on optimizing performance for high-dimensional data management, particularly in AI applications. Key points include Pinecone's compute and storage separation and static sharding, and Weaviate's contextualized embeddings and flexible deployment options.
What is an agent?
LangChain Blog|blog.langchain.dev
913 words (4 minutes)
|AI score: 95 ๐๐๐๐๐
This article from the LangChain Blog tackles the definition and understanding of 'agents' in LLM applications. The author, a LangChain developer, defines an agent as a system that uses an LLM to guide the control flow of an application, contrasting it with the perception of agents as advanced, human-like entities. The article introduces the concept of 'agentic' capabilities as a spectrum, similar to levels of autonomy in self-driving cars, advocating for its use in guiding the development, execution, and evaluation of LLM systems. It further highlights the need for new tools and infrastructure like LangChain's LangGraph and LangSmith to support increasingly agentic applications. The more 'agentic' an application, the more critical specialized tools become for managing its complexity.
Baidu Releases Wenxin Model 4.0 Turbo: Faster and Better Performance
3454 words (14 minutes)
|AI score: 91 ๐๐๐๐๐
At WAVE SUMMIT 2024, Baidu introduced the Wenxin Model 4.0 Turbo, emphasizing improvements in speed and performance. The model achieved optimization through innovations in data, basic models, alignment technology, knowledge, and dialogue. The conference also presented various innovations, including Agricultural AI, PaddlePaddle Framework 3.0, and the intelligent code assistant 'Wenxin Fast Code' 2.5. These technologies demonstrate Baidu's progress in AI models and applications, promoting the development of AGI. The daily user inquiries for Wenxin Model increased by 78%, and the average inquiry length rose by 89%, indicating its growing application and user demand.
Exploration of Multi-Agent System Applications in Financial Scenarios
8143 words (33 minutes)
|AI score: 91 ๐๐๐๐๐
This article details the speech by Chen Hong, a senior algorithm expert at Ant Group, at the AICon Global AI Development and Application Conference, discussing the application of multi-agent system technology in the financial field. The article focuses on the role of multi-agent systems in addressing challenges of information, knowledge, and decision-intensive environments in finance. By examining the technical evolution of large models and agents, the article highlights the stateful nature of agents and their critical role in task execution. Subsequently, the article proposes solutions for multi-agent system applications in financial scenarios, especially the application of the PEER Model in enhancing the rigor and professionalism of financial decision-making. Finally, the article showcases practical application cases of Ant Group based on the AgentUniverse (a multi-agent framework) framework, illustrating how the PEER Model improves analyst productivity across multiple financial scenarios.
Optimizing RAG Through an Evaluation-Based Methodology
3198 words (13 minutes)
|AI score: 91 ๐๐๐๐๐
The article begins by examining the role of AI in knowledge management, highlighting the potential of the Retrieval Augmented Generation (RAG) method to improve the quality of text generation. By enabling Large Language Models (LLMs) to access information from repositories such as vector databases, RAG enhances the accuracy, relevance, and reliability of generated text. The author underscores the critical importance of evaluation strategies in ensuring that AI products achieve success benchmarks. The article illustrates, through an experiment, how the RAG system can be optimized using tools like Qdrant and Quotient. Qdrant functions as an efficient vector database, ideal for the quick and precise retrieval of large datasets necessary for RAG solutions. Quotient provides tools to evaluate and refine RAG implementations, assisting teams in identifying deficiencies and improving their applications' performance.
Through the experiment, the author constructs a RAG pipeline and employs Qdrant and Quotient for assessment, leading to a set of critical insights. These include the identification of irrelevant documents and hallucinations, strategies for optimizing document retrieval, the necessity for adaptive retrieval, the effects of variations in models and prompts on the quality of responses, and the optimization tools offered by Qdrant and Quotient. A series of experiments explores the impact of different parameter settingsโsuch as embedding models, chunk size, chunk overlap, and the number of retrieved documentsโon RAG performance, as well as the influence of various LLMs. The results demonstrate that careful adjustment of these parameters and models can significantly enhance the RAG system's effectiveness.
Introducing llama-agents: A Powerful Framework for Building Production Multi-Agent AI Systems
1273 words (6 minutes)
|AI score: 90 ๐๐๐๐
In the AI domain, LlamaIndex's open-source framework llama-agents is revolutionizing the development process for multi-agent AI systems. It provides developers with a robust toolkit that features a distributed service-oriented architecture, standardized API communication protocols, and adaptable orchestration workflows. This makes the creation of complex AI systems both more efficient and more reliable. Regardless of the application, whether it's question-answering systems, collaborative AI assistants, or distributed AI workflows, llama-agents empowers developers to transform agents into scalable microservices. Additionally, it offers straightforward deployment and real-time monitoring solutions.
Figma AI: Empowering Designers Everywhere
2344 words (10 minutes)
|AI score: 91 ๐๐๐๐๐
At Config2024, Figma announced a range of new features, including Figma AI, aimed at solving real-world problems faced during the design process to enhance efficiency and creativity. Figma AI streamlines the workflow for designers with features such as visual and AI-enhanced content search, auto-naming layers, text processing, and visual layout generation. Additionally, Figma has made five major optimizations to the UI interface, making it easier for users to get started. Figma also released a new version of Figma Slides, further enhancing its competitiveness in the professional environment. Figma has committed to data privacy protection to ensure the security of user data.
Representative Open Source AI Web Scraper Projects
2158 words (9 minutes)
|AI score: 88 ๐๐๐๐
This article introduces several AI-integrated data scraping tools and their features, which are essential in the fast-evolving AI landscape where data is a core competitive advantage. The tools discussed include:
- Scrapegraph-ai: A Python library that automates data scraping using large language models (LLMs) and graph-based pipelines.
- llm-scraper: A TypeScript library that converts any webpage into structured data using LLMs.
- Firecrawl: A tool developed by Mendable.ai and the Firecrawl community to scrape and convert websites into Markdown or structured data.
- MediaCrawler: Capable of scraping data from platforms like Xiaohongshu, Douyin, Kuaishou, Bilibili, and Weibo.
- gpt-crawler: A project that scrapes web documents and generates files for creating custom GPTs.
- gpt4V-scraper: A GPT-4V-based web agent for automating webpage data scraping.
- EasySpider: A visual no-code web crawler for automating browser tasks.
- Basic Scraper Frameworks: Common frameworks like Playwright, Cypress, Puppeteer, and Selenium.
DeepSeek-Coder-v2 Tops the Arena as the Strongest Open-Source Coding Model, Surpassing GPT4-Turbo
1526 words (7 minutes)
|AI score: 93 ๐๐๐๐๐
DeepSeek-Coder-v2 has emerged as the strongest open-source coding model in the Arena, surpassing GPT4-Turbo. It supports 338 programming languages and offers 236B and 16B parameter sizes. The model excels in coding and mathematics, ranking high in various coding and AI performance benchmarks. DeepSeek-Coder-v2 also introduced a feature similar to 'Artifacts', allowing code generation and execution directly in the browser.
Open Model Bonanza, Private Benchmarks for Fairer Tests, More Interactive Music Generation, Diffusion + GAN
deeplearning.ai|deeplearning.ai
2864 words (12 minutes)
|AI score: 90 ๐๐๐๐
This article from The Batch Newsletter discusses the advancements in AI coding agents, particularly focusing on open-source frameworks like OpenDevin. It highlights research papers that explore multi-agent code generation, code debugging using large language models (LLMs), and the development of efficient agent-computer interfaces. The article emphasizes the importance of automated evaluation using benchmarks like HumanEval and MBPP, contrasting it with the challenges of evaluating web search and article synthesis agents. It concludes by discussing the rapid evolution of coding agents and their potential to make programming more enjoyable and productive.
AIGC Weekly #77
ๆญธ่็AIๅทฅๅ ท็ฎฑ|mp.weixin.qq.com
6544 words (27 minutes)
|AI score: 91 ๐๐๐๐๐
This week in AIGC, Anthropic releases Claude 3.5 Sonnet with improved performance and a new interactive feature called Artifact. Runway launches its video generation model Gen-3, boasting high video quality and fine-grained control. Deepseek unveils its code model and code assistant, DeepSeek-Coder-V2, surpassing GPT-4 turbo in code capabilities. Ilya Sutskever establishes a new company, SSI, focusing on safe superintelligence. Meta open-sources four models, including the Meta Chameleon language model, the Meta Multi-Token Prediction model for code completion, the Meta JASCO music model, and the AudioSeal audio watermarking technology. Other notable developments include new features from Kuaishou's Kelin, Midjourney, and Google Gemini, and the formation of Comfy Org. The report concludes with recommendations for AI-powered tools such as Genspark, Hedra, Dot, Otto, Playmaker Document AI, and selected readings like Andrej Karpathy's LLM 101 course and Lex Fridman's interview with the CEO of Perplexity.
70 Years, 800 AI Models, Global AI Model Data Visualization; The Truth of AI Revealed by 750 Engineers; A Must-Read Manual for Founders Heading to the US | ShowMeAI Daily
ShowMeAI็ ็ฉถไธญๅฟ|mp.weixin.qq.com
4492 words (18 minutes)
|AI score: 90 ๐๐๐๐
The daily report from ShowMeAI unveils the latest developments in AI technology: Anthropic's large-scale model, Claude Artifacts, takes the lead in the programming field by generating and previewing code, heralding a new era for AI applications in workflow processes. Global visualizations of AI model data highlight a swift upward trend in the training computation and costs, especially for language models.
Survey results indicate that while AI's role in boosting work efficiency is widely acknowledged, there remains a prevalence of AI usage without clear policy direction. The increasing application frequency of AI in chatbots and workflow automation underscores its growing importance in daily operations.
Furthermore, the article underscores that generative AI (GenAI) should not be seen as a substitute for junior programmers; effective engineering teams still hinge on human collaboration. For American startup founders, the report offers practical guidance on company incorporation, stock distribution, and more, assisting them in making wise decisions at the outset of their entrepreneurial journey.
Welcome Gemma 2 - Googleโs new open LLM
Hugging Face Blog|huggingface.co
2261 words (10 minutes)
|AI score: 90 ๐๐๐๐
Google has recently unveiled its latest open-source, large-scale language model, Gemma 2, which is available in two sizes: 90 billion and 270 billion parameters, each with a base version and an instruction-tuned version. Gemma 2 introduces key enhancements in sliding window attention mechanisms, logarithmic probability soft constraints, knowledge distillation, and model merging, all designed to improve generation quality and overall model performance. The article offers a detailed account of Gemma 2's architecture, training process, and advancements in technology. Gemma 2 is trained on Google Cloud TPUs and is seamlessly integrated with Hugging Face Transformers, as well as being compatible with Google Cloud and inference endpoint integration.
The technical innovations in Gemma 2 encompass: sliding window attention, which combines local and global attention to enhance long text processing; logit soft-capping, which improves training by curbing the growth of logits; knowledge distillation, which has been employed to refine the pre-training of the 90 billion parameter model; and model merging, which leverages the combination of multiple models to enhance performance. Gemma 2 utilizes a novel merging technique known as WARP, which incorporates exponential moving average, spherical linear interpolation (SLERP), and linear interpolation towards initialization.
Price Wars, Layoffs, and Model Failures: A Busy Q2 in the AI Community
9154 words (37 minutes)
|AI score: 90 ๐๐๐๐
The second quarter of 2024 witnessed a flurry of activity in the AI field, highlighting both technological progress and fierce market competition. Meta unveiled the open-source large language model Llama 3, touted as an 'open-source GPT-4.' Meanwhile, Microsoft released the open-source model WizardLM-2, but quickly removed it due to a lack of toxicity testing, sparking widespread discussion. Mobvoi, a company specializing in generative AI and voice interaction technology, successfully listed on the Hong Kong stock market, becoming the first AIGC company to go public. Google's decision to lay off its entire Python team underscored the fact that competition in the AI field extends beyond technology, encompassing factors like labor costs and market strategies. Companies like Alibaba Cloud demonstrated their technical prowess by releasing new models and upgrading services. OpenAI's GPT-4o model further enhanced the capabilities of generative AI, showcasing its advanced abilities in text, vision, and audio. Domestic tech giants like Alibaba, Baidu, and Tencent significantly reduced the prices of large models, drawing market attention. The United States passed the ENFORCE Act, strengthening control over AI technology exports. Mistral released the first generative AI model for coding, Codestral. Zhipu AI launched MaaS 2.0 and reduced prices comprehensively, leading to a significant increase in API call volume. Apple demonstrated its emphasis on AI technology at the WWDC24 Developer Conference. Former OpenAI Chief Scientist Ilya Sutskever established a new company, SSI, forming competition with OpenAI. Domestic large model companies have swiftly rolled out migration plans to mitigate the impact of OpenAI's supply disruption, offering alternative solutions for developers and businesses.
Highly Praised Speech at AGI Conference: Innovation Works' Wang Hua Explains When the AI Application Boom Will Arrive
7113 words (29 minutes)
|AI score: 93 ๐๐๐๐๐
At AGI Playground 2024, Wang Hua, a managing partner at Innovation Workshop, shared his insights on the future trajectory of AI applications. Despite the current anxiety over the lack of application breakthroughs in the AI field, Wang Hua anticipates an explosion in AI applications in the next four to five years, driven by enhancements in model performance, inference costs, model modalities, and the maturation of the application ecosystem. He predicts a substantial reduction in inference costs, which will enable widespread adoption of AI applications across B2B sectors, productivity tools, and the entertainment industry.
The article underscores that China's AI model capabilities have significantly closed the gap with the United States, laying a robust foundation for a burgeoning AI application landscape in China. The shift in investment from foundational models to the application layer is particularly evident in the consumer application sector, where an uptick in investments is already observable. Wang Hua posits that the proliferation of AI applications is contingent upon four essential conditions: high-performance models, reduced inference costs, multi-modal capabilities, and a comprehensive application ecosystem. He foresees that by year-end, inference costs will have dropped tenfold, and by the end of next year, they will reach just 1% of current levels, catalyzing the mass adoption of AI applications.
Drawing on his experience as an investor and entrepreneur, Wang Hua recommends that aspiring AI entrepreneurs possess a thorough understanding of both product development and technical aspects, emphasizing the importance of delving deeply into user scenarios. With the continued decrease in inference costs, he envisions the approaching "democratization singularity," which will render products with billions of daily active users (DAUs) not only feasible but imminent. Additionally, he forecasts substantial advancements in the realms of intelligence ceilings, multi-modality, and AI agents, which will propel AI applications forward and could potentially transform the human world.
Z Potentials | Luyu Zhang: Serving Millions of Developers and Re-entering Entrepreneurship to Build the Leading Large Model Middleware, Dify, with No.1 Global Monthly Growth and Over 400,000 Installations
16385 words (66 minutes)
|AI score: 90 ๐๐๐๐
In the realm of artificial intelligence, Dify has emerged as a leading startup focused on middleware for large-scale models, achieving over 400,000 installations in just one year and securing its position as the fastest-growing provider of open-source middleware for large-scale models globally. In an interview, Dify's founder, Zhang Luyu, delved into his entrepreneurial journey, his perspectives on the evolution of AI technology, and the company's vision. Zhang highlighted the principle of user-centered product design, stressing the importance of balancing user-friendliness with adaptability in a rapidly changing technological landscape. He proposed the concept of LLMOps, noting the consolidation of AI technology stacks and the intricate engineering challenges inherent in middleware. Additionally, he underscored the critical role of open-source practices and globalization.
The article detailed Zhang's insights into the three driving forces behind entrepreneurship: a desire to improve upon the status quo, a passion for creation, and a commitment to altruism. He particularly emphasized the centrality of altruism in his entrepreneurial ethos. Zhang posited that an effective tool should streamline user tasks rather than fabricate demand. While acknowledging the maturation of the AI technology stack, he pointed out that the integration of models and applications continues to pose significant challengesโa niche that Dify is dedicated to addressing. He also highlighted the strategic importance of open source for Dify, crediting it with fostering global contributor engagement, enhancing technical control, and reducing barriers to market-driven promotion.
Zhang expressed a keen interest in fostering a culture of innovation and teamwork within his company, recognizing the pivotal role of team culture in a startup's success and the necessity of continuous innovation for staying competitive. Furthermore, he shared his reflections on the release of ChatGPT, expressing his belief that recent advancements in AI have unlocked unprecedented opportunities for innovators and entrepreneurs, inspiring a surge of creativity and boldness across the industry.
Dialogue with Li Dahai of Mianbi Intelligence: Beyond Scaling Law, Another Key Path for Large Models
6965 words (28 minutes)
|AI score: 91 ๐๐๐๐๐
In this article, Li Dahai from Mianbi Intelligence discusses the future of large models beyond the Scaling Law. Key points include:
- The potential to develop a GPT-4 level edge model by 2026.
- The importance of edge models in being closer to users and more practical.
- The role of AGI and the significance of Agent technology.
- The concept of 'intelligent density' and its impact on the efficiency of large models.
- The challenges and advancements in creating efficient edge models with high performance.
Reflections on the Next Generation AI Hardware: A Conversation Between Two Hardware Entrepreneurs
6404 words (26 minutes)
|AI score: 91 ๐๐๐๐๐
At the AGI Playground 2024 Conference, hardware entrepreneurs Yang Jianbo and Yang Meng engaged in a deep discussion about the future of AI hardware. They believe that AI hardware needs to offer efficient human-computer interaction and emotional value to meet user needs across various scenarios. While smartphones remain the optimal interaction device, AI hardware can provide emotional value in non-work settings, such as AI pets. The emergence of large AI models has revolutionized the traditional development of AI algorithms, enabling more problems to be solved through innovations in underlying models. They also discussed how AI can enhance different roles, particularly the emotional value of pets and intelligent entities. They predict that future intelligent entities might be a single super intelligent agent coordinating all scenarios, or multiple isolated intelligent agents. The development of AI hardware requires a combination of local computing power and cloud models to deliver an enhanced user experience. The organizational structure of hardware companies will not undergo rapid changes in the AI era; they will continue to rely on experienced individuals and existing tools. AI cannot fully replace human resources in hardware companies in the short term, and the foundation of organizational structure remains crucial.
AI Will Reshape Gaming: Marc Andreessen's 15,000-Word Discussion on Game Products and Investment in the AI Era (Video Included)
Web3ๅคฉ็ฉบไนๅ|mp.weixin.qq.com
15855 words (64 minutes)
|AI score: 91 ๐๐๐๐๐
Marc Andreessen shared his deep insights on the impact of AI in the gaming sector during his interview. He posited that AI would revolutionize gaming, turning it into an individualized art form that can respond dynamically to players, fostering a collaborative cycle of creation between the users and the system. Andreessen espoused a philosophy of technological optimism, noting that while new technologies might provoke moral panic, their positive transformations are substantial. He likened AI to a novel breed of computer that can creatively generate content, thus paving the way for novel artistic expressions and business paradigms. He also highlighted the critical role of open-source initiatives in advancing the widespread adoption and innovation of AI technologies, expressing optimism about the opportunities for startups in this space. Furthermore, Andreessen predicted that founders in the gaming industry could exert a significant influence on the world over the next few decades, and he explored the potential for gaming technology to permeate other fields, thereby driving social advancement.
DingTalk Announces Openness to All Large-Scale Models, Building China's Most Open AI Ecosystem
1744 words (7 minutes)
|AI score: 90 ๐๐๐๐
DingTalk has announced its policy of openness to all large-scale AI model vendors, aiming to construct China's largest and most open AI ecosystem. Besides cooperation with the Tongyi large model, DingTalk has also partnered with six other large model vendors: MiniMax, Zhouzhidao, ZHIPU AI, OrionStar, Zero One Universe, and Baichuan Intelligence. Currently, DingTalk has over 5,600 ecosystem partners, with AI partners exceeding 100. DingTalk's AI is called more than 10 million times daily. The company is exploring three major partnership models with large model vendors.
Why is Feishu the Shared Choice of China's Large Language Model Unicorns?
3855 words (16 minutes)
|AI score: 90 ๐๐๐๐
Feishu, a collaboration platform developed by ByteDance, has become the preferred choice for leading Chinese large language model (LLM) companies. This article explores the reasons behind this trend, highlighting the unique challenges faced by LLM startups and how Feishu addresses them. The article details three key aspects of Feishu's appeal:
- Rapid Iteration and Organizational Agility: Feishu's tools and methodology facilitate rapid iteration, enabling LLM companies to adapt quickly to the fast-paced nature of the industry.
- Context Over Control: Feishu's all-in-one approach fosters information flow and efficient collaboration, aligning with the decentralized, goal-driven nature of LLM companies.
- Flexibility and Openness: Feishu's high degree of flexibility and openness, particularly its multi-dimensional table and open platform capabilities, caters to the technical expertise and customization needs of LLM companies.
Podcast Update: An Oral Account of the First Half of the Global Large Model: Perplexity's Sudden Popularity and the Yet-to-Boom AI Application Ecosystem
2476 words (10 minutes)
|AI score: 90 ๐๐๐๐
This episode of 'Zhang Xiaojun Jรนn | Business Interview' is a podcast from Tencent News focusing on in-depth business interviews, aiming to depict the business, culture, and new knowledge of our era. The podcast discusses the progress of global large models in the first half of the year from the perspective of AI applications. It delves into Perplexity, a company in the AI search domain, and its startup, data, competition, and moat. Perplexity's latest valuation has reached $3 billion. The podcast also addresses concerns in the industry, such as why AI applications have not yet boomed, why GPT-5 is slow, and what the business model and barriers of large models are. Additionally, it reviews the status of major U.S. tech giants in the past six months.