跳至主要内容

Revealing DeepSeek: A more extreme story of Chinese technological idealism

 文 | 于丽丽  Wen | Yu Lili

编辑 | 刘旌  Edit | Liu Jing

中国的7家大模型创业公司中,DeepSeek(深度求索)最不声不响,但它又总能以出其不意的方式被人记住。
Of the 7 major model startups in China, DeepSeek is the least silent, but it can always be remembered in an unexpected way.

一年前,这种出其不意源自它背后的量化私募巨头幻方,是大厂外唯一一家储备万张A100芯片的公司,一年后,则来自它才是引发中国大模型价格战的源头。
A year ago, this kind of quantitative private equity giant fantasies that did not mean to derive behind it was the only company outside the large factory that reserved 10,000 A100 chips. One year later, it came from it to trigger the source of China's big model price war.

在被AI连续轰炸的5月,DeepSeek一跃成名。起因是他们发布的一款名为DeepSeek V2的开源模型,提供了一种史无前例的性价比:推理成本被降到每百万token仅 1块钱,约等于Llama3 70B的七分之一,GPT-4 Turbo的七十分之一。
In May, which was bombarded by AI, Deepseek became famous. The reason is that they released an open source model called DeepSeek V2, which provides an unprecedented cost-effectiveness: the reasoning cost of reasoning is reduced to only 1 yuan per million token, which is about one-seventh of LLAMA3 70B, GPT-4 4 Turbo's seventy -tenth.

DeepSeek被迅速冠以“AI界拼多多”之称的同时,字节、腾讯、百度、阿里等大厂也按耐不住,纷纷降价。中国大模型价格战由此一触即发。
At the same time that DEEPSEEK was quickly crowned as "AI Fighting Duoduo", the large manufacturers such as bytes, Tencent, Baidu, Ali and other large manufacturers were also unbearable, and the prices were reduced. The price war in China is from this.

弥漫的硝烟其实掩盖了一个事实:与很多大厂烧钱补贴不同,DeepSeek是有利润的。
The diffuse smoke actually covered a fact: Unlike many large manufacturers burning money subsidies, Deepseek is profitable.

这背后,是DeepSeek对模型架构进行了全方位创新。它提出的一种崭新的MLA(一种新的多头潜在注意力机制)架构,把显存占用降到了过去最常用的MHA架构的5%-13%,同时,它独创的DeepSeekMoESparse结构,也把计算量降到极致,所有这些最终促成了成本的下降。
Behind this is that Deepseek has innovated a full range of model architecture. It proposes a brand new MLA ( a new poly head potential attention mechanism ) architecture, which has reduced the memory of 5%-13%of the most commonly used MHA architecture in the past. The quantity dropped to the extreme, all of which eventually contributed to the decline in cost.

在硅谷,DeepSeek被称作“来自东方的神秘力量”。SemiAnalysis首席分析师认为,DeepSeek V2论文“可能是今年最好的一篇”。OpenAI前员工Andrew Carr认为论文“充满惊人智慧”,并将其训练设置应用于自己的模型。而OpenAI前政策主管、Anthropic联合创始人Jack Clark认为,DeepSeek“雇佣了一批高深莫测的奇才”,还认为中国制造的大模型,“将和无人机、电动汽车一样,成为不容忽视的力量。”
In Silicon Valley, Deepseek is called "mysterious power from the East". The chief analyst of Semianalysis believes that the DeepSeek V2 paper "may be the best article this year." Former OpenAI employee Andrew Carr believes that the paper is "full of amazing wisdom" and applies its training settings to its own model. Jack Clark, the former OPENAI policy director and co -founder of Anthropic, believes that DeepSeek "hired a group of unpredictable wizards" and also believed that the large model made in China, "will be like drones and electric vehicles, which will become unavoidable. strength."

在基本由硅谷牵动故事进展的AI浪潮里,这是罕有的情形。多位行业人士告诉我们,这种强烈的反响源自架构层面的创新,是国产大模型公司乃至全球开源基座大模型都很罕见的尝试。一位AI研究者表示,Attention架构提出多年来,几乎未被成功改过,更遑论大规模验证。“这甚至是一个做决策时就会被掐断的念头,因为大部分人都缺乏信心。”
This is a rare situation in the wave of AI that basically touched the story by Silicon Valley. A number of industry people told us that this strong response originated from the architecture level innovation, which is a rare attempt to be a rare attempt to make large domestic model companies and even global open source base models. A AI researcher said that the Attention architecture has been proposed for many years, and it has almost been successfully changed, let alone large -scale verification. "This is even a idea of ​​being cut off when making decisions, because most people lack confidence."

而另一方面,国产大模型之前很少涉足架构层面的创新,也是因为很少有人主动去击破那样一种成见:美国更擅长从0-1的技术创新,而中国更擅长从1-10的应用创新。何况这种行为非常不划算——新一代模型,过几个月自然有人做出来,中国公司只要跟随、做好应用即可。对模型结构进行创新,意味着没有路径可依,要经历很多失败,时间、经济成本都耗费巨大。
On the other hand, the domestic big model rarely involved the innovation at the architecture level, because few people took the initiative to break such a prejudice: the United States is better at technological innovation from 0-1, and China is better at 1-10 Application innovation. Besides, this behavior is very uncomfortable -a new generation model, naturally someone will do it in a few months. As long as Chinese companies follow and do well. Innovating the model structure means that there is no path to rely on, and a lot of failure is to go through a lot of failures. Time and economic costs are huge.

DeepSeek显然是逆行者。在一片认为大模型技术必然趋同,follow是更聪明捷径的喧哗声中,DeepSeek看重“弯路”中积累的价值,并认为中国的大模型创业者除应用创新外,也可以加入到全球技术创新的洪流中。
Deepseek is obviously retrograde. In a piece of big model technology that is inevitable, Follow is the noise of more smart shortcuts. DeepSeek values ​​the value accumulated in the "detours", and believes that in addition to application innovation, Chinese big model entrepreneurs can also join global technological innovation. In the torrent.

DeepSeek的很多抉择都与众不同。截至目前,7家中国大模型创业公司中,它是唯一一家放弃“既要又要”路线,至今专注在研究和技术,未做toC应用的公司,也是唯一一家未全面考虑商业化,坚定选择开源路线甚至都没融过资的公司。这些使得它经常被遗忘在牌桌之外,但在另一端,它又经常在社区被用户“自来水”式传播。
Many of DeepSeek's choices are different. As of now, among the seven major Chinese model startups, it is the only company that has given up the "both must and also" route and is focusing on research and technology. The open source route has not even finished the company. These are often forgotten from the table, but at the other end, it is often spread by users by users in the community.

DeepSeek究竟是如何炼成的?我们为此访谈了甚少露面的DeepSeek创始人梁文锋。
How is DeepSeek made? We interviewed Liang Wenfeng, the founder of Deepseek, who rarely appeared.

这位从幻方时代,就在幕后潜心研究技术的80后创始人,在DeepSeek时代,依旧延续着他的低调作风,和所有研究员一样,每天“看论文,写代码,参与小组讨论”。
This era of post -80s, who has been studying technology behind the scenes, still continues his low -key style in the DEEPSEEK era. Like all researchers, every day, "look at the dissertation, write code, and participate in group discussions."

和很多量化基金创始人都有过海外对冲基金履历,多出身物理、数学等专业不同的是,梁文锋一直是本土背景,早年就读的也是浙江大学电子工程系人工智能方向。
Different from the founders of many quantitative funds have the overseas hedge fund resumes. Different from the majors of physics and mathematics, Liang Wenfeng has always been a local background. In his early years, he also studied artificial intelligence in the Department of Electronic Engineering of Zhejiang University.

多位行业人士和DeepSeek研究员告诉我们,梁文锋是当下中国AI界非常罕见的“兼具强大的infra工程能力和模型研究能力,又能调动资源”、“既可以从高处做精准判断,又可以在细节上强过一线研究员”的人,他拥有“令人恐怖的学习能力”,同时又“完全不像一个老板,而更像一个极客”。
Several industry insiders and deepseek researcher told us that Liang Wenfeng is a very rare "strong Infra engineering ability and model research ability in the Chinese AI industry, but also can mobilize resources." Those who are more than a front -line researcher in details ", he has" terrifying learning ability ", and at the same time," is not like a boss at all, but more like a geek. "

这是一次尤为难得的访谈。访谈里,这位技术理想主义者,提供了目前中国科技界特别稀缺的一种声音:他是少有的把“是非观”置于“利害观”之前,并提醒我们看到时代惯性,把“原创式创新”提上日程的人。
This is a particularly rare interview. In the interview, this technical idealist provides a very scarce voice in the Chinese scientific and technological community: he is rare to put the "right or wrong view" before the "concept of interest", and remind us to see the inertia of the times. "Original Innovation" on the agenda.

一年前,DeepSeek刚下场时,我们初次访谈了梁文锋 :《疯狂的幻方:一家隐形AI巨头的大模型之路》 。如果说当时那句「务必要疯狂地怀抱雄心,且还要疯狂地真诚」还是一句美丽的口号,一年过去,它已经在成为一种行动。
One year ago, when DeepSeek first ended, we first interviewed Liang Wenfeng: "Crazy Fantasy Fang: The Road to a Big Model of an Invisible AI Giant". If the phrase "must be embraced madly, and to be madly sincere" is still a beautiful slogan, one year has passed, it is already becoming a action.

以下为对话部分:  The following is the dialogue part:

价格战第一枪是怎么打响的?  How did the first shot of the price war began?

「暗涌」:DeepSeek V2模型发布后,迅速引发一场血雨腥风的大模型价格战,有人说你们是行业的一条鲶鱼。
"Dark Surge": After the release of the Deepseek V2 model, it quickly triggered a big model price war with a bloody storm. Some people said that you are a catfish in the industry.

梁文锋:我们不是有意成为一条鲶鱼,只是不小心成了一条鲶鱼。
Liang Wenfeng : We don't intend to be a catfish, but we accidentally become a catfish.

「暗涌」:这个结果让你们意外吗?  "Dark": Is this result surprised you?

梁文锋:非常意外。没想到价格让大家这么敏感。我们只是按照自己的步调来做事,然后核算成本定价。我们的原则是不贴钱,也不赚取暴利。这个价格也是在成本之上稍微有点利润。
Liang Wenfeng : Very unexpected. I did not expect the price to be so sensitive. We just do things according to our own pace, and then calculate the cost. Our principles are not money or profit. This price is also a little profitable on the cost.

「暗涌」:5天后智谱AI就跟进了,之后是字节、阿里、百度、腾讯等大厂。
"Dark Surge": After 5 days, the wisdom spectrum AI followed up, and then large factories such as bytes, Ali, Baidu, Tencent and other large manufacturers.

梁文锋:智谱AI降的是一个入门级产品,和我们同级别的模型仍然收费很贵。字节是真正第一个跟进的。旗舰模型降到和我们一样的价格,然后触发了其它大厂纷纷降价。因为大厂的模型成本比我们高很多,所以我们没想到会有人亏钱做这件事,最后就变成了互联网时代的烧钱补贴的逻辑。
Liang Wenfeng : The wisdom spectrum AI reduces an entry -level product, and the model of our same level is still expensive. The byte is the first to follow up. The flagship model dropped to the same price as us, and then triggered other large manufacturers to reduce prices. Because the cost of the model of the big factory is much higher than us, we did not expect that someone would lose money to do this, and finally became the logic of burning subsidies in the Internet era.

「暗涌」:外部看来,降价很像在抢用户,互联网时代的价格战通常如此。
"Dark surge": It seems that the price reduction is very similar to being robbing users. The price war in the Internet era is usually the case.

梁文锋:抢用户并不是我们的主要目的。我们降价一方面是因为我们在探索下一代模型的结构中,成本先降下来了,另一方面也觉得无论API,还是AI,都应该是普惠的、人人可以用得起的东西。
Liang Wenfeng : Raising users is not our main purpose. On the one hand, our price cut is because in the structure of the next generation of models, the cost drops first, and on the other hand, we also feel that both the API or AI should be inclusive and everyone can use things.

「暗涌」:在这之前,大部分中国公司都会直接copy这一代的 Llama结构去做应用,为什么你们会从模型结构切入?
"Dark Surge": Before that, most Chinese companies will directly Copy's LLAMA structure to apply. Why do you cut in from the model structure?

梁文锋:如果目标是做应用,那沿用 Llama结构,短平快上产品也是合理选择。但我们目的地是AGI,这意味着我们需要研究新的模型结构,在有限资源下,实现更强的模型能力。这是scale up到更大模型所需要做的基础研究之一。除了模型结构,我们还做了大量其他的研究,包括怎么构造数据,如何让模型更像人类等,这都体现在我们发布的模型里。另外,Llama的结构,在训练效率和推理成本上,和国外先进水平估计也已有两代差距。
Liang Wenfeng : If the goal is to apply it, the LLAMA structure is used, and the product is also a reasonable choice. But our destination is AGI, which means that we need to study new model structures and achieve stronger model capabilities under limited resources. This is one of the basic research that Scale UP needs to do a larger model. In addition to the model structure, we have also done a lot of other studies, including how to construct data and how to make the model more like humans, which are reflected in the model we posted. In addition, the structure of LLAMA, in terms of training efficiency and reasoning costs, has two generations gap with advanced foreign levels.

「暗涌」:这种代差主要来自哪里?  "Dark Surge": Where does this difference come from?

梁文锋:首先训练效率有差距。我们估计,国内最好的水平和国外最好的相比,模型结构和训练动力学上可能有一倍的差距,光这一点我们要消耗两倍的算力才能达到同样效果。另外数据效率上可能也有一倍差距,也就是我们要消耗两倍的训练数据和算力,才能达到同样的效果。合起来就要多消耗4倍算力。我们要做的,正是不停地去缩小这些差距。
梁文锋:首先训练效率有差距。 We estimate that compared with the best level in China and the best abroad, there may be double the model structure and training dynamics. We have to consume twice the computing power to achieve the same effect. In addition, there may be double gap in data efficiency, that is, we have to consume twice the training data and computing power to achieve the same effect. It takes 4 times more computing power to close. What we have to do is constantly narrowing these gaps.

「暗涌」:大部分中国公司都选择既要模型又要应用,为什么DeepSeek目前选择只做研究探索?
"Dark Yong": Most Chinese companies choose to use both models and application. Why is DeepSeek choose to only do research and exploration?

梁文锋:因为我们觉得现在最重要的是参与到全球创新的浪潮里去。过去很多年,中国公司习惯了别人做技术创新,我们拿过来做应用变现,但这并非是一种理所当然。这一波浪潮里,我们的出发点,就不是趁机赚一笔,而是走到技术的前沿,去推动整个生态发展。
Liang Wenfeng : Because we feel that the most important thing now is to participate in the wave of global innovation. In the past many years, Chinese companies have been accustomed to making technological innovation. We have taken it for application monetization, but this is not a matter of course. In this wave of waves, our starting point is not to take the opportunity to make a fortune, but to the forefront of technology to promote the entire ecological development.

「暗涌」:互联网和移动互联网时代留给大部分人的惯性认知是,美国擅长搞技术创新,中国更擅长做应用。
"Dark Surge": The inertia cognition left by most people in the Internet and mobile Internet era is that the United States is good at engaging in technological innovation and China is better at applying.

梁文锋:我们认为随着经济发展,中国也要逐步成为贡献者,而不是一直搭便车。过去三十多年IT浪潮里,我们基本没有参与到真正的技术创新里。我们已经习惯摩尔定律从天而降,躺在家里18个月就会出来更好的硬件和软件。Scaling Law也在被如此对待。
Liang Wenfeng : We believe that with the development of the economy, China must gradually become contributors, rather than always taking stools. In the IT wave in the past 30 years, we have basically not participated in real technological innovation. We are accustomed to falling from the sky, and we will come out for better hardware and software when we lie at home for 18 months. Scaling Law is also treated like this.

但其实,这是西方主导的技术社区一代代孜孜不倦创造出来的,只因为之前我们没有参与这个过程,以至于忽视了它的存在。
But in fact, this was created by the Western -led technological community generation, because we did not participate in this process before, so that we ignored its existence.

真正的差距不是一年或两年,而是原创和模仿之差  The real gap is not one year or two years, but the difference between original and imitation

「暗涌」:为什么DeepSeek V2会让硅谷的很多人惊讶?
"Dark Surge": Why does DeepSeek V2 surprise many people in Silicon Valley?

梁文锋:在美国每天发生的大量创新里,这是非常普通的一个。他们之所以惊讶,是因为这是一个中国公司,在以创新贡献者的身份,加入到他们游戏里去。毕竟大部分中国公司习惯follow,而不是创新。
Liang Wenfeng : This is a very ordinary one in the large number of innovations in the United States every day. The reason why they were surprised was because it was a Chinese company, joining the game as an innovative contributor to their games. After all, most Chinese companies are used to Follow, not innovation.

「暗涌」:但这种选择放在中国语境里,也过于奢侈。大模型是一个重投入游戏,不是所有公司都有资本只去研究创新,而不是先考虑商业化。
"Dark Surging": But this choice is too luxurious in the context of China. Large models are a heavy -duty game. Not all companies have capital only to research innovation, rather than considering commercialization first.

梁文锋:创新的成本肯定不低,过去那种拿来主义的惯性也和过去的国情有关。但现在,你看无论中国的经济体量,还是字节、腾讯这些大厂的利润,放在全球都不低。我们创新缺的肯定不是资本,而是缺乏信心以及不知道怎么组织高密度的人才实现有效的创新。
Liang Wenfeng : The cost of innovation is definitely not low. The inertia of the past doctrine is also related to the past national conditions. But now, you can see that regardless of China's economy, or the profits of large factories such as bytes and Tencent, it is not low in the world. We must not be capital, but lack confidence and do not know how to organize high -density talents to achieve effective innovation.

「暗涌」:为什么中国公司——包括不缺钱的大厂,这么容易把快速商业化当第一要义?
"Dark Surge": Why does a Chinese company -including a large factory that is not short of money, so it is easy to take fast commercialization as the first priority?

梁文锋:过去三十年,我们都只强调赚钱,对创新是忽视的。创新不完全是商业驱动的,还需要好奇心和创造欲。我们只是被过去那种惯性束缚了,但它也是阶段性的。
Liang Wenfeng : In the past three decades, we have all emphasized to make money and ignore innovation. Innovation is not entirely commercially driven, but also needs curiosity and creativity. We are just bound by the inertia of the past, but it is also staged.

「暗涌」:但你们究竟是一个商业组织,而非一个公益科研机构,选择创新,又通过开源分享出去,那要在哪里形成护城河?像5月这次MLA架构的创新,也会很快被其他家copy吧?
"Dark": But you are a commercial organization, not a public welfare scientific research institution, choose innovation, and share it through open source. Where can you form a moat? Like the innovation of the MLA architecture in May, will it be Copy soon?

梁文锋:在颠覆性的技术面前,闭源形成的护城河是短暂的。即使OpenAI闭源,也无法阻止被别人赶超。所以我们把价值沉淀在团队上,我们的同事在这个过程中得到成长,积累很多know-how,形成可以创新的组织和文化,就是我们的护城河。
Liang Wenfeng : In the face of disruptive technology, the moat formed by the closed source is short. Even if the OpenAI is closed, it cannot be stopped by others. Therefore, we have precipitated value on the team. Our colleagues have grown in the process, accumulating a lot of Know-How, to form an innovative organization and culture, which is our moat.

开源,发论文,其实并没有失去什么。对于技术人员来说,被follow是很有成就感的事。其实,开源更像一个文化行为,而非商业行为。给予其实是一种额外的荣誉。一个公司这么做也会有文化的吸引力。
Open source, papers, actually did not lose anything. For technicians, it is very accomplished by Follow. In fact, open source is more like a cultural behavior, not a business behavior. Giving is actually an additional honor. A company does this will also be attractive.

「暗涌」:你怎么看类似朱啸虎的这种市场信仰派观点?
"Dark": What do you think of the market beliefs like Zhu Xiaohu?

梁文锋:朱啸虎是自洽的,但他的打法更适合快速赚钱的公司,而你看美国最赚钱的公司,都是厚积薄发的高科技公司。
Liang Wenfeng : Zhu Xiaohu is self -consistent, but his play is more suitable for fast -making companies, and you see that the most profitable companies in the United States are high -tech companies.

「暗涌」:但做大模型,单纯的技术领先也很难形成绝对优势,你们赌的那个更大的东西是什么?
"Dark Yong": But when making a big model, it is difficult to form an absolute advantage with simple technical leadership. What is the bigger thing you bet?

梁文锋我们看到的是中国AI不可能永远处在跟随的位置。我们经常说中国AI和美国有一两年差距,但真实的gap是原创和模仿之差。如果这个不改变,中国永远只能是追随者,所以有些探索也是逃不掉的。
Liang Wenfeng : What we see is that China AI cannot always follow. We often say that Chinese AI and the United States have a gap between one or two years, but the real Gap is the difference between original and imitation. If this does not change, China can only be followers, so some explorations cannot escape.

英伟达的领先,不只是一个公司的努力,而是整个西方技术社区和产业共同努力的结果。他们能看到下一代的技术趋势,手里有路线图。中国AI的发展,同样需要这样的生态。很多国产芯片发展不起来,也是因为缺乏配套的技术社区,只有第二手消息,所以中国必然需要有人站到技术的前沿。
Nvidia's lead is not just the efforts of a company, but the result of the joint efforts of the entire Western technology community and industry. They can see the technical trend of the next generation and have a roadmap in their hands. The development of Chinese AI also needs such an ecology. Many domestic chips cannot develop, and because of lack of supporting technology communities, there are only second -hand news, so China must need to stand at the forefront of technology.

更多的投入并不一定产生更多的创新  More investment does not necessarily produce more innovation

「暗涌」:现在的DeepSeek有一种OpenAI早期的理想主义气质,也是开源的。后边你们会选择闭源吗?OpenAI和Mistral都有过从开源到闭源的过程。
"Dark": Now Deepseek has an early idealism of OpenAI, which is also open source. Will you choose to close the source? Both Openai and Mistral have the process from open source to closed sources.

梁文锋:我们不会闭源。我们认为先有一个强大的技术生态更重要。
Liang Wenfeng : We will not close the source. We think it is more important to have a powerful technical ecology.

「暗涌」:你们有融资计划吗?看有媒体报道,幻方对DeepSeek有独立拆分上市的计划,硅谷的AI创业公司,最终也都难免要和大厂绑定。
"Dark": Do you have a financing plan? Seeing media reports, the magic party has a plan to separate the listing of Deepseek. The AI ​​startups in Silicon Valley will inevitably bind to the large manufacturers in the end.

梁文锋:短期内没有融资计划,我们面临的问题从来不是钱,而是高端芯片被禁运。
Liang Wenfeng : There is no financing plan in the short term. The problems we face are never money, but that high -end chips are embarked down.

「暗涌」:很多人认为,做AGI和做量化是完全不同的两件事,量化可以闷声去做,但AGI可能更需要高举高打,需要结盟,这样可以让你的投入变大。
"Dark Surge": Many people think that doing AGI and quantification are two things that are completely different. Quantitatives can be done with a stuffy voice, but AGI may need to hold high beating and all alliances, which can make your investment larger.

梁文锋:更多的投入并不一定产生更多的创新。否则大厂可以把所有的创新包揽了。
Liang Wenfeng : More investment does not necessarily produce more innovation. Otherwise, big manufacturers can take over all innovations.

「暗涌」:你们现在不做应用,是因为你们没有运营的基因吗?
"Undercurrent": You don't make applications now, is it because you don't have the genes to operate?

梁文锋:我们认为当前阶段是技术创新的爆发期,而不是应用的爆发期。长远来说,我们希望形成一种生态,就是业界直接使用我们的技术和产出,我们只负责基础模型和前沿的创新,然后其它公司在DeepSeek 的基础上构建toB、toC的业务。如果能形成完整的产业上下游,我们就没必要自己做应用。当然,如果需要,我们做应用也没障碍,但研究和技术创新永远是我们第一优先级。
Liang Wenfeng : We believe that the current stage is an explosion period of technological innovation, not an explosion period of application. In the long run, we hope to form an ecosystem in which the industry directly uses our technology and output. We are only responsible for basic models and cutting-edge innovations, and then other companies build toB and toC businesses based on DeepSeek. If we can form a complete upstream and downstream industry, we don’t need to make applications ourselves. Of course, if necessary, there is no obstacle for us to apply it, but research and technological innovation will always be our first priority.

「暗涌」:但选择API的话,为什么选择DeepSeek,而不是大厂?
"Undercurrent": But when it comes to choosing API, why choose DeepSeek instead of big manufacturers?

梁文锋:未来的世界很可能是专业化分工的,基础大模型需要持续创新,大厂有它的能力边界,并不一定适合。
Liang Wenfeng : The world of the future is likely to be one of specialization and division of labor. Basic large-scale models require continuous innovation. Large manufacturers have their own capability boundaries and may not necessarily be suitable.

「暗涌」:但技术真的可以拉开差距吗?你也说过并不存在绝对的技术秘密。
"Undercurrent": But can technology really widen the gap? You also said that there is no absolute technical secret.

梁文锋:技术没有秘密,但重置需要时间和成本。英伟达的显卡,理论上没有任何技术秘密,很容易复制,但重新组织团队以及追赶下一代技术都需要时间,所以实际的护城河还是很宽。
Liang Wenfeng : There is no secret in technology, but resetting requires time and cost. Nvidia's graphics cards theoretically do not have any technical secrets and are easy to copy, but it takes time to reorganize the team and catch up with next-generation technology, so the actual moat is still very wide.

「暗涌」:你们降价后,字节率先跟进,说明他们还是感受到某种威胁。你怎么看创业公司与大厂竞争的新解法?
"Undercurrent": After you lowered the price, Byte followed up first, which shows that they still feel some kind of threat. What do you think of the new solution for startups to compete with big companies?

梁文锋:说实话我们不太care这件事,只是顺便做了这件事。提供云服务不是我们的主要目标。我们的目标还是去实现AGI。
Liang Wenfeng : To be honest, we don’t care much about this matter, we just did it by the way. Providing cloud services is not our main goal. Our goal is still to achieve AGI.

目前没有看到什么新解法,但大厂也没有明显占优。大厂有现成的用户,但它的现金流业务也是它的包袱,也会让它成为随时被颠覆的对象。
I haven’t seen any new solutions so far, but the big manufacturers don’t have a clear advantage either. Big manufacturers have ready-made users, but their cash flow business is also a burden, making them vulnerable to subversion at any time.

「暗涌」:你怎么看DeepSeek之外的6家大模型创业公司的终局?
"Undercurrent": What do you think of the outcome of the six large-model startups besides DeepSeek?

梁文锋:可能活下来2到3家。现在都还处在烧钱阶段,所以那些自我定位清晰、更能精细化运营的,更有机会活下来。其它公司可能会脱胎换骨。有价值的东西不会烟消云散,但会换一种方式。
Liang Wenfeng : Maybe 2 to 3 families will survive. We are still in the money-burning stage, so those with clear self-positioning and more refined operations have a better chance of surviving. Other companies may be reinvented. Things of value will not disappear, but they will change.

「暗涌」:幻方时代,面对竞争的姿态就被评价为“我行我素”,很少在意横向比较。关于竞争,你思考的原点是什么?
"Undercurrent": In the era of magic square, the attitude in the face of competition was evaluated as "going one's own way" and rarely paying attention to horizontal comparisons. Regarding competition, what is the starting point of your thinking?

梁文锋:我经常思考的是,一个东西能不能让社会的运行效率变高,以及你能否在它的产业分工链条上找到擅长的位置。只要终局是让社会效率更高,就是成立的。中间很多都是阶段性的,过度关注必然眼花缭乱。
Liang Wenfeng : What I often think about is whether a thing can make society more efficient, and whether you can find a position where you are good at it in its industrial division of labor chain. As long as the end result is to make society more efficient, it is valid. There are many stages in between, and excessive attention will inevitably make you dizzy.

一群做“高深莫测”事的年轻人  A group of young people who do "unfathomable" things

「暗涌」:OpenAI前政策主管、Anthropic联合创始人Jack Clark认为DeepSeek雇佣了“一批高深莫测的奇才”,做出DeepSeek v2的是怎样一群人?
"Undercurrent": Jack Clark, former policy director of OpenAI and co-founder of Anthropic, believes that DeepSeek hired "a group of unpredictable wizards". What kind of people made DeepSeek v2?

梁文锋:并没有什么高深莫测的奇才,都是一些Top高校的应届毕业生、没毕业的博四、博五实习生,还有一些毕业才几年的年轻人。
Liang Wenfeng : There are no mysterious geniuses. They are all recent graduates from top universities, interns with Ph.D. 4 and Ph. 5 who have not graduated, and some young people who have graduated only a few years ago.

「暗涌」:很多大模型公司都执着地去海外挖人,很多人觉得这个领域前50名的顶尖人才可能都不在中国的公司,你们的人都来自哪里?
"Undercurrent": Many large model companies are persistent in poaching people overseas. Many people think that the top 50 talents in this field may not be in Chinese companies. Where do your people come from?

梁文锋:V2模型没有海外回来的人,都是本土的。前50名顶尖人才可能不在中国,但也许我们能自己打造这样的人。
Liang Wenfeng : There are no people who came back from overseas in the V2 model, they are all local. The top 50 talents may not be in China, but maybe we can build such people ourselves.

「暗涌」:这次MLA创新是如何发生的?听说idea最早来自一个年轻研究员的个人兴趣?
"Undercurrent": How did this MLA innovation happen? I heard that the idea first came from the personal interest of a young researcher?

梁文锋:在总结出Attention架构的一些主流变迁规律后,他突发奇想去设计一个替代方案。不过从想法到落地,中间是一个漫长的过程。我们为此组了一个team,花了几个月时间才跑通。
Liang Wenfeng : After summarizing some mainstream changes in the Attention architecture, he suddenly wanted to design an alternative. However, it is a long process from idea to implementation. We formed a team for this and it took us several months to get through it.

「暗涌」:这种发散性灵感的诞生和你们完全创新型组织的架构很有关系。幻方时代,你们就很少自上而下地指派目标或任务。但AGI这种充满不确定性的前沿探索,是否多了管理动作?
"Undercurrent": The birth of this divergent inspiration is closely related to the structure of your completely innovative organization. In the Magic Square era, you rarely assign goals or tasks from top to bottom. But does AGI, a frontier exploration full of uncertainty, require more management actions?

梁文锋:DeepSeek也全是自下而上。而且我们一般不前置分工,而是自然分工。每个人有自己独特的成长经历,都是自带想法的,不需要push他。探索过程中,他遇到问题,自己就会拉人讨论。不过当一个idea显示出潜力,我们也会自上而下地去调配资源。
Liang Wenfeng : DeepSeek is also all bottom-up. Moreover, we generally do not pre-position division of labor, but natural division of labor. Everyone has their own unique growth experience and comes with their own ideas, so there is no need to push them. During the exploration process, when he encounters problems, he will invite others to discuss them. But when an idea shows potential, we will allocate resources from top to bottom.

「暗涌」:听说DeepSeek对于卡和人的调集非常灵活。
"Undercurrent": I heard that DeepSeek is very flexible in mobilizing cards and people.

梁文锋:我们每个人对于卡和人的调动是不设上限的。如果有想法,每个人随时可以调用训练集群的卡无需审批。同时因为不存在层级和跨部门,也可以灵活调用所有人,只要对方也有兴趣。
Liang Wenfeng : There is no upper limit for each of us to transfer cards and people. If you have an idea, everyone can call the card of the training cluster at any time without approval. At the same time, because there are no hierarchies or cross-departments, everyone can be flexibly called as long as the other party is also interested.

「暗涌」:一种松散的管理方式也取决于你们筛选到了一批强热爱驱动的人。听说你们很擅长从细节招人, 可以让一些非传统评价指标里优秀的人被选出来。
"Undercurrent": A loose management method also depends on you selecting a group of people who are driven by strong love. I heard that you are very good at recruiting people based on details, and can select some outstanding people based on non-traditional evaluation indicators.

梁文锋:我们选人的标准一直都是热爱和好奇心,所以很多人会有一些奇特的经历,很有意思。很多人对做研究的渴望,远超对钱的在意。
Liang Wenfeng : The criteria for choosing people have always been love and curiosity, so many people will have some strange experiences, which are very interesting. Many people's desire for research far exceeds their care of money.

「暗涌」: transformer诞生在谷歌的AI Lab,ChatGPT诞生在OpenAI,你觉得大公司的AILab 和一个创业公司对于创新产生的价值有什么不同?
"Undercurrent": Transformer was born in Google's AI Lab, and ChatGPT was born in OpenAI. What do you think is the difference in the value of innovation between a large company's AILab and a startup company?

梁文锋:不管是Google实验室,还是OpenAI,甚至中国大厂的AI Lab,都很有价值的。最后是OpenAI做出来,也有历史的偶然性。
Liang Wenfeng : Whether it is Google Labs, OpenAI, or even the AI ​​Labs of major Chinese companies, they are all valuable. In the end, OpenAI made it, and it was also a historical accident.

「暗涌」:创新很大程度也是一种偶然吗?我看你们办公区中间那排会议室左右两侧都设置了可以随意推开的门。你们同事说,这就是给偶然留出空隙。transfomer诞生中就发生过那种偶然经过的人听到后加入,最终把它变成一个通用框架的故事。
"Undercurrent": Is innovation largely an accident? I see that the row of conference rooms in the middle of your office area has doors on the left and right that can be pushed open at will. Your colleagues said that this is to leave room for chance. In the birth of transformer, there was a story where people passing by by chance heard about it and joined in, eventually turning it into a universal framework.

梁文锋:我觉得创新首先是一个信念问题。为什么硅谷那么有创新精神?首先是敢。Chatgpt出来时,整个国内对做前沿创新都缺乏信心,从投资人到大厂,都觉得差距太大了,还是做应用吧。但创新首先需要自信。这种信心通常在年轻人身上更明显。
Liang Wenfeng : I think innovation is first of all a matter of belief. Why is Silicon Valley so innovative? The first is to dare. When Chatgpt came out, the entire country lacked confidence in cutting-edge innovation. From investors to large manufacturers, everyone felt that the gap was too big, so they should just make applications. But innovation first requires confidence. This confidence is usually more pronounced in younger people.

「暗涌」:但你们不参与融资,很少对外发声,社会声量上肯定不如那些融资活跃的公司,怎么确保DeepSeek就是做大模型的人的首选?
"Undercurrent": But you don't participate in financing, rarely speak out to the outside world, and your social voice is definitely not as good as those companies that are active in financing. How can you ensure that DeepSeek is the first choice for people who want to build large models?

梁文锋:因为我们在做最难的事。对顶级人才吸引最大的,肯定是去解决世界上最难的问题。其实,顶尖人才在中国是被低估的。因为整个社会层面的硬核创新太少了,使得他们没有机会被识别出来。我们在做最难的事,对他们就是有吸引力的。
Liang Wenfeng : Because we are doing the most difficult thing. What attracts top talents the most is definitely solving the world’s most difficult problems. In fact, top talents are underestimated in China. Because there are too few hard-core innovations at the entire social level, they have no chance to be identified. We are doing the most difficult thing, which is attractive to them.

「暗涌」:前一段OpenAI的发布并没有等来GPT5,很多人觉得这是技术曲线明显在放缓,也很多人开始质疑Scaling Law,你们怎么看?
"Undercurrent": The release of OpenAI some time ago did not wait for GPT5. Many people think that the technology curve is obviously slowing down, and many people are beginning to question the Scaling Law. What do you think?

梁文锋:我们偏乐观,整个行业看起来都符合预期。OpenAI也不是神,不可能一直冲在前面。
Liang Wenfeng : We are optimistic, and the entire industry seems to meet expectations. Openai is not a god, it is impossible to rush ahead.

「暗涌」:你觉得AGI还要多久实现,发布DeepSeek V2前,你们发布过代码生成和数学的模型,也从dense模型切换到了MOE,所以你们的AGI路线图有哪些坐标?
"Undercurrent": How long do you think it will take for AGI to be realized? Before releasing DeepSeek V2, you released code generation and mathematical models, and also switched from dense models to MOE. So what are the coordinates of your AGI roadmap?

梁文锋:可能是2年、5年或者10年,总之会在我们有生之年实现。至于路线图,即使在我们公司内部,也没有统一意见。但我们确实押注了三个方向。一是数学和代码,二是多模态,三是自然语言本身。数学和代码是AGI天然的试验场,有点像围棋,是一个封闭的、可验证的系统,有可能通过自我学习就能实现很高的智能。另一方面,可能多模态、参与到人类的真实世界里学习,对AGI也是必要的。我们对一切可能性都保持开放。
Liang Wenfeng : It may be 2 years, 5 years or 10 years. In short, it will be realized in our lifetime. As for the roadmap, even within our company, there is no consensus. But we did bet in three directions. One is mathematics and code, the second is multimodality, and the third is natural language itself. Mathematics and code are the natural testing ground for AGI. It is a bit like Go. It is a closed and verifiable system, and it is possible to achieve high intelligence through self-learning. On the other hand, multi-modal learning that involves humans in the real world may also be necessary for AGI. We are open to all possibilities.

「暗涌」:你觉得大模型终局是什么样态?  "Undercurrent": What do you think the ending of the big model will be like?

梁文锋:会有专门公司提供基础模型和基础服务,会有很长链条的专业分工。更多人在之上去满足整个社会多样化的需求。
Liang Wenfeng : There will be specialized companies providing basic models and basic services, and there will be a long chain of professional division of labor. More people can meet the diverse needs of society as a whole.

所有的套路都是上一代的产物  All routines are products of the previous generation

「暗涌」:过去这一年,中国的大模型创业还是有很多变化的,比如去年开头还很活跃的王慧文中场退出了,后来加入的公司也开始呈现出差异化。
"Undercurrent": In the past year, there have been many changes in China's large model entrepreneurship. For example, Wang Huiwen, who was active at the beginning of last year, withdrew from the company mid-term, and the companies he joined later began to show differentiation.

梁文锋:王慧文自己承担了所有的损失,让其他人全身而退。他做了一个对自己最不利,但对大家都好的选择,所以他做人是很厚道的,这点我很佩服。
Liang Wenfeng : Wang Huiwen took all the losses and let others escape unscathed. He made a choice that was most detrimental to himself but best for everyone, so he is a very kind person, which I admire very much.

「暗涌」:现在你的精力最多放在哪里?  "Undercurrent": Where do you focus most of your energy now?

梁文锋:主要的精力在研究下一代的大模型。还有很多未解决的问题。
Liang Wenfeng : The main focus is on researching the next generation of large models. There are still many unanswered questions.

「暗涌」:其他几家大模型创业公司都是坚持既要又要,毕竟技术不会带来永久领先,抓住时间窗口把技术优势落到产品也很重要,DeepSeek敢于专注在模型研究上是因为模型能力还不够吗?
"Undercurrent": Several other large model startups insist on having both. After all, technology will not bring permanent leadership. It is also important to seize the time window to put the technical advantages into products. DeepSeek dares to focus on model research. Is it because the model capability is not enough?

梁文锋:所有的套路都是上一代的产物,未来不一定成立。拿互联网的商业逻辑去讨论未来AI的盈利模式,就像马化腾创业时,你去讨论通用电气和可口可乐一样。很可能是一种刻舟求剑。
Liang Wenfeng : All routines are products of the previous generation and may not be valid in the future. Use the business logic of the Internet to discuss the future profit model of AI, just like when Ma Huateng started his business, you discussed General Electric and Coca-Cola. It is probably a kind of carving a boat to seek a sword.

「暗涌」:过去幻方就有很强的技术和创新基因,成长也比较顺利,这是你偏乐观的原因吗?
"Undercurrent": In the past, Huanfang had strong technology and innovation genes, and its growth was relatively smooth. Is this why you are optimistic?

梁文锋:幻方某种程度上增强了我们对技术驱动型创新的信心,但也不都是坦途。我们经历了一个漫长的积累过程。外部看到的是幻方2015年后的部分,但其实我们做了16年。
Liang Wenfeng : Magic Square has enhanced our confidence in technology-driven innovation to some extent, but it is not always a smooth road. We have gone through a long accumulation process. What we see from the outside is the part of Magic Square after 2015, but in fact we have been doing it for 16 years.

「暗涌」:回到关于原创式创新的话题。现在经济开始进入下行,资本也进入冷周期,所以它对原创式创新是否会带来更多抑制?
"Dark": Back to the topic about original innovation. Now that the economy has begun to fall, capital also enters the cold cycle, so will it bring more suppression of original innovation?

梁文锋:我倒觉得未必。中国产业结构的调整,会更依赖硬核技术的创新。当很多人发现过去赚快钱很可能来自时代运气,就会更愿意俯身去做真正的创新。
Liang Wenfeng : I don't think it is necessary. The adjustment of China's industrial structure will rely more on the innovation of hardcore technology. When many people find that in the past, they are likely to come from the times, and they will be more willing to lean down to do real innovation.

「暗涌」:所以你对这件事也是乐观的?  "Dark Surging": So are you optimistic about this?

梁文锋:我是八十年代在广东一个五线城市长大的。我的父亲是小学老师,九十年代,广东赚钱机会很多,当时有不少家长到我家里来,基本就是家长觉得读书没用。但现在回去看,观念都变了。因为钱不好赚了,连开出租车的机会可能都没了。一代人的时间就变了。
Liang Wenfeng : I grew up in a fifth -tier city in Guangdong in the 1980s. My father was a primary school teacher. In the 1990s, there were many opportunities to make money in Guangdong. At that time, many parents came to my house. Basically, parents felt that reading was useless. But now when I go back, my concept has changed. Because the money is not easy to make, even the chance of driving a taxi may be gone. The time of a generation has changed.

以后硬核创新会越来越多。现在可能还不容易被理解,是因为整个社会群体需要被事实教育。当这个社会让硬核创新的人功成名就,群体性想法就会改变。我们只是还需要一堆事实和一个过程。
There will be more and more hard-core innovations in the future. It may not be easy to understand now because the entire social group needs to be educated on the facts. When this society allows hard-core innovative people to become successful, group thinking will change. We just need a bunch of facts and a process.

评论

此博客中的热门博文

WSJ:为何泡沫能在众目睽睽之下继续膨胀?

华尔街日报: 现在人人都在谈论着 人工智能(AI)泡沫 。但在多头看来,这恰恰证明不存在AI泡沫这回事。这种观点认为,如果人们真的相信价格被严重高估,他们就会抛售,泡沫就会破裂。如果人人都知道是泡沫,那怎么还会有泡沫呢? 我天生喜欢唱反调,所以我喜欢这个想法。当我与大众为伍时,会感到不安。但认为泡沫在普遍的警告声中不可能继续膨胀,这种想法是完全错误的。历史能证明这一点,心理学和金融理论也解释了其中的原因。 互联网泡沫时代提供了一个最明显的例子,说明泡沫如何在指责其过度膨胀的呼声中继续膨胀。我猜,一些铁杆人士那时可能真的认为,在公司名称后加上“.com”就能让其股价在接下来的一周半时间内平均上涨74%,或者认为通过衡量每网络点击价格来为公司估值真的合理。 但当时不乏“泡沫正在形成”的强烈警告。本报(以及其他所有有声望的刊物)在1999年充斥着各种文章,将许多收入为零的互联网公司股票的暴涨与郁金香狂热和南海泡沫相提并论。大型基金经理关于投机泡沫的警告也频频被引用。然而,互联网股票仍然持续上涨,直到涨势戛然而止。 原因是,许多人购买股票不是因为他们认为公司前景良好,或者点击量有朝一日会转化为收入,而是因为他们看到朋友发了财,自己也想从股市中分一杯羹。 金融危机史方面的伟大历史学家查尔斯·金德尔伯格(Charles Kindleberger)精辟地总结道:“没有什么比看到朋友发财更能扰乱一个人的幸福感和判断力了。”1999年,任何在互联网公司IPO中获得新股配售的人都立即赚了大钱。根据佛罗里达大学(University of Florida)荣休教授杰伊·里特(Jay Ritter)汇编的数据,当年IPO的平均首日回报率超过70%,是该数据自1980年有记录以来遥遥领先的最高水平。 然而,大多数人是在新股首日大涨后买入的,因为他们没有获得配售。他们为什么买入?不是因为这些公司很棒。许多公司的商业计划书看起来就像是在餐巾纸上画出来的,它们烧钱的速度越快,股价上涨得也越快。 这种上涨是多种因素共同作用的结果,包括一些公司出售的股份很少,以及“博傻理论”——明天会有人愿意出更高的价格,所以即使股价明显过高,现在买入仍然是值得的。在金融学中,这有一个古板的名称,叫“理性泡沫”。 标准金融模式的失灵通常会使泡沫恶化。一般情况下,股价之所以被控制在合理范围内,不仅是因为股价过高时持有者...

中国芯片制造商巧妙地突破了美国的限制

 经济学人: 芯片之 战自2018年以来一直持续。当时,唐纳德·特朗普(以及后来的乔·拜登和特朗普)领导下的美国开始对希望在中国销售产品的半导体公司实施日益严格的出口限制。这项高科技禁令旨在挫败中国打造自身先进芯片制造业的雄心。 相反,它激励了他们。中国政府希望国内企业能够用硬件完成他们已经用软件完成的工作,并突破美国的极限进行创新。今年 1 月,中国软件公司 DeepSeek 发布了一款人工智能 (  AI  ) 模型,令世界震惊。尽管该模型只使用了西方竞争对手的一小部分计算能力进行训练,但却具有竞争力。中国的芯片制造商正在尝试做类似的尝试。他们正在将工具发挥到极致,构建大型处理器集群以抵消较慢的芯片,并融合硬件和软件以榨干每一滴性能。问题是中国能否将这些组件(芯片、系统和代码)连接成一个自给自足、具有竞争力的 AI  “技术堆栈”。 图表:《经济学人》 先从芯片本身说起。风险投资公司 Edgerunner Ventures 的 Ryan Cunningham 收集的数据显示,中国 人工智能 芯片的平均性能为每秒 114 万亿次浮点运算(即每秒进行一万亿次计算),远远落后于美国竞争对手(见图表)。华为的旗舰 人工智能 芯片 Ascend 910  C 的 浮点运算速度为每秒 800 万亿次浮点运算,而英伟达的高端产品 B  200的浮点运算速度则为每秒 2500 万亿次浮点运算。 造成这种差距的一大原因是这些芯片制造难度大。过去半个世纪以来,提高微芯片速度最可靠的方法是缩小晶体管的尺寸。晶体管是一种微型电子开关,其开或关状态代表二进制算术中的“1”和“0”。B  200 芯片集成了 2080 亿个晶体管,分布在数千个独立的核心中,所有这些晶体管都塞进了几十毫米宽的硅片中。 只有三家公司——韩国三星半导体公司、台湾 台积电公司 以及(在一定程度上)美国英特尔公司——能够制造包含极小晶体管的芯片。 台积电 占据市场主导地位,但美国的压力意味着其最先进的工厂对中国客户关闭。他们不得不与本土芯片制造商合作,例如部分国有企业 中芯国际 和拥有自主制造工厂的科技巨头华为。 但 中芯国际 和华为也面临限制。芯片制造工厂使用的先进机床则由另一类公司制造。例如,光刻机利用光将构成微芯片的电路图案蚀刻到硅晶圆上。就像...

美银:五大风险可能打击标普500指数

WSJ: 自从三年多之前牛市开启以来,美银(BofA)股票策略师萨维塔·萨布拉马尼安(Savita Subramanian)几乎一直是坚定的股市乐观派。 但最近,就连她和她的团队也开始对市场前景感到担忧。 在本周分享给MarketWatch的一份报告中,萨布拉马尼安指出了可能打击标普500指数的五大风险。她建议客户将资金从追踪该指数表现的基金中转移出来,转而投资个股。 她告诫道:“要精挑细选。” 标普500指数估值已经很高了 萨布拉马尼安的团队追踪标普500指数的20个估值指标,包括往绩市盈率、远期PEG比率、以及标普500指数预期市盈率与罗素2000指数预期市盈率之比等等。 几乎所有这些指标都显示,标普500指数的估值高于其历史平均水平。 根据其中九个指标,标普500指数已超过互联网泡沫顶峰时期的水平。此外,其中四个指标——标普500市值与GDP之比、市净率、市现率以及企业价值与销售额之比——都处在历史最高点。 当代投资者为持有股票而支付更高的溢价,而萨布拉马尼安和她的团队曾坚称,较高估值或许有其合理的原因。 “历史比较存在问题,因为如今的标普成分股质量更高、资产较轻、杠杆率较低等等,” 萨布拉马尼安在报告中称。“但风险正在累积,标普500指数的估值底部很可能低于当前水平。” 熊市来临的迹象日益增多 美银团队追踪10个熊市迹象——据萨布拉马尼安称,这些信号可靠地预示了过去的股市见顶。 美银关注的一些指标包括:世界大型企业研究会(Conference Board)的消费者信心指数、该研究会有关预期股价将上涨的受访者百分比的调查结果、监控卖方分析师建议的美银卖方指标、过去六个月宣布的并购交易数量的10年Z分数,以及高市盈率股票与低市盈率股票之间的表现差异。 综合来看,这些信号表明应保持谨慎。60%的信号已被触发,略低于以往市场见顶前70%的平均水平。为了扩大样本量,美银团队不仅研究了此前的熊市,还研究了止步于熊市区域之前的抛跌,包括1990年、2018年和今年早些时候的震荡时期。熊市通常被定义为从近期高点下跌20%或以上。 政府数据缺失成为风险因素 4月份的关税引发抛售后,美国经济前景终于开始趋于明朗。外界原本期待美国总统唐纳德·特朗普(Donald Trump)的预算法案能使企业新投资加速。 接着,10月份出现了政府停摆和中美贸易争端的再次升级。突然之间,企业投入更...

罗马帝国的衰落与灭亡——黄金价格一直处于一种令人无法抗拒但却空洞的叙述之中

  条条大路通罗马:乔瓦尼·福雷 (Giovanni Faure),1806-1867 年,《罗马广场》。  摄影师:Molteni Motta/Universal/Getty BBG: 货币贬值由来已久。公元前二世纪, 罗马共和国开始铸造 第纳尔 ,当时每枚硬币的含银量高达98%。几个世纪过去,硬币的材质被剪裁,贱金属被加入,到了公元三世纪,每枚 第纳尔的 含银量已不足5%。 这种贬值是显而易见的,也是显而易见的。 把 迪纳 里按时间顺序摆放在博物馆的展柜里 ,你可以看到它们随着时间的流逝逐渐失去光泽、缩水,甚至开始变绿。罗马货币的长期掺假当然与发行它的帝国的长期衰亡交织在一起。一个货币贬值的国家,正如人们所说,也是一个腐败的国家,并且必然走向衰亡。 为了支持这一观点,互联网上引用了 《罗马帝国衰亡史》 作者爱德华·吉本的 一句话 : 罗马衰落文化的五个标志:关注炫耀富裕而不是积累财富;痴迷于性和性变态;艺术变得怪异和耸人听闻而不是富有创造力和原创性;贫富差距不断扩大;依赖国家生活的需求不断增加。 乍一看,这令人瞠目结舌;颓废的罗马与对美国帝国的批判之间的相似之处显而易见。再看一眼,这显然好得令人难以置信。吉本是一位才华横溢的作家,但他的笔下却写满了18世纪的格言,词汇华丽。这句话是用要点写成的。它出自一个习惯于用PowerPoint表达论点的人之口。 在网上稍加搜索就能找到 能够揭穿这一谎言的历史学家 。但罗马货币贬值本身就如此引人入胜,足以构成一个更广泛的衰落叙事,令人无法抗拒,以至于吉本从未说过这句话也几乎无关紧要。罗马确实贬值了货币,其帝国也确实衰落了。吉本确实写过这些事情,尽管他最重要且最具争议的观点是将 罗马的 衰落归咎于基督教。 奥古斯都时期的辉煌(上图)与公元前三世纪时期的辉煌(下图)。 摄影:Heritage/Hulton/Getty 朱莉娅·梅萨,两位你从未听说过的皇帝的祖母。 图片来源:Heritage/Hulton/Getty 2025年的大贬值 罗马似乎再次变得重要,因为货币贬值又卷土重来,并制造出新的 恐怖叙事 。金价创下一系列纪录。即使在本周大幅下跌之后,今年金价仍上涨了55%,远超股市涨幅,有望创下 1979年以来最强劲的一年。或者换句话说,美元兑黄金的价值跌幅创下了自滞胀时代、 伊朗人质危机 和苏联入侵阿富汗 以...

金融时报:特朗普至上主义

  在九月举行的查理·柯克纪念集会上,唐纳德·特朗普拒绝了这位遇刺基督教民族主义者的宽恕精神。“这就是我与查理意见相左的地方,”特朗普坦白道。“我恨我的对手,我不希望他们得到最好的结果。我很抱歉。”除了那句道歉之外,几乎没有人怀疑他的话是发自内心的。 特朗普胜选近一年后——如果以此后发生的一切规模来衡量,已经过去了几十年——美国总统正深陷一位前助手所说的“复仇之旅”的阵痛之中。特朗普向世界发出了矛盾的信号。前一刻,他还在 加沙地带达成戏剧性的停火 ,并 鼓吹自己将获得诺贝尔和平奖 ;下一刻,他又在加勒比海地区摧毁不明船只,并盘算着吞并邻国领土。然而,在国内,他的方向始终只有一个。 柯克服役几天后,特朗普在弗吉尼亚州匡蒂科对约 800 名美国高级将领、海军上将和其他高级军事领导人发表讲话时表示,他们的首要任务是“从内部打击敌人”。最近几周,联邦检察官起诉了前联邦调查局局长 詹姆斯·科米、 纽约州总检察长 莱蒂西亚·詹姆斯 和特朗普的前国家安全顾问 约翰·博尔顿 。他们每人都被指控犯有可监禁的罪行——博尔顿最高可判处 180 年监禁。特朗普还呼吁逮捕或监禁两名民主党州长、一名大城市市长、一名现任美国参议员、高级退役将军、一名前中央情报局局长和许多其他被点名的官员。 任何认为总统在开玩笑——或者认为他的司法部长、联邦调查局局长、国土安全部长和其他强大的支持者会拖延行动——都是在胡扯。“他从不装腔作势,”博尔顿本月早些时候,也就是他被起诉前几天告诉我。“特朗普只会报复任何背叛他的人。” 任何试图捕捉特朗普第二任期初期规模的努力都容易迷失方向。特朗普的讲话过多是故意为之。从他漫无边际的新闻发布会(越来越多地针对“Maga”媒体事实上的速记员),到他如今在军事基地以士兵为道具发表的系列讲话,没有哪位美国总统的讲话比特朗普更多。特朗普的前首席策略师史蒂夫·班农称之为“用屎淹没这个区域”。特朗普谈到的一些话很严肃,比如他的报复威胁。他说的很多话都是废话;比如他反复抱怨现代管道,或者风力涡轮机造成的鸟类死亡人数。然而,特朗普的每一句声明,无论是真是假,是认真的还是开玩笑的,都在提醒人们他的至高无上。 从今天的角度来看,特朗普的第一个任期堪称宪法约束的典范。这一次,他掌控着国会,内阁要员们在他那场朝鲜式的会议上争相称赞他,最高法院几乎没有对他的行为进行制约。至少就目前而言,美国...

中国的宝可梦热潮在日本引发排外情绪

BBG: 在日本各地,麦当劳门店挤满了恼火又不耐烦的顾客。这家连锁店的开心乐园餐促销活动——将宝可梦玩具和两张独家卡牌捆绑销售——引发了一场疯狂的抢购。门店人满为患,许多门店一天之内就销售一空。社交媒体上的照片显示,一袋袋未动过的汉堡和薯条被丢弃在桌子上,甚至被扔在街上。 混乱局面的起因是,一些转售商专门用这些卡片来购买餐食,而这些卡片在中国的售价可能高达原价的100倍。中国此类收藏品市场的蓬勃发展催生了一个黄牛网络,他们一经发售就抢购,导致价格上涨,并引发了民众的不满。 病毒式传播的 图片在X、TikTok和Instagram上引发了愤怒的评论,指责中国消费者:“只要中国人存在,这个问题就永远不会消失,”其中一条帖子写道。其他人则将矛头指向 麦当劳 控股日本公司首席执行官 高忠信 ,暗示这位香港本地人应该为此次事件负责。这些强烈反应凸显了日本消费者的愤怒情绪如何迅速演变成仇外情绪。 8月份的事件暴露了日本对外国人更深层次的敌意,而这种敌意的根源在于不断增长的移民数量、物价危机以及京都等城市创纪录的旅游业。一些海外游客的不良行为催生了“旅游污染”(  kanko kogai )一词。首相高市 早苗 甚至声称,外国游客正在“踢”她家乡奈良公园里漫步的鹿。 极右翼政党三世党主张限制移民,并限制外资持有日本土地和企业,该党在最近的选举中取得了进展。富有的中国投资者经常被指责推高了东京的房价,而中国游客占到入境游客总数的四分之一以上。 日本圣心大学传播学教授戴维·麦克尼尔表示:“仇外心理很容易滋生。”他表示,中国游客常常因为其显赫地位而被单独挑出来。 与此同时,外国黄牛也成为公众愤怒的焦点。 苏萌郑就是这样一位转售者。他在东京原宿附近的一间单间公寓里工作,房间里从地板到天花板都堆满了运动鞋盒子、气泡膜和收藏品模型,他靠套利宝可梦卡牌等商品为生。麦当劳的活动开始前几周,他就通过中国电商应用小红书上的帖子了解到了这一消息。 经销商们预计这张卡会大受欢迎,早在开售前,来自中国的订单就纷至沓来。在小红书上,用户招募在日本的中国居民和游客购买餐食,并将卡寄回国。郑先生说,由于麦当劳限制每人限购五份餐食,许多经销商会前往多家门店,或者结伴而行。 “没人想要这些玩具,人们为了这张卡就买一整顿饭。”郑说着,桌子底下塞满了一堆还没开封的宝可梦开心乐园餐玩具。丢弃餐食的图片引发了广泛批...

贸易战后,中国每日对美出口额达10亿美元,彰显要价实力和出口韧性

 BBG: 唐纳德·特朗普发起 贸易战 六个月以来,中国出口的韧性证明,即使在美国征收 55% 的关税之后,中国许多产品仍然至关重要。 每天,约有价值10亿美元的货物从中国跨越太平洋运往美国, 9月份的 贸易额 较8月份有所上升。尽管 过去半年整体贸易额出现 两位数下降,但一些产品的贸易额最近自2024年以来有所增长,这克服了中美之间贸易紧张的局面。 其结果是,美国关税在控制美国企业进口方面的能力似乎有限,因为中国在稀土和电子产品等领域的影响力使其产品难以被取代,至少在短期内如此。这种情况可能会随着时间的推移而改变,尤其是如果特朗普进一步提高关税的话——正如这位共和党领袖反复威胁的那样。 今年中国对美出口下降17% 但中国企业仍报告称,出口了价值 3170 亿美元的商品 数据来源:中国海关总署 彭博经济学家舒畅 和 曲大卫 写道:“中国在全球供应链中的强势地位使其在短期内拥有与美国进口商的一定议价能力。” 他们警告称,其他国家无法迅速取代中国成为美国的供应国。“重新调整生产需要时间,”他们补充道。 所有这些,都让中国 在贸易谈判代表就延长90天关税休战协议(该协议将于11月到期)进行谈判之际,拥有了更大的议价能力。第三季度,价值超过1000亿美元的中国商品运抵美国,帮助北京保持经济增长,实现年度目标,并使双边 贸易顺差 达到670亿美元。 特朗普周二预测, 即将 与中国领导人举行的会晤将在贸易问题上达成“良好协议”,同时警告称,预计下周在韩国举行的峰会仍可能破裂。这位美国领导人已将稀土、芬太尼和大豆 列为 美国与中国讨论的首要贸易议题。 美国从中国购买大量电动自行车 尽管征收了关税,今年迄今为止的出货量仍维持在 12 亿美元 数据来源:中国海关总署 注:HS 编码 8711600 项下的出口 - “以电动机为动力的摩托车/自行车” 全球两大经济体之间的合作不仅仅局限于中国主导全球供应的产品,例如 对美国制造业至关重要的 磁铁或广泛使用的药物中的 化学品 。 彭博社对中国海关数据的分析显示,尽管上季度几乎所有十大对美出口商品都同比大幅下滑,但电子烟的出货量却有所增长。美国对电动自行车的需求也十分强劲,截至9月底的三个月,中国企业对电动自行车的出口额超过5亿美元,略高于去年同期。 过去三个月,精炼阴极铜出口额从几乎为零飙升至 2.7 亿美元,电缆出口额增长 87...

付鹏11月24日在HSBC内部演讲速记

《2024年年终回顾和2025年展望——对冲风险VS软着陆》   上篇 正值年底,虽然刚才汇丰一直强调大家不录音不录像,但大概率你挡不住。我在这儿讲话会谨慎一些,非常小心谨慎,大概率会有人透露出去,放到YouTube上,基本上所有见我都说付总我在YouTube上看过你的视频,我说那都是盗版的,靠盗版发财的也不少。 今天和大家分享的内容基本上都是官方的,回顾会多一点,展望不多,因为这个月展望完了之后下个月怎么办?有些话对我来讲我倒觉得很简单,本质上原来我们是做Hedge Fund出身,所以我们的逻辑框架整体具有极强的延续性,不是说今年去讨论,或者说明年去讨论。 惯性思维从2016年开始,我一直在跟大家强调这个世界已经完全不一样了。当然经历过过去的几年时间,我相信在座各位应该对这番话的理解变得越发深刻。 2016年实际上是美国特朗普的第一次大选,我有一个特点,我的特征是如果我觉得什么地方有投资机会,我可能第一时间去一线调研,我不喜欢看YouTube,我也不喜欢在网上扒。当然你会说,现在ChatGPT很强大了,人工智能好像能帮你解决很多问题,但你们有没有想过,可能广泛流传或者广泛传播的很多信息是错的。这一点在2012年当时我从日本做完调研回来之后,我的感悟是最深的。 当然去日本有一个重要的人物,名字叫本森特,很快大家就会非常熟悉他的,目前来讲应该是特朗普政府提名的美国财长。本森特原来是索罗斯基金实际掌控人,因为索大爷已经年龄很大了,去年的时候才刚刚把基金的业务交给他儿子亚历山大,但在这之前,最主要的几场战役本质上来讲都是本森特在主导。 2012年当时我从北京去香港约朋友们吃饭的饭局上,当时斯索罗斯基金在香港办公室跟我说,本森特从这儿去了日本。我说OK。我经常说一句话“站在巨人的肩膀上看问题。” 当然你知道,网民们最可怕的地方是巴菲特“SB”、索罗斯“SB”,我最“牛逼”。你要记住,他们的所有行为一定有很大的变化,很多人可能都不知道,巴菲特第一次去是2011年,我们正在讲福岛核电站泄漏,核废水污染以后海鲜不能吃的时候,一个80多岁的老头顶着核辐射泄漏去日本吃海鲜了,当然他去日本干吗,这其实很关键。 之后我们跑到日本做完调研回来之后那几年,我陆陆续续跟很多人讲,日本正在发生变化,日本的利率结构都会随之变化的,当然包括日本的证券市场。今年日本股市终于走出这35年了,创下...

WSJ:交易、奉承和战机护航:东道国如何竞相讨好特朗普

  世界各国领导人为迎接美国总统特朗普(Trump)的来访形成了一套固定模式:举行盛大的欢迎仪式并发起魅力攻势,以期获得美国关税减免,并免于被要求增加国防开支。 特朗普最近进行海外访问时,东道国派遣战斗机为“空军一号”最后的进场护航,并在红毯两侧安排身着制服的士兵和传统舞者列队欢迎。特朗普抵达后,东道国领导人常常称赞他在达成重大贸易或和平协议中发挥的作用。还有人一再承诺要提名特朗普角逐诺贝尔和平奖。 外国领导人向来都借助美国总统来访之机向白宫主人献殷勤。然而,与许多前任总统相比,特朗普更多地利用美国的经济和军事实力向世界各国施压并索取让步,这也令各国领导人设法讨好特朗普的利害关系随之增大。 这套正在形成的外交脚本上周六在吉隆坡展现得淋漓尽致。在特朗普的专机降落在该市机场前,马来西亚方面派出一架F-18战斗机在专机侧翼伴飞。马来西亚总理安瓦尔·易卜拉欣(Anwar Ibrahim)在舷梯下等候特朗普,旁边还有仪仗队和数十名舞者。特朗普甚至也加入了欢迎活动,他挥舞着拳头,随着音乐节奏摇摆,然后同时挥舞着马来西亚和美国的国旗。 特朗普还特意邀请安瓦尔进入他那辆被称为“野兽”(The Beast)的总统专车,然后一同驱车前往在此地举行的东南亚国家联盟(Association of Southeast Asian Nations)峰会。 “总统抵达时,他邀请我同乘一辆车。我说,‘这违反了安全礼宾规定’,而他却很高兴能打破规定,”安瓦尔不久后向听众津津乐道地讲述道,“那是一段很愉快的车程。” 片刻之后,在美国帮助下最终敲定的柬埔寨与泰国和平协议的签署仪式上,柬埔寨首相洪玛奈(Hun Manet)表示特朗普理应获得诺贝尔奖,而安瓦尔则称赞特朗普“坚韧、有勇气”。 柬埔寨和泰国在7月份因长期存在的边境争端而发生冲突。特朗普威胁称,如果两国不能达成停火协议,美方将暂停关税减免。这迫使两国领导人接受由马来西亚斡旋的谈判。 这种盛大的场面与今年早些时候特朗普在中东受到的欢迎仪式如出一辙,显示出美国的合作伙伴们如何试图奉承和影响这位美国总统。多国官员表示,为求成功,世界各国政府都会仔细追踪特朗普及其身边人的言论,寻找能让他们赢得特朗普青睐的话题。 这种隆重的排场有时超过了其他美国总统所享有的待遇,也表明如今特朗普到访时许多国家在机场欢迎仪式和峰会安排上的用心,与他们在政策成效上花的...

华尔街日报:关于特朗普的外国投资基金

WSJ: 特朗普总统行动如此之快,宣布的又如此之多,以至于很难区分真假。例如,外国政府在特朗普的贸易协议中做出的“投资美国”承诺就是一个很好的例子。这些承诺规模如此之大,以至于不太可能兑现,而且它们引发了人们对美国治理和财政实力的严重质疑。 特朗普将于本月晚些时候前往韩国参加亚太经合组织(APEC)年度会议。美国财政部长 斯科特·贝森特 表示,美国政府“即将完成”关于韩国承诺向美国投资约3500亿美元的谈判。作为回报,特朗普将对韩国的关税从25%下调至15%。日本也同意削减美国5500亿美元的援助,以换取韩国降低关税。 这听起来很成功,但如果你仔细研究与日本签署的谅解备忘录的细节,就会发现并非如此。(韩国的谅解备忘录仍在谈判中。)日本的谅解备忘录称,这笔资金将投资于“被认为有助于促进经济和国家安全利益的领域”,例如金属、能源、人工智能和量子计算。 但这些投资并非像 台积电 在亚利桑那州建设半导体工厂那样由私营企业进行。这些投资是政府间投资,完全由美国政府——也就是总统及其副手——自行决定。这些投资实际上是主权财富基金,无需国会拨款或立法即可管理。 派珀·桑德勒 的 安迪·拉佩里尔(Andy Laperriere) 上周在一份令人大开眼界的研究报告中详细介绍了与日本交易的特殊条款。美国政府将为每项投资设立一个特殊目的公司(SPV),由总统或其指定的经理人选择并控制。日本将有45天的时间支付这笔资金。如果日本拒绝,可能会被征收更高的关税。 东京将作为这笔政府股权交易的有限合伙人。日本和美国政府将分割利润(如有),直至达到未指定的“视同分配金额”。此后,美国将获得90%的利润。 问题在于这些承诺的规模之大。拉佩里尔表示,这3500亿美元将相当于韩国GDP的6.5%,分摊到特朗普第二任期剩下的三年。而根据谅解备忘录,日本到2028年每年必须支出1830亿美元,相当于未来三年每年GDP的4.4%。 日本国际合作银行的资产规模仅为350亿美元。商务部长 霍华德·卢特尼克 上个月表示,日本将不得不“爆仓”并大举借债,以履行其谅解备忘录的承诺。他真是好意。 日本和韩国如果像特朗普一直敦促的那样增加国防开支,岂不是更好?日本每年的国防支出占GDP的1.8%,韩国占2.3%。他们承诺向特朗普的基金提供的资金是这两个国家的两到三倍。他们从哪里弄来这些钱呢? 日本和韩国官员对选民和立法者负...