This false nomenclature of "researcher" and "engineer", which is a thinly-masked way of describing a two-tier engineerin...
This false nomenclature of "researcher" and "engineer", which is a thinly-masked way of describing a two-tier engineerin...
Ok this makes me super happy. The "NoFilter" work, paper, and advocacy that @angelinepouget and I argued so hard for is ...
We're excited to have @shengjia_zhao at the helm as Chief Scientist of Meta Superintelligence Labs. Big things are comin...
the openai IMO news hit me pretty heavy this weekend i'm still in the acute phase of the impact, i think i consider myse...
We are seeing much faster AI progress than **Paul Christiano** and **Yudkowsky** predicted, who had gold in 2025 at 8% a...
So, all the models underperform humans on the new International Mathematical Olympiad questions, and Grok-4 is especiall...
🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is ...
In the current AI talent war, everyone is focused on the big numbers (alleged compensation packages). It misses the bigg...
The code and instruction-tuning data for MetaQuery are now open-sourced! Code: https://github.com/facebookresearch/metaq...
1/ Excited to share that I'm taking on the role of leading Fundamental AI Research (FAIR) at Meta. Huge thanks to Joelle...
We are open-sourcing all the models in Web-SSL, from ViT-L to ViT-7B! It was super fun to train and play with these mass...
OpenRouter 宣布为 Llama 3.3 70b 降价,同时提供该模型的六个版本及对应提供商。
OpenRouter 宣布对 Llama 3.3 70b 进行价格下调,同时新增六个模型及相应供应商。此次调整扩大了 Llama 3.3 70b 的可选提供商范围,并降低了调用成本。
Meta 于 4 月 18 日发布的开源模型 Llama 3-70B 在 Chatbot Arena 排行榜迅速登顶,参与超 5 万次对战。该模型在开放式写作和创意任务上表现突出,胜率达 60%,但在数学、编码等封闭式技术任务上逊于 GPT-4-Turbo 和 Claude 3 Opus。随着提示难度增加,其胜率从 50% 显著下降至 40%。分析显示,Llama 3 的输出风格更友好且具对话性,这成为其获得用户偏好的关键因素。
文章针对《纽约时报》关于 Yi-34B 与 Llama 2 关系的报道进行事实核查,澄清 Yi-34B 在架构设计、训练数据及分词器实现上与 Llama 2 的实际差异,同时系统梳理了当前大语言模型训练领域的行业常见实践,强调在遵循开源协议前提下基于现有架构进行技术迭代是 AI 社区的标准做法。