Down to Earth AI #1: Musk's AI, LLM Rankings, The state of AI, Nvidia's Latest Updates
Hey! Each week I'll bring you the latest major advancements and news in AI, along with other intriguing insights, all tailored specifically for business leaders.
Intro
First of all, it’s great to have you here! I created this newsletter to translate the latest AI advancements, technicalities, and news into a digestible format for business leaders. My goal is to help you understand how AI is currently impacting the modern workforce and how emerging trends are reshaping our world.
You’ll learn how to make sense of AI and learn to navigate a world that is inevitably becoming more complex.
In this week’s edition:
Elon Musk secures $6 billion funding on AI model
How to know which LLM or chatbot is best?
$NVDA keynote at Computex 2024
The state of AI in early 2024
xAI funding round
xAI is pleased to announce...
Our Series B funding round of $6 billion with participation from key investors including Valor Equity Partners, Vy Capital, Andreessen Horowitz, Sequoia Capital, Fidelity Management & Research Company, Prince Alwaleed Bin Talal and Kingdom Holding, amongst others.
source: https://x.ai/blog/series-b
OpenAI GPT, Facebook's LLaMA, and other models are facing significant competition. The open-source Grok model, along with Elon Musk's unique approach to business innovation, is set to disrupt the market. Personally, I am a great admirer of Elon Musk's execution skills.
Facebook’s decision to make LLaMA open-source and freely available has fundamentally changed the landscape of Large Language Models. I believe xAI will further support this trend, shifting the value of LLMs towards the applications and solutions built on top of them.
How to know which LLM or chatbot is best?
For evaluating general-purpose foundation models such as large language models (LLMs) — which are trained to respond to a large variety of prompts — we have standardized tests like MMLU (multiple-choice questions that cover 57 disciplines like math, philosophy, and medicine) and HumanEval (testing code generation). We also have the LMSYS Chatbot Arena, which pits two LLMs’ responses against each other and asks humans to judge which response is superior, and large-scale benchmarking like HELM. These evaluation tools took considerable effort to build, and they are invaluable for giving LLM users a sense of different models’ relative performance. Nonetheless, they have limitations. For example, leakage of benchmarks datasets’ questions and answers into training data is a constant worry, and human preferences for certain answers does not mean those answers are more accurate.
source: https://www.deeplearning.ai/the-batch/issue-251/
While LLMs (Large Language Models) offer impressive capabilities, maintaining full control over them presents significant challenges. Due to their general-purpose nature, tailoring LLMs for specific use cases can be difficult. This challenge becomes even more pronounced when developing client-facing solutions. Fine-tuning LLMs demands substantial effort and specialized expertise. Additionally, the frequent emergence of new models complicates the process further.
So, how can businesses harness the power of LLMs effectively? The approach largely depends on the specific use case. One particularly promising method is multi-agent cooperation. For complex tasks like software development, a multi-agent approach can be highly effective. This involves breaking down the task into subtasks and assigning these to different roles, such as a software engineer, product manager, designer, and QA (Quality Assurance) engineer. Different agents can then handle these subtasks collaboratively, enhancing efficiency and productivity. More on this in upcoming newsletters.
Nvidia’s keynote at Computex 2024
I think everyone already heard about Nvidia soaring stock prices more or less around LLMs and new AI bubble. They GPUs appeared to extremely effective with running AI models and the competition seems to be far behind. Here are few takeaways from their CEO last presentation:
AI and Data Center Advancements - highlighting new GPUs optimized for data centers and AI workloads
Omniverse Platform Enhancements
Expanded features for 3D collaboration and simulation, enabling more seamless workflows for developers and enterprises.
The days of millions of GPUs data centers are coming.
- Jensen Huang, Nvidia CEO
For context, one million of GPUs would cost ~$40B, just GPUs alone.
So what does this mean? LLMs are likely to become increasingly affordable. NVIDIA's market dominance, substantial margins, high valuation, and lack of significant competition make it seem unchallenged. However, I believe we can anticipate the emergence of serious competition within the next 1-2 years, which will help drive down the costs of LLMs even further.
Full video: https://www.youtube.com/live/pKXDVsWZmUU?si=LF__SLoWpJH8GSKc
The state of AI in early 2024 by McKinsey
2023 was the year the world discovered generative AI (gen AI), 2024 is the year organizations truly began using—and deriving business value from—this new technology. In the latest McKinsey Global Survey on AI, 65 percent of respondents report that their organizations are regularly using gen AI, nearly double the percentage from our previous survey just ten months ago. Respondents’ expectations for gen AI’s impact remain as high as they were last year, with three-quarters predicting that gen AI will lead to significant or disruptive change in their industries in the years ahead.
In my opinion, most of the 65% who claim to use GenAI primarily rely on ChatGPT for crafting email messages and other forms of communication. Nonetheless, GenAI is a powerful technology. When used properly, it can bring tremendous improvements to businesses, delivering significant and swift ROI. This represents a real transformation that every business leader should monitor and implement as soon as possible, even as a proof of concept in minor areas. Becoming familiar with GenAI and building competencies in leveraging this technology should be a priority for every business today.
Full report here: https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai
If you enjoyed the newsletter, I would greatly appreciate it if you could share it with others! 👇