A host of US-based tech unicorns, companies valued at $1 billion or more, are projected to go public in 2025, as per the Q4 2024 Venture Monitor report by PitchBook/NVCA. PitchBook's VC exit predictor, a tool harnessing machine learning and a comprehensive database of companies, funding rounds, and investors, is used to evaluate a startup's probability of a successful exit. This tool assigns a percentage to VC-backed companies, reflecting the likelihood of them being acquired, going public, failing to exit, or becoming self-sustaining. To provide a more accurate depiction of the VC market's exit value, the researchers have incorporated a proprietary extrapolation methodology, typically used for undisclosed M&A transactions in private equity and global M&A reports. Nizar Tarhuni, PitchBook's EVP of Research and Market Intelligence, commented that despite a steady increase in completed financings and aggregate investment in 2024, the VC landscape is grappling with a lack of significant exits. This situation is caused by several factors, including valuation discrepancies between buyers and sellers resulting from overpriced previous rounds, and regulatory obstacles hampering deal interest in larger market segments. Tarhuni expressed cautious optimism for 2025, citing a more M&A and business-friendly environment in Washington, and the potential for startups and investors to adjust their expectations around valuations, deal structures, and growth. However, he also warned that fundraising might remain tepid as other market segments gain momentum, competing for investment funds and favoring larger venture platforms and established managers. NVCA's CEO, Bobby Franklin, also expressed a cautiously optimistic outlook for 2025, following a strong quarter with the highest investment levels since mid-2022. He highlighted the potential for changes in leadership at the FTC, DOJ, and SEC to alleviate liquidity issues for portfolio companies and reduce regulatory burdens. He also emphasized the opportunity for venture-backed companies to demonstrate their critical role in driving economic growth and maintaining American competitiveness, with more VCs taking on government roles. Franklin pointed out the significance of the tax bill currently progressing through Congress, which has the potential to incentivize innovation, reinstate the R&D tax credit, and support the broader ecosystem. Among the tech unicorns predicted to go public in 2025, PitchBook highlights Anduril, an aerospace and defense firm, and Mythical Games, a Web3 gaming company, each with a 97% chance. Other companies with a high probability of going public include Ayar Labs, Carbon, Databricks, EquipmentShare, Form Energy, GrubMarket, Mainspring, Sila, and StockX, with Impossible Foods, Groq, and SpaceX having a 96% chance. In 2024, the most common deal size was between $1 million and $5 million, with 3,153 startups falling into this category, a decrease from 3,781 in 2023 and 5,310 in 2021. For those seeking to stay ahead in the rapidly evolving tech landscape, VB Daily provides in-depth insights into generative AI applications, from regulatory changes to practical deployments, enabling you to maximize ROI.
European technology firm, Vertical Compute, recently secured $20.5 million in seed funding to develop its revolutionary memory and compute technology. The funding round was spearheaded by Imec.xpand and supported by an array of investors including Eurazeo, XAnge, Vector Gestion, and imec. This funding is set to advance Vertical Compute's mission of creating a new generation of Artificial Intelligence (AI) applications. The technology being developed by Vertical Compute promises to revolutionize the tech industry by enabling next-generation applications with unmatched efficiency and privacy. The innovation is designed to reduce data movement and bring large data closer to computation, resulting in energy savings of up to 80%. It also aims to unlock hyper-personalized AI solutions and eliminate the need for remote data transfers, thereby enhancing user privacy. The tech industry currently faces a significant challenge known as the 'memory wall'. This refers to the limitations in both density and performance scaling of memory technologies, while processor performance continues to increase. This problem is further compounded by the extreme data access demands of AI workloads. However, Vertical Compute's innovative approach aims to overcome this hurdle, paving the way for the next wave of AI developments. The company's solution is based on a chiplet-based design, which takes a modular approach to chip design. This approach leverages a novel way of storing bits in a high aspect ratio vertical structure, potentially outperforming Dynamic Random-Access Memory (DRAM) in terms of density, cost, and energy. By reducing data movements from centimeters to nanometers, this technology promises to address the memory bottleneck in computer systems. Vertical Compute is headquartered in Louvain-La-Neuve, Belgium, and has main R&D offices in Leuven, Grenoble, and Nice. The company is actively recruiting an elite team of engineers to support its ambitious R&D goals and to expedite the development and commercialization of its chiplet-based technology. This seed funding round not only underlines the confidence in the leadership team's capabilities but also in the disruptive potential of this game-changing technology. With ongoing support from its teams and ecosystem, Vertical Compute is poised to become a disruptor in the semiconductor industry. Founded in 2024, Vertical Compute is dedicated to solving the memory bottleneck in computer systems, a challenge that has been impeding the advancement of AI inference. The company's commitment to developing a novel solution to this problem points to an exciting future for AI applications and edge computing.
In 2024, we witnessed the rise of thousands of startups leveraging the immense potential of advanced large language models (LLMs). However, statistical trends suggest that only a fifth of these will make it through 2025. To succeed, these companies need more than just cutting-edge technology. As a technology enthusiast, I am thrilled by the prospect of this new tech sector. AI-driven startups promise to revolutionize our world in unimaginable ways, provided they have the right ingredients for success. My recent role as a judge for a startup competition, which recognizes innovators shaping the future of communication and customer engagement, reinforced this notion. Out of more than 500 applicants, we chose 12 honorees, each embodying essential principles for building successful AI solutions. These principles could serve as a starting point for any AI business. AI alone cannot guarantee a competitive edge as it has become increasingly commonplace. The hurdles of launching and running a startup remain largely unchanged in the era of LLMs. Startups need to attract, convert and retain customers, manage costs effectively, and create a sustainable competitive advantage. While AI is becoming more affordable, it can still inflate costs if used to create complex workflows. The power of AI can be harnessed to simplify complex tasks and encapsulate them in an easy-to-use API framework. This approach mirrors what was done for telephony a decade ago and is being done by major AI models today. However, merely creating a 'wrapper' for existing LLMs won't ensure long-term differentiation. For instance, if your goal is to develop a tool for code creation, speech transcription or data extraction from PDFs, a user-friendly interface won't suffice. Major LLMs already excel in these areas. Instead, focus on providing a unique service that offers a compounding advantage through a data flywheel or network effects. Take, for example, one of the honorees from the startup competition that automates voice calls for businesses. By amassing anonymized data from millions of customer calls, they have built a robust database and enhanced analytics. AI is evolving at an unprecedented pace. The number of AI patents per year has increased thirty-one-fold since 2010, with over 62,000 granted in 2022 alone. Therefore, understanding the trajectory of LLM development is crucial to avoid launching an obsolete solution. For instance, major AI labs are currently enhancing the reasoning capabilities of these models. Unless you have significant funding, focusing on advanced reasoning is not advisable. However, one of the competition honorees, a grassroots tech nonprofit, uses AI to improve the accessibility and effectiveness of social and mental health support - an area overlooked by major AI models but serving a significant societal need. While new AI solutions generate considerable interest, maintaining this excitement is crucial. This can be achieved by continuously improving your product based on customer feedback and introducing new features, certifications, and customer-friendly offers. One of the competition honorees, an AI-powered phone interview platform, is a perfect example. After winning an award, they didn't rest on their laurels but continued to secure key technology certifications and offer guarantees of uptime, international support, and reliability. While 2024 was a year of AI experimentation, 2025 will be characterized by AI-powered startups delivering measurable business impact. From my experience, the most promising companies are not just creating innovative AI implementations but building robust businesses that can adapt to technological changes. These successful companies demonstrate that enduring success comes from combining AI capabilities with deep domain expertise and solid business fundamentals. We are on the cusp of a new tech boom, and I am confident that some of today's innovators will become tomorrow's tech giants.
In an ambitious project, a team of engineers, in collaboration with Supermicro and NVIDIA, is constructing the world's most extensive liquid-cooled GPU cluster. This colossal AI supercomputer, designed to power Grok, a generative AI chatbot, will incorporate more than 100,000 NVIDIA HGX H100 GPUs, exabytes of storage, and ultra-fast networking capabilities. The state-of-the-art data center, located in Memphis, TN, was transformed from an empty structure into a fully operational AI supercomputer in a mere 122 days. This impressive feat, dubbed the "xAI Colossus cluster," involved the installation of power generators, transformers, and multiple hall structures. VentureBeat, in partnership with Supermicro and ServeTheHome, offers an exclusive video tour of this extraordinary achievement. Building such a massive AI supercomputer required an immense amount of computing power, ready for deployment on day one. The project was tailored to meet xAI's unique specifications. The design started with a standard raised floor data hall, equipped with power systems and liquid cooling pipes connected to a facility chiller. Each of the four compute halls houses approximately 25,000 NVIDIA GPUs, along with storage, high-speed fiber-optic networking, and power systems. The project's uniqueness lies in its details. Each cluster features Supermicro's liquid-cooled rack, the fundamental building block for the Colossus. These racks, each housing eight Supermicro 4U Universal GPU systems with liquid-cooled NVIDIA HGX H100 8-GPUs and two liquid-cooled x86 CPUs, are arranged in groups of eight, forming mini-clusters within the larger system. These Supermicro servers, considered the densest and most advanced AI servers available, are designed from the ground up to be liquid-cooled. This design approach, combined with their compact power and serviceability, sets Supermicro apart in the industry. Each Coolant Distribution Unit (CDU) features a monitoring system to track flow rate, temperature, and other critical functions. They also have redundant pumps and power supplies for uninterrupted operation, even during maintenance or replacements. Despite being liquid-cooled, these servers still use fans to cool components like DIMMs, power supplies, baseboard management controllers, NICs, and other electronics. The fans draw cooler air from the front and expel the warm air through liquid-cooled rear door heat exchangers, resulting in lower power consumption compared to air-cooled servers. The data center's vast networks run on NVIDIA's Spectrum-X Ethernet networking platform, designed to handle the high demands of AI workloads. Each cluster uses NVIDIA Bluefield-3 SuperNICs, providing 400 gigabit per second networking, with nine links per system offering 3.6Tbps of bandwidth per GPU compute server. The CPUs also get a 400GbE connection, which uses a different switch fabric. The NVIDIA Spectrum SN5600, a 64-port 800Gb ethernet switch, can split and run 128 400-gigabit ethernet links to ensure the entire cluster operates at maximum performance levels. This groundbreaking project surpasses any supercomputing endeavor undertaken previously. As the team, along with Supermicro and NVIDIA, continues to push the boundaries of supercomputing, the world watches with anticipation. Don't miss the detailed walkthrough of the Colossus in the video above to get a closer look at this technological marvel.
Cerebras Systems, a leading AI supercomputing firm, has partnered with Mayo Clinic to develop an AI-based genomic foundation model. This innovative model aims to predict optimal treatments for patients with rheumatoid arthritis and could potentially be applied to cancer and cardiovascular disease treatments as well. The announcement was made at the JP Morgan Healthcare Conference in San Francisco, highlighting Mayo Clinic's ongoing commitment to transforming healthcare through AI. Unlike other semiconductor companies such as Nvidia that rely on individual AI processors, Cerebras Systems takes a unique approach. The company designs an entire silicon wafer embedded with multiple chips that collectively solve complex AI problems and other computing tasks. This method significantly reduces power consumption and accelerates computation speed. Cerebras' approach has enabled the rapid computation of the genomic foundation model, saving time, effort, and resources compared to traditional computing solutions. The genomic foundation model is a significant step forward in precision medicine. It is designed to enhance diagnostics and personalize treatment selection, starting with rheumatoid arthritis. Traditional methods that examine single genetic markers have had limited success in predicting treatment response. However, this model, trained on a combination of publicly available human reference genome data and Mayo Clinic's patient exome data, shows promising results. The model's accuracy is expected to improve as more patient data is incorporated. To evaluate the model's clinical relevance, the team developed new benchmarks that focus on detecting specific medical conditions from DNA data. This fills a gap in publicly available benchmarks, which primarily concentrate on identifying structural elements like regulatory or functional regions. The model has demonstrated high accuracy in several key areas, including rheumatoid arthritis benchmarks, cancer predisposition prediction, and cardiovascular phenotype prediction. The model was developed in less than a year, a significant achievement considering such models typically require multiple years to develop. This rapid development was enabled by training Mayo Clinic's custom models on the Cerebras AI platform. The Mayo Genomic Foundation Model represents a significant stride towards enhancing clinical decision support and advancing precision medicine. In separate collaborations, Mayo Clinic also announced partnerships with Microsoft Research and Cerebras Systems in the field of generative AI. These projects aim to personalize patient care, accelerate diagnostic time, and improve accuracy by developing foundation models that integrate various applications. For instance, Mayo Clinic and Microsoft Research are exploring the use of generative AI in radiology, using Microsoft Research’s AI technology and Mayo Clinic’s X-ray data. Large language models typically predict words, but genomic models predict nucleotides. When a nucleotide mutates or experiences a transcription error, it could cause or predict a disease. Existing models can only examine single nucleotide changes, but Cerebras' model can analyze multiple nucleotide changes, resulting in a more accurate prediction. The goal is to predict the most effective drug for a specific patient, reducing the trial-and-error process often involved in treatment selection. The model is still in the early stages, but early results show promising accuracy rates in drug prediction for rheumatoid arthritis patients. As the field of AI continues to evolve, collaborations like these between tech companies and healthcare institutions are crucial in leveraging AI's potential to transform healthcare and improve patient outcomes.
OpenAI's artificial intelligence chatbot, ChatGPT, is evolving into a comprehensive personal assistant with the introduction of a new feature known as Tasks. This enhancement might indicate that OpenAI is planning to launch more AI agents in the future. Tasks, currently in its beta phase, allows users of ChatGPT Plus, Team, and Pro to schedule activities in advance. Users can set up recurring or one-time reminders for things like project deadlines or daily weather updates. To organize a task, users can select "4o with scheduled tasks" in the model picker and create a reminder prompt. ChatGPT can also propose tasks based on previous conversations. The Tasks feature is compatible with all versions of ChatGPT and sends notifications across desktop, web, and mobile platforms. However, task management is only available on the web version of ChatGPT. OpenAI anticipates that the beta period will provide valuable insights into how users interact with Tasks. This will enable them to refine the feature before rolling it out to all ChatGPT users. Tasks is the latest addition to ChatGPT's growing suite of assistant-like features. OpenAI previously introduced screen sharing during its "12 Days of OpenAI" event. This feature allows users to open ChatGPT while reading a text message and ask for assistance in crafting a response. Speculation about OpenAI launching an AI agent began when some users noticed ChatGPT offering access to scheduled tasks as early as December. The rumored agent, referred to as Operator, would be OpenAI's inaugural agent. There are theories that Tasks could eventually empower ChatGPT to perform more complex tasks such as searching for specific information, summarizing data, accessing documents, and problem-solving. When asked about Tasks and Operator, OpenAI chose not to comment directly, stating only that Tasks is a crucial step towards making ChatGPT a more useful AI companion. OpenAI has already ventured into the realm of AI agents with Swarm, a framework designed to manage AI agents. Despite the abundance of reminder, calendar, and productivity apps available, OpenAI's entry into this space is noteworthy. ChatGPT is already simplifying the process for users to transfer their coding or writing tasks to the platform or search the web without leaving the chat interface. With the addition of the Tasks feature, ChatGPT could emerge as a strong contender in the productivity and scheduling app market. As ChatGPT continues to add more capabilities to its platform, the idea of setting up scheduled tasks and reminders becomes increasingly feasible. This could make ChatGPT an attractive option in the productivity assistant landscape, which is already populated by applications like Google Calendar, Outlook Calendar, Asana, Trello, and Notion.
Google's Gemini AI has quietly revolutionized the AI landscape by achieving a significant milestone: the real-time processing of multiple visual streams simultaneously. This breakthrough enables Gemini to analyze live video feeds and static images at the same time, a feat not yet possible on Google's other platforms. The innovation emerged from an experimental application called "AnyChat", demonstrating the untapped potential of Gemini's architecture. Traditionally, AI platforms could only manage either live video streams or static photos, never both concurrently. AnyChat has shattered this barrier, expanding the scope of AI's capacity for intricate, multi-modal interactions. The technical prowess behind Gemini's multi-stream processing lies in its advanced neural architecture. AnyChat leverages this infrastructure to process multiple visual inputs without compromising performance. Although this capability exists in Gemini's API, it hasn't been made available in Google's official applications for end users. Unlike Gemini, many AI platforms, such as ChatGPT, are limited to single-stream processing due to their computational demands. For instance, ChatGPT currently disables live video streaming when an image is uploaded, as even handling one video feed can strain resources, let alone when combined with static image analysis. This breakthrough has immediate and transformative potential applications. For instance, students can now point their camera at a calculus problem while showing Gemini a textbook for step-by-step guidance. Artists can share works-in-progress alongside reference images, receiving nuanced, real-time feedback on composition and technique. AnyChat's achievement is notable not just because of the technology itself, but also because it bypasses the limitations of Gemini's official deployment. This was made possible through specialized permissions from Google's Gemini API, allowing AnyChat to access functionality not yet available in Google's own platforms. With these expanded permissions, AnyChat optimizes Gemini's attention mechanisms to track and analyze multiple visual inputs simultaneously, while maintaining conversational coherence. Developers can easily replicate this capability using a few lines of code, as demonstrated by AnyChat's use of Gradio, an open-source platform for building ML interfaces. AnyChat's success is not accidental; its developers worked closely with Gemini's technical architecture to push its boundaries. In doing so, they revealed a side of Gemini that even Google's official tools haven't yet explored. This experimental approach allowed AnyChat to handle simultaneous streams of live video and static images, essentially breaking the "single-stream barrier." The implications of Gemini's new capabilities extend beyond creative tools and casual AI interactions. In the medical field, professionals could show an AI both live patient symptoms and historical diagnostic scans simultaneously. Engineers could compare real-time equipment performance against technical schematics, receiving instant feedback. Quality control teams could match production line output against reference standards with unprecedented accuracy and efficiency. In education, the potential is transformative. Students can use Gemini in real-time to analyze textbooks while working on practice problems, receiving context-aware support that bridges the gap between static and dynamic learning environments. For artists and designers, the ability to showcase multiple visual inputs simultaneously opens up new avenues for creative collaboration and feedback. For now, AnyChat remains an experimental developer platform, operating with expanded rate limits granted by Gemini's developers. However, its success proves that simultaneous, multi-stream AI vision is no longer a distant aspiration but a present reality, ready for large-scale adoption. The emergence of AnyChat raises intriguing questions. Why hasn't Gemini's official rollout included this capability? Is it an oversight, a deliberate choice in resource allocation, or an indication that smaller, more agile developers are driving the next wave of innovation? As the AI race accelerates, the lesson of AnyChat is clear: The most significant advances may not always come from the sprawling research labs of tech giants. Instead, they may originate from independent developers who see potential in existing technologies — and dare to push them further. With Gemini's groundbreaking architecture now proven capable of multi-stream processing, the stage is set for a new era of AI applications. Whether Google will fold this capability into its official platforms remains uncertain. One thing is clear, however: The gap between what AI can do and what it officially does just got a lot more interesting.
Global pharmaceutical leader GSK is pioneering the use of generative artificial intelligence (AI) in healthcare, exploring its potential in areas like drug discovery, genomic analysis, and scientific literature review. However, a significant challenge it faces is "hallucinations," instances where AI models generate incorrect or misleading information. This issue is particularly critical in healthcare, where errors can lead to serious, potentially life-altering consequences. To address this, GSK has focused on reducing hallucinations during the AI model's implementation phase, known as inference-time, rather than during its training phase. Techniques employed include self-reflection mechanisms, multi-model sampling, and iterative output evaluation. These strategies, according to GSK's SVP of AI and machine learning, ensure that AI models are robust and reliable, enabling scientists to generate actionable insights more efficiently and quickly. The primary method of enhancing the performance of generative AI applications during inference-time involves increasing computational resources. This enables the model to undertake complex operations such as iterative output refinement or multi-model aggregation, which are crucial for reducing hallucinations and improving model performance. GSK leverages these additional computational resources to produce results that are not only faster but also more accurate and reliable. GSK's approach to addressing hallucinations in healthcare AI applications involves two key strategies. The first is self-reflection, where large language models (LLMs) critique and refine their own outputs to improve quality. This iterative process results in clearer, more detailed final answers that align with strict healthcare standards. The second strategy involves using multiple LLMs or different configurations of a single model to cross-check outputs. By comparing these outputs, GSK can confirm the most consistent conclusions, thereby reducing hallucinations and increasing confidence in the final answer, a critical requirement in healthcare. However, these strategies demand an infrastructure that can handle significantly heavier computational loads. In what's been dubbed the "inference wars," AI infrastructure companies compete to deliver hardware improvements that enhance token throughput, lower latency, and reduce costs per token. Specialized chips and architectures facilitate complex inferencing routines at scale, allowing advanced techniques to function in real-world scenarios. Despite these advancements, scaling computational resources presents challenges. Longer inference times can slow workflows, particularly in situations where immediate results are required. Higher computational usage also increases costs, necessitating careful resource management. Yet, GSK considers these trade-offs necessary for stronger reliability and richer functionality. GSK continues to refine its AI-driven healthcare solutions, with a focus on scaling computational resources during inference-time. This approach not only ensures that generative models meet the rigorous demands of clinical environments but also serves as a blueprint for other organizations on how to balance accuracy, efficiency, and scalability. By staying at the forefront of computational innovations and sophisticated inference techniques, GSK is laying the groundwork for future breakthroughs in drug discovery, patient care, and beyond. This exploration of GSK's use of generative AI in healthcare is part of our ongoing series on Healthcare and Gen AI. For more insights on how companies are leveraging generative AI, from regulatory shifts to practical deployments, stay tuned to VB Daily.
The Singapore-based tech firm, MiniMax, renowned for its high-resolution generative AI video model, Hailuo, has unveiled a new series of models known as MiniMax-01. These models are designed to handle ultra-long contexts, thereby enhancing the development of AI agents. The MiniMax-01 series includes MiniMax-Text-01, a foundational large language model (LLM), and MiniMax-VL-01, a visual multimodal model. Notably, MiniMax-Text-01 can process up to 4 million tokens in its context window, which is equivalent to the content of a small library. This context window refers to the amount of data the LLM can process in a single input/output exchange. This impressive capability doubles the previous record held by Google's Gemini 1.5 Pro model. The new models are available for download on platforms like Hugging Face and Github. Developers can also access them through MiniMax's application programming interface (API) for integration into their own applications. Compared to competitors like OpenAI’s GPT-4o, MiniMax offers text and multi-modal processing at competitive rates. To optimize scalability, MiniMax has incorporated a mixture of experts (MoE) framework with 32 experts. This design balances computational and memory efficiency while maintaining high performance. At the core of MiniMax-01 is a Lightning Attention mechanism, an innovative alternative to transformer architecture. This design significantly reduces computational complexity and consists of 456 billion parameters, with 45.9 billion activated per inference. The MiniMax-01 models have shown competitive performance on mainstream text and multimodal benchmarks, rivaling top-tier models such as GPT-4 and Claude-3.5. Particularly, MiniMax-Text-01 achieved 100% accuracy on the Needle-In-A-Haystack task with a 4-million-token context. The models also maintain consistent performance even as input length increases. Looking ahead, MiniMax plans to regularly update and enhance the models' capabilities. The company sees the open-sourcing of these models as a step towards building foundational AI capabilities for the evolving AI agent landscape. As the need for sustained memory and efficient communication between AI agents increases, MiniMax's innovations are designed to meet these challenges. MiniMax encourages developers and researchers to explore the capabilities of MiniMax-01 and welcomes technical suggestions and collaboration inquiries. The company's commitment to cost-effective and scalable AI positions it as a key player in shaping the AI agent era. The MiniMax-01 series offers an exciting opportunity for developers to explore the potential of long-context AI.
As the year 2025 approaches, it's predicted to be a pivotal moment for artificial intelligence (AI), with the promise of delivering tangible benefits to businesses. However, according to a recent report on the state of AI development, we are not quite at that stage. Only a quarter of enterprises have incorporated AI into their operations, and of those, a mere fraction have seen measurable results. This suggests that many businesses are still grappling with identifying practical applications for AI, leaving them in a pre-development phase. Despite the buzz and ongoing discussions about AI, most companies are still in the early stages of understanding how to effectively deploy AI solutions. Insights from over 1,250 AI developers and builders reveal the current landscape of AI implementation. Businesses are at various stages of their AI journey, with the majority (53%) developing and assessing strategies and proofs of concept, followed by beta testing (14%), and user requirement gathering (7.9%). The primary focus for most enterprises is on creating document analysis tools and customer service chatbots. However, there is growing interest in applications that combine analytics with natural language, content creation, recommendation systems, code generation, automation, and research automation. Developers have reported benefits including gaining a competitive edge (31.6%), saving time and costs (27.1%), and increasing user adoption rates (12.6%). However, nearly a quarter have yet to see any significant impact from their AI investments. The report emphasizes the importance of identifying use cases from the outset. While experimenting with AI may appease investors, it doesn't necessarily contribute to business growth or efficiency. Establishing clear use cases, achieving measurable ROI, and building internal expertise can help companies move beyond the hype and gain momentum in their AI journey. In terms of AI models, OpenAI is currently leading the pack with its GPT 4o and GPT 4o-mini. However, 2024 saw an increase in options, either directly from model creators or through platform solutions like Azure or AWS Bedrock. Providers hosting open-source models such as Llama 3.2 70B, including Groq, Fireworks AI, and Together AI, are also gaining popularity. The future of AI is expected to be multimodal, with a surge in the adoption of tools capable of handling a variety of tasks. Text remains the most common use case, followed by file creation (PDF or Word), images, audio, and video. Retrieval-augmented generation (RAG) is a popular choice for information retrieval, with over half of developers using vector databases to simplify search. Top models include Pinecone, MongoDB, Quadrant, Elastic Search, PG vector, Weaviate, and Chroma. AI is becoming democratized across enterprises, with departments beyond IT getting involved in AI projects. While engineering remains most engaged in AI projects (82.3%), leaders and executives (60.8%), subject matter experts (57.5%), product teams (55.4%), and design departments (38.2%) are also participating. Despite the growing involvement, enterprises continue to face challenges, notably around AI hallucinations and prompts, model speed and performance, data access and security, and securing stakeholder buy-in. Furthermore, while more non-technical users are getting involved, there is still a lack of pure technical expertise in-house. Many of these challenges can be mitigated through tooling, or platforms and services that assist developers in evaluating complex AI systems. However, nearly 18% of developers are defining prompts and orchestration logic without any tooling at all. Another solution to common problems is performing evaluations, or using specific metrics to test the correctness of responses. However, many developers are not conducting evaluations as consistently as they should be. The report suggests that enterprises need robust evaluation processes, especially when dealing with advanced agentic systems, which call external systems and perform autonomous actions. While some developers are utilizing automated evaluation tools, A/B testing, and open-source evaluation frameworks, more than three-quarters are still conducting manual testing and reviews. In conclusion, the report emphasizes the importance of embracing a mix of systems that work together — from cloud to application programming interfaces (APIs). It suggests treating AI as just one tool in the toolkit, rather than a magic solution for every problem.