<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/"><channel><title>IEEE Spectrum</title><link>https://spectrum.ieee.org/</link><description>IEEE Spectrum</description><atom:link href="https://spectrum.ieee.org/feeds/topic/computing.rss" rel="self"></atom:link><language>en-us</language><lastBuildDate>Mon, 08 Jun 2026 14:49:49 -0000</lastBuildDate><image><url>https://spectrum.ieee.org/media-library/eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpbWFnZSI6Imh0dHBzOi8vYXNzZXRzLnJibC5tcy8yNjg4NDUyMC9vcmlnaW4ucG5nIiwiZXhwaXJlc19hdCI6MTgyNjE0MzQzOX0.N7fHdky-KEYicEarB5Y-YGrry7baoW61oxUszI23GV4/image.png?width=210</url><link>https://spectrum.ieee.org/</link><title>IEEE Spectrum</title></image><item><title>Nvidia’s AI Hardware Comes to Windows in RTX Spark PCs</title><link>https://spectrum.ieee.org/nvidia-rtx-spark-windows-pc</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/3d-rendering-of-a-pc-chip.jpg?id=66865922&width=1200&height=400&coordinates=0%2C292%2C0%2C292"/><br/><br/><p>At Computex 2026, an annual computer trade show held in Taipei, Taiwan, Nvidia made a long anticipated announcement—a version of the company’s Blackwell GB10 superchip for Windows PCs, called RTX Spark. <a href="https://www.windowscentral.com/hardware/nvidia/nvidia-n1x-opencl-leak-cuda-cores-rtx-5070" rel="noopener noreferrer" target="_blank">Originally rumored to launch in 2025</a>, it was finally introduced at this year’s show.</p><p>It came with full support from Microsoft, which announced two new devices powered by RTX Spark: the <a href="https://www.microsoft.com/en-us/surface/devices/surface-laptop-ultra" rel="noopener noreferrer" target="_blank">Surface Laptop Ultra</a> and the <a href="https://www.microsoft.com/en-us/surface/devices/surface-rtx-spark-dev-box" rel="noopener noreferrer" target="_blank">Surface RTX Spark Dev Box</a>. Asus, Dell, Lenovo, HP, and MSI also announced Windows PCs with RTX Spark.</p><p>If this is triggering déjà vu, that’s for good reason. In June 2024, Qualcomm and Microsoft partnered to launch AI-focused Copilot+ PCs. <a href="https://spectrum.ieee.org/qualcomm-snapdragon-x2" target="_blank">Qualcomm’s Arm-based chips</a> provided an alternative to <em><em>x</em></em>86-based chips from AMD and Intel used across dozens of budget and mid-range Windows laptops. It was met with mixed commercial success, however, and Intel remains the dominant supplier of chips for Windows laptops. But that doesn’t mean RTX Spark will follow the same path, as Nvidia’s involvement is an important part of the equation. </p><p>“Nvidia just has more clout and more industry weight to push and make things happen that Qualcomm couldn’t do early on, and that even Microsoft struggled with,” says <a href="https://www.linkedin.com/in/ryanshrout/" rel="noopener noreferrer" target="_blank">Ryan Shrout</a>, president at <a href="https://signal65.com/" rel="noopener noreferrer" target="_blank">Signal65</a>, a third-party testing firm. “They can get game developers on board, and get software developers in the emerging AI space to pay attention.”</p><h2>What is RTX Spark?</h2><p>At its core, <a href="https://www.nvidia.com/en-us/products/rtx-spark/" rel="noopener noreferrer" target="_blank">RTX Spark</a> is an iteration of the hardware found in the <a href="https://www.nvidia.com/en-us/products/workstations/dgx-spark/" rel="noopener noreferrer" target="_blank">DGX Spark</a> mini-workstation, which was released in late 2025. Officially badged N1X, the silicon is Nvidia’s Blackwell GB10 “superchip,” a system-on-a-chip with 20 Arm CPU cores, 6,144 GPU cores, and support for up to 128 gigabytes of LPDDR5X memory. </p><p>There are some small differences between the mini-workstation and PC system, and the most significant is power consumption. The DGX Spark was designed for GB10 to operate with a power consumption up to 140 watts without overheating. RTX Spark laptops are likely to use less power, which may lower performance, though the details will depend on each PC maker’s particular implementation and remain to be seen.</p><p>RTX Spark will also <a href="https://spectrum.ieee.org/ai-models-locally" target="_blank">include a neural processing unit</a> (NPU) that qualifies the system for Microsoft’s Copilot+ certification. The NPU is used for some background AI features, like Windows Recall. However, the GPU will remain in the driver’s seat for active AI tasks, including large language models (LLMs) and image generation.</p><p>Though RTX Spark laptops took the spotlight, the news is also relevant to desktop workstations. Currently, DGX Spark ships with a custom version of Linux called DGX OS, not Windows. Nvidia says RTX Spark desktops with Windows are coming in the third quarter of 2026. <a href="https://www.nvidia.com/en-us/products/workstations/dgx-station-for-windows/" rel="noopener noreferrer" target="_blank">Windows is also coming to Nvidia’s DGX Station</a>, the full-sized desktop iteration of Nvidia’s hardware. </p><p>The launch of RTX Spark is of course in part an AI play, and that is taking the lion’s share of attention. But <a href="https://www.linkedin.com/in/anshel-sag-7484127/" rel="noopener noreferrer" target="_blank">Anshel Sag</a>, principal analyst at <a href="https://moorinsightsstrategy.com/" rel="noopener noreferrer" target="_blank">Moor Insights & Strategy</a>, thinks Spark is just as relevant for professional work and gaming. “I think the AI play is mostly to appease investors,” he says. “Creators and gamers are also excited about RTX Spark, and someone like me who does all three is even more excited, because having a machine that can do all three well has been a challenge.”</p><h2>Nvidia’s advantage may lie in software </h2><p>Though Nvidia refers to the GB10 as a “superchip,” it’s similar to other high-performance system-on-a-chip designs, such as Apple’s M-series silicon and AMD’s Ryzen AI Max. All three include a CPU, GPU, and NPU. All three support large amounts of DRAM. And all three have a unified memory architecture (meaning the system memory is a shared resource accessible to the CPU, GPU, and NPU). </p><p>The existing DGX Spark also provides a baseline for performance expectations. RTX Spark will likely deliver GPU performance similar to an RTX 5070 mobile GPU which, if correct, would put it ahead of Apple and AMD’s competing systems. On the other hand, GB10’s CPU cores <a href="https://www.phoronix.com/review/nvidia-gb10-cpu/6" rel="noopener noreferrer" target="_blank">aren’t as quick as the CPU cores</a> found in leading competitors. </p><p>Nvidia’s biggest edge might stem not from hardware performance, but from software. The company’s GPUs are essentially the industry standard across gaming and professional work, <a href="https://www.pcworld.com/article/3079686/nvidia-dominates-pc-graphics-cards-eating-94-of-the-market.html" rel="noopener noreferrer" target="_blank">with estimates placing Nvidia’s GPU market share above 90 percent</a>. That in turn has made Nvidia the target for most software that benefits from a GPU.</p><p>“Nobody doubts that Nvidia is the leader in GPU capability and the software stack around it,” says Shrout. Sag agrees, saying Nvidia has the advantage of “extremely mature drivers.” </p><h2>Microsoft touts AI, but Windows on Arm remains a question</h2><p>Nvidia announced RTX Spark was in lockstep with Microsoft, which held its Build developer conference in San Francisco while Computex was taking place across the Pacific in Taipei.</p><p>Repeating the Copilot+ PC launch, Microsoft’s vision of Windows on the RTX Spark leans heavily on AI. But unlike Copilot+ PCs—<a href="https://spectrum.ieee.org/microsoft-copilot" target="_self">which used the NPU to accelerate AI features integrated into the Windows user experience</a>, such as quickly recalling anything you’ve opened or translating live video calls—the pitch for Windows running on RTX Spark seems more focused on using the Spark’s GPU to accelerate LLMs.</p><p>Microsoft announced an “early preview” of Windows SDK called <a href="https://blogs.windows.com/windowsdeveloper/2026/06/02/windows-platform-security-for-ai-agents/" rel="noopener noreferrer" target="_blank">Microsoft Execution Containers (MXC)</a>, which sandboxes AI agents, allowing them to work autonomously while isolating them from functions the user doesn’t want the agent to access. </p><p>Still, the real test for both Nvidia and Microsoft remains the same challenge Microsoft and Qualcomm faced: establishing Windows on Arm PCs as an alternative to Windows PCs powered by <em><em>x</em></em>86 chips from Intel and AMD. Whether RTX Spark will succeed in this remains to be seen.</p><p>“Even with all of the talk from Nvidia and Microsoft about the future of the PC and revolutionizing the PC, everybody understands that it needs to be a great general-purpose PC first,” says Shrout.</p>]]></description><pubDate>Sat, 06 Jun 2026 12:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/nvidia-rtx-spark-windows-pc</guid><category>Nvidia</category><category>Pcs</category><category>Windows</category><category>Arm</category><category>Ai-hardware</category><dc:creator>Matthew S. Smith</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/3d-rendering-of-a-pc-chip.jpg?id=66865922&amp;width=980"></media:content></item><item><title>NSF Experiments With New Kind of Science Funding</title><link>https://spectrum.ieee.org/nsf-x-labs</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/conceptual-illustration-of-a-futuristic-atomic-particle-core-on-a-digital-hud-data-display.jpg?id=66853198&width=1200&height=400&coordinates=0%2C417%2C0%2C417"/><br/><br/><p>Uncle Sam wants you to solve big <a href="https://www.nature.com/articles/d41586-022-00018-5" rel="noopener noreferrer" target="_blank">scientific and engineering bottlenecks</a> outside the hallowed walls of academia. On 14 May, the U.S. National Science Foundation (NSF) issued a <a href="https://www.nsf.gov/tip/updates/nsf-announces-15b-nsf-x-labs-initiative-pursue-generational" rel="noopener noreferrer" target="_blank">solicitation</a> inviting what it calls “X-Labs,” or independent research organizations, to apply for a total of US $1.5 billion over 10 years. The structure of the X-labs solicitation is new for the United States government and closely matches an emerging private funding model for what are known as focused-research organizations (FROs), which various think tanks and philanthropies have <a href="https://fas.org/publication/focused-research-organizations-to-accelerate-science-technology-and-medicine/" rel="noopener noreferrer" target="_blank">proposed</a> and <a href="https://issues.org/focused-research-organizations-fro-marblestone-gamick-wang-fridman/" rel="noopener noreferrer" target="_blank">tested</a> during the last six years.</p><p>A focused-research organization is a team of scientists, engineers, and other technology developers that works on a well-defined problem with a target duration of three to seven years and on a budget in the tens of millions of dollars. Some examples have sought to build an <a href="https://spectrum.ieee.org/bci-ultrasound" target="_self">ultrasound-based brain-computer interface</a>, <a href="https://spectrum.ieee.org/ocean-carbon-removal" target="_self">quantify marine CO<sub>2</sub> removal</a>, and <a href="https://spectrum.ieee.org/ai-proof-verification" target="_self">improve formal verification in mathematics</a>. The funding for these FROs required a team approach to science larger and more agile than a typical academic lab but with a more academic appetite for scientific risk than a commercial venture might have. In some ways, they echo work done by the Defense Advance Research Projects Agency (DARPA), which is known for helping bridge high-risk research and early-stage commercialization of many technologies. </p><p>“The NSF’s X-Labs announcement is a welcome signal that the global research community is serious about finding new ways to fund ambitious, high-risk science,” says <a href="https://www.linkedin.com/in/pippy-james/" target="_blank">Pippy James</a>, deputy CEO of the <a href="https://aria.org.uk/about-aria/" target="_blank">Advanced Research + Invention Agency (ARIA)</a>, a United Kingdom government funding body in London that uses a similar model to the X-Labs.</p><p>The NSF’s first two X-Lab research areas are <a href="https://sam.gov/workspace/contract/opp/f58da497f6ad4bd9ab7ca021eee479e2/view" rel="noopener noreferrer" target="_blank">scientific instrumentation for sensing and imaging</a> and <a href="https://sam.gov/workspace/contract/opp/cdce081fa4aa4bcaa70003db71919a36/view" rel="noopener noreferrer" target="_blank">interconnects and integrated photonics for quantum systems</a>, and the solicitation says the agency will announce additional topics within weeks. </p><p>The funding is structured <a href="https://sam.gov/workspace/contract/opp/0918909712164c78af1ef29055a02f7b/view" rel="noopener noreferrer" target="_blank">in phases</a>, with $1.5 million per project in the first year, then up to $50 million per project over the next two to three years for selected projects, with a third, more open-ended phase after that. That first-year funding is more than seven times as much as for <a href="https://nsf-gov-resources.nsf.gov/files/04_fy2025.pdf?VersionId=p8rlqMsPAwAgX9xJuzDBHV5bmXdImVme" rel="noopener noreferrer" target="_blank">the typical NSF project</a> of around $200,000.</p><p>“Compared to incremental, project-based grants, larger institutional grants and longer-horizon grants let teams take on harder, more infrastructure-heavy problems with the agility to pivot as they learn,” says <a href="https://www.jenngustetic.me" target="_blank">Jenn Gustetic</a>, director of metascience and R&D policy at the <a href="https://ifp.org" rel="noopener noreferrer" target="_blank">Institute for Progress</a> (IFP), a Washington, D.C. think tank that made a <a href="https://ifp.org/how-x-labs-can-unleash-ai-driven-scientific-breakthroughs/" rel="noopener noreferrer" target="_blank">proposal</a> last year for how the U.S. government could support more independent research organizations.</p><p>The NSF solicitation also requires applicants to demonstrate “substantial” independence from any non-X-Lab institutions such as a university or company, which the NSF defined in part to mean the ability to make decisions on research direction, partnerships, and staff in days rather than weeks. It would be difficult for a full-time researcher at a university to qualify, for example, which opens the door to industry researchers or academics willing to take extended leave. </p><h2>What should new money for science look like?</h2><p>“People have been kind of wanting to do [focused research organizations] in a fairly bipartisan way since 2020,” says <a href="https://www.adammarblestone.org" target="_blank">Adam Marblestone</a>, an early proponent who now directs <a href="https://www.convergentresearch.org" rel="noopener noreferrer" target="_blank">Convergent Research</a>, a Cambridge, Mass., nonprofit that has spent almost $400 million building <a href="https://www.convergentresearch.org/ecosystem" rel="noopener noreferrer" target="_blank">a dozen FROs</a>. Some larger goal-directed, rather than principal-investigator-centered, funding has existed for decades in other federal agencies in the form of the Advanced Research Projects Agencies for defense, intelligence, energy, and most recently health. ARPA program managers often took a more <a href="https://emergingtechpolicy.org/federal-rd-funding/#funding-models-and-mechanisms" rel="noopener noreferrer" target="_blank">hands-on and flexible approach</a> than typical three-year National Institutes of Health (NIH) or NSF grant managers could. On 2 June, the IFP published an <a href="https://atlasofinnovation.org/" rel="noopener noreferrer" target="_blank">Atlas of Innovation</a> comparing many different research funding structures. </p><p>The NSF announcement comes against a backdrop of administration requests for dramatic cuts to the agency’s budget, though Congress has generally <a href="https://www.aip.org/fyi/fy2025-nsf-budget-and-appropriations" rel="noopener noreferrer" target="_blank">appropriated stable amounts</a>. NSF has, however, not disbursed all its appropriated funding, because the Trump administration has been feuding with many universities, accusing them of discrimination, and suing them. Most recently, NSF stopped new funding for several prominent universities, <em><em>Nature</em></em> <a href="https://www.nature.com/articles/d41586-026-01667-6" rel="noopener noreferrer" target="_blank">reported</a>. MIT’s president in May said that the university has <a href="https://president.mit.edu/writing-speeches/video-transcript-message-president-kornbluth-about-funding-and-talent-pipeline" rel="noopener noreferrer" target="_blank">won 10 percent less federal money</a> than the previous year.</p><p>The big-ticket nature of the X-labs might make administrators and principal investigators at those universities worry about whether it will impact their own funding. “I don’t think it’s a zero-sum game,” says <a href="https://fas.org/expert/erica-goldman/" target="_blank">Erica Goldman</a>, director of policy entrepreneurship at the <a href="https://fas.org" rel="noopener noreferrer" target="_blank">Federation of American Scientists</a>, a Washington, D.C. science-policy think tank, “but the way the timing of the announcements have come out and the rhetoric out there make it very hard to see that.”</p><p>“NSF X-Labs is structured to complement the existing system, not displace it—adding an independent institutional type alongside universities, national labs, small businesses and corporate R&D,” the IFP’s Gustetic says. The X-Labs annual sticker price represents less than 2 percent of the agency’s overall budget of $8.75 billion in 2026.</p><p>The emerging field of <a href="https://scienceplusplus.org/metascience/index.html" rel="noopener noreferrer" target="_blank">metascience</a>, which investigates how best to do science, has been debating how governments should build the proper pipelines for converting blue-sky research into returns for all taxpayers. Metascientists differ over how FROs should fit into the research system. Some argue that governments <a href="https://www.nature.com/articles/d41586-021-01878-z" rel="noopener noreferrer" target="_blank">can’t expect to apply the vaunted DARPA model to everything</a>. Others write that FROs are a great idea and that the federal government <a href="https://www.macroscience.org/p/metascience-is-ignoring-the-national?utm_source=substack&utm_medium=email" rel="noopener noreferrer" target="_blank">already has a version of them</a> in the form of the Department of Energy’s National Labs, which have spun off science platforms such as the Human Genome Project and the Protein Data Bank.</p><p>NSF media affairs head <a href="https://www.linkedin.com/in/englandmichael/" target="_blank">Mike England</a> told <em>IEEE Spectrum</em> that “X-labs creates space and provides funding for new institutions to achieve breakthroughs in scientific discovery, research, and translation, and ultimately helps create new platform technologies.” In addition, on 27 May NSF <a href="https://sam.gov/workspace/contract/opp/4998d1c8f414490fb5590752d607f21b/view" rel="noopener noreferrer" target="_blank">requested information</a> for a new funding idea adjacent to X-Labs it calls <a href="https://www.nsf.gov/funding/initiatives/tech-accelerators" rel="noopener noreferrer" target="_blank">Tech Accelerators</a>. These would use NSF money and accelerator expertise to fund and guide “deep-tech” commercialization efforts in agriculture, materials, ocean, and scientific instrumentation.</p><p>Other elements of government are also exploring how to incorporate the FRO funding model. In December 2025, U.S. Representative Josh Harder (D-Calif.) introduced a <a href="https://www.congress.gov/bill/119th-congress/house-bill/6572/all-actions-without-amendments" rel="noopener noreferrer" target="_blank">bill</a> that would apply the X-labs model to the NIH. England says that NSF is having conversations with other government agencies about the model and welcomes more agencies to explore it. </p><p>“We don’t have great evidence comparing how different funding mechanisms perform—individual project grants, milestone-based contracts, prize challenges, etc.” Gustetic says. “X-Labs is a chance to actually learn something about which institutional designs work for which kinds of problems.”</p><p>Universities will likely have to adapt to the new model. Mid-career academics may well want to take time away from their university homes to participate in an X-Lab or other FRO project, says <a href="https://www.monicadus.com/team" target="_blank">Monica Dus</a>, director of the <a href="https://research.umich.edu/office-of-national-labs/" rel="noopener noreferrer" target="_blank">Office of National Laboratories</a> at the University of Michigan and a holder of NSF grants. Institutions will need to figure out how to cover teaching duties in their absence and how to assess commercial experience for tenure or other internal promotions. “Universities should adapt to make sure the research does really reach the people it is meant for,” she says.</p><p>Academics may also need to change their approach to succeed, Convergent’s Marblestone says. “When we go to academics at Convergent, it takes a few conversations to plan it, because they don’t always know how they’d manage $50 million and a professional engineering team. You really need a CEO.”</p><a href="https://fas.org/expert/daniel-correa/" target="_blank">Daniel Correa</a>, CEO of the Federation of American Scientists, which published a <a href="https://fas.org/publication/focused-research-organizations-to-accelerate-science-technology-and-medicine/" rel="noopener noreferrer" target="_blank">2020 call for FROs</a>, is optimistic about the NSF’s ability to get results from X-Labs. “The team at NSF did a lot of due diligence talking to folks on the outside, not just policy people but people that are building these labs, and integrated some of the key elements into the contours of the solicitation,” he says.]]></description><pubDate>Thu, 04 Jun 2026 13:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/nsf-x-labs</guid><category>Nsf</category><category>Higher-education</category><category>Darpa</category><category>Science-policy</category><dc:creator>Lucas Laursen</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/conceptual-illustration-of-a-futuristic-atomic-particle-core-on-a-digital-hud-data-display.jpg?id=66853198&amp;width=980"></media:content></item><item><title>The Classical Advances Needed to Make Quantum Computers Tick</title><link>https://spectrum.ieee.org/quantum-calibration-decoding</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/photo-collage-of-server-wires-overlapping-with-a-quantum-computers-superconducting-lines.jpg?id=66852633&width=1200&height=400&coordinates=0%2C1042%2C0%2C1042"/><br/><br/><p>Quantum computers promise to one day solve problems beyond the most powerful supercomputers imaginable. But it’s often underappreciated how much classical computing it takes just to operate these machines. As qubit counts rise, innovations in this supporting infrastructure will be essential if they’re to live up to their promise.</p><p>To prepare for the scale of quantum computers the industry is working toward, many companies are also gearing up the classical hardware, and software, required to support them. In April, Nvidia <a href="https://developer.nvidia.com/ising?size=n_6_n" rel="noopener noreferrer" target="_blank">announced</a> new AI-based software to accelerate the classical tasks that enable quantum computers. Sydney-based quantum software company <a href="https://q-ctrl.com/" rel="noopener noreferrer" target="_blank">Q-CTRL</a> has developed an<a href="https://q-ctrl.com/technology/quantum-computer-autocalibration" rel="noopener noreferrer" target="_blank"> automatic calibration algorithm</a> for quantum computers, and is now <a href="https://q-ctrl.com/blog/scaling-quantum-autonomy-with-nvidia-ising" rel="noopener noreferrer" target="_blank">leveraging</a> Nvidia’s agent-based system. Other companies, including <a href="https://www.ibm.com/quantum?utm_content=SRCWW&p1=Search&p4=318569493295&p5=p&p9=196351357812&gclsrc=aw.ds&gad_source=1&gad_campaignid=23423681724&gbraid=0AAAAAD-_QsScZCuULMxdFOkQ5b_2MNwwg&gclid=Cj0KCQjw_vnQBhCxARIsADcZyxJVWxymnYXSvyFK8eNpTpku6HCqSfpon3KvvZKLBk9mYwws0RkQCVkaAtf_EALw_wcB" rel="noopener noreferrer" target="_blank">IBM Quantum</a>, Cambridge, England–based <a href="https://www.riverlane.com/" rel="noopener noreferrer" target="_blank">Riverlane</a>, which develops quantum-error correction, and <a href="https://quantumai.google/quantumcomputer" rel="noopener noreferrer" target="_blank">Google Quantum AI</a>, are developing similar tools. </p><h2>The Role of Classical in Quantum </h2><p>Digital computer chips are marvels of engineering, operating flawlessly out of the box and capable of trillions of operations without error. The quantum bits, or qubits, at the heart of a quantum computer, by contrast, are temperamental and unreliable, requiring regular calibration and complex <a href="https://spectrum.ieee.org/quantum-error-correction" target="_self">error-correcting schemes</a> to keep them on track.</p><p>Calibration and error-correction are fundamentally classical, not quantum, problems, and they require dedicated classical hardware to solve. As quantum computers get bigger, the scale of those resources will need to rise in lockstep. That means that for the foreseeable future, quantum computers are going to be <a href="https://spectrum.ieee.org/ibm-quantum-computer-2668978269" target="_self">hybrid devices</a> with a healthy dose of classical computing on the side.</p><p>“The cheapest and fastest way to execute most computer programs is to run them on a classical computer—even if a quantum computer is available,” says <a href="https://www.linkedin.com/in/adamzalcman/" rel="noopener noreferrer" target="_blank">Adam Zalcman</a>, a quantum software engineer at Google Quantum AI. “This is true of most of the information processing involved in running a quantum computer itself.... Therefore, I expect that every practical and efficient quantum-computer architecture will incorporate fast classical devices.”</p><h2>Tuning Quantum Hardware</h2><p>While the transistor has cemented its place as the foundational component of classical chips, the qubits at the heart of a quantum computer come in many flavors—superconducting circuits, <a href="https://spectrum.ieee.org/longlasting-qubits" target="_self">trapped ions</a>,<a href="https://spectrum.ieee.org/neutral-atom-quantum-computing" target="_self"> neutral atoms</a>, even individual <a href="https://spectrum.ieee.org/quantum-computers" target="_self">photons</a>. Using them for computation requires a painstaking calibration process to turn the “bare metal” of the underlying hardware into a qubit that can be controlled to run quantum circuits, says <a href="https://www.linkedin.com/in/james-guilmart/" rel="noopener noreferrer" target="_blank">Jay Guilmart</a>, lead product manager at Q-CTRL.</p><p>Calibration has two stages. The first, known as “bring up,” determines the frequency at which each qubit resonates, how long it holds its quantum state, its sensitivity to control pulses, and the strength of its interactions with neighboring qubits. All of these factors determine its error propensity and response to control signals.</p><p>Done by hand, the process still requires someone with a Ph.D. and can take days or even weeks, says Guilmart. This isn’t a scalable solution and so there’s a growing drive to automate the process. This is challenging because every step relies on results from the previous step. So rather than relying on a predefined script, Q-CTRL has therefore built intelligent calibration software that examines the result of each measurement, diagnoses failures, and adjusts the approach before retrying. </p><p>“After each step, we analyze that data and we say, are we okay to proceed to the next step? Do we have to go back to the previous step? Do we have to re-recreate this step?” says Guilmart.</p><p>Calibration is also not a one-and-done process: key parameters drift over time, gradually degrading performance. Q-CTRL’s software performs “runtime recalibration” to nudge things back into place, but there’s a limit to how much on-the-fly adjustment is practical. </p><p>“If I’m running a recalibration, I’m not running a circuit,” he says. “Even though I’m maintaining some high system state and high fidelities, if it takes all of my uptime it’s worthless.”</p><h2>Decoding Errors in Real Time</h2><p>Even a well-calibrated quantum computer remains fault-prone, which is why companies are investing heavily in quantum error correction (QEC). This typically involves encoding quantum information across large numbers of physical qubits in their shared state—a “<a href="https://spectrum.ieee.org/logical-qubit" target="_self">logical qubit</a>“—so that errors in individual qubits can be detected and compensated for without destroying the encoded information.</p><p>Because measuring a qubit directly collapses its quantum state, errors are detected via parity checks, which query whether pairs of qubits share the same state. This produces a series of measurements known as a “syndrome,” which classical algorithms called decoders analyze to locate errors.</p><p>The process must happen extremely quickly. While many errors can be logged and corrected mathematically after an operation, some must be fixed immediately before the algorithm can proceed. Superconducting and silicon spin qubits can hold their quantum states only for microseconds or milliseconds, so errors must be decoded and corrected within that window.</p><p>These tight requirements mean decoders typically run on specialized silicon like field-programmable gate arrays (FPGAs) or application-specific integrated circuits (ASICs) optimized for speed, says Jerry Chow, CTO of quantum-centric supercomputing at <a href="https://www.ibm.com/quantum?utm_content=SRCWW&p1=Search&p4=318569493295&p5=p&p9=196351357812&gclsrc=aw.ds&gad_source=1&gad_campaignid=23423681724&gbraid=0AAAAAD-_QsScZCuULMxdFOkQ5b_2MNwwg&gclid=Cj0KCQjw_vnQBhCxARIsADcZyxIcIyEy4jC6XdELM49EwyoDo5PSWNjYhr0GAJeokVSKFwzOXynFRmUaAiPJEALw_wcB" rel="noopener noreferrer" target="_blank">IBM</a>. “You need to be able to keep up and you need to be able to effectively decode on the fly,” he says. “The best way to do that is through very tightly integrated FPGA or ASIC decoder capabilities.”</p><h2>To AI or Not to AI</h2><p>There is growing interest in using AI to simplify quantum hardware control. In April, Nvidia released two models targeting calibration and decoding. The first uses a vision-language model to analyze calibration-measurement outputs—typically plotted as graphs—and passes that evaluation to an AI agent that decides how to tweak the processor. The second uses a convolutional neural network to identify the simpler, localized errors that make up the bulk of faults. More complex errors are passed to a traditional algorithmic decoder, but the first pass reduces computational load enough to deliver a 2x speedup.</p><p>The attraction of AI for decoding, says <a href="https://www.linkedin.com/in/samstanwyck/" rel="noopener noreferrer" target="_blank">Sam Stanwyck</a>, director of quantum product at Nvidia, is that while models are time-consuming to train, they are extremely fast at inference—and thanks to parallelization across many chips, that speed holds even as qubit counts grow.</p><p>But offloading to a GPU still introduces significant latency, says <a href="https://www.riverlane.com/team/marco-ghibaudi" rel="noopener noreferrer" target="_blank">Marco Ghibaudi</a>, vice president of engineering at Riverlane. “You can have a really fat pipe, but it’s really long,” he says. “Our job [approach] has always been to try to remove as many unnecessary steps and shorten the pipe, and then make every section of the pipe as fast as possible.”</p><p>IBM’s Chow agrees that GPU latency currently makes them infeasible for real-time decoding. He’s also cautious about AI for calibration, given its computational expense. The approach holds promise for understanding the physics of novel architectures or new kinds of circuits. But for well-characterized devices where you’re simply looking for small deviations, simpler physics-informed techniques can be considerably cheaper.</p><p>The two approaches aren’t mutually exclusive, however, says Google’s Zalcman. Neural networks excel at discovering hidden patterns in syndrome data that help identify complex errors algorithmic decoders sometimes miss. Google is therefore developing a hardware architecture that can incorporate both traditional and AI-based decoders, including its AlphaQubit 2 model.</p><p>In the long run, Andi Gu, a Harvard Ph.D. student working on AI decoders, thinks “the bitter lesson” will come for decoding. This refers to AI pioneer Richard Sutton’s argument that general-purpose learning methods consistently outperform handcrafted algorithms over time. “If you make the model large enough and you throw enough training data at it, it will learn to capture the hidden correlations better than any other handwritten algorithm,” says Gu.</p><p>Latency remains a barrier, but his group is researching ways to make AI decoders more efficient and smaller so that they can fit on an FPGA, cutting response times. This can degrade accuracy though, so finding the right balance is still a work in progress.</p><p>Regardless of which approach wins out, one thing is certain—future quantum computers will require massive classical support. Decoding is a continuous, computationally expensive process whatever technique you use, says Gu, so you will need a “healthy chunk” of classical hardware dedicated to that task.</p><p>Calibration compute overheads will similarly “blow up” as devices scale to thousands or millions of qubits, says Q-CTRL’s Guilmart. Current techniques are unlikely to scale, he adds, so new approaches will be needed. “We’re going to have to rearchitect and do things differently when we get to even 1,000 qubits,” he says. “So no one’s winning the battle today.”</p>]]></description><pubDate>Wed, 03 Jun 2026 20:06:00 +0000</pubDate><guid>https://spectrum.ieee.org/quantum-calibration-decoding</guid><category>Quantum-computers</category><category>Quantum-error-correction</category><category>Internal-calibration</category><category>Nvidia</category><category>Quantum-computing</category><dc:creator>Edd Gent</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/photo-collage-of-server-wires-overlapping-with-a-quantum-computers-superconducting-lines.jpg?id=66852633&amp;width=980"></media:content></item><item><title>New Server Hopes to Break Through AI’s “Memory Wall”</title><link>https://spectrum.ieee.org/huge-memory-ai-server</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/3d-rendering-of-an-ai-server.jpg?id=66838211&width=1200&height=400&coordinates=0%2C417%2C0%2C417"/><br/><br/><p>Memory is arguably the most serious constraint on modern AI large language models (LLMs). <a href="https://arxiv.org/pdf/2403.14123" rel="noopener noreferrer" target="_blank">According to one influential paper</a>, LLM token generation is an inherently memory-bound task, meaning the rate at which models output text is limited by how quickly data can be read in from memory. The severity of this bottleneck grows with model size. This creates a “memory wall” that holds back LLM inference performance.<br/><br/>AI hardware startup <a href="https://majestic-labs.ai/" rel="noopener noreferrer" target="_blank">Majestic Labs</a> is taking a direct—and comprehensive—approach to solving this problem. It’s developing a new AI server, Prometheus, with up to 128 terabytes of memory. That’s over 60 times more than Nvidia’s <a href="https://resources.nvidia.com/en-us-dgx-systems/dgx-b300-datasheet" rel="noopener noreferrer" target="_blank">DGX B300 server</a>, a cutting-edge AI processing rack. </p><p><a href="https://www.linkedin.com/in/rabii/" rel="noopener noreferrer" target="_blank">Sha Rabii</a>, co-founder and president of Majestic Labs, believes that this drastic increase in memory will provide his company an edge. While he acknowledges that “Nvidia’s done a phenomenal job creating a system that can scale out,” he argues that it becomes less economical as models grow and “ends up greatly over-provisioning on compute and starving on memory.”</p><h2>DRAM-Centric Architecture for LLM Memory</h2><p>Majestic Labs plans to surmount the “memory wall” with an architecture that fundamentally differs from competitors’. </p><p>Nvidia’s current servers have fast <a href="https://spectrum.ieee.org/hbm-on-gpu-imec-iedm" target="_self">high-bandwidth memory</a> (HBM), which is typically used to read in an LLM’s model weights. In addition, there’s an often larger but slower pool of dynamic random access memory (DRAM), which handles LLM and server overhead. Majestic instead goes all in on DRAM (specifically LPDDR6) in a unified architecture. </p><p>Rabii says that most memory interfaces are designed to operate over a short physical distance—sometimes only a few millimeters. That limits how much memory can be placed. “You get this shoreline at the compute die where you can put your HBM. If you wanted to put more, you can’t,” Rabii explains. </p><p>To solve that, Majestic uses a proprietary memory interface constructed from miniature copper cables that’s effective up to a meter. This is paired with custom memory aggregation chips that sit physically next to memory modules and coordinate memory across the server. </p><p>“It’s an endpoint for that high-speed interface and fans out to many, many commodity DRAM chips,” explains Rabii. In addition to addressing large pools of memory, Majestic says this design offers memory bandwidth up to 25.6 terabytes per second. </p><h2>Ignite AI Processor for LLM Acceleration</h2><p>More memory is good, but it needs to be paired with AI acceleration, something akin to Nvidia’s GPU. Majestic’s solution to this is Ignite, a custom AI processing unit that serves as the server’s compute engine. The Prometheus server contains 12 Ignite chips. </p><p>Ignite combines data-center-class ARM application cores with RISC-V vector and tensor cores on a single die, all sharing the same memory space. The ARM cores act as an on-chip host processor to orchestrate the AI model. The RISC-V cores carry out the actual LLM processing. The result is a single chip that handles multiple aspects of LLM inference demands without handing off between processors. Majestic Labs has yet to reveal specific metrics for Prometheus’ compute performance.</p><p>Rabii acknowledges that software is important as well, given that many AI frameworks are already entrenched. “We’re trying to reduce friction as much as possible in every aspect of our customer adoption, whether it’s physical or software,” he says. Prometheus will support PyTorch, vLLM, and OpenAI’s Triton inference frameworks without requiring code modifications. That means existing models compatible with these frameworks can run as-is.</p><h2>Prometheus Server Design and Pricing</h2><p>All of this combines in the server itself, which is <a href="https://en.wikipedia.org/wiki/Open_Rack" rel="noopener noreferrer" target="_blank">Open Compute Project-compliant</a>. Up to four servers can fit in a server rack; power draw is expected to total up to 120 kilowatts per rack; and heat will be managed with <a href="https://spectrum.ieee.org/data-center-liquid-cooling" target="_self">cold-plate liquid cooling</a>. The server’s memory design is modular, which means servers purchased with less than the maximum of 128 TB of memory can be upgraded at a later date. </p><p>Despite the breadth of the project, Majestic wants to position Prometheus on price, too—which might be a surprise given how much memory each server can contain. Majestic argues that this will be possible because it uses DRAM instead of HBM. Pricing has not yet been announced, as Prometheus is expected to ship in 2027.</p><p>“Our customers’ capital expenditure will come down by, depending on the workload, 10 to 50 times, and the power consumption comes down by a similar amount,” Rabii claims.</p>]]></description><pubDate>Mon, 01 Jun 2026 15:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/huge-memory-ai-server</guid><category>Memory</category><category>Server</category><category>Ai-accelerators</category><category>Performance</category><dc:creator>Matthew S. Smith</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/3d-rendering-of-an-ai-server.jpg?id=66838211&amp;width=980"></media:content></item><item><title>Precision Agriculture Tech Can Address New Fertilizer Shortages</title><link>https://spectrum.ieee.org/fertilizer-shortage-precision-agricultur</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/close-up-of-a-probe-equipped-with-a-spectrometer-and-environmental-sensors.jpg?id=66826391&width=1200&height=400&coordinates=0%2C292%2C0%2C292"/><br/><br/><p>The ongoing instability in the Middle East has put fertilizer availability at the center of global food security concerns. Up to 30 percent of the global fertilizer trade typically passes through the Strait of Hormuz, along with major flows of liquefied natural gas, a key feedstock for its production. Delayed shipments are rippling through agricultural supply chains, and the United Nation’s Food and Agriculture Organization <a href="https://www.fao.org/newsroom/detail/strait-of-hormuz-crisis--fao-director-general-outlines-risks--actions-and-policy-responses/en" target="_blank">has warned</a> that the price of urea, a widely used nitrogen fertilizer, had increased by 52 percent in the United States and by 60 percent in Brazil by mid-April. </p><p>“Traditionally, because fertilizers are relatively cheap, farmers often apply the maximum allowed amount,” says <a href="https://www.athanasiadis.info/" rel="noopener noreferrer" target="_blank">Ioannis Athanasiadis</a>, a professor at <a href="https://www.wur.nl/en" rel="noopener noreferrer" target="_blank">Wageningen University</a> in the Netherlands who works on AI for agriculture and food systems. “It acts as insurance against uncertain weather conditions.” But for farmers already squeezed by the cost of fuel, machinery, and seeds, volatile fertilizer prices are making waste increasingly expensive.</p><p><a href="https://spectrum.ieee.org/john-deere-and-the-birth-of-precision-agriculture" target="_self">Precision agriculture</a> has already helped farmers reduce chemical use and save money, with computer-vision systems able to identify weeds and trigger herbicide nozzles only where needed. But fertilizer is a tougher challenge. Nitrogen, the key nutrient in many fertilizers, is invisible, highly mobile in soil, and can be washed below the root zone before crops absorb it.</p><p>“The difficulty is that you never actually know how much nitrogen the plant and the soil have,” says <a href="https://www.linkedin.com/in/chris-padwick-75b5761" rel="noopener noreferrer" target="_blank">Chris Padwick</a>, a technical fellow at <a href="https://www.bluerivertechnology.com/" rel="noopener noreferrer" target="_blank">Blue River Technology</a>, a California-based company that develops computer-vision technologies for agriculture.</p><h2>Precision Fertilizer Technologies</h2><p>Along with its parent company <a data-linked-post="2658609901" href="https://spectrum.ieee.org/5g-network" target="_blank">John Deere</a>, Blue River developed a precision fertilizer technology called <a href="https://www.deere.com/en/technology-products/precision-ag-technology/precision-upgrades/planter-upgrades/exactshot-upgrade/" rel="noopener noreferrer" target="_blank">ExactShot</a>, which can be deployed only at the time of planting. The system detects each seed as it goes into the soil and sprays a few drops of starter fertilizer directly onto it, instead of applying fertilizer continuously along the row. Blue River says the system can cut starter fertilizer use by more than 60 percent and could save more than 93 million gallons annually across the U.S. corn crop.</p><p>The harder task comes later in the plant’s life, when crop needs depend on weather, soil type, previous applications, and what has already happened in that patch of field. Many sprayers for precision herbicide applications can be fitted with attachments in the shape of an inverted-Y that drag hoses near corn plants, dribbling liquid nitrogen close to the row during the growing season. But without reliable information on which parts of the field actually need nitrogen, the application remains nonselective.</p><p class="shortcode-media shortcode-media-rebelmouse-image rm-float-left rm-resized-container rm-resized-container-25" data-rm-resized-container="25%" style="float: left;"> <img alt="3D rendering of a wheeled spreader deploying drops of fertilizer onto individual seeds underground. " class="rm-shortcode" data-rm-shortcode-id="b9ada09f67c4f38e01670cb8e7914871" data-rm-shortcode-name="rebelmouse-image" id="4a11c" loading="lazy" src="https://spectrum.ieee.org/media-library/3d-rendering-of-a-wheeled-spreader-deploying-drops-of-fertilizer-onto-individual-seeds-underground.jpg?id=66826439&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">John Deere’s ExactShot technology detects each seed as it is planted and applies a few drops of starter fertilizer.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Blue River Technology</small></p><p>Padwick says Blue River is testing whether its crop imaging systems could also become “digital scanners” for plant health, using broader spectral coverage to analyze vegetation as sprayers pass close to the canopy. But getting insights isn’t the same as accurately interpreting them. Yellowing leaves or reduced chlorophyll may point to a nutrient deficiency, but they can also indicate drought stress, disease, or insect damage. That is why some companies are moving below the canopy and into the soil, where direct measurements can provide a more objective picture of what nutrients are available.</p><p>Elsewhere in the United States, Iowa-based <a href="https://n-sense.us/" target="_blank">N-Sense</a> is betting on a mobile machine for soil analysis. Its prototype soil-nitrate sensor can be pulled through the field by a truck to measure nitrate concentration in real time. The system uses a ruggedized miniature Fourier-transform infrared spectrometer operating in the mid-infrared, coupled with a diamond interface. Soil is pressed against one end of the diamond while infrared light passes through the other, allowing the instrument to detect nitrate while the diamond protects the optical surface from abrasion.</p><p>“Nitrate is particularly difficult to detect,” says <a href="https://www.linkedin.com/in/david-laird-91669617b" target="_blank">David Laird</a>, N-Sense’s president and CEO. “In the ultraviolet, many things in the soil are responsive, so it is very difficult to separate the nitrate signal. But in the mid-infrared, we are able to isolate the nitrate band and get a strong signal.”</p><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="A truck pulling a soil nitrate sensor across a field of crops." class="rm-shortcode" data-rm-shortcode-id="fd9103f523ea13d224460d6094c1bd96" data-rm-shortcode-name="rebelmouse-image" id="4d500" loading="lazy" src="https://spectrum.ieee.org/media-library/a-truck-pulling-a-soil-nitrate-sensor-across-a-field-of-crops.jpg?id=66826434&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">N-Sense’s sensor measures nitrate in real time as it moves through a crop field.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">N-Sense</small></p><p>The sensor feeds nitrate data into machine learning software that also taps into soil-survey data, satellite imagery, and yield data to generate fertilizer prescriptions that can be uploaded directly to a tractor.</p><p>“You do not want to look at nitrogen in isolation,” says Laird. “The soil may have low nitrogen content, but if water availability is what is limiting productivity, adding nitrogen will not help.”</p><p>In one field tested last year, Laird says the company achieved about a 30 percent reduction in total nitrogen fertilizer applied.</p><h2>Real-Time Soil Analysis</h2><p>In Potsdam, Germany, an agricultural tech company called <a href="https://www.stenon.io/en" target="_blank">Stenon</a> has developed <a href="https://www.stenon.io/en/technology" target="_blank">FarmLab</a>, a mobile soil-analysis device designed to give farmers real-time measurements directly in the field. The handheld probe is pushed into the soil and combines optical spectroscopy, which reads how soil absorbs and reflects light, with impedance-based electrical measurements, which send a small electrical signal through the ground to capture properties affected by moisture, salts, and nutrient ions. Environmental sensors capture the temperature and humidity, and there is also a GPS tag that associates each reading with a location. Cloud computing and machine learning turn the raw signals into usable soil data. The goal is to infer key soil parameters, including nitrate, mineral nitrogen, moisture, and other indicators that can guide fertilizer decisions.</p><p class="shortcode-media shortcode-media-rebelmouse-image rm-float-left rm-resized-container rm-resized-container-25" data-rm-resized-container="25%" rel="float: left;" style="float: left;"> <img alt="A soil analysis device resembling an electric shovel." class="rm-shortcode" data-rm-shortcode-id="ab6d58ef184cf1763b7974942363743c" data-rm-shortcode-name="rebelmouse-image" id="6b820" loading="lazy" src="https://spectrum.ieee.org/media-library/a-soil-analysis-device-resembling-an-electric-shovel.jpg?id=66826446&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Stenon’s FarmLab uses optical spectroscopy, electrical impedance sensing, and machine learning to generate soil nutrient maps for precision fertilization.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Stenon</small></p><p>The company’s founder and CEO, <a href="https://de.linkedin.com/in/niels-grabbert-a6645698" target="_blank">Niels Grabbert</a>, says the idea came when Europe began enforcing the <a href="https://environment.ec.europa.eu/topics/water/nitrates_en" target="_blank">Nitrates Directive</a>, a law aimed at protecting groundwater and surface water from agricultural nitrate pollution. Back then, farmers were required to reduce nitrogen losses but lacked real-time information on the nitrogen present in their fields.</p><p>“Depending on the country and on what parameters you are testing for, it can take anywhere from two to eight weeks before you receive soil-testing results,” says Grabbert. “That means farmers do not know in real time how much nutrition the soil itself can provide.”</p><p>FarmLab is meant to replace sparse lab testing with faster, denser field data. On a 100-hectare farm, an agronomist could take one reading every two hectares, then use the software to turn those GPS-tagged measurements into nutrient maps and fertilizer rates that can be sent to farm machinery or management platforms.</p><p>Grabbert says the technology can reduce fertilizer use by around 20 percent on average while increasing yields by 2 to 8 percent, depending on the crop and production system.</p><p>For Dutch professor Athanasiadis, these systems point in the right direction, but they are not enough on their own. “There are no magic solutions,” he says. “We need sensors, robotics, AI, government support, and farmer participation all working together.”</p>]]></description><pubDate>Thu, 28 May 2026 15:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/fertilizer-shortage-precision-agricultur</guid><category>Robotics</category><category>Artificial-intelligence</category><category>Machine-vision</category><category>Iran</category><category>Precision-agriculture</category><category>Climate-tech</category><dc:creator>Maurizio Arseni</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/close-up-of-a-probe-equipped-with-a-spectrometer-and-environmental-sensors.jpg?id=66826391&amp;width=980"></media:content></item><item><title>Junctionless Transistors Show a New Path to 3D Chips</title><link>https://spectrum.ieee.org/3d-chips</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/two-asian-mens-faces-reflected-in-a-silicon-wafer.jpg?id=66822290&width=1200&height=400&coordinates=0%2C729%2C0%2C730"/><br/><br/><p><span>Chipmakers are struggling to shrink the amount of area a transistor takes up, so researchers are trying to build layers of devices on top of each other. However, many experimental 3D chips rely on exotic materials and perform poorly compared with regular silicon devices. But researchers at the University of Illinois Urbana-Champaign have found a new way to build 3D circuits from silicon. The secret is a process that lets them roll multiple layers of nanometers-thin silicon onto a wafer at relatively low temperatures.</span></p><p>Today’s <a href="https://spectrum.ieee.org/quantum-sensors-2674296517" target="_self">3D microchips</a>, such as the <a href="https://spectrum.ieee.org/amd-mi300" target="_blank">AMD MI300 series</a>, stack prefabricated layers on top of each other and connect them with the help of <a href="https://spectrum.ieee.org/next-gen-chips-will-be-powered-from-below" target="_self">metal pillars</a> known as <a href="https://spectrum.ieee.org/amd-3d-stacking-intel-graphcore" target="_self">through-silicon vias</a>. However, the challenge of properly aligning the connections between these layers limits how many links can be made and therefore how useful 3D stacking can be.</p><p>By contrast, in <a href="https://spectrum.ieee.org/the-rise-of-the-monolithic-3d-chip" target="_self">monolithic 3D chips</a>, layers of devices are fabricated directly on top of each other. This enables alignment of these layers with nanometer-scale precision, and with orders of magnitude denser connectivity than today’s 3D chips.</p><p>However, experimental monolithic 3D chips require transistors and other devices in the upper layers to be fabricated at 400 °C or less to preserve the wiring that connects their components together. Such 3D chips have been made using a variety of materials, but their performance and reliability all proved much worse than the metal-oxide-semiconductor field-effect transistors (MOSFETs) found in virtually all conventional microchips, erasing most of the gains offered by a monolithic 3D design.</p><p>Now scientists have created monolithic 3D chips from silicon at less than 200 ℃. “For years, people assumed monolithic 3D would require exotic new materials such as <a href="https://spectrum.ieee.org/modern-microprocessor-built-using-carbon-nanotubes" target="_self">carbon nanotubes</a>, <a href="https://spectrum.ieee.org/3d-cmos" target="_self">metal-oxide semiconductors</a>, or <a href="https://spectrum.ieee.org/cdimensions-2d-semiconductors" target="_self">2D semiconductors,</a>“ says <a href="https://matse.illinois.edu/people/profile/qingcao2" target="_blank">Qing Cao</a>, an associate professor of materials science and engineering at the University of Illinois Urbana-Champaign. “Demonstrating that silicon can do the job means this technology can plug directly into existing manufacturing ecosystems, which dramatically accelerates its path toward real impact.”</p><h2>Low-temperature junctionless transistors</h2><p>Instead of the <a href="https://spectrum.ieee.org/the-highk-solution" target="_self">MOSFETs</a> used in most chips, the new 3D chips rely on <a href="https://ieeexplore.ieee.org/document/10877552" target="_blank">junctionless transistors</a>. Regular MOSFETs are made using both <em>n</em>-type semiconductors, which are doped to contain an excess of electrons, and <em>p</em>-type semiconductors, which are doped to produce a deficit of electrons. Charges enter a transistor through its source terminal, travel down a channel, and exit out the drain terminal. In MOSFETs, if the the source and drain are made of<em> p</em>-type silicon, the channel will be made of <em>n</em>-type, and vice versa. The <a href="https://spectrum.ieee.org/the-tunneling-transistor" target="_self"><em>p</em>-<em>n</em> junctions</a> where these semiconductor types meet interrupt the flow of current. When a gate electrode applies voltage to the channel, current can flow across. </p><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Circuit diagram for a 3D chip." class="rm-shortcode" data-rm-shortcode-id="4c4d4eaf13be3226978705151cdac5b3" data-rm-shortcode-name="rebelmouse-image" id="76f20" loading="lazy" src="https://spectrum.ieee.org/media-library/circuit-diagram-for-a-3d-chip.jpg?id=66822309&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Each layer of a new kind of 3D contains so-called junctionless transistors. The bottom layer is made from silicon with excess mobile electrons, the top from silicon with excess holes. The transistors are linked together vertically to form complementary logic.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Bao Lam, Yung Man Yu, et al.</small></p><p>In contrast, in junctionless transistors, the source, channel and drain are all completely either <em>p</em>-type or <em>n</em>-type, and so operate without <em>p</em>-<em>n</em> junctions. When a voltage is applied to the gates, they switch on, allowing current to flow. <a href="https://www.mdpi.com/2079-9292/9/7/1174" target="_blank">First proposed in 1925</a>, they were not built until 2010 due to limits in fabrication technology; they require highly and uniformly doped channels at most about 10 nanometers thick. In MOSFETs, chipmakers use high heat to make sure <a href="https://en.wikipedia.org/wiki/Dopant_activation" target="_blank">dopants are located precisely where they are needed to be in the silicon crystal </a>to create <em>p</em>-<em>n</em> junctions. Junctionless transistors don’t need these high temperatures. <br/><br/>“Junctionless devices also use a simpler process flow, which can reduce costs and improve yield,” Cao says.</p><p>The new 3D chips are made by laying down uniformly doped single-crystal silicon membranes each 10 nm or less thick using a wafer-scale <a href="https://www.nature.com/articles/s41528-021-00116-w" rel="noopener noreferrer" target="_blank">roll-transfer-printing</a> process. “Because the membranes are so thin and flexible, they conform to the underlying surface, avoiding the voids and warpage that often plague wafer bonding between rigid wafers,” Cao says.</p><p>That the nano-membranes can transfer onto surfaces that are not necessarily perfectly flat “is important because the current method typically used in industry requires sub-1-nanometer roughness for the surfaces to be bonded together and extremely flat—only a few microns of variations across the wafer,” says <a href="https://www.ee.iitb.ac.in/web/people/veeresh-deshpande/" rel="noopener noreferrer" target="_blank">Veeresh Deshpande</a>, an associate professor of electrical engineering at the Indian Institute of Technology Bombay, who did not participate in this study. “The proposed method simplifies the process complexity and allows stacking several tiers of transistors, both for advanced computing and memory like DRAM.”</p><p>Cao and his colleagues fabricated three levels of junctionless transistors on a 75-millimeter silicon wafer, with each tier composed of 625 transistors over a 1,600-square-mm area. From these transistors they constructed a variety of logic gates and circuits—including inverters, NAND and NOR gates, and <a href="https://spectrum.ieee.org/sram-intel-tsmc" target="_self">static random access memory (SRAM)</a> cells—using vertical connections between the layers that were aligned with sub-10-nm accuracy.</p><p>The researchers were able to form circuits made up of transistors distributed over all three layers of the 3D chips. That led to a six-transistor SRAM cell with a footprint as little as one-third the size of its 2D layout.</p><p>A transistor’s switching speed depends on its current density, and the junctionless transistors showed a current density that could exceed 650 milliamperes per micrometer, which is comparable to older commercial silicon MOSFETs. More advanced MOSFETs do show current densities exceeding 1,000 mA per micrometer, but Cao and his colleagues say that future engineering could further improve the performance of their devices.</p><p>“The key implication is that vertical stacking may not have to come with a severe transistor-performance penalty,” says <a href="https://www.matse.psu.edu/directory/saptarshi-das" rel="noopener noreferrer" target="_blank">Saptarshi Das</a>, a professor of engineering science and mechanics at Pennsylvania State University, who did not take part in this research. “If scalable, this could open a practical path to denser, more energy-efficient chips with much shorter interconnects.”</p><h2>Roll-transfer processes</h2><p>The silicon wafers Cao’s team used are much smaller than the 300-mm ones most fabs use today. But transferring and stacking silicon membranes even across a 75-mm wafer without cracks, wrinkles, or defects “required a series of engineering innovations,” Cao says. These included adding <a href="https://en.wikipedia.org/wiki/Surfactant" rel="noopener noreferrer" target="_blank">surfactants</a> during certain etching steps to reduce surface tension; adding polymer support layers for mechanical stability and surface protection; and adopting a roll-lamination process to apply uniform pressure during transfer.</p><p>“We began in 2019,” Cao says. “By 2024, we realized we had solved the fundamental barriers. The following year and a half was spent refining the process and demonstrating multilayered devices at wafer scale and 3D logic circuits.”</p><p>Beyond computing, integrating silicon with other materials in monolithic 3D devices may open up new applications “that were previously out of reach.” Cao says. “For example, vertically stacking different types of single-crystalline semiconductors could enable ultrasensitive X-ray-detector panels or compact multispectral imaging systems.”</p><p class="shortcode-media shortcode-media-rebelmouse-image rm-float-left rm-resized-container rm-resized-container-25" data-rm-resized-container="25%" rel="float: left;" style="float: left;"> <img alt="STEM micrograph showing three tiers of stacked junctionless transistor arrays separated by approximately 90 nanometers." class="rm-shortcode" data-rm-shortcode-id="fa76d859b20d7865c30a6822c51e9de2" data-rm-shortcode-name="rebelmouse-image" id="0a95f" loading="lazy" src="https://spectrum.ieee.org/media-library/stem-micrograph-showing-three-tiers-of-stacked-junctionless-transistor-arrays-separated-by-approximately-90-nanometers.jpg?id=66822314&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">A new 3D chip has three layers of silicon transistors separated by about 90 nanometers of dielectric.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Bao Lam, Yung Man Yu, et al.</small></p><p>One challenge monolithic devices will face is yield. “When you stack devices vertically, the traditional assumption is that every transistor in every layer must work perfectly, which can reduce overall chip yield,” Cao says. “We are working with circuit designers on defect-tolerant architectures that can absorb imperfections with minimal area and power overhead.”</p><p>Another hurdle is the way these 3D chips increase power density, concentrating heat. “We are collaborating with circuit and architecture teams on solutions such as dynamic voltage and frequency scaling and AI-assisted on-chip power regulation to actively manage heat,” Cao says.</p><p>Cao suggests the new approach is initially only promising for research and low-volume prototyping applications. “Once the benefits of monolithic 3D integration are clearly established, we can work toward high-volume manufacturing,” Cao says. “We simply want to be realistic and avoid over-claiming before the technology has been validated in those settings with full cost analysis.”</p><p>The scientists now want to partner with semiconductor foundries to demonstrate and refine the technology in a manufacturing environment, Cao says. Ultimately, “because our approach is silicon based and compatible with foundry processes, it has a realistic path to adoption,” he notes. “It will be especially valuable for AI workloads that are increasingly limited by communication bottlenecks, which is directly addressed by this technology by bringing compute layers physically closer together.”</p>Cao and his colleagues detailed <a href="https://www.nature.com/articles/s41586-026-10496-6" target="_blank">their findings</a> in the 28 May <em><em>Nature</em></em><span>.</span>]]></description><pubDate>Wed, 27 May 2026 15:00:02 +0000</pubDate><guid>https://spectrum.ieee.org/3d-chips</guid><category>3d-chips</category><category>Junctionless-transistors</category><category>3d-integration</category><dc:creator>Charles Q. Choi</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/two-asian-mens-faces-reflected-in-a-silicon-wafer.jpg?id=66822290&amp;width=980"></media:content></item><item><title>South Africa Has AI Leverage. Its Draft Policy Leaves It Unused</title><link>https://spectrum.ieee.org/south-africa-ai-policy</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/aerial-view-of-an-industrial-mining-complex-with-reddish-brown-processing-facilities-contrasted-by-a-distant-green-landscape.jpg?id=66784945&width=1200&height=400&coordinates=0%2C417%2C0%2C417"/><br/><br/><p><em><em>This article is adapted by the author with permission from </em><a href="https://www.techpolicy.press/" rel="noopener noreferrer" target="_blank"><em>Tech Policy Press</em></a><em>. Read the </em><a href="https://www.techpolicy.press/south-africa-has-ai-leverage-its-draft-policy-leaves-it-unused/" rel="noopener noreferrer" target="_blank">original article</a><em>.</em></em></p><p>South Africa is not just another developing country struggling to govern artificial intelligence; it is the exception with leverage, and the window to act on it is closing. It holds <a href="https://www.statista.com/statistics/273624/platinum-metal-reserves-by-country/" rel="noopener noreferrer" target="_blank">approximately 88 percent of global platinum-group metal reserves</a>, critical inputs to parts of the semiconductor and data-center supply chains that make AI infrastructure possible. It hosts the <a href="https://www.arizton.com/market-reports/south-africa-data-center-market-investment-analysis" rel="noopener noreferrer" target="_blank">largest data-center market</a> on the continent. Its <a href="https://africadca.org/en/data-centres-in-africa-focus-report-2024" rel="noopener noreferrer" target="_blank">existing hyperscaler relationships</a> give it procurement leverage that <a href="https://spectrum.ieee.org/ai-for-good" target="_blank">most African states will never have</a>. And a major <a href="https://techcentral.co.za/draft-ai-policy-south-africa-too-dependent-on-us-china/280253/" rel="noopener noreferrer" target="_blank">geopolitical contest</a> over AI infrastructure is being fought on its soil right now, between Chinese and American technology companies competing for control of the systems that will underpin an entire continent’s public sector.</p><p>In physics, leverage requires three things: a fulcrum, a lever arm, and the ability to apply force. The Bushveld Complex, <a href="https://pubs.usgs.gov/periodicals/mcs2025/mcs2025-platinum-group.pdf" rel="noopener noreferrer" target="_blank">the world’s largest platinum-group metal deposit</a>, is the fulcrum: a mineral endowment that gives South Africa a position in the semiconductor supply chain that no other African state holds. The <a href="https://www.sanews.gov.za/south-africa/minister-announces-withdrawal-draft-ai-policy" rel="noopener noreferrer" target="_blank">since-withdrawn</a> <a href="https://www.gov.za/sites/default/files/gcis_document/202604/54477gen3880.pdf" rel="noopener noreferrer" target="_blank">draft policy</a> is the lever arm. The unresolved “OPTION” provisions in the policy are where force would be applied. Without a policy that specifies what South Africa wants in return for market access, the lever arm sits unused, and the weight of two of the world’s largest technology ecosystems settles exactly where those ecosystems want it to settle.</p><p>This makes South Africa a global test case. Not because its proposed means of governance is exemplary, but because it is the one developing country with enough structural leverage to negotiate <a href="https://spectrum.ieee.org/responsible-ai" target="_blank">genuinely different terms</a>, and the one that is choosing, through inaction, not to. The recent <a href="https://techcentral.co.za/malatsi-moves-to-rescue-south-africas-botched-ai-policy/281299/" rel="noopener noreferrer" target="_blank">announcement</a> of a new panel to <a href="https://www.reuters.com/world/africa/south-africa-targets-january-2027-revised-ai-policy-after-earlier-withdrawal-2026-05-26/" target="_blank">update the draft policy by January 2027</a> is an important opportunity. But the deeper failure is not that an AI policy contained bad references. It is that no verification process caught them before the document entered the public domain. That is a systems problem, not merely a political one. It points to a missing layer in how governments are adopting AI.</p><h2>The contest already underway</h2><p>Last year, Huawei <a href="https://www.bloomberg.com/news/features/2025-10-22/china-s-deepseek-pushes-into-africa-making-ai-accessible-to-millions" rel="noopener noreferrer" target="_blank">pitched an emerging-product bundle</a> to tech executives across the continent. Huawei was now bundling access to DeepSeek’s large language model with its own cloud and storage infrastructure. The price differential was stark—in some cases by more than 90 percent.</p><p>At the same time, Microsoft announced plans to spend <a href="https://news.microsoft.com/source/emea/features/microsoft-invests-zar-5-4bn-in-south-africa/" rel="noopener noreferrer" target="_blank">ZAR 5.4 billion ($300 million)</a> by the end of 2027 on cloud and AI infrastructure in South Africa, building on a prior ZAR 20.4 billion investment. Google, Amazon Web Services, and Oracle already have cloud regions in the country. According to one analysis, the country’s data-center market was valued at US <a href="https://www.arizton.com/market-reports/south-africa-data-center-market-investment-analysis" rel="noopener noreferrer" target="_blank">$2.16 billion in 2024, the largest in Africa</a>.</p><p>These are not commercially neutral investments. Huawei’s infrastructure reach has been explicitly linked to <a href="https://www.congress.gov/crs-product/IF11735" rel="noopener noreferrer" target="_blank">Chinese strategic objectives</a>, including a <a href="https://www.csis.org/analysis/watching-huaweis-safe-cities" rel="noopener noreferrer" target="_blank">documented track record</a> of providing governments with surveillance infrastructure through its Safe Cities network. U.S. hyperscaler investment comes with its own dependency structure: closed models, pricing set unilaterally, and terms of access that no African government has meaningfully shaped. South Africa is being asked to choose between these dependency models without a policy that specifies what it wants in return.</p><h2>The leverage it has</h2><p>There is a particular irony in South Africa’s position. The country whose mines supply platinum-group metals essential to semiconductor manufacturing, and through them to AI compute, has drafted a policy that treats it as a consumer of AI systems rather than a stakeholder in their governance. South Africa digs up the minerals that make AI possible. It has no say over the AI built from them.</p><p>The <a href="https://cset.georgetown.edu/publication/the-ai-triad-and-what-it-means-for-national-security-strategy/" rel="noopener noreferrer" target="_blank">AI triad framework</a> covers algorithms, compute, and data. South Africa has no frontier model development capacity. South Africa holds significant data assets in financial services, health care, and agriculture, with no clear framework for their sovereign management. <a href="https://elements.visualcapitalist.com/charted-the-minerals-powering-the-ai-boom/" rel="noopener noreferrer" target="_blank">South Africa possesses PGM (Platinum Group Metals) leverage</a> of global significance on the compute axis, currently being transferred without meaningful condition. It also has <a href="https://datacatalog.worldbank.org/search/dataset/0039068/south-africa-solar-irradiation-and-pv-power-potential-maps" rel="noopener noreferrer" target="_blank">exceptionally high solar irradiance</a> and <a href="https://datacatalog.worldbank.org/search/dataset/0039068/south-africa-solar-irradiation-and-pv-power-potential-maps" rel="noopener noreferrer" target="_blank">significant renewable-energy potential</a>. A country that can offer both critical mineral inputs and the energy to power the infrastructure those minerals help build occupies a negotiating position of unusual strength.</p><p>The Draft Policy proposes no minimum terms for hyperscaler investment, no data sovereignty requirements, no technology transfer conditions and no compute visibility mechanism. Multiple provisions are explicitly left unresolved, marked “OPTION,” including the most consequential choices about how governance will function. Infrastructure decisions made now determine what is renegotiable later, and the answer is: very little.</p><h2>Three futures, one default</h2><p>The three infrastructure futures on offer each create a structurally different form of dependency, and only one creates sovereign capability. The Huawei-hosted DeepSeek integration offers low cost and open-source weights, but with data stored on infrastructure potentially accessible under Chinese legal frameworks, creating surveillance dependency in a pattern <a href="https://carnegieendowment.org/2019/09/17/global-expansion-of-ai-surveillance-pub-79847" rel="noopener noreferrer" target="_blank">already documented</a> across Africa. The second is U.S. closed-model dependency: higher capability, more reliable data protection, but complete API dependency on developers abroad. The third is locally hosted open-weight infrastructure: models governed under <a href="https://www.gov.za/sites/default/files/gcis_document/202406/50741gen2533.pdf" rel="noopener noreferrer" target="_blank">South African data-sovereignty rules</a>, on infrastructure subject to minimum terms, developed with South African data. As <a href="https://www.interconnects.ai/p/open-models-in-perpetual-catch-up" rel="noopener noreferrer" target="_blank">Nathan Lambert at Interconnects</a> has observed, open-weight models are likely the only realistic way to get sovereign AI off the ground as a real effort, enabling local communities and economies to integrate meaningfully with the technology. But this requires procurement conditions, not goodwill.</p><h2>What binding governance looks like</h2><p>The <a href="https://www.governance.ai/research-paper/governing-through-the-cloud" rel="noopener noreferrer" target="_blank">GovAI “Governing Through the Cloud” framework</a> identifies four roles compute providers should accept as conditions of operating at scale: securers (protecting model weights and training data), record keepers (maintaining infrastructure usage logs), verifiers (confirming customer compliance with safety standards) and enforcers (restricting access when violations occur). These are operational requirements, not theoretical categories—specific, enforceable, and well within the bargaining power of a market of South Africa’s size and mineral position.</p><p>A <a href="https://itlawco.com/sa-national-ai-policy-submission-2026/" rel="noopener noreferrer" target="_blank">detailed policy analysis</a> submitted to the Department of Communications and Digital Technologies (DCDT) identifies the specific provisions the final policy must contain: mandatory minimum terms for foreign compute infrastructure investments above ZAR 500 million (~$30 million); a compute reporting threshold; a National AI Safety Institute mandate covering defensive monitoring of AI capability accumulation; and National AI Champion Sector designations to create data assets for domestic model development. Each provision converts a structural advantage into a governance instrument before that advantage is foreclosed by market reality. Just as modern software security increasingly depends on knowing what components are inside a system—model provider, training data, compute environment, evaluation methods, update cadence, human review points, and failure-reporting procedures—public-sector AI governance requires a clear account of the stack before deployment, not after a problem surfaces. A public institution that cannot verify the sources in its own AI policy is unlikely to be ready to verify the AI systems it procures, deploys, or regulates.</p><h2>Why this is the continental test case</h2><p>South Africa’s choices will establish a regional precedent for what is commercially negotiable in AI infrastructure. If South Africa negotiates data-sovereignty guarantees and technology-transfer conditions as requirements for hyperscaler investment, it creates a replicable model. If Microsoft’s $300 million investment and Huawei’s infrastructure expansion proceed on standard commercial terms, as they are currently, it normalizes extractive AI infrastructure across the continent. The lesson is not specific to Africa. Governments everywhere are producing AI strategies while lacking AI assurance infrastructure. South Africa is an early warning, not an isolated case.</p><p>The public comment period closed when the policy was withdrawn. But a parallel process remains live: the <a href="https://www.treasury.gov.za/public%20comments/ProcReg/Draft%20General%20Public%20Procurement%20Regulations%202026%20for%20consultation%20ito%20section%2063(3)%20of%20Act.pdf" rel="noopener noreferrer" target="_blank">National Treasury’s Draft General Public Procurement Regulations</a>—the legal instrument that will govern every government AI contract—closes for comment on June 15. Those regulations contain no AI-specific provisions.</p><p>South Africa has more AI leverage than any country on the continent. Some argue, with force, that <a href="https://www.dailymaverick.co.za/article/2026-04-19-sa-risks-missing-critical-global-ai-window-through-well-intentioned-policy-misalignment/" rel="noopener noreferrer" target="_blank">governance requirements risk deterring the infrastructure investment</a> South Africa urgently needs: compute capacity, reliable energy, venture capital, and talent retention. That concern deserves a direct answer. Minimum procurement terms, compute reporting thresholds, and technology transfer conditions are not barriers to investment. They are the conditions under which investment serves the host country rather than extracting from it. Infrastructure built without minimum terms produces dependency. Infrastructure built with them produces leverage. To serve the public interest, its AI policy must use it.</p>When late last month News24 <a href="https://www.news24.com/business/tech/govts-draft-ai-policy-cites-fictitious-references-experts-believe-are-ai-hallucinations-20260424-1085" rel="noopener noreferrer" target="_blank">reported</a> AI-hallucinated references in the draft AI policy, Minister of Communications and Digital Technologies Solly Malatsi <a href="https://www.sanews.gov.za/south-africa/minister-announces-withdrawal-draft-ai-policy" rel="noopener noreferrer" target="_blank">withdrew the draft policy</a>. That was a <a href="https://www.linkedin.com/pulse/why-withdrawing-south-africas-draft-ai-policy-wrong-call-adams-4arzf/?trackingId=p1G8Vk1DBwSwD550j8ym2A%3D%3D" rel="noopener noreferrer" target="_blank">mistake</a> that could cost South Africa and the rest of the continent the initiative on this urgent issue. His more recent constitution of an <a href="https://techcentral.co.za/malatsi-moves-to-rescue-south-africas-botched-ai-policy/281299/" rel="noopener noreferrer" target="_blank">independent panel</a> is a belated step in the right direction, if it can turn South Africa’s leverage into policy. The panel—chaired by Professor Benjamin Rosman of the Wits Machine Intelligence and Neural Discovery Institute, and including Professors Vukosi Marivate and Alison Gillwald of Research ICT Africa and Dr. Jabu Mtsweni of the Council for Scientific and Industrial Research—has the technical and governance credibility to produce a stronger document. A <a href="https://www.reuters.com/world/africa/south-africa-targets-january-2027-revised-ai-policy-after-earlier-withdrawal-2026-05-26/" target="_blank">revised draft</a> is due to be ready for public comment by January 2027. South Africa remains without a formal AI governance framework in the interim.]]></description><pubDate>Wed, 27 May 2026 13:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/south-africa-ai-policy</guid><category>Ai</category><category>Artificial-intelligence</category><category>Microsoft</category><category>South-africa</category><category>Huawei</category><category>Ai-policy</category><dc:creator>Nathan-Ross Adams</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/aerial-view-of-an-industrial-mining-complex-with-reddish-brown-processing-facilities-contrasted-by-a-distant-green-landscape.jpg?id=66784945&amp;width=980"></media:content></item><item><title>What It Takes to Preserve Floppy Disks</title><link>https://spectrum.ieee.org/floppy-disk-data-preservation-archives</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/person-in-floppy-disk-sweater-sits-behind-scattered-floppy-disks-on-table.png?id=66763716&width=1200&height=400&coordinates=0%2C312%2C0%2C313"/><br/><br/><p><a data-linked-post="2667647674" href="https://spectrum.ieee.org/3m-floppy" target="_blank">Floppy disks</a> are several decades old—many of the disks are degrading and the data stored on them is at risk of being lost. In response, <a href="https://www.cdh.cam.ac.uk/about/people/leontien-talboom/" rel="noopener noreferrer" target="_blank">Leontien Talboom</a>, a technical analyst at Cambridge University Libraries and Archives, led a roughly year-long project preserving <a href="https://spectrum.ieee.org/3m-floppy" target="_self">floppy disks</a> called “<a href="https://www.lib.cam.ac.uk/future-nostalgia" rel="noopener noreferrer" target="_blank">Future Nostalgia</a>,” which concluded in January.</p><h3>Leontien Talboom</h3><br/><p><a href="https://www.cdh.cam.ac.uk/about/people/leontien-talboom/" rel="noopener noreferrer" target="_blank">Leontien Talboom</a> is a technical analyst at Cambridge University Libraries and Archives, where she transfers material from a wide range of storage media to make them accessible to archivists. </p><p><em><em>IEEE Spectrum</em></em> spoke to Talboom about her work <a href="https://www.digipres.org/the-floppy-guide/" rel="noopener noreferrer" target="_blank">preserving data</a> from Cambridge’s collection of floppy disks and <a href="https://www.repository.cam.ac.uk/items/154ad280-7c47-49eb-9cbf-24b6762f6c1c" rel="noopener noreferrer" target="_blank">collecting knowledge</a> about the disks themselves.</p><p><strong>Why is it important to preserve floppy disks now?</strong></p><p><strong>Leontien Talboom: </strong>Two reasons. First, the physical media is starting to degrade. Floppy disks are made from plastic, but they’ve got a magnetic layer of iron oxide, and that’s deteriorating. A lot of floppy disks are found in attics or garages, which means they also suffer from mold.</p><p>Second, a lot of people who developed floppy disks and systems that use floppy disks are starting to retire or pass away, which means that a lot of tacit knowledge is disappearing.</p><p><strong>Whom did you go to for that tacit knowledge?</strong></p><p><strong>Talboom: </strong>I went to the retro computing community. Their work is more around preserving these machines to keep them running [than] the data that lives on the floppy disk. But they know their stuff about floppy disks.</p><p>For example, they know that in a lot of the older disks, the inside of the disk—the doughnut—gets stuck to the top. So if you flex the casing, the doughnut falls down again. If I hadn’t known that, I would have assumed that those disks in our collection were broken or corrupt.</p><p><strong>What is the most difficult part of working with floppy disks?</strong></p><p><strong>Talboom: </strong>Accessing the files can be quite challenging if we don’t understand the file system. Within libraries and archives, we get a lot of material from machines that are not as well loved. Many of the personal computers that you had at home, such as the <a href="https://amstrad.com/product-category/computer/" rel="noopener noreferrer" target="_blank">Amstrad</a> or <a href="https://www.bbc.com/news/articles/cpvzp80jv07o" rel="noopener noreferrer" target="_blank">ZX Spectrum</a> or <a href="https://computerhistory.org/blog/the-bbc-micro/" rel="noopener noreferrer" target="_blank">BBC Micro</a>, are very well documented. But a bunch of our material comes from business or research systems. They’re not as nostalgic for people, so there’s not as big a community preserving this type of material.</p><p><strong>Do you have a favorite type of floppy disk?</strong></p><p><strong>Talboom: </strong>Five and a quarter. The weirder the system, the more frustrating and fun it is. I quite like doing that detective work.</p><p>The Amstrad disk has also really stolen my heart. The popularity of floppy disks is very geographically dependent. Our library, for example, has these Amstrad 3-inch disks. But if you go to the U.S., they’re really uncommon. They weren’t able to manufacture enough of these drives, and [3.5-inch disks] took over at a certain point. But they’re really cute.</p><p><strong>What’s the best method for sustainably storing data?</strong></p><p><strong>Talboom: </strong>The main thing is actively looking after it. A lot of the floppy disks we get in the library haven’t been accessed for 20 or 30 years, which means that you need certain special hardware to actually read them, and then work with emulators or other tools to make these file formats accessible.</p><p>Now that we’ve done that work and transferred it, we can monitor it and make sure it’s not suffering from anything like bit rot. We can also make decisions around migrating it to other file formats or working on specific file systems or unknown file formats in more detail.</p><p><em>This article appears in the June 2026 print issue as “Leontien Talboom.”</em></p>]]></description><pubDate>Tue, 26 May 2026 13:00:00 +0000</pubDate><guid>https://spectrum.ieee.org/floppy-disk-data-preservation-archives</guid><category>Archives</category><category>5-questions</category><category>Data-preservation</category><category>Type-departments</category><dc:creator>Gwendolyn Rak</dc:creator><media:content medium="image" type="image/png" url="https://spectrum.ieee.org/media-library/person-in-floppy-disk-sweater-sits-behind-scattered-floppy-disks-on-table.png?id=66763716&amp;width=980"></media:content></item><item><title>Pavona Launches Open-Hardware Ecosystem for Secure Chips</title><link>https://spectrum.ieee.org/pavona-open-source-hardware</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/3d-rendering-of-several-layers-comprising-a-single-computer-chip.jpg?id=66785309&width=1200&height=400&coordinates=0%2C417%2C0%2C417"/><br/><br/><p><span>Open-source software is ubiquitous: </span><a href="https://www.linux.org/" target="_blank">Linux</a><span> is the dominant operating system on servers and supercomputers worldwide; </span><a href="https://wordpress.com/" target="_blank">Wordpress</a><span> powers over 40 percent of all websites, among other major projects. Open-source hardware has </span><a href="https://lists.debian.org/debian-announce/1997/msg00026.html" target="_blank">existed</a><span> since the late 1990s, but it hasn’t seen nearly the same level of interest or adoption as its software-focused cousin.</span></p><p><a href="https://www.linkedin.com/in/dominic-rizzo-b353a628/" target="_blank">Dominic Rizzo</a>, CEO and founder of the startup <a href="https://www.zerorisc.com/" target="_blank">zeroRISC</a>, aims to change that. Today, the nonprofit global security standards consortium <a href="https://globalplatform.org/" target="_blank">GlobalPlatform</a> launched <a href="https://www.pavona.org" target="_blank">Pavona</a>, where Rizzo will be a governing board chair. The goal of Pavona is to facilitate the adoption of open hardware into all kinds of applications, including tiny IoT devices and massive data centers, by making the elements modular, standardized, and trusted.</p><p>Pavona is a new open-hardware ecosystem. It provides a starting kit of hardware modules, coupled with reference designs, a set of software tools to streamline adoption in different types of chips, and software tooling to ease integration. It also has a governance structure aimed at lowering the barrier to entry for adding new open-hardware designs and collaborating on development.</p><p>“I think it’s foundational,” says <a href="https://en.wikipedia.org/wiki/Andrew_Huang_(hacker)" target="_blank">Andrew “bunnie” Huang</a>, hacker and founder of <a href="https://baochip.com/" target="_blank">Baochip</a>, which is a founding member of Pavona. “We are now at the point where we finally have enough of a nugget of something open that we can spread it around. The outcome of this experiment is going to determine the shape of how we interact with hardware and open source for a long time.”</p><h2>How open-source hardware differs from open-source software</h2><p><span><strong></strong>The main reason open-source hardware hasn’t seen as much of a boom as software is almost too obvious to name: hardware needs to be manufactured, and manufacturing costs money. “Hardware, when it’s built, requires atoms,” Huang says, “which requires logistics and payment.”</span></p><p>At bottom, manufacturing itself is closed source. Because of this, open-sourcing hardware is inherently layered: While the chip fabrication, physical design kit, and foundry process remain closed, the layers on top of that, such as the design verification, system architecture, instruction-set architecture, and firmware, may be open source.</p><p>The Pavona ecosystem isn’t meant to deepen the penetration of open source through the layers. Instead, it’s meant to take the available open-source layers and facilitate their adoption and repurposing into as broad an application set as possible. “A lot of the work we’re putting into Pavona has to do with the infrastructure and the architecture that connects all this stuff together,” Rizzo says, “so it becomes much more like Legos, so you can use it in one configuration for a small IoT device and in another configuration for some large data-center system-on-a-chip.”</p><p>Part of making the hardware components more modular is software. Rizzo and his team built what they call an architectural composition engine that serves as a wrapper around the hardware, allowing it to interact with different types of computing cores, be they ARM or RISC-V. This way, a company can integrate the open hardware into their existing architecture without changing the software stack.</p><h2>Pavona begins with security chip OpenTitan</h2><p>Pavona’s starting kit of open-hardware designs includes <a href="https://spectrum.ieee.org/open-titan-chip" target="_self">components of OpenTitan</a>, a chip that provides a “hardware root-of-trust,” a chip-level source of security that serves as a foundation for all secure operations in a computer. They also include extensions of the OpenTitan design that <a href="https://www.zerorisc.com/blog/accelerating-post-quantum-cryptography-on-opentitan-based-designs-flexible-hardware-for-a-secure-future" target="_blank">incorporate</a> efficient cryptography that’s safe against possible future attacks from a large-scale quantum computer.</p><p>According to OpenTitan’s proponents, security hardware benefits from openness more than other chips, because if anyone can inspect and verify the design, and there is an active community of people stress-testing the hardware, it can become more trustworthy, and therefore more secure. It also makes the process of proving compliance with various regulatory requirements more straightforward.</p><p>Rizzo is counting on three factors to drive adoption of these open-security chips. The first is the AI boom, which has caused a massive increase in demand for chips of all kinds, not only the GPUs but also less well-known components like networking cards, monitors, and more. The second is the regulatory push toward transitioning to <a href="https://spectrum.ieee.org/post-quantum-cryptography-standards-nist" target="_self">postquantum security</a>, which both the <a href="https://bidenwhitehouse.archives.gov/briefing-room/statements-releases/2022/05/04/national-security-memorandum-on-promoting-united-states-leadership-in-quantum-computing-while-mitigating-risks-to-vulnerable-cryptographic-systems/" target="_blank">U.S</a>. and <a href="https://digital-strategy.ec.europa.eu/en/news/eu-reinforces-its-cybersecurity-post-quantum-cryptography" target="_blank">European</a> governments have legislated to happen by the end of 2030. And third is new regulatory requirements in the <a href="https://digital-strategy.ec.europa.eu/en/policies/cyber-resilience-act" target="_blank">European Cyber Resilience Act</a>, which adds new security verification and reporting requirements for products sold in the European market.</p><p>“I think those three things together are all driving people in this direction of using secure, open-source silicon,” Rizzo says.</p><p>Security hardware may be just the beginning. Pavona is designed to make it as easy as possible to pull in new hardware modules. One need not be a paying member of Pavona to contribute new designs. “We absolutely are rejecting gatekeeping,” Rizzo says.</p><p>To increase trust from both individual contributors and large companies, Rizzo and his team developed a governance structure based on large open-source projects from the software world, such as <a href="https://www.yoctoproject.org/about/project-overview/" target="_blank">Yocto</a>. Contributing-member companies get representation on Pavona’s governing board. However, an independent technical committee makes the high-level technical decisions. This separation of managerial and technical oversight is meant to increase trust and transparency. “People get very discouraged when they feel like, ‘Hey, I made a contribution, and then someone made a decision in a hallway somewhere and told us later.’ So this is more consensus based, it’s more discussion based. And so those discussions have to be open,” Rizzo says.</p><p><a href="https://ide.mit.edu/people/frank-nagle/" target="_blank">Frank Nagle</a>, the Linux Foundation’s advising chief economist and a research scientist at MIT, <a href="https://ide.mit.edu/people/frank-nagle/" target="_blank"></a>says compliance with standards and transparent governance are the keys to adoption of open-source technologies. “Having that type of structure in place will hopefully give it a fighting chance and allow it to reach scale, without people being concerned that it’s controlled by any one company.”</p><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="A flow chart containing several colored boxes representing parts of a computer chip" class="rm-shortcode" data-rm-shortcode-id="cc95bbba8452d217b00e6a053fe7dd97" data-rm-shortcode-name="rebelmouse-image" id="84aac" loading="lazy" src="https://spectrum.ieee.org/media-library/a-flow-chart-containing-several-colored-boxes-representing-parts-of-a-computer-chip.jpg?id=66785377&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Pavona’s architectural-composition engine allows hardware to interact with different types of computing cores, so a company can integrate open hardware into its existing architecture without changing the software stack.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Dominic Rizzo</small></p><h2>The open-hardware future</h2><p>Nagle argues that an underappreciated benefit of open source is that it allows private companies to work together, collaborating on core technology while still competing on specialized implementations.</p><p>“My favorite example of this I heard from a car manufacturer,” Nagle says. “The seat in your car has a little button that slides your seat backward and forward. Nobody’s buying one car rather than another car because that little toggle is better. But if you didn’t have one of those in your car, then somebody might not buy your car.”</p><p>Many technologies fall into the same category as the car seat’s button—technologies that are necessary but not a differentiator of the product. Security chips are a great example: Every piece of hardware needs security; however, few have it as their main function. These are the parts that benefit from open source, Nagle explains.</p><p>Collaborating on such hardware may enable cost savings for chip manufacturers and their customers, making the AI boom more economically sustainable.</p><p>Perhaps even more important, open sourcing some hardware development can lower the barrier to entry for new people to enter the field. To aid in this quest, Pavona also provides multiple “getting started” guides, software emulation tools, and FPGA code that anyone can download onto a board and get up and running in under 10 minutes.</p><p>“I want to get more people involved,” says bunnie Huang. “Particularly young people, particularly new people. Because we need a more robust ecosystem, more new ideas to ensure that we have the ability to maintain these technologies we depend upon.”</p>]]></description><pubDate>Mon, 25 May 2026 14:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/pavona-open-source-hardware</guid><category>Open-source-hardware</category><category>Open-source</category><category>Hardware-security</category><category>Embedded-security</category><dc:creator>Dina Genkina</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/3d-rendering-of-several-layers-comprising-a-single-computer-chip.jpg?id=66785309&amp;width=980"></media:content></item><item><title>Reclaiming Social Engineering for Good</title><link>https://spectrum.ieee.org/social-engineering-good</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/a-photo-illustration-of-a-person-inside-a-swirling-tunnel-of-colorful-digital-shapes-and-screens.jpg?id=66742827&width=1200&height=400&coordinates=0%2C833%2C0%2C834"/><br/><br/><p>“Social engineering” sounds like something out of a conspiracy thriller, charged with totalitarian control and fringe paranoia. More mundanely, it’s come to be associated with phishing and other scams, in which fraudsters manipulate people into disclosing personal information. </p><p>Yet the concept is older and more benign: it is the deliberate shaping of human behavior, often at scale. It predates silicon—and became pervasive, and ungoverned, especially once its practitioners learned to hide it. Authoritarian regimes and more recently scammers and big companies have profited from it. To defend ourselves from bad actors, and to benefit from social engineering’s good side, we need to reclaim the name, and <a href="https://spectrum.ieee.org/why-engineers-must-try-to-save-the-world" target="_blank">govern it prudently</a>.</p><h2> The roots of engineering</h2><p>In 1894, Dutch entrepreneur Jacques van Marken urged companies to hire “social engineers” to manage human systems such as insurance, education, and profit sharing for workers as carefully as they did mechanical ones. Fifteen years later, reformer William H. Tolman published <em>Social Engineering</em>, describing how U.S. industrialists optimized workers’ conditions alongside manufacturing methods. If industrialists could shape steel and electricity on demand, why not society itself?</p><p> By the 1920s, that confidence had spread. The architect Le Corbusier declared that dwellings were “machines for living in,” imagining cities as orderly lattices where people moved like parts on a conveyor belt. Civilization would run like a Swiss watch.</p><p>The idea soon darkened. Authoritarian regimes pushed it to extremes, promising to fashion “<a href="https://www.jstor.org/stable/20719929" rel="noopener noreferrer" target="_blank">the New Man</a>.” In Nazi Germany, engineer Fritz Todt founded Organization Todt, a vast state engineering enterprise that emerged from the autobahn highway system and later operated concentration camps using slave labor. </p><p>In the Soviet Union, leaders adopted U.S. scientific management techniques to plan factory-worker movements and classify populations through centralized records, feeding both rapid industrialization drives and the gulag system of forced labor. The same tools and managerial methods used to build highways and enact five-year plans worked for repression and mass control.</p><p>By the 1950s, “social engineering” had become a contaminated phrase. The revelations of Nazi and Soviet abuses, along with Cold War <a href="https://en.dialektika.org/society-politics/politics/karl-popper-and-the-social-engineering-utopian-vs-piecemeal/" rel="noopener noreferrer" target="_blank">critiques of grand social planning</a> turned the term from a progressive slogan into a warning label. Banishing the words pushed the practice underground, making it harder to recognize when it resurfaced in new forms—such as organizational psychology and systems management that still relied on classification and behavioral influence techniques but under softer, less loaded labels.</p><h2>Social engineering’s more subtle spread</h2><p>In the postwar years, the new social-engineering lexicon included “human factors” and “urban planning,” all promising integration rather than command. As computing advanced, the language shifted again: “customer journey mapping” to track interactions, “user experience” to script them. Engineering, which began as a means of reshaping physical space, set its sights on shaping behavior. Digital design features embedded in our smartphones now target our attention and desire.</p><p> Language helps conceal these modern forms of social engineering. “Data analytics” sounds neutral beside “surveillance.” “Personalization” flatters individuality while still sorting users into predictable categories. “Behavioral nudges” guide decisions without the sense of intrusion. We attach “social” as a favorable modifier to sciences, capital, and media, yet recoil when it meets “engineering.”</p><p> That discomfort is a clue. Engineering implies control, and control prompts us to ask who directs whom, toward what ends, and with whose permission.</p><p> Not all social engineering these days is hidden. Hackers don’t need to break a firewall if someone hands over their password. Romance scammers cultivate intimacy the way farmers cultivate crops. They succeed not through force but by exploiting trust. If even these obvious attacks work, the invisible kind, with roots in social engineering, are a shoo-in. </p><p>Most of the social engineering we encounter is proprietary and beyond our control. Firms build recommendation algorithms tuned to boost engagement and profit with no hearings or right of appeal. Browser and cookie defaults decide what data we surrender. A single autoplay toggle can cost users hours and build unhealthy habits. These are acts of engineering as deliberate as laying a road or redrawing an electoral district. They create a kind of curated itch by which boredom never settles, and satisfaction never arrives. The results are predictable—users click on targeted ads, make purchases, form habits, and lock in opinions. </p><p>Consent has transformed along with it. Once straightforward and revocable, it is now subtle and persistent, buried in defaults or opaque terms of service too quickly accepted. You remain free to opt out, much as you are free to refuse roads or electricity. Consent has become the preselected setting of modern life.</p><p>When social engineering operated more in the open, citizens could contest it, at least in societies with responsive government. Today’s invisible version diffuses accountability so thoroughly that scrutiny becomes hard to direct. Despite recent <a href="https://www.judiciary.senate.gov/committee-activity/hearings/social-media-and-the-teen-mental-health-crisis" rel="noopener noreferrer" target="_blank">congressional hearings</a> on social media’s impact on youth mental health and juries agreeing that <a href="https://spectrum.ieee.org/social-media-trial" target="_self">firms are knowingly designing algorithms that cause harm</a>, pinpointing responsibility remains elusive. When the mechanism is buried inside a system used by billions, we cannot easily point to a single decision-maker or trace the precise moment of manipulation. </p><p>Today’s social engineering is less overt and theatrical than its predecessors. Earlier versions arrived on public posters and loudspeakers for mass audiences. Today’s version is more intimate, delivered through personal devices and constant feeds tailored to the individual. The model succeeds because participation feels like freedom, not control. </p><p>Not all social engineering is dystopian. Well-kept parks foster community, accessible buildings extend dignity, vaccines and seatbelts save lives. Even in the digital realm, positive examples exist: browser extensions that automatically block hidden trackers, search engines that refuse to build personalized surveillance profiles, and decentralized social platforms that give users greater control over their own data and feeds. </p><p> The term “social engineering” still unsettles, though. But “asocial” engineering, which ignores human consequences entirely, is worse. Recognition of the human dimension to engineering is the beginning of repair. Only by seeing the machinery clearly and naming it honestly can we decide who engineers what and why. The machinery will not dismantle itself. Once named, it becomes subject to choice. That negotiation of purpose, power, and process are the defining political questions of any real democracy. We cannot ensure that social engineering serves and sustains society so long as we dodge the words.</p>]]></description><pubDate>Mon, 25 May 2026 13:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/social-engineering-good</guid><category>Social-engineering</category><category>User-experience</category><category>Policy</category><category>Security</category><dc:creator>Guru Madhavan</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/a-photo-illustration-of-a-person-inside-a-swirling-tunnel-of-colorful-digital-shapes-and-screens.jpg?id=66742827&amp;width=980"></media:content></item><item><title>Bolt Challenges Nvidia With a Focus on Cutting-Edge Graphics</title><link>https://spectrum.ieee.org/bolt-graphics-zeus-gpu</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/bolt-graphicss-zeus-gpu-comes-in-as-a-pcie-card-for-pcs-and-workstations-and-in-a-multi-gpu-version-for-server-racks.jpg?id=66764156&width=1200&height=400&coordinates=0%2C729%2C0%2C730"/><br/><br/><p><br/></p><p>Darwesh Singh thinks Nvidia has a weakness.</p><p>The last decade of Nvidia’s history was among the most consequential stories in technology ever. The company’s stock price has increased over 200-fold since 2016, and <a href="https://epoch.ai/data-insights/ai-chip-production" rel="noopener noreferrer" target="_blank">deployed Nvidia AI compute capacity has surged to 225 times</a> greater than the first quarter of 2021, according to data tracked by Epoch AI.</p><p>Yet Nvidia may in some ways be a victim of its own success. Its dominance in AI has led to GPU designs that prioritize tensor units and low precision math. These decisions make sense for AI, but less so for some creative, scientific, and industrial work. </p><p><a href="https://www.linkedin.com/in/darweshsingh/" target="_blank">Singh’s</a> five-year-old startup, <a href="https://bolt.graphics/about-us/" rel="noopener noreferrer" target="_blank">Bolt Graphics</a>, sees an opportunity to build a GPU specifically for these use cases. <a href="https://www.linkedin.com/in/jill-mueller-allied-asid-191511107/" rel="noopener noreferrer" target="_blank">Jill Mueller,</a> Bolt’s chief marketing officer, puts it bluntly. Nvidia has “a fundamental lack of understanding of their customer,” she says. “They just throw stuff at you, and there you go.”</p><p>Bolt aims for this potential weak spot with Zeus, a GPU that will be sold as both a PCIe (peripheral component interconnect express) card for desktop workstations and, for those who require more performance, a rack-mountable server containing four Zeus GPUs (for up to 96 per rack).</p><h2>While Nvidia goes low-precision, Bolt goes high</h2><p><a href="https://www.linkedin.com/in/feldgoise/" rel="noopener noreferrer" target="_blank">Jacob Feldgoise</a>, senior data research analyst at <a href="https://cset.georgetown.edu/" rel="noopener noreferrer" target="_blank">Georgetown University’s Center for Security and Emerging Technology</a>, has also noticed a shift in Nvidia’s recent hardware.</p><p>“AI is sucking the computational units used for high-precision workloads out of that hardware,” he says. “If you look at Nvidia’s highest performance GPUs, generation to generation, a greater share of the hardware has been allocated to <a href="https://spectrum.ieee.org/nvidia-gpu" target="_blank">low-precision compute</a>, as opposed to high-precision compute, which is generally needed for scientific computing.”</p><p>Precision refers to how many bits a GPU uses to represent each number. High-precision formats like FP64 (64-bit floating point) preserve more digits and a wider range, while FP16 and INT8 sacrifice precision for speed. Recently, Nvidia introduced a new 4-bit number format, <a href="https://developer.nvidia.com/blog/introducing-nvfp4-for-efficient-and-accurate-low-precision-inference/" rel="noopener noreferrer" target="_blank">NVFP4</a>, to accelerate AI workloads, which generally tolerate low-precision math.</p><p>But some tasks require precision. Singh cited geographical information systems, such as <a href="https://www.esri.com/en-us/arcgis/geospatial-platform/overview" rel="noopener noreferrer" target="_blank">Esri’s ArcGIS</a>, as an example. When rendering the planet on a GPU, low-precision arithmetic applied to large coordinate values can introduce errors that cause objects to drift.</p><p>Because Zeus, unlike so many other GPUs, is not designed primarily for AI, its design makes FP64-native vector cores a focus and allocates a large share of silicon to them. </p><p>“[Nvidia and AMD] make a conscious trade-off to allocate more die space to matrix multiplication and tensor units and less towards fixed function hardware,” Singh says. ”We decided to allocate the die space a bit differently.” </p><h2>Rasterization is out, path tracing is in</h2><p>A focus on FP64 isn’t the only way Bolt differs from the norm. Zeus is also built to render graphics with path tracing instead of <a href="https://spectrum.ieee.org/story-behind-pixars-cgi-software" target="_blank">rasterization</a>. </p><p>Rasterization is the traditional method of high-performance 3D rendering. It projects 3D triangles onto a pixel grid and uses mathematical abstractions to determine the correct color for each pixel. Path tracing instead does the equivalent of shooting rays from a camera to simulate how light should bounce and interact. It delivers more accurate lighting but is computationally expensive.</p><p>As with high-precision math, Bolt believes it can find an edge by placing more emphasis on path tracing than do today’s GPUs. Rasterization is supported by Zeus but significantly scaled back; Singh estimates that Zeus’s raster performance is about half that of a comparable Nvidia card. </p><p>Bolt’s fresh arrival to the GPU arena also allows the company to take a clean sheet approach unburdened by legacy support. This differs from Nvidia and AMD, which must integrate path tracing alongside rasterization in a way that can support numerous existing applications and application programming interfaces (APIs).</p><p>Bolt claims that a server rack with 28 Zeus GPUs will deliver real-time path traced performance equivalent to 280 Nvidia RTX 5090 GPUs. The aim is for this configuration of Zeus hardware to support real-time path tracing that simulates up to 20 “bounces”—a reflection or collision of the simulated light—at 4K resolution and 30 frames per second. This is a high degree of accuracy required for professional rendering workloads; for comparison, even the most graphically attractive path traced games simulate just a few bounces. </p><h2>Can a startup really launch a new GPU?</h2><p>There’s a logic to Bolt’s approach. Nvidia and AMD are focused on AI, but GPUs are still useful for many tasks besides AI. However, Bolt will need to overcome two key technical hurdles. </p><p>The first is production. Cutting-edge silicon production is in short supply, and leaders like Nvidia have most leading-edge production capacity tied up. The Zeus GPU will instead be fabricated on TSMC’s older N5 process node. Bolt is betting that an older process node will keep Zeus competitive with Nvidia on price.</p><p>Bolt may also find it challenging to convince users that an unproven GPU is a safe bet. Driver support for software is always a headache in the GPU arena—<a href="https://www.youtube.com/watch?v=MjYSeT-T5uk" rel="noopener noreferrer" target="_blank">just ask Intel</a>—and the use cases that might benefit Zeus’s high precision and path tracing will also require reliable drivers. </p><p>Bolt plans to address this by launching with support only for specific applications. “We know that PC gaming is a huge segment,” Singh says. “But our approach is we want to target professional, creative, and high-performance compute first.” The company is working with software companies including <a href="https://www.blender.org/about/" rel="noopener noreferrer" target="_blank">Blender</a>, <a href="https://www.autodesk.com" rel="noopener noreferrer" target="_blank">Autodesk</a>, and <a href="https://www.sidefx.com/" rel="noopener noreferrer" target="_blank">SideFX</a>.</p>Bolt <a href="https://www.prnewswire.com/news-releases/bolt-graphics-completes-tape-out-of-test-chip-for-its-high-performance-zeus-gpu-a-major-milestone-in-reducing-computing-costs-by-17x-302750442.html" rel="noopener noreferrer" target="_blank">announced the tape-out of the first Zeus test chips on 22 April 2026</a>, and it is now focused on bringing the GPU to production by the fourth quarter of 2027.]]></description><pubDate>Thu, 21 May 2026 13:00:02 +0000</pubDate><guid>https://spectrum.ieee.org/bolt-graphics-zeus-gpu</guid><category>Computer-graphics</category><category>Gpus</category><category>Nvidia</category><dc:creator>Matthew S. Smith</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/bolt-graphicss-zeus-gpu-comes-in-as-a-pcie-card-for-pcs-and-workstations-and-in-a-multi-gpu-version-for-server-racks.jpg?id=66764156&amp;width=980"></media:content></item><item><title>SEM-Guided Low-kV FIB Finishing for Leading-Edge Semiconductor Failure Analysis</title><link>https://events.bizzabo.com/868497/home</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/zeiss-logo-above-the-slogan-seeing-beyond-on-a-dark-curved-rectangle.png?id=66728517&width=980"/><br/><br/><p>Discover how the ZEISS Crossbeam 750 FIBSEM sets a new benchmark for precise TEM lamella prep, tomography, and advanced nanofabrication. This delivers better resolution, better SNR, larger usable FOV, and shorter acquisition times. Learn how uninterrupted FIB milling will reduce damage and rework, accelerate time to TEM, and increase first pass success—so your FA, yield, and materials teams make faster, confident data driven decisions.</p><p><span>Join us to discover how the new ZEISS Crossbeam 750 with its see while you mill capability delivers precision and clarity—every time—for demanding FIB-SEM workflows. </span>Designed for extremely challenging TEM lamella preparation, tomography, advanced nanofabrication, and APT‑ready lift‑out, Crossbeam 750 combines a new Gemini 4 SEM objective lens, a double deflector, and a next‑generation scan generator to elevate both image quality and process confidence. You’ll learn how better resolution and better SNR translate into more image detail and shorter acquisition times, while the low‑kV FIB performance enables more precise lamella prep.</p><p>We’ll demonstrate High Dynamic Range (HDR) Mill + SEM—an interwoven SEM/FIB scanning mode that suppresses FIB‑generated background. This enables immediate, clean visual feedback, even during nudging the FIB pattern live while milling . The result: confident endpointing with uninterrupted FIB milling and pristine, metrology‑grade surfaces with the lowest possible sample damage. </p><p><span><span>This session is ideal for semiconductor failure analysists, yield teams and materials scientists seeking faster time‑to‑TEM, higher first‑pass success, and consistent outcomes at low kV. See how Crossbeam 750 empowers you to make earlier stop‑milling decisions, cut rework, and reliably plan turnaround time—so you can move from sample to insight with confidence.</span></span></p><p><span><span></span><a href="https://events.bizzabo.com/868497/home" target="_blank">Register now for this free webinar!</a></span></p>]]></description><pubDate>Thu, 21 May 2026 10:00:02 +0000</pubDate><guid>https://events.bizzabo.com/868497/home</guid><category>Type-webinar</category><category>Semiconductors</category><category>Nanofabrication</category><category>Optics</category><dc:creator>Zeiss</dc:creator><media:content medium="image" type="image/png" url="https://assets.rbl.ms/66728517/origin.png"></media:content></item><item><title>The Future of Physical AI Isn’t Smarter Robots, It’s Smarter Interfaces</title><link>https://spectrum.ieee.org/wetour-robotics-physical-ai-human-interfaces</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/hands-controlling-speaker-light-bulb-and-drone-against-minimalist-white-walls.jpg?id=66718902&width=1200&height=400&coordinates=0%2C139%2C0%2C140"/><br/><br/><p><em>This sponsored article is brought to you by <a href="https://wetourrobotics.com/" target="_blank">Wetour Robotics</a>.</em></p><p>A field technician on a wind turbine, harness clipped, both hands on a wrench, needs to send a command to the diagnostic device hanging at her belt. A logistics worker on a loading dock, gloves on, eyes on the pallet, needs to redirect a connected lift. A person using an assistive mobility device on a crowded street wants to nudge it forward without taking out a phone or speaking aloud. None of these moments call for a smarter robot. They call for a smarter way to be heard by the machines that already exist.</p><h2>The industry has been building from one side</h2><p>The past three years of Physical AI have been a story of remarkable progress on the robot side of the loop. Companies like Boston Dynamics, Figure, and Unitree have advanced actuators, locomotion, and dexterity to a level that would have seemed implausible a decade ago. Google DeepMind’s Gemini Robotics has redefined what vision-language-action models can do in unstructured settings. The trajectory of the hardware and the foundation models is real, and it is accelerating.</p><p>But there is another side to this loop, and it has been treated as a solved problem for too long. The interface between humans and machines has defaulted, for 40 years, to three input modalities: screens, buttons, and voice. Each of those assumes the user can stop, look down, and translate intent into structured commands. That assumption breaks the moment the work moves into a real environment. On a turbine. On a dock. On a sidewalk. In any setting where hands are occupied, eyes are committed, or speaking is impractical, the conventional interface stack quietly fails.</p><p class="pull-quote">Spatial Intent Fusion is the simultaneous processing of three streams of human-centered information, namely spatial position, visual context, and gestural intent: Your body is the interface.<br/></p><p>The bottleneck on the human side of the loop is becoming as important as the one on the machine side. And solving it requires a different question. Not how do we make the robot more capable, but how do we let the human participate in the computing system as naturally as the robot already does.</p><h2>Wetour Robotics’ bet: put the human back into the computing loop</h2><p><a href="https://wetourrobotics.com/" target="_blank">Wetour Robotics</a> is betting that the next architectural leap in Physical AI is not about making the robot more capable. It is about making the human a first-class node in the computing network, with the same kind of low-latency, high-fidelity participation that connected devices already enjoy.</p><p>Wetour Robotics’ engineers frame the problem this way: a wristband that recognizes a gesture is not enough. A camera that recognizes a scene is not enough. The information a human carries about what they are about to do is distributed across multiple channels, including where their body is in space, what their eyes are attending to, and what their muscles are preparing to do, and any single channel observed in isolation is ambiguous. Reconstructing intent reliably means fusing those channels at the operating system level, with latency low enough that the loop feels closed rather than mediated.</p><p>This approach has a name. Wetour Robotics calls it Spatial Intent Fusion: the simultaneous processing of three streams of human-centered information, namely spatial position, visual context, and gestural intent, fused into a single real-time command for any connected physical device. It is the technical implementation behind a simpler positioning statement the company uses externally: your body is the interface.</p><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Sleek silver rectangular electronic device labeled \u201cORCHESTRA\u201d on a light gray background." class="rm-shortcode" data-rm-shortcode-id="bb58b16b7b8b65030fe32d2ff82e4ee2" data-rm-shortcode-name="rebelmouse-image" id="1af08" loading="lazy" src="https://spectrum.ieee.org/media-library/sleek-silver-rectangular-electronic-device-labeled-u201corchestra-u201d-on-a-light-gray-background.png?id=66718892&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Orchestra is a portable intelligent hub running the operating system that handles sensor fusion, intent inference, command translation, and safety arbitration. The reference compute platform is NVIDIA Jetson Orin Nano Super, which provides enough on-device inference capacity to keep the entire control loop at the edge, with no cloud dependency on the critical path. </small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Wetour Robotics</small></p><h2>The architecture: three layers, four engines, one loop</h2><p>Orchestra is not a single device but a layered platform, designed from the start to be sensor-flexible and actuator-agnostic. The architecture decomposes into three perception layers and four coordination engines.</p><p><strong>Orchestra</strong> itself is the local compute and orchestration core: a portable intelligent hub running the operating system that handles sensor fusion, intent inference, command translation, and safety arbitration. The reference compute platform is NVIDIA Jetson Orin Nano Super, which provides enough on-device inference capacity to keep the entire control loop at the edge, with no cloud dependency on the critical path. Edge inference is non-negotiable for this application. Full-chain latency from biosignal acquisition to actuator command is held under 100 milliseconds, the envelope inside which closed-loop control feels natural rather than laggy.</p><p><strong>VisionLink</strong> handles visual and spatial perception. Cameras feed into vision models that identify objects, estimate distances, and track environmental context. VisionLink is designed not as a passive recognition layer but as a real-time command generator: its outputs feed directly into Orchestra OS to be fused with biosignal data.</p><p><strong>Conductor</strong> is the biosignal pipeline. It ingests raw surface electromyographic (sEMG) data from a wrist-worn device, classifies temporal patterns into discrete gestures or continuous control signals, and outputs actuator commands. The technically interesting property of sEMG for this use case is that the signal precedes visible motion. Motor unit action potentials appear at the skin surface roughly 50 to 80 milliseconds before a finger completes the corresponding gesture. Wetour Robotics calls this property pre-motion intent sensing, and it is what allows Orchestra to anticipate user intent rather than react to it.</p><p>On top of the three perception layers, Orchestra OS runs four coordination engines. The <strong>Perception Engine</strong> ingests and normalizes raw sensor streams. The <strong>Intent Engine </strong>performs Spatial Intent Fusion across modalities, resolving what the user is trying to do given where they are, what they are looking at, and what their hand is signaling. The <strong>Orchestration Engine</strong> translates intent into device-specific command sequences for any connected actuator. The <strong>Safety Engine</strong> arbitrates conflicting commands, enforces operational envelopes, and gates execution against runtime safety conditions.</p><p class="shortcode-media shortcode-media-youtube"> <span class="rm-shortcode" data-rm-shortcode-id="cadc408927185275af6d15b314d998a0" style="display:block;position:relative;padding-top:56.25%;"><iframe frameborder="0" height="auto" lazy-loadable="true" scrolling="no" src="https://www.youtube.com/embed/WOUjWM4hIko?rel=0" style="position:absolute;top:0;left:0;width:100%;height:100%;" width="100%"></iframe></span></p><h2>The trade-offs we’re honest about</h2><p>No system that bridges the human body and the digital world is finished. Three engineering challenges remain open, and the company addresses each with a deliberate trade-off rather than a claim of having fully solved it.</p><p><strong>Baseline stability of sEMG under motion.</strong> In a stationary user, continuous gesture recognition from sEMG is reliable. Once the user is walking, climbing, or otherwise moving, motion artifacts and electrode drift degrade the signal in ways that are difficult to fully compensate for. Rather than overpromise on continuous control in dynamic settings, Orchestra defaults to a smaller set of robust discrete gestures in complex operating environments, and reserves continuous control modes for contexts where the signal-to-noise ratio supports them.</p><p><strong>Miniaturization of edge AI compute.</strong> Running the Orchestra control loop entirely at the edge requires real on-device inference, which has historically meant trading off between compute capacity, battery life, and form factor. Wetour Robotics’ approach has been a compact carrier board paired with a thermal design and a battery module sized for all-day wearability. The result is a hub that travels with the user rather than tethering them to a desk, and that performs the full perception-to-actuation loop without offloading to the cloud.</p><p><strong>Heterogeneity of third-party device protocols.</strong> The actuator side of the loop is a fragmented landscape. Different manufacturers expose different command interfaces, different communication stacks, and different safety conventions, and a Physical AI operating system has to integrate with all of them. Wetour Robotics uses an AI-agent layer to negotiate connection and protocol translation adaptively, so that Orchestra OS can ingest data from a wide range of devices, run them through neural network models that infer human intent, and emit the right command on the right protocol for the device on the other end.</p><h2>Why this matters, and why it helps the rest of the field</h2><p>The history of computing is a history of interface revolutions. Command lines gave way to graphical user interfaces, which gave way to touch, which gave way to voice. Each transition expanded who could participate in the system and what they could do with it. The next transition is not about a new screen or a new microphone. It is about treating the human body itself as a participant in the computing network, capable of contributing intent at the same speed and fidelity that any other connected node can.</p><p class="pull-quote">The history of computing is a history of interface revolutions. The next transition is not about a new screen or a new microphone — it is about treating the human body itself as a participant in the computing network.</p><p>This path is not a competitor to the work being done on humanoid robots, foundation models for embodied AI, and dexterous manipulation. It is the missing complement to that work. The hardest open problem for humanoid systems is the data: every natural interaction between a human and the physical world is a potential training signal, and most of those interactions are currently invisible to any computing system. As more humans become first-class nodes in the loop, those interactions become observable, structured, and ultimately useful for training the next generation of embodied AI, including the humanoid robots being developed today.</p><p>In other words: putting the human back into the computing loop is not just about better interfaces for individual users. It is about generating the kind of grounded, in-the-wild human-machine interaction data that the broader Physical AI ecosystem will need to keep advancing. The robot side and the human side of the loop are not two competing futures. They are two halves of the same one.</p><p>That is what Wetour Robotics means when it says: <em>Your body is the interface.</em></p><p>Learn more at <a href="https://wetourrobotics.com/" target="_blank">wetourrobotics.com</a>.</p>]]></description><pubDate>Thu, 21 May 2026 10:00:02 +0000</pubDate><guid>https://spectrum.ieee.org/wetour-robotics-physical-ai-human-interfaces</guid><category>Interfaces</category><category>Physical-ai</category><category>Robot-hardware</category><category>Smarter-robots</category><dc:creator>Wetour Robotics</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/hands-controlling-speaker-light-bulb-and-drone-against-minimalist-white-walls.jpg?id=66718902&amp;width=980"></media:content></item><item><title>Manchester Code Made Bits Behave</title><link>https://spectrum.ieee.org/manchester-code-ieee-milestone</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/black-and-white-photograph-of-a-well-dressed-young-man-working-on-a-magnetic-drum-store-in-a-lab.jpg?id=66734376&width=1200&height=400&coordinates=0%2C350%2C0%2C350"/><br/><br/><p>In the late 1940s—when computer engineers were grappling with unreliable hardware and noisy transmission environments—a team of engineers inside a modest lab at the <a href="https://www.manchester.ac.uk/" rel="noopener noreferrer" target="_blank">University of Manchester</a>, England, confronted a problem so fundamental that it threatened the viability of <a href="https://spectrum.ieee.org/topic/computing/" target="_self">digital computing</a> itself. Machines could generate bits, but they could not reliably read them back.</p><p>The inconsistent reading back of memory data did not initially present itself as a grand theoretical challenge. It showed up as something more mundane: inconsistent computing results.</p><p>Engineers including <a href="https://en.wikipedia.org/wiki/Frederic_C._Williams" rel="noopener noreferrer" target="_blank">Frederic C. Williams</a>, <a href="https://computerhistory.org/profile/tom-kilburn/" rel="noopener noreferrer" target="_blank">Tom Kilburn</a>, and  <a href="https://www.historicalporttalbot.com/dr-g-e-tommythomas.html" rel="noopener noreferrer" target="_blank">G. E. (Tommy) Thomas</a> traced the failures not to logic errors but to the physical behavior of the machines themselves. The team devised a technique for keeping a transmitter and a receiver synchronized without relying on a separate clock signal. Their innovation, known as <em><em>Manchester code</em></em> or <em><em>phase encoding</em></em>, encoded each bit with a transition in the middle of the bit period, effectively embedding timing information directly into the data stream to be a self-clocking signal. So, even if the signal degraded or the timing drifted slightly, the receiver could continually keep time based on those regular transitions.</p><p>By eliminating the need for separate clocks and reducing synchronization errors, Manchester code made data transfer more robust across cables and circuits.</p><p>Those qualities later made it a natural fit for technologies such as <a href="https://spectrum.ieee.org/ethernet-ieee-milestone" target="_self">Ethernet</a> and early data storage systems. Its self-clocking nature helped standardize how machines communicate, and it laid the groundwork for modern networking and digital communication protocols.</p><p>On 13 April 2026, this breakthrough was honored with an <a href="https://ieeemilestones.ethw.org/Main_Page" rel="noopener noreferrer" target="_blank">IEEE Milestone</a> plaque during a ceremony at the University of Manchester. Dignitaries from IEEE and the university attended the ceremony.</p><h2>Embedding timing in signals</h2><p>Those 1940s Manchester University engineers were working on systems that fed into the <a href="https://www.britannica.com/technology/Manchester-Mark-I" rel="noopener noreferrer" target="_blank">Manchester Mark I</a>, one of the first practical stored-program machines.</p><p>When troubles arose, they used oscilloscopes to probe signals. They found that electrical pulses did not arrive with consistent timing. Memory signals also blurred over time, making them harder to read, and when long runs of identical bits occurred, the waveform flattened into stretches with no transitions.</p><p>That led to a crucial insight: The problem was not just detecting whether a signal was high or low; the system also lost track of when to sample the signal. Without reliable timing markers, even correctly formed signals were misread. Bits could effectively be lost or miscounted because the system fell out of sync.</p><p>At first, the engineers tried to tame the hardware. They experimented with stabilizing circuits and more consistent pulse generation, attempting to impose a regular rhythm on an inherently unstable system. But the fixes proved fragile, and the electronics of the day could not maintain the required precision. So the Manchester group took a different approach.</p><p>If the hardware could not provide a dependable clock, the signal itself would have to carry one. Instead of representing data as static levels, each bit changed state, with a guaranteed transition in the middle.</p><p>Embedding timing in the signal reduced erratic behavior. Machines were suddenly able to reliably transmit, store, and read back data—an essential step toward practical stored-program computing.</p><h2>Making signals unmistakable</h2><p>The Manchester code addressed several issues at once. Regular transitions allowed continuous timing recovery. Transitions proved easier to detect than static levels, and long runs of identical bits no longer produced flat, ambiguous waveforms. Rather than fighting the imperfections of early electronics, the design worked with them.</p><h2>From lab curiosity to a global standard</h2><p>What began as a local solution in Manchester shaped digital communication systems for decades, including early <a href="https://spectrum.ieee.org/ethernet-ieee-milestone" target="_self">Ethernet</a> technology, for which timing and shared-medium communication were central challenges.</p><p>According to <a href="https://www.invent.org/inductees/robert-m-metcalfe" rel="noopener noreferrer" target="_blank">Robert Metcalfe</a>, a member of the team that built the first Ethernet system at <a href="https://spectrum.ieee.org/xerox-parc" target="_self">Xerox PARC</a> in 1973, he and his colleagues relied on Manchester code.</p><p>“Manchester code solved a fundamental problem for us: timing,” Metcalfe says, explaining that each bit carried its own clock and removed the need for a global synchronized signal.</p><p>That self-clocking property wasn’t the only benefit provided by the encoding scheme. On a shared <a href="https://en.wikipedia.org/wiki/Coaxial_cable" rel="noopener noreferrer" target="_blank">coaxial cable</a>, Manchester encoding did more than provide timing. Each transceiver left the medium undriven—effectively “off”—most of the time, allowing packets from other machines to pass without interference. Even during transmission, a station drove the signal only about half the time, leaving the line undriven during the other half of each bit cycle.</p><p>This distinction—between a driven signal and an undriven line, rather than simple 1s and 0s—allowed receivers to recover both data and clock timing while also monitoring the cable for other activity. If a transceiver detected a signal when it expected the line to be undriven, the signal indicated that another station was transmitting at the same time. In other words, the system could detect collisions in real time and respond accordingly.</p><p>The idea has proven durable far beyond local networks. Manchester code is being used aboard the <a href="https://science.nasa.gov/mission/voyager/" rel="noopener noreferrer" target="_blank">Voyager</a> spacecraft, which are now cruising through interstellar space—underscoring its reliability in extreme environments.</p><p>The code also has found its way into everyday consumer electronics. Infrared remote controls for televisions and audio equipment commonly rely on Manchester code through protocols such as <a href="https://ieeexplore.ieee.org/document/5204459" rel="noopener noreferrer" target="_blank">RC-5</a>, developed by <a href="https://www.usa.philips.com/?srsltid=AfmBOoo3NzmHCA6YkyyN491Ap-CAL5zRLLrSMfq00xJZS-gNs1Mdf4JV" rel="noopener noreferrer" target="_blank">Philips</a> in the early 1980s. The protocol encodes commands as timed infrared signals transmitted by a handset’s <a href="https://spectrum.ieee.org/tag/integrated-circuits" target="_self">integrated circuit</a> and <a href="https://spectrum.ieee.org/tag/leds" target="_self">LED</a>, allowing devices to reliably interpret button presses even through noise and signal distortion. Manufacturers across Europe—and many in the United States—adopted the approach, extending Manchester code into the home.</p><h2>Why the Milestone matters</h2><p>An IEEE Milestone designation recognizes technologies with enduring impact. Manchester code qualifies because it solved a foundational timing problem at a critical moment in computing history.</p><p>Without a way to embed timing in the data itself, early digital systems would have remained fragile and unreliable. Manchester code helped transform them into dependable machines, and it enabled much of today’s digital communication.</p><p class="pull-quote">“Manchester code solved a fundamental problem for us: timing,” <strong>—Robert Metcalfe, an Ethernet inventor</strong></p><p><span></span>Key participants at the plaque dedication ceremony included <a href="https://iwrc.ieeeusa.org/blog/portfolio-items/thomas-coughlin/" target="_blank">Tom Coughlin</a>, 2024 IEEE president; <a href="https://www.manchester.ac.uk/about/people/university-executive-team/president-vice-chancellor/" target="_blank">Duncan Ivison</a>, University of Manchester president and vice chancellor, <a href="https://www.ieee-ukandireland.org/" target="_blank">and </a><a href="https://www.uwl.ac.uk/staff/nagham-saeed" target="_blank">Nagham Saeed</a>, chair of the <a href="https://www.ieee-ukandireland.org/" target="_blank">IEEE U.K. and Ireland Section</a>.</p><p>Talks by <a href="https://spectrum.ieee.org/kees-immink-the-man-who-put-compact-discs-on-track" target="_self">Kees Schouhamer Immink</a> (the 2017 <a href="https://corporate-awards.ieee.org/ieee-medal-of-honor/" rel="noopener noreferrer" target="_blank">IEEE Medal of Honor</a> laureate <a href="https://spectrum.ieee.org/kees-immink-the-man-who-put-compact-discs-on-track" target="_self"></a>probably best known for his work that made compact discs and other high-density digital media practical) and <a href="https://www.staffnet.manchester.ac.uk/news/display/?id=32657" rel="noopener noreferrer" target="_blank">Peter Green</a> (Manchester’s deputy dean for the engineering faculty) highlighted the code’s lasting impact on digital data storage and communications.</p><p>The IEEE <a href="https://ethw.org/Milestones:Manchester_Code,_1948%E2%80%931949" rel="noopener noreferrer" target="_blank">Milestone plaque</a> for the Manchester code reads:</p><p>“At<em><em> this site in 1948–1949, Manchester code was invented for reliably encoding digital data stored on the Manchester Mark I computer’s magnetic drum. It became a standard for computer magnetic tapes and floppy disks and was used in digital communications, including the Voyager 1 and 2 spacecraft and early Ethernet networks. It found wide use in domestic remote controllers, radio frequency identification (RFID) tags, and many control network standards</em></em>.”</p>Administered by the <a href="https://www.ieee.org/about/history-center" rel="noopener noreferrer" target="_blank">IEEE History Center</a> and supported by donors, the Milestone program recognizes outstanding technical developments worldwide. The IEEE U.K. and Ireland Section sponsored the nomination.]]></description><pubDate>Mon, 18 May 2026 18:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/manchester-code-ieee-milestone</guid><category>Type-ti</category><category>Ieee-history</category><category>Ieee-milestone</category><category>Computing</category><category>Ethernet</category><category>Digital-communication</category><dc:creator>Willie D. Jones</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/black-and-white-photograph-of-a-well-dressed-young-man-working-on-a-magnetic-drum-store-in-a-lab.jpg?id=66734376&amp;width=980"></media:content></item><item><title>How Melbourne’s AI and Data Center Flywheel Is Accelerating Research Innovation</title><link>https://spectrum.ieee.org/melbourne-ai-data-center-innovation</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/blue-lit-server-room-featuring-the-large-monash-maveric-supercomputer-installation.jpg?id=66718014&width=1200&height=400&coordinates=0%2C573%2C0%2C574"/><br/><br/><p><em>This sponsored article is brought to you by </em><a href="https://www.melbournecb.com.au/?utm_source=ieee&utm_medium=editorial&utm_campaign=discover-melbourne-2026&utm_term=maveric&utm_content=link" rel="noopener noreferrer" target="_blank"><em><span>Melbourne Convention Bureau (MCB)</span></em></a><em> supported by </em><a href="https://businessevents.australia.com/en" target="_blank"><em><span>Business Events Australia</span></em></a><em>.</em></p><p>Melbourne’s reputation as a global events city, from the Australian Open tennis and Formula 1 Australian Grand Prix to hosting NFL regular season games, now intersects with a different form of scale: large-scale compute, data-intensive research, and advanced engineering. Long recognized for delivering complex international events, the city is applying the same organisational capability to the infrastructure that underpins modern AI research, positioning Melbourne at the convergence of global convening and high-performance digital systems.</p><p>Consistently ranked among the world’s most livable cities, Melbourne was named <a href="https://www.timeout.com/travel/best-cities-2026" target="_blank"><span>Time Out’s Best City in the World in 2026</span></a>, the first Australian city to hold the title.</p><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Hot-air balloons over a riverside city skyline at sunrise with parks in the foreground" class="rm-shortcode" data-rm-shortcode-id="a4fff87a9da2d309d0278846fedc28ac" data-rm-shortcode-name="rebelmouse-image" id="79f98" loading="lazy" src="https://spectrum.ieee.org/media-library/hot-air-balloons-over-a-riverside-city-skyline-at-sunrise-with-parks-in-the-foreground.jpg?id=66723254&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Melbourne, Australia’s premier conference destination.</small> <small class="image-media media-photo-credit" placeholder="Add Photo Credit...">            Tourism Australia        </small> </p>More materially for research and innovation, Melbourne is also the nation’s <a href="https://www.invest.vic.gov.au/understand-the-market/discover-consumer-and-business-markets/consumer-and-business-markets" target="_blank"><span>fastest‑growing capital</span></a>, attracting increasing concentrations of engineering and technology talent, investment and international engagement.<p><span>Australia’s artificial intelligence (AI) ecosystem is entering a new phase, defined less by isolated initiatives and more by the convergence of compute infrastructure, research intensity and international collaboration. Melbourne sits at this intersection.</span></p><p class="pull-quote"><span>Melbourne’s trajectory highlights what enables research at scale: access to frontier-grade compute, proximity to industry-ready infrastructure, and repeated opportunities for global research communities to convene.</span></p><p>Sovereign AI compute, expanding hyperscale data center campuses and a growing pipeline of international research-led conferences are reshaping the city’s research landscape. Together, these elements position Melbourne as a focal point for applied AI research, advanced engineering and data-intensive science.</p><p>The growing global influence of AI engineering, underscored by <a href="https://spectrum.ieee.org/2026-ieee-medal-of-honor" target="_self"><span>NVIDIA CEO Jensen Huang receiving the 2026 IEEE Medal of Honor</span></a>, reflects the scale of this shift. In Melbourne, these factors form a reinforcing research flywheel linking infrastructure, discovery and collaboration.</p><p>Rather than focusing on startup density or short-term commercial output, Melbourne’s trajectory highlights what enables research at scale: access to frontier-grade compute, proximity to industry-ready infrastructure, and repeated opportunities for global research communities to convene.</p><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Person in tuxedo holding an IEEE award plaque on a lit stage with floral decor" class="rm-shortcode" data-rm-shortcode-id="5bc615cd12804379152b4c13d064cf9e" data-rm-shortcode-name="rebelmouse-image" id="e6a57" loading="lazy" src="https://spectrum.ieee.org/media-library/person-in-tuxedo-holding-an-ieee-award-plaque-on-a-lit-stage-with-floral-decor.png?id=66718012&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">NVIDIA CEO Jensen Huang received the 2026 IEEE Medal of Honor.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">IEEE</small></p><h2>Sovereign AI foundations</h2><p>The most recent cornerstone of Melbourne’s AI capability is <a href="https://www.monash.edu/maveric" target="_blank"><span>MAVERIC</span></a> (Monash AdVanced Environment for Research and Intelligent Computing), Australia’s largest university-based AI supercomputer. Built and deployed by Monash University in partnership with NVIDIA, Dell Technologies, and CDC Data Centres, MAVERIC has been engineered specifically for large scale AI and data intensive science, with medical research representing a key priority. Indeed, in these regards MAVERIC has been designed to function as a Next Generation Trusted Research Environment thus ensuring that it is state-of-the-art and provides a safe and secure framework for the analysis of large sensitive datasets.</p><p>Designed to support research projects including cancer and neurodegenerative disease detection, clinical trial analysis and drug discovery through to materials science and engineering, MAVERIC enables Australian researchers to train and evaluate large models domestically while keeping highly sensitive datasets secure and under national jurisdiction. This sovereign design is particularly relevant in fields such as medical research where privacy, regulation or intellectual property constraints limit the use of offshore cloud resources.</p><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Professionals in business attire stand in a modern, arched lobby formation." class="rm-shortcode" data-rm-shortcode-id="85a14575d9b9f4808fb2757cf1b06fef" data-rm-shortcode-name="rebelmouse-image" id="0aaca" loading="lazy" src="https://spectrum.ieee.org/media-library/professionals-in-business-attire-stand-in-a-modern-arched-lobby-formation.png?id=66718020&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Monash University Vice-Chancellor and President Professor Sharon Pickering with researchers [left to right] Professor Anton Peleg, Professor Victoria Mar, Professor James Whisstock, Vice-President (Strategy and Major Projects) Teresa Finlayson, and Professor Patrick Kwan.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Eamon Gallagher (Australian Financial Review)</small></p><p><span>Technically, the system reflects the latest shifts in high performance AI architecture. Built on NVIDIA GB200 NVL72 platforms and integrated using Dell’s rack scale infrastructure, MAVERIC employs closed loop liquid cooling to reduce water consumption compared with conventional air-cooled systems, aligning large scale compute growth with sustainability objectives while supporting high density, high throughput workloads.</span></p><p><a href="https://research.monash.edu/en/persons/james-whisstock/" target="_blank">Professor James Whisstock</a>, Deputy Dean Research of Monash’s Faculty of Medicine, Nursing, and Health Sciences commented, “MAVERIC provides a huge leap forward in our compute capability that will revolutionize our researchers’ ability to address the most challenging and important research questions across the fields of medical research, information technology, and STEM disciplines. It will seed wonderful new cross-disciplinary collaborations, underpin the work of our best and brightest young researchers and will allow our scientists to continue to make major discoveries that positively impact the Australian and global population more broadly.”</p><p class="pull-quote">“MAVERIC provides a huge leap forward in our compute capability that will revolutionize our researchers’ ability to address the most challenging and important research questions across the fields of medical research, information technology, and STEM disciplines.” <strong>—Professor James Whisstock, Deputy Dean Research of Monash’s Faculty of Medicine, Nursing, and Health Sciences</strong></p><p>Monash University frames MAVERIC not as a standalone asset, but as part of the national research infrastructure, intended to strengthen collaboration across academia, healthcare, government and industry. This approach positions Melbourne at the forefront of sovereign AI enabled research in the region.</p><h2>Data center scale as research infrastructure</h2><p>The infrastructure demands of modern AI research extend well beyond individual systems. Melbourne’s expanding data center footprint now supports hyperscale compute, applied AI deployment and large-scale research workloads simultaneously.</p><p class="shortcode-media shortcode-media-rebelmouse-image rm-float-left rm-resized-container rm-resized-container-25" data-rm-resized-container="25%" style="float: left;"> <img alt="Bar chart of 2024 data center investment; US leads, Australia second, then Japan, Singapore, UK, Canada." class="rm-shortcode" data-rm-shortcode-id="8d6a38b90c4ef69cae0f70753b193c91" data-rm-shortcode-name="rebelmouse-image" id="a796d" loading="lazy" src="https://spectrum.ieee.org/media-library/bar-chart-of-2024-data-center-investment-us-leads-australia-second-then-japan-singapore-uk-canada.jpg?id=66718033&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Total data center investment, US$ billions.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Source: <a href="chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https:/content.knightfrank.com/research/2982/documents/en/data-centres-global-report-2025-12054.pdf" target="_blank">Data Centres Global Report 2025</a></small></p><p><span>In February 2026, CDC Data Centres opened its first Melbourne campus in Brooklyn, with two live facilities and a third in planning. Combined with CDC’s Laverton campus, Melbourne is projected to host more than 800 megawatts of sovereign digital capacity, critical for AI workloads requiring sustained access to high-density power, cooling and secure environments.</span></p><p>Parallel investment is underway in Fishermans Bend, where <a href="https://www.nextdc.com/news/nextdc-announces-2-billion-ai-factory-and-technology-campus-at-fishermans-bend" target="_blank"><span>NEXTDC is developing a AUD $2 billion AI and digital infrastructure hub</span></a> adjacent to the Innovation Precinct. Planned facilities include an AI Factory, a Mission Critical Operations Center and a Technology Center of Excellence, enabling sovereign AI, high-performance computing and cross-sector collaboration across health, defence and finance.</p><p>Melbourne hosts Australia’s <a href="https://www.invest.vic.gov.au/explore-your-sector/digital-technology/data-centres" target="_blank"><span>largest cluster of AI firms</span></a>, with 188 companies, and more than 40 data centers currently operate across Victoria. The Victorian Government has complemented this growth with an initial AUD $5.5 million investment in the Sustainable Data Center Action Plan.</p><p>Together, these developments reinforce Melbourne’s role as a national and increasingly global hub for high-performance AI infrastructure as model complexity and infrastructure dependency continue to accelerate.</p><h2>Applied AI research at scale</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="People talking beside colorful cone sculpture outside modern campus building on College Walk" class="rm-shortcode" data-rm-shortcode-id="530f3f9e56c4a0636eda723d331bae6f" data-rm-shortcode-name="rebelmouse-image" id="8a533" loading="lazy" src="https://spectrum.ieee.org/media-library/people-talking-beside-colorful-cone-sculpture-outside-modern-campus-building-on-college-walk.jpg?id=66718043&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Monash University is home to MAVERIC, Australia’s largest university-based AI supercomputer, built and deployed by Monash in partnership with NVIDIA, Dell Technologies, and CDC Data Centres.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Monash University</small></p><p>Melbourne’s research strength is underpinned by a dense university network with deep capability across AI, data science and engineering. Institutions including Monash University, the University of Melbourne, Deakin University, La Trobe University, RMIT University and Swinburne University of Technology collectively support research across machine learning, robotics, human-computer interaction, extended reality and advanced manufacturing.</p><p><span>This concentration fosters applied collaboration where AI intersects with medicine, sustainability, cognitive systems and immersive technologies. For visiting researchers, it provides access not only to academic expertise but also to live infrastructure environments where research can be tested and validated, reinforcing Melbourne’s position as one of the Asia-Pacific’s most integrated AI research ecosystems.</span></p><h2>Conferences as research accelerators</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Large audience in modern auditorium watching speaker on brightly lit conference stage" class="rm-shortcode" data-rm-shortcode-id="06b667e88bd471c9f85038ed575e2b64" data-rm-shortcode-name="rebelmouse-image" id="5b02f" loading="lazy" src="https://spectrum.ieee.org/media-library/large-audience-in-modern-auditorium-watching-speaker-on-brightly-lit-conference-stage.jpg?id=66718051&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Plenary session at Melbourne Convention and Exhibition Center.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Melbourne Convention Bureau</small></p><p>Melbourne’s selection as host city for a growing number of international technology conferences reflects the convergence of research capability and infrastructure maturity.</p><p>In September 2026, <a href="https://datacenterworldaustralia.com/" target="_blank">Data Center World Australia</a> and <a href="https://australia.theaisummit.com/" target="_blank">The AI Summit Australia</a> will be co-located at the Melbourne Convention and Exhibition Center, bringing together global leaders across AI, digital infrastructure and enterprise technology. The pairing highlights a broader reality: advances in AI are inseparable from the infrastructure that enables them.</p><p class="pull-quote">Melbourne’s expanding data center footprint now supports hyperscale compute, applied AI deployment and large-scale research workloads simultaneously.</p><p>Research-led conferences are also expanding Melbourne’s global footprint. <a href="https://www.iconip2026.org/" target="_blank">ICONIP 2026</a>, hosted by Deakin University, will bring up to 700 researchers in neural networks and machine learning, followed in 2027 by <a href="https://ieeevr.org/2027/" target="_blank">IEEE VR</a>, the leading conference on virtual reality and 3D user interfaces, attracting up to 1,000 delegates.</p><p>In this context, conferences function not simply as events, but as infrastructure for knowledge transfer, supporting standards exchange, collaboration and system-level learning at global scale.</p><h2>A global platform for advancing research</h2><p>Sovereign compute, data center scale and a strong conference pipeline create a reinforcing cycle, enabling researchers to engage directly with infrastructure and industry well beyond the event itself.</p><p>By closing the gap between theory and deployment, Melbourne supports deeper technical exchange and more enduring global research networks.</p><p>This role was recognized in 2025 when the IEEE awarded Melbourne Convention Bureau the 2025 Organisational Supporting Friend of <a href="https://www.ieee.org/communities/geographic-activities" target="_blank">IEEE Member and Geographic Activities (MGA)</a> — the first convention bureau in the Asia Pacific region to receive the acknowledgement as a result of the longstanding partnership with the <a href="https://r10.ieee.org/victorian/" target="_blank">IEEE Victorian Section</a>.</p><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Two people hold an IEEE award in front of a 60 years Melbourne Convention banner" class="rm-shortcode" data-rm-shortcode-id="44ff9d49df420ad51c849d43c6875556" data-rm-shortcode-name="rebelmouse-image" id="d8d38" loading="lazy" src="https://spectrum.ieee.org/media-library/two-people-hold-an-ieee-award-in-front-of-a-60-years-melbourne-convention-banner.jpg?id=66718119&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Melbourne Convention Bureau (MCB) representative Fatima Aboudrar, Senior Business Development Manager, with Vijay S. Paul, Immediate Past Chair, IEEE Victorian Section, receiving Supporting Friend Member recognition in 2025.</small></p><p><span>As AI research becomes increasingly dependent on infrastructure scale, sovereign capability, and global collaboration, Melbourne is moving beyond hosting conversations to actively enabling the systems that advance AI and data‑driven research at global scale.</span></p><h2>Conference support in Melbourne</h2><p class="shortcode-media shortcode-media-html5_video"> <video caption="Why host a conference in Melbourne, Australia." class="rm-shortcode" controls="" data-rm-shortcode-id="dec941a36c92df0006fae2798936edb4" expand="1" feedbacks="true" id="c9d60" mime_type="video/mp4" photo_credit="Melbourne Convention Bureau" shortcode_id="1778594450027" site_id="20265424" url="https://roar-assets-auto.rbl.ms/runner%2FBIDs_WhyMelbourne_and_MCB.mp4" videocontrols="true" width="100%"> <source src="https://roar-assets-auto.rbl.ms/runner%2FBIDs_WhyMelbourne_and_MCB.mp4" type="video/mp4"/> Your browser does not support the video tag. </video> <small class="image-media media-caption" placeholder="Add Photo Caption...">Why host a conference in Melbourne, Australia.</small><small class="image-media media-photo-credit" placeholder="add photo credit...">Melbourne Convention Bureau</small> </p><p>This ecosystem is underpinned by Melbourne’s highly accessible city center, where world-class venues, research institutions and industry hubs are located in close proximity. Free public transport and a compact city footprint enable seamless movement from conference floor to real-world application.</p><p><a href="https://www.melbournecb.com.au/" target="_blank">Melbourne Convention Bureau (MCB)</a> is a not-for-profit state government agency with over 60 years’ experience, that provides IEEE and its members with free support to bring international conferences to Melbourne, Australia. MCB’s support spans early-stage exploration and international bidding through to securing government funding, connecting organizers with venues, accommodation and event suppliers, and providing destination support for conference planning and delivery. Organizations considering a conference in Australia are encouraged to connect with MCB’s dedicated team, which supports IEEE conferences in Melbourne. Enquiries can be directed to info@melbournecb.com.au.</p>]]></description><pubDate>Mon, 18 May 2026 10:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/melbourne-ai-data-center-innovation</guid><category>Australia</category><category>Artificial-intelligence</category><category>Research-centers</category><category>Applied-ai</category><category>Conferences</category><dc:creator>Melbourne Convention Bureau</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/blue-lit-server-room-featuring-the-large-monash-maveric-supercomputer-installation.jpg?id=66718014&amp;width=980"></media:content></item><item><title>Accelerating Chipmaking Innovation for the Energy-Efficient AI Era</title><link>https://spectrum.ieee.org/applied-materials-epic-center</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/modern-glass-office-complex-labeled-epic-center-with-trees-and-walkways-outside.jpg?id=66659351&width=1200&height=400&coordinates=0%2C92%2C0%2C92"/><br/><br/><p><em>This sponsored article is brought to you by <a href="https://www.appliedmaterials.com/us/en.html" target="_blank">Applied Materials</a>.</em></p><p>At pivotal moments in history, progress has required more than individual brilliance. The most consequential breakthroughs — such as those achieved under the Human Genome Project — required a new operating paradigm: Concentrate the world’s best talent around a single mission, establish a common platform, share critical infrastructure, and collapse feedback loops. When stakes are high and timelines are compressed, sequential and siloed innovation simply cannot keep pace.</p><p>Today’s AI era is creating an engineering race with similar demands. Every company is pushing to deliver higher-performance AI systems, faster. But performance is no longer defined by compute alone. AI workloads are increasingly dominated by the movement of data: In many cases, moving bits consumes as much — or more — energy than compute itself. As a result, reducing energy per bit can extend system‑level performance alongside gains in peak compute.</p><p><span>The path to energy‑efficient AI therefore runs through system‑level engineering, spanning three tightly interconnected domains:</span></p><ul><li><strong>Logic</strong>, where performance per watt depends on efficient transistor switching, low‑loss power, and signal delivery through dense wiring stacks.</li><li><strong>Memory</strong>, where surging bandwidth and capacity demands expose the memory wall, with processor capability advancing faster than memory access.</li><li><strong>Advanced packaging</strong>, where 3D integration, chiplet architectures, and high‑density interconnects bring compute and memory closer together — enabling system designs monolithic scaling can no longer sustain.</li></ul><p>These domains can no longer be optimized independently. Gains in logic efficiency stall without sufficient memory bandwidth. Advances in memory bandwidth fall short if packaging cannot deliver proximity within thermal and mechanical constraints. Packaging, in turn, is constrained by the precision of both front‑end device fabrication and back‑end integration processes.</p><p>In the angstrom era, the hardest problems arise at the boundaries — between compute and memory in the package, front‑end and back‑end integration, and the tightly coupled process steps needed for precise 3D fabrication. And it is precisely this boundary‑driven complexity where the traditional innovation model breaks down.</p><h2>The Traditional R&D Workflow Is Too Slow for Angstrom‑Era AI</h2><p>For decades, the semiconductor industry’s R&D model has resembled a relay race. Capabilities are developed in one part of the ecosystem, handed off downstream through integration and manufacturing, evaluated by chip and system designers, and only then fed back for the next iteration. That model worked when progress was dominated by relatively modular steps that could be scaled independently and simply dropped into the manufacturing flow.</p><p>But the AI timeline has upended these rules. At angstrom‑scale dimensions, the physics enforces inescapable coupling across the entire stack: materials choices shape integration schemes; integration defines design rules; design rules dictate power delivery; wiring sets thermal budgets; and thermals ultimately constrain packaging scaling. System architects simply cannot wait 10–15 years for each major semiconductor technology inflection to mature.</p><p class="pull-quote">Representing a roughly $5 billion investment, EPIC is the largest commitment to advanced semiconductor equipment R&D in U.S. history.</p><p>A long‑term perspective is essential to align materials innovation with emerging device architectures — and to develop the tools and processes required to integrate both with manufacturable precision. At <a href="https://www.appliedmaterials.com/" target="_blank">Applied Materials</a>, together with our customers, we are charting a course across the next 3–4 generations, extending as far as 10 years down the roadmap.</p><p>The angstrom era demands that we break down silos and bring together the industry’s best minds — from leading companies to leading academic institutions. If the problem is coupled, the solution must be coupled. If the timeline is compressed, the learning loop must be compressed. It’s not enough to just innovate — we must innovate <em>how </em>we innovate.</p><h2>EPIC: A Center and Platform for High‑Velocity Co‑Innovation</h2><p>This is the challenge that Applied Materials EPIC Center is designed to solve.</p><p>Representing a roughly US $5 billion investment, EPIC is the largest commitment to advanced semiconductor equipment R&D in U.S. history. When it opens in 2026, it will deliver state‑of‑the‑art cleanroom capabilities built from the ground up to shorten the path from early‑stage research to full‑scale manufacturing. But the facilities are only one component of the model. EPIC is also a platform, an operating system for high-velocity co‑innovation that revolutionizes how ideas move from the lab to the fab.</p><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Diagram comparing traditional and EPIC chip innovation timelines showing 2x faster path" class="rm-shortcode" data-rm-shortcode-id="96015591a65db61b8276debbf07572cd" data-rm-shortcode-name="rebelmouse-image" id="65b06" loading="lazy" src="https://spectrum.ieee.org/media-library/diagram-comparing-traditional-and-epic-chip-innovation-timelines-showing-2x-faster-path.png?id=66661836&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">EPIC is a platform, an operating system for high-velocity co‑innovation that revolutionizes how ideas move from the lab to the fab.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Applied Materials</small></p><p><span>The EPIC model compresses the traditional workflow. Customer engineers work side‑by‑side with Applied technologists from day one — moving beyond isolated process optimization and downstream handoffs. Within a shared, secure environment, EPIC tightly integrates atomistic modeling, test vehicles, process development, validation, and metrology feedback. Constraints that once surfaced late in development are identified and addressed early.</span></p><p>The result is a potentially 2x faster path that benefits the entire ecosystem under one roof:</p><ul><li><strong>Chipmakers </strong>gain earlier access to Applied’s R&D portfolio, faster learning cycles, and accelerated transfer of next‑generation technologies into high‑volume manufacturing.<strong></strong></li><li><strong>Ecosystem partners</strong> gain earlier access to advanced manufacturing technology and collaboration opportunities that expand what is possible through materials innovation.<strong></strong></li><li><strong>Academic institutions </strong>gain opportunities to strengthen the lab‑to‑fab pipeline and help develop future semiconductor talent.<strong></strong></li></ul><p>Building on decades of co‑development, we are reinventing the innovation pipeline with our partners across logic, memory, and advanced packaging to deliver the next leap in energy‑efficient AI.</p><h2>Accelerating Advanced Logic</h2><p>Logic remains the engine of AI compute. In the angstrom era, however, system‑level gains are increasingly constrained by power and energy. Extending AI performance now depends on architectures that deliver more performance per watt — accelerating the move to 3D devices such as gate‑all‑around (GAA) transistors, which boost density within a compact footprint while preserving power efficiency.</p><div class="ieee-sidebar-large"><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Evolution from FinFET to GAA, backside power, isolated GAA, and CFET transistors" class="rm-shortcode" data-rm-shortcode-id="d66597919442799fa477cfc8aafcaa01" data-rm-shortcode-name="rebelmouse-image" id="dd920" loading="lazy" src="https://spectrum.ieee.org/media-library/evolution-from-finfet-to-gaa-backside-power-isolated-gaa-and-cfet-transistors.jpg?id=66659734&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Architectures that deliver more performance per watt are accelerating the move to 3D devices such as gate‑all‑around (GAA) transistors, and further out, complementary FETs (CFETs), which push density scaling even more.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Applied Materials</small></p></div><p><span>These architectural shifts are unfolding at unprecedented scale, with the logic roadmap already extending beyond first‑generation GAA toward more advanced designs. One key example is GAA with backside power delivery, which relocates thick power lines to the backside of the wafer, reducing resistive losses and freeing front‑side routing for tighter logic cell integration. Another example brings adjacent GAA PMOS and NMOS transistors closer together while inserting a dielectric isolation wall between them to minimize electrical interference. Further out, complementary FETs (CFETs) push density scaling even more by stacking PMOS and NMOS devices directly atop one another.</span></p><p>While these architectures deliver compelling gains in performance per watt and logic density without relying solely on tighter lithography, they significantly raise integration complexity. Manufacturing a single GAA device today can involve more than 2,000 tightly interdependent process steps. At the same time, wiring stacks continue to grow taller and denser to connect these advanced logic devices. Modern leading‑edge GPUs now in development pack more than 300 billion transistors into an area little larger than a postage stamp, interconnected by over 2,000 miles of wiring.</p><div class="ieee-sidebar-large"><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Diagram of advanced AI chip showing layered wiring and 3D stack of copper interconnects." class="rm-shortcode" data-rm-shortcode-id="0ac1f5771ed9d3d6daa81708a2feba6d" data-rm-shortcode-name="rebelmouse-image" id="5adf6" loading="lazy" src="https://spectrum.ieee.org/media-library/diagram-of-advanced-ai-chip-showing-layered-wiring-and-3d-stack-of-copper-interconnects.jpg?id=66659736&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Modern leading‑edge GPUs now in development pack more than 300 billion transistors into an area little larger than a postage stamp, interconnected by over 2,000 miles of wiring.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Applied Materials</small></p></div><p><span>At this level of complexity, the process steps used to create these precise 3D devices and wiring stacks cannot be optimized independently. Design and process must evolve in lockstep, and materials innovation and fabrication methods must advance alongside device architecture. EPIC’s co‑innovation model is designed to accelerate exactly this convergence — enabling logic compute to continue advancing the frontiers of AI at the pace the roadmap demands.</span></p><h2>Powering the Memory Roadmap</h2><p>At the same time, the AI computing era is fundamentally reshaping how data is generated, moved, and processed — making memory technologies, especially DRAM, central to delivering the energy‑efficient performance AI systems require. As models grow larger and more data‑hungry, the DRAM roadmap is shifting toward architectures that deliver higher density, greater bandwidth, and faster access per watt.</p><div class="ieee-sidebar-large"><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Diagram of DRAM cell scaling from 8F\u00b2 to stacked 3D DRAM architecture." class="rm-shortcode" data-rm-shortcode-id="4a15a67c9e3fc19ccc59866774ef7f6c" data-rm-shortcode-name="rebelmouse-image" id="107e7" loading="lazy" src="https://spectrum.ieee.org/media-library/diagram-of-dram-cell-scaling-from-8f-u00b2-to-stacked-3d-dram-architecture.jpg?id=66659766&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">At the DRAM cell level, AI performance requirements are driving a transition from 6F² buried‑channel array transistors (BCAT) to more compact 4F², and beyond that, architectures that move past what 2D scaling alone can deliver. </small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Applied Materials</small></p></div><p>At the DRAM cell level, this shift is driving a transition from 6F² buried‑channel array transistors (BCAT) to more compact 4F² architectures, which orient the transistor vertically to boost density and reduce chip area. Looking beyond 4F², sustaining gains in performance per watt will require moving past what 2D scaling alone can deliver. The industry is therefore turning to 3D DRAM, stacking memory cells vertically to add capacity within a constrained footprint. As these structures grow taller and aspect ratios intensify, high-mobility materials engineering in three dimensions becomes increasingly critical to performance and reliability.</p><p>Beyond the memory cell array, another powerful lever for DRAM scaling is shrinking the peripheral circuitry, which includes logic transistors and interconnect wiring. One emerging approach places select periphery functions beneath the DRAM array by bonding two wafers — one optimized for the DRAM cells and the other for CMOS logic — using multiple wiring layers.</p><div class="ieee-sidebar-large"><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Diagram of transistor and interconnect technology progressing to FinFET and advanced Cu links" class="rm-shortcode" data-rm-shortcode-id="6c6c6ebbda58b4b241b326cf5f2514b5" data-rm-shortcode-name="rebelmouse-image" id="f2f52" loading="lazy" src="https://spectrum.ieee.org/media-library/diagram-of-transistor-and-interconnect-technology-progressing-to-finfet-and-advanced-cu-links.jpg?id=66659784&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Beyond the memory cell array, another powerful lever for DRAM scaling is shrinking the peripheral circuitry, which includes logic transistors and interconnect wiring.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Applied Materials</small></p></div><p>In parallel, DRAM performance is being extended by leveraging logic‑proven enhancers in the memory periphery. These include mobility boosters such as embedded silicon germanium and stress films, along with wiring upgrades like improved low‑k dielectrics and advanced copper interconnects. Memory manufacturers are also transitioning periphery transistors from planar devices to FinFET architectures, following the logic roadmap to further improve I/O speed. These valuable inflections are central to EPIC’s mission — where they can be co-developed and rapidly validated for next‑generation memory systems.</p><h2>Driving System Scaling With Advanced Packaging</h2><p>As data movement becomes the dominant energy cost in AI systems, advanced packaging has emerged as a critical lever for improving system‑level efficiency—shortening interconnect distances, increasing bandwidth density, and reducing the power required to move data between logic and memory.</p><div class="ieee-sidebar-medium"><p class="shortcode-media shortcode-media-rebelmouse-image rm-float-left rm-resized-container rm-resized-container-25" data-rm-resized-container="25%" style="float: left;"> <img alt="Diagram of AI accelerator with surrounding HBM chips and enlarged stacked HBM memory." class="rm-shortcode" data-rm-shortcode-id="57ca5bd0a4fb3c9caafdd046322814ee" data-rm-shortcode-name="rebelmouse-image" id="8d42b" loading="lazy" src="https://spectrum.ieee.org/media-library/diagram-of-ai-accelerator-with-surrounding-hbm-chips-and-enlarged-stacked-hbm-memory.jpg?id=66659903&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">The rise of 3D packages such as high‑bandwidth memory (HBM) underscores why advanced packaging is becoming central to the AI era.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Applied Materials</small></p></div><p>High‑bandwidth memory (HBM) marks a major inflection along this path. By stacking DRAM dies — scaling to 16 layers and beyond — and placing memory much closer to the processor, HBM enables rapid access to ever‑larger working datasets. This delivers step‑function gains in both bandwidth and energy efficiency.</p><p>More broadly, the rise of 3D packages such as HBM underscores why advanced packaging is becoming central to the AI era. Packaging now addresses system‑level constraints that logic and memory device scaling alone can no longer overcome. It also enables a move away from monolithic systems‑on‑chip toward chiplet‑based architectures, as AI workloads increasingly demand flexible designs that combine logic, memory, and specialized accelerators optimized for specific tasks.</p><p>A vital technology powering this roadmap is hybrid bonding. With interconnect pitches approaching those of on‑chip wiring, conventional bumps and microbumps run into fundamental limits in density, power, and signal integrity. Hybrid bonding removes these barriers by allowing dramatically higher interconnect and I/O density, supporting a broad range of chiplet architectures — from memory stacking to tighter compute‑memory integration.</p><div class="ieee-sidebar-large"><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Colorful 3D cross-section of a stacked computer chip package with connectors" class="rm-shortcode" data-rm-shortcode-id="803f8a53c6b07244ec4f34b4165fd65e" data-rm-shortcode-name="rebelmouse-image" id="623bc" loading="lazy" src="https://spectrum.ieee.org/media-library/colorful-3d-cross-section-of-a-stacked-computer-chip-package-with-connectors.jpg?id=66659905&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">EPIC tackles high‑value advanced‑packaging challenges through early, parallel co‑innovation across materials, integration, and manufacturing.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Applied Materials</small></p></div><p>As bonded structures like HBM stacks grow larger and more complex, warpage control, die placement, stack alignment, and thermal management become first‑order challenges. EPIC tackles these and other high‑value advanced‑packaging challenges through early, parallel co‑innovation across materials, integration, and manufacturing.</p><h2>Bringing It All Together</h2><p>Across logic, memory, and advanced packaging, our industry faces an ambitious roadmap that promises significant gains in energy efficiency for AI systems. But realizing that potential demands breakthrough materials innovation at a time when feature sizes are shrinking, interfaces are multiplying, and process interdependencies are escalating. These challenges cannot be solved on 10–15‑year timelines under the traditional relay‑race model. We must break down silos, align earlier across the ecosystem, and parallelize learning to keep pace with AI’s demands.</p><p>In the AI era, progress will be defined by the speed at which lightbulb moments turn into manufacturing and commercialization reality. The only viable path forward is a new innovation model — and EPIC is how we are driving it.</p>]]></description><pubDate>Thu, 14 May 2026 10:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/applied-materials-epic-center</guid><category>Chipmaking</category><category>Artificial-intelligence</category><category>Materials-science</category><category>Semiconductors</category><dc:creator>Prabu Raja</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/modern-glass-office-complex-labeled-epic-center-with-trees-and-walkways-outside.jpg?id=66659351&amp;width=980"></media:content></item><item><title>Your Next AI Query May Travel Where the Power Is</title><link>https://spectrum.ieee.org/distributed-inference-data-centers</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/illustration-of-a-stylized-ai-search-bar-and-nested-rectangles.jpg?id=66667694&width=1200&height=400&coordinates=0%2C292%2C0%2C292"/><br/><br/><p>The rise of electricity-guzzling data centers has forced the artificial intelligence industry to get creative about finding power. One of the latest ideas: Build micro data centers next to utility substations and operate them in concert, shifting the computation around based on power availability.</p><p>That’s the approach <a href="https://www.nvidia.com/en-us/" rel="noopener noreferrer" target="_blank">Nvidia</a> and its collaborators are taking in a new pilot project they plan to build later this year. They’ll construct about 25 of these small data centers, each ranging from 5 to 20 megawatts, across five utilities in the United States. If one substation is overloaded with power demand, or if there’s an outage, the compute will be shifted to a different data center near a substation that has spare capacity. </p><p>To develop the fleet, Nvidia is partnering with data center builder <a href="https://infrapartners.llc/" rel="noopener noreferrer" target="_blank">InfraPartners</a>, real estate service provider <a href="https://www.prologis.com/" rel="noopener noreferrer" target="_blank">Prologis</a>, and the nonprofit <a href="https://www.epri.com/" rel="noopener noreferrer" target="_blank">EPRI</a> (formerly known as the Electric Power Research Institute).</p><p>The project aims to demonstrate a new way for data centers to be more flexible and accommodating of electricity availability. It’s also a way for data center developers to quickly secure power from the grid—an increasingly precious commodity, even in small chunks.</p><p>“We started looking at how much [unused] power is available at individual substations, and what we found was that on average, like 5 MW is nominally available…max 20 MW,” says <a href="https://www.linkedin.com/in/bensooter/" rel="noopener noreferrer" target="_blank">Ben Sooter</a>, director of Agentic AI Initiatives and Distributed AI Architecture at EPRI.</p><p>That’s too small to interest most data center operators, but building several at that size and operating them as if they’re one larger one is useful, Sooter says. Plus, shifting compute away from overburdened substations to those with more headroom can double the overall available power, he says.</p><p>“There are 55,000 substations in the U.S., and if they each have 5, 10, or 20 MW of spare capacity, that number adds up pretty fast,” adds <a href="https://www.linkedin.com/in/spieler/" rel="noopener noreferrer" target="_blank">Marc Spieler</a>, senior director of energy at Nvidia.</p><h2>Building energy flexibility into data centers</h2><p>Squeezing every spare megawatt out of the grid will become increasingly important as data center construction continues to ramp up. In the United States, where <a href="https://spectrum.ieee.org/data-center-growth" target="_self">half of all new data centers are being built</a>, data centers could consume <a href="https://powering-intelligence.epri.com/" rel="noopener noreferrer" target="_blank">9 to 17 percent of electricity generation by 2030</a>. That’s more than double the current use, according to EPRI’s estimates. Facilities that train AI models are being built at the <a href="https://spectrum.ieee.org/5gw-data-center" target="_self">gigawatt scale</a>, drawing about the same amount of power as a midsize U.S. city.</p><p>As grid operators figure out how to accommodate such massive new loads, data center developers sometimes end up waiting up to a decade to get approved for a grid connection. In response, the developers are making incredibly bold decisions around power—moves that would have been unthinkable just two years ago.</p><p>Many are <a href="https://spectrum.ieee.org/5gw-data-center" target="_self">building their own gas power plants on site</a>. Some are offering to pay for the cost of new transmission lines and other grid infrastructure. And a few are even <a href="https://spectrum.ieee.org/nuclear-powered-data-center" target="_self">investing in startup companies</a> that are developing fusion and next-generation nuclear fission reactors, in the hope of meeting power needs a decade from now.</p><p>But there’s a lot more power available on the grid than is used day to day. <a href="https://nicholasinstitute.duke.edu/sites/default/files/publications/rethinking-load-growth.pdf" rel="noopener noreferrer" target="_blank">U.S. grid operators use only about 53 percent</a> of their generation capacity on average, according to a landmark 2025 report from Duke University’s Nicholas Institute for Energy, Environment and Sustainability.</p><p>That’s because the U.S. electricity supply was built to meet peak demand—periods of the highest energy use of the year, such as the hottest days of the summer. Those peak loads can be almost double the load on a mild-temperature day and typically occur for less than 200 hours a year. The rest of the time, whole power plants sit idle.</p><p>If AI data centers can find a way to reduce or shift power consumption during these periods of peak demand, the extraordinary measure of building on-site power generation may not always be necessary. U.S. grids could provide an additional 76 GW—about 10 percent of peak demand—if large loads like data centers curtailed their power use just 0.25 percent of the time, according to the Nicholas Institute report.</p><p>Energy flexibility could also allow data centers to connect to the grid faster because they wouldn’t have to wait for new power plants to be built. And placing small data centers right next to substations reduces the need for new grid infrastructure, such as power lines and poles, and upgraded transformers and switch gear. As a bonus, these substations already have fiber-optic lines for high-speed internet, Nvidia’s Spieler points out. So the small data center can connect to those existing lines. </p><h2>The inference advantage</h2><p>The type of flexibility data centers can offer depends, in part, on the workload. The two main types of workload are AI training (the process of developing, say, a large language model or image generation model) and inference (using that model to, say, generate responses to users’ chatbot questions and requests for images).</p><p>Training requires huge data centers with tightly interconnected GPUs. For example, Meta’s <a href="https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct" rel="noopener noreferrer" target="_blank">Llama 3.1 405B</a> model took about two and a half months to train on 16,000 GPUs. During training, adjusting all the model weights at once at each step requires the GPUs to be connected via high-speed links, such as Nvidia’s <a href="https://www.nvidia.com/en-us/data-center/nvlink/" rel="noopener noreferrer" target="_blank">NVLink</a> and <a href="https://www.nvidia.com/en-us/networking/products/infiniband/" rel="noopener noreferrer" target="_blank">InfiniBand</a> interconnects. It wouldn’t be practical to spread out AI training workloads among a fleet of mini data centers. On the bright side, because training takes months, it’s possible to pause for short periods of time to curtail energy use during peak demand.</p><p>Inference doesn’t require as many GPUs or as much fancy networking. Instead of a huge corpus of data, a single user’s query is fed into the model, and the model spits out the answer. No backpropagation is involved—that is, no large-scale coordination between different chunks of input data is needed. And so inference is amenable to smaller data centers. However, timing is key. When you ask an image generator for a picture of your face pasted onto a cute cat, you understandably expect to see the result right away. So rather than briefly pausing compute during peak demand, the energy flexibility can come through creatively shifting the workload to a different location.</p><p>“Inference is one of the few workloads that can be dynamically routed,” says <a href="https://www.linkedin.com/in/valerie-crafton-phd-mba-leed-ap-six-sigma-gb-0362b816/" rel="noopener noreferrer" target="_blank">Valerie Crafton</a>, senior vice president of strategy and operations at modular data center company <a href="https://www.mod42llc.com/" rel="noopener noreferrer" target="_blank">Mod42</a>. “Which means that you can align the compute with wherever the power is actually available. That’s one unique piece that’s really driving the push for a lot of these smaller data centers where the power exists.”</p><p>Both Nvidia and EPRI have been on a tear to demonstrate different kinds of data center flexibility. They’re calling their substation-based strategy “distributed inference.” <a href="https://www.epri.com/about/media-resources/press-release/dzagwmfxgarse2g2s9ma4telm5gxqsbt" rel="noopener noreferrer" target="_blank">Announced in February</a>, the project aims to begin construction of the pilot fleet of small data centers by the end of 2026. Nvidia and EPRI estimate that compute workloads will need to be moved to a different substation only about 0.1 percent of the time.</p><p><a href="https://spectrum.ieee.org/modular-data-center" target="_self">Going micro in data center size</a> is an idea that’s picking up speed. “We’re in this compute wave currently where everybody’s building these really large data centers—5 gigawatt, mammoth things,” says Sooter. But “there’s a second compute wave coming,” involving much smaller data centers handling inference, he says. Tech companies are “really beating the drum on this because they see demand for inference compute really picking up in 2027,” he says.</p><p><em>This story was updated on 13 May, 2026 to correct the source of the 76-GW figure.</em> </p>]]></description><pubDate>Tue, 12 May 2026 12:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/distributed-inference-data-centers</guid><category>Ai-data-centers</category><category>Nvidia</category><category>Epri</category><category>Power-generation</category><dc:creator>Dina Genkina</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/illustration-of-a-stylized-ai-search-bar-and-nested-rectangles.jpg?id=66667694&amp;width=980"></media:content></item><item><title>Startup Wants to Run AI Inference From Space</title><link>https://spectrum.ieee.org/orbital-inference-data-center</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/illustration-of-a-satellite-with-a-compact-rectangular-payload-and-two-narrow-solar-panels.jpg?id=66691214&width=1200&height=400&coordinates=0%2C417%2C0%2C417"/><br/><br/><p><span>The rapid advancement of large language models (LLMs) is fueling a global </span><a href="https://spectrum.ieee.org/5gw-data-center" target="_self">data center boom</a><span> and driving a surge in energy demand. But the electricity required to power data centers is straining the grid, pushing infrastructure operators to search for </span><a href="https://spectrum.ieee.org/ai-data-centers" target="_self">alternative</a><span> sources of power. Some are even looking beyond Earth.</span></p><p>One company that’s looking to the stars for energy is <a href="https://orbital.inc/" target="_blank">Orbital</a> Inc. In mid-April, the Los Angeles–based startup emerged from stealth and announced plans to build space data centers. Backed by <a href="https://a16z.com/" target="_blank">Andreessen Horowitz</a> (A16z), Orbital is designing infrastructure for AI inference, where trained models generate outputs. Much like other companies advocating for space-based data centers, Orbital is banking on the “<a href="https://spectrum.ieee.org/orbital-data-centers" target="_self">free</a>” energy generated by the sun to power compute for workloads such as chatbots and agents, sidestepping terrestrial energy constraints.</p><p>“There simply isn’t enough capacity here [on Earth], and the only way is up,” says <a href="https://en.wikipedia.org/wiki/Euwyn_Poon" target="_blank">Euwyn Poon</a>, Orbital’s founder and CEO. “There’s actually abundant solar energy that’s not being harnessed.”</p><p>Orbital’s vision is a mesh constellation of small satellites in low Earth orbit. Each satellite would be equipped with a GPU server rack powered by solar panels roughly the size of a tennis court, plus radiative cooling panels of comparable size. The long-term goal is up to 10,000 fridge-sized satellites—each with 100 kilowatts of power—forming a distributed cloud, similar to <a href="https://www.spacex.com/" target="_blank">SpaceX</a>’s proposed <a href="https://www.basenor.com/blogs/news/spacex-ai-sat-mini-bigger-than-starship-100kw-of-power?srsltid=AfmBOoryoatVofhwx9MacY1ex-keRNsrC1FUpkIO8a8a8ep9JyS3mfEJ" target="_blank">AI Sat Mini</a>.</p><p>Orbital’s first test will come in 2027, when it plans to launch a prototype satellite aboard a SpaceX Falcon 9 to validate its GPU operations in orbit and run commercial inference workloads. Another company, <a href="https://www.starcloud.com/" target="_blank">Starcloud</a>, has <a href="https://spectrum.ieee.org/nvidia-h100-space#:~:text=%E2%80%9CThe%20H%2D100%20is%20about,flown%20and%20operated%20in%20orbit.%E2%80%9D" target="_self">already run</a> a similar test last year. Orbital’s differentiator is their plans to match the solution with a problem: Small satellites equipped to run inference workloads specifically could benefit from lower launch costs. However, they face the same difficulties as other space data center hopefuls. Every watt of “free” energy must be dissipated as heat via large <a href="https://www.eetimes.com/the-hidden-physics-of-running-data-centers-in-orbit/" rel="noopener noreferrer" target="_blank">radiative coolers</a>; radiation in low Earth orbit <a href="https://thebreakthrough.org/issues/energy/data-centers-wont-be-in-space-anytime-soon" rel="noopener noreferrer" target="_blank">degrades</a> compute equipment; and regular maintenance in space is difficult and costly. </p><h2>Orbital’s inference focus</h2><p>Poon says Orbital’s focus on a distributed network of smaller satellites designed to run inference workloads across independent GPU nodes, rather than large, tightly-coupled systems, makes the execution more feasible. </p><p>That idea shapes Orbital’s design. Training large AI models typically relies on tightly-coupled GPU clusters optimized for massive compute throughput. Inference workloads, by contrast, are generally less compute-intensive per request and can often run on smaller numbers of GPUs, making them easier to distribute across systems. Capping each satellite at roughly 100 kilowatts, Poon says, greatly simplifies the design. “It’s very simple,” Poon says, referring to the concept behind the satellites’ engineering. “Engineers would appreciate this.” </p><p>In Orbital’s design, a user request—like, say, asking ChatGPT to analyze a data set—is routed from a data center on Earth to a ground station, a terrestrial relay that connects satellites to the internet, then transmits the request to a satellite. Satellites communicate through optical interlinks, which use lasers to pass data between nodes. That routes the request to an available GPU, which processes the user’s query and generates the output before sending the result back through the network to the user. These links rely on ground stations that only communicate with satellites when they pass within range.</p><p>If the satellites are proven to work, Orbital is set on tapping “big model labs” as customers, including firms like OpenAI and Anthropic that run massive inference workloads. Orbital plans to serve them through direct API access for buying tokens and enterprise deals that shift inference demand into its network in space.</p><h3>Engineering challenges</h3><p>Poon recognizes that running data centers in space introduces major technical hurdles. </p><p>Radiation can strike GPUs and cause bit flips or other errors. Thermal management is also difficult. Without air, systems must rely on radiating heat into space rather than conventional cooling. Maintenance is another constraint, as satellites cannot be easily repaired or replaced if they malfunction in space. It’s why Poon says the test launch will be critical to identify and troubleshoot these issues. “Part of the mission is to figure out the unknowns,” he says.</p><p>Dr. Amit Verma, an electrical engineering professor at Texas A&M University–Kingsville, who researches semiconductor device modeling, raised similar concerns. Deploying thousands of satellites, Dr. Verma says, increases failure risk with limited repair options. He added that operational feasibility depends on the applications performed on the satellites. While some workloads like chatbots or algorithmic recommendations can tolerate added delays (data traveling to lower Earth orbit takes tens of milliseconds to return), others like real-time stock trading cannot.</p><p>“Outer space data centers that involve heavy use of AI-related processing certainly do need to overcome power and deployment and reliability issues to be meaningful,” Verma says.</p><p>Orbital plans to test extensively before launch. Poon says his company is exploring radiation hardening for GPUs and ammonia-based liquid cooling loops to transfer heat to external radiators. Reducing system weight is also top of mind to lower launch costs. </p><p>Even with these mitigations, the timeline is ambitious. In a <a href="https://andercot.substack.com/p/do-orbital-data-centers-make-sense" rel="noopener noreferrer" target="_blank">Substack post</a> on space data centers, Andrew Côté, an engineering physicist, predicts that space data centers won’t be operational for at least another 10 to 20 years. Orbital, however, expects to finalize the satellite designs by 2026, launch in 2027, and build a manufacturing facility in Los Angeles by 2028. </p><p>With the engineering challenges complex and the costs of launch high, the ability for Orbital’s satellite systems to operate reliably at scale remains an open question.</p><p>Despite those uncertainties, Poon remains laser focused on the long-term opportunity. </p><p>“I trust that our engineering efforts can start making progress towards solving these problems,” he says.</p>]]></description><pubDate>Sun, 10 May 2026 13:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/orbital-inference-data-center</guid><category>Data-center</category><category>Space</category><category>Ai</category><category>Inferencing</category><dc:creator>Aaron Mok</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/illustration-of-a-satellite-with-a-compact-rectangular-payload-and-two-narrow-solar-panels.jpg?id=66691214&amp;width=980"></media:content></item><item><title>Learn What It Takes to Become a Cybersecurity Consultant</title><link>https://spectrum.ieee.org/ieee-guide-cybersecurity-consultant</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/a-young-south-asian-woman-explaining-detailed-computer-code-to-a-colleague-using-an-office-presentation-screen.jpg?id=66689798&width=1200&height=400&coordinates=0%2C730%2C0%2C730"/><br/><br/><p>Cybersecurity consultants have never been more in demand. Information security analyst roles are projected to grow <a href="https://www.bls.gov/ooh/computer-and-information-technology/information-security-analysts.htm" rel="noopener noreferrer" target="_blank">nearly 30 percent between now and 2034</a>, according to the U.S. <a href="https://www.bls.gov/" rel="noopener noreferrer" target="_blank">Bureau of Labor Statistics</a>. More than <a href="https://www.statista.com/forecasts/1485031/cyberattacks-annual-worldwide/" rel="noopener noreferrer" target="_blank">15 million cybercrime incidents</a> occurred worldwide in 2024, <a href="https://www.statista.com/" rel="noopener noreferrer" target="_blank">Statista</a> reported.</p><p>Data breaches are costly and pose direct safety risks. Statista reported that more than <a href="https://www.statista.com/study/203640/cybercrime-worldwide/" rel="noopener noreferrer" target="_blank">US $10 trillion is spent annually repairing the damage</a> caused by cybercrime, <a href="https://www.statista.com/statistics/184083/commonly-reported-types-of-cyber-crime-us/" rel="noopener noreferrer" target="_blank">most commonly</a> phishing, spoofing, extortion, and data breaches. In one example in the United States, <a href="https://spectrum.ieee.org/connected-vehicle-risks" target="_self">breathalyzer devices</a> installed in vehicles became disabled, leaving hundreds of drivers stranded, as detailed in an <a href="https://spectrum.ieee.org/" target="_self"><em><em>IEEE Spectrum</em></em> article</a>.</p><p>To help you acquire the skills you need to distinguish yourself from other cybersecurity job candidates, the <a href="https://www.computer.org/" rel="noopener noreferrer" target="_blank">IEEE Computer Society</a> offers a “<a href="https://join.computer.org/become-a-cybersecurity-consultant/?Campaign_ID=103" rel="noopener noreferrer" target="_blank">What Makes a Great Cybersecurity Consultant</a>” guide. The 23-page PDF includes hard and soft skills you need, a list of certifications to pursue, and key IEEE cybersecurity conferences for staying updated on developments in the field.</p><p>The guide includes advice from two cybersecurity experts. <a href="https://www.linkedin.com/in/nullsession/" rel="noopener noreferrer" target="_blank">John D. Johnson</a>, an IEEE senior member, is the founder and CEO of <a href="https://www.linkedin.com/company/aligned-security/" rel="noopener noreferrer" target="_blank">Aligned Security</a> in Bettendorf, Iowa. <a href="https://webdiis.unizar.es/~ricardo/" rel="noopener noreferrer" target="_blank">Ricardo J. Rodriguez</a> is an associate professor of computer science and systems engineering at the <a href="https://www.unizar.es/" rel="noopener noreferrer" target="_blank">Universidad de Zaragoza</a>, in Spain, who researches digital forensics and other cybersecurity topics.</p><p>“Technology, remote work, and a shortage of skilled workers make this the ideal time to consider becoming a cybersecurity consultant,” Johnson says in the guide. “Consulting can give you the flexibility, variety, and control over where you want your career to go.”</p><h2>Hard and soft skills</h2><p>At a minimum, cybersecurity professionals should have a general understanding of IT including operating systems, communication protocols, network architecture, and <a href="https://spectrum.ieee.org/top-programming-languages-2025" target="_self">programming languages such as C++, Java, and Python</a>. They also should be well-versed in security auditing, firewall management, penetration testing, and encryption technologies.</p><p>The principles of ethical hacking and coding would be handy as well.</p><p>“To be able to defend a system well, you first have to know how to attack it,” Rodriguez says.</p><p>The guide explains that there are now more technologies available to help cybersecurity consultants monitor threats and protect systems. They include <a href="https://www.ibm.com/think/topics/security-orchestration-automation-response" rel="noopener noreferrer" target="_blank">security orchestration, automation, and response</a> (SOAR) platforms, which automate workflows to collect security data, streamline incident response, and automate repetitive tasks.</p><p>Rodriguez points to advances in <a href="https://spectrum.ieee.org/the-fight-over-encrypted-dns-boils-over" target="_self">domain name system security extensions</a> (DNSSEC), which uses digital signatures based on public-key cryptography to strengthen the authentication of the <a href="https://spectrum.ieee.org/fresh-phish" target="_self">domain name system</a>. By validating data authenticity, DNSSEC safeguards against attacks such as DNS spoofing and guarantees that users connect to the correct IP address.</p><p>Technologies such as <a href="https://spectrum.ieee.org/topic/artificial-intelligence/" target="_self">artificial intelligence</a>, <a href="https://spectrum.ieee.org/tag/blockchain" target="_self">blockchain</a>, and <a href="https://spectrum.ieee.org/quantum-safe-crypto" target="_self">quantum computing</a> will increasingly be used to help thwart cyberattacks, the guide suggests. AI is expected to enhance the quality of data analysis, Rodriguez says.</p><p>Although hard skills are important, soft skills are just as crucial, according to the guide. Critical thinking, project management, flexibility, teamwork, and organizational and <a href="https://spectrum.ieee.org/5-tips-technical-presentations" target="_self">presentation skills</a> are essential.</p><p>It’s not enough to be good at analyzing security vulnerabilities; you also need to clearly describe the situation and explain possible solutions.</p><p>“Soft skills are important to achieve good team cohesion,” Rodriguez says, “because consultants often lead diverse teams from within their client’s organization.”</p><p>“It’s essential,” Johnson adds, “that you demonstrate to clients you’re a team player and a capable communicator, and that you meet your commitments.”</p><h2>Security certifications</h2><p>Possessing security-specific credentials is a valuable way to demonstrate your expertise to potential clients, according to the guide. Because hundreds of certifications are available, Johnson says, pinpointing the most relevant ones can be challenging. Some people focus on theoretical knowledge, while others want to cover practical applications of technology.</p><p>“Survey the industry and compare it to your skills,” Johnson recommends. “Decide what you want to do, and identify where you have gaps in your skills and experience.”</p><p>Here are four of the nine certifications listed in the guide that are frequently cited as being important. All the providers are cybersecurity organizations.</p><ul><li><a href="https://www.isaca.org/credentialing/cism" rel="noopener noreferrer" target="_blank"><strong>Certified information security manager.</strong></a> This globally recognized certification from the <a href="https://www.isaca.org/" rel="noopener noreferrer" target="_blank">ISACA</a> is for professionals managing enterprise information security.</li><li><a href="https://www.isc2.org/certifications/ccsp" rel="noopener noreferrer" target="_blank"><strong>Certified cloud security professional.</strong></a> Offered by <a href="https://www.isc2.org/certifications/ccsp" rel="noopener noreferrer" target="_blank">ISC2</a>, this credential validates advanced technical skills in designing, managing, and securing cloud infrastructure.</li><li><a href="https://ethicalhacking.eccouncil.org/certified-ethical-hacker-cehv13-usa?utm_source=ecc_paid&utm_medium=GooglePmax&utm_campaign=ecc-usa_googlepmax_cehv13&utm_source=ecc_paid&utm_medium=GooglePmax&utm_campaign=ecc-usa_googlepmax_cehv13&utm_id=21927959183&gad_source=1&gad_campaignid=22071110617&gbraid=0AAAAAD1MC3Kh3KdmDA1YocxnrPE7TBc3e&gclid=CjwKCAjwnZfPBhAGEiwAzg-VzNdRi99sTWedxsM5rkvIDi0o-8O64x8C5dgJxLuh90A9MEx6B5nObxoC-G8QAvD_BwE" rel="noopener noreferrer" target="_blank"><strong>Certified ethical hacker.</strong></a> This certification from the <a href="https://www.eccouncil.org/" rel="noopener noreferrer" target="_blank">International Council of E-Commerce Consultants (C-Council)</a> confirms proficiency in using methods commonly employed by malicious hackers to detect vulnerabilities.</li><li><a href="https://www.offsec.com/blog/oscp-vs-oswe/" rel="noopener noreferrer" target="_blank"><strong>Offensive security certified professional.</strong></a> A hands-on, 24-hour certification exam offered by <a href="https://www.offsec.com/" rel="noopener noreferrer" target="_blank">OffSec</a> covers practical testing skills.</li></ul><p>Additional industry-specific certifications might be required for organizations in finance, government, health care, or manufacturing.</p><p>Sound general knowledge—backed by experience, training, and certification—is an essential foundation for being a specialist, Johnson says.</p><h2>Conferences and networking opportunities</h2><p>Events sponsored by the IEEE Computer Society can help you learn about the latest research and advancements in cybersecurity:</p><ul><li><a href="https://sp2026.ieee-security.org/" rel="noopener noreferrer" target="_blank">IEEE Symposium on Security and Privacy</a><span>, from 18 to 21 May in San Francisco.<br/></span></li><li><a href="https://eurosp2026.ieee-security.org/" target="_blank">IEEE European Symposium on Security and Privacy</a><span>, from 6 to 10 July in Lisbon.<br/></span></li><li><a href="https://www.ieee-csr.org/" target="_blank">IEEE International Conference on Cyber Security and Resilience</a><span>, from 3 to 5 August in Lisbon.<br/></span></li><li><a href="https://secdev.ieee.org/2025/home/" target="_blank">IEEE Secure Development Conference</a><span>, from 14 to 16 October in Indianapolis.</span></li></ul><p>Conferences can give you insight into the field and let you do some networking, but it’s important to network elsewhere as well, experts say. Consider joining the <a href="https://www.ieee-security.org/" target="_blank">IEEE Technical Community on Security and Privacy</a>, which connects experts and professionals advancing research in areas such as encryption, operating system security, and data privacy.</p><p>Learning and meeting people keeps your knowledge sharp and can lead to mentorship opportunities with established cybersecurity consultants, Johnson says.</p><h2>Other IEEE resources</h2><p>The IEEE Computer Society’s <a href="https://www.computer.org/resources/cybersecurity#Cybersecurity" target="_blank">cybersecurity resources page</a> offers a wealth of information including fundamentals, possible career paths, and standards development. To keep you updated on trends, the society publishes <a href="https://www.computer.org/csdl/journal/pr" rel="noopener noreferrer" target="_blank"><em><em>IEEE Transactions on Privacy</em></em></a> and the <a href="https://www.computer.org/csdl/magazine/sp" rel="noopener noreferrer" target="_blank"><em><em>IEEE Security and Privacy</em></em></a><em> </em>magazine.</p><p>In addition to the guide, the <a href="https://iln.ieee.org/public/trainingcatalog.aspx" rel="noopener noreferrer" target="_blank">IEEE Learning Network</a> offers <a href="https://iln.ieee.org/public/searchresults?q=&ty=ML.BASE.DV.SearchAnyWords&at=T&cy=&ln=&CTGYLCL_CATEGORY_ID=F45DE82A63AB48B7A3AB4BEBB6F2E293" rel="noopener noreferrer" target="_blank">nearly 30 courses on cybersecurity</a>. And you can find research papers in the <a href="https://ieeexplore.ieee.org/search/searchresult.jsp?newsearch=true&queryText=cybersecurity" rel="noopener noreferrer" target="_blank">IEEE Xplore Digital Library</a>.</p>]]></description><pubDate>Wed, 06 May 2026 18:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/ieee-guide-cybersecurity-consultant</guid><category>Ieee-products-and-services</category><category>Cybersecurity</category><category>Ieee-computer-society</category><category>Careers</category><category>Computing</category><category>Career-advice</category><category>Type-ti</category><dc:creator>Kathy Pretz</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/a-young-south-asian-woman-explaining-detailed-computer-code-to-a-colleague-using-an-office-presentation-screen.jpg?id=66689798&amp;width=980"></media:content></item><item><title>A Bit of Data Center Heat Can Be Turned Back Into Electricity</title><link>https://spectrum.ieee.org/data-center-thermoelectric-generation</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/a-block-of-metal-with-t-shaped-crystals-inside-on-the-left-and-with-holes-where-the-crystals-were-on-the-right.jpg?id=66668556&width=1200&height=400&coordinates=0%2C292%2C0%2C292"/><br/><br/><p>Managing heat in AI data centers is a growing challenge. As hyperscalers cram more and more high-power computing systems into <a href="https://spectrum.ieee.org/5gw-data-center" target="_self">huge facilities</a>, they generate more and more heat. Data center designs are switching from fan-based systems to <a href="https://spectrum.ieee.org/data-center-liquid-cooling" target="_self">liquid ones</a>, which pipe water near electronics to gather up waste heat. That hot water is then cooled, dissipating the heat into the environment.</p><p><a href="https://pyrodelta.com/management/" rel="noopener noreferrer" target="_blank">Michael Abdelmaseh</a> has a different idea: What if some of that waste heat could be utilized and converted back into usable electricity? The <a href="https://spectrum.ieee.org/thermoelectric-effect-liquid" target="_self">thermoelectric effect</a>, by which certain materials can convert thermal energy into electrical energy and vice versa, has been known for about 200 years. The company <a href="https://phononic.com/" rel="noopener noreferrer" target="_blank">Phononic</a> is using thermoelectrics to cool data centers using electricity. </p><p>Reversing that process, <a href="https://en.wikipedia.org/wiki/Thermoelectric_generator" rel="noopener noreferrer" target="_blank">thermoelectric generators</a> harvest heat to produce electricity, but they currently aren’t very durable nor versatile. Abdelmaseh, founder and head engineer of <a href="https://pyrodelta.com/" rel="noopener noreferrer" target="_blank">PyroDelta Energy</a>, wants to make thermoelectric generators that can easily be integrated with data center liquid cooling systems, in engines, and in drones. This will not replace traditional cooling methods, since thermoelectric materials are currently limited in efficiency, but it could introduce some heat reuse. PyroDelta is a subsidiary of Vancouver-based First Tellurium.</p><p>The main thermoelectric material in commercial use today is bismuth telluride. The material is grown in large crystals—which is necessary, because the quality of the crystal contributes to its ultimate thermoelectric performance—and then sawed into smaller pieces that can be soldered together to make devices. Slicing and dicing the crystals generates waste material, which increases costs. And bismuth telluride cylinders can only be hewn into a limited number of shapes, typically tiny cubes, says Abdelmaseh. The crystals themselves are also prone to cracking. Another drawback is the assembly of the devices themselves. They are typically soldered together and melt when exposed to very high temperatures.</p><p>Abdelmaseh, previously an engineer at Toyota, wanted to make a more versatile thermoelectric generator. He developed a way to grow bismuth telluride crystals into a variety of shapes. This eliminates process steps and materials waste, he says. Instead of growing large crystals, PyroDelta relies on the capillary effect to draw the raw materials into molds during crystallization. </p><p>“Based on the cavity where the crystal grows, you can decide the final size and shape of the crystal,” he says. Using these methods, it’s possible to make curved shapes, not just cubes. These curved designs can be made in the shape of rings to create a tube-shaped thermoelectric generator that goes around a water pipe in a cooling system, for instance. And the materials are less brittle than those made by sawing, which improves durability.</p><p class="shortcode-media shortcode-media-rebelmouse-image rm-float-left rm-resized-container rm-resized-container-25" data-rm-resized-container="25%" style="float: left;"> <img alt="A metal plate with a pattern of rectangular holes filled with thermoelectric semiconductor material. " class="rm-shortcode" data-rm-shortcode-id="5503444a76420a11ca52da333b1b575c" data-rm-shortcode-name="rebelmouse-image" id="f58e7" loading="lazy" src="https://spectrum.ieee.org/media-library/a-metal-plate-with-a-pattern-of-rectangular-holes-filled-with-thermoelectric-semiconductor-material.jpg?id=66668570&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">Thermoelectric generators that can convert some of the waste heat in data centers back into electricity. </small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Michael Abdelmaseh</small></p><p>Abdelmaseh says this capillary casting method leads to a 60-80 percent reduction in materials waste, and approximately 10 times longer durability. </p><p>The company has developed a <a href="https://firsttellurium.com/pyrodelta-ready-to-test-prototype-thermoelectric-generator-for-ai-data-centers/" target="_blank">prototype</a> energy harvester for data centers. While the electricity generated with the prototype is not nearly enough to run an AI data center, Abdelmaseh says it should be sufficient to power temperature sensors, security cameras, and other sensors within data centers. The company has also developed a prototype thermoelectric car radiator that gathers heat to produce energy to run the electrical systems in gas-powered cars. Abdelmaseh says this could improve the efficiency of internal combustion engines by 5 percent.</p>The company is also <a href="https://pyrodelta.com/first-tellurium-subsidiary-pyrodelta-energy-competing-in-us-department-of-defense-drone-innovation-contest/" target="_blank">competing</a> in the <a href="https://www.darpa.mil/research/challenges/lift" target="_blank">DARPA Lift Challenge</a> this summer. Competitors are tasked with demonstrating a drone that can lift loads two or more times greater than its own weight. Abdelmaseh says a thermoelectric system helps make their drone more powerful at a lighter weight by scavenging thermal energy.]]></description><pubDate>Sat, 02 May 2026 13:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/data-center-thermoelectric-generation</guid><category>Thermoelectrics</category><category>Ai-data-centers</category><category>Heat-management</category><category>Crystals</category><dc:creator>Katherine Bourzac</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/a-block-of-metal-with-t-shaped-crystals-inside-on-the-left-and-with-holes-where-the-crystals-were-on-the-right.jpg?id=66668556&amp;width=980"></media:content></item><item><title>AI Processing of Earth Images Can Now Run in Space</title><link>https://spectrum.ieee.org/ai-earth-observation-in-space</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/satellite-image-of-an-airport-with-all-visible-planes-highlighted-by-ai-recognition-boxes.jpg?id=66663058&width=1200&height=400&coordinates=0%2C1041%2C0%2C1042"/><br/><br/><p>AI image processing aboard satellites in space has been a goal of the <a href="https://spectrum.ieee.org/earth-observation-satellites-small-constellations" target="_self">Earth observation</a> industry for years. Now it has finally been achieved. <a href="https://www.planet.com/" rel="noopener noreferrer" target="_blank">Planet Labs</a>, based in Calif., released an <a href="https://www.businesswire.com/news/home/20260407165913/en/Planet-Successfully-Runs-AI-in-Space" rel="noopener noreferrer" target="_blank">image</a> captured by its Pelican-4 multispectral satellite showing an airport in Alice Springs, Australia. On the tarmac, more than a dozen aircraft are scattered, each highlighted in a neat green box, identified by an AI model running aboard the satellite. </p><p>Planet Labs’ engineers had worked 18 months to accomplish reliable autonomous object classification from space. They hope the technology will put <a href="https://spectrum.ieee.org/commercial-satellite-imagery" target="_self">Earth observation on steroids</a>, enabling autonomous tasking and real-time sharing of insights with users on Earth.</p><p>“The entire remote-sensing industry has been known to put exotic sensors in space,” said <a href="https://www.linkedin.com/in/kiruthikadevaraj/" rel="noopener noreferrer" target="_blank">Kiruthika Devaraj,</a> vice president of engineering at Planet Labs. “We have very good eyes in space looking at everything that’s going on. But then, we collect so much data and have to wait six to 12 hours to get the information out. So, you’re essentially looking at the past.”</p><p>Planet Labs currently operates a constellation of several hundred<strong> </strong>Dove and SuperDove CubeSats, each only 30 centimeters long. These low-cost space cameras scan the entire surface of Earth multiple times a day at a resolution of around 5 meters. The company is also building up a fleet of larger satellites, called Pelicans, which image the planet’s surface in 30-centimeter detail. The fourth of these, <a href="https://investors.planet.com/news/news-details/2025/Planet-Launches-Two-Additional-High-Resolution-Pelican-Satellites/default.aspx" rel="noopener noreferrer" target="_blank">deployed</a> into orbit in 2025, ran the airplane-recognition algorithm. </p><p>All Planet’s satellites combined generate 30 terabytes of data per day—equivalent to 10,000 hours of high-definition video, which gets beamed to the ground for processing and analysis via tens of radio stations scattered all over the world.</p><p>Transferring the downloaded data into the cloud for processing and subsequent AI analysis takes hours, leading to delays, which could mean that a sparked wildfire gets noticed only when it’s too large to quickly contain.</p><p>“Minutes matter in some sectors,” Devaraj said. “And real-time insights really enable us to provide answers to problems as they’re unfolding.”</p><p>The AI image-recognition algorithms developed by Devaraj and her team analyze a single Pelican image comprising 16,000 pixels in half a second, using onboard GPUs. The results can be in the hands of users in minutes from the moment the image was taken.</p><p class="shortcode-media shortcode-media-youtube"> <span class="rm-shortcode" data-rm-shortcode-id="65c458adf45bc3f108a5ed4741dec90e" style="display:block;position:relative;padding-top:56.25%;"><iframe frameborder="0" height="auto" lazy-loadable="true" scrolling="no" src="https://www.youtube.com/embed/e8fjkuetzLE?rel=0" style="position:absolute;top:0;left:0;width:100%;height:100%;" width="100%"></iframe></span> <small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Planet Labs</small> </p><p>So far, only the Pelican satellites are fitted with AI-capable processors—the Nvidia Jetson Orin GPU modules frequently used in autonomous drones. But Devaraj says Planet plans to augment the SuperDove constellation with a new type of satellite, called the <a href="https://www.planet.com/pulse/introducing-owl-planet-s-most-advanced-satellite-mission-yet/" rel="noopener noreferrer" target="_blank">Owl</a>. The satellite will provide daily revisits with a higher resolution of up to 1 meter and will also be fitted with Nvidia’s Jetson processors, which are capable of AI detection. </p><p>The new fleet would enable the company to begin working on what Devaraj describes as “planetary intelligence.” Working as a single intelligent-satellite network, the Owls would constantly monitor the planet and autonomously flag potential problems directly to the higher-resolution Pelicans to revisit without the need for human interference.</p><p>“We want to put the brain, all the compute, right next to the sensors,” Devaraj said, “so that the system of satellites we build acts like a biological network that is responding to stimuli in real time.”</p><p>In the future, the company wants to switch to more-powerful Nvidia Jetson Thor processors and eventually run large language models (LLMs) in space.</p><p>“In five or 10 years, when we all get used to just accepting what Gemini and Claude and other LLMs give you, we may train some generic LLM on satellite imagery and just get text answers to what it sees,” said Devaraj. “You could just get a text message on your phone that says, ‘Three minutes ago, I detected this ship without an AIS transmitter, so it’s an illegal ship, and these are the specific coordinates.’ ”</p><p>The Earth-observation industry has been talking about onboard AI processing for almost a decade. But until recently, the technology wasn’t ready to run AI algorithms in space fast enough and reliably enough.</p><p>“We started with the early Nvidia Jetson processors, but until the Orin iteration, they didn’t have enough compute power,” Devaraj said.</p><p>To run onboard AI image analysis in space, the algorithms need to be able to handle unprocessed raw data that hasn’t been smoothened out and corrected, unlike data crunched by AI algorithms on Earth.</p><p>“There’s a lot of satellite-level uncertainties,” said Devaraj. “The satellite’s moving, the satellite’s wobbling, vibrating. On the ground, the processing takes hours to correct all of that.”</p><p>It took Planet engineers 18 months to achieve 80 percent detection reliability with the AI onboard model, Devaraj said. The team hopes the next iteration of their algorithm will increase that accuracy to over 95 percent.</p><p>The space-based real-time AI-detection service will only be made available to customers in the next six to nine months.</p><p>Devaraj thinks that when it comes to AI in space, this is only a start. Planet is collaborating with Google on the <a href="https://blog.google/innovation-and-ai/technology/research/google-project-suncatcher/" rel="noopener noreferrer" target="_blank">Suncatcher project</a>, which intends to deploy a vast constellation of data-processing satellites into Earth’s orbit. The project is one in a plethora of recently discussed ventures that envision moving Earth-based data-crunching infrastructure off the planet. Proponents, including tech giants SpaceX and Amazon, believe that in Earth’s orbit, power-hungry computers will be able to run on free solar power and be easily cooled without straining water supplies. But critics question whether large-scale computing infrastructure could ever be launched cheaply enough to compete with technology on Earth.</p><p>Google and Planet plan to fly two prototype satellites in 2027.</p><p><em>This story was updated on 4 May, 2026 to correct the number of Pelican satellites that Planet Labs is planning to launch. The original version of this story said 32 satellites, but the company has not committed to a final specific number at this time.</em><br/></p>]]></description><pubDate>Fri, 01 May 2026 14:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/ai-earth-observation-in-space</guid><category>Earth-observation</category><category>Ai</category><category>Computer-vision</category><category>Satellites</category><dc:creator>Tereza Pultarova</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/satellite-image-of-an-airport-with-all-visible-planes-highlighted-by-ai-recognition-boxes.jpg?id=66663058&amp;width=980"></media:content></item><item><title>The Fog, a New Encrypted Cloud Platform, Rolls In</title><link>https://spectrum.ieee.org/the-fog-cloud-encryption</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/illustration-of-a-workflow-application-floating-on-a-cloud-while-tethered-to-an-encrypted-chip-below.jpg?id=65521062&width=1200&height=400&coordinates=0%2C1042%2C0%2C1042"/><br/><br/><p>Most cloud computing services encrypt data in transit and at rest. But that data still needs to be decrypted before cloud servers or virtual machines can perform any kind of computation on it. This risks exposing data—especially sensitive information such as financial transactions or medical records—during processing. This is where <a href="https://niobium.co/platform" rel="noopener noreferrer" target="_blank">the Fog</a> comes in.</p><p>Launched in early April by chip startup <a href="https://niobium.co/" rel="noopener noreferrer" target="_blank">Niobium</a>, the Fog is an encrypted cloud platform. It follows a client-server architecture, where a person or organization (the client) can encrypt data or workloads locally using their own private keys and deploy the encrypted data or workloads to the Fog (the server) without sharing their keys. These private keys remain with data owners, and only they can decrypt any results from the platform.</p><p>Much as actual fog obscures everything it envelops, so does the encrypted cloud platform named after it. Yet unlike physical fog that eventually lifts, the Fog keeps data opaque at all times—even as computation happens.</p><p>“The data in our cloud will never be exposed—it’s always encrypted,” says <a href="https://www.linkedin.com/in/barrus" rel="noopener noreferrer" target="_blank">John Barrus</a>, vice president of product at Niobium. “It’s a new category of cloud.”</p><h2>Fully Homomorphic Encryption keeps the Fog secure</h2><p>Beneath the Fog lies a cryptographic technique known as <a href="https://spectrum.ieee.org/tag/homomorphic-encryption" target="_self">fully homomorphic encryption</a>, or FHE, which allows for computing on encrypted data without the need to decrypt it. But FHE is often slow and requires a lot of computing power and memory. </p><p>Niobium aims to address these bottlenecks using Mistic, its <a href="https://spectrum.ieee.org/tag/fpga" target="_self">FPGA</a> (field-programmable gate array) chip, which is configured to run FHE. For some applications the company is testing, this <a href="https://spectrum.ieee.org/homomorphic-encryption" target="_self">accelerator hardware</a> runs FHE about twice as fast as today’s GPUs, Barrus says.</p><p>To demonstrate the usability of its encrypted cloud platform, Niobium has developed a handful of template applications “that solve typical problems where you might want to hide the data or keep it encrypted, so people can start there and just try it out,” says Barrus. One such template application involves encrypted semantic search, which queries databases or datasets and returns relevant results based on the context or meaning of the search terms rather than keywords that match them. Both the query and the data source are encrypted, helping ensure data privacy.</p><p>“Let’s say you’re a legal firm, and you have sensitive case documents. You encrypt all those documents and store them encrypted in the cloud,” Barrus says. In this scenario, you can ask questions about the documents using encrypted semantic search “and get pointers to those documents back, and then just download and decrypt the documents you need.”</p><h2>Niobium takes FHE from theory to practice</h2><p><a href="https://www.linkedin.com/in/kurt-rohloff/" rel="noopener noreferrer" target="_blank">Kurt Rohloff</a>, cofounder and CTO at <a href="https://dualitytech.com/" rel="noopener noreferrer" target="_blank">Duality Technologies</a>, is excited about the prospect of running his company’s privacy-enhancing software products on the Fog. Duality provides software that uses FHE, including an <a href="https://spectrum.ieee.org/homomorphic-encryption-llm" target="_self">LLM inference framework</a>. Without a platform like the Fog, users may need to purchase dedicated FHE acceleration hardware, he says. But “the Niobium encrypted cloud platform allows users to rapidly scale their use of FHE-protected computing [and] get much more value from their data,” he says.</p><p>Echoing the sentiment is <a href="https://www.linkedin.com/in/rashmi-agrawal-9a0601133" rel="noopener noreferrer" target="_blank">Rashmi Agrawal</a>, cofounder and CTO at <a href="https://www.ciphersoniclabs.io/" rel="noopener noreferrer" target="_blank">CipherSonic Labs</a>, a company building FHE-powered encrypted AI infrastructure. “Platforms like Niobium are important because they help move FHE from theory into deployable infrastructure,” she says. “An encrypted cloud platform built on FHE fundamentally changes the trust model of cloud computing. This significantly reduces exposure to data leakage, insider threats, and compliance risks while enabling organizations to safely process highly sensitive data in the cloud.”</p><p>However, Agrawal points out that despite FHE’s rapid progress, there are still practical challenges. These include performance overheads for complex tasks or workloads that need to be completed with low latency, as well as filling in skills gaps for software developers who have no FHE knowledge or experience. “Building FHE-compatible applications often requires rethinking traditional approaches. The ecosystem is still maturing as tooling, standards, and interoperability continue to evolve,” she adds.</p><p>Barrus acknowledges these hurdles. “I think the real challenge is large language models with a lot of matrix and vector multiplications. We have to be fast enough that you’re not waiting minutes for every token but seconds or so. That’s going to be much harder to solve,” he says.</p><p>In terms of equipping developers without any FHE background, Niobium hopes to make the Fog more accessible by providing a tech stack composed of a compiler, software development kit, documentation, and other training materials. “If we can bring FHE computation to more people, then more people can develop privacy-preserving applications,” says Barrus.</p><p>The Fog is currently available in private beta, with Niobium targeting May or June for a public launch. The company is also developing an application-specific integrated circuit for its encrypted cloud platform that Barrus says will be up to 25 times as fast as a GPU, depending on the application.</p><p>“What we’re trying to do is create value from encrypted data,” he says. “Our vision is that data never has to be exposed to be useful.”</p><p><em>This article was updated on May 7 to clarify the nature of the Mistic FPGA.</em></p>]]></description><pubDate>Thu, 30 Apr 2026 13:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/the-fog-cloud-encryption</guid><category>Homomorphic-encryption</category><category>Data-privacy</category><category>Cloud-computing</category><category>Hardware-acceleration</category><dc:creator>Rina Diane Caballar</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/illustration-of-a-workflow-application-floating-on-a-cloud-while-tethered-to-an-encrypted-chip-below.jpg?id=65521062&amp;width=980"></media:content></item><item><title>Power Buffer Protects Grid From Data Centers’ Wild Load Swings</title><link>https://spectrum.ieee.org/data-center-power-fluctuation</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/staff-working-behind-computers-in-the-control-center.jpg?id=66648750&width=1200&height=400&coordinates=0%2C292%2C0%2C292"/><br/><br/><p>As more AI data centers come on line, concerns are rising about their effects on the grid, and it’s not just the amount of power they consume. They tend to have huge swings in power use, surging up and down by 70 percent or more in milliseconds. Traditional electricity infrastructure isn’t designed to deal with that kind of load fluctuation. </p><p>To address the problem, researchers are developing power electronics systems that sit between the data center and the grid to act as a buffer and even as a grid helper in times of need. One such system, developed by the Miami-based company <a href="https://www.on.energy/" rel="noopener noreferrer" target="_blank">ON.energy</a>, is being implemented across 3 gigawatts’ worth of projects, and has sailed through a battery of tests at the U.S. National Lab of the Rockies (NLR).</p><p>In the tests, ON.energy’s system sat between a simulated data center and a simulated grid. The system successfully protected the data center from grid instability and also safeguarded the grid from the major load swings generated by the data center. The company’s technology involves a bidirectional uninterruptible power supply (UPS) that it calls AI UPS.</p><p>Such grid buffers are becoming increasingly important as AI facilities expand to gigawatt scale and beyond. Utilities have major concerns about both the amount of power demanded by these data centers and their potential to create system instability due to wild variations in loads. Innovations are needed to help data centers become better grid citizens, and shorten the amount of time they must wait to connect to the grid.</p><h2>AI Data Centers and Grid Stability</h2><p>UPS systems have been used for decades to protect data centers from grid events. If frequency varies suddenly or power is lost, these unidirectional systems provide almost instantaneous, short-term backup power to the equipment inside the data center. Because servers can’t tolerate more than minor deviations, UPS electronics also clean up low-quality power, such as voltage spikes or sags and frequency deviation.</p><p>UPS has served data centers well. But the scale of modern facilities packed with graphics processing units (GPUs) changes the game. Instead of data centers whose size is measured in tens of megawatts, <a href="https://spectrum.ieee.org/5gw-data-center" target="_self">AI facilities are reaching up to 5 GW</a>. They still require the type of protection afforded by UPS, but their massive scale and load volatility pose dangers to the grid. </p><p>During a minor grid fault in Virginia in 2025, for example, several data centers tripped offline, causing <a href="https://www.datacenterdynamics.com/en/news/virginia-narrowly-avoided-power-cuts-when-60-data-centers-dropped-off-the-grid-at-once/" rel="noopener noreferrer" target="_blank">1.5 GW to drop off the grid</a> simultaneously. This caused panic for the system operator, who had to act fast to balance the system and avoid a major power outage.</p><p>In addition to major changes in overall load, AI data centers can generate short-lived, high-voltage, or high-current disturbances known as grid transients. They may only last microseconds, but they can break down insulation, overheat transformers, cause electrical arcing, start fires, and destabilize an entire grid.</p><p>“The scale of modern data centers could lead to load swings of 1 GW multiple times per minute, which creates frequency variations and oscillations that the grid can’t handle,” says <a href="https://www.linkedin.com/in/ricardodeazevedo/" rel="noopener noreferrer" target="_blank">Ricardo de Azevedo</a>, CTO at ON.energy. </p><p>These problems have given utilities and government authorities pause. Some authorities in the United States and parts of Europe are implementing moratoriums on new data centers or instituting rules that place responsibility for grid conditions onto the data center.</p><p><a href="https://www.gtlaw.com/en/insights/2026/3/texas-senate-bill-6-update-what-data-centers-large-load-customers-should-know-about-proposed-interconnection-standards" rel="noopener noreferrer" target="_blank">Texas Senate Bill 6</a>, for example, requires new data centers to pay a share of any new grid infrastructure needed by their facilities. Additional requirements for voltage ride-through—the equipment’s ability to continue operating during power disruptions—are currently being formulated in accordance with this bill. Such rules aim to prevent large data centers from tripping offline suddenly or overwhelming the grid due to severe load variability from AI workloads. </p><p>“One of our customers in Texas that is building a 1-GW campus is now being required by the local grid authority to include voltage ride-through,” Azevedo says. </p><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Two men monitoring power grid data displayed on a video wall." class="rm-shortcode" data-rm-shortcode-id="3d7f1a685653b94a940be5dca341e269" data-rm-shortcode-name="rebelmouse-image" id="f5d69" loading="lazy" src="https://spectrum.ieee.org/media-library/two-men-monitoring-power-grid-data-displayed-on-a-video-wall.jpg?id=66648905&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">NLR engineers Przemyslaw Koralewicz [left] and Shahil Shah monitor the results of a simulation of ON.energy’s AI UPS in the control center at the NLR Flatirons Campus.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Agata Bogucka/NLR</small></p><h2>Bidirectional UPS </h2><p>ON.energy’s 3.5-megawatt units consist of a power conversion system (PCS), batteries to store energy and act as an energy reservoir or buffer, another PCS, and a transformer. The batteries can provide up to eight hours of backup power, depending on the size of the data center. ON.energy sources this equipment from established manufacturers and adds its own software and controls.</p><p>The latest PCS units are bidirectional, acting as the interface between the grid, the batteries, and the data center. They convert between the alternating current (AC) from the grid, the direct current (DC) stored in the batteries, and the AC delivered to the data-center load, ensuring power quality and optimal flow when feeding an AI facility. In the other direction, the PCS absorbs and smooths transients caused by sudden load swings in the data center that would otherwise disrupt the grid. </p><p>“The batteries act like a reservoir of energy as well as a shock absorber, should there be any disturbances on the grid or from the data center,” says Azevedo.</p><p>ON.energy’s system is housed outside the data center, rather than inside as most UPS systems are, which frees up space internally for more compute resources. Being outside also allows the system to harness more advanced power electronics fed by medium voltage. Traditional UPS, on the other hand, operates on the low voltages needed by data-center computers for safety reasons. </p><p>The company has about 3 GW of these bidirectional AI UPS units either operating or under construction. It expects to commission a system in May for a 1.5-GW AI data center in Texas, according to Azevedo. For such a facility, hundreds of these 3.5-MW units would be required. </p><h2>NLR’s Data Center-Grid Simulator</h2><p>To test its system, ON.energy turned to NLR (formerly known as the National Renewable Energy Laboratory). The facility is likely the <a href="https://www.nlr.gov/news/detail/program/2026/could-a-new-kind-of-power-supply-help-make-data-centers-grid-friendly" target="_blank">only one in the world</a> that can do full-load, bidirectional testing that simulates both grid conditions and variable data-center loads. The facility can test up to 20 MW with voltage levels reaching 13.2 kilovolts. The test consisted of a 7-MW grid simulator that replicates disturbances and voltage ride-through events, and a 20-MW load simulator that reproduces real-world demand dynamics such as those created by an AI data center.</p>Systems like ON.energy’s could become the norm in the coming years. Pilot projects for similar technologies are ongoing in Ireland. Another <a href="https://spectrum.ieee.org/dcflex-data-center-flexibility" target="_self">project in France</a> coordinated by the Electric Power Research Institute (EPRI) is assessing the capabilities of UPS systems through its DC Flex initiative. Results are expected in the coming weeks. Lower voltage versions of this type of bidirectional technology are also under development by <a href="https://www.eaton.com/us/en-us/products/backup-power-ups-surge-it-power-distribution/backup-power-ups/dual-purpose-ups-technology.html" target="_blank">Eaton and Microsoft</a>. <p><br/></p><em><em>This story was updated on 29 April, 2026 to clarify how ON.energy’s power conversion system works.</em></em>]]></description><pubDate>Wed, 29 Apr 2026 14:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/data-center-power-fluctuation</guid><category>Data-center-energy</category><category>Ai-data-centers</category><category>Power-quality</category><category>Power-electronics</category><dc:creator>Drew Robb</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/staff-working-behind-computers-in-the-control-center.jpg?id=66648750&amp;width=980"></media:content></item><item><title>Why the Ideal Magnet Remains Out of Reach</title><link>https://spectrum.ieee.org/rare-earth-free-magnets</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/photo-of-a-technician-working-on-a-cylindrical-machine-containing-brass-colored-components.jpg?id=66525633&width=1200&height=400&coordinates=0%2C1042%2C0%2C1042"/><br/><br/><p>All over the world, researchers are working on an urgent and surprisingly difficult challenge: creating a cost-effective yet powerful <a href="https://spectrum.ieee.org/best-rare-earth-elements-2025" target="_self">permanent magnet</a> that doesn’t use <a href="https://spectrum.ieee.org/rare-earth-elements-2670490876" target="_self">rare earth elements</a>. Rare earth magnets are essential components of the motors for electric vehicles, heating and cooling systems, robots, tools, and appliances, and they’re also essential for wind turbines, audio speakers, and other systems. A strong magnet that doesn’t use rare earths would be of almost incalculable value, because it would free its users from China’s near-monopoly on rare earth elements and magnets. By circumventing that monopoly, it would almost certainly alter geostrategic calculations and global supply chains in short order.</p><p>Tantalizingly, no physics theories preclude the existence of a powerful and rare-earth-free magnet. And yet, after more than a decade of intensive efforts by many exceptionally bright people, no such magnet has been discovered.</p><p>Now, a small group of researchers in France and the United States has set out to test an intriguing hypothesis—that the problem can be solved with quantum computers. “You need the math of quantum mechanics to solve a problem that lives in the quantum realm,” declares <a href="https://www.linkedin.com/in/theau-peronnin/" rel="noopener noreferrer" target="_blank">Théau Peronnin</a>, CEO of <a href="https://alice-bob.com/" rel="noopener noreferrer" target="_blank">Alice & Bob</a>, a Paris-based quantum computer startup. Alice & Bob is collaborating with <a href="https://www.lanl.gov/" rel="noopener noreferrer" target="_blank">Los Alamos National Laboratory</a> and <a href="https://www.gevernova.com/" rel="noopener noreferrer" target="_blank">GE Vernova</a>, with US $3.9 million in funding from the U.S. Department of Energy’s ARPA-E <a href="https://arpa-e.energy.gov/programs-and-initiatives/view-all-programs/qc3" rel="noopener noreferrer" target="_blank">Quantum Computing for Computational Chemistry</a> program.</p><h2><strong>Why Rare Earth Magnets Still Dominate</strong></h2><p><a href="https://nemad.org/" rel="noopener noreferrer" target="_blank">More than 67,000 compounds</a> are known to have some degree of permanent magnetism. None, however, come close to the reigning permanent-magnet champ, <a href="https://spectrum.ieee.org/the-men-who-made-the-magnet-that-made-the-modern-world" target="_self">neodymium iron boron</a> (NdFeB), which dominates high-power applications.</p><p>For more than 15 years, researchers have used conventional high-performance computers to search for new and powerful magnets. But no commercially successful magnets have come out of that work. Even the best conventional computers aren’t powerful enough to simulate the detailed magnetic properties of a hypothetical permanent magnet.</p><p>To understand why, start with the basics. Permanent magnetism arises in certain crystalline materials when the spins of <a href="https://spectrum.ieee.org/tag/electrons" target="_self">electrons</a> of some of the atoms in the crystal are forced to point in the same direction, either “up” or “down.” The more of these aligned spins, the stronger the magnetism. The ideal atoms are ones that have unpaired electrons swarming around the nucleus in what are known as <a href="https://winter.group.shef.ac.uk/orbitron/atomic_orbitals/3d/index.html" rel="noopener noreferrer" target="_blank">3d orbitals</a>. Tops are iron, with four unpaired 3d electrons, and <a href="https://spectrum.ieee.org/tag/cobalt" target="_self">cobalt</a>, with three.</p><p>But 3d electrons alone are not enough to make superstrong magnets. As researchers discovered decades ago, magnetic strength can be greatly improved by adding to the crystalline lattice atoms with unpaired electrons in the 4f orbital—notably the rare earth elements <a href="https://spectrum.ieee.org/tag/neodymium" target="_self">neodymium</a>, praseodymium, and dysprosium. These 4f electrons enhance a characteristic of the crystalline lattice called magnetic <a href="https://www.stanfordmagnets.com/what-is-magnetic-anisotropy.html" rel="noopener noreferrer" target="_blank">anisotropy</a>—in effect, they promote adherence of the magnetic moments of the atoms to the desired directions in the crystal lattice. That, in turn, can be exploited to achieve high <a href="https://en.wikipedia.org/wiki/Coercivity" rel="noopener noreferrer" target="_blank">coercivity</a>, the essential property that lets a permanent magnet stay magnetized.</p><p class="pull-quote">“The combinatorial space is just ridiculously large. It’s 2 to the—I don’t know—40th or 50th power. It’s absolutely tremendous.”<br/></p><p>The point is that being able to accurately simulate a hypothetical magnet means not only accounting for all those electron orbitals and spin states but also simulating the <em><em>interaction</em></em> of all those electron orbitals and spin states. And that’s really, really hard.</p><p>“Let’s say you have a chain of atoms, each with a single electron in the 1d orbital,” explains Peronnin. “And then you want to understand: If the spin of this one electron is down, how does it affect its neighbors? Would they be more likely to be up or down? And you need to do so for all the electrons in your chain. And then see if the total system has a tendency to align all its electron spins. Or, once you’ve added a bit of thermal noise and an external magnetic field, for example, how much disorder would there be in that chain? And so those are exactly the properties you want to predict.</p><p>“The emergent global properties [such as magnetism] arise from the local behavior of each electron. But each electron’s behavior is highly, highly correlated with how its neighbors behave. And this is what makes the problem extremely difficult, because you cannot treat each of those electrons individually. You need to treat the whole system with all its possible configurations all at once to predict the global properties. And this is where the computing space explodes.</p><p>“You have to consider all the possible superpositions of states of those electrons,” Peronnin continues. “And so here, the combinatorial space is just ridiculously large. It’s 2 to the, I don’t know, 40th or 50th power. It’s absolutely tremendous.”</p><h2>Why Quantum Computers Might Finally Solve This Problem</h2><p>The great potential advantage of quantum computers here is <a href="https://arxiv.org/abs/2405.07222" rel="noopener noreferrer" target="_blank">quantum parallelism</a>, a capability that emerges directly from the qubits that are the heart of a quantum computer. In such a machine, these qubits are entangled with one another. The qubits are also in a state of <a href="https://scienceexchange.caltech.edu/topics/quantum-science-explained/quantum-superposition" rel="noopener noreferrer" target="_blank">superposition</a>, which means that they can embody, in the macro world, certain quantum characteristics of subatomic particles. Namely, they can represent a binary 0 or 1 and also exist in a continuous range of states, each with an associated pair of probabilities—a probability that the bit is 0 and a corresponding probability that it’s 1. And the more there are of these superimposed qubits that are entangled, the more states those qubits can represent: A collection of <em><em>n</em></em> entangled qubits can represent 2<em><em><span><sup>n</sup></span></em></em> states simultaneously. The upshot is that with enough qubits, a quantum computer could handle the stupendous computational challenge of accurately simulating a hypothetical magnetic material.</p><p>How many qubits are enough? Peronnin figures things will start getting interesting when he and his colleagues can build a machine that has 100 logical qubits furnished with a proprietary type of error correction that they have pioneered. He figures that will happen around 2030. (IBM and others have already built quantum computers with <a href="https://spectrum.ieee.org/ibm-condor" target="_self">over 1,000 <em><em>physical</em></em> qubits</a>, but these machines did not have the <a href="https://spectrum.ieee.org/ibm-quantum-error-correction-starling" target="_self">error correction</a> that is the defining characteristic of logical qubits, and none of them ever performed useful work.)</p><p class="pull-quote">A strong magnet that doesn’t use rare earths would be of almost incalculable value.</p><p>Magnetics researchers not involved with the ARPA-E effort are mostly supportive of the project, while noting that progress on quantum computers is notoriously difficult to predict. “This is an interesting approach,” says <a href="https://ceps.unh.edu/person/jiadong-zang" target="_blank">Jiadong Zang</a>, a professor of materials science and director of the materials science program at the University of New Hampshire. “You need some extraordinary approach to find some new structures,” he adds. Zang is part of a group that has been using a large language model to search the magnetics literature for the purpose of creating a database of experimental magnets, called the <a href="https://pubmed.ncbi.nlm.nih.gov/41136402/" target="_blank">Northeast Materials Database for Magnetic Materials</a>.</p><p>“This might be a task that quantum computers could do well,” agrees <a href="https://www.ameslab.gov/directory/matthew-kramer" target="_blank">Matthew Kramer</a>, Distinguished Scientist at <a href="https://www.ameslab.gov/" target="_blank">Ames National Laboratory</a>, in Iowa. (Kramer is working on a project with the U.S. Department of Energy and Fermilab aimed at improving a certain class of qubits.) He cautions, however, that efforts to use conventional computers to identify new magnet materials have often identified new candidates that could not possibly be built in the real world.</p><h2>Microsoft’s Imaginary Magnets Will Probably Stay That Way</h2><p>A recent and highly <a href="https://www.microsoft.com/en-us/research/blog/mattergen-a-new-paradigm-of-materials-design-with-generative-ai/" target="_blank">ambitious project at Microsoft</a>, for example, resulted in a system called <a href="https://www.nature.com/articles/s41586-025-08628-5" target="_blank">MatterGen</a>, which the researchers used to design a range of magnets with “low supply-chain risk.” However, the researchers simplified the problem greatly by focusing on “high magnetic density” alone, without trying to incorporate any of the many other characteristics needed for a magnet to be useful. Taking into account such characteristics, including high coercivity, chemical stability, and cost effectiveness, is a big reason why the challenge quickly becomes computationally intractable. In the end, the researchers did not fabricate any of the magnets identified; it’s not even clear that they could.</p><p>“They had a lot of unusual structures,” Kramer notes. “The real question there is, can any of those actually be synthesized?”</p>At GE Vernova, <a href="https://scholar.google.com/citations?user=E2xhYPAAAAAJ&hl=en" target="_blank">senior scientist Jonathan Owens</a> says a likely best outcome would be for quantum computing to become part of a larger experimental system. “Quantum will be a piece of probably a much larger pipeline where you’re using machine learning or traditional methods to kind of guide what quantum calculations you need to run,” Owens says. “You’ll feed that back into your larger workflow and sort of iterate. But you can explore any space because you’re not restricted to only chemistries you know.”]]></description><pubDate>Wed, 29 Apr 2026 12:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/rare-earth-free-magnets</guid><category>Permanent-magnets</category><category>Quantum-computers</category><category>Rare-earth-metals</category><category>Rare-earths</category><category>Electric-motors</category><dc:creator>Glenn Zorpette</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/photo-of-a-technician-working-on-a-cylindrical-machine-containing-brass-colored-components.jpg?id=66525633&amp;width=980"></media:content></item><item><title>Better Hardware Could Turn Zeros into AI Heroes</title><link>https://spectrum.ieee.org/sparse-ai</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/abstract-gradient-artwork-of-a-stylized-robot-head-with-circuits-and-binary-code-patterns.jpg?id=65862907&width=1200&height=400&coordinates=0%2C990%2C0%2C990"/><br/><br/><p><strong>When it comes to</strong> AI models, size matters.</p><p>Even though some artificial-intelligence experts <a href="https://spectrum.ieee.org/chain-of-thought-prompting" target="_self">warn</a> that scaling up large language models (LLMs) is hitting diminishing performance returns, companies are still coming out with ever larger AI tools. Meta’s latest Llama release had a staggering <a href="https://ai.meta.com/blog/llama-4-multimodal-intelligence/" rel="noopener noreferrer" target="_blank">2 trillion</a> parameters that define the model.</p><p>As models grow in size, their <a href="https://arxiv.org/abs/2001.08361" rel="noopener noreferrer" target="_blank">capabilities</a> increase. But so do the energy demands and the time it takes to run the models, which increases their <a href="https://spectrum.ieee.org/ai-index-2025" target="_self">carbon footprint</a>. To mitigate these issues, people have turned to <a href="https://spectrum.ieee.org/large-language-models-size" target="_self">smaller, less capable models</a> and using <a href="https://spectrum.ieee.org/1-bit-llm" target="_self">lower-precision</a> numbers whenever possible for the model parameters.</p><p>But there is another path that may retain a staggeringly large model’s high performance while reducing the time it takes to run an energy footprint. This approach involves befriending the zeros inside large AI models.</p><p>For many models, most of the parameters—the weights and activations—are actually zero, or so close to zero that they could be treated as such without losing accuracy. This quality is known as sparsity. Sparsity offers a significant opportunity for computational savings: Instead of wasting time and energy adding or multiplying zeros, these calculations could simply be skipped; rather than storing lots of zeros in memory, one need only store the nonzero parameters.</p><p>Unfortunately, today’s popular hardware, like multicore CPUs and GPUs, do not naturally take full advantage of sparsity. To fully leverage sparsity, researchers and engineers need to rethink and re-architect each piece of the design stack, including the hardware, low-level firmware, and application software.</p><p>In our research group at Stanford University, we have developed the first (to our knowledge) piece of hardware that’s capable of calculating all kinds of sparse and traditional workloads efficiently. The energy savings varied widely over the workloads, but on average our chip consumed one-seventieth the energy of a CPU, and performed the computation on average eight times as fast. To do this, we had to engineer the hardware, low-level firmware, and software from the ground up to take advantage of sparsity. We hope this is just the beginning of hardware and model development that will allow for more energy-efficient AI.</p><h2>What is sparsity?</h2><p>Neural networks, and the data that feeds into them, are represented as arrays of numbers. These arrays can be one-dimensional (vectors), two-dimensional (matrices), or more (tensors). A sparse vector, matrix, or tensor has mostly zero elements. The level of sparsity varies, but when zeroes make up more than 50 percent of any type of array, it can stand to benefit from sparsity-specific computational methods. In contrast, an object that is not sparse—that is, it has few zeros compared with the total number of elements—is called dense.</p><p>Sparsity can be naturally present, or it can be induced. For example, a <a href="https://arxiv.org/abs/2005.00687" rel="noopener noreferrer" target="_blank">social-network graph</a> will be naturally sparse. Imagine a graph where each node (point) represents a person, and each edge (a line segment connecting the points) represents a friendship. Since most people are not friends with one another, a matrix representing all possible edges will be mostly zeros. Other popular applications of AI, such as other forms of graph learning and <a href="https://arxiv.org/abs/1906.03109" rel="noopener noreferrer" target="_blank">recommendation models</a>, contain naturally occurring sparsity as well.</p><h3></h3><br/><img alt="Diagram mapping a sparse matrix to a fibertree and compressed storage format" class="rm-shortcode" data-rm-shortcode-id="d0cc84749a0f0fb374e27ea2ba2041c3" data-rm-shortcode-name="rebelmouse-image" id="3b584" loading="lazy" src="https://spectrum.ieee.org/media-library/diagram-mapping-a-sparse-matrix-to-a-fibertree-and-compressed-storage-format.jpg?id=65866445&width=980"/><h3></h3><br/><p>Beyond naturally occurring sparsity, sparsity can also be induced within an AI model in several ways. Two years ago, a team at <a href="https://spectrum.ieee.org/cerebras-wafer-scale-engine" target="_self">Cerebras</a> <a href="https://www.cerebras.ai/blog/introducing-sparse-llama-70-smaller-3x-faster-full-accuracy" target="_blank">showed</a> that one can set up to 70 to 80 percent of parameters in an LLM to zero without losing any accuracy. Cerebras demonstrated these results specifically on Meta’s open-source Llama 7B model, but the ideas extend to other LLM models like ChatGPT and Claude.</p><h2>The case for sparsity</h2><p>Sparse computation’s efficiency stems from two fundamental properties: the ability to compress away zeros and the convenient mathematical properties of zeros. Both the algorithms used in sparse computation and the hardware dedicated to them leverage these two basic ideas.</p><p>First, sparse data can be compressed, making it more memory efficient to store “sparsely”—that is, in something called a sparse data type. Compression also makes it more energy efficient to move data when dealing with large amounts of it. This is best understood by an example. Take a four-by-four matrix with three nonzero elements. Traditionally, this matrix would be stored in memory as is, taking up 16 spaces. This matrix can also be compressed into a sparse data type, getting rid of the zeros and saving only the nonzero elements. In our example, this results in 13 memory spaces as opposed to 16 for the dense, uncompressed version. These savings in memory increase with increased sparsity and matrix size.</p><h3></h3><br/><img alt="Diagram comparing dense and sparse matrix\u2013vector multiplication step by step." class="rm-shortcode" data-rm-shortcode-id="3d04f283be99eec83a4206f10d0394ca" data-rm-shortcode-name="rebelmouse-image" id="f523b" loading="lazy" src="https://spectrum.ieee.org/media-library/diagram-comparing-dense-and-sparse-matrix-u2013vector-multiplication-step-by-step.jpg?id=66499008&width=980"/><p><br/></p><p>In addition to the actual data values, compressed data also requires metadata. The row and column locations of the nonzero elements also must be stored. This is usually thought of as a “fibertree”: The row labels containing nonzero elements are listed and linked to the column labels of the nonzero elements, which are then linked to the values stored in those elements.</p><p>In memory, things get a bit more complicated still: The row and column labels for each nonzero value must be stored as well as the “segments” that indicate how many such labels to expect, so the metadata and data can be clearly delineated from one another.</p><p>In a dense, noncompressed matrix data type, values can be accessed either one at a time or in parallel, and their locations can be calculated directly with a simple equation. However, accessing values in sparse, compressed data requires looking up the coordinates of the row index and using that information to “indirectly” look up the coordinates of the column index before finally reaching the value. Depending on the actual locations of the sparse data values, these indirect lookups can be extremely random, making the computation data-dependent and requiring the allocation of memory lookups on the fly.</p><p>Second, two mathematical properties of zero let software and hardware skip a lot of computation. Multiplying any number by zero will result in a zero, so there’s no need to actually do the multiplication. Adding zero to any number will always return that number, so there’s no need to do the addition either.</p><p>In matrix-vector multiplication, one of the most common operations in AI workloads, all computations except those involving two nonzero elements can simply be skipped. Take, for example, the four-by-four matrix from the previous example and a vector of four numbers. In dense computation, each element of the vector must be multiplied by the corresponding element in each row and then added together to compute the final vector. In this case, that would take 16 multiplication operations and 16 additions (or four accumulations).</p><p>In sparse computation, only the nonzero elements of the vector need be considered. For each nonzero vector element, indirect lookup can be used to find any corresponding nonzero matrix element, and only those need to be multiplied and added. In the example shown here, only two multiplication steps will be performed, instead of 16.</p><h2>The trouble with GPUs and CPUs</h2><p>Unfortunately, modern hardware is not well suited to accelerating sparse computation. For example, say we want to perform a matrix-vector multiplication. In the simplest case, in a single CPU core, each element in the vector would be multiplied sequentially and then written to memory. This is slow, because we can do only one multiplication at a time. So instead people use CPUs with vector support or GPUs. With this hardware, all elements would be multiplied in parallel, greatly speeding up the application. Now, imagine that both the matrix and vector contain extremely sparse data. The vectorized CPU and GPU would spend most of their efforts multiplying by zero, performing completely ineffectual computations.</p><p><a href="https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt/" target="_blank">Newer generations</a> of GPUs are capable of taking some advantage of sparsity in their hardware, but only a particular kind, called structured sparsity. Structured sparsity assumes that two out of every four adjacent parameters are zero. However, some models benefit more from unstructured sparsity—the ability for any parameter (weight or activation) to be zero and compressed away, regardless of where it is and what it is adjacent to. GPUs can run unstructured sparse computation in software, for example, through the use of the <a href="https://docs.nvidia.com/cuda/cusparse/" target="_blank">cuSparse GPU library</a>. However, the support for sparse computations is often limited, and the GPU hardware gets underutilized, wasting energy-intensive computations on overhead.</p><p class="shortcode-media shortcode-media-rebelmouse-image rm-float-left rm-resized-container rm-resized-container-25" data-rm-resized-container="25%" rel="float: left;" style="float: left;"> <img alt="Neon pixel art of a glowing portal framed by geometric stairs and circuitry lines" class="rm-shortcode" data-rm-shortcode-id="7edb9085f930de797a7c401b9485d3ea" data-rm-shortcode-name="rebelmouse-image" id="012af" loading="lazy" src="https://spectrum.ieee.org/media-library/neon-pixel-art-of-a-glowing-portal-framed-by-geometric-stairs-and-circuitry-lines.jpg?id=65863062&width=980"/> <small class="image-media media-photo-credit" placeholder="Add Photo Credit..."><a href="https://petrapeterffy.com/" target="_blank">Petra Péterffy</a></small></p><p>When doing sparse computations in software, modern CPUs may be a better alternative to GPU computation, because they are designed to be more flexible. Yet, sparse computations on the CPU are often bottlenecked by the indirect lookups used to find nonzero data. CPUs are designed to “prefetch” data based on what they expect they’ll need from memory, but for randomly sparse data, that process often fails to pull in the right stuff from memory. When that happens, the CPU must waste cycles calling for the right data.</p><p>Apple was the <a href="https://ieeexplore.ieee.org/document/9833570" target="_blank">first</a> to speed up these indirect lookups by supporting a method called an array-of-pointers access pattern in the prefetcher of their A14 and M1 chips. Although innovations in prefetching make Apple CPUs more competitive for sparse computation, CPU architectures still have fundamental overheads that a dedicated sparse computing architecture would not, because they need to handle general-purpose computation.</p><p>Other companies have been developing <a href="https://spectrum.ieee.org/nvidia-ai" target="_self">hardware</a> that accelerates sparse machine learning as well. These include Cerebras’s <a href="https://spectrum.ieee.org/cerebras-chip-cs3" target="_self">Wafer Scale Engine</a> and <a href="https://ai.meta.com/blog/next-generation-meta-training-inference-accelerator-AI-MTIA/" target="_blank">Meta’s Training and Inference Accelerator (MTIA)</a>. The Wafer Scale Engine, and its corresponding sparse programming framework, have <a href="https://www.cerebras.ai/blog/introducing-sparse-llama-70-smaller-3x-faster-full-accuracy" target="_blank">shown</a> incredibly sparse results of up to 70 percent sparsity on LLMs. However, the company’s hardware and software solutions support only weight sparsity, not activation sparsity, which is important for many applications. The second version of the MTIA <a href="https://ai.meta.com/blog/next-generation-meta-training-inference-accelerator-AI-MTIA/" target="_blank">claims</a> a sevenfold sparse compute performance boost over the <a href="https://doi-org.stanford.idm.oclc.org/10.1145/3579371.3589348" target="_blank">MTIA v1</a>. However, the only publicly available information regarding sparsity support in the MTIA v2 is for matrix multiplication, not for vectors or tensors.</p><p>Although matrix multiplications take up the majority of computation time in most modern ML models, it’s important to have sparsity support for other parts of the process. To avoid switching back and forth between sparse and dense data types, all of the operations should be sparse.</p><h2>Onyx</h2><p>Instead of these halfway solutions, our team at Stanford has developed a hardware accelerator, <a href="https://ieeexplore.ieee.org/document/10631383" target="_blank">Onyx</a>, that can take advantage of sparsity from the ground up, whether it’s structured or unstructured. Onyx is the first programmable accelerator to support both sparse and dense computation; it’s capable of accelerating key operations in both domains.</p>To understand Onyx, it is useful to know what a coarse-grained reconfigurable array (CGRA) is and how it compares with more familiar hardware, like CPUs and field-programmable gate arrays (FPGAs).<p>CPUs, CGRAs, and FPGAs represent a trade-off between efficiency and flexibility. Each individual logic unit of a CPU is designed for a specific function that it performs efficiently. On the other hand, since each individual bit of an FPGA is configurable, these arrays are extremely flexible, but very inefficient. The goal of CGRAs is to achieve the flexibility of FPGAs with the efficiency of CPUs.</p><p>CGRAs are composed of efficient and configurable units, typically memory and compute, that are specialized for a particular application domain. This is the key benefit of this type of array: Programmers can reconfigure the internals of a CGRA at a high level, making it more efficient than an FPGA but more flexible than a CPU.</p><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Two circuit boards and a pen showing a chip shrinking from large to tiny size." class="rm-shortcode" data-rm-shortcode-id="b8111010f181900745167f0ffb5617f3" data-rm-shortcode-name="rebelmouse-image" id="f394d" loading="lazy" src="https://spectrum.ieee.org/media-library/two-circuit-boards-and-a-pen-showing-a-chip-shrinking-from-large-to-tiny-size.jpg?id=65970072&width=980"/> <small class="image-media media-caption" placeholder="Add Photo Caption...">The Onyx chip, built on a coarse-grained reconfigurable array (CGRA), is the first (to our knowledge) to support both sparse and dense computations. </small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Olivia Hsu</small></p><p>Onyx is composed of flexible, programmable processing element (PE) tiles and memory (MEM) tiles. The memory tiles store compressed matrices and other data formats. The processing element tiles operate on compressed matrices, eliminating all unnecessary and ineffectual computation.</p><p>The Onyx compiler handles conversion from software instructions to CGRA configuration. First, the input expression—for instance, a sparse vector multiplication—is translated into a graph of abstract memory and compute nodes. In this example, there are memories for the input vectors and output vectors, a compute node for finding the intersection between nonzero elements, and a compute node for the multiplication. The compiler figures out how to map the abstract memory and compute nodes onto MEMs and PEs on the CGRA, and then how to route them together so that they can transfer data between them. Finally, the compiler produces the instruction set needed to configure the CGRA for the desired purpose.</p><p>Since Onyx is programmable, engineers can map many different operations, such as vector-vector element multiplication, or the key tasks in AI, like matrix-vector or matrix-matrix multiplication, onto the accelerator.</p><p>We evaluated the efficiency gains of our hardware by looking at the product of energy used and the time it took to compute, called the energy-delay product (EDP). This metric captures the trade-off of speed and energy. Minimizing just energy would lead to very slow devices, and minimizing speed would lead to high-area, high-power devices.</p><p>Onyx achieves up to 565 times as much energy-delay product over CPUs (we used a 12-core Intel Xeon CPU) that utilize dedicated sparse libraries. Onyx can also be configured to accelerate regular, dense applications, similar to the way a GPU or TPU would. If the computation is sparse, Onyx is configured to use sparse primitives, and if the computation is dense, Onyx is reconfigured to take advantage of parallelism, similar to how GPUs function. This architecture is a step toward a single system that can accelerate both sparse and dense computations on the same silicon.</p><p>Just as important, Onyx enables new algorithmic thinking. Sparse acceleration hardware will not only make AI more performance- and energy efficient but also enable researchers and engineers to explore new algorithms that have the potential to dramatically improve AI.</p><h2>The future with sparsity</h2><p>Our team is already working on next-generation chips built off of Onyx. Beyond matrix multiplication operations, machine learning models perform other types of math, like nonlinear layers, normalization, the softmax function, and more. We are adding support for the full range of computations on our next-gen accelerator and within the compiler. Since sparse machine learning models may have both sparse and dense layers, we are also working on integrating the dense and sparse accelerator architecture more efficiently on the chip, allowing for fast transformation between the different data types. We’re also looking at ways to manage memory constraints by breaking up the sparse data more effectively so we can run computations on several sparse accelerator chips.</p><p>We are also working on systems that can predict the performance of accelerators such as ours, which will help in designing better hardware for sparse AI. Longer term, we’re interested in seeing whether high degrees of sparsity throughout AI computation will catch on with more model types, and whether sparse accelerators become adopted at a larger scale.</p><p>Building the hardware to unstructured sparsity and optimally take advantage of zeros is just the beginning. With this hardware in hand, AI researchers and engineers will have the opportunity to explore new models and algorithms that leverage sparsity in novel and creative ways. We see this as a crucial research area for managing the ever-increasing runtime, costs, and environmental impact of AI. <span class="ieee-end-mark"></span></p><p><em>This article appears in the June 2026 print issue.</em></p>]]></description><pubDate>Tue, 28 Apr 2026 18:03:40 +0000</pubDate><guid>https://spectrum.ieee.org/sparse-ai</guid><category>Ai-models</category><category>Gpus</category><category>Energy-efficiency</category><category>Data-compression</category><dc:creator>Olivia Hsu</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/abstract-gradient-artwork-of-a-stylized-robot-head-with-circuits-and-binary-code-patterns.jpg?id=65862907&amp;width=980"></media:content></item><item><title>The Chip That Made Hardware Rewriteable</title><link>https://spectrum.ieee.org/fpga-chip-ieee-milestone</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/die-photo-of-an-integrated-circuit-with-an-8-by-8-array.jpg?id=66633028&width=1200&height=400&coordinates=0%2C1042%2C0%2C1042"/><br/><br/><p>Many of the world’s most advanced electronic systems—including <a href="https://spectrum.ieee.org/tag/routers" target="_self">Internet routers</a>, <a href="https://spectrum.ieee.org/6g-wireless" target="_self">wireless base stations</a>, <a href="https://spectrum.ieee.org/mri" target="_self">medical imaging scanners</a>, and <a href="https://www.ibm.com/think/topics/ai-accelerator" rel="noopener noreferrer" target="_blank">some artificial intelligence tools</a>—depend on <a href="https://spectrum.ieee.org/tag/fpga" target="_self">field-programmable gate arrays</a>. Computer chips with internal hardware circuits, the FPGAs can be reconfigured after manufacturing.</p><p>On 12 March, an <a href="https://ieeemilestones.ethw.org/Main_Page" rel="noopener noreferrer" target="_blank">IEEE Milestone</a> plaque recognizing the first FPGA was dedicated at the <a href="https://www.amd.com/en.html" rel="noopener noreferrer" target="_blank">Advanced Micro Devices</a> campus in San Jose, Calif., the former <a href="https://en.wikipedia.org/wiki/Xilinx" rel="noopener noreferrer" target="_blank">Xilinx</a> headquarters and the birthplace of the technology.</p><p>The FPGA earned the Milestone designation because it introduced iteration to semiconductor design. Engineers could redesign hardware repeatedly without fabricating a new chip, dramatically reducing development risk and enabling faster innovation at a time when semiconductor costs were rising rapidly.</p><p>The ceremony, which was organized by the <a href="https://ieeescv.org/" rel="noopener noreferrer" target="_blank">IEEE Santa Clara Valley Section</a>, brought together professionals from across the semiconductor industry and IEEE leadership. Speakers at the event included <a href="https://www.seti.org/people/stephen-trimberger/" rel="noopener noreferrer" target="_blank">Stephen Trimberger</a>, an IEEE and <a href="https://www.acm.org/" rel="noopener noreferrer" target="_blank">ACM</a> Fellow <a href="https://www.seti.org/people/stephen-trimberger/" rel="noopener noreferrer" target="_blank"></a>whose technical contributions helped shape modern FPGA architecture. Trimberger reflected on how the invention enabled software-programmable hardware.</p><h2>Solving computing’s flexibility-performance tradeoff</h2><p>FPGAs emerged in the 1980s to address a core limitation in computing. A microprocessor executes software instructions sequentially, making it flexible but sometimes too slow for workloads requiring many operations at once.</p><p>At the other extreme, <a href="https://spectrum.ieee.org/lowbudget-chip-design-how-hard-is-it" target="_self">application-specific integrated circuits</a> are chips designed to do only one task. ASICs achieve high efficiency but require lengthy development cycles and nonrecurring engineering costs, which are large, upfront investments. Expenses include designing the chip and preparing it for manufacturing—a process that involves creating detailed layouts, building <a href="https://spectrum.ieee.org/leading-chipmakers-eye-euv-lithography-to-save-moores-law" target="_self">masks for the fabrication machines</a>, and setting up production lines to handle the tiny circuits.</p><p>“ASICs can deliver the best performance, but the development cycle is long and the nonrecurring engineering cost can be very high,” says <a href="https://vast.cs.ucla.edu/people/faculty/jason-cong" rel="noopener noreferrer" target="_blank">Jason Cong</a>, an IEEE Fellow and professor of computer science at the <a href="https://samueli.ucla.edu/" rel="noopener noreferrer" target="_blank">University of California, Los Angeles</a>. “FPGAs provide a sweet spot between processors and custom silicon.”</p><p>Cong’s foundational work in FPGA design automation and high-level synthesis transformed how reconfigurable systems are programmed. He developed synthesis tools that translate <a href="https://spectrum.ieee.org/top-programming-languages-2025" target="_self">C/C++</a> into hardware designs, for example.</p><p>At the heart of his work is an underlying principle first espoused by electrical engineer <a href="https://www.invent.org/inductees/ross-freeman" rel="noopener noreferrer" target="_blank">Ross Freeman</a>: By configuring hardware using programmable memory embedded inside the chip, FPGAs combine hardware-level speed with the adaptability traditionally associated with software.</p><h2>Silicon Valley origins: the first FPGA</h2><p>The FPGA architecture originated in the mid-1980s at Xilinx, a Silicon Valley company founded in 1984. The invention is widely credited to Freeman, a Xilinx cofounder and the startup’s CTO. He envisioned a chip with circuitry that could be configured after fabrication rather than fixed permanently during creation.</p><p>Articles about the <a href="https://www.eejournal.com/article/how-the-fpga-came-to-be-part-5/" rel="noopener noreferrer" target="_blank">history of the FPGA</a> emphasize that he saw it as a deliberate break from conventional chip design.</p><p>At the time, semiconductor engineers treated <a href="https://spectrum.ieee.org/special-reports/the-transistor-at-75/" target="_self">transistors</a> as scarce resources. Custom chips were carefully optimized so that nearly every transistor served a specific purpose.</p><p>Freeman proposed a different approach. He figured <a href="https://spectrum.ieee.org/special-reports/50-years-of-moores-law/" target="_self">Moore’s Law</a> would soon change chip economics. The principle holds that transistor counts roughly double every two years, making computing cheaper and more powerful. Freeman posited that as transistors became abundant, flexibility would matter more than perfect efficiency.</p><p>He envisioned a device composed of programmable logic blocks connected through configurable routing—a chip filled with what he described as “open gates,” ready to be defined by users after manufacturing. Instead of fixing hardware in silicon permanently, engineers could configure and reconfigure circuits as requirements evolved.</p><p>Freeman sometimes compared the concept to a blank cassette tape: Manufacturers would supply the medium, while engineers determined its function. The analogy captured a profound shift in who controls the technology, shifting hardware design flexibility from chip fabrication facilities to the system designers themselves.</p><p>In 1985 Xilinx introduced the first FPGA for commercial sale: the <a href="https://spectrum.ieee.org/chip-hall-of-fame-xilinx-xc2064-fpga" target="_self">XC2064</a>. The device contained 64 configurable logic blocks—small digital circuits capable of performing logical operations—arranged in an 8-by-8 grid. Programmable routing channels allowed engineers to define how signals moved between blocks, effectively wiring a custom circuit with software.</p><p>Fabricated using a 2-micrometer process (meaning that 2 µm was the minimum size of the features that could be patterned onto silicon using <a href="https://www.micron.com/content/dam/micron/educatorhub/fabrication/photolithography/micron-fabrication-intro-to-photolithography-presentation.pdf" rel="noopener noreferrer" target="_blank">photolithography</a>), the XC2064 implemented a few thousand logic gates. Modern FPGAs can contain hundreds of millions of gates, enabling vastly more complex designs. Yet the XC2064 established a design workflow still used today: Engineers describe the hardware behavior digitally and then “compile the design,” a process that automatically translates the plans into the instructions the FPGA needs to set its logic blocks and wiring, according to <a href="https://www.amd.com/en.html" rel="noopener noreferrer" target="_blank">AMD</a>. Engineers then load that configuration onto the chip.</p><h2>The breakthrough: hardware defined by memory</h2><p>Earlier <a href="https://tessellatedcircuits.com/pld_hist.php" rel="noopener noreferrer" target="_blank">programmable logic devices</a>, such as erasable programmable read-only memory, or EPROM, allowed limited customization but relied on largely fixed wiring structures that <a href="https://medium.com/@najamhassan569/understanding-plds-the-building-blocks-of-modern-digital-systems-dbefd69fbc21" rel="noopener noreferrer" target="_blank">did not scale well</a> as circuits grew more complex, Cong says.</p><p>FPGAs introduced programmable interconnects—networks of electronic switches controlled by memory cells distributed across the chip. When powered on, the device loads a <a href="https://spectrum.ieee.org/computing-with-random-pulses-promises-to-simplify-circuitry-and-save-power" target="_self">bitstream</a> configuration file that determines how its internal circuits behave.</p><p>“As process technology improved and transistor counts increased, the cost of programmability became much less significant,” Cong says.</p><h2>From “glue logic” to essential infrastructure</h2><p>“Initially, FPGAs were used as what engineers called <a href="https://www.pcmag.com/encyclopedia/term/glue-logic" rel="noopener noreferrer" target="_blank">glue logic</a>,” Cong says.</p><p><em><em>Glue logic</em></em> refers to simple circuits that connect processors, memory, and peripheral devices so the system works reliably, according to <a href="https://www.pcmag.com/encyclopedia/term/glue-logic" rel="noopener noreferrer" target="_blank"><em><em>PC Magazine</em></em></a>. In other words, it “glues” different components together, especially when interfaces change frequently.</p><p>Early adopters recognized the advantage of hardware that could adapt as standards evolved. In “<a href="https://cacm.acm.org/practice/the-history-status-and-future-of-fpgas/" rel="noopener noreferrer" target="_blank">The History, Status, and Future of FPGAs</a>,” published in <a href="https://cacm.acm.org/" rel="noopener noreferrer" target="_blank"><em><em>Communications of the ACM</em></em></a>, engineers at Xilinx and organizations such as <a href="https://spectrum.ieee.org/7-bell-labs-ieee-milestones" target="_self">Bell Labs</a>, <a href="https://computerhistory.org/blog/fairchild-semiconductor-the-60th-anniversary-of-a-silicon-valley-legend/" rel="noopener noreferrer" target="_blank">Fairchild Semiconductor</a>, <a href="https://www.ibm.com/us-en" rel="noopener noreferrer" target="_blank">IBM</a>, and <a href="https://www.britannica.com/money/Sun-Microsystems-Inc" rel="noopener noreferrer" target="_blank">Sun Microsystems</a> said the earliest uses of <a href="https://www.eetimes.com/transfer-from-fpgas-for-prototype-to-asics-for-production/" rel="noopener noreferrer" target="_blank">FPGAs were for prototyping ASICs</a>. They also used it for <a href="https://www.synopsys.com/glossary/what-is-hav-emulation.html" rel="noopener noreferrer" target="_blank">validating complex systems</a> by running their software before fabrication, allowing the companies to deploy specialized products manufactured in modest volumes.</p><p>Those uses revealed a broader shift: Hardware no longer needed to remain fixed once deployed.</p><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="A group dressed in business casual attire smiling and posing together around an outdoor bench adorned with a plaque." class="rm-shortcode" data-rm-shortcode-id="d28b2fa5d3ac1b68dd9ced85e46da61a" data-rm-shortcode-name="rebelmouse-image" id="c3363" loading="lazy" src="https://spectrum.ieee.org/media-library/a-group-dressed-in-business-casual-attire-smiling-and-posing-together-around-an-outdoor-bench-adorned-with-a-plaque.jpg?id=66633157&width=980"/><small class="image-media media-caption" placeholder="Add Photo Caption...">Attendees at the Milestone plaque dedication ceremony included (seated L to R) 2025 IEEE President Kathleen Kramer, 2024 IEEE President Tom Coughlin, and Santa Clara Valley Section Milestones Chair Brian Berg.</small><small class="image-media media-photo-credit" placeholder="Add Photo Credit...">Douglas Peck/AMD</small></p><h2>Semiconductor economics changed the equation</h2><p>The rise of FPGAs closely followed changes in semiconductor economics, Cong says.</p><p>Developing a custom chip requires a large upfront investment before production begins. As fabrication costs increased, products had to ship in large quantities to make ASIC development economically viable, according to <a href="https://anysilicon.com/the-economics-of-asic/" target="_blank">a post</a> published by <a href="https://anysilicon.com/" target="_blank">AnySilicon</a>.</p><p>FPGAs allowed designers to move forward without that larger monetary commitment.</p><p>ASIC development typically requires 18 to 24 months from conception to silicon, while FPGA implementations often can be completed within three to six months using modern design tools, Cong says. The shorter cycle and the ability to reconfigure the hardware enabled startups, universities, and equipment manufacturers to experiment with advanced architectures that were previously accessible mainly to large chip companies.</p><h2>Lookup tables and the rise of reconfigurable computing</h2><p>A popular technique for implementing mathematical functions in hardware is <a href="https://ieeexplore.ieee.org/document/10013797" target="_blank"></a>the <a href="https://ieeexplore.ieee.org/document/10013797" target="_blank">lookup table</a> (LUT). A LUT is a small memory element that stores the results of logical operations, according to “<a href="https://arxiv.org/abs/2511.06174" rel="noopener noreferrer" target="_blank">LUT-LLM: Efficient Large Language Model Inference with Memory-based Computations on FPGAs</a>,” a paper selected for presentation next month at the 34th <a href="https://www.fccm.org/" rel="noopener noreferrer" target="_blank">IEEE International Symposium on Field-Programmable Custom Computing Machines</a> (FCCM).</p><p>Instead of repeatedly recalculating outcomes, the chip retrieves answers directly from memory. Cong compares the approach to consulting multiplication tables rather than recomputing the arithmetic each time.</p><p>Research led by Cong and others helped develop efficient methods for mapping digital circuits onto LUT-based architectures, shaping routing and layout strategies used in modern devices.</p><p>As transistor budgets expanded, FPGA vendors integrated memory blocks, digital signal-processing units, high-speed communication interfaces, <a href="https://spectrum.ieee.org/tag/cryptography" target="_self">cryptographic engines</a>, and embedded processors, transforming the devices into versatile computing platforms.</p><h2>Why the gate arrays are distinct from CPUs, GPUs, and ASICs</h2><p>FPGAs coexist with other processors because each one optimizes different priorities. Central processing units excel at general computing. Graphics processing units, designed to perform many calculations simultaneously, dominate large parallel workloads such as AI training. ASICs provide maximum efficiency when designs remain stable and production volumes are high.</p><p class="pull-quote">“ASICs can deliver the best performance, but the development cycle is long, and the nonrecurring engineering cost can be very high. FPGAs provide a sweet spot between processors and custom silicon.” <strong>—Jason Cong, IEEE Fellow and professor of computer science at UCLA.</strong></p><p>“FPGAs are not replacements for CPUs or GPUs,” Cong says. “They complement those processors in <a href="https://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1209&context=ecetr" target="_blank">heterogeneous computing</a> systems.”</p><p>Modern computing platforms increasingly combine multiple types of processors to balance flexibility, performance, and energy efficiency.</p><h2>A Milestone for an idea, not just a device</h2><p>This IEEE Milestone recognizes more than a successful semiconductor product. It also acknowledges a shift in how engineers innovate.</p><p>Reconfigurable hardware allows designers to test ideas quickly, refine architectures, and deploy systems while standards and markets evolve.</p><p>“Without FPGAs,” Cong says, “the pace of hardware innovation would likely be much slower.”</p><p>Four decades after the first FPGA appeared, the technology’s enduring legacy reflects Freeman’s insight: Hardware did not need to remain fixed. By accepting a small amount of unused silicon in exchange for adaptability, engineers transformed chips from static products into platforms for continuous experimentation—turning silicon itself into a medium engineers could rewrite.</p><p>Among those who attended the Milestone ceremony were 2025 IEEE President <a href="https://www.linkedin.com/in/kathleenkramer" target="_blank">Kathleen Kramer</a>; 2024 IEEE President <a href="https://corporate-awards.ieee.org/speaker/tom-coughlin/" rel="noopener noreferrer" target="_blank">Tom Coughlin</a>; <a href="https://www.linkedin.com/in/averylu" rel="noopener noreferrer" target="_blank">Avery Lu</a>, chair of the <a href="https://ieeescv.org/" rel="noopener noreferrer" target="_blank">IEEE Santa Clara Valley Section</a>; and <a href="https://ieeetv.ieee.org/speaker/brian-berg" rel="noopener noreferrer" target="_blank">Brian Berg</a>, history and milestones chair of <a href="https://ieee-region6.org/" rel="noopener noreferrer" target="_blank">IEEE Region 6</a><a href="https://ieeetv.ieee.org/speaker/brian-berg" rel="noopener noreferrer" target="_blank">. They joined</a> AMD’s chief executive, <a href="https://www.amd.com/en/corporate/leadership/lisa-su.html" rel="noopener noreferrer" target="_blank">Lisa Su</a>, and <a href="https://www.amd.com/en/corporate/leadership/salil-raje.html#:~:text=Salil%20Raje%20is%20senior%20vice,with%20an%20emphasis%20on%20growing" rel="noopener noreferrer" target="_blank">Salil Raje</a>, senior vice president and general manager of adaptive and embedded computing at AMD.</p><p>The <a href="https://ethw.org/Milestones:Field_Programmable_Gate_Array" rel="noopener noreferrer" target="_blank">IEEE Milestone plaque</a> honoring the field-programmable gate array reads:</p><p><em><em>“</em></em><em><em>The FPGA is an integrated circuit with user-programmable Boolean logic functions and interconnects. FPGA inventor Ross Freeman cofounded Xilinx to productize his 1984 invention, and in 1985 the XC2064 was introduced with 64 programmable 4-input logic functions. Xilinx’s FPGAs helped accelerate a dramatic industry shift wherein ‘fabless’ companies could use software tools to design hardware while engaging ‘foundry’ companies to handle the capital-intensive task of manufacturing the software-defined hardware.”</em></em></p><p>Administered by the <a href="https://www.ieee.org/about/history-center?check_logged_in=1" rel="noopener noreferrer" target="_blank">IEEE History Center</a> and supported by donors, the IEEE Milestone program recognizes outstanding technical developments worldwide that are at least 25 years old.</p><p>Check out <em><em>Spectrum</em></em>’s <a href="https://connect.ieee.org/NzU2LUdQSC04OTkAAAGhT7-QweL2i3BmX2b-_PBdiukfOVwCR2UPcYg1G4khUu5odaR3T07IAVEY5ylL-hWj7LNbRKU=" rel="noopener noreferrer" target="_blank">History of Technology</a> channel to read more stories about key engineering achievements.</p>]]></description><pubDate>Tue, 28 Apr 2026 18:00:02 +0000</pubDate><guid>https://spectrum.ieee.org/fpga-chip-ieee-milestone</guid><category>Ieee-history</category><category>Fpga</category><category>Xilinx</category><category>Ieee-milestone</category><category>Amd</category><category>History-of-technology</category><category>Type-ti</category><dc:creator>Willie D. Jones</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/die-photo-of-an-integrated-circuit-with-an-8-by-8-array.jpg?id=66633028&amp;width=980"></media:content></item><item><title>GPU Renters Are Playing a Silicon Lottery</title><link>https://spectrum.ieee.org/gpu-performance-comparison</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/bar-chart-comparing-tesla-t4-a10g-a100-l4-and-h100-gpu-performance-ranges.png?id=65814435&width=980"/><br/><br/><p>Think one GPU is very much like another? Think again. It turns out that there’s surprising variability in the performance delivered by chips of the same model. That can make getting your money’s worth by renting time on a GPU from a cloud provider a real roll of the dice, according to research from the College of William & Mary, Jefferson Lab, and <a href="https://www.silicondata.com/" rel="noopener noreferrer" target="_blank">Silicon Data</a>.</p><p>“It’s called the silicon lottery,” says <a href="https://www.linkedin.com/in/carmenrli/" rel="noopener noreferrer" target="_blank">Carmen Li,</a> founder and CEO of Silicon Data, which tracks <a href="https://spectrum.ieee.org/gpu-prices" target="_self">GPU rental prices</a> and <a href="https://spectrum.ieee.org/mlperf-trends" target="_self">benchmarks</a> cloud-computing performance.</p><p>The <a href="https://www.computer.org/csdl/proceedings-article/sc/2022/544400a937/1I0bT7vc6B2" rel="noopener noreferrer" target="_blank">silicon lottery’s existence</a> has been known since at least 2022, when researchers at the University of Wisconsin tied it to variations in the performance of GPU-dependent supercomputers. Li and her colleagues figured that the effect would be even more pronounced for AI cloud customers.</p><h3>Performance varies for GPU models in the cloud</h3><br/><img alt="Chart comparing GPU models by 16-bit TFLOPS and median hourly rental prices." class="rm-shortcode" data-rm-shortcode-id="14114673d2c672cde525bd4d147097b7" data-rm-shortcode-name="rebelmouse-image" id="b5d4e" loading="lazy" src="https://spectrum.ieee.org/media-library/chart-comparing-gpu-models-by-16-bit-tflops-and-median-hourly-rental-prices.png?id=65816885&width=980"/><h3></h3><br/><p>So they ran 6,800 instances of the index firm’s benchmark test on 3,500 randomly selected GPUs operated by 11 cloud-computing providers. The 3,500 GPUs comprised <a href="https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units" target="_blank">11 models of Nvidia GPU</a>, the most advanced being the <a href="https://spectrum.ieee.org/ai-benchmark-mlperf-llama-stablediffusion" target="_self">Nvidia H200</a> SXM. (The team wasn’t just picking on <a href="https://www.nvidia.com/en-us/" target="_blank">Nvidia</a>; the GPU giant makes up most of the rental cloud market.)</p><p>The benchmark, called <a href="https://www.silicondata.com/products/silicon-mark" target="_blank">SiliconMark</a>, is intended to provide a snapshot of a GPU’s ability to run large language models, or LLMs. It tests 16-bit floating-point computing performance, measured in trillions of operations per second, and a GPU’s internal-memory bandwidth, measured in gigabytes per second. <a href="https://downloads.silicondata.com/documents/GPGPU26_SiliconData.pdf" rel="noopener noreferrer" target="_blank">The results</a> showed that the computing performance varied for all models, but for the 259 H100 PCIe GPUs it differed by as much as 34.5 percent, and the memory bandwidth of the 253 H200 SXM GPUs varied by as much as 38 percent.</p><h3></h3><br/><img alt="Chart comparing GPU internal memory bandwidth by model, from Tesla T4 to H200 SXM." class="rm-shortcode" data-rm-shortcode-id="b5cdb54f4666983523d50b7fc5968cbe" data-rm-shortcode-name="rebelmouse-image" id="b818b" loading="lazy" src="https://spectrum.ieee.org/media-library/chart-comparing-gpu-internal-memory-bandwidth-by-model-from-tesla-t4-to-h200-sxm.png?id=65816932&width=980"/><p><span>Differences in how the GPU is cooled, how cloud operators configure their computers, and how much use the chip has seen can all contribute to variations in performance of otherwise identical chips. But Silicon Data’s analysis showed that the real culprit was variations in the chips themselves, likely due to manufacturing issues.</span></p><p>Such randomness has real dollars-and-cents consequences, the researchers argue, because there’s a chance that a pricier, more advanced GPU won’t deliver better performance than an older model chip.</p><p>So what should GPU renters do? “The most practical approach is to benchmark the actual rental they receive,” says <a href="https://www.linkedin.com/in/jcornick/" target="_blank">Jason Cornick</a>, head of infrastructure at Silicon Data. “Running a benchmark tool [such as SiliconMark] allows them to compare their specific instance’s performance against a broader corpus of data.”</p>]]></description><pubDate>Thu, 23 Apr 2026 18:06:01 +0000</pubDate><guid>https://spectrum.ieee.org/gpu-performance-comparison</guid><category>Artificial-intelligence</category><category>Cloud-computing</category><category>Nvidia</category><category>Gpus</category><category>Gpu</category><category>Hyperscalers</category><category>Graphics-processing-units</category><category>Benchmarking</category><category>Large-language-models</category><dc:creator>Samuel K. Moore</dc:creator><media:content medium="image" type="image/png" url="https://assets.rbl.ms/65814435/origin.png"></media:content></item><item><title>What Anthropic’s Mythos Means for the Future of Cybersecurity</title><link>https://spectrum.ieee.org/ai-cybersecurity-mythos</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/a-cgi-image-of-a-translucent-padlock-filled-with-0s-and-1s-one-spot-is-broken-and-the-numbers-are-spraying-out-of-that-spot.jpg?id=65714765&width=1200&height=400&coordinates=0%2C729%2C0%2C730"/><br/><br/><p>Two weeks ago, Anthropic <a href="https://red.anthropic.com/2026/mythos-preview/" rel="noopener noreferrer" target="_blank">announced</a> that its new model, Claude Mythos Preview, can autonomously find and weaponize software vulnerabilities, turning them into working exploits without expert guidance. These were vulnerabilities in key software like operating systems and internet infrastructure that thousands of software developers working on those systems failed to find. This capability will have major security implications, compromising the devices and services we use every day. As a result, <a href="https://spectrum.ieee.org/tag/anthropic" target="_blank">Anthropic</a> is not releasing the model to the general public, but instead to a <a href="https://www.anthropic.com/glasswing" rel="noopener noreferrer" target="_blank">limited number</a> of companies.</p><div class="rm-embed embed-media"><iframe height="110px" id="noa-web-audio-player" src="https://embed-player.newsoveraudio.com/v4?key=q5m19e&id=https://spectrum.ieee.org/ai-cybersecurity-mythos&bgColor=F5F5F5&color=1b1b1c&playColor=1b1b1c&progressBgColor=F5F5F5&progressBorderColor=bdbbbb&titleColor=1b1b1c&timeColor=1b1b1c&speedColor=1b1b1c&noaLinkColor=556B7D&noaLinkHighlightColor=FF4B00&feedbackButton=true" style="border: none" width="100%"></iframe></div><p><span>The news rocked the internet security community. There were few details in Anthropic’s announcement, </span><a href="https://srinstitute.utoronto.ca/news/the-mythos-question-who-decides-when-ai-is-too-dangerous" target="_blank">angering</a><span> many observers. Some speculate that Anthropic </span><a href="https://kingy.ai/ai/too-dangerous-to-release-or-just-too-expensive-the-real-reason-anthropic-is-hiding-its-most-powerful-ai/" target="_blank">doesn’t have</a><span> the GPUs to run the thing, and that cybersecurity was the excuse to limit its release. Others argue Anthropic is holding to its AI safety mission. </span><a href="https://www.nytimes.com/2026/04/07/opinion/anthropic-ai-claude-mythos.html" target="_blank">There’s</a><span> </span><a href="https://www.axios.com/2026/04/08/anthropic-mythos-model-ai-cyberattack-warning" target="_blank">hype</a><span> and </span><a href="https://www.artificialintelligencemadesimple.com/p/anthropics-claude-mythos-launch-is" target="_blank">counter</a><a href="https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier" target="_blank">hype</a><span>, </span><a href="https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities" target="_blank">reality</a><span> and marketing. It’s a lot to sort out, even if you’re an expert.</span></p><p>We see Mythos as a real but incremental step, one in a long line of incremental steps. But even incremental steps can be important when we look at the big picture.</p><h2>How AI Is Changing Cybersecurity</h2><p>We’ve <a href="https://spectrum.ieee.org/online-privacy" target="_self">written about</a> shifting baseline syndrome, a phenomenon that leads people—the public and experts alike—to discount massive long-term changes that are hidden in incremental steps. It has happened with online privacy, and it’s happening with AI. Even if the vulnerabilities found by Mythos could have been found using AI models from last month or last year, they couldn’t have been found by AI models from five years ago.</p><p>The Mythos announcement reminds us that AI has come a long way in just a few years: The baseline really has shifted. Finding vulnerabilities in source code is the type of task that today’s large language models excel at. Regardless of whether it happened last year or will happen next year, it’s been clear for a <a href="https://sockpuppet.org/blog/2026/03/30/vulnerability-research-is-cooked/" target="_blank">while</a> this kind of capability was coming soon. The question is how we <a href="https://labs.cloudsecurityalliance.org/mythos-ciso/" target="_blank">adapt to it</a>.</p><p>We don’t believe that an AI that can hack autonomously will create permanent asymmetry between offense and defense; it’s likely to be more <a href="https://danielmiessler.com/blog/will-ai-help-moreattackers-defenders" rel="noopener noreferrer" target="_blank">nuanced</a> than that. Some vulnerabilities can be found, verified, and patched automatically. Some vulnerabilities will be hard to find but easy to verify and patch—consider generic cloud-hosted web applications built on standard software stacks, where updates can be deployed quickly. Still others will be easy to find (even without powerful AI) and relatively easy to verify, but harder or impossible to patch, such as IoT appliances and industrial equipment that are rarely updated or can’t be easily modified.</p><p>Then there are systems whose vulnerabilities will be easy to find in code but difficult to verify in practice. For example, complex distributed systems and cloud platforms can be composed of thousands of interacting services running in parallel, making it difficult to distinguish real vulnerabilities from false positives and to reliably reproduce them.</p><p>So we must separate the patchable from the unpatchable, and the easy to verify from the hard to verify. This taxonomy also provides us guidance for how to protect such systems in an era of powerful AI vulnerability-finding tools.</p><p>Unpatchable or hard to verify systems should be protected by wrapping them in more restrictive, tightly controlled layers. You want your fridge or thermostat or industrial control system behind a restrictive and constantly updated firewall, not freely talking to the internet.</p><p>Distributed systems that are fundamentally interconnected should be traceable and should follow the principle of least privilege, where each component has only the access it needs. These are bog-standard security ideas that we might have been tempted to throw out in the era of AI, but they’re still as relevant as ever.</p><h2>Rethinking Software Security Practices</h2><p>This also raises the salience of best practices in software engineering. Automated, thorough, and continuous testing was always important. Now we can take this practice a step further and use defensive AI agents to <a href="https://www.secwest.net/ai-triage" rel="noopener noreferrer" target="_blank">test exploits</a> against a real stack, over and over, until the false positives have been weeded out and the real vulnerabilities and fixes are confirmed. This kind of <a href="https://www.csoonline.com/article/4069075/autonomous-ai-hacking-and-the-future-of-cybersecurity.html" rel="noopener noreferrer" target="_blank">VulnOps</a> is likely to become a standard part of the development process.</p><p>Documentation becomes more valuable, as it can guide an AI agent on a bug-finding mission just as it does developers. And following standard practices and using standard tools and libraries allows AI and engineers alike to recognize patterns more effectively, even in a world of individual and ephemeral <a href="https://www.csoonline.com/article/4152133/cybersecurity-in-the-age-of-instant-software.html" rel="noopener noreferrer" target="_blank">instant software</a>—code that can be generated and deployed on demand.</p><p>Will this favor <a href="https://www.schneier.com/essays/archives/2018/03/artificial_intellige.html" rel="noopener noreferrer" target="_blank">offense or defense</a>? The defense eventually, probably, especially in systems that are easy to patch and verify. Fortunately, that includes our phones, web browsers, and major internet services. But today’s cars, electrical transformers, fridges, and lampposts are connected to the internet. Legacy banking and airline systems are networked.</p>Not all of those are going to get patched as fast as needed, and we may see a few years of constant hacks until we arrive at a new normal: where verification is paramount and software is patched continuously.]]></description><pubDate>Thu, 23 Apr 2026 14:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/ai-cybersecurity-mythos</guid><category>Cybersecurity</category><category>Anthropic</category><category>Agentic-ai</category><category>Hacking</category><dc:creator>Bruce Schneier</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/a-cgi-image-of-a-translucent-padlock-filled-with-0s-and-1s-one-spot-is-broken-and-the-numbers-are-spraying-out-of-that-spot.jpg?id=65714765&amp;width=980"></media:content></item><item><title>AI Agent Designs a RISC-V CPU Core From Scratch</title><link>https://spectrum.ieee.org/ai-chip-design</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/a-graphic-design-system-plot-of-a-risc-v-cpu-core-it-resembles-a-square-grid-covered-in-colorful-vertical-and-horizontal-scratc.jpg?id=65519361&width=1200&height=400&coordinates=0%2C1042%2C0%2C1042"/><br/><br/><p>In 2020, researchers fine-tuned a GPT-2 model to <a href="https://arxiv.org/html/2411.11856v2" rel="noopener noreferrer" target="_blank">design fragments of logic circuits</a>; in 2023, researchers used GPT-4 <a href="https://arxiv.org/abs/2305.13243" rel="noopener noreferrer" target="_blank">to help design an 8-bit processor</a> with a novel instruction set; by 2024, a variety of LLMs could <a href="https://arxiv.org/pdf/2405.02326" rel="noopener noreferrer" target="_blank">design and test chips</a> with basic functionality, like dice rolls (though often these were flawed).</p><p>Now Verkor.io, an <a href="https://spectrum.ieee.org/chip-design-ai" target="_blank">AI chip design</a> startup, claims a bigger milestone: a <a href="https://spectrum.ieee.org/risc-v-laptops" target="_blank">RISC-V </a>CPU core designed entirely by an agentic AI system. The CPU, dubbed VerCore, has a clock speed of 1.5 gigahertz and performance similar to a 2011-era laptop CPU. </p><p><a href="https://www.linkedin.com/in/suresh-krishna-793506158" rel="noopener noreferrer" target="_blank">Suresh Krishna</a>, cofounder at <a href="https://verkor.io/" rel="noopener noreferrer" target="_blank">Verkor.io</a>, says the team’s key claim is that this approach is more effective than using only specialized AI systems for specialized tasks within the overall design process. “ What we learned is that the better approach is to let the AI agent solve the whole problem,” he says.</p><h2>Bringing Human Workflows to Agentic AI</h2><p>Verkor.io’s agentic system is called <a href="https://arxiv.org/pdf/2603.08716" rel="noopener noreferrer" target="_blank">Design Conductor</a>, and it’s not itself an AI model. It’s a harness for large language models (LLMs). A harness is software that forces an AI agent to proceed through structured steps. In this case, the steps are like those a team of human chip architects would follow: design, implementation, testing, and so on. The harness also manages subagents and a database of related files.</p><p>That means it can work autonomously with only an initial prompt—in this case a 219-word design specification—from the user. (<a href="https://arxiv.org/pdf/2603.08716" target="_blank">The prompt is published in the Design Conductor paper</a>.) It outputs <a href="https://en.wikipedia.org/wiki/GDSII" rel="noopener noreferrer" target="_blank">a Graphic Design System II (GDSII) file</a>, which can be used in existing electronic design automation (EDA) software.</p><p><a href="https://www.synopsys.com/ai/agentic-ai.html" rel="noopener noreferrer" target="_blank">Synopsys</a> and <a href="https://www.cadence.com/en_US/home/ai/ai-for-design.html" rel="noopener noreferrer" target="_blank">Cadence</a>, two major players in EDA software, also have agentic AI tools. These allow chip architects to automate some tasks with AI agents. Design Conductor is different because it’s built to handle chip design from spec to completion with full autonomy, something major EDA companies have not yet touted.</p><p><a href="https://www.linkedin.com/in/ravi-k-a10287122/" target="_blank">Ravi Krishna</a>, founding engineer at Verkor.io, says Design Conductor’s workflow is “mirrored after the traditional process a human engineer might use.” It analyzes the specification, then writes and debugs a register-transfer level, or RTL, file (an abstraction of the CPU’s data flow) before iterating through subtasks like power delivery, signal timings, and layout, which are again checked against the specification. Some tasks, like layout, <a href="https://theopenroadproject.org/" target="_blank">call tools</a> to assist the agent. “It’s an iterative system.”</p><p>The system took 12 hours to create the VerCore design. That’s not long, but, because it uses AI agents, you might imagine it taking more or less time based on the number of agents thrown at it. However, Ravi Krishna says it’s not that simple, because some design tasks aren’t easily parallelized. </p><p>However, the general improvement of AI models over time has proven essential. “I remember that around the middle of last year, we tried to build a floating-point multiplier with the models of that time. It was slightly beyond what they could do,” says Ravi Krishna. VerCore—designed in December 2025— represents an increase in capability since then. “If it can’t do it today, it’ll do it in six months,” he says. “I don’t know if that’s a scary thing or a good thing.”</p><h2>A First for AI Chip Design</h2><p>VerCore uses the RISC-V instruction set architecture (ISA), a popular open-standard ISA that’s beginning to break out of niche applications, like storage controllers, into systems on a chip (SoCs) that can power <a href="https://spectrum.ieee.org/risc-v-laptops" target="_self">laptops or smartphones</a>. The CPU’s exact clock speed is 1.48 GHz and it achieved a <a href="https://www.eembc.org/coremark/" rel="noopener noreferrer" target="_blank"></a>score of 3,261 on the <a href="https://www.eembc.org/coremark/" rel="noopener noreferrer" target="_blank">CoreMark</a> processor core benchmark. </p><p>Verkor says this puts VerCore’s performance in line with the CPU core performance of <a href="https://www.notebookcheck.net/Intel-Celeron-Dual-Core-SU2300-Notebook-Processor.33847.0.html" rel="noopener noreferrer" target="_blank">Intel’s Celeron SU2300</a>. Whether that sounds impressive depends on your perspective. The Celeron SU2300, which arrived in 2011, uses Intel’s <a href="https://www.intel.com/content/dam/doc/white-paper/45nm-next-generation-core-microarchitecture-white-paper.pdf" rel="noopener noreferrer" target="_blank">Penryn CPU architecture</a>, which debuted in November of 2007.<br/><br/> In other words, VerCore is no threat to leading-edge CPUs, but it’s notable for two reasons.<br/><br/>VerCore is the first RISC-V CPU core designed by an AI agent. Previous examples of AI chip design presented portions of a design but didn’t present a complete core. Ravi Krishna says the company wanted to target a design that an AI agent hadn’t previously accomplished. “From the perspective of trying to push the limits of what AI models can do, that was interesting to us,” he says.</p><p>And while VerCore’s theoretical performance has limits, it’s enough to suggest the design could be useful. Indeed, RISC-V is popular because it provides an ISA that’s free to use (RISC-V is an open standard). RISC-V chips generally aren’t as quick as their <em>x</em>86 and Arm peers, but they’re less expensive. </p><p>There’s one final caveat worth mentioning; the chip has not been physically produced. VerCore was verified in simulation with <a href="https://github.com/riscv-software-src/riscv-isa-sim" rel="noopener noreferrer" target="_blank">Spike</a>, the reference RISC-V ISA simulator, and laid out using the open-source <a href="https://github.com/The-OpenROAD-Project/asap7" rel="noopener noreferrer" target="_blank">ASAP7 PDK</a>, an academic design kit that simulates a 7-nanometer production node. Both tools are commonly used for RISC-V design. VerCore says its CPU can run a variant of <a href="https://en.wikipedia.org/wiki/%CE%9CClinux" rel="noopener noreferrer" target="_blank">uCLinux</a> in simulation. </p><p>Skeptics will have a chance to judge for themselves. Verkor.io plans to release design files at the end of April. This will include the VerCore CPU and several other designs recently completed by the AI agent system. Verkor also plans to show an FPGA implementation of VerCore at <a href="https://dac.com/2026" rel="noopener noreferrer" target="_blank">DAC</a>, the leading electronic design automation conference.</p><h2>Should Chip Designers Worry about AI Agents Taking Their Jobs?</h2><p>An AI chip designer that can bang out a CPU in 12 hours might seem like troubling news for flesh-and-blood engineers, but Design Conductor has its limitations. The team at Verkor.io say that despite improvements, LLMs still lack the intuition a human can bring.</p><p>Design Conductor can fall down rabbit holes that a human engineer would avoid. In one instance the agent made a mistake in timing, meaning that data was not moved across the CPU in agreement with its clock cycle. The model didn’t recognize the cause and made broad changes while hunting for the fix. It did eventually find a fix, but only after reaching many dead ends. “Basically, we are trading off experience for compute,” says <a href="https://www.linkedin.com/in/david-chin-a5092a/" rel="noopener noreferrer" target="_blank">David Chin</a>, vice president of engineering at the startup.<br/><br/>Suresh Krishna concurs and adds that Design Conductor’s brute-force approach is likely to become less efficient as agentic systems tackle more complex designs. “It’s a nonlinear design space, so the compute grows very quickly,” he says. “As a practical matter, expert guidance and common sense helps a lot.”</p><p>Despite such issues, agentic systems like Design Conductor might accelerate chip design by accelerating iteration. They may also make design accessible to small teams that otherwise lack the resources or head count to pull off a project.</p><p>“It’s not at the point where you can have one person. I would say you still need five to ten, all experts in different areas,” says Ravi Krishna. “That team could get you to [a production-ready chip design] at this point.”</p>]]></description><pubDate>Wed, 22 Apr 2026 11:00:01 +0000</pubDate><guid>https://spectrum.ieee.org/ai-chip-design</guid><category>Eda</category><category>Chip-design</category><category>Agentic-ai</category><category>Risc-v</category><category>Cpu</category><dc:creator>Matthew S. Smith</dc:creator><media:content medium="image" type="image/jpeg" url="https://spectrum.ieee.org/media-library/a-graphic-design-system-plot-of-a-risc-v-cpu-core-it-resembles-a-square-grid-covered-in-colorful-vertical-and-horizontal-scratc.jpg?id=65519361&amp;width=980"></media:content></item><item><title>Designing Broadband LPDA-Fed Reflector Antennas With Full-Wave EM Simulation</title><link>https://content.knowledgehub.wiley.com/efficient-design-and-simulation-of-lpda-fed-parabolic-reflector-antennas/</link><description><![CDATA[
<img src="https://spectrum.ieee.org/media-library/wipl-d-logo.png?id=26851496&width=980"/><br/><br/><p>A practical guide to designing log-periodic dipole array fed parabolic reflector antennas using advanced 3D MoM simulation — from parametric modeling to electrically large structures.</p><p><strong>What Attendees will Learn</strong></p><ol><li>How to set design requirements for LPDA-fed reflector antennas — Understand the key specifications including bandwidth ratio, gain targets, and VSWR matching constraints across the full operating range from 100 MHz to 1 GHz.</li><li>Why advanced 3D EM solvers enable simulation of electrically large multiscale structures — Learn how higher order basis functions, quadrilateral meshing, geometrical symmetry, and CPU/GPU parallelization extend MoM simulation capability by an order of magnitude.</li><li>How to apply a systematic three-step design strategy with proven workflow starting with first optimizing the stand-alone LPDA for VSWR and gain, then integrating the reflector, and finally tuning parameters to satisfy all performance requests including gain and impedance matching.</li><li>How parametric CAD modeling accelerates LPDA design — Discover how self-scaling geometry, automated wire-to-solid conversion, and multiple-copy-with-scaling features enable fully parametrized antenna models that streamline optimization across dozens of design variants.</li></ol><div><span><a href="https://content.knowledgehub.wiley.com/efficient-design-and-simulation-of-lpda-fed-parabolic-reflector-antennas/" target="_blank">Download this free whitepaper now!</a></span></div>]]></description><pubDate>Fri, 17 Apr 2026 14:00:50 +0000</pubDate><guid>https://content.knowledgehub.wiley.com/efficient-design-and-simulation-of-lpda-fed-parabolic-reflector-antennas/</guid><category>Type-whitepaper</category><category>Broadband</category><category>Antennas</category><category>Simulation</category><dc:creator>WIPL-D</dc:creator><media:content medium="image" type="image/png" url="https://assets.rbl.ms/26851496/origin.png"></media:content></item></channel></rss>