Stay Updated Icon

Subscribe to Our Tech & Career Digest

Join thousands of readers getting the latest insights on tech trends, career tips, and exclusive updates delivered straight to their inbox.

Beyond Tape: The Future of Archival Data Storage in Glass and Molecules

8:41 AM   |   15 June 2025

Beyond Tape: The Future of Archival Data Storage in Glass and Molecules

Beyond Tape: The Future of Archival Data Storage in Glass and Molecules

Time to stop giving cold storage the cold shoulder

The relentless explosion of digital data presents a monumental challenge: how do we store vast quantities of information reliably and affordably for decades, centuries, or even millennia? This is the domain of archival storage, a critical but often overlooked component of the data lifecycle. For years, the workhorse of this field has been magnetic tape, specifically the Linear Tape-Open (LTO) format. While tape offers a compelling cost advantage per terabyte compared to hard disk drives (HDDs) and solid-state drives (SSDs), its inherent limitations are becoming increasingly apparent in the face of exponential data growth driven by high-resolution media, scientific research, and the burgeoning field of artificial intelligence.

The function of archival storage is singular and crucial: to preserve data for the long term – think decades and beyond – with unwavering reliability and at a manageable cost. Currently, LTO tape dominates this space. However, its characteristics, including a finite lifespan and sequential access, are beginning to strain under the weight of modern data demands. The need for a viable, scalable replacement is growing, but as yet, no single technology is ready to step into tape's shoes entirely. The most promising contenders appear to be optical storage technologies based on glass, while molecular methods like DNA storage remain fascinating but distant prospects.

The Growing Inadequacy of Tape Storage

Streaming tape, despite its continued evolution with generations like LTO-10, faces growing inadequacy for future archival needs. Its primary appeal lies in its significantly lower cost per terabyte compared to disk or flash storage. However, this cost advantage comes with operational overheads and technical constraints.

One of the most significant limitations is the media's lifespan. Magnetic tape is susceptible to degradation over time, a phenomenon known as "bit rot." To combat this, the content of a tape cartridge typically needs to be copied and rewritten (often referred to as "resilvering") onto fresh media every five to ten years. This process is not only labor-intensive but also consumes considerable energy and resources.

Ant Rowstron, a Distinguished Engineer at Microsoft Project Silica, highlights this challenge: "Magnetic technology has a finite lifetime. You must keep copying it over to new generations of media. A hard disk drive might last five years. A tape, well, if you're brave, it might last ten years. But once that lifetime is up, you've got to copy it over. And that, frankly, is both difficult and tremendously unsustainable if you think of all that energy and resource we're using."

Symply LTO tape libraries

Symply LTO-10 tabletop and rackmount tape libraries. Image Credit: The Register

Another drawback is tape's sequential access nature. Unlike the random access capabilities of HDDs and SSDs, data on tape must be read or written in a linear fashion, requiring the tape to be spooled to the correct position. This results in a much slower time to first byte, making tape unsuitable for applications requiring rapid data retrieval.

While LTO technology has seen steady improvements in throughput, the rate of capacity increase is lagging behind other storage media. LTO-10 offers a raw throughput of 400 MBps, the same as LTO-9, despite having a higher capacity (30 TB raw, 75 TB compressed at 2.5:1). This means reading a full LTO-10 tape takes longer than an LTO-9 tape. Current disk drives are reaching capacities in the 32-36 TB range, with 40 TB on the horizon, while SSDs are already far beyond, with 122 TB drives available and 256 TB forecast. The next tape generation, LTO-11, expected around 2027/28, is projected to reach 72 TB raw capacity. By then, disk drives could be around 50 TB, and SSDs well over 300 TB. The capacity gap is widening.

Despite these limitations, tape remains the best archival medium we currently have at scale due to its affordability. The LTO roadmap extends for another four generations, providing a degree of predictability until approximately 2035/36. However, the need for a more durable, higher-capacity, and potentially more sustainable alternative is clear. Two potential replacements are gaining attention: glass-based and molecular technologies.

The Glass Archive Game: Project Silica

One of the most prominent efforts in glass-based archival storage is Microsoft's Project Silica. This initiative leverages technology developed in collaboration with the University of Southampton in the UK. The core concept involves storing data within square tablets of silica glass by creating polarization-based nanostructures using ultra-fast femtosecond infrared laser pulses. Silica glass is an attractive medium because it is incredibly durable and resistant to environmental factors that degrade traditional media, such as heat, boiling water, electromagnetic radiation, various chemicals, and even surface scratches.

project silica

A Project Silica glass tablet. Image Credit: The Register

The data is encoded in nanoscale structures within the glass, defined by their position, orientation, size, and how they refract light. Early work in 2019 demonstrated storing 75.6 GB of data in multiple layers within a 75 x 75 x 2 mm glass tablet. Researchers at Southampton later advanced this with a 5D system, utilizing two optical and three spatial dimensions. This involved burning nanoscale voids (around 130 nm) with a femtosecond laser and then shaping them into nanolamellas (460 x 50 nm) to create voxels, each capable of storing four bits.

Microsoft states that Project Silica glass tablets, roughly the size of a drink coaster, can now hold 7 TB of raw data across 100 or more layers and are designed to preserve this data for thousands of years. The system uses Azure AI to decode the data, which Microsoft claims improves reading and writing speeds and enables higher data density.

The data storage process involves four key steps:

  • Writing: Using an ultra-fast femtosecond laser to create the nanostructures.
  • Reading: Employing a computer-controlled, polarization-sensitive microscope that shines polarized light through the glass.
  • Decoding: Machine learning algorithms interpret the patterns of polarized light to reconstruct the data.
  • Storing: Tablets are housed in a library system, similar in concept to tape libraries.

The Project Silica library utilizes battery-powered robots that charge while idle. These robots retrieve tablets from shelves and transport them to reader stations. A crucial design feature is that stored tablets are intended to be immutable; they are not taken to the writer station, preventing accidental data corruption by the laser pulses.

project silica

Two Project Silica library robots. Image Credit: The Register

While specific current capacity numbers in terabytes per tablet and exact read/write throughput speeds are not publicly detailed, Microsoft suggests that system-level aggregate write throughputs are "comparable to current archival systems," carefully avoiding claims of being faster than tape libraries. Project Silica is still in development, and commercial availability is estimated to be two to five years away.

Microsoft emphasizes the low power consumption of Project Silica libraries, aligning with sustainability goals. The project is primarily viewed as a means to develop advanced archival storage for Microsoft's own Azure cloud service, suggesting it may remain proprietary and not be offered commercially to competing cloud providers or other customers.

Interestingly, Project Silica is already finding a niche application. Elire, a venture group focused on sustainability, is collaborating with Microsoft Research to use this technology for their Global Music Vault, initially located in Svalbard, Norway. Elire plans to establish additional, more accessible locations worldwide for this musical archive.

Cerabyte: Another Player in the Glass Game

Another company pursuing glass-based archival storage is Cerabyte. Their approach differs from Project Silica in the specific medium used: ceramic-coated glass. Instead of writing directly into the glass substrate, Cerabyte's method involves burning nanoscale pits into a ceramic layer deposited on a glass tablet using femtosecond laser pulses.

Cerabyte diagram

Cerabyte diagram illustrating the ceramic-coated glass concept. Image Credit: The Register

Unlike Project Silica's multi-layer approach, Cerabyte currently employs a single-layer technology, resulting in lower capacity per tablet – around 1 GB per surface. However, the fundamental advantages of durability and resistance to physical, chemical, and electromagnetic damage are shared with Project Silica. Cerabyte also envisions storing its tablets in robot-accessed libraries with separate stations for writing and reading, mirroring the tape library model.

A key difference lies in the data encoding and reading method. Cerabyte stores data in QR (quick response) codes, which are two-dimensional barcodes. Data is read using a scanning microscope. The single-layer design simplifies both the writing and reading processes compared to the complex multi-layer navigation required by Project Silica's system.

Cerabyte claims its technology can "write up to 2,000,000 bits with one laser pulse, enabling ultra-fast data storage and reading with high-speed cameras." However, concrete throughput numbers comparable to tape or other media have not been widely disclosed.

cerabyte glass carrier

A Cerabyte glass tablet. Image Credit: The Register

Similar to Project Silica and a significant improvement over tape, Cerabyte's stored tablets do not require periodic rewriting, offering a true "write once, read many" (WORM) capability with extreme longevity.

Cerabyte has attracted notable investment from organizations like In-Q-Tel (the venture capital arm of the US intelligence community), Pure Storage, and Western Digital. Shantnu Sharma, WD's Chief Strategy and Corporate Development Officer, expressed interest in a technology partnership for commercialization. Locating their office in Boulder, Colorado, close to established tape library vendors like SpectraLogic and Quantum, positions Cerabyte strategically within the archival storage ecosystem.

Like Project Silica, Cerabyte is still some years away from commercial product availability, estimated to be within the next two to five years. However, unlike Microsoft's potentially proprietary solution, Cerabyte is explicitly aiming for commercial availability, which could make it a viable alternative for a broader market.

Comparing the Glass Contenders

While both Project Silica and Cerabyte represent significant steps towards durable, long-term archival storage using glass, they employ different technical approaches and have different potential market strategies.

  • Medium and Writing: Silica writes directly into the glass substrate using femtosecond lasers to create multi-layer nanostructures. Cerabyte writes pits into a ceramic layer on glass, currently in a single layer, also using lasers.
  • Density: Project Silica currently claims higher density per tablet (7 TB in 100+ layers) compared to Cerabyte's 1 GB per surface (single layer). However, Cerabyte's approach might be simpler to manufacture and scale initially.
  • Reading: Silica uses a polarization-sensitive microscope and AI decoding for multi-layer structures. Cerabyte uses a scanning microscope to read QR codes on a single layer, potentially a simpler reading mechanism.
  • Maturity and Availability: Both are estimated to be 2-5 years from commercial product.
  • Market Focus: Project Silica appears primarily focused on Microsoft's internal Azure needs, potentially remaining proprietary. Cerabyte is actively seeking commercial partnerships and aims for broader market availability.
  • Investment: Cerabyte has attracted external investment from key industry players like Western Digital and Pure Storage, signaling market validation and potential integration into existing storage ecosystems.

Both technologies share the critical advantage of extreme media longevity (thousands of years) and resistance to environmental damage, eliminating the need for periodic data migration required by tape. They also propose library-based access models familiar to current archival users. The race is on to see which approach, or perhaps both, can successfully transition from research project to commercially viable product capable of challenging tape's dominance.

The Distant Dream of Molecular Storage: DNA and Beyond

Beyond optical methods, scientists have long been captivated by the potential of molecular storage, particularly using DNA. The appeal is immense: DNA is nature's ultimate information storage molecule, capable of storing vast amounts of data at incredibly high densities. Supplier Biomemory, for instance, claims that a staggering 45 zettabytes of data could theoretically be stored in just 1 gram of DNA. Furthermore, DNA can remain stable and preserve information for hundreds or even thousands of years under the right conditions, as evidenced by the successful sequencing of ancient DNA.

Rackable DNA data storage

Biomemory rackable DNA data storage concept. Image Credit: The Register

The concept involves encoding digital data (0s and 1s) into sequences of the four DNA nucleobases: adenine (A), thymine (T), cytosine (C), and guanine (G). These sequences are then synthesized into artificial DNA strands. However, translating this theoretical potential into a practical, scalable archival storage technology faces fiendishly difficult challenges.

The primary hurdles lie in the mechanisms for writing and reading DNA data, which are currently deplorably slow and complex. These processes involve chemical reactions, not the rapid electrical or magnetic manipulations used in conventional storage. Furthermore, a DNA storage medium would not be a neatly organized reel or tablet but rather a vast collection of millions of tiny DNA fragments mixed together, typically suspended in a liquid.

Writing data involves synthesizing these specific DNA sequences, a process that is currently slow and expensive. Reading requires retrieving these fragments, sequencing them to determine their base order, and then computationally reassembling the original data from these sequenced fragments – akin to receiving millions of disordered packets from a network and putting them back together in the correct order. Written DNA is generally immutable, but the reading process itself can be destructive, requiring multiple copies of the data to ensure successful retrieval.

The speed disparity between current DNA writing/reading and conventional storage is immense. Recent work by Chinese researchers using a methylated DNA technique achieved a write speed of just 40 bits per second. Compare this to LTO-9 tape, which writes at 400 MBps, equivalent to 3,200,000,000 bits per second – 80 million times faster. Achieving an 80 million-fold improvement in speed is an astronomical challenge that is simply not feasible in the foreseeable future, whether that's ten or twenty years out. DNA storage, while scientifically fascinating, remains firmly in the realm of a scientific daydream for practical, large-scale data archiving.

Exploring Other Molecular Avenues: Sequence-Defined Polymers

Beyond DNA, researchers are exploring other molecular storage concepts. One such area involves sequence-defined polymers (SDPs), a form of plastic. Research at the University of Austin, Texas, is investigating the potential of SDPs for data storage.

As noted in a Cell paper by the researchers, SDPs offer potential advantages over DNA. "For example, DNA is limited to four monomers, yet SDPs can use a much larger set – eight, sixteen, or even more – allowing for greater information density." This larger alphabet could potentially lead to more efficient encoding.

Recognizing the limitations of DNA sequencing for data retrieval, the Texas researchers have focused on developing an electrochemical method for decoding SDPs. In this approach, sequence components representing data (such as ASCII characters) are read via their individual electrical signals, potentially offering a faster and more electronically compatible reading mechanism than DNA sequencing.

Praveen Pasupathy, a corresponding author and electrical engineer at the University of Texas, commented in an announcement: "Molecules can store information for very long periods without needing power. Nature has given us the proof of principle that this works. This is the first attempt to write information in a building block of a plastic that can then be read back using electrical signals, which takes us a step closer to storing information in an everyday material."

The research group successfully demonstrated a proof-of-concept, reading and decoding an 11-character password in 2.5 hours. While this is a significant scientific achievement, it underscores just how early-stage this technology is. Senior paper author and chemist Eric Anslyn added: "Our approach has the potential to be scaled down to smaller, more economical devices compared to traditional spectrometry-based systems. It opens exciting prospects for interfacing chemical encoding with modern electronic systems and devices."

Despite the promise of electrical reading, molecular storage technologies like SDPs are still very much at the frontier of fundamental research. They face immense challenges in terms of writing speed, scalability, error correction, and integration into practical storage systems. They are likely decades away from any form of commercial viability for large-scale data archiving.

Conclusion: The Road Ahead for Archival Storage

The landscape of archival data storage is at a critical juncture. The current king, LTO tape, while cost-effective, is increasingly challenged by the sheer volume and longevity requirements of modern data. Its need for periodic migration, sequential access, and capacity growth rate are significant limitations that will only become more pronounced.

Looking ahead, glass-based optical storage technologies represent the most promising near-term alternatives. Microsoft's Project Silica and Cerabyte are leading the charge, developing media that offer multi-millennial data retention and extreme durability. While they differ in their technical implementation (multi-layer vs. single-layer, decoding methods) and market strategy (proprietary cloud vs. commercial availability), both are targeting the core weaknesses of tape: limited lifespan and the need for refreshing. These technologies are still under development, with commercial products likely 2-5 years away, but they hold the potential to revolutionize long-term data preservation.

Molecular storage technologies, such as DNA and sequence-defined polymers, offer tantalizing visions of ultra-high density and longevity. However, the fundamental scientific and engineering challenges associated with writing and reading data at practical speeds and scales are immense. Current speeds are millions of times slower than tape, placing these technologies firmly in the realm of long-term research rather than near-term commercial solutions.

As data continues to proliferate, the demand for reliable, sustainable, and cost-effective archival storage will only grow. While tape will likely remain relevant for some time, the future of truly long-term, immutable data preservation appears to lie in durable physical media like glass. The race between Project Silica and Cerabyte, and potentially other optical technologies, will shape the next era of cold storage, leaving the molecular frontier as an exciting, but distant, chapter in the evolution of data archiving.