Starfish Storage Celebrates a Decade of Leadership in Metadata-Driven Unstructured Data Management

Mar 13, 2024 | News

Marking the Milestone: Managing Well Over an Exabyte in the World’s Most Demanding and Complex Data Environments

Starfish Storage, the leader in metadata-driven unstructured data management, proudly celebrates its 10-year anniversary. Coinciding with this milestone, the company also announces it now manages well over an exabyte of capacity across its client base. This includes eight of the world’s top 10 pharmaceutical firms, seven of the eight Ivy League universities, Department of Energy supercomputing sites, and leading corporations in nearly every industry, including semiconductor, oil and gas, fintech, automotive, healthcare, consumer products, and media-entertainment.

Starfish services the largest and most demanding file environments in the world. The typical Starfish customer uses high-performance parallel file systems and scale-out NAS to service the needs of scientific computing, AI/ML, engineering, rendering, and other highly demanding production workloads. These environments consist of billions of files, tens and sometimes hundreds of petabytes of capacity, and have a myriad of data management challenges.

Starfish’s metadata-driven approach sets it aside from traditional file management solutions that rely on timestamps and coarse-grained analytics to make policy decisions. At the heart of the Starfish platform is a data catalog specifically designed for unstructured data. The metadata and analytics capabilities of the catalog allow Starfish to service a wide variety of use cases (well beyond the table-stakes use cases of archiving and data movement) and address the nuanced needs of each set of stakeholders. Most importantly, Starfish enables the end users who create and consume files to manage their own data with the appropriate security and safeguards.

Company founder, Jacob Farmer, explains, “In these large, diverse computing facilities where Starfish plays there are a myriad of use cases, often with nuanced implementation details. There are also many stakeholders including the users who create and consume the files as well those who are responsible for paying for storage capacity, various aspects of compliance, and ensuring proper data curation.” Some of the use cases Farmer refers to include archiving, data protection, migrations, cloud bursting, cost accounting, data disposition, ROT (Redundant, Obsolete, and Trivial) cleanup, FAIR data management, and AI/ML workflows.

About Starfish Storage

Starfish is the unstructured data management platform for high performance computing (HPC), AI/ML, and other demanding file-based workloads. Starfish provides a unified index and a common API for addressing, managing, and moving files across a diversity of file storage systems, including HPC file systems, scale-out NAS, conventional file servers, and S3-style object stores. Starfish is vendor-agnostic, supporting devices and services from all hardware, software, and cloud vendors. Common use cases include data classification, protection, archiving, migration, ROT cleanup, cost accounting, and workflow automation.

Starfish supports such a wide variety of use cases due to its super-flexible architecture that combines a metadata-rich catalog with a scale-out data mover and batch processor. The feedback loop between the catalog and the batch processor enables automated data classification, metadata-driven storage management policies, user self-service, and file-processing workflows. For more information about how your organization can unlock the power of its data, please visit Starfish Storage at https://starfishstorage.com/

Recent Posts

The Gruesome Job of Managing Petabytes of Scientific Data

“Oh, wow, the IT department charges us $11,000 a month for data storage. Why don’t we put some data on cheaper storage? Mark, why don’t you look into that?” the principal investigator asks the postdoc who just joined the team. There are thousands of similar...

Managing unstructured data to boost performance, lower costs

Unstructured data is the fastest growing data around. It's increasing at a compound annual growth rate of 61%, according to IDC, and will account for 80% of worldwide data by 2025. For many large IT organizations, it passed that mark a while ago. Unstructured data...

Upcoming Events

21 Surprising things you can do with Starfish