the holistic approach to unstructured data management

Starfish is a unique software application for managing files and objects at any scale, ranging from departmental file shares to the world's largest supercomputing file systems.

billions of files · 100s of petabytes ·

metadata-driven reporting & workflows ·

multi-server parallelized data movement ·

storage agnostic

USER FRIENDLY

Enables users to participate in storage management and content classification

EXPERT FRIENDLY

Power tools for DevOps and sys admins;

designed for scale and automation

A simple but powerful paradigm
Starfish combines DISCOVERY with EXECUTION

DISCOVERY

A METADATA
CATALOG


For unstructured data

The File Catalog is a database that enumerates all of your files and directories along with their version histories. Extensible metadata in the form of tags and key-value pairs adds color to your files, facilitating discoverability, reporting, and analytics.

Know
everything about
your files that is
knowable

question mark
Can you afford
NOT KNOWING
what's going on in your
file storage systems?
Can you afford
NOT KNOWING
what's going on in your
file storage systems?
  • reporting/analytics
  • metadata
  • cost accounting
  • user portal

EXECUTION

A Scale-out
data mover


& Batch processor

The output of a query to the file catalog serves as instructions for Starfish's parallelized data movers and batch processors. Copy, move, or delete files OR do whatever you want to your files at very large scale with minimal effort.

Do
anything to your
files that is
do-able

coin
Can you afford
Doing nothing
except spending on
additional storage?
Can you afford
Doing nothing
except spending on
additional storage?
  • archiving
  • ROT cleanup
  • data movement
  • automation

Solutions

Metadata and Content Classification

Starfish allows users and applications to associate metadata with files and directories. Features include:

  • Tagging

  • Key-value pairs

  • Inheritable directory metadata

  • Enhanced search/find

  • Visual query builder

  • Easy automation in CLI, GUI, and API

Reporting and Analytics

Starfish is the industry leader in reporting and file system analytics. Features include:

  • Metadata for greater specificity

  • Engineered for massive scale

  • Retention and analysis of historical data

  • Customizable with SQL

  • Support for third-party BI tools such as Tableau

  • More than just analytics

  • Orphaned file discovery

  • Quota management, chargeback, and showback

  • Identify candidate files for archive and deletion

Data Protection / Preservation

Starfish supports a number of data protection strategies, giving you complete control over the files you protect and how you protect them. Features include:

  • Data replication

  • Data mirroring

  • Backup/restore

  • Hash comparisons

  • Content addresses

  • DOIs and UUIDs

Tiered Storage and Capacity Optimization

Starfish often pays for itself simply by reclaiming wasted storage space. Starfish is superior to other solutions in that it enables users, to participate in storage management. Capabilities include:

  • Data ROT (Redundant, Obsolete and Trivial) cleanup

  • Automated storage tiering

  • Data disposition policies and workflows

  • Duplicate file detection and remediation

  • Orphaned file detection and remediation

  • Compression/decompression

Data Migration

Starfish is the ultimate data migration solution, and it can be licensed just for this purpose. Features include:

  • Parallel data movement

  • Support for NFS, SMB, and object

  • The ability to detect incremental file-system changes

  • Awareness of character encoding issues

  • Sensitivity to hard and soft links

  • Ability to do dry runs and simulations of complex migrations

Archive and Recovery

Starfish provides comprehensive and flexible archiving and recovery. Features include:

  • Leverage metadata to decide what to archive and what to recover

  • User self-service

  • Manager approval workflows

  • Support for POSIX, Windows, object, deep archives, and tape

Workflow Automation

Starfish combines metadata with batch processing, making it very simple to automate pipelines and workflows that involve file processing and data movement.

  • Metadata-driven workflows

  • Data movement

  • Metadata extraction from file headers

  • Hash comparisons

  • Content addresses

  • DOIs and UUIDs

Data Governance

Fine tune detection logic across files; analyze permissions, file headers, and the content of individual files; and design remediation strategies. Capabilities include:

  • PII and data anonymization

  • File access permissions

  • Confidentiality/export controls

Industries

icon of performence

High Performance Computing

Starfish is the only reporting and unstructured data management solution in the industry today that works at HPC scale. Starfish has been vetted in some of the world’s largest and most demanding HPC centers. These environments are characterized by extremely high file counts, rapid and widespread churn, and enormous capacities. Starfish provides life cycle management and essential insights into capacity consumption.

icon of science

Pharmaceutical / Biotech / Life Sciences

Starfish has been very successful in the life sciences. Our clients include Top 10 pharmas as well as leading biotech firms, biomedical research labs, government labs, and EDU labs. Starfish is great for instrumentation workflows, downstream data classification and management, data ROT (Redundant, Obsolete, and Trivial) cleanup and archive automation. Starfish is also great for cloud bursting and AI workflow management.

icon of education

Higher Education

Starfish was initially built for scientific research data management; in fact, many of our early customers were grant-funded EDU research facilities. Today, research organizations rely on Starfish to maintain good storage hygiene, conform to data management plans, and facilitate open access to raw data that supports published findings. With Starfish, data curation begins when files are first created.

icon of labs

Government Labs

Starfish is in production in government labs in the U.S. and abroad. U.S. government clients include DOE labs, NIH divisions, NIST, CDC, USDA, and the Federal Reserve. Starfish is also deployed at the European Bioinformatics Institute and the Australian CSIRO.

icon of financial

Financial Services

Starfish is deployed at several of the world’s largest hedge funds. Starfish tracks dependencies, runs data protection workflows, manages archives, and provides detailed capacity reporting on billions of files spread across NAS and HPC file systems. These file systems are often quite complex, containing deep directory trees, and even symlinks that point to other symlinks.

icon of energy

Oil and Gas / Energy

Starfish is great at managing data upstream in the oil and gas discovery process. Starfish excels at high volume data transfer, data ROT cleanup, compliance of licensed data sets, content classification, and archiving.

icon of digital storage

Museums and Libraries

Starfish is ideal for managing digital preservation workflows and is the ultimate storage management foundation for digital asset management. Starfish automates data integrity checks, replicates objects based on policies, provides a system of UUIDs or DOIs, and extracts metadata from file headers. Starfish can also serve as a global namespace for organizations with multiple asset management systems and content catalogs.

icon of semiconductor

Semiconductor / Electronic Design Automation

Unlike conventional enterprise file-management solutions, Starfish is designed for high churn environments that have billions of small files and extensive use of symbolic links. These are the hallmarks of large-scale electronic design automation sites. Starfish is also great for mapping directory trees to specific projects and enforcing data retention rules.

Careers

Want to
be part of the Starfish team?

Check out our open positions. If you don’t see a match for yourself, please contact us anyway. We are always open to accommodating talented people.

Starfish is a great place to work. We have a small team of exceptionally talented people.

We have an enthusiastic, engaged customer base who are using our products to cure disease, preserve cultural heritage, drive innovative engineering, and make awesome movies!

We are flexible in our work culture. We respect work-life balance. We embrace work-from-home.

Starfish Storage provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, disability, genetic information, marital status, amnesty, military service, or veteran status in accordance with applicable federal, state and local laws. Know Your Rights: Workplace Discrimination is Illegal. Starfish Storage’s EEO Policy.

If you need assistance or an accommodation during the application process because of a disability, it is available upon request (Phone: +1-781-301-7500, Mail: Starfish Storage, Attn: Recruiting, 271 Waverley Oaks Road, Suite 301, Waltham, MA 02452, USA). The company is pleased to provide such assistance, and no applicant will be penalized as a result of such a request.

Contact Us

  • For a quicker response, please use an email address associated with the organization you represent.
  • Which of the following describes the nature of your inquiry
  • This field is for validation purposes and should be left unchanged.