BioMicro:FAIR

From OpenWetWare
Jump to navigationJump to search

What issues are we trying to solve?

In recent years, the lack of reproducibility of scientific studies has been especially emphasized, as people started to realize how many studies were not able to be reproduced due to the lack of metadata. In wake of this crisis, there has been a shift in how metadata is managed and released during publication. Storing rich metadata ensures that a project can be replicated; for example, a raw sequencing file on its own is not as useful as a file that is associated with the sample species, what tissue it is, the type of sequencing used, etc. As a response to this issue, the NIH has mandated that metadata must be released to the public for a study to be published or receive federal grants. This ensures that published studies can be reproduced, which is important because it confirms that the results and conclusions are accurate.

We manage and store data in a manner that meets publication standards. One way that we do so is by inputting data in real time, which avoids loss of metadata. If there is an issue with the metadata, it is much easier to remember the details of an experiment that was performed recently versus months or years ago. This timeline for data upload also allows us to avoid a last minute rush of uploading data at time of publication. We also aim to minimize the activation energy needed for researchers. We handle everything related to data storage, management, and publication, making the data release process as hassle-free as possible and allowing for the researchers to focus more on the primary research.


How do we implement FAIR-compliant standards?

FAIR-compliance is important as it ensures the quality of the data being shared. NextSEEK was built on FAIR data attributes, making it one of a kind in how vigorously it incorporates FAIR data standards. All of the data stored in NextSEEK is assigned a unique ID number, making it easy to search for a specific sample. Each data sample is also stored with rich metadata, making it possible to search for samples of data using metadata attributes. This system for searching up data is standardized, meaning it isn’t specific to a certain piece of data and allows for expansion for numerous data types. Retrieving data also requires authentication and authorization, ensuring the data is accessible while still being secure and protected. NextSEEK stores data in an interoperable way by storing data in a broadly applicable manner and allowing different data types to reference each other. Along with storing rich metadata, NextSEEK data is stored with detailed connection and a clear point of origin, making it simple to see how different samples are connected.

NextSEEK data meets community standards in regards to the metadata being stored, making it easy for the scientific community to interpret and reuse the data. Due to the existing database architecture, SEEK is also adaptable post-implementation, ensuring the scalability and the long term capabilities of the system.


The importance of FAIR compliant data

Data is the next big thing in the field of science. Nowadays, publishing data is a requirement, meaning storing and managing data efficiently is more important than ever. The data managers at the BioMicro center assist with this by storing data in a FAIR-compliant manner in NExtSEEK; this, along with timely data deposition and storing rich metadata ensures that the data published is high quality. This all ties into our mission to make the submission for publication process as easy as possible on the data side, as it is almost guaranteed all of the required data for submission will be correct and stored in the database. Once a paper is ready to be published, the data stored in the database can be published to FAIRDOMHub, where it is accessible to the public. The data stored in NExtSEEK is also secure while still allowing for cross collaboration, making it ideal for scientists to share their data prior publication.