Image Data Management Resources: “On-site” visits
From September 2015 to January 2016, the National Coordination had been leading a pilot survey aiming at listing the actual resources, equipment (including IT dedicated one), tools and expertise in the fields of image data management and bioimage informatics existing in the different FBI sites. It also aimed at identifying bottlenecks in order to recover needs and foresee potential projects. Perrine Paul-Gilloteaux from the IPDM FBI node had been assigned to collect this information by visiting on-site FBI platforms and R&D laboratories, and interviewing staff in charge, mainly engineers and researchers. The overall view and the main proposals for action resulting from this survey are presented below.
When it comes to IT infrastructure for image data, most FBI nodes are disconnected and even sites of the same node do not share IT infrastructures, have different data repositories when they exist, and have access to different network levels.
However, Core Facilities are facing a deluge of data resulting from the novel imaging technologies (see below) , notably acquired in the Frame of the FBI program, and Associated Research & Development teams would consider sharing their data, through dedicated tools ( Image Data Repository) to facilitate development of image processing tools or validation/cross comparisons of data, or exchange in the frame of new collaborations.
As mentioned by multiple sites/nodes, a big jump in data production and inherent difficulties, are expected with innovative approaches (SPIM, Serial Block Face …) but, up to now, no clear and even less commonly approved solutions are proposed for accurate storage and analysis. Image-Data storage requires a dedicated infrastructure and software in use are inadequate for processing and visualization of large data sets (3D). OMERO seems to be the most current centralized system of storage/ Data Base, but others coexist. In any case, there are no bridges between them within FBI. Yet, centralized storage is underused in most of the places.
A data management plan may be needed in order to break practical drag and improve the service in a national process of a qualitative approach involving to:
- Get a data structure in terms of common semantic
- Develop interoperable software & tools adapted to big data human assimilation
- Organize meetings between IT proximity engineers or technician to exchange on current hardware infrastructure for data storage and transfer.
- Define a common policy of FBI nodes regarding data responsibility
- Set up a centralized repository to publish FBI working groups data and users gold standard data to facilitate the exchange between users of multiple nodes and present new data modalities to image processing teams
It will be also important to communicate and teach how to use data management systems so as to erase behavioral barriers and involve the research community towards a better understanding of the challenge ahead:
- Big data valorization (diffusion with correct curation, exploitation, convenient visualization)
- Training for facility people for data curation and annotation
- Metrology/facility monitoring from image data base
- Coding parties/Tagging on tools to facilitate access to software, development and diffusion of user friendly tools, interfacing software tools with data base. Use of Grid computing
- On-demand focalized training on thematic image processing notions or to more general software platform (Icy or others)