Data Management Plan
Data Management Plan
The CBB data management plan (DMP) was developed with multiple goals in mind. These are 1) promoting highly collaborative interdisciplinary research within CBB; 2) sharing the fundamental knowledge gained by CBB with the public and scientific community; and 3) ensuring that CBB methods and approaches are put to use in operating accelerators at national labs, at universities, and in industry.
In Phase I, CBB’s main scientific product was the fundamental knowledge needed to increase beam brightness. This was distributed through mechanisms such as archival journal publications, including supplemental information when appropriate, conference presentations and proceedings, and invention disclosures and patent applications. In addition, CBB produced educational material, which is available through publicly accessible websites.
In Phase II, CBB will pursue the most productive of the Phase I directions, which it will continue to distribute through the vehicles listed above. In addition, Phase II will prepare CBB methods and approaches for implementation in operating accelerators; its outputs will therefore also include engineering designs, technical drawings and software packages.
CBB data and products
The types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project.
As an interdisciplinary center, CBB scientific research generates a wide variety of data. These include raw experimental data, laboratory logbooks, and the products of computer-based simulations and data analysis code.
Experimental data include 1) the output of beam instrumentation such as beam position monitors, Turbo_ICT diagnostics for beam charge, and knife-edge emittance diagnostics; 2) characterization of devices and processes, such as Hadinger interferometry for tuning a delay-plate holder for optical stochastic cooling, pressure and temperature data from Nb3Sn coatings, and RF signal, temperature, and other data from vertical performance testing of SRF cavities; and 3) materials characterization using STM/AFM/TEM surface morphologies; optical and electron micrographs; x-ray photoelectron spectroscopy (XPS), Auger electron spectroscopy (AES), scanning tunneling microscopy/spectroscopy (STM/STS), and Energy Dispersive X-ray Spectroscopy (EDX-EDS).
Simulations and software produced by CBB include the simulation of beamlines and beam phenomena in the context of standard accelerator software packages such as ELEGANT and GPT; point-to-point computations of space charge; density functional theory (DFT) to study doping, growth and properties of SRF cavity and photocathode surfaces; solutions to Maxwell’s equations in vacuum coupled to time-dependent Ginzburg-Landau theory to study field enhancement in SRF cavities; and a genetic algorithm for structure and phase prediction (GASP) interfaced to GULP, LAMMPS and VASP. Outputs include material structures and associated properties, predictions of material characteristics, and simulated beamline performance.
CBB produces kits for use in middle school classrooms that address the Next Generation Science Standards. It also produces educational material related to particle accelerators.
Data Formatting and Content
The standards to be used for data and metadata format and content.
Experimental and simulation data. Most experimental data are recorded in an automated fashion using a LabView VIs, Matlab scripts or proprietary control/data acquisition software for the respective instruments. Every effort is made to store experimental data using formats that can be read by many software packages and/or have well-known file format definitions. These include ASCII, CSV, or binary formats (sometimes using typical MATLAB I/O standards and functions) for large size sets. Images are saved in TIFF/JPEG/PNG format. Where practical, datasets generated in proprietary data formats (e.g., Omicron STM data) are translated into a non-proprietary format at the transfer stage. Data snapshots are stored in individual directories with appropriate time stamps. Indexing of the experimental data (metadata) is available in each directory and the main parent directory, along with documentation in electronic or hardcopy laboratory notebooks. Metadata standards are developed by the investigators as necessary (custom) to unambiguously associate data with specimens, procedures, instruments, dates and times, and analysis techniques, and any other information deemed necessary to fully understand the data analysis flow and results.
The file formats used for materials structures are standard for ab initio simulations. The structures will be stored in the Crystallographic Information File (CIF) format and the commonly used VASP and Quantum Espresso file formats for easy access and compatibility. Properties will be stored in ASCII or XML file formats. These formats will ensure the longevity of the structure date and make it useful for other researchers who wish to study these materials for other applications.
Laboratory Notebooks. Hardcopy or electronic notebooks such as LabArchives (made by LabArchives, LLC) and ELOG (developed at the Paul Scherrer Institute) will be maintained to document daily activities and experiments. Entries are data and time-stamped and include sufficient narrative to enable reproduction of all experiments or simulations as well as clearly labeled references to any computer-recorded datasets. Logbooks will be preserved for at least ten years.
Engineering Designs. Mechanical drawings may be produced at Cornell, UCLA, NIU, or U Chicago. They are maintained in standard CAD formats. At Cornell, they are archived in a lab-wide ‘Vault’ database and can be made available on request in any desired CAD format. At UCLA we maintain an extensive library of Solidworks model of beamline components and radiofrequency cavity designs. U Chicago and NIU have similar practices.
The practices for backing up servers storing experimental or simulation data vary with institution and are the responsibility of the individual investigator.
Software and Algorithms. CBB investigators use repositories, such as GitHub, SourceForge, and SciDrive for storing software and algorithms. Algorithms implemented in popular software, such as ELEGANT, are incorporated into releases to the user base.
Materials samples. Physical specimens (e.g., etched or grown surfaces) are typically not preserved after study. Instead, they are disposed of in a manner consistent with university practices.
Pedagogical material. CBB produces a variety of pedagogical material, including material for US Particle Accelerator School courses, videos on topics in accelerator science, and an ontology of accelerator science. CBB members regularly teach at the US Particle Accelerator School, producing material such as lectures and homework assignments, which are available online at uspas.fnal.gov. For example, CBB members recently taught a two week course on high brightness beams, one week on photocathode physics and one week on beam dynamics. CBB also maintains a YouTube channel (https://www.youtube.com/channel/UCLLGLeOr-UUnws3RxbTK1NQ) with pedagogical videos, which have had over 6000 views. Finally, CBB has created an ontology of accelerator science using the WebProtege platform (webprotege.cbb.cornell.edu).
CBB activities do not include the use of human subjects or the use of vertebrate animals, so they are not subject to IRB Protocols. They do not collect ‘personal data’ as defined by the Data Protection Act 1998 (the DPA) or equivalent HIPAA requirement).
Data Access and Sharing
Policies for access and sharing, including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements.
Data Sharing within CBB. The success of CBB depends on the free internal sharing of ideas and methods. The research themes typically meet biweekly, and most meetings consist of a presentation by a CBB participant or affiliate on work-in-progress. Comfort with the sharing of unpublished work is possible because of an Intellectual Property Agreement (IPA) (see below), which permits the sharing of ideas outside CBB only by the creator or with the creator’s explicit permission.
In order to provide ready access to the presentations at theme and collaboration meetings, CBB uses indico (https://getindico.io), which is a CERN product that is in widespread use in the accelerator community. Each CBB meeting has its own indico site with the agenda, slide presentations, supporting material, and zoom links. By default, CBB indico sites are accessible only to CBB members.
Reports, spreadsheets, meeting minutes and other documents are stored on a Cornell-hosted Box site. Box allows individualized access control for each document and folder.
The public CBB website shares information such as the CBB participant directory, the list of active CBB projects, recent results, publications, conference proceedings, and upcoming events. Additional information of primarly internal interest is available on an internal CBB website. The internal website includes information such as the CBB Strategic Plan, the CBB Handbook, CBB logos and templates, instructions for acknowledging NSF, Safety guidelines, the ontology of accelerator science, and the link to indico.
All CBB researchers are required to complete training in the Responsible Conduct of Research.
Intellectual Property Agreement (IPA). A CBB IPA governs ownership of intellectual property including patent applications, patents, copyrights, trademarks, mask works, trade secrets, software, and any other legally protectable information made in CBB. Unless otherwise agreed in writing, this IP is owned by the institution whose employees make or generate it. Jointly made or generated IP is jointly owned by the institutions of the creators unless otherwise agreed in writing. The institutions of all CBB participants and affiliates are signatories to the IPA.
Public sharing of CBB results. The results of CBB research will be presented at conferences and prepared and submitted for publication in a peer-reviewed scientific journal. These are listed on the CBB website. CBB’s expectation is that publication will be within two years of the generation of the data. Data are typically published in the form of graphs, images, and tables. Additional details are made available after publication upon request or included directly as supplementary information.
Whenever possible, after publication of an article, the data and code needed to interpret, verify and extend the research reported are archived in a permanent, publicly accessible repository. For example, the group Arizona State University uses the ASU Digital Repository and Northern Illinois University uses the Huskies Commons https://commons.lib.niu.edu/. Others may use repositories such as Zenodo, Dryad, or the NSF-funded PARADIM data collective. All provide a citable DOI number so that the data sets and code can be properly acknowledged, as well as referenced in the publication.
Software and algorithms will be shared through repositories or included in releases of standard accelerator software packages such as ELEGANT or GPT. Links to repositories are available from the primary investigators, whose sites may be accessed from the main CBB web site.
CBB practice is to freely share technical information with scientists at national labs and universities. Ownership of the idea remains with the institution of the creator, and the receiving institution may not share it with others without express consent. Sharing with companies is encouraged, subject to terms negotiated by the creator’s institution.
Cornell freely shares technical information related to SRF cavities according to the guidelines of the TESLA Technology Collaboration (TTC), of which it is a member. The mission of TTC is “to advance superconducting radio frequency technology research and development and related accelerator studies across the broad diversity of scientific applications, and to keep open and provide a bridge for communication and sharing of ideas, developments, and testing across associated projects. To this end the Collaboration supports and encourages free and open exchange of scientific and technical knowledge, expertise, engineering designs, and equipment.”
In the event that an investigator believes an invention was generated through the proposed research, he or she will file invention disclosures and then patent applications will be formulated if appropriate. Jointly created IP will be handled in accordance with the rules set forth in the IPA.
Educational kits are available to teachers and the public through the CBB website.
Pedagogical videos and course material on accelerator science and related fields are publicly available on YouTube and at uspas.fnal.gov.
Policies and provisions for re-use, re-distribution, and the production of derivatives.
Published CBB data may be reused without express consent from the PI’s, but requires statements of attribution and disclaimers that the originators of the data are not responsible in any way for re-use or novel interpretations or results.
Plans for archiving data, samples, and other research products, and for preservation of access to them.
The analyzed data will be published in archival journals. Archival longevity depends on the continued existence of the respective journals and web resources, and data repositories.
All data generated for these publications, whether of broad utility or not, will be kept for at least three years beyond the end of the award, as recommended by NSF and in minimum 10 years after publication.