Import your data into the Instrument Data Wizard

To add new data to your existing Projects, Experiments, and Datasets in Instrument Data Service, you need to first download a YAML file from Instrument Data Service. This YAML file will contain the necessary information for the existing objects and list your projects.

  • Log in to the Instrument Data Service with 2 factor auth (2FA).

  • Click your username on top right and go to “Get Instrument Data Wizard” page.

../../_images/import-00.png
  • Head to the ‘Starter file’ session and download the starter file ingestion.yaml.

../../_images/import-0.png
  • Move the downloaded ingestion.yaml file to the root folder of your data in the BIRU share drive.

In this tutorial, we have prepared the ingestion.yaml for you in the tutorial data folder, so there is no need to download it. The file contains the metadata for your existing projects in Instrument Data Service.

To get started, open the Wizard and click the Open button. Select the provided YAML file. Your projects from Instrument Data Service will be listed.

../../_images/metadata-41.png

The file includes one project called “Breast Cancer Drug Treatment Genomics” with ID “BREAST04”, one experiment named “Keytruda” with ID “Keytruda”, and one dataset named “Raw” with ID “Keytruda-Raw”.

To add new data, you can click the Import data files button, and the step-by-step wizard will prompt you to add files and ask how you would like to organise them.

You can also right-click on the Project, Experiment or Dataset you would like to add more data to, and select the Add Experiment, Add Dataset or Add files options.

As Sarah, you have some new raw data in the Herceptin trial you would like to import.

After clicking the Import data files button and going through the initial explanation screen, you will now be presented with a choice to add files to an existing Project, or create a new Project.

../../_images/import-41.png

Since this is data for the same project, choose the “Breast Cancer Drug Treatment Genomics” Project.

Then proceed through the rest of the wizard using this setup.

  • Create a new Experiment with name “Herceptin”, and ID “Herceptin”.

  • Create a new Dataset with the name “Raw”, and ID “Herceptin-Raw”.

  • Data files: Add the .fastq files in the tutorial data folder, under tutorial/herceptin/.

Once finished, your editor should look like this.

../../_images/import-51.png

Save your progress

Instrument Data Wizard keeps all your data structure and annotations in an YAML-formatted ingestion file. This file is read by the Instrument Data Service ingestion process to find all your data files. It needs to be saved in the root folder of your data.

Click the Save button, and save your ingestion file under the tutorial data folder. Use the same name ingestion.yaml.

Save as you go!

Remember to save your changes as you work! As the Instrument Data Wizard is still being developed, bugs and crashes may happen at inopportune moments. After a crash, you can reopen the file using the Open button.

Exercise: Add even more data

Try to re-create the hierarchy in the Instrument Data Wizard as described in the example data structure plan.

Once finished, your editor should look like this.

../../_images/import-exercise1.png