0% found this document useful (0 votes)
17 views

Data Goodness: Mostly in Black and White by Dom

The document discusses the importance of properly storing and managing research data. It notes that lost data results in no research, publications, jobs or PhDs. It recommends storing large data on servers which provide backups, mirroring and parity protection to prevent data loss from desktop computer failures. The document emphasizes curating data as work is done by using meaningful folder names, README files and keeping data tidy. It stresses testing software and using versioning tools to ensure scripts work properly and data is safely stored.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Data Goodness: Mostly in Black and White by Dom

The document discusses the importance of properly storing and managing research data. It notes that lost data results in no research, publications, jobs or PhDs. It recommends storing large data on servers which provide backups, mirroring and parity protection to prevent data loss from desktop computer failures. The document emphasizes curating data as work is done by using meaningful folder names, README files and keeping data tidy. It stresses testing software and using versioning tools to ensure scripts work properly and data is safely stored.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 7

Data goodness

Mostly in black and white By Dom

You must love your data!


Lost data :
Current imaging data in BRIC cost ~5.1M, just for scanning costs! (2011)

no research
no publications
no jobs no PhDs! Sad Dom

Look after your data!


It looks after you
Happy Dom

Data Storage
Home directories:
ISIS home, U Home Not for large amounts of imaging data

Projects directory
ISIS, V: Big stuff goes here

If you require large amounts of space


E.g. > 50 GB
LET ME KNOW IN ADVANCE!

Server goodness
Why is the server a good place to store data?
Mirror and parity - some errors - data can be easily recovered

BACKUPS:
Tape backups, daily - 1 month retention if you have funding, processed data can be mirrored off site raw data is always mirrored offsite (ECDF) by default

Desktop PC's
not reliable - no mirroring, no parity - some errors - data is lost (Often all of it) Network backups often fail
Machines turned off, Network busy moving to a new system when I get time!

Data love
Curation: Do this as you work!
Plan your data use

Use meaningful folder names Make 'README.txt' files with dates, names of students/employees involved, references to software, scripts and versions, purpose of experiment/processing. Be tidy with your data - tidy up occasionally Friday afternoon - quick tidy up Big tidy up at end of experiment/ project/ phase/ year
BE CAREFUL, dont rush

Data, spreadsheets, databases


Anonymisation *** Repatriation keys***

Code and Scripts


Coding:
Testing
Make sure that the software you are using does exactly what you think it does! Check every step for every image!

Do not use hard coded paths


Use versioning software (ECDF)

Safe data is Happy data!

You might also like