During the project

During the project, collect data in formats that are long-lasting. The Library provides training in EndNote, and can help you with information about format obsolescence, digital data preservation and documentation organisation. Data that is organised and well-documented is easier to find and use. Regularly assess your options for storing your data and moving it around. If your data is lost, stolen or misused you will lose valuable work and damage your reputation as a researcher. Contact your Librarian for assistance.

 

Assess the durability of the file formats you will use by considering if the format is:

  • endorsed and published by standards agencies such as Standards Australia or ISO
  • publicly documented, i.e. complete authoritative specifications are available
  • the product of collaborative development and consultative processes
  • widely used and accepted as best practice within your discipline or other user communities.

You should also assess the long-term accessibility of any hardware and software used to create and manipulate research data.

If you develop software as part of your research, follow available best practice guidelines for developing, releasing and licensing your software.

Resources:
File formats and File formats (working with data) (Australian National Data Service)
Tips and tricks for sustainable software and development (Software Sustainability Institute, UK)

Seek advice from Technology Services if required.

Digital data
You should only store master copies of digital research data on:

  • SCU systems e.g. Enterprise systems like O365 OneDrive
  • SCU approved storage services for the Australian research sector
    • CloudStor+ (AARNet)
    • Space (Intersect)

Consult Technology Services if you need advice about secure storage options. Technology Services can refer you to SCU storage experts and authorised off-site providers. Gathering the following information will help you explain your needs to Technology Services staff:

  • current data volume - total size in Mb/Gb/Tb - and likely rate of growth
  • number of files and folders, and how they are organised
  • location of your workspace/s, e.g. office, lab, home, in the field
  • platform - Mac / Windows / Linux
  • applications used to access and work with your data
  • frequency of update, e.g. working data that changes daily, or data from a completed project that needs to be retained but would not be used often
  • data type/s: spreadsheets, database, documents, images, datasets, etc.
  • any special security needs, e.g. clinical data, personal data, commercial potential
  • access control: Who needs access? Are they from SCU? If not, are they based in Australia or overseas? At universities or at other types of organisations?

Resources:
(SCU)
CloudStor+ (AARNet)
Space (Intersect).

Seek advice from Technology Services if required.

Desktop and laptop computers
You should not store master copies of digital data on individual desktop or laptop computers.
You should treat these as convenient working areas but not as primary stores.
Local drives fail and are often not backed-up. Local machines are regularly replaced, upgraded, allocated to other people and stolen - data is at risk of being lost or inappropriately accessed.
If you store additional working copies on local computers, schedule automatic synchronisation and/or backups and password-protect and physically secure the machines.

Removable media
You should not store master copies of digital data on removable media like CDs and DVDs, flash memory devices (i.e. USB sticks), and portable hard drives.
These are:

  • not always long-lasting, especially if they are not stored correctly (CDs/DVDs)
  • easy to damage physically (e.g. through magnetism or shocks)
  • prone to errors in writing to the media ('burning')
  • a risk in terms of data security - they are easy to misplace or lose, usually are not password-protected and are an easy target for viruses and malware.

If you store additional working copies on removable media, schedule automatic synchronisation and/or regular backups. You should password-protect and encrypt the media and ensure they are as physically secure as possible

Choose high quality products, and follow the instructions provided by the manufacturer for care and handling, including environmental conditions and labelling.

Regularly check the media to make sure that they are not failing, and periodically 'refresh' the data (i.e. copy to a new disk, USB stick, or portable drive).

Resources:
Caring for CDs and DVDs (British Library - UK National Preservation Office)

Seek advice from Technology Services if required.

Cloud services
With the exception of the research sector and enterprise solutions noted above, you must not store research data using services that are provided or managed externally to SCU by third parties. The reasons for this include:

  • Protection of intellectual property: Some cloud services assert their ownership of the intellectual property in anything that is uploaded by users.
  • Legal requirements: Storage of data that contains personal information outside Australia could be a breach of the Privacy Act.
  • Risk management: The Terms and Conditions of some cloud services state that they will take no responsibility for data loss and that they can withdraw the service at any time. There are also documented security breaches of many of these systems.

Resources:
Cloud computing and the privacy principles (Office of the Information Commissioner Queensland)

Seek advice from Technology Services if required.

CloudStor
You should use the CloudStor service run by AARNET (Australia's Academic and Research Network) to transfer research data, particularly data that contains sensitive or personal information. CloudStor:

  • can be used to transfer files to/from collaborators at other Australian universities as well as "external" users
  • encrypts your data before submission
  • is accessible using your SCU login details
  • can accommodate large files.

Email
You should avoid using email for data transfer. Some of the limitations of email include:

  • size restrictions - most institutions have limits on the size of emails and attachments (SCU MS Outlook service restricts you to 25MB)
  • security risks - particularly if you are working with data that is personally or commercially sensitive and/or utilising personal accounts that may not meet legal and ethical requirements around privacy and confidentiality, and
  • version control issues.

You should create and maintain sufficient documentation or metadata (i.e. structured information about the data) to enable research data to be identified, discovered, associated with its owners and creators, linked to other related data or publications, contextualised in time and space, and to have the quality of the data assessed and research results validated.

If you poorly document your data, it will be difficult (or impossible) to find it and manage it in the longer term. Even if you (or others, in future) can find the data, its value will be diminished if it is hard to interpret.

Practices will differ depending on your discipline, but you should always ensure that protocols are agreed early in the project and adopted by all researchers consistently.

File naming for digital files
Digital file names can be important for identifying and finding digital files. You should develop file naming conventions early in a research project, and agree on these with colleagues and collaborators before data is created.

Conventions will differ depending on the nature and size of a research project. In all cases, filenames should be unique, persistent and consistently applied, if they are to be useful for finding and retrieving data.

Identifiers
An identifier is a reference number or name for a data object and forms a key part of your documentation and metadata. To be useful over the long-term, identifiers need to be:

    • unique - globally unique if possible, but at the very least unique within your particular systems and processes, and
    • persistent - the identifier should not change over time.

The emerging identifier standard for publicly available datasets is the Digital Object Identifiers (DOIs). Although DOIs have been traditionally used for electronically published journal articles, they can now be assigned to datasets. SCU can assign a DOI to a collection that you make available through the institutional repository ePublicatons@SCU.

 

Controlled vocabularies
A vocabulary sets out the common language a discipline has agreed to use to refer to concepts of interest in that discipline. It models the concepts in a discipline by applying labels to the concepts and relating the concepts to each other in a formal structure.

Vocabularies take many forms. They include glossaries, dictionaries, gazetteers, code lists, taxonomies, subject headings, thesauri, semantic networks and ontologies.

Wherever possible, you should use an existing controlled vocabulary. Even if you need to adapt or customise an existing standard, this is preferable to creating something from scratch.

 

CC-BY

Adapted from Best practice guidelines for researchers: Managing research data and primary materials by Griffith University which is licensed under a Creative Commons Attribution 4.0 International License.