Tagging Documents and using Controlled Vocabularies

Overview

Documents can be tagged and the values used can be 'free text' or they can be autopopulated from a controlled vocabulary/ontology. PIs can decide to enforce the use of ontologies within their lab group, this will remove the option to enter tags as freetext and enforce that all sources of controlled vocabularies for tags be in files shared with a Group. This is described in detail below.

General use of tags

You can add tags to any document you can edit – just click on the 'edit' symbol and add them as a comma-separated list in the ‘Tags’ textbox at the top-right-hand side of the document view. Current tags are displayed in this location.

When searching for a document that has a given tag, RSpace autopopulates the possible search terms from existing tags. Searching for tags is a very powerful way to organise and aggregate your research documents in multiple ways – for example by grant number, project, or publication, or simply as a collective name for a related or themed set of documents. To view only a selection of documents marked with a particular tag, choose ‘Tag’ in the Workspace search drop-down and enter the identifying term you have tagged a series of entries or documents with – the search returns a table of results which only shows items containing the particular tag. Tagging documents makes them easier to find and collects them in search results with similarly tagged related content.

Stop words and Tagging

There are various words that the search engine will not find. These are generally short prepositions such as 'of', 'and', 'this', 'that' etc. The search engine will not be able to find tags that are solely comprised of these stop words. It will, however, find tags that include these stop words but are not exclusively using stop words. Therefore it will not find a document tagged with the value: 'of'. It will find a document tagged with 'Ides of March'.

When creating a new tag RSpace suggests autopopulated values from existing tags AND from any ontology files in your workspace or shared with you (see below).

You may use free text instead. Simply click in the textbox for tags in order to see the suggestions. Tags can be created as key=value - just enter the tag as 'key=value' (no surrounding quotes are required) using freetext or in an ontology file (see below). Searches for a tag with 'key=value1' will return only documents tagged with 'key=value1' and not documents tagged with 'key=value2' etc. When ontologies are enforced, RSpace will only allow tag values from ontology files shared with a group you belong to. If you type in the tag text box this is used to filter the autopopulated suggestions. Suggestions are in alphabetical order. RSpace will only display up to 1000 suggestions, when there are more possible values than 1000, RSpace will display an initial value of

'============CLICK_HERE_FOR_NEXT_DATA============'.

Click on this value in the suggestions dropdown to load the next 1000 suggestions. When there is no more suggested data, RSpace will display:

'================BACK_TO_START================'

Click on this value in the suggestions dropdown to cycle back to the original suggestions.

When there are too many possible values, RSpace requires you to narrow them down by entering some text and will display: 'Too many results, please enter a specific search term'.

Whenever you save (or delete) a tag, RSpace creates/updates a file inside an 'Ontologies' folder in your workspace with an icon respresenting its purpose:

This file is an example of a controlled vocabulary/ontology file. Do not edit this file by hand as it will be overwritten on any future saving of your tags.

Forbidden characters in tags

The following characters are forbidden in tags and will be rejected if entered as free text or from a tag suggestion term: '<', '>', '/', '\'

Controlled vocabularies/ontologies

Note - as described above, you have an autogenerated ontology file created for you whenever you save/delete a tag in any document. (The file does not exist until you save/delete a tag). You may want to share or export this file, allowing other users access to the controlled vocabulary which you have created as document tags.

You may create ontology files for the purpose of ensuring an agreed set of terms are used for tags. Although referred to as ontologies/controlled vocabularies throughout this documentation, these files are really just a controlled vocabulary as they have no concept of nested terms or hierarchies. However they do contain the ability to create key=value pairs which could be useful for eg namespacing.

To create an ontology file:

Click create -> From Form -> RSpace Tags from Ontologies

The generated ontology file contains 20 fields, each called 'Ontologies for Tag creation'.

This file is just a normal RSpace file but it will be used to generate tag 'suggestions' following some simple rules as follows:

Ontology terms should be comma separated. There can be one key per line of text, separated by an '=' from the values it matches.

For example, I create a controlled vocabulary to describe experiments so I edit 'Ontologies for Tag creation' and enter : "started,finished,phase1,phase2". Whenever I chose to tag a document, these 4 values will appear as 4 separate suggested terms for the new tag. If I wished to namespace this, then I would enter: "experiment_stage=started,finished,phase1,phase2". Whenever I chose to tag a document I will now have 4 separate suggested key=value pairs for the tag. These will be 'experiment_stage=started', 'experiment_stage=finished' etc.

There can only be one key per line of text, therefore in order to create further key=value pairs you must enter values on a seperate line in the file:

After saving the ontology document, whenever I create a new tag for any document I will see the following suggestions:

Viewing ontology files

It can be useful to see all ontology files you own or are shared with you. There is an 'ontologies' view in the workspace:

This shows your ontology files and ontology files shared with you. Note that if ontologies are enforced, only those files which have been shared with a Group will be making any contribution to the controlled vocabulary you can use to create new tags. (Clicking on the info, 'i', button for a file will show you whether it has been shared with a Group).

Uploading ontology files in csv format

External ontologies can now be used in RSpace - for example https://bioportal.bioontology.org/ allows download of ontologies in the required CSV format.

If an external ontology file is available in CSV format, it can now be uploaded to RSpace. Go to the export-import page under the My RSpace tab and choose a file using the dialog under 'Import an ontology file - csv format'.

As explained, RSpace is actually using controlled vocabularies in a 'flat' format, so using an external CSV formatted ontology consists of chosing a single column that will be used as terms for the controlled vocabulary. Tell RSpace which column to use by editing the CSV ontology file, adding a new line at the top with the following text: 'USE_COLUMN_X' where 'X' is replaced by the column of interest. For example, having downloaded the 'BRENDA Tissue and Enzyme Source Ontology' from https://bioportal.bioontology.org/ontologies/BTO in CSV form, I decide to use the 'preferred label' column, which is column 2 in the CSV file (starting numbering at 1, not 0). I edit the csv file, adding a new line USE_COLUMN_2 as the first line in the file. After a succesful upload, RSpace will open the workspace, showing the new file. The ontology terms have been comma separated and will all be on one 'line', with up to 10,000 terms per 'Ontologies for Tag creation' field in the document. There are 20 fields in the document, which means an uploaded CSV file can contain at most, 200000 ontology terms.

The 'USE_COLUMN_X' text must be the only text on the first line of your csv file. It is not recommended to use Excel/Word etc to edit ontology files - for example Excel will insert commas after the 'USE_COLUMN_X'. RSpace will ignore commas on the first line of the file but any other changes can cause the file upload to fail.

If the upload of csv ontologies creates a large number of suggested ontology terms when creating new tags, RSpace will require you to filter the suggested tags options by typing a value into the tags text box.

External ontology files may not be uploaded on the RSpace Community server.
Troubleshooting: If your .csv file fails to import, or imports but does not seem to create a corresponding RSpace ontology Document, check the .csv carefully for extra commas or other hidden content that may have been introduced, especially if the file has been edited or created with an application such as MS Excel or Open Office.
If your ontology file has only ONE column of data, you specify 'USE_COLUMN_1' as the first line. In this specific case, you do not need to have commas in the file as there are no columns of data to separate.
You can copy/paste small amounts of data into an existing RSpace ontology file rather than doing a CSV upload, so why use the CSV method? The CSV upload method is useful because 1) you can build, edit or acquire the CSV outside of RSpace and also 2) because if you attempt to paste more than a few hundred lines of text, you will likely cause your browser to freeze or crash.

Sharing ontologies

Ontology files can be shared as with any RSpace file and recipients will be able to use them to create tags. This includes sharing with collaboration groups.

You may export an ontology file as an RSpace archive. Any recipient can then re-import it (as an RSpace archive, not as a CSV file!).

Enforcing ontologies

In order to guarantee all group members use a shared vocabulary for tagging, a Lab Group PI now has the option to enforce ontologies on the My LabGroups page:

If ontologies are enforced the following applies to all members of the Lab Group/ Collaboration Group (including the PI):

New Tags cannot be entered as free text.

New Tag suggestions will not be autopopulated from existing tag values.

New Tag suggestions will only be autopopulated from ontology files which have been shared with a Group.

As a user, if any group I am a member of has 'enforce ontologies' turned on, then these rules take affect in my workspace. This includes collaboration groups.

Some examples:

  • I am group PI and I turn on Enforce Ontologies. No ontology files have been shared with any group I belong to. I will not be able to create any new tags until an ontology file is shared with a group I belong to.
  • I am group PI and I belong to a collaboration group. Another PI in the collaboration group turns on 'enforce ontologies' for the collaboration group. No ontology files have been shared with any group I belong to. I will not be able to create any new tags until an ontology file is shared with a group I belong to.
  • I am a member of a group with enforced ontologies and I wish to use my own ontology file to create tags. I share it with the PI. I will not be able to use the ontology file to create tags and neither will the PI. I must share the file with a Group before I, or any member of the group, may use the file to create tags.


How did we do?


Powered by HelpDocs (opens in a new tab)

Powered by HelpDocs (opens in a new tab)