Deposit Manual TLA
The repository system of The Language Archive (TLA) features an integrated web-based deposit system that allows users to archive their data. This manual describes the use of this deposit system. Since it is an integral part of the archive, there is no separate URL for it, the deposit functionality becomes visible automatically to logged in users of the archive who have been granted deposit permissions (see section 1.1). TLA hosts research data from researchers at the Max Planck Institute for Psycholinguistics (MPI), as well as certain external depositors. This document describes the workflow for external depositors.
In order to facilitate long-term preservation of its holdings, TLA accepts a limited number of file formats, which are listed on this page. That page also contains any further conditions that apply to some of the accepted file types. File names can only contain alphanumeric characters (without accents/diacritics), dots, hyphens and underscores. Spaces in file names are not allowed. For each accepted format, only specific file extensions are allowed, as listed in the table. File extensions should all be lower case with the exception of ’TextGrid’.
Below we explain a number of important concepts that are used throughout the deposit system:
Metadata: Metadata is information about the archived materials that allows others to discover and re-use them. TLA uses the CMDI metadata framework as a standard for its descriptive metadata. We supports a selected number of CMDI profiles that are listed here. At the moment, the deposit system includes web forms for editing metadata using the MPI_Bundle, MPI_Collection, lat-corpus and lat-session profiles. Metadata in one of the other supported profiles cannot be edited online but can be uploaded as files.
Bundle: a Bundle in the archive contains one or more data files (e.g. audio & video data, etc) and their associated metadata. Typically, all files that are linked to the same Bundle should have a logical relation to one another, e.g. a video recording and its transcript, all trials for a given experimental task for a given subject, all photographs of a given event, etc. This typically means that the metadata for the Bundle applies to all files within the Bundle, e.g. in terms of date, location, participants, etc. There is currently a limit of 50 files that can be attached to one Bundle. For certain exceptional use cases where a larger number of files needs to be attached to one Bundle, those files can be zipped together in one zip file. Please check with the archive staff before doing so. For language corpora, this is typically not allowed.
Collection: a Collection in the archive is used to group Bundles, or other Collections. This enables us to created hierarchical Collection structures. A Collection also has descriptive metadata.
Nextcloud: external depositors need a Nextcloud space in order to archive research data. This service is hosted at the MPI and will be created upon request.
2 Deposit permissions, user account & dashboard
2.1 Deposit permissions
Before you can use the archive’s deposit system, you will need to have a registered account, and the collection you would like to archive will have to be approved in case you have not archived with The Language Archive before. For more info about our collection development policy, see visit this page. After approval, deposit permissions will be assigned to your account.
2.2 Registration & Login
To register an account, or log in to the archive, go to the account login page found here and select the ’create new account’ or ’Log in’ tab. If your institution is part of one of the supported Identity Federations, you may be able to use your institutional account credentials to log in to the archive via the "Login with Shibboleth" link. To register an account, fill out the required details.
Enter a username
Enter a valid email address
Your full name
Enter your affiliation (e.g. institution or department)
After logging in, you will be taken to the ’My account’ page. You can also go to ’My dashboard’ (see My dashboard) from the top navigation-menu once your deposit permissions have been assigned.
2.3 My account
In the ’My account’ section, you can view and edit your account information. You can change your password, add or edit a profile picture, set your timezone and affiliation.
2.4 My dashboard
’My dashboard’ is the central hub for the archiving activities: the ’My Bundles’ section displays a list of Bundles that are currently in progress. ’My Collections’ lists the active Collections, to which you can add Bundles. ’My reports’ contains reports about Bundle validation and archiving actions.
2.4.1 My Bundles
The ’My Bundles’ tab displays a list of all the Bundles that you are currently working on. These are either new Bundles that have not been archived yet, or updated Bundles that have not yet been submitted for archiving.
The table on this page contains some information about each Bundle. You can see to which Collection a Bundle belongs (if any), what the status of the Bundle is (see below), whether metadata have been created for it, and when it has been initiated. You can also delete Bundles by clicking in the ’delete’ link. (Note that deleting Bundles that are updates of existing Bundles in the archive will only delete the update, not the original version in the archive).
The ’status’ for a Bundle can be:
Open: the Bundle can be edited (either metadata or adding/removing resources)
Validating: the Bundle is being checked for valid metadata and resources.
Processing: the Bundle is being archived.
Failed: the Bundle validation or archive action failed (see the report (My reports) for more info). It may be possible to remedy the issue by re-opening the Bundle , editing it and submitting it again. In case of continued problems, contact the archive staff.
2.4.2 My Collections
’My Collections’ displays all of your ’active’ Collections. These can be Collections you have added via the ’activate’ tab in the archive browser (see Activate Collection), or newly created Collections, which are automatically added.
You can click on an active Collection for more details. There is also a shortcut to view the Collection in the archive. Removal of the active Collection from the list can be done with the ’delete’ function. It will then be put on the list of inactive Collections. You can view the list of your inactive Collections via the link below the active Collections.
When viewing an active Collection in more detail, you can see whether there are any Bundles in progress that are associated with the Collection. You can work on them by clicking on their name. You can also validate, archive and re-open one ore more Bundles, by checking the box in the ’select’ column and by clicking on the desired action button. Note that for these actions, metadata for the Bundle should already be available. See the image below for a detailed overview.
2.4.3 My reports
The ’My reports’ tab consists of an overview of notifications sent to you by the deposit system. These will mostly be validation and archiving reports of your Bundles. You can see whether a Bundle was validated or archived successfully, or whether it failed. You can view detailed information by clicking the report.
If a Bundle failed to validate, or if archiving failed, you can view the report and try to find out what went wrong, or you can contact the archive staff with the included report and inform them that something went wrong.
You can also delete all reports older than 2 weeks, or choose to delete them all by clicking the appropriate button. This action cannot be reversed!
Back to index
3 Archiving data
3.1 Activate Collection
When you are ready to archive, inform the data-management team (
firstname.lastname@example.org) that you wish to archive a project. They will then create a new empty Collection within your (department’s) section of the archive, based on a metadata form which will be sent to you.
In order to add Bundles or Collections to a Collection, it needs to be ’Activated’. To do so, locate the Collection in the archive browser and click the ’Activate’ tab on the Collection page. The Collection will then be listed in the ’My Collections’ overview of your dashboard (see My Collections). You will only see the ’Activate’ tab on Collections to which you’ve been assigned deposit permissions.
Once your Collection is active, you can start adding sub-Collections and/or Bundles to it. Bundles can either be added from the ’My Bundles’ overview or from the browser-tab shown below. Additional Collections can be added from the archive browser tabs only.
3.2 Add Collection
After activating your Collection, you can choose to add a new (sub)Collection (via the Add Collection tab in the archive browser), or to update the metadata of the current Collection (section 3.5.1).
To add a Collection to an active Collection, browse to the Collection in the archive browser and click the ’Add Collection’ tab. You can also reach the Collection via ’My dashboard>My Collections’ and clicking the ’Collection in archive’ link there. Next, click the ’Add Collection’ tab.
Select the metadata profile you wish to use. For most language corpora, this should be lat-corpus. For most other types of Collections, you should use MPI_Collection. Collection metadata in other supported profiles can be uploaded as a file. Once you’ve selected the appropriate profile, fill out all the mandatory fields and any other metadata fields you wish to add. The ’Add’ button found under certain metadata fields allows you to add multiple instances of a certain element.
You also need to select an initial access policy for the Collection. In exceptional cases where the Collection needs to be temporarily invisible (in cases where the title alone would give away too much information), you can make the Collection "private" and tick the "hide metadata" box. This will make the Collection only visible to you when searching and browsing the archive, and to no one else. The access policy can be refined later if necessary.
Note that most of the metadata values that you enter for the MPI_Collection will become the default values for any MPI_Bundle or MPI_Collection that you add to the Collection later on, such that you don’t need to enter the same information again, but only modify what is relevant.
When you have filled out all of the metadata, click ’Submit’. The Collection will be created and added to your list of active Collections. If you made a mistake while filling out any metadata, you will be notified by the system and you will need to correct the error and try the submission again. The submission of the Collection can take a bit of time. Please wait for it to finish.
Once the Collection has been created, you can update its metadata at a later stage if necessary, see Update Collection for additional info.
3.3 Add Bundle
3.3.1 Create Bundle
To add actual data files to your Collections, you will need to create Bundles. A Bundle should contain one or more files, in file formats that are accepted by the archive as listed on this page. See the remarks about file names and extensions in the introduction of this manual.
You will need to upload the data files you wish to add to a Bundle into a Nextcloud folder. See Nextcloud data upload on how to do so.
To create a Bundle, go to ’My dashboard>My Bundles’ and click ’Initiate new Bundle’. Alternatively, you can initiate a new Bundle via the archive browser. Browse to an active Collection (Activate Collection) and choose ’Add Bundle’ from the tabs displayed.
On the page that appears, you give the Bundle a name, choose the Collection it belongs to, what access rights apply and you select the Nextcloud folder that contains the resource data.
Below you find a brief explanation of the fields you will need to fill out:
Title: enter the title of your Bundle. This title will also be copied into the metadata and will be the label for the Bundle that is displayed in the browser. Titles need to be unique within a given parent Collection.
Parent Collection: select the Collection you wish to add the Bundle to. (already set in case you initiate the Bundle via the "add Bundle" tab of the Collection)
Access policies: Select which access policy should be applied to the files within this Bundle.
"Open" materials can be accessed by anyone without having to log in. "Registered Users" means any user with a valid account for the archive.
"Academic Users" are users that log in with an academic account or whose academic status has been verified.
"Restricted" means that the materials are only accessible to the depositor and authenticated users, metadata can be visible or hidden, use this option if need be.
How will you provide metadata: this can be done by filling out a web-form, or by uploading a CMDI file.
Enter using form: metadata can be entered through an online form.
Upload a CMDI file: Upload a CMDI metadata file that you've created with a different tool. In case there are links to resources in your CMDI file, you will need to provide those files in the Nextcloud folder you select in the next step (number of files and filenames need to match). The metadata can be further modified using an online form once the bundle has been created.
Upload a CMDI file as a template: Upload an existing CMDI metadata file to be used for this bundle, as a template. Any existing resource links will be removed and instead the resources in the Nextcloud folder that you will choose in the next step will be added to the bundle. The metadata can be further modified using an online form once the bundle has been created.
How will you provide files: choose ’Select a Nextcloud folder’. See Nextcloud data upload below for more information.
Subfolder: select the appropriate folder inside the ’Nexcloud’ folder. This can be repeated to select further subfolders, until you’ve reached the final folder that contains the files for your Bundle. As noted, the finally selected folder cannot contain any subfolders and can maximally contain 50 files. The current path you’ve selected is displayed at the bottom of the form.
When you have filled out everything, click ’save’ to continue. The next page will display the created Bundle containing the resources from the selected Nextcloud folder.
3.3.2 Nextcloud data upload
To to upload data for archiving, you will need to use the Nextcloud instance hosted at the MPI. This is a cloud-based storage system, which is linked to the depositing system of the archive. The folders you create in Nextcloud can be selected when creating new Bundles (see Create Bundle ).
To be able to create Bundles in conjunction with the Nextcloud server, please follow these steps:
Open a new browser tab and go to the TLA Nextcloud service. Login with the same username or email address and password you used to register with the Language Archive.
From the main overview page, you can start creating folders by clicking the ’+’ icon found on the top-half of the page. Choose the ’folder’ option from the context menu that appears. Once you have the folders created, you can add data to it.
To add data to a folder, first click on the appropriate folder to enter it. Next, click the ’+’ icon again and choose the ’Upload file’ function from the context menu. Select the file(s) you wish to upload from your local computer. Alternatively, you can also just drag and drop files (as well as complete folders in most browsers) from your computer to the web page.
Nextcloud also has clients available that you can download and install on your own computer. These work very similar to "Dropbox" or similar tools, where you can select a local folder that will be synchronised with the cloud. To use those clients with the MPI Nextcloud, you need to enter the URL of the server: https://archive.mpi.nl/nextcloud along with your login credentials.
Once you have uploaded the data you wish to archive, you can return to the Language Archive page and start the creation of a Bundle, see Create Bundle. Choose the option ’Select a Nextcloud folder’ as the way to provide data for the Bundle you are creating. You can repeatedly select subfolders, until you’ve reached the one you want to use. As noted, the finally selected folder for the Bundle can contain no further subfolders and can contain maximally 50 files.
3.3.3 Bundle overview
The Bundle overview displays the files that will be included in the Bundle.
Fill in/edit metadata for Bundle: This allows you to fill in the metadata for the data included in the Bundle (Fill in metadata for Bundle). Only enabled in case you’ve opted to fill in the metadata by using a form. If you uploaded a CMDI file, you can edit the metadata here.
Validate Bundle: Here you can check if your resource data and metadata are valid before archiving it (Validate Bundle). The data folders of valid Bundles are moved away from the workspace location, such that they can no longer be modified.
Re-open Bundle: When a Bundle is valid, it will be closed so you can archive it. Clicking ’Re-open’ will allow you to work on the Bundle again. Data folders will be moved back to their original location, such that they can be modified. After re-opening a Bundle, validation is required again before the Bundle can be archived.
Archive Bundle: Click this to archive your valid Bundle. The resource data will be moved to the archive together with the metadata (Archive Bundle).
Edit Bundle properties: You can edit some of the Bundle properties for Bundles that are still open, e.g. change the name, access policy or select a different data folder.
Delete Bundle: Click this if you want to delete the Bundle completely. (Selected files in your workspace will not be deleted).
3.3.4 Fill in metadata for Bundle
After creating the Bundle, choose the ’Fill in metadata for Bundle’ button to start creating metadata describing the resource data inside the Bundle. This step is not necessary when you chose to upload a CMDI file containing the metadata. You can still edit the metadata here, or you can immediately validate the Bundle (Validate Bundle).
On the page that will be displayed, you’ll have to choose a profile to use for the metadata. After this selection, you will be presented with a form based on the chosen profile that you will have to fill out.
The ’Add’ button allows you to add multiple instances of a certain element. For instance, if your Bundle contains multiple types of data (e.g. photographs and videos), click ’Add’ to add an additional data type.
Notice that some values of the form will have already been filled in. For the MPI_Bundle profile, these values are inherited from the Collection that your have added the Bundle to. You should however modify these values if that’s appropriate for the Bundle.
When all of the (required) metadata has been filled in, click ’Submit’. If there were errors (an empty required field), you will be prompted to correct them. Otherwise, you will be taken back to the Bundle overview page where you can further process your Bundle.
3.3.5 Validate Bundle
Once you have successfully created or uploaded metadata for the resource data inside your Bundle, you must validate the Bundle prior to archiving it. The system will then check both the resource data and metadata to make sure the data is acceptable and contains no errors. You can only validate a Bundle after successfully creating a Bundle containing both resources (Create Bundle) and metadata (Fill in metadata for Bundle).
To validate a Bundle, go to the Bundle overview page (’My dashboard>My Bundles’), and click the ’validate’ button. Alternatively, and if you would like to validate multiple Bundles at once, you can go to ’My dashboard>My Collections’. Select the Collection to which the Bundle(s) belong(s), and check the ones you wish to validate.
Once validation starts, you will be taken to the ’My dashboard>My Bundles’ page and the progress will be indicated (My Bundles). When the validation process is done, the ’Status’ column will tell you the outcome (i.e. if it is valid, or if it failed an requires further attention).
In case the Bundle failed to validate, you will have to make corrections to it. You can check the validation report under ’My reports’ for more info regarding the error(s) (My reports). You can find more information on how to correct errors in the next section.
If the Bundle is valid, you may proceed with archiving it ( see Archive Bundle ).
3.3.6 Correcting failed Bundles
If a Bundle failed to validate, you will have to correct the error(s) to it. To do so, check the report of the failed Bundle (My dashboard>My reports).
A Bundle may fail to validate due to issues with resource data or metadata. In both cases, you can read from the report what the source of the error is.
Invalid resource data: In the example picture above, a specific file did not validate. This happens when a file is not in a format of accepted file types (see the page of accepted file types). You can choose to remove the file from the workspace folder, or alter the file such that it is in an accepted file-format. Alternatively, contact email@example.com for help.
Invalid metadata: If you entered metadata for the resources in the Bundle, it may happen that something you entered is invalid. In this case, you will have to remove the invalid metadata file and re-create it. To do so, click on the Bundle to go the Bundle-overview page and click ’Edit Bundle properties’. Next, remove the attached metadata file by clicking ’remove’. Finally click ’Save’ to save the Bundle.
Back in the Bundle overview page, you can now fill out the new metadata for the Bundle. See Fill in metadata for Bundle for more info.
Once you have corrected the error(s), you may validate the Bundle again by clicking on the failed Bundle. You will be taken to the Bundle overview, where you can click ’Validate Bundle’ again (Validate Bundle).
In case you’re seeing different errors or in case your Bundle fails to validate repeatedly, please contact the archive staff.
3.3.7 Archive Bundle
Archiving a Bundle is easy, once it has been validated by the system. Simply go to the ’Bundle overview’ page by clicking on the Bundle in the ’My dashboard>My Bundles’ page. From there, click the ’Archive Bundle’ button.
Alternatively, or if you want to archive multiple Bundles at once, go to ’My dashboard>My Collections’, click the Collection to which the Bundle belongs, check the boxes in the ’select’ column and click ’Archive Bundle(s)’.
The Bundle(s) will be processed and you will be taken to the ’My dashboard>My Bundles’ page. After archiving is complete, a report will be placed in the ’My reports’ tab. If anything went wrong during archiving, it will be stated there, so please check the report(s) after archiving your data.
3.4 Update content
After archiving your data, you can revise it at any given time. You can update the metadata of a Collection or a Bundle, or add/remove resources from an existing Bundle.
3.4.1 Update Collection
To update the metadata of a Collection, browse to the Collection in the archive (via ’My dashboard>My Collections>link to active Collection’). Next, click the tab ’Update’ to start updating the metadata. When done, click ’Submit’ and the metadata for the Collection will be updated.
3.4.2 Update Bundle
You can update a previously archived Bundle, by adding or removing resources, but also by updating the metadata. Go to the archived Bundle, and either select the ’Update Bundle’ tab (to add/remove resources) or the ’Update metadata’ tab (to update metadata).
126.96.36.199 Update Bundle resources
To update the resources in the Bundle, fill out the form displayed on the screen. Select the appropriate folder in your workspace containing the additional resource files (If you only want to remove specific resource files, you will currently need to select an empty folder there!). When done, click ’Submit’ and you’ll be taken to the Bundle overview page.
In the overview, you will see the list of files currently in the archive, which you can select to remove from the Bundle. Below it is the list of files that will be added to the current Bundle.
Once you have updated your Bundle, click ’Validate’ to check if all data is valid. If everything is valid you may archive the updated Bundle. If you made a mistake or you noticed not all files that you wish to add are in the new overview, you can click ’delete Bundle’ to start over. This does not delete the Bundle from the archive!
188.8.131.52 Update Bundle metadata
If you only want to update the metadata of a Bundle, you can do so by selecting the ’Update metadata’ tab of the archived Bundle. You can either upload a CMDI containing the updated metadata, or edit via the form. The form will display the currently filled out metadata for the Bundle, which can be edited. Once done, click ’Submit’ .
Back to index