TG-crud

Overview

TG-crud is the CREATE, RETRIEVE, UPDATE and DELETE service of TextGrid, that is responsible for creating, retrieving, updating, and deleting TextGrid resources, i.e. TextGrid objects including TextGrid metadata. A TextGrid object or resource consists of a data file, e.g. an TEI XML file or an image, and a metadata file conform to the TextGrid Metadata Schema. It furthermore is generating TextGrid URIs (see TG-noid), and can be used to LOCK and UNLOCK objects. All available methods are described below. The TextGrid Crud is offering all it’s methods via REST and SOAP.

Technical Information

The class TGCrudServiceImpl implements the service interface created by the CXF webservice implementation. This service is used to ingest, access, and delete TextGrid objects which actually have two components: the metadata file and the data file. It is ment to be used by the TextGridLab software and by other TextGrid Services that are used by the TextGridLab. All service methods can be used with the Java client classes that are provided in the middleware.tgcrud.clients.tgcrudclient folder, using the JAXB data binding, or using every client to be build to serve the TG-crud’s WSDL, or WADL file, or using the tgcrud-client Maven module provided with the parent tgcrud module.

GIT Repository

The productive TextGrid Release can be found here:

The TG-crud GIT repository can be found here, containing the current develop version that is (mostly anytime) runnable, and possibly deployed as a test version somewhere:

All the modules of the TG-crud are available in the DARIAH Nexus Maven Repository:

Version

This page is valid for current TG-crud Service Version. To check the current productive TG-crud version simply try:

More Documentation

You can get even more information looking at the current productive TG-crud WSDL file here:

Installation

Getting the WAR File from Source

You can either check out the TG-crud from the GIT (trunk or tagged productive version), and use Maven to build the TG-crud service WAR file:

First go to your favorite home directory, and create a folder for the code:

mkdir src/
cd src

Now checkout the TG-crud service code via GIT:

git clone git@gitlab.gwdg.de:dariah-de/dariah-de-crud-services.git

Build the package:

cd service
mvn package

You will get a TG-crud WAR file in the folder ./tgcrud-base/target.

Deploying the TG-crud Service

This WAR file tgcrud.war simply is deployed into your favorite Application Server, e.g. Apache Tomcat.

Configuration

TG-crud is configured and setup by puppet.

Dependencies

API Documentation

The TG-crud can be addressed via SOAP and REST. Nearly all of the TG-crud calls do need an RBAC Session ID and an (optional) log parameter. You can get this ID from the TG-auth* Web-Auth service, if you have got a TextGrid or Shibboleth account.

Parameter Overview

Parameter Description
sessionId The Project ID of the project a new object shall be created in.
logParameter A deprecated parameter, there is no log service existing anymore.
uri If no URI is given here, the TG-crud will create a new one. If an URI is given here, it must be one created via TG-crud’s GETURI method, and it must not be used before. This parameter mostly will be used if objects shall be automatically imported and prepared and the URIs are used for internal references, link rewriting, etc. For some methods like reading and deleting the URI is mandatory. The READ and READMETADATA methods are capable of resolving Handle PIDs and not only TextGrid URIs that will be created if objects are beeing published to the TextGrid Repository. So published objects can mostly be referred to with both TextGrid URIs (textgrid:1234) and Handle PIDs (hdl:11858/00-1734-0000-0005-1424-B). If you use TextGrid URIs, you can chose to give (a) the direct revision URI (textgrid:1234.5) or (b) use the base URI (textgrid:1234), then the latest revision is beeing addressed. PIDs always are addressing revision URIs!
createRevision Set to TRUE if a new revision of the given object shall be created (baseUri is mandatory then), the revision number will be increased, base URI stays the same. Set to FALSE (default) if a new object shall be created using a new URI.
projectId Give the Project ID of the project the new object shall be created in.
tgObjectMetadata The TextGrid metadata as XML object, see WSDL file and XML metadata schema.
tgObjectData The data object.
howMany An integer value.

#GETVERSION

Just returns the current version of the TG-crud.

Parameters

  • NONE

RESTful access

#CREATE

The CREATE method of the TG-crud service is used to create TextGrid objects and store them TO THE GRID. It does the following in the given order:

  • Check if publish access is granted to the given RBAC session ID (ONLY if the DIRECTLY PUBLISH option is set to TRUE).
  • Check if create access is granted to the project resource using the given RBAC session ID.
  • Compute the revision number to use.
  • Create new TextGrid URI if needed, check URI if given.
  • Generate the generated metadata type.
  • Get some public data out of the metadata (ONLY if the DIRECTLY PUBLISH option is set to TRUE)
  • STORE AGGREGATION, EDITION, or COLLECTION DATA to the INDEX DATABASE.
  • STORE ORIGINAL XML DATA to the INDEX DATABASE.
  • Call the AdaptorManager, process data if needed.
    • If an aggregation object is ingested: Add the subject’s URI to the aggregation ORE file.
    • Put the namespaces of XSD files into the RDF database.
    • Put the relations extracted with the AdaptorManager into the RDF database.
    • Add warnings to the metadata, if existing.
  • STORE METADATA and DATA TO THE GRID.
  • STORE RELATIONS to the RDF DATABASE.
  • STORE METADATA to the INDEX DATABASE.
  • REGISTER RESOURCE to the TG-auth.
  • Set the isPublic flag in TG-auth (ONLY if the DIRECTLY PUBLISH option is set to TRUE)
  • Add permissions to the metadata.
  • Return the complete metadata element.

Parameters

Name Status Type
sessionId mandatory String
logParameter optional or empty String
uri optional or null URI (revision URIs are ignored, the new revision always is computed from base URI!)
createRevision mandatory Boolean
projectId mandatory String
tgObjectMetadata mandatory MetadataContainerType
tgObjectData mandatory DataHandler

RESTful access

  • HTTP POST https://textgridlab.org/1.0/tgcrud/rest/create
  • Parameters as stated above
  • tgObjectMetadata and tgObjectData as Multipart
  • Special header information provided
    • Location: TextGrid URI
    • Last-Modified date
  • Response: 200 OK, MetadataContainerType delivered in body (text/xml)
  • Errors
    • MetadataParseFault: 400 BAD REQUEST
    • WebApplicationException: 500 INTERNAL SERVER ERROR
    • ObjectNotFoundFault: 404 NOT FOUND
    • AuthFault: 401 UNAUTHORIZED

#CREATEMETADATA

This method is not implemented yet!

It shall be used to create TextGrid objects that are holding the object’s metadata and a HTTP reference only to the object’s data. TG-crud then will get the data via HTTP and delivers it via TG-crud#READ.

Parameters

Name Status Type
sessionId mandatory String
logParameter optional or empty String
uri optional or null URI (revision URIs are ignored, the new revision always is computed from base URI!)
projectId mandatory String
externalReference mandatory String
tgObjectMetadata mandatory MetadataContainerType

#READ

The READ method of the TG-crud service reads TextGrid objects (including metadata) FROM THE GRID, and does the following in the given order:

  • Check if read access is granted to the given URI using the given RBAC session ID.
  • Read the metadata and data files FROM THE GRID.
  • Fill dataContributor and permissions tags with the information provided by the checkAccess query.
  • Return the complete data and metadata elements.

Parameters

Name Status Type
sessionId optional or empty for public resources String
logParameter optional or empty String
uri mandatory URI – TextGrid URI or PID
tgObjectMetadata mandatory for SOAP requests for delivering the metadata MetadataContainerType
tgObjectData mandatory for SOAP requests for delivering the data DataHandler

RESTful access

  • HTTP GET https://textgridlab.org/1.0/tgcrud/rest/textgrid:vqmz.0/data
  • Special header information provided
    • Last-Modified date
  • Response: 200 OK, Object delivered in body (mimetype depending on object type)
  • Errors
    • ObjectNotFoundFault: 404 NOT FOUND
    • MetadataParseFault: 400 BAD REQUEST
    • IoFault: 500 INTERNAL SERVER ERROR
    • ProtocolNotImplementedFault: 400 BAD REQUEST
    • AuthFault: 401 UNAUTHORIZED

#READMETADATA

The READMETADATA method of the TG-crud service reads the metadata of a TextGrid object (metadata only) FROM THE GRID, and does the following in the given order:

  • Check if read access is granted to the given URI using the given RBAC session ID.
  • Read the metadata file FROM THE GRID.
  • Fill dataContributor and permissions tags with the information provided by the checkAccess query.
  • Return the complete metadata element.

Parameters

Name Status Type
sessionId optional or empty for public resources String
logParameter optional or empty String
uri mandatory URI – TextGrid URI or PID

RESTful access

  • HTTP GET https://textgridlab.org/1.0/tgcrud/rest/textgrid:vqmz.0/metadata
  • Special header information provided: Last-Modified Date
  • Response: 200 OK, MetadataContainerType delivered in body (text/xml)
  • Errors:
    • ObjectNotFoundFault: 404 NOT FOUND
    • MetadataParseFault: 400 BAD REQUEST
    • IoFault: 500 INTERNAL SERVER ERROR
    • AuthFault: 401 UNAUTHORIZED

#UPDATE

The UPDATE method of the TG-crud service updates a TextGrid object including metadata and data. Furtermore it does the following in the given order:

  • Retrieve the metadata FROM THE GRID.
  • Store aggregation, edition, and collection data to the INDEX database.
  • Store the original data (if XML) to the INDEX database.
  • Delete the relations for the given object from the RDF database.
  • Call the Adaptor Manager.
  • Store the relations again to the RDF database.
  • Store metadata and data TO THE GRID.
  • Store the metadata to the Index database.
  • Return the updated metadata element.

User locking is involved here (please see LOCK and UNLOCK).

Parameters

Name Status Type
sessionId mandatory String
logParameter optional or empty String
uri mandatory for RESTful access! Not used for SOAP access, the URI is included in the metadata object involved! URI (use of revision URI is mandatory here!)
tgObjectMetadata mandatory MetadataContainerType
tgObjectData mandatory DataHandler

RESTful access

  • HTTP POST https://textgridlab.org/1.0/tgcrud/rest/textgrid:1234.5/update
  • Parameters as stated above
  • tgObjectMetadata and tgObjectData as Multipart
  • Special header information provided
    • Location: TextGrid URI
    • Last-Modified date
  • Response: 200 OK, MetadataContainerType delivered in body (text/xml)
  • Errors:
    • MetadataParseFault: 400 BAD REQUEST
    • IoFault: 500 INTERNAL SERVER ERROR
    • ObjectNotFoundFault: 404 NOT FOUND
    • AuthFault: 401 UNAUTHORIZED
    • UpdateConflictFault: 409 CONFLICT

#READTECHMD

The READMETADATA method of the TG-crud service also provides access to technical metadata. Technical metadata is only available for published objects!

Parameters

Name Status Type
uri mandatory URI – TextGrid URI or PID

RESTful access ONLY

#UPDATEMETADATA

The UPDATEMETADATA method of the TG-crud service updates the metadata of a TextGrid object. Furtermore it does the following in the given order:

  • Retrieve the metadata FROM THE GRID.
  • Retrieve Adaptor data FROM THE GRID.
  • Delete the relations for the given object from the RDF database.
  • Store the relations again to the RDF database.
  • Store metadata TO THE GRID.
  • Store the metadata to the INDEX database.
  • Return the updated metadata element.

User locking is involved here (please see LOCK and UNLOCK).

Parameters

Name Status Type
sessionId mandatory String
logParameter optional or empty String
uri mandatory for RESTful access! Not used for SOAP access, the URI is included in the metadata object involved! URI (use of revision URI is mandatory here!)
tgObjectMetadata mandatory MetadataContainerType

RESTful access

  • HTTP POST https://textgridlab.org/1.0/tgcrud/rest/textgrid:1234.5/updateMetadata
  • Parameters as stated above
  • tgObjectMetadata as Multipart
  • Special header information provided
    • Location: TextGrid URI
    • Last-Modified date
  • Response: 200 OK, MetadataContainerType delivered in body (text/xml)
  • Errors:
    • MetadataParseFault: 400 BAD REQUEST
    • IoFault: 500 INTERNAL SERVER ERROR
    • ObjectNotFoundFault: 404 NOT FOUND
    • AuthFault: 401 UNAUTHORIZED
    • UpdateConflictFault: 409 CONFLICT

#DELETE

The DELETE method of the TG-crud service does the following in the given order:

  • Delete metadata, original, aggregation, and baseline data from the XML database.
  • Unregister object from the TG-auth*.
  • Add a deleted relation to the RDF database.
  • Delete data and metadata FROM THE GRID.

Parameters

Name Status Type
sessionId mandatory String
logParameter optional or empty String
uri mandatory URI (use of revision URI is mandatory here!)

RESTful access

#GETURI

A valid RBAC Session ID given, the method GETURI generates the requested amount of TextGrid URIs, e.g. to prepare and then import a bunch of files via the Import Tool External (koLibRI), or the copy workflow of the TG-publish Service.

Parameters

Name Status Type
sessionId mandatory String
logParameter optional or empty String
howMany mandatory Integer

RESTful access

  • HTTP GET https://textgridlab.org/1.0/tgcrud/rest/getUri
  • Response 200 OK, TextGrid URI list separated by newline (text/plain)
  • Errors:
    • ObjectNotFoundFault: 404 NOT FOUND
    • IoFault: 500 INTERNAL SERVER ERROR
    • AuthFault: 401 UNAUTHORIZED

#LOCK

The implementation using the NOID (see TG-noid) locks TextGrid objects (practically their URIs) depending on the user ID (write access needed) and an “automagic unlocking time”, that is currently set to 30 minutes. If an URI is not yet locked, any user that has write access is able to lock this URI. Only a user, who has locked the object in the first place can (a) update the object and metadata (save object and save metadata), and (b) re-lock to keep the object locked, and should do so before the 30 minutes are over. If the unlocking time has exceeded, any other user with write access can lock the object again. Updating an object will do no re-locking, the lock just stays alive.

Published objects can not be locked or unlocked.

Parameters

Name Status Type
sessionId mandatory String
logParameter optional or empty String
uri mandatory URI (use of revision URI is mandatory here!)

Returns a boolean value that states if the locking succeeded or not: FALSE in case of an error only, TRUE if locking succeeded, and it throws an IoFault in case another user already holds a lock. The user ID of that specific user is included as exception message then.

RESTful access

#UNLOCK

The implementation using the NOID (see TG-noid) unlocks TextGrid objects. Unlocking is permitted for the user who has locked the object, and for any user (who has write access), if the automagic unlocking time has elapsed.

Parameters

Name Status Type
sessionId mandatory String
logParameter optional or empty String
uri mandatory URI (use of revision URI is mandatory here!)

Returns a boolean value that states if the unlocking succeeded or not: FALSE in case of an error only, TRUE if unlocking succeeded, and it throws an IoFault in case another user already holds a lock. The user ID of that user is included as exception message then.

RESTful access

#MOVEPUBLIC

Moves data from the non-public repository storage location to the public repository storage location. Needs special authentication and is only used from within TG-publish.

No RESTful access is provided.

TextGrid Fault Codes

No Description
1 AuthFault
2 IoFault
3 ObjectNotFoundFault
4 MetadataParseFault
5 RelationsExistFault
6 UpdateConflictFault
7 ProtocolNotImplementedFault
8 NoConfigurationFault
9 InternalServiceError

The TG-crud client

A simple way to work with the TG-crud service is to just include the tgcrud-client Maven module to your own Maven modules as dependency, as version please use the most recent productive version:

<dependency>
        <groupId>info.textgrid.middleware</groupId>
        <artifactId>tgcrud-client</artifactId>
        <version>[please see https://textgridlab.org/1.0/tgcrud/rest/version]</version>
</dependency>

You could then use the method TGCrudClientUtilities.getTgcrud(), provide a TG-crud service endpoint and simply call this tgcrud client’s methods. For example:

// Create TG-crud service client, use MTOM.
TGCrudService tgcrud = TGCrudClientUtilities.getTgcrud("https://textgridlab.org/1.0/tgcrud/TGCrudService", true);

// Get TG-crud's version.
System.out.println("TG-crud version is " + tgcrud.getVersion());

// Read the metadata of a TextGrid object from the Digitale Bibliothek (No sessionId is needed because the object is public).
MetadataContainerType metadata = tgcrud.readMetadata("", "", "textgrid:vqmw");

// Print out the JaxB metadata object.
JAXB.marshal(metadata, System.out);

Online JUnit Tests

The productive TextGrid JUnit tests tag can be found here:

Sources

See tgcrud_sources

Bugtracking

See tgcrud_bugtracking

Licence

See LICENCE