TQE Tools

This page contains documentation for each tool used in the TQE Tutorial. These tools are intended as free and open-source resources to the MQM community, and may be used as templates for more sophisticated tools.

The tools are divided among the three stages of a TQE:

Preliminary Stage: STS Generation, Metric Generation, Bitext Generation
Error Annotation Stage: TRG Annotation Tool, JSON-TEI Converter, TEI Annotation Reconstructor
Automatic Calculation & Follow-Up Stage: TQE Calculator and downloadable spreadsheet

1. Preliminary stage

1.1 STS Generation

The structured translation specifications (STS) file is an XML file that represents the specifications for the original translation job through a series of <parameter> elements, organized in <section> elements. Each <parameter> has various <subparameter> children, each of which has one or more <value> elements detailing the specifications. The parameters reflect those outlined in ASTM F2575.

1.2 Metric Generation

A discussion of the parts of an MQM metric may be found here.

The section will present two methods to produce a file such as concrete-example-metric.xml. It may be done by hand or with a tool.

The basic workflow for both methods is the same:

You will need a copy of the MQM Error Typology. This can be the full or core, XML, or a spreadsheet.
You will select a subset of the Typology to be the errors that are relevant to the specifications of your TQE.
You will add the other information relevant to the scoring model.

[Here is the typology file.] Selecting a subset of errors and adding scoring model information can be done by hand or with a tool. Both approaches are detailed below.

By Hand

This method can be time consuming, especially for those unfamiliar with XML, but can be performed in any text editor.

A metric XML file always has a <issues> element at its root. It contains attributes that carry data relevant for the scoring model, inclding:

Severity level penalty point multipliers for neutral, minor, major, and critical errors. Here, we use the defaults: 0, 1, 5, and 25, respectively.
The maximum overall quality score that the sample may receive. Here, the score is out of 100.
The minimum overall quality score that a sample must receive in order to receive a passing quality rating. Here, a translation must receive a score of at least 80.
Whether or not the scoring model will be calibrated (discussed elsewhere). Here, we use an uncalibrated scoring model.
A reference word count, to be used in normalization. The actual number is not important, but it must be consistent across the scoring process, and so is provided as part of the metric.

With all of this information filled in, the metric in our scenario begins with these lines:

<?xml version="1.0" encoding="utf-8"?>
<issues neutralSeverityMult="0" minorSeverityMult="1" majorSeverityMult="5" criticalSeverityMult="25" maximumScoreValue="100" cutscore="80" scoringModel="not-calibrated" referenceWordCount='1000'>

From there, the <issues> element contains <issue> elements, which are nested inside of each other to reflect the dimensional hierarchy of the MQM Error Typology. Each <issue> has the following attributes:

type. This is the identifier for the error type, and should match the id attribute of the corresponding <errorType> element in the Typology.
level. This is the level at which the error is nested. It should match the level attribute of the corresponding <errorType> element in the Typology.

With the <issue> elements added, the full metric XML for our scenario ends up being the following:

<?xml version="1.0" encoding="utf-8"?>
<issues neutralSeverityMult="0" minorSeverityMult="1" majorSeverityMult="5" criticalSeverityMult="25" 
cutscore="90" scoringModel="not-calibrated" maximumScoreValue="100" referenceWordCount='1000'>
  <issue type="Accuracy" level="0" display="yes">
    <issue type="Mistranslation" level="1" display="yes"/>
    <issue type="Addition" level="1" display="yes" />
    <issue type="Omission" level="1" display="yes" />
  </issue>
  <issue type="Linguistic conventions" level="0" display="yes">
    <issue type="Grammar" level="1" display="yes"/>
    <issue type="Punctuation" level="1" display="yes" />
    <issue type="Whitespace" level="1" display="yes" />
    <issue type="Spelling" level="1" display="yes"/>
  </issue>
  <issue type="Style" level="0" display="yes">
    <issue type="Unidiomatic style" level="1" display="yes" />
    <issue type="Awkward style" level="1" display="yes" />
    <issue type="Organizational style" level="1" display="yes" />
  </issue>
</issues>

Generating a Metric Automatically

This tool can be used to select error types and build the XML file for a metric:

Metric Generator

1.3 Bitext Generation

For the tools that we will use later on in this system, the samples to be evaluated of the source and target texts must be combined in a specific format. This format must be a text file segmented by line breaks, meaning that every line contains the source text version of a translation unit as well as the target text version of the same translation unit. It is also aligned by separating the corresponding translation units with a tab character. Thus, there will be as many lines in the bitext file as there are translation units in the sample. To go about doing this:

Make 2 text files, one for each language. Each translation unit should be on its own line, and the line numbers should correspond between the same translation unit across the source and target text files. This process is segmentation.
- NB: Use a find-and-replace tool to ensure that there are no \r characters anywhere in these files. Newlines should be represented by \n only, or this method may fail.

Do the Unix command:

$ paste [source_text].txt [target_text].txt > bitext.txt

“bitext.txt” is the name of the outputted bitext file. This command automatically concatenates the two text files line-by-line, use tab characters as a delimiter. If it does not automatically use tab characters, try using the “-d” argument to set the delimiter to tab manually.

2. Error Annotation Stage

The evaluator, a trained translator, used an annotation tool to identify errors and produce metadata. Then, they used data converters (here, a JSON-TEI converter and a TEI inspector for manual validation) for data handoff, leading into stage 3.

2.1 TRG Annotation Tool

This tool requires the metric, STS, and bitext files to be uploaded in order to create a new project. Before creating any projects, the user must also upload a typology.xml file. The typology only needs to be uploaded once, and provides data on all the possible errors (called <issue>s in this file) that may be included in any metric file in any project.

[info on how to deploy an instance here]

https://mqm-scorecard-dev-582b03cd5905.herokuapp.com

2.2 JSON-TEI Converter

The TRG Annotation Tool exports JSON files, which can then be converted into a TEI file format for ease of data interchange.

https://linport.net/mqm-tools/json-to-tei

2.3 TEI Annotation Reconstructor

To visually inspect the TEI conversion of the JSON export for data integrity, the following tool can be used, to convert the TEI into an HTML format similar to the TRG Annotation Tool’s annotation interface: https://linport.net/mqm-tools/mqm-reconstructor.

3. Automatic Calculation & Follow-Up Stage

The TQE Calculator, or alternatively a spreadsheet, is used by a technician (or by the evaluator using tools developed by the translator, or by the translator themselves if they are filling both roles) to produce an overall quality score and pass/fail decision based on the data from stage 2.

Recommendations are then synthesized from the quality rating in parallel with the error annotation metadata. The evaluator is the one most qualified for this interpretation stage, as the analytic score will make reference to the error annotation data, which they themselves produced.

3.1 TQE Calculator

https://linport.net/mqm-tools

3.2 Downloadable Spreadsheet

Link to excel export