Create transcript formats

If you want to add plain text transcripts (.txt files) to a case, you must tell the application how to divide each transcript file into multiple transcript pages. To do this, you create a transcript format. A transcript format indicates the location of page breaks in a transcript file.

The pages in a transcript typically contain a similar pattern of text where each page break occurs. The pattern can occur in the header or footer of each page. For example, each page in a transcript might include a page footer with the name of the transcription company. A transcript format describes this page break pattern.

When you add a plain text transcript to a case, you associate the transcript file with a transcript format. This allows the application to process and display the transcript properly. For information about how to add transcripts to a case, see Add transcripts.

Note: Transcript files that are portable transcript format (PTF) or portable case format (PCF) file types include pagination information as part of the file. Because of this, you do not need to create transcript formats for PTF or PCF files.

Before you start, examine the transcript and identify patterns in how page breaks are formatted. Look for the following information:

Do page breaks occur as headers or as footers?

What consistent pattern of words or characters appears on every page where the page break occurs?

Does the transcript include multiple sections that contain different page break patterns?

Administrators and group leaders can create transcript formats.

To create transcript formats:

On the Case Home page, on the Case Home menu, select Transcripts.

In the navigation pane, click Formats.

Tip: You can base a new transcript format on an existing transcript format if you create the new format while you are adding a transcript to the application.

Click Add.

In the Name box, type a unique name for the transcript format.

In the Page break area, specify the location of page breaks, as follows:

Below: Page breaks are indicated by footers.

Above: Page breaks are indicated by headers.

Tip: If the location of the page breaks is indicated both by headers and by footers, choose the location that has an easier page break pattern to describe.

In the Pattern box, type the pattern of words or characters that identifies where page breaks occur, using regular expression syntax. For more information about regular expressions, see Describe page break patterns using regular expressions.

Note: If a page break pattern is longer than one line, use a regular expression to describe the pattern on the first line, and then specify the total number of lines.

In the Number of lines list, select the number of lines that are used by the headers or footers that indicate page breaks.

To remove blank lines from the transcripts, select the Strip blank lines check box.

If the transcript includes line numbers, select the Embedded line numbers check box. If you clear this check box, the application automatically generates line numbers for the transcript.

Click Save.

Describe page break patterns using regular expressions

When you create a transcript format, you use a syntax called regular expressions to describe where the page breaks in a transcript occur.

For example, consider the transcript that is shown in the following figure. The transcript contains page numbers in the header, and the transcription company's name in the footer. For this example, assume that every page in the transcript contains the same pattern of text in the header and the footer.

Sample transcript with header and footer patterns

In this example, you can describe the page break pattern in the following ways:

Header page break pattern: The example header contains spaces followed by a three-digit page number.

To describe the page break pattern, use the following regular expression: ^\s+\d\d\d

The regular expression describes each element of the pattern, as follows:

^ indicates that the pattern starts at the beginning of a line.

\s+ indicates one or more spaces on the line.

\d\d\d indicates three numbers on the line.

Footer page break pattern: The example footer contains spaces followed by the text ABC Transcription Company, Inc. 800-555-0123.

To describe the page break pattern, use the following regular expression: ^\s+ABC Transcription Company, Inc. 800-555-0123

The regular expression describes each element of the pattern, as follows:

^ indicates that the pattern starts at the beginning of a line.

\s+ indicates one or more spaces on the line.

ABC Transcription Company, Inc. 800-555-0123 indicates the text that is repeated on every page.

To determine the appropriate page break pattern to describe, consider the following information:

You can choose to describe a page break pattern that occurs in either the header or the footer of every page. If the location of page breaks is indicated both by headers and by footers, choose the location that has an easier pattern to describe.

You can describe a pattern that occurs at the beginning of a line or the end of a line.

You do not need to describe the entire line of a pattern. This can be useful if a transcript includes multiple sections that contain slightly different page break patterns, or if a pattern includes special characters.

If a page break pattern is longer than one line, use a regular expression to describe the pattern on the first line. You can specify the total number of lines when you create a transcript format.

The following table lists common character patterns and the corresponding regular expressions.

Regular expression syntax

Pattern of characters

^

Indicates that the pattern occurs at the beginning of a line. Place the ^ (caret) at the beginning of a regular expression string, such as ^\d.

\s

One space in the pattern.

\s+

One or more spaces in the pattern.

\s*

Zero or more spaces in the pattern.

\d

One number in the pattern.

\d+

One or more numbers in the pattern.

\d*

Zero or more numbers in the pattern.

\w

One alphanumeric character or underscore in the pattern.

\w+

One or more alphanumeric characters or underscores in the pattern.

\w*

Zero or more alphanumeric characters or underscores in the pattern.

\f

Page break or form feed character. In a text editor, a form feed character may appear as a vertical bar (|).

$

Indicates that the pattern occurs at the end of a line. Place the $ (dollar sign) at the end of a regular expression string, such as \d$.

. (period)

Represents any single character except a new line character.