View and troubleshoot ingestions jobs
You can view ingestions job details as well as perform troubleshooting for jobs that encounter errors.
View ingestions jobs
You can view the following information about ingestions jobs:
View_properties_of_an_ingestions_job
View properties of an ingestions job
You can view detailed properties of an ingestions job after it has processed. The properties appear for all job statuses, regardless of whether a job finished successfully.
To access information about an ingestions job, including file counts, filters, and levels:
On the Case Home page, under Manage Documents, click Ingestions.
Click the name of an ingestions job.
The following table describes the information that appears on an ingestions job's Properties page.
Section |
Description |
Description |
Information entered by the creator of the job to describe or identify it. |
Job ID |
Unique identification number for each job. You can use this number to track and report issues with a job in a case. |
Started |
Date and time that the ingestions job began. |
Duration |
The amount of time it took to process the ingestions job. |
Status |
The status of the ingestions job. Status messages include the following: In progress, Completed, Error, Completed with Warnings, and Completed with Exceptions. |
Processed Files |
Total expanded documents: Total number of files after expanding compressed archive files, such as .zip files. Suppressed documents: Number and percentage of total documents that the application excluded after processing. Duplicates: Number and percentage of total documents that were excluded through deduplication. Click the Master Document link to run a search for the master documents of all documents that were suppressed by deduplication. The search results appear on the Documents page. Outside date range: Number and percentage of total documents with a Document Date that is outside of the specified range. Excluded NIST files: Number and percentage of total files that the application excluded because the files appear on the NIST list. Note: For information about hash codes, see the National Software Reference Library (NSRL), maintained by the National Institute of Standards and Technology (NIST). Excluded by category or extension: Number and percentage of total files that the application excluded because the file types were selected for exclusion in Default Settings. Does not match search term family: Number and percentage of total files that the application excluded because the files do not match any terms in the selected search term family. |
Unsuppressed documents |
Number and percentage of total documents available in the application after processing. Click the number to view the search results for all documents in the job. The search results are equivalent to what you would expect to see if you searched using the Evidence Job ID field. |
Suppressed Document Ingestion Exceptions and Unsuppressed Document Ingestion Exceptions |
Breakout of exceptions to the suppressed document settings that were encountered in the [Meta] Processing Exceptions field for suppressed and unsuppressed documents. Click the number under the Unsuppressed Document Ingestion Exceptions heading to view the search results for exceptions in the job. The search results are equivalent to what you would expect to see if you searched using the Evidence Job ID field AND [Meta] Processing Exceptions / has a value. Separate links to search results for the Evidence Job ID and the specific [Meta] Processing Exceptions list items appear underneath this heading. |
Folders |
Folder structure: The Folder depth settings selected in the Default Settings. Source files: Indicates the folder structure of the source data that was ingested. |
Filters |
Duplication: Indicates whether the documents are deduplicated at the case level, the custodian level, or not at all. Deduplication by top parent: If this option is selected, the application compares only the hash values of top parent documents. Attachments are ignored. This means families with different numbers of attachments may be identified as duplicates. Date range: The earliest start date and latest end date in the Document Date fields of the documents. Exclude NIST files: Indicates whether NIST files are excluded after processing. Exclude by category or extension: Indicates whether files with selected file types are excluded after processing. |
Keywords |
The search term family the application uses to filter data. |
Levels |
User-defined levels in the application under which processed data is organized. |
Document ID |
Document prefix: The user-defined prefix for each document ID. |
Ingestion Details |
For more information about these settings, see Configure default ingestion settings. Time zone: The time zone setting for the job. Suppressed documents: Indicates whether native files of suppressed documents are retained. Indexing: Status messages include Indexing only or Indexing and Enrichment. Personal information: Indicates whether credit card or personal identification numbers were identified and tagged as part of the ingestions job. Language: Indicates whether the primary language of the documents was identified as part of the ingestions job. Duplicate coding: If this option is selected, documents in the ingestion take the All Custodian field values from other duplicates in the case and pass those All Custodians values to duplicates. If this option is not selected, documents in the ingestion retain their own Custodian value in the All Custodians field. |
Source encoding | The source encoding value selected in the Advanced settings window. |
Password bank | If the administrator did not select a password bank for ingestions in the Advanced settings window, the value displayed in this row is No. If the administrator selected a password bank, the value is Yes. |
Chat Settings |
The following information is available in this section: Idle time: Threads are broken into separate documents if the difference in sent times between two messages is equal to or greater than this number. Minimum messages: Threads containing fewer messages than this number are not broken out into separate documents. Maximum messages: Threads containing more messages than this number are broken out into separate documents. |
EmailFiles | The type of file the user selected to be available in the viewer for imported email files. |
To download a report of the Properties page, click Download report.
View ingestions job progress
To view the details of the ingestions job's progress:
On the Case Home page, under Manage Documents, click Ingestions.
Click the name of a job.
In the navigation pane, click Progress.
The Progress Summary page appears. A progress bar and details for each submission of an ingestions job are included on this page, with the most recent job at the top of the page.
The following table describes the information that appears on an ingestions job's Progress Summary page.
Column |
Description |
RPF Job ID |
Unique identification number of each job in the Processing Framework (RPF). |
Start |
Date and time that the ingestions job began. |
Progress bar |
If the progress bar is green, the steps performed so far have completed successfully. Red indicates that a step failed. The entire bar is yellow when a job completes with warnings. Percentages appear underneath the progress bar to indicate the progress of the current job. |
Status |
The status of the ingestions job. Status messages include the following: In progress, Completed, Error, and Completed with Exceptions, and Completed with Warnings. |
For more information about the job, click View Progress Details.
The following table describes the information that appears on an ingestions job's Progress Details page.
Column |
Description |
Name |
This column contains the following task status names. Processing: Creates tables, begins processing, standardizes data. File Inventory: Identifies and catalogs files. Pre Processing: Verifies Nuix engine. Creating RDX Guid Tables: Prepares data tables. Batching: Breaks data into batches for processing. Processing: Expands files, gathers metadata, exports data and files. Standardizing Data: Assigns document IDs, levels, and export_extra types. Import Files and Metadata: Creates case database entries and copies files to the agent server. Load Data: Loads the data into the case database. Cleanup: Deletes the data staging tables. Hashes: Runs a Hashes job. Update Field Counts: Updates document counts for fields populated during the job. Group Coding: Runs the All Custodians stage. All Custodians: Populates All Custodians values. Data Filtering: Filters data by date range, removes files on the NIST list, and deduplicates files. Filter by Date Range: Filters data by the user-defined date range. Filter by excluded Files: Suppresses files of types selecting in the settings. Filter by NIST: Removes files on the NIST list, if indicated by the user. Filter by De-duplication: Removes duplicate files, if indicated by the user. Transfer Unsuppressed Files: Copies the files that were not filtered out of processing. File Copy Batching: Divides the files into batches for transfer. File Transfer: Copies the files from the ingest folder to the images folder. File Transfer Confirmation: Verifies that files have been successfully copied. File Copy Rebatching: If any files failed to copy, divides remaining files into batches for transfer. Retry File Transfer: Copies the remaining files from the ingest folder to the images folder Finalize File Copy: Verifies that files have been successfully copied. Indexing: Creates and updates indexes. Search Term Family: If a search term family was selected by the user, this column includes files that correspond to the selected search term family. Transfer Suppressed Files: Copies the files that were filtered out of processing. File Copy Batching: Divides the files into batches for transfer. File Transfer: Copies the files from the ingest folder to the suppressed folder. File Transfer Confirmation: Verifies that files have been successfully copied. File Copy Rebatching: If any files failed to copy, divides remaining files into batches for transfer. Retry File Transfer: Copies the remaining files from the ingest folder to the suppressed folder Finalize File Copy: Verifies that files have been successfully copied. Gathering Report Data: Generates data for job specific reporting. Finalize Job: Deletes temporary files and tables. Cleanup: Removes temporary tables and sets final job status. |
Tasks Completed |
Number of subtasks completed out of the total number of subtasks. |
Duration |
Amount of time taken to complete each task. |
Start |
Date and time that the task began. |
Progress |
Task's percentage of completion. |
To view the Tasks page for each status, in the Name column, click the stage.
The following information appears on the Tasks page.
Column |
Description |
(Task status icon) |
Hover over the icon to view information about the status. |
Task ID |
The unique number for the task. |
Start |
The date and time that the task started processing. |
Duration |
How long it took for the task to complete. |
Supervisor |
The name of the supervisor executing the task. |
Status |
The status of the task. |
Progress |
The task's percentage of completion. |
To view a task's input, output, and error detail, click a Task ID.
The XML page appears, displaying the input, output, and error detail.
Note: A task's XML input is the XML set that provides instructions for a task to do its work. The task output is the XML output that the application creates if the task succeeds with warnings. You can view this information and error data if the task encounters an error during processing.
View ingestions reports
The following reports are available for ingestions jobs: File type by custodian and Files processed. The reports list details of each processed file in an ingestion, including a link to the documents in the case that were generated from each file.
To download a report about each ingestion:
On the Case Home page, under Manage Documents, click Ingestions.
Click a job name.
In the navigation pane, click Report.
The following basic job information appears at the top of the page.
Job ID: Unique identification number of each job.
Total exceptions: The total number documents with exceptions.
Date range of documents: The minimum to maximum Document Date.
Total files: The number and size of files included in the ingestion.
Document ID range: The minimum to maximum Document ID.
Click the File type by custodian or Files processed tabs. To save the report as a spreadsheet (.csv file), click Download report.
The following table describes the information that appears in the File type by custodian report.
Column |
Description |
File type by custodians |
Lists each custodian and the types and number of files belonging to each custodian. |
Expanded |
Total number of files extracted from compressed archive files, such as .zip files. |
Duplicates |
Number of duplicate files. |
Suppressed |
Number of files that were excluded after processing. |
Unsuppressed |
Number of files available in the application after processing. |
The following table describes the information that appears in the Files processed report.
Column |
Description |
File ID |
Number assigned to each file that is processed. This value is stored in a field called [RT] DPM File Id. |
Path |
Path to the processed file. |
Name |
Name of the processed file. |
Extension |
File extension of the processed file. |
Related Files |
Subsequent parts of a multi-part file, such as a .rar file. |
Size (bytes) |
Size of the processed file. |
Suppressed |
Number of files that were excluded during processing. |
Unsuppressed |
Number of files available in the application after processing. |
View link |
Opens the document on the Documents page. |
Troubleshoot ingestions jobs
You can perform the following troubleshooting tasks for ingestions jobs:
Unsuppress documents in an ingestions job
Troubleshoot unprocessed files
View a table with common [Meta] processing exceptions and possible resolutions
Unsuppress documents in an ingestions job
You can unsuppress all suppressed documents from a completed ingestions job.
Note: If no documents are suppressed in an ingestions job, the button is unavailable. If the Retain suppressed files option is not selected for the ingestions job, the files for these documents are not available in the application.
To unsuppress documents:
On the Case Home page, under Manage Documents, click Ingestions.
Click the name of a completed job.
Click the Unsuppress documents button.
A message appears with the number of documents that the application will unsuppress in the job.
Click OK.
Troubleshoot unprocessed files
You can resubmit unprocessed files for processing or export the entire list as a .csv file for further troubleshooting.
To view unprocessed files:
On the Case Home page, under Manage Documents, click Ingestions.
Click the name of a job with a status of Completed with Warnings.
In the navigation pane, click Unprocessed files.
The following information appears on the Unprocessed files page.
Column |
Description |
Batch ID |
Identification number of the batch containing the unprocessed file. |
File ID |
Identification number of the unprocessed file. |
File name |
File name of the unprocessed file. |
File size |
Size of the unprocessed file. |
File path |
File location of the unprocessed file. |
To view the XML output, click the number in the Batch ID column.
Resubmit unprocessed files
To resubmit unprocessed files:
To access the Unprocessed files page, do the following:
On the Case Home page, under Manage Documents, click Ingestions.
Click a job name.
In the navigation pane, click Unprocessed files.
On the Unprocessed files page, select the check box next to the Batch ID for the file or files to resubmit. To resubmit all files, skip this step.
Click Resubmit.
Select All files to resubmit all unprocessed files, or Selected files to resubmit the files that you selected.
Change the value in the Max files per batch box, if needed.
Click Save.
Resubmitted files appear on the Progress Details page under the original batch ID.
Download the list of unprocessed files
To download the list of unprocessed files as a comma-separated values (.csv) file:
To access the Unprocessed files page, do the following:
On the Case Home page, under Manage Documents, click Ingestions.
Click a job name.
In the navigation pane, click Unprocessed files.
On the Unprocessed files page, select the check box next to the Batch ID for the file or files.
Click Download report, and then click OK.
Open or save the report.
Resolve file copy errors
Ingestions attempts to copy files into the application that failed to copy during the initial ingestions job. If files fail to copy after multiple attempts, the job's status will be Completed with warnings.
You can attempt to resolve any remaining copy errors by manually rerunning the file copy steps.
To manually rerun the file copy steps:
On the Case Home page, under Manage Documents, click Ingestions.
Click the name of a job with a status of Completed with Warnings. A message indicating that some files failed to copy also appears for the job.
Click Retry job.
Common [Meta] processing exceptions and possible resolutions
The following table provides common [Meta] processing exceptions, descriptions of the exceptions, and possible resolutions.
List item |
Description |
Possible resolution |
Corrupted |
The application is unable to open the file during ingestion. When opening the file, there is some type of failure or the application is otherwise unable to process the file. |
Obtain a new copy of the file, if possible, and reprocess the file. |
Data Type Conversion Failed |
This indicates an invalid date that was extracted for a date value in processing. If this is coded, additional detail is in the [RT] Ingestion Exception Detail field. |
No resolution. Refer to the [RT] Ingestion Exception Detail field. |
Databases |
Items where the Kind = "Databases" from the supported file types list. |
No further action. (This is just an informational flag.) |
Deleted Item |
Permanently deleted items that were recovered from slack space. |
No further action. (This is just an informational flag.) |
Empty File |
Items that are 0 KB in size. |
No further action. (This is just an informational flag.) |
Encrypted |
Items that the application has determined to contain encrypted content. |
Obtain the password for the file, apply the password to the file, and reprocess or replace the file in Nuix Discover. Or use a password-cracking software or consulting solution to obtain a decrypted copy of the file, and then reprocess or replace the file in Nuix Discover. Note: Reprocessing applies only to container documents. |
Export Failed |
Items flagged by the application as "poison" files, OR items of MIME types application/vnd.ms-outlook-activity, application/vnd.ms-outlook-journal or application/vnd.ms-outlook-task that the application was unable to write to a native file, OR items where no binary data is available to create a native file. |
No further action. Note: This is just an informational flag. |
Export Slipsheet |
No longer used. |
No further action. |
Extracted Text Only |
If Ingestions is unable to obtain a native file for a document, and extracted text is available, the text is used in lieu of the native file. This exception is coded if the text is used. |
No further action. |
Field Data Extraction Error |
This indicates an issue with extracting data from the application for a specific value. If this is coded, additional detail is in the [RT] Ingestion Exception Detail field. |
No resolution. Refer to the [RT] Ingestion Exception Detail field. |
Field Data Truncated |
This indicates data that was too long for the target field. The field and the entire value are found in the [RT] Ingestion Exception Detail field. |
No resolution. Refer to the [RT] Ingestion Exception Detail field. |
File copy failed |
The file for this document could not be copied to Nuix Discover. |
Note: Nuix Discover no longer codes “File Copy Failed”. If the job fails, that is reported in the job itself. The user can click the Retry Job button on the Properties page for the ingestion and try to copy the file again. |
Inaccessible Content |
Items with the common name of "Inaccessible Content" from the supported file types list. |
No further action. Note: This is just an informational flag. |
License Restricted |
Nuix license restricted flag. This applies if the Nuix license does not cover processing of a certain file type. |
Contact Nuix for licensing options. |
Missing Hash Value |
Items that do not have an MD5 Hash value. |
No further action. Note: This indicates that the application was unable to obtain a hash value for a file, for example, for a file that was corrupt or inaccessible, or for a file with a 0 KB file size. |
Multimedia |
Items where the Kind = "Multimedia" from the supported file types list. |
No further action. Note: This is just an informational flag. |
NIST Item |
Items with an MD5 Hash value that matches an MD5 Hash value of a known file from the NSRL Reference Data Set, also referred to as the NIST list. |
No further action. Note: This is just an informational flag. |
Non-Business Document |
No longer used |
No further action. |
Non-Searchable PDF |
Items determined to be a PDF but that do not contain any indexable text. |
N/A - If the case option OCR documents without content files is selected, Optical Character Recognition (OCR) will be run on non-searchable PDF files. Note: If you do not have this case option selected, you may want to OCR these documents. |
Renamed Extension |
Items where the true file extension determined by header analysis is different from the original file extension. |
No further action. Note: This is just an informational flag. |
System File |
Items where the Kind = "System File" from the supported file types list. |
No further action. Note: This is just an informational flag. |
Text Stripped |
Items where the application recognized the file type, but the text and metadata cannot be cleanly extracted. The result is an item that is searchable, but the text may be distorted or not properly formatted. |
No further action. Note: This is just an informational flag. |
Unknown Binary |
Items where Document Kind = "Unrecognized" or items where the MIME type is application/octet-stream. |
No further action. Note: This is just an informational flag. |