Ranged Image Imports, Including Produced Data

Follow

Outlined below are the steps and workflow tips recommended by the Nextpoint Data Strategy Team to manage the import of produced data or data being migrated from another platform.  You can also watch our Advanced Import training video here for additional information.

All steps outlined are based on the assumption of a ranged image import.  The typical format for such an import is single-page tiff/jpg image files named by their Bates number, together with document-level text files, any included natives, and document breaks/metadata contained in a load file. 

See below for a breakdown of what a ranged image import typically looks like.

WHAT DOES A RANGED IMAGE IMPORT LOOK LIKE?

The data set will contain up to 3 folders (at the very least, this type of data set must contain an "Images" folder):

IMAGES - this folder contains the document pages, each a one-page image file

TEXT - this folder contains the OCR text information, and can be either one text file per page, or one text file per document

NATIVES - this folder contains any native files that accompany each document

Image file pages will be in the .tif or .jpg format, and the files will be named by their bates numbers. If included, the OCR text and Native files will be also be named by the corresponding bates numbers. Here is what a common single-page image data set looks like:

1.png

2.png

3.png


InstructionsExhibit AExhibit BExhibit C
    1. Download metadata file and convert from DAT to CSV.
      1. See instructions in Exhibit A
    2. Open CSV and review the first two or three columns to confirm if they have Bates or a general control id.
      1. Typically, bates_start/bates_end or beg_doc/end_doc columns

    3. Open the source production folder and navigate to the IMAGES folder.  Check for individual tiffs/jpgs.
      1. If IMAGES is named something else (e.g. TIFFS), rename TIFFS to IMAGES
    4. Open Settings >> Coding >> Fields in your Nextpoint database.
    5. Load File Configuration / Document Boundaries: Check if there is begattach / endattach, production begin/production end, etc... 

      1. These beginning and ending numbers will identify which range of single page tiff/jpgs should be pulled to create individual documents (one document per row)
      2. If there is not already a begattach and endattach field under Settings >> Coding >> Fields, add as Freeform fields.
    6. Load File Configuration / Metadata fields: Review column headers and compare to the Fields in the database.
      1. Set up Fields: For any header in your load file, you will need to have a corresponding Field.

        • If there is already a field set up in the database, use that.
        • If not, create a new Field named the same as your load file headers.
      2. Be aware of default Fields and Document Attributes which exist (and do not need to be set up).
        • See list in Exhibit B.
        • If a header value in your load file matches a header in Exhibit B, you do not need to set up a corresponding field, but do make sure they match exactly.
      3. Be aware of Protected System Fields!
        • For any of the fields listed in Exhibit C, it is necessary to rename the fields in your load file and set up a corresponding Field.
      4. Text and Native Paths: Nextpoint needs to know which text and native files to grab and line up with their respective document image(s). This is accomplished by using a text_file or native_file column headers, which contain the path to and name of the text and native files, respectively.

        • IMPORTANT: These two columns MUST be named text_file and native_file for the import to work correctly
        • Check to make sure the paths are correct.  The paths should start from where the load file is going to be saved, likely the parent production folder and in line with IMAGES/TEXT/NATIVES.  Example, the path would start with TEXT/ or NATIVES/, NOT  ./ or /
      5. Specific Field Notes
        • Subject: Refers to an Email so recommend changing to Email Subject
        • Title: Refers to a Document so recommend changing to Efile Title
    7. After all Fields are squared away, it is critical to the import’s success to save your load file as nextpoint_load_file.csv
    8. Check your database SETTINGS for Deduplication
    9. Place load file at the root of your production folder and then upload production folder to Nextpoint File Room:

      FileRoomImportPlacement.png
    10. Import your Production Folder
    11. Once Import is complete, complete “Family Linking" from your Batch Summary page.  This is vital for visually establishing parent-email relationships in your import.

 

Have Questions?

Produced data and migration imports can be difficult. After reviewing the above Produced Data Import Approach, if you need additional training or assistance from the Nextpoint Data Strategy team, please contact your Account Director or the Nextpoint Support Team.

1 out of 1 found this helpful

Comments

0 comments

Please sign in to leave a comment.