Nextpoint EDA
Reporting and Exporting
During a Nextpoint EDA project, you may need to run reports to share with the court, your team, and/or Opposing Counsel. You will likely also need to export culled data for review in your Nextpoint database. Here is how you can export from the Nextpoint EDA tool:
Exporting Reports
- Navigate to the "Export" tab in your Nextpoint EDA instance.
- You have the option to toggle between "S3 Exports" and "Reports." Select the "Reports" option.
- Click on the "Generate Report" button.
- Name the report you intend to export.
- Select the "Report Type" you would like to produce. Currently, the report options are "Import Summary Report," "Search Hit Report," and "Error Breakdown Report."
- Click the "Next" button.
- If you selected the "Search Hit Report," you will be taken to a list of the slices created for your data set. You can select one slice to populate the search hit report.
- Then click the “Generate” button.
- If you selected the "Import Summary Report" or the "Error Summary Report," you will be taken to a list of previous import batches. You can select one or more import batches to generate either of these reports.
- Then click the "Generate" button.
- Once the report finishes processing, it will appear in the Report List (it will remain greyed out while processing). You can access the report by clicking on the hyperlinked name of the report.
- From there you can download the report as a PDF.
Report Options:
- Search Term Hit Report – A report that contains all project searches to which the files in the data set are responsive. Users have the ability to isolate searches to only report back on the specific searches they want included by slicing data to include those searches. See the Sample Search Term Hit Report below.
-
Import Summary Report - A report of basic information on a list of selected import batches including data on size, total documents per batch, dates that the import batches were run and processing time. The report also includes size and file counts for the overall data set and the combined selected import batches. See the Sample Import Summary Report below.
- Error Summary Report - A list of selected batches with a summary of processing errors by type, a list of archives that failed to import, and summary of each import batch including size, document count, the number of archive errors in that import batch and the processing date and time. The report also includes size and file counts for the overall data set and the combined selected import batches. See the Sample Error Breakdown Report below.
Exporting Data
- Navigate to the "Export" tab.
- Select the “S3 Exports” Tab at the top of the page. On this page you also have access to information about all of your previous exports.
- To generate a new export, click on the “New Export” button.
- A window will appear asking you to name your export.
- You can then select the “Slices” radio button. You also have the ability to export a previous import by selecting the “Imports” radio button in the rare case that you may need to export a full set of files you imported.
- Select the slice(s) you want to export.
- Click the “Next" button.
- You will see the name you’ve chosen and the slice(s) you’ve selected to import. There you can assign the destination location of the export from a list of Nextpoint database file rooms already entered into the Nextpoint EDA platform. You can also add a new S3 location (e.g. an additional database’s file room) by selecting the “Add New Location” option and entering the AWS S3 Keys.
- Click the “Next” button.
- Select the folder to which you want to export the data set. Currently, you must create the folder/subfolder in your database’s file room prior to exporting to it (as you cannot create a folder during the export process).
- Click the "Export" button and your export will begin to run.
-
The “Status” column will indicate when your import is complete. Once complete, you can click on the hamburger (3 dot) menu next to the import to view details about it (size, file count, timing, exporting user, and export location as well as the slice(s) included in the export). You may also edit the name of the export from this window. You may also download an error report about the export or delete it from your list of exports. Deleting the export from this list will NOT delete it from the location it was exported to. Once this "Status" shows as "Complete" you should have access to your data in the file room or s3 location chosen for the export.
Next up: Nextpoint EDA - Glossary
Or view one of the other support resources in the Nextpoint EDA series:
Nextpoint EDA – Getting Started
Nextpoint EDA - Project Dashboard
Nextpoint EDA – Uploading and Importing Data
Nextpoint EDA - Searches and Search Groups
Nextpoint EDA - Exporting Reports and Data
Searches and Slices
The primary function of Nextpoint EDA is to run search terms on large sets of documents. Both Searches and Slices allow users to narrow their data sets in order to only review the most relevant and useful data. Searches will focus primarily on the terms found in the text of your data while slices will allow you to group search groups and restrict your final sets by the properties of your data. Here is how "Search" works in Nextpoint EDA.
Begin by clicking onto the "Search" tab.
Search Builder
-
Building your search:
- In the search builder input field (1), you can manually enter your searches or paste from your external documentation into the field.
- In the search builder, each line item is equal to one search. If you would like to start a new search within the input field simply press enter/return on your keyboard.
- Note: We strongly suggest running searches in sets together, rather than individually when possible. This will be the most time and cost efficient way to search.
- For example it is considerably faster to run 100 searches together than it is to run them individually.
- You can also work out the syntax in any outside text editor and copy/paste them into the search builder.
-
Currently, only the following metadata fields can be searched via the search page:
mailbox_path
author
subject
email_from
email_to
email_cc
email_bcc
email_subjectYou can filter all other available metadata and file properties via the slice builder when creating your slice.
-
Assign to a search group:
- While you can allow searches to run individually, it is usually best to assign them to a search group. This way you can run reports on the set or include them in later slices and exports.
- To add to a search group, click on the dropdown menu below "Add to Search Group" (2) and select an existing group or choose the "Create New" option at the top of the menu.
-
Save Searches:
- You can also put a date restriction on individual searches at this point or you can add date restrictions in the "slice" step. If you restrict dates in this step, the date range will be connected with each individual search term. If you want to be able to adjust this date restriction on later iterations of your project, it is best to restrict your dates in the "slice" section.
- If you choose to include a Date Restriction, note that searches are inclusive of the input dates (so 10/29/2022 to 10/31/2022 would include 3 total days)
- Once your searches are in the input box and you have grouped them as you want, click on the "Save Searches" button (3). Then, the searches will transfer to the "Saved Searches" tab (4).
Saved Searches
The Saved Search table showcases a list of all of the searches you have created within this project.
- Here you can see how each line item within the search builder (as mentioned in the previous step) appears as its own row with related data, conditions applied, and slice assigned. Clicking on the search term will show the term along with any date restrictions applied to it.
- Each line item includes specific data relating to that search including file and family count, uniqueness and search proportion. See Glossary.
- "0" results means that there were no hits for that search.
- Empty results means that the search has not run yet. The user should hit the “calculate results” button (2) to view the results of the search.
- The "Calculate Results" button refreshes both new and old searches with updated hit counts based on all documents currently in the database. If you have multiple searches (or even multiple slices) to run you should add them all to the search table before clicking the "Calculate Results" button.
- "Error calculating results" means that an internal error occurred on this search. Users should reach out to the support team to identify the issue and possible next steps. If you would like to retry these searches, please copy them to builder, edit as needed, and run them again,
- If you selected a search group in the “Search Groups” column of the search page, you will be shown the Search Group Details modal which give you more detailed insight into your search group including the search terms. In this modal, you can also remove searches from a search group.
- You can review, compare, and contrast these search groups and the data that they yield for context as to what you may want to export later on.
- You can add new search terms to a group on the fly by selecting the "+Add" button next to unassigned search terms.
- The "Copy to Builder" button (4) will copy selected terms to the builder where they can be edited and rerun with modified conditions. This button will only be active when one or more terms is selected.
- To clear out your list of saved searches, you can select the ones you want to remove from the table and click on the "Archive/Unarchive" button (5). This action is reversible and you can review archived searches at the bottom of your saved searches chart (they will have "True" in the "Archived" column of the chart).
- Search hit counts only refresh after the "Calculate Results" button is pushed. The "Last Updated" date/time (6) lets the user know the last time that the searches were updated. If you import new document sets and want prior search sets to include the new documents, you need to click the "Calculate Results" button to recalculate the results.
- Each line item includes specific data relating to that search including file and family count, uniqueness and search proportion. See Glossary.
Next up: Nextpoint EDA Search Guide
Or view one of the other support resources in the Nextpoint EDA series:
Nextpoint EDA – Getting Started
Nextpoint EDA - Project Dashboard
Nextpoint EDA – Uploading and Importing Data
Nextpoint EDA - Searches and Search Groups
In Nextpoint EDA, your data can not only be searched for keywords, but it can also be sliced to combine complex file filters and parameters with your searches so that you can pull out very specific data sets for your review.
Slice Builder
- In the "Slice" tab, you can access a number of options to build your slice:
Name your slice (1). - Select from the "Slice Field" options below the editor (2).
- Select the specific criterion within that slice field to add to your slice (3).
- Apply a date restriction (4) filter (if applicable). Note that the date restriction is slice field specific. If you apply a data restriction to one part of your slice but use "OR" connectors, it is possible that your resulting data set will include files outside your data range.
- Click on the "Add to Slice" button at the bottom of your "Slice Fields" chart.
- Add additional slice field options as necessary.
- Adjust your connectors (6) to fit the specific needs of your data set. Note that all connectors inside each grouping must be identical. Mixing AND, OR, and NOT connectors within the same slice grouping separated by parentheses could result in unpredictable and inconsistent results.
- When your slice is ready, click on the "Create" button to slice your data set.
Saved Slices
Once you create a slice, it saves to your "Saved Slices" (1) and then calculates your results. The time this processing takes depends on the number of documents in your database, but when it finishes running, you can view the file count of specific hits and the family count of your hits (2) including their full families. If you want to review the specific syntax of a slice, click on the name of the slice (3).
Slice Troubleshooting
As your slices become increasingly complex, the connector restrictions described above may limit your ability to isolate a specific data set. In these cases (specifically if you need to mix connectors but cannot adjust the slice builder to fit your needs), try breaking your slice into separate slices. For example, let's say want to include all hits from a specific search group OR all spreadsheets from a specific custodian in the same slice. You may want your slice to look something like:
(Search Group:"2-9-24 set") OR ((file_type:xl* OR file_type:csv) AND (custodian:"Benjamin Rogers"))
The current iteration of the slice builder cannot use double parentheses or different connectors within the same group. To run this search you need to create 2 slices. First slice:
(file_type:xl* OR file_type:csv) AND (custodian:"Benjamin Rogers")
Then run that slice OR your search group to create your final slice:
(Search Group:"2-9-24 set") OR (Slice:"Rogers Spreadsheets")
If you need help creating searches or slices to meet your needs, reach out to support@nextpoint.com for support.
Nextpoint EDA - Data Slices
Nextpoint EDA uses a powerful search syntax called dtSearch. There are differences between dtSearch and the search syntax employed in Nextpoint databases, so some translation may be required.
Documents are searchable with scans after processing is completed. Consultation on terms and syntax is available for an additional hourly charge.
Like in a Nextpoint database, Nextpoint EDA uses boolean searching for text searches. A "boolean" search request consists of a group of words or phrases linked by connectors such as AND and OR that indicate the relationship between them.
Examples:
Search Request |
Meaning |
apple and pear |
both words must be present |
apple or pear |
either word can be present |
apple w/5 pear |
"apple" must occur within 5 words of "pear" |
apple not w/12 pear |
"apple" must occur, but not within 12 words of "pear" |
apple and not pear |
"apple" must be present and "pear" cannot be present. |
name contains smith |
the field name must contain smith |
apple w/5 xfirstword |
apple must occur in the first five words of the document |
apple w/5 xlastword |
apple must occur in the last five words of the document |
Warning
Exact phrases should be off set by quotation marks.
"test phrase" OR single OR word
If you use more than one connector (and, or, contains, etc.), you should use parentheses to indicate precisely what you want to search for. For example, apple and pear or orange could mean (apple and pear) or orange, or it could mean apple and (pear or orange). For best results, always enclose expressions with connectors in parenthesis. Example:
(apple and pear) or (name contains smith)
Field Filtering (in the Slice Section)
The following metadata fields can be used to filter for hits, but they must be applied in the "Slices" section of your Nextpoint EDA app. Fields can vary based on the data type. For emails, we generally extract the following field information (if available):
import_path
ancestry
file_type
file_size
md5
s3_path
status
project_id
batch_id
searchability
content_type
creation_date
creator
language
email_date
email_content_type
email_message_id
has_children
family_date
file_id
family_id
- Any file extracted from another file (loose files from a zip, attachments from emails, etc.) will have an ancestry field
- Any file that has other files extracted from it will have a has_children field (value is true/false)
- Any file directly or indirectly extracted from a mailbox will have a mailbox_path field
- Most (if not all) text-based files will have author,content_type, creation_date, and language fields.
- Emails and their attachments will have a family_date field. This is like Nextpoint's "master_date" field. It uses the family parent's creation_date value, but it's inherited by all children in the family.
All files should have the following metadata fields which can be searched on:
import_path
file_type
file_size
md5
s3_path
status
searchability
project_id
batch_id
file_id
family_id
Search terms may include the following special characters:
Character |
Meaning |
? |
matches any character |
= |
matches any single digit |
* |
matches any number of characters |
% |
|
# |
|
~ |
|
& |
|
~~ |
|
## |
Fuzzy Searching
Fuzzy searching will find a word even if it is misspelled. For example, a fuzzy search for apple will find appple. Fuzzy searching can be useful when you are searching text that may contain typographical errors (such as emails), or for text that has been scanned using optical character recognition (OCR).
Add fuzziness selectively using the % character. The number of % characters you add determines the number of differences dtSearch will ignore when searching for a word. The position of the % characters determines how many letters at the start of the word have to match exactly. Examples:
ba%nana
Word must begin with ba and have at most one difference between it and banana.
b%%anana
Word must begin with b and have at most two differences between it and banana.
Phonic Searching
Phonic searching looks for a word that sounds like the word you are searching for and begins with the same letter. For example, a phonic search for Smith will also find Smithe and Smythe.
To ask dtSearch to search for a word phonically, put a # in front of the word in your search request. Examples:
#smith
#johnson
Stemming
Stemming extends a search to cover grammatical variations on a word. For example, a search for fish would also find fishing. A search for applied would also find applying, applies, and apply.
To add stemming selectively, add a ~ at the end of words that you want stemmed in a search. Example: apply~
The stemming rules included with dtSearch are designed to work with the English language.
Synonym Searching
Synonym searching finds synonyms of a word that you include in a search request. For example, a search for fast would also find quickly. You can enable synonym searching selectively by adding the & character after certain words in your request. Example:
improve& w/5 search
Numeric Range Searching
A numeric range search is a search for any numbers that fall within a specified range. To add a numeric range component to a search request, enter the upper and lower bounds of the search separated by ~~ like this:
apple w/5 12~~17
This request would find any document containing apple within 5 words of a number between 12 and 17.
Notes
- A numeric range search includes the upper and lower bounds (so 12 and 17 would be retrieved in the above example).
- Numeric range searches only work with integers greater than or equal to zero, and less than 2,147,483,648
- For purposes of numeric range searching, decimal points and commas are treated as spaces and minus signs are ignored. For example, -123,456.78 would be interpreted as: 123 456 78 (three numbers).
Regular Expressions
Regular expression searching provides a way to search for advanced combinations of characters. A regular expression included in a search request must be quoted and must begin with ##.
Examples:
Apple and "##199[0-9]"
This would hit on a file containing the word "Apple" and the number 1994 (or 1990, 1991...1999).
Apple and "##19[0-9]+"
This would hit on a file containing the word "Apple" and the number 194 (or 1964 or 1983302002...).
Special characters in a regular expression are:
Regular expression |
Effect |
. (period) |
Matches any single character. Example: "sampl." would match "sample" or "samplZ" |
\ |
Treat next character literally. Example: in "\$100", the \ indicates that the pattern is "$100", not end-of-line ($) followed by "100" |
[abc] |
Brackets indicate a set of characters, one of which must be present. For example, "sampl[ae]" would match "sample" or "sampla", but not "samplx" |
[a-z] |
Inside brackets, a dash indicates a range of characters. For example, "[a-z]" matches any single lower-case letter. |
[^a-z] |
Indicates any character except the ones in the bracketed range. |
.* (period, asterisk) |
An asterisk means "0 or more" of something, so .* would match any string of characters, or nothing |
.+ (period, plus) |
A plus means "1 or more" of something, so .+ would match any string of at least one character |
[a-z]+ |
Any sequence of one or more lower-case letters. |
Limitations
- A regular expression must match a single whole word. For example, a search for "##app.*ie" would not find "apple pie".
- Only letters and numbers are searchable. Characters that are not indexed as letters are not searchable even using regular expressions, because the index does not contain any information about them.
- Because the dtSearch index does not store information about line breaks, searches that include begining-of-line or end-of-line regular expression criteria (^ and $) will not work.
- No case or other conversion is done on regular expressions, so a regular expression must match the case of the information stored in the index. If an index is case-insensitive, all letters in the regular expression must be lower-case. If a character is not searchable in the index, then it cannot be included as a searchable character in the regular expression. Non-searchable characters in a regular expression are not ignored as they are in other search expressions.
Performance
A regular expression is like the * wildcard character in its effect on search speed: the closer to the front of a word the expression is, the more it will slow searching. "appl.*" will be nearly as fast as "apple", while ".*pple" will be much slower.
Searching for numbers
The = wildcard, which matches a single digit, is faster than regular expressions for matching patterns of numbers. For example, to search for a social security number, you could use "=== == ====" instead of the equivalent regular expression.
For additional information about dtSearch syntax, review the following documentation (from which this search guide was adapted): https://support.dtsearch.com/webhelp/dtsearch/search_requests_overview.htm
Next up: Nextpoint EDA - Exporting Reports and Data
Or view one of the other support resources in the Nextpoint EDA series:
Nextpoint EDA – Getting Started
Nextpoint EDA - Project Dashboard
Nextpoint EDA – Uploading and Importing Data
Nextpoint EDA Search Guide
Importing
The first step in mining your data is to upload and import it into the Nextpoint EDA tool.
Step 1: Getting Started with Imports
- First Time User without imported data: As a first time user, you will be immediately prompted to import your data from the Dashboard.
- Returning User with existing data: To import new data, simply navigate to the import tab and select “New Import”.
Step 2: Naming your Import and Selecting Your Source for Nextpoint EDA
In order to import your files directly into the Nextpoint EDA project, users have the ability to add any outside S3 sources, including their Nextpoint database(s). Once a location has been added and successfully verified, the source is saved and the user will be able to access that for all future imports. Additionally, each Nextpoint EDA project comes with a Nextpoint EDA Repository pre-created for that project which can be used directly to house source data.
Name the Import and Selecting an Existing Source
-
- Name your import. This name will appear later on your import batch list, so make the name clear and unique to this import data set.
- Select the source of the data (your Nextpoint EDA s3 repository, a Nextpoint File Room, or an external s3 Repository). If you need to add a new source location, check out the next section "Adding a New Source Location".
- Click the "Next" button.
Adding a New Source Location
Amazon s3 sources are virtual data storage locations used for housing large data sets. Your Discovery or Litigation Nextpoint File Room is an example of an s3 location. To add a non-Nextpoint EDA s3 location (like a Nextpoint File Room) to your Nextpoint EDA project:
- Click on “Add New” at the bottom of a new import window.
- Name your new s3 location (e.g. "Hoven v. Enron Discovery Database").
- Copy and Paste your AWS Access Key ID into the textbox below that option. In a Nextpoint database, all of these can be found in the “Settings” tab under “Import” in the "File Room" section. For more information about accessing your AWS keys and File Room Path, visit this support article.
- Copy and Paste your Secret Access Key into the text box below that option.
- Copy and Paste your File Room Path into the textbox below that option.
- Click “Add” and confirm that the system was able to verify your credentials. You should see the word "Success" in green with a checkmark next to it by the new source you added.
Nextpoint EDA S3 Repository
If you choose to import directly from your Nextpoint EDA s3 repository, a tool tip on the import screen labeled “How do I transfer files into my Nextpoint EDA repository?” will guide you through how to pull your data into your Nextpoint EDA repository. The required AWS Access Key ID, Secret Access Key, and File Path will be provided here for input into your external sources.
Using these keys you can use any of the tools listed below to transfer your data into your repository.
If you get an error when adding an external s3 location after adding your keys, it could be because of a CORS error. If this occurs, take the following steps to add a CORS configuration to an s3 bucket:
- Sign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/.
- In the Buckets list, choose the name of the bucket that you want to create a bucket policy for.
- Choose Permissions.
- In the Cross-origin resource sharing (CORS) section, choose Edit.
- In the CORS configuration editor text box, type or copy and paste a "new CORS configuration", or "edit an existing configuration":
[ { "AllowedHeaders": [ "authorization", "content-length", "content-md5", "content-type", "host", "origin", "x-amz-acl", "x-amz-content-sha256", "x-amz-date", "x-amz-meta-path", "x-amz-meta-qqfilename", "x-amz-security-token", "x-amz-server-side-encryption", "x-amz-user-agent", "amz-sdk-invocation-id", "amz-sdk-request", "x-amz-bucket-region", "x-amz-expected-bucket-owner" ], "AllowedMethods": [ "GET", "POST", "PUT", "HEAD" ], "AllowedOrigins": [ "*" ], "ExposeHeaders": [ "ETag" ], "MaxAgeSeconds": 3000 } ]
6. The CORS configuration is a JSON file. The text that you type in the editor must be valid JSON. For more information, see CORS configuration.
7. Choose Save changes.
Still Getting Errors?
AWS IAM policy will grant permission to list and download objects from an S3 bucket. But the following script could help you set up AWS permissions. Note - this will not work for exports, only imports.
- Sign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/.
- In the Buckets list, choose the name of the bucket that you want to create a bucket policy for.
- Choose Permissions.
- In the Bucket Policy Section, choose Edit.
- Editor text box, type or copy and paste the following, updated to include information about your bucket:
-
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::bucketname/*" }, { "Effect": "Allow", "Action": "s3:ListBucket", "Resource": "arn:aws:s3:::bucketname" } ] }
If uploading the data yourself is not possible or you have questions about your specific situation, reach out to your client success manager for other options.
Note
Linking data from your file room to the Nextpoint EDA tool will create a copy of the data in the Nextpoint EDA repository. If you are uploading new data, we recommend placing it directly into your Nextpoint EDA repository. If the data has already been uploaded to your database's file room, it is fine to utilize this option for Nextpoint EDA.
Step 3: Selecting Data for Import
- Review the selections from the previous step such as "Import Name", "Source Selected", and the data within the selected source can be seen within the table above.
- You have the option (not a required field) to assign a custodian(s) to the import (if applicable). At this time, custodians added to a batch are assigned to all files in the batch (so a custodian cannot be assigned to only certain parts of a batch). Custodians can be edited or added after the import completes as well (see the custodians section below for details).
- Select the folder or file for import. At this time only a singe folder or file is eligible for import in one batch.
- Click “Import.”
- The import list will show the batch as "queued" (waiting in line to start processing) and then “processing” until the batch is “complete.” In very rare occasions a batch will show as “failed” at which point you should contact the support team to identify the issue with the import.
- To download a csv of your import batch list, click on “Export CSV” at the bottom of the batch list.
- If you click on the hamburger (3 dot) menu next to any import batch, you have the option to "Edit Import Details" or "Download CSV Error Report."
Considerations for Importing
Full text + Metadata | Metadata only | Can be Identified, not processed |
pst zip mbox eml msg jpg png tiff bmp gif rtf txt doc docx xls xlsx ppt pptx dat data csv htm html mht mhtml xml |
mp3 wav flac mp4 m4v m4a mov mpg |
ics vcf flv pnm pbm pgm ppm ps svg emlx mbx anything encrypted |
Universal Fields | File Type Based* |
import_path ancestry file_type file_size md5 s3_path status searchability project_id batch_id file_id family_id unique_id |
mailbox_path author content_type creation_date creator subject language email_from email_to email_cc email_bcc email_subject email_date email_content_transfer_encoding email_content_type email_in_reply_to email_message_id email_thread_index email_thread_topic has_children family_date |
Nextpoint will assign custodians upon request. Please note that the custodian of a piece of data is not intrinsic to that data, rather it is an employee or other person or group with ownership, custody, or control over potentially relevant information. For example, an individual custodian's electronically stored information (ESI) usually includes their mail file, whereas a group custodian's ESI may include a shared network folder. Due to this, custodians cannot be assigned without direction as to how the data was collected.
Email archives collected and combined into a single PST file with multiple folders can be split among multiple custodians after processing has been completed. Assignment of more than 10 custodians in a single import may be billed as an additional hourly charge.
To add or edit the custodian(s) assigned to an import batch, click on the 3 dots next to the import and select the option to "Edit Assigned Custodians". Then click on the "-" next to an existing custodian to remove them or click on the "+ Assign Custodian" link to add a new or existing custodian. Once you select a new custodian for the batch, click the "Assign", and their name should appear on the list of existing custodians. Click the "Save" button to add the new custodian to the data from this batch.
Documents are standardized and processed into coordinated universal time (UTC) unless otherwise requested. This time zone will be used for all date filters and to standardize any datetime metadata fields. Any time zone offset can be provided in document metadata. For example, a time zone offset from GMT that the data was processed in. For example, if the data was processed in GMT-5 this would be populated with -5.00.
Master date of the document is the date used for filtering and date restrictions. Master date will be generated from the date sent of parent email for emails and their attachments and the last modified date for efiles.
When applying date restrictions, the kept documents are inclusive of the chosen date (master date as described above).
We deduplicate documents based on email message ID or MD5 hash (if no email message ID is available). Any files having matching email message IDs (or MD5 hashes) will be deduplicated, only one native copy will be stored in the system, and their metadata will be merged by default. That said, documents within different document families will not be deduplicated to split up the family. So attachments with matching MD5 hashes but attached to two different emails will be retained as separate documents. Deduplication is done globally within each project, across all batches and custodians.
Currently, this feature cannot be turned off or customized.
Upon import into the Discovery platform, Nextpoint dedupes email families and loose files globally across all custodians. To do so a MD5 hash value is generated, for emails, from Date Sent, Sender Name, Sender Email Address, Recipient Email Addresses, Display To, Display CC, Display BCC, Subject, Body, Attachment Names, Attachment Size and for loose files the bit stream of that file.
Archives with zero extracted files or mismatched expected file count (coming soon) will be addressed on import in a quality control pass. Individual file processing and indexing errors will not be addressed, only reported upon.
- Video/Audio Transcription*
- Language Detection (will occur on all imports) and Translation*
- Image Recognition*
- Entity Recognition/PII*
*These services may incur additional costs. Reach out to your client success representative for details.
Next up: Nextpoint EDA - Project Dashboard
Or view one of the other support resources in the Nextpoint EDA series:
Nextpoint EDA – Getting Started
Nextpoint EDA - Searches and Search Groups
Nextpoint EDA – Uploading and Importing Data
Logging-in and Project Setup
Why Nextpoint EDA?
As data volumes and discoverable sources continue to grow, getting a quick understanding of what is in your data before fully processing will speed up your overall review process and create less headaches during your review. Similar to how your email inbox may be set up to pre-filter through your spam, allowing you to ONLY view pertinent information – Nextpoint EDA helps attorneys eliminate the noise before they even lay eyes on their information.
Because each firm’s environment for Nextpoint EDA is unique and independent, Nextpoint EDA is highly secure and incredibly fast. We ingest data for Nextpoint EDA up to a blazing ½ to 1 terabyte per hour.
Getting Started
-
Sign in:
- (New Users) At this time, new users to Nextpoint EDA must be added by Nextpoint. Reach out to your Client Success Director or support@nextpoint.com for more information or to get started.
- Similar to your Nextpoint database login, Nextpoint EDA uses two factor authentication. Once you type in your email address and password, you will be asked to type in a verification pin sent to your email address the first time you log in.
- Should you need to reset your password, click on the "Forgot Password" link above the password field. You will then receive an email with a link to reset your password and return to the login screen.
-
Create/Select Project:
- Once you log in, you will be prompted to “Select an Existing Project” (if any exist) or “Create a New Project.” Clicking "Select an Existing Project" (if any exist) will take you to that project's dashboard.
- If you select “Create a New Project,” you will be prompted to type in the Project Name, Client Number, and Matter Number. Note: All Data will be processed using UTC time.
- Then click “Create.”
-
Importing Data:
- You will then be taken to the project's dashboard where you will be prompted to “Import Data.”
Updating Settings
You can also update settings, users, add database file rooms, and passwords in the settings tab.
The General tab allows you to edit some of your profile information, email your CS director, or change the time zone of your profile. Changing the time zone will change how the dates are viewed within the app. All data will continue to be processed in UTC time.
The S3 Locations tab allows you to view current file rooms and other S3 storage instances that this project has access to for imports and exports. You can add new locations here by clicking on the "Add New Location" button and inserting the access keys and location information in the resulting pop-up.
The location can be anything that will help you reference this location. The other information can be found for a Nextpoint database under Settings > Imports. If you have not used your s3 credentials in a Nextpoint database, you may need to reach out to support@nextpoint.com so that we can set your credentials.
You can also add new locations on the fly from within an import or export.
The Security tab allows you to update your password.
The Users tab allows you to view all of the users who have accessed your account and projects including the date they were added and the last accessed date. If you are your firm's Nextpoint EDA administrator (usually the first person added on a Nextpoint EDA account), you are also able to add new users in this tab by hitting the "Invite New User" button. Adding the new user's name and email address will automatically send them an email inviting them to the account. It they are a new user, they will be prompted to set up there account before accessing the project.
Next up: Nextpoint EDA – Uploading and Importing Data
Or view one of the other support resources in the Nextpoint EDA series:
Nextpoint EDA - Project Dashboard
Nextpoint EDA - Searches and Search Groups
Nextpoint EDA - Exporting Reports and Data
Nextpoint EDA – Getting Started
The dashboard gives a high level visual overview of the data that has been imported into your project. You can choose to see how your data breaks down via a specific import, or holistically with all your data. This can help you make informed decisions for future imports prior to creating searches and slices for export.
Project Dashboard Cards
Valuable aspects of the overview include breakdowns of your data within cards such as:
Imported documents are categorized as follows:
- Fully Searchable – The metadata was successfully extracted from these files and their content was OCR'd to create searchable text.
- Metadata Only – The metadata was successfully extracted from these files, but they are of a type that would not generate searchable OCR'd text, so only the metadata is searchable.
- Unsupported – These file types are not supported by Nextpoint's EDA app, so neither the metadata nor any text in these files will be searchable.
- Unknown – These file types are unknown/unreadable to Nextpoint's EDA app and could not be processed. Neither their metadata nor any text in these files will be searchable. This may also include files that have readability issues (e.g. a corrupt eml file or an encrypted spreadsheet)
Individual documents may contain multiple languages. Each document is categorized based on the "dominant" language – or most prevalent language – detected in the text.
Importing errors are categorized as follows:
- Unsupported Type – This file type is not supported by the Nextpoint EDA app and cannot be processed.
- Unsupported Size – The file size is too large to process.
- DeNIST – Computer system files and NSRL not generally user generated and therefore often not relevant to most litigation.
- Error – An error occurred while processing this file and/or extracting its metadata.
- Other – File specific issues that make them unreadable (e.g. corruption, encryption, empty files...)
The data timeline shows the frequency of documents through your data's date range. It will include custodian frequency along the timeline (if applicable).
Next up: Nextpoint EDA – Uploading and Importing Data
Or view one of the other support resources in the Nextpoint EDA series:
Nextpoint EDA – Getting Started
Nextpoint EDA - Searches and Search Groups
Nextpoint EDA – Project Dashboard
Assigned Slice: The slice(s) that a specific search in the Search Table is assigned to.
Conditions: Specific parameters placed on a slice to limit the return of a search (e.g. custodians or a date range to include)
Created On: The date that a Search was created and added to the Search Table.
Doc Hit Count: The number of documents that directly have one or more search hits. In the context of a report with multiple lines of syntax, overlap may occur if a document directly hits on more than one term.
Early Case Assessment (ECA): The process of evaluating the strengths and weaknesses of a case prior to investing substantial resources in litigation.
Early Data Assessment (EDA): a subset of ECA that involves using data analytics and advanced eDiscovery filtering techniques to understand the contents of digital data at the outset of a matter. The primary goal of an Early Data Assessment project is to conduct an initial review and identify specific documents or other pieces of evidence that establish the strength and weakness of the litigation position.
Family Count: The number of total documents included when full families (parent emails and all of their attachments) are included with the search hits.
File Count: The number of direct hits on a search or included in a slice. This number does not include family members of hits that do not also hit on the search or were included in the slice.
Noise Words (also known as Stop words): A list of common words that are not indexed and therefore not searchable. (e.g. “the project” would only search on the word “project”)
S3 Repository: The data storage location for Nextpoint EDA projects (exactly like the file room in a Nextpoint database).
Search Proportion: The percentage of documents that hit directly on a line of syntax (a search). This is calculated out of the total number of searchable documents in the slice. Search proportion is also known as "inclusiveness."
Search/Searches - A search is a line of terms (words), phrases, boolean connectors, and specialized syntax used to find specific documents. Each search can also have conditions that limit the reach of that search (FOR EXAMPLE a date range or a specific set of custodians).
Slice - Groups of searches run in bulk that can also have conditions that will be applied to each search line contained within.
Term - A single word or phrase within the syntax of a search.
Uniqueness: The number of direct documents hits where only the current term is hit and no other.
View one of the other support resources in the Nextpoint EDA series:
Nextpoint EDA – Getting Started
Nextpoint EDA - Project Dashboard
Nextpoint EDA – Uploading and Importing Data
Nextpoint EDA - Searches and Search Groups
Nextpoint EDA - Exporting Reports and Data