Skip to main content

Reveal Processing

Search & Indexing Module

The Search & Indexing module is unlike other modules as it has two ribbon-level tabs that separate functionality within the Module for Search and Indexing. The Search tab creates, manages, and deletes search terms, whereas the Indexing tab creates, manages, and deletes indexes within the project. Search terms can be organized within a Search Group after a Search Group is created. The Search Group is represented in the Search Module Navigation with the 60391ae70b6cc.png icon.

Note

Indexing can be automated or it can be a manual build. Depends on the project setting for indexing, that process may need to be run before search is available. See Creating Indexes below for information on this process.

Important

When building or rebuilding an Index, the old Indexes must be first deleted.

Enabling Search or Indexing Tab

  1. Search – To create, manage, and delete Keyword searches, click the Search tab.

    60391aea3b1aa.png
  2. Indexing – To create, manage, and delete Indexes, click the Indexing tab.

    60391aed28c47.png

All counts displayed within the Search tab are total counts meaning they have both original and duplicate files included in the counts. There are two different types of counts within the Search Module: Doc Count and Family Count.

  • Doc Count is the result of a term on a file level.

  • Family Count is the result of a term on an entire family.

For example, the Search term ‘document’ may be responsive to the attachment of an email or an embedded object of an efile, but not the parent email or efile itself. In this example, the Doc Count will be 1 and the Family Count will be 2.

Only original or unique files are added to an Index. All counts displayed within the Indexing tab are unique counts meaning they have only original files included in the counts.

Running Keyword Terms

There are nominally two types of searches that can be executed in the Search Module: Keyword and Concept searches. Keyword Terms use dtSearch to return files via search types like Boolean, proximity, wildcard, stemming, etc.; Concept Searching has been superseded by advancements in Reveal Review analytics, and is in the process of being deprecated in Discovery Manager. Keyword Terms automatically update at the end of the Import process, which updates the counts in the Search Module. Run the Keyword Term Hits Report on the new imports to see the effect on the Keyword Terms. Note the collapsed Search & Index Module ribbon to increase work space in the illustration below.

60391af07f5ff.png
  1. New Search Group – In the Search tab choose New Search Group to collect Keyword Search Terms into a Keyword Search Group. Terms entered using Add Keyword Terms (Item 5 below) are added to the selected group if Add to Group is checked.

  2. Keyword – Choose the Keyword tab to add and run Keyword Terms.

  3. Add Keyword Terms – Search allows for literal, wildcard, proximity and fuzzy searches. All Keyword Terms must be written using the appropriate syntax, and each term must be entered on a separate line in the Add Keyword Search box. To view the Search Syntax guide, see APPENDIX F - dtSearch Syntax Guide. There are two ways to add a Keyword Term to the project:

    • Typing a Keyword – Type a freeform Keyword Term, and click the Add Keyword Terms button. To add multiple Keyword Terms, type a term, hit the Enter key, type in a term, hit the Enter key, etc. (repeat as many times as necessary) and click the Add Keyword Terms button.

    • Dragging and Dropping a List of Terms – Create a list of Keyword Term(s) by typing the term(s) into a text file using the same method as above of one term per line, and drag and drop the text file into the Keyword tab and click the Add Keyword Terms button.

      Note

      This text file must be directly accessible from your Discovery Manager load machine (e.g., in its Documents folder or an available network location) to drag and drop. An alternative for a list in a file on a user's desktop would be to copy from the original file and paste into the Add Keyword Search box.

    • As the terms are added a search is run. If a term yields no results, you may check 60391af55fec6.png to select the term(s) and click Delete Selected Terms on the toolbar. The term will be removed after you confirm.

      Delete_selected_keyword_search_terms.png
  4. Add Term To Group– To add the Keyword Term(s) to a Keyword Group, check 60391af55fec6.png and choose the Keyword Group and the terms will be automatically added to the target Keyword Group as well as the ALL TERMS Keyword Group.

    • If a Search Term Group has not yet been created, click New Search Group from the toolbar and enter a name for the group, ideally indicating the focus of the search group.

    • You will then be able to add selected keyword search terms to this or any other available search group.

    Note

    By default, Reveal Discovery Platform indexes and searches the fields FULLTEXT, SENDER, RECIPIENTS, TO, FROM, CC, BCC. The sender and recipient email address fields contain both the display name and the fully qualified email address. Because of this it is possible that a Keyword Search Term will hit on one of the email address fields and the fully qualified email address will not visible in the extracted text (FULLTEXT). To only search the extracted text, use the syntax //text contains (<Term>). This is the only fielded search that requires the // syntax in the fielded search. Alternatively, within the Project Settings, the sender and recipient fields can be excluded from the dtSearch Index leaving only the FULLTEXT.

    Search syntax guidance in this module applies only to dtSearch. Different indexing engines may require different specification syntax for field searches.

  5. Keyword Search Terms Table – After the Add Keyword Terms button is clicked, the Keyword Term(s) are displayed in the Keyword Search Terms table. The Keyword Search Terms table has six columns in addition to sequentially-assigned ID:

    • Term – This is the Keyword search term that was added to the Keyword Terms table.

      • Term Derivatives – All Keyword Terms are displayed with a tree view. Once expanded, the tree view shows all derivatives for the parent term, for example, counts for individual connected terms or expansions of a wildcard term. The Doc Count for the parent term is the combination of the derivative’s Doc Hits combined with the given operation. It is likely the parent term’s Doc Count will not equal the sum of the derivatives' Doc Hits counts, as several derivatives may exist within one file.

    • Doc Count – This is the number of files responsive to the Keyword Term.

    • Family Count – This is the total number of files within a family one or more of whose members are responsive to a Keyword Term. For email, Doc Count and Family Count may be different depending on the situation. For example, the Keyword Term ‘document’ may be responsive to the attachment of an email but not the email itself. In this example, the Doc Count will be 1 and the Family Count will be 2.

    • Uniqueness – This is the number of files that uniquely and only hit on the particular Keyword Term with no other overlapping Keyword Terms responsive to the file. This means that if this Keyword Term were deleted from the case, these unique files would be removed from the responsive set. This is calculated on the document level.

    • Inclusiveness – This is percentage of Doc Hits/Indexed Files. If the percentage is high for a particular Keyword Term, that term may be over inclusive and need to be revised. This is calculated on the document level.

    • Group Membership – This is the Keyword Group(s) to which the term has been assigned.

  6. Keyword Search Groups – This table lists the totals for all terms within all defined Keyword Search Groups. The table displays three columns:

    • Group Name -- The name given the Keyword Search Group before Keyword Search terms were added.

      • Group Terms -- All Keyword Search Groups are displayed with a tree view. Once expanded, the Keyword Search Terms for the Group are displayed in this sub-table with Doc Count and Family Count.

    • Doc Count – This is the total number of files responsive to the Keyword Terms in the Keyword Search Group.

    • Family Count – This is the total number of files within a family one or more of whose members are responsive to a Keyword Term in the Keyword Search Group. As noted above, Doc Count and Family Count may be different depending on members of a family having or not having one or more of the terms in the group.

After Running Searches

60391aff4f5f2.png
  1. Refresh – Click the Refresh button to refresh the Search tab to show new Search Terms and Search Groups created/removed on different machines in a distributed environment, as well as to update the Search Group’s statistics.

  2. Delete Selected Terms – Select the Search Term(s) from the Keyword Search table that need to be deleted and click the Delete Selected Terms button. Optionally this can also be done via a right click menu after selecting the term(s).

  3. New Search Group – A Search Group is a simple way to combine Keyword Searches. To create a Search Group, click the New Search Group button and fill out the New Group form, click OK, and the 60391b00eb97a.png icon will appear in the Search Module, and the Keyword Search Groups section within the Module Form.

  4. Assign Terms to Group – To add one or more terms to a Search Group click the checkbox(es) next to the term(s), click the Assign Terms to Group button, and choose the target Search Group. The Doc Count for the Search Group is the combination of the term’s Doc Count combined with the OR operator.

    Note

    By default, all Search Terms added to the project will be added to the ALL TERMS KEYWORD GROUPS.

  5. Search Module Navigation – The Search Module Navigation displays the various Keyword Search Groups. Each Search Group has an icon and has a tree view which displays the following counts:

    • Term Count – The total number of Search Terms assigned to the Search Group.

    • Doc Count – The total individual files responsive to the Search Terms within the Search Group.

    • Family Count – This is the total number of files within a family when one or more of its members is responsive to a Search Term. For email, Doc Count and Family Count can be different depending on the situation. For example, the Keyword Term ‘document’ may be responsive to the attachment of an email but not the email itself. In this example, the Doc Count will be 1 and the Family Count will be 2.

  6. Delete Search Group – To delete a Search Group, first click on the Keyword Group in the Search Module Navigation, and then click the Delete Search Group button in the Search Ribbon.

    Note

    This will only delete the Keyword Group from the project, but will not delete the Search Terms from the project.

  7. Launch to Preview – A Preview allows a user to see the files that are responsive to the chosen Search Term(s). To preview the results of a Search Term(s), select one or more Search Terms from the Keyword Search table, click the Launch To Preview button, and choose either Document or Family Level. To see more information about using Previews, please see Appendix G.

Creating Indexes

Every time an Indexing Job is created with a scope, Discovery Manager will only pull back the original files that are within the scope that have not been indexed in any prior Indexing Jobs. When creating Indexing Jobs, the user will choose the scope of files to create the Index, click the Launch Indexing Job button, and an Indexing Job will be created and sent out to the Discovery Agents. The Indexing Job can be monitored in the Search & Indexing Module and the Environment Module.

60391b0316417.png

Important

When building or rebuilding an Index, the old Indexes must be first deleted.

  1. Indexing – Choose the Indexing tab to create an Index.

  2. Index Scope – There are three scopes that can be used to create an Index for the project:

    • Project – If no checkbox is selected in Imports or Selective Sets and the Launch Indexing Job button is selected, the system will look across the entire project to see if there are any files available for indexing.

    • Imports – To create an Index from one or more Imports, select the checkbox next to the applicable Import(s), and click the Launch Indexing Job button.

    • Selective Sets – To create an Index from one or more Selective Sets, select the checkbox next to the applicable Selective Set(s), and click the Launch Indexing Job button.

  3. Launch Indexing Job – To launch an Indexing Job to the Discovery Agents, select the Index Scope, and click the Launch Indexing Job button.

    Note

    The Reveal Discovery Manager uses only accent-insensitive indexes. This is done so that the same keyword term does not need to be added with and without accents to be a search hit. For example the Keyword Search of ‘uber’ would return ‘uber’ and ‘über’.

  4. Monitoring Indexing Jobs – Indexing Jobs can be monitored in the Indexing tab by clicking the Refresh button, or within the Environment Module.

After Creating an Index

60391b06d3999.png
  1. Refresh – Click the Refresh button to refresh the Indexing tab to show the current counts.

  2. Project Indexes – To delete or update an Index, select the checkbox next to the Index(s) and choose the applicable button in the Indexing Ribbon. The Project Indexes table contains the following values:

    • Index ID – The ID of the Index within the project.

    • Index Status – The status of the Index. If the Index Status is ERRORED, the Index should be deleted and a new Index should be built for the applicable scope(s).

    • Index Scope -- The content selected in generating the Index, either PROJECT, IMPORTS or SELECTIVE SETS.

    • Index Type – There are two different Index Types of EXTRACTED TEXT and OCR. While both extracted text and OCR text can be added to the dtSearch Index and be made searchable, this is done through different process, thus they are separated as different Index Types.

    • Actual Count – This is the actual number of items added to the dtSearch Index. If Actual Count does not equal Expected Count, the Index should be deleted, and a new Index should be built for the applicable scope(s).

    • Expected Count – This is the expected number of items that should be added to the dtSearch Index. If Actual Count does not equal Expected Count, the Index should be deleted, and a new Index should be built for the applicable scope(s).

    • Fragmentation – Fragmentation of an Index increases the size of the Index and slows searching, but the effect is generally not noticeable unless the fragmentation is severe. If the fragmentation of an Index is high and search results are taking a long time to complete, the Index should be deleted and a new Index should be built for the applicable scope(s).

    • Job ID – The Distributed Job ID of the Index.

  3. Delete Index – To delete an Index from the project select the checkbox next to the Index, and click the Delete Index button. Any file(s) deleted from the Index may be part of a future Index Scope and will be available for indexing. Note that null indexes from the prior illustration (having 0 documents to index) have been deleted here.

  4. Update Index Properties – To update Actual Count and Fragmentation to the most current states for one or more Indexes, select the Index(es) and click the Update Index Properties button.