Skip to main content

Reveal Review Publication

3. Searching and Filtering Documents

Story Engine provides many different options to search and find relevant documents. A user can use traditional keywords, topics and people to search for documents, or AI augmented filtering methods such as COSMIC scores and Emotional Intelligence.

Users can either type directly into the natural language search bar, assisted by our artificial intelligence, or use one of the dropdown menus to return more specific results. You can use Term reports to run multiple string searches and reports.

60424d0833222.png
B. Search by Person

When searching for a person in the global search, users can define whether they want to search documents going To, From, To or From, To and From or Discussed the person of interest. This feature is useful when trying to pinpoint communications between specific people, or find documents discussing specific people.

60424d0f0df61.png
  • From: filter to return only emails written by the person.

  • To: filter to return only emails received by the person.

  • To or From: filter to return emails written or received by the person

  • To and From: filter to return emails written and received by the same person

  • Discussed: filter returns threads where the person is discussed within the content of the document.

To search communications between two specific people, select To to specify that you would like to retrieve emails received by the individual and then click the Apply button.

60424d108e7b1.png

Then, add another person to the search bar using the Person filter, specifying From. Using the AND connector means that you will return results sent to person A and from person B.

60424d1240472.png

In the graphic below we visualize documents sent TO Vincent Kaminski AND FROM Shirley Crenshaw.

60424d1444f6e.png

You can add another person to the To field. By default, Story Engine uses the OR operator to connect each of the specified To persons.

C. Search by Date

Click Date dropdown to open Date searching popup.

603e72b4c01ae.png

Date Operators use one of these searching operators.

  • Before: Return all documents dated before the selected date.

  • On: Return all documents dated on the selected date.

  • After: Return all documents dated after the selected date.

  • Between: Return all documents between the specified start and ending dates.

  • Work shift: Users can select one of the work shift options to further limit search results. The options available are All shifts, Business Hours, Evening Business Hours or After Hours.

603e72b6bbb78.png

Inclusive/Non-Inclusive:

By default the searching is set to return more precise results (fewer documents) by searching only inclusive documents.

603e72b8a9c1a.png

User can specify to include non-inclusive documents in the search results by moving the precision slide bar to right (less precise).

603e72ba55c15.png

Customize date searching options:

603e72bbc10cd.png
  • Search Metadata: This option allows user to specify whether to search metadata at segment level or thread level.

  • Search Mentions: Users can also search for mentions of the specific date in the body by selecting Search Mentions checkbox.

  • Partial ranges: Use this option if you want to include documents with date mentions that represent a range, which falls partially in the target searching dates.

D. Search by COSMIC Score

When filtering by COSMIC Score, documents are returned based on how well they fit the probability model as determined by a custom Machine Learning model. Users train a “model” data set by tagging for COSMIC groups, and then the algorithm works through the remaining documents based on the user trained data, with each term's relevance probability returned as a number between 0 and 100. This is what the COSMIC Mission Control Statistics table underlying this process looks like:

603e72bdd801e.png

This enables Story Engine to return similarly classified documents based on the probability of matching the model.

The COSMIC Score drop-down button is used to interactively weigh COSMIC scores by their relative probability. In the Settings, users can choose Low, Medium or High probability as well as no score, errors, and custom ranges.

603e72bfcb461.png

Users can define a number of thresholds for any COSMIC model in their storybook. For example, a user can check the ALL option, and use a High probability of “Issues: Finance” and low probability of “Issues: HR” in conjunction to find documents that only fit both thresholds. Users can also choose the ANY option, to require just one of the conditional thresholds be met to return results, instead of ALL which would mean that all COSMIC models selected must be present to return the document.

E. Search by Emotions

When searching by Emotions, users can find models by keyword searching. Users can also specify minimum or maximum thresholds for Intent Score, Opportunity Score, Pressure Score, Rationalization Score, Positivity Score and Negativity Score.

603e72c1a65de.png

For each emotion, users can specify one or a range of threshold options:

  • Any score

  • No score

  • Low

  • Medium

  • High

  • Custom

By combining a number of these filter options users can combine options to search for documents with a medium intent score and high negativity score, for example.

F. Search by Thread Intelligence
603e72c39f95f.png

Users can select Threads to search by Thread Intelligence to find documents and threads with a certain number of recipients, number of email segments (length of the conversation), or reciprocal ratio (social status).

“Social status” condition represents emails sent or received by a person with high or low social status as determined by reciprocal ratio. A large reciprocal ratio (>5) often indicates that the person associated with the email address may be mass marketing or spamming, while a reciprocal ratio of less than 1 means that the individual receives more messages than they send.

G. Search Metadata

In the global search bar, users click on the Metadata tab to search by metadata.

603e72c59ad43.png

In the drop down, click Select… and choose from Control number, ID, Group, Thread, Random, or External IDs, Record type and Inclusive Only.

Record type: Search Record Type is used to determine what kinds of files were being sent in what manner. Search options include Emails, attachments, and E-files such as PDFs.

603e72c78d93a.png
  • Email: parent email.

  • Attachment: attachment to an email. Note that an email attached to another email would be treated as an attachment not an email.

  • EFile: stand-alone parent document such as MS Office docs, PDF files, etc.

H. Search Domain

Searching by domain is used to find out which people associated with which domains were sending and receiving messages. Domain is accessible from the “All filters” dropdown menu.

603e72c951b1b.png

Exclude a domain by placing your mouse over the “Include/Exclude” button. In the dropdown menu, select “Exclude”.

  • Include/Exclude: exclusively return or filter out specific domains.

  • From: returns emails written by the person associated with the domain on the baseball card.

  • To: returns emails received by the person associated with the domain on the baseball card.

  • To or From: returns emails written and received by the person associated with the domain on baseball card.

These options are available at any time from the global search and filtering bar. Similar to communication filtering, users can use domain filtering to find emails sent from domain A to domain B. The graphic below shows emails from people associated with the citicorp.com domain sent to people associated with the enron.com domain:

603e72cb1a513.png
I. Search Tags

Searching by tag is available from the “Tags” dropdown search bar. Once accessed, users can search tags by name, or simply choose from tags listed in the dropdown menu.

603e72ccb93d6.png
  • Document Choice Filter: click the “plus sign” to select the document choice to be filtered.

603e72ce7a06b.png
  • Operators available: Any of these, None of these, Is set, Is Not Set.

  • Choices available: Yes, No, Skip, Control Set.

Under the “Control Set” drop-down, user can select if they want to return documents “In control set” or “Not in control set”.

603e72cfc5374.png

Under the Model Options drop-down, user can select if they want to return documents “Included in model” or “Excluded from model”.

603e72d10d136.png
J. Search Entity

In addition to the filtering options provided in the filtering panel, a user can also search entities or their mentions using the entity searching function listed under the “Entities” dropdown menu:

603e72d315262.png

Which entities are displayed is determined by admin configuration.

You may also create new Custom Entity types meeting your requirements and keying off your data. See Section 5 - Custom Entity Types for details.

Click any of the entity types to bring up the search window. Notice that underneath the search window is the Detection filter.

603e72d4ca059.png

Choose +Detection and you see the following list of possible methods through which entities have been discovered:

603e72d6186f2.png

Entities may be found by:

  • Term Report

  • Entity Model

  • User

Each of these is described in Section 5 - Custom Entity Types.

Users can type any keywords in the search window to search mentions associated with the targeted entity type.

603e72d7aff02.png

An entity can have multiple mentions. For example, the Topic entity “audit” might contain mentions like “audit”, “audit committee”, “audit consideration”. Once an entity is selected in the search bar, the user can choose “+Filter mentions” and then further narrow down the search results by checking/unchecking the mentions associated with the entity.

603e72d94b03f.png

Click on the Apply button to confirm the selection.

K. Search Languages

User can search for different languages that appear in documents.

60424d280e9ef.png
603e72dcc46a6.png

From the search bar, users can immediately see the list of different languages detected, as well as the number of documents detected in each language.

603e72de6dd1f.png

User can narrow down search results by selecting different levels of prevalence of a language within the documents. Prevalence of a language is calculated by the number of characters within a document, and is assigned at the document level.

603e72e01604b.png

See Appendix E for specifics on language capabilities.

L. Save Search Results

Search results can be saved to a Saved Search. Click the 603e72e1ae23a.png button to open Save Search drop down and select Save this search to invoke the saved search dialog box:

603e72e35d76d.png

In the dialog box, type the name you want to use for this saved search. If you want to replace an existing search, choose Overwrite a saved search and click Save to save the search.

The “Make available for training” option is available if you want to use this search for training queue. (See Story Engine 2.0 COSMIC Guide for more details.)

The “Make available for Insights modules” option indicates you want this search to appear as a choice for criteria in creating an Insights graphic. See Section 2 - Story Engine Exploring Page > C. Insights for more details.

M. Term Report

You can use Term reports for complex searching and reporting. Start by choosing the term reports wand icon and then press Create report.

603e72e51801d.png

The following panel appears:

603e72e696e18.png

By selecting the question mark icon next to “Term report type” you may review the descriptions of your two choices:

  • Search term reports (detailed in this section) are based on NexLP’s search technology. Understand where your hits exist in your data and drill down into documents to see their context.

Search for:

  • Keywords

  • Wildcard words

  • Proximity expressions

  • Boolean expressions

  • Entity search and extract reports (detailed in Section 5 - Custom Entity Types) extract entities from search lists, enable entity annotation across documents for hits, and create entity models from hits.

Search for:

In this section we will describe the Search term reports. The other choice, Entity search and extract, is used in Custom entity creation described in Section 5 - Custom Entity Types.

Choose Search term.

603e72e845d66.png

Name*: Name your search

Term report type*: You have chosen Search term. (The other choice, Entity search and extract, is used in custom entity creation explained in Section 5 - Custom Entity Types.)

Enter search term. One search string per line*: Each discrete search string for which you want specific reporting should be on a separate line. Each line is connected by an implied Boolean OR.

Run based on saved search: Click arrow for drop down and choose from:

Top Five Custodians

All Custodians

Global View

[Any saved searches]

Notes: Add notes as needed.

You may choose either Create or Create and runreport.

Choose the Create button to create a search without running a report.

603e72e9d7a9f.png

The created search(es) along with search specifications will be displayed.

603e72eb7255d.png

To run a report, choose the triangle icon under Run full report.

603e72ecbddb0.png

Notifications of search status are accessed through the bell icon next to your name...

60424d31481e3.png

Which provides information such as term report title, the terms, the storybook and status:

60424d32db839.png

Alternately, to create a search and run the report immediately choose Create and run report.

603e72f00990b.png

Whether you run the report immediately or later, clicking on the resulting report name…

603e72f1b207b.png

will provide the hit report per search string.

603e72f30efdf.png

The following options and information are provided:

+ Add Term: Choose this to add additional lines to your search.

Run Full Report: This runs a report on all search terms.

The report columns:

Keyword: The separate search strings that comprise the search.

Documents: The number of documents found.

Documents with group: The number of documents in the families of the documents that were found.

Unique Hits: The number of documents which were returned for only this search string and for no other search string.

NOTE: “Unique hits” may be helpful in assessing whether this particular search string is over-inclusive. For example, in an anti-trust dispute between competing pharmaceutical companies there may be a list of search strings which in various combinations might indicate objectionable intention. One of those might be “the pharma industry” as in the sentence “There is too much competition in the pharma industry.” But the documents where the phrase “the pharma industry” is the onlyhit might be reasonably suspected of being false positives. Therefore, the “unique hits” report may be used to negotiate a refinement in the search terms in the interest of efficiency and cost control.

Status: Successful, Never Run, Error or Out-of-date.

Last run time: The last time this search string was run.

Run report: You can rerun the report per line.

Actions: You may delete an individual search string.