User can access COSMIC Settings via Applied AI in the Flyout Menu.
Click Report or Settings to open COSMIC Mission Control.
The COSMIC report tab shows the status of COSMIC, including the Cycle Info, Statistics table and the control set.
Cycle Info: shows the current cycle info.
Cycle: Current cycle number, or Pre-Training.
Progress: Number of positive samples identified for the current cycle so far.
Classifier Status: Current status of the classifier.
Classifier Detailed Status: Status of classifier with any available details.
Current Stability Status: Unknown, instable, or stable.
Referenced Models: If the storybook uses any external model, it will show up in the “Referenced Models” line.
Training Mode: Active Learning or Infinite Learning.
Model Type: Identifies where model is assigned.
Last Refreshed: The last time the classifier was run.
Print: Open the COSMIC report in a printable format.
Statistics (Display Mode): displays the current status of documents in the storybook. The user can select to display All Docs or Inclusive Docs. Detailed statistics are provided for the following Ranges:
High (60% to 100%)
Medium (40% to 60%)
Low (0% to 40%)
Scored (Total) (0 to 100%)
Uncertain – No Model Features (-500): When a document does have some metadata or text features but doesn’t have those features in the current run of the COSMIC classification model; in other words, if the model doesn’t have the set of features that a document has, we cannot score the document against the model so it gets marked as Empty.
Unclassified (-100): These are the documents that get a probability score of 50 from the COSMIC classifier. This means they are classified as neither positive nor negative as per the current round of COSMIC classification.
No Score – Errored Documents (-200): An error has occurred during the 2nd pass of the processing for these documents.
No Score – Empty Documents (-300): When a document is found to have no text and metadata representation in the form of vectors; this almost never happens for emails as there is metadata in the form of the fields From, Sent, Subject, etc… But this can happen for an attachment that has a unique filename.
Tip
In order to reduce this number, tag more documents and train the model. As the model grows stronger it will reduce the number of empty documents by being able to recognize the metadata and text features in order to give the documents a proper score.
No Score – Missing Text Vectors (-400): this number reflects the number of documents for which our system cannot locate Text Vectors. This can occur if something abnormal happened during the processing or if any vector files have changed location at any point.
Threshold Slider: allows you to adjust the threshold for Relevancy cut-off; it affects the numbers showing up in the Scored Set numbers and graph below:
Control Set: shows the control set, how the control set documents are tagged, and if they are currently scored as Positive/Negative by the system.
Download Control Set History: reviewers can keep track of the Control Set history by clicking the Download Control Set History button. The downloaded CSV report contains Cycle, Precision, Recall, F1, Richness Labeled, Precision Range (+/-), Recall Range (+/-), Richness Labeled Range (+/-), Labeled Positive, Labeled Negative, Labeled Skip and Had Gap info for each cycle:
Precision/Recall Rates: shows Precision, Recall, F1 scores and Richness of the data.
The SETTINGS tab provides various settings used to configure COSMIC:
The settings available on this page include:
COSMIC Functions:
Run Full Process: Use this function to force the system to re-classify the whole document set at any time.
Run Warmup Process: Send a request to the COSMIC service to warmup the target storybook. Administrators should run this once at the beginning of COSMIC. This step will reduce the time required when you run an initial classification.
Reset COSMIC Group: Sends a request to COSMIC service to clean current COSMIC tags and scores assigned to the documents.
Warning
Use this option with caution; the system does not keep a copy of the existing COSMIC tags and scores. We recommend creating a backup of the tags and scores before reset.
Update Metadata Vectors: Allows you to update metadata vectors for this storybook. This button gives the user the option to incorporate all new entity examples into the COSMIC metadata vector, even if they are not in a model that has been run.
Note
The metadata vectors are updated whenever an entity model is run.
Available in Storybook: Check/uncheck this option to enable/disable the COSMIC group in the storybook.
COSMIC Configuration:
Name: Labeled Name of COSMIC Tag set.
Training Mode: Active Learning or Infinite Learning. Infinite Learning reduces the need to review non-relevant documents. Active Learning aims at limiting the human coding of even relevant documents.
Checkout Size: Number of documents sent to reviewers at one time when entering the queue.
Retraining Interval: The amount of non-control set documents that must be coded for re-classification in the document universe.
Training Queue Size: The approximate number of documents to be added to the training queue after reclassification.
Minimum Positive Examples: The minimum number of documents tagged as Yes before a reviewer can start training COSMIC.
Control Set Percentage: The portion of random sample documents to be set aside per training queue for the purpose of statistical measurement. The Percentage can range from 0% to 100% of the training queue.
Is Inclusive Only: Tells the classifier to classify and select for training All documents or Inclusive (Inclusive emails, attachments, loose eFiles) documents only.
* See Appendix A for more details about “Is Inclusive Only” setting.
Autotune Enabled: Allows the system to automatically adjust weights based on current results to achieve best results (by default: On).
Autotune Cutoff Threshold: Maximum number of documents that will be used for autotuning (by default: 1000).
Stability Threshold: Minimum number of document tags added or deleted that will trigger stability to recalculate.
Auto Submit Status:
Enabled: System will automatically submit newly coded documents to the classifier when the retraining interval is reached.
Disabled: System will not submit newly coded documents to the classifier.
Infinite | Override - Continue Training with Stable Model: Continue submitting new documents to classifier even after stability is reached.
Standby: System will automatically set Auto Submit Status to “Standby” once it reaches stability. Under this status, the classifier will only run when a previously reviewed document was selected as a Control Set.
Warning
If a user chooses to set the option to “Infinite | Overwrite Continue Training with Stable Model”, the newly coded documents may set the system back to “Still Learning” status.
Labels:
Training Queue Tag Name: Name of the training queue to be used for COSMIC.
Positive Tag Name: This label is for the Responsive coding button.
Negative Tag Name: This is label is for the not-responsive coding button.
Skip Tag Name: This label is for the skip button to allow reviewers to temporarily “pass” on a document.
Classifier Configuration Weights:
The Classifier Configuration Weights table displays the weights given to each feature (beyond the word content as such) of each document segment. The features include enriched natural language processing (NLP) and metadata often implying dynamics outside the literal content of the documents themselves – such as emotionally charged communications, indications of pressure applied or endured, or the over-use of capitalization or personal pronouns. Also, the feature list includes both standard and custom entities.
The weights range from 0 to 1. By default, each feature is given a 0.1 weight. If a feature is given a higher weight, it will have more impact on the classifier. The Autotune feature may assign a different weight to the feature if it is enabled, or a user can manually adjust the weights.
Package: See Section 9.E. -- COSMIC Model Library below.
Stability History
The STABILITY HISTORY tab shows the results each time the system recalculates stability. The results can be Still Learning, Approaching Stability, Stabilized and Still Learning (Data Load).
For the system to reach stability, it has to encounter Approaching Stability 3 times in a row.
The system automatically sets stability to Still Learning (Data Load) each time new data has been added to the storybook.
The following example shows the stability history of a COSMIC review project. “Download Stability History” allows a user to download a copy of the history report, including Status, Pos+Neg Count, Cycle and Stability Score for each calculation. The View Details link shows the documents tagged in that cycle.
Agreement Report:
The AGREEMENT tab shows the level of agreement between the model and the reviewers for each cycle.
Cycle Start: the starting review cycle for analyzing agreement.
Cycle End: the ending review cycle for analyzing agreement.
Threshold: the score threshold for positive (Yes).
Display type: choose either “List” or “Matrix”.
Analyze Button: press to run new report.
When the display type is set to use “List”, the report below shows the following columns:
Cycle: the review cycle under examination.
Run Date: the time when COSMIC runs classification for the cycle.
Stability Cycle: the stability cycle which contains this review cycle.
Stability Cycle Status: the status of the stability cycle which contains this coding cycle.
Stability Cycle Score: the stability score of the stability cycle which included this coding cycle.
Reviewer Yes/Model Yes: documents tagged as Yes by reviewers, and among these documents how many are also with COSMIC scores equals to or above the threshold.
Agreement Yes Rate: the percentage of documents tagged as Yes by reviewers and also with COSMIC scores equals to or above the threshold, vs. the total number of documents tagged as Yes by reviewers in this cycle. For example, in cycle 15 above, there are 5 documents tagged as Yes by reviewers, 4 of them are also with COSMIC scores equals to or above the threshold, the Agreement Yes Rate will be 80% (4 out of 5).
Reviewer No/Model No: documents tagged as No by reviewers, and among these documents how many are also with COSMIC scores below the threshold.
Agreement No Rate: the percentage of documents tagged as No by reviewers and also with COSMIC scores below the threshold, vs. the total number of documents tagged as No by reviewers in this cycle. For example, in cycle 15 above, there are 4 documents tagged as No by reviewers, 3 of them are also with COSMIC scores below the threshold, the Agreement No Rate will be 75% (3 out of 4).
Overall Labelled Count: total documents tagged in the review cycle.
Overall Agreement Rate: the average of Agreement Yes Rate and Review No Rate.
View Details: provides a link to show detailed reviewer actions.
When the display type is set to use “Matrix”, the following additional settings become available:
Score Width: the interval of scores to display in matrix.
Cycle Group size: how many review cycles to group into one cycle for analyzing.
Below shows an example of the agreement report using “Matrix” display type, with score width set to “10” and groups each 3 review cycles into one (to access this, change the Display Type from “List” to “Matrix”):
Each cell in the report displays the number of documents in the score range tagged by a reviewer, among these docs how many are scored as Yes or No (based on the 55% threshold set above), and the percentage of agreement between human coding and the COSMIC score.
For example, note the indicated cell in the report above. For cycle 14-16, under column “0.40”, it shows “67% (2/3).” This indicates from review cycle 14 to 16 three documents with a COSMIC score between 0.3 and 0.4 were tagged, two of which were tagged No by reviewers. Because this score of 0.3 and 0.4 is below the 0.55 threshold, this shows a 67% agreement between the reviewers and COSMIC score.
Reviewer Actions:
The REVIEWER ACTIONS tab shows the associated reviewer tagging actions:
Reviewer: the name of the reviewer.
Tag: Positive, Negative, Skip, Exclude.
Score: score assigned to the document at the time of tagging.
Agreed?
Yes: system agrees with the tag assigned.
No: system does not agree with the tag assigned.
Date: date and time at the time of tagging.
Cycle: # of cycle at the time of tagging.
Copy Id: ID of the document tagged.
Control Number: control number of the document tagged.
Added or Removed from Training Set:
Added: tagged Yes.
Removed: tagged No.
Removed (Auto added to Control set): when a document was first tagged as a seed document and later selected as a Control Set document, the system automatically removes the document from training.
Included (Exclude tag removed): User tagged a previously Excluded document as Included.
Excluded (Exclude tag added): User tagged a document as Excluded.
In Training Queue?
Yes: tagged from within any COSMIC queue.
No: not tagged from within a COSMIC queue.
COSMIC Group Queue
Not in COSMIC Queue: when tags are assigned from outside a COSMIC queue.
[COSMIC GROUP]: primary COSMIC Group when tagging occurred.
Queue Type: one of the COSMIC queue types, see Section 9.B > COSMIC Queues for possible COSMIC queue types.
The ASSIGNED TO REPORT tab shows documents currently assigned to any reviewers.
In case a reviewer is removed from the review team, the System Administrator can use the Unassign button to check in documents originally assigned to that reviewer so they can be returned to the queue.
Correlation Matrix provides a cross examination of COSMIC scores from all COSMIC groups.
Score Range ([COSMIC GROUP]): the scope to examine; a user can select a Low, Medium or High scope to examine.
Display Mode: Inclusive Docs or All Docs.
Click the Analyze button to run the reports based on the two options above.
The matrix reports the percentage and number of documents with Low, Medium and High scores for each COSMIC group. A user can also click the link to load the corresponding documents into the thread viewer.