Skip to main content

Reveal Review Publication

Reveal AI Common Errors

Summary:       This document provides information on some common errors observed in Reveal AI.

  1. Processing is running slow or failed due to low memory assignment

    • Error Message:

      java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at nexlp.process.LoopingQueueProcess.process(LoopingQueueProcess.java:97) at nexlp.process.LoopingProcess.runInLoopWithSleepUntilStop(LoopingProcess.java:130) at nexlp.process.LoopingProcess.run(LoopingProcess.java:49) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)

    • Meaning:

      If data ingestion is slow, check Java Heap Allocation and Thread count used for processing on the processing server and see if this was set too low.

    • Resolution:

      We recommend at least 52GB memory and 6 threads or higher if possible.

  2. Processing is running slow due to bulk insert failure

    • Error Message:

      2018-08-25-12-20-Exception:ItemID: (none). Exception in BulkInsertWriter. Processing will continue using the ONEBYONE method. See BulkInsertWriter.log for more details.

    • Meaning:

      Story Engine uses Bulk Insert mode to get data populated into SQL database during processing. If Bulk Insert failed, Story Engine will revert to using One-by-one insertion mode, which is quite slow.

    • Resolution:

      To check if Bulk Insert is working properly, open the nexlp_exception.log and search for Bulk Insert related error. (See example in column 2.) You can also open BulkInsertWriter.log and check if it contains any error messages.  This log is located …\StoryBooks\{project}\ProcessOutput\log\processing\BulkInsertWriter\BulkInsertWriter.log  

      To fix the Bulk Insert issue, confirm the Bulk Insert path is set properly. This setting is located on the SQL Output server. Click the Edit button to view/update it:

      image1.png

      Notice typically the Bulk Insert Folder should be located on a local drive on the SQL server. If it is set to point to a network share, it could cause security issues.

  3. Processing failed due to transaction lock

    • Error Message:

      The following error message in nexlp_exception.log:

      2018-08-25-12-20-Exception:ItemID: (none). TransactionManager: Transaction could not be closed. . 2018/08/25 12:20:10.743: ItemID: (none). java.lang.Exception: TransactionManager: Transaction could not be closed.

           at nexlp.persistence.transaction.TransactionManager.close(TransactionManager.java:94)

           at nexlp.neo4j.DatabaseGateway.processOneByOne(DatabaseGateway.java:611)

           at nexlp.neo4j.DatabaseGateway.executeOneByOneImport(DatabaseGateway.java:554)

           at nexlp.neo4j.DatabaseGateway.Import(DatabaseGateway.java:355)

           at nexlp.processing.dataconnector.Neo4JImportWrapper.runImport(Neo4JImportWrapper.java:343)

           at nexlp.processing.dataconnector.Neo4JImportWrapper.pushDocsToNeo4J(Neo4JImportWrapper.java:231)

           at nexlp.processing.dataconnector.Neo4JImportWrapper.update(Neo4JImportWrapper.java:87)

           at nexlp.processing.dataconnector.Neo4jUpdater.updateItemWithExceptionLogging(Neo4jUpdater.java:97)

           at nexlp.processing.dataconnector.Neo4jUpdater.updateItems(Neo4jUpdater.java:87)

           at nexlp.processing.dataconnector.Neo4jUpdater.runNonSyncronized(Neo4jUpdater.java:80)

           at nexlp.processing.dataconnector.Neo4jUpdater.run(Neo4jUpdater.java:29)

           at java.lang.Thread.run(Unknown Source)

    • Meaning:

      In rare situations, Processing might fail if user tries to kick off a second time without cleaning up the process, which will cause processing failure again.

    • Resolution:

      This is typically caused by a failed run early. Follow the steps below to fix this problem:

      1. Stop currently running nexlp_processing service.

      2. Open Windows Task Manager, check for any Java process: if it is running and there is no other project kicked off on the same server, kill it.

      3. Restart processing service and kick off processing again.

  4. COSMIC Service stopped

    • Error Message:

      Front-end reviewers will see “Error-service Stopped” in the thread view below:

      image2.jpeg
    • Meaning:

      COSMIC is running as a Windows Service. Occasionally if the server hosting AI service (AI Server) got rebooted and the AI service ran into problem connecting to SQL server, it will stop running.

    • Resolution:

      Follow the steps below to fix this problem:

      1. RDP into AI server.

      2. Open Windows service, find NexLPServiceAi_[VERSION] service:

        image3.png
      3. If this service is not running, right click and start the service.

  5. COSMIC Failed because of missing vector -300

    This issue only happened in old versions (below 1.14.05.30).

    • Error Message:

      In COSMIC document view, if user reports a large amount of documents not classified successfully, and also saw a red warning message in the thread viewer (see below), it could be because these documents are empty and their Metadata features need to be improved when we create Numeric Vectors.

      image4.png
      image5.png
    • Meaning:

      This typically happens when you have the vector files hosted on a non-windows server.

    • Resolution:

      To fix the issue, we need to rebuild the numeric vectors. Check out online document G. Rebuild Data (revealdata.com) for more details.

  6. COSMIC Failed because of missing vector -200 or -400

    • Error Message:

      See attached example with outline message that displays when hovering over the red caution symbol.

      image6.png
    • Meaning:

      When viewing the COSMIC report, there are an exceptionally high number of unscored documents due to errors and missing text vectors.  When I review the missing text vector population, the documents appear to have text.  I have created saved searches of both sets within my review application and am able to move through them without issue.

    • Resolution:

      The best option here will be to regenerate the text vectors. Check out online document G. Rebuild Data (revealdata.com) for more details.

  7. COSMIC errored with “Invalid Data Source” message

    • Error Message:

      java.lang.RuntimeException: java.lang.Exception: Invalid source data. Check logs for more info. at nexlp.process.LoopingProcess.process(LoopingProcess.java:120) at nexlp.process.LoopingQueueProcess.process(LoopingQueueProcess.java:91) at nexlp.process.LoopingProcess.runInLoopWithSleepUntilStop(LoopingProcess.java:130) at nexlp.process.LoopingProcess.run(LoopingProcess.java:49) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.lang.Exception: Invalid source data. Check logs for more info. at nexlp.classifiers.manager.database.BiGramsClassifierManagerDatabase.process(BiGramsClassifierManagerDatabase.java:643) at nexlp.classifiers.manager.service.ClassifiersManagerLoopingProcess.mainProcess(ClassifiersManagerLoopingProcess.java:69) at nexlp.process.LoopingProcess.process(LoopingProcess.java:97) ... 8 more

    • Meaning:

      We’ve seen this could happen with two scenarios:

      1. Reviewers kicked off “Run Full Process” in COSMIC settings but didn’t realize there are not enough Positive/Negative seed documents (by default it needs at least 4 positive and 1 negative sample).

      2. After Stability has been reached, we published a new model and then added more documents to the case.

    • Resolution:

      To fix this issue, first check number of seed documents. If there are less than required seed documents, ask reviewer to tag more positive/negative documents and run Full Process again.

      If it is the 2nd scenario, follow the steps below to force it to run a round:

      1. Change Auto Submit Status to Override or Enabled.

      2. Tag enough docs to get COSMIC to run a round.

  8. Documents Are Timing Out On The Front-End

    • Error Message:

      The wait operation timed out

      ERRORSHELPABOUT

      System.ComponentModel.Win32Exception

      The wait operation timed out

      ---> System.Data.SqlClient.SqlException: Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception: The wait operation timed out

    • Meaning:

      Documents are taking a very long time to load after selecting them from the thread navigation.

    • Resolution:

      Check the “Systems Error” Tab on the Admin menu in the software. If you see an error similar to the one above, try to find the SQL query submitted. Using tools such as SQL Query Analyzer to analyze the query and find the reason why it takes long (for example, the indices might be too fragmented).

      Also confirm your version of Microsoft SQL Server. If you are using SQL Server 2016 and you do not have service pack 2 you will not be able to add all of the required indexes to increase load speed times on the front-end.

  9. COSMIC Model Error

    • Error Message:

      Often cosmic model errors will error like below (classifier status). 

      Error message example:

      An error has occurred and processing has stopped. Contact your administrator to resolve this issue. java.lang.Exception: Could not find vocab files in \\nexlp-aiworker1\NEXLP_SHARE\Vectors\1000132\ActiveLearning\NumericVectors_1612510586898\Vocab at nexlp.activelearning.ActiveLearningClassifierDataManager.loadVocabAndReIndexSubjectFilenmIfNeeded(ActiveLearningClassifierDataManager.java:529)

      image7.png
    • Meaning:

    • Resolution:

      Navigate to Cosmic Settings in Reveal AI Story Engine: Reveal AI Story Engine site > find Storybook > click Cosmic settings tab > click Settings.

      - Under Settings you will find reports, where normally the error message and details show.

      • Option 1 (Re-run full process) - Under Settings tab, Run full process will reprocess the model.

        Note

        This will take some time to reprocess.

        image8.png
      • Option 2 (Regenerate vectors) - If option 1 did not fix the issue, then regenerate vectors in storybook.

        Note

        This takes quite bit more time to regenerate depending on number of docs.

        image9.png
  10. COSMIC Index (Reference model error)

    • Error Message:

      An error has occurred and processing has stopped. Contact your administrator to resolve this issue. java.lang.Exception: Vectors don't exist on disk for timestamp:1602130175067 and Vectors Creation Service Errored. at nexlp.classifiers.manager.BiGramsClassifierManager.queueVectorsIfNeeded(BiGramsClassifierManager.java:347) at nexlp.classifiers.manager.BiGramsClassifierManager.call(BiGramsClassifierManager.java:222) at nexlp.classifiers.manager.database.BiGramsClassifierManagerDatabase.call(BiGramsClassifierManagerDatabase.java:321) at nexlp.classifiers.manager.database.BiGramsClassifierManagerDatabase.process(BiGramsClassifierManagerDatabase.java:696) at nexlp.classifiers.manager.service.ClassifiersManagerLoopingProcess.mainProcess(ClassifiersManagerLoopingProcess.java:72) at nexlp.process.LoopingProcess.process(LoopingProcess.java:111) at nexlp.process.LoopingQueueProcess.callSuperProcess(LoopingQueueProcess.java:184) at nexlp.process.RunMainProcess.lambda$runMainProcess$0(RunMainProcess.java:17) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source

    • Meaning:

      Caused by hard drive being full.

    • Resolution:

      Clear out old vector files. Re-run or generate vectors on Reveal AI. 

      On AI server for this Instance, you can locate Vector folders corresponding to Storybook Id (path below is example but should be same for all AI servers): D:\NexLP\Logs\Service\NexlpServiceAi_2.90.06\VECTORS  

      After selecting Storybook in Reveal AI, hit Rebuild data, then Generate COSMIC vectors (if error message states vectors do not exist).

      image10.png

      Once Vector creation is completed, then re-Run Full Process. This will re-run the COSMIC job. 

      Under COSMIC group tab in Reveal AI, new tab will open > under COSMIC job hit Settings, then Run Full Process.

      image11.png
  11. Error Building COSMIC AI model

    • Error Message:

      An error has occurred and processing has stopped. Contact your adminstrator to resolve this issue. java.lang.NullPointerException at nexlp.classifiers.manager.database.BiGramsClassifierManagerDatabase.process(BiGramsClassifierManagerDatabase.java:870) at nexlp.classifiers.manager.service.ClassifiersManagerLoopingProcess.mainProcess(ClassifiersManagerLoopingProcess.java:72) at nexlp.process.LoopingProcess.process(LoopingProcess.java:111) at nexlp.process.LoopingQueueProcess.callSuperProcess(LoopingQueueProcess.java:186) at nexlp.process.RunMainProcess.lambda$runMainProcess$0(RunMainProcess.java:17) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)  

      The issue from the system error log is as follows:  

      Failed to process the job '17921': an exception occurred.  

      Message Details

      System.Exception: Java Service Error : java.lang.RuntimeException: java.lang.NullPointerException   

           at nexlp.process.LoopingProcess.process(LoopingProcess.java:134)

      at nexlp.process.LoopingQueueProcess.callSuperProcess(LoopingQueueProcess.java:186)

           at nexlp.process.RunMainProcess.lambda$runMainProcess$0(RunMainProcess.java:17)

           at java.util.concurrent.FutureTask.run(Unknown Source)  

           at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

           at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

           at java.lang.Thread.run(Unknown Source)

      Caused by: java.lang.NullPointerException at nexlp.classifiers.manager.database.BiGramsClassifierManagerDatabase.process(BiGramsClassifierManagerDatabase.java:870) at

      nexlp.classifiers.manager.service.ClassifiersManagerLoopingProcess.mainProcess(ClassifiersManagerLoopingProcess.java:72) at nexlp.process.LoopingProcess.process(LoopingProcess.java:111)  

         ... 6 more

         at NexLP.StoryEngine.Core.Jobs.Cosmic.WatchActiveLearningProcessQueueJob.Watch(PerformContext context, Int32 activeLearningProcessQueueId, IJobCancellationToken cancellationToken) in D:\TeamCity\buildAgent-2\work\1a441ced843e34ae\NexLP.StoryEngine.Core\Jobs\Cosmic\WatchActiveLearningProcessQueueJob.cs:line 61

    • Meaning:

    • Resolution:

      • Step 1: Log into Reveal AI > System Admin > (Select Case) > Cosmic Groups

        image12.png
      • Step 2: Go to Settings (of model in question) > Check Available in Storybook then Run Full Process.

        image13.png

Revised 8/16/2021 11:16 (A. Kass)