Failure Modes & Partial Results
Virza’s document processing pipeline is designed to be resilient. When a stage fails, the system either retries automatically or degrades gracefully, preserving as much extracted data as possible.
How failure handling works
The pipeline divides stages into critical and non-critical:
- Critical stages (text extraction, section splitting, file validation): if these fail, the document cannot be used and is marked Failed
- Non-critical stages (summaries, embeddings, citations, figures, tables): if these fail, the document is marked Ready with warnings and remains fully usable for core features
Non-critical stages are retried automatically up to 5 times with exponential backoff before being marked as failed.
Error reference
| Error | What happened | Is it retried? | What to do |
|---|---|---|---|
| Size limit exceeded | File is larger than your plan’s maximum upload size | No | Compress the PDF or upgrade your plan. See Upload Limits. |
| Page limit exceeded | Document has more pages than your plan allows | No | Split the document into smaller parts or upgrade. |
| Encryption detected | PDF is password-protected or encrypted | No | Remove the password using a PDF editor and re-upload. |
| MIME invalid | File type is not supported (only PDF, DOCX, TXT) | No | Convert to a supported format and re-upload. |
| Parse timeout | The document parser exceeded its 300-second timeout | Yes (up to 5 times) | Try re-uploading. Very complex layouts may take multiple attempts. If it persists, contact support. |
| OCR failed | Scanned PDF text recognition produced unreliable results | Yes (up to 5 times) | Use the original text-based PDF if available. For scanned documents, ensure the scan is clean and high-resolution. |
| Model error | An AI model (summarization, embedding, or vision) returned an error | Yes (up to 5 times) | Usually resolves automatically on retry. If it persists after re-uploading, contact support. |
| Budget exceeded | Your workspace’s AI credit budget has been reached | No | Wait for your monthly credit reset or upgrade your plan. |
| Storage error | The file storage service was temporarily unavailable | Yes (up to 5 times) | Usually resolves automatically. Re-upload if the issue persists. |
| Infected | Virus or malware was detected in the file | No | The file has been quarantined and deleted. Do not re-upload. Scan the original file with antivirus software. |
Understanding “Ready with warnings”
A document marked Ready with warnings means:
- ✅ Full text was extracted successfully
- ✅ Sections were identified
- ✅ The document is searchable
- ⚠️ One or more optional enrichment stages failed
Common causes:
- Embedding provider temporarily unavailable: semantic search for this document may be limited until embeddings are generated
- Summary generation failed: the AI summary is missing; you can still read the full document
- Citation parsing encountered issues: some references may not have been extracted; you can view the References section of the PDF directly
You can retry failed stages from the document menu (⋯ → Reprocess).
Understanding “Failed”
A document marked Failed means a critical stage could not complete:
- Text extraction produced no usable text (the document may be image-only with very poor OCR quality)
- The file could not be parsed at all (corrupted file, unsupported internal format)
- File validation failed (encryption, unsupported MIME type, size/page limits)
What to check:
- Can you select text in the PDF using your local PDF reader? If not, it’s an image-only PDF.
- Is the PDF password-protected? Some PDFs have invisible DRM.
- Is the file corrupted? Try opening it locally first.
- Does the file exceed your plan’s size or page limits?
Search degradation
Virza’s search system also handles failures gracefully. If part of the search infrastructure is temporarily unavailable:
| Situation | What happens | Impact on your results |
|---|---|---|
| Semantic search unavailable | Falls back to keyword-only search | Results based on exact text matches only; you may miss conceptually related papers |
| Reranking unavailable | Uses statistical fusion ranking | Slightly less precise result ordering |
| External databases slow | Discover tab results may be incomplete | Some providers may not return results; others still work |
| Full pipeline timeout (rare) | Returns best results available within 4 seconds | You may see fewer results than usual |
You will never see an error page for search. Virza always returns the best results it can with the available infrastructure. If search quality is degraded, the response includes a signal that results may be less precise than usual.