|
|
TIKA-2373
|
Fix licenses via rat before 1.15 release
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2370
|
Close Font in TrueTypeParser
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2367
|
Avoid npe in wmf
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2364
|
Clean up printstacktrace
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2361
|
Upgrade to PDFBox 2.0.6
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2360
|
Handle SentimentParser resource failure more robustly
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2358
|
Avoid bundling dl4j with tika-app and tika-server
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2357
|
Allow Tesseract PSM up to 13
|
Dave Meikle
|
Dave Meikle
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2356
|
Temporarily prevent duplication of sheets in some xlsx POI-61034
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2354
|
Missing many embedded images in .doc files
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2352
|
Incorrect EOF exception in WordPerfect parser
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2350
|
Add catch block when opening Action on document open in PDFParser
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2349
|
Try to match digests when finding equivalent embedded files in tika-eval Compare
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2348
|
Improve error reporting in wmf/emf
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2345
|
TikaConfigSerializer should expose EncodingDetector details
|
Unassigned
|
Nick Burch
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2343
|
--text-main in tika-server
|
Unassigned
|
Nino Skopac
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2339
|
Remove test file flagged by anti-virus code
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2334
|
Upgrade SQLite to 3.16.1
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2333
|
TIKA-2330
Upgrade commons-compress to 1.13
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2331
|
Upgrade RTFParser to allow configuration of max bytes per embedded object
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2330
|
Prevent preventable OOM in CompressorInputStream
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2329
|
Upgrade to POI 3.16-final
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2325
|
Allow specification of default lang for common words
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2323
|
Improve commandline parameterization of thresholds
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2322
|
Video labeling using existing ObjectRecognition
|
Chris A. Mattmann
|
Madhav Sharan
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2311
|
Preserve "x-tika-ooxml" mime value for truncated ooxml files
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2309
|
New Detector and Parser classes for Time Stamped Data Envelope file format
|
Unassigned
|
Fabio
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2307
|
Accidentally swallowing UnsupportedZipFeatureException in rare cases
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2305
|
REST api documentation can't be viewed on the website because your MireDot license has expired
|
Konstantin Gribov
|
Laszlo Marai
|
|
Closed |
Fixed
|
|
|
|
|
|
|
TIKA-2300
|
Can't tell if a zip file is encrypted
|
Tim Allison
|
Aeham Abushwashi
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2297
|
Add Lingo24 Language Detector
|
Dave Meikle
|
Dave Meikle
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2295
|
Image not extracted via -z or -J in ODT
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2292
|
Update CXF version to 3.0.12
|
Dave Meikle
|
Sergey Beryozkin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2291
|
REST API documentation is down
|
Lewis John McGibbney
|
Mike Liu
|
|
Closed |
Fixed
|
|
|
|
|
|
|
TIKA-2290
|
PDFParser 'ocr' properties cannot be set via headers when using Tika JAXRS
|
Tim Allison
|
Kevin Oberlag
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2287
|
Allow general jdbc connectivity for tika-eval
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2286
|
Add parameterization for image quality when rendering PDF page for OCR
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2281
|
Let's extract the MAPI subtype (NOTE, STICKY, etc.) for msg files
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2279
|
Simplify token counting in tika-eval
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2277
|
Remove ParseContext field from AbstractParser
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2276
|
Try to be more parsimonious creating TikaConfigs and ParseContexts
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2275
|
EmbeddedDocumentUtil should check parseContext for a TikaConfig
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2269
|
NPE with FeedParser
|
Unassigned
|
Julien Nioche
|
|
Closed |
Fixed
|
|
|
|
|
|
|
TIKA-2267
|
Add common tokens files for tika-eval
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2255
|
Test files for SAS mimetypes
|
Unassigned
|
Nick Burch
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2253
|
Obtain new Miredot license key and upgrade plugin version in tika-server
|
Lewis John McGibbney
|
Lewis John McGibbney
|
|
Closed |
Fixed
|
|
|
|
|
|
|
TIKA-2250
|
Remove the x- prefix for some Microsoft image format mimetypes, eg BMP
|
Unassigned
|
Nick Burch
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2247
|
Extract text from WMF/EMF files
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2246
|
Extract files embedded within EMF files
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2245
|
Standardise logging
|
Konstantin Gribov
|
Matthew Caruana Galizia
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2244
|
excessive memory usage when parsing a large nested package file
|
Unassigned
|
Joshua Hight
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2242
|
opendocument parsing produces malformed xml
|
Tim Allison
|
Jan Van Raemdonck
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2240
|
MS Write File
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2238
|
Add mime detection for embedded MSEquation files
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2237
|
UnsupportedOperationException due to SingletonList.set in ProbabilisticMimeDetectionSelector
|
Unassigned
|
Jasper Hafkenscheid
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2236
|
Upgrade to PDFBox 2.0.5 when available
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2235
|
Use Tesseract's recommended DPI for PDF images
|
Unassigned
|
Matthew Caruana Galizia
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2234
|
Remove ThreadLocal from dateformat
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2232
|
Add JBIG2 image parsing support
|
Tim Allison
|
Pascal Essiembre
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2231
|
Invalid language code exception
|
Unassigned
|
Peter Weiss
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2230
|
Add paragraph markup to WordPerfect parser(s)
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2229
|
NullPointerException at org.apache.tika.parser.microsoft.ooxml.XWPFListManager.getFormattedNumber(XWPFListManager.java:64)
|
Unassigned
|
Jorge Spinsanti
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2228
|
WordPerfect parser update to support 5.x
|
Unassigned
|
Pascal Essiembre
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2226
|
Add UnsupportedFormatException (extends TikaException)
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2223
|
Extra ß characters in some WordPerfect files
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2221
|
poi.EncryptedDocumentException not wrapped in tika.exception.EncryptedDocumentException
|
Unassigned
|
Matthew Caruana Galizia
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2220
|
Refactor/merge new experimental docx/pptx components
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2219
|
CharsetDetector no longer detects windows-1252 charset
|
Unassigned
|
Pascal Essiembre
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2218
|
Add a few more places where PPTX relationships might include an attachment
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2215
|
TikaException about "Invalid embedded resource" on a valid PPT file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2212
|
Update mimes for OOXMLParser
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2211
|
ePub formatting instructions appear in plain text output
|
Unassigned
|
Adam Carroll
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2210
|
Add experimental SAX/Streaming XSLF/pptx extractor
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2209
|
Update PDFBox to 2.0.4
|
Konstantin Gribov
|
Konstantin Gribov
|
|
Closed |
Fixed
|
|
|
|
|
|
|
TIKA-2208
|
Catch missing libraires
|
Unassigned
|
David Pilato
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2207
|
ArrayIndexOutOfBoundsException on a valid Excel file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2204
|
IndexOutOfBoundsException on a valid Powerpoint file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2202
|
StringIndexOutOfBoundsException on a valid Word document
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Cannot Reproduce
|
|
|
|
|
|
|
TIKA-2198
|
NullPointerException on a valid Word file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2195
|
Consolidate MockParser's service loading file and custom-mimetype entry into tika-core's tests jar
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2193
|
java.io.NotSerializableException while using ForkParser
|
Unassigned
|
Michal Hlavac
|
|
Closed |
Duplicate
|
|
|
|
|
|
|
TIKA-2192
|
Extract embedded files from headers, footers, footnotes, etc from docx/m
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2191
|
Apply current .docx unit tests to experimental SAX parser and fix or document as necessary
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2190
|
Add "preserve_interword_spaces" option of tesseract
|
Tim Allison
|
Bipul Kumar
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2189
|
Default value mismatch for "enableImageProcessing" in TesseractOCRConfig.properties and TesseractOCRConfig.java
|
Unassigned
|
Bipul Kumar
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2187
|
Align default behavior of experimental docx parser with that of doc parser in handling delText
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2181
|
Upgrade to POI 3.16-beta2 when available
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2179
|
WordMLParser fails to parse a word xml file
|
Tim Allison
|
Sean Story
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2175
|
Enable extraction of inlined jp2/jpx from PDF
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2174
|
Too few formats in support declared by TesseractOCRParser
|
Unassigned
|
Matthew Caruana Galizia
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2171
|
Upgrade SQLite to 3.15.1
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2170
|
Tika 1.13 ForkParser fails intermittently with very large MS Word docx
|
Unassigned
|
Tim Kingsbury
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2169
|
Fix xhtml in combination OCR+metadata extraction from images
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2167
|
Image processing causes OCR to fail
|
Unassigned
|
Matthew Caruana Galizia
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2166
|
TaggedIOException from a ZipException on a valid Word file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2164
|
HSLFException from ZipException "invalid stored block lengths" on a valid Powerpoint file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2162
|
"Unknown compression method" on a Powerpoint file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2161
|
EOFException on a valid Powerpoint file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2160
|
POIXMLException from NullPointerException on a valid Word file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2159
|
Handle pre-parse embedded object exceptions uniformly and more robustly
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2158
|
NullPointerException on a valid Word file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2155
|
IndexOutOfBoundsException on a valid Excel file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2153
|
TaggedIOException on a valid Powerpoint file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2152
|
NullPointerException on a valid Word file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2145
|
InvalidFormatException on a valid Word file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2142
|
ArrayIndexOutOfBoundsException
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2137
|
NullPointerException on a valid Word file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2136
|
External file links in PPTX misparsed
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2134
|
Different NullPointerException on a valid Excel file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2132
|
NullPointerException on a valid Excel file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2130
|
TaggedIOException from ZipException on a valid PowerPoint file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2129
|
IllegalArgumentException/"Unknown shape type" on a valid Powerpoint file
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2127
|
NullPointerException on a valid PPTX
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2125
|
XmlValueOutOfRangeException on a good Word document
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2118
|
Misleading exception on a password protected XLS
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2117
|
NullPointerException on PDF (fixed in PDFBox)
|
Unassigned
|
Seva Alekseyev
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2116
|
Upgrade to POI 3.16-beta1 when available
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2115
|
OOM caused by corrupt embedded OLE object
|
Unassigned
|
Thomas Galla
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2111
|
Executable Parser adds Content-Type instead of setting
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2109
|
OutOfMemory when parsing 5MB word document
|
Unassigned
|
Julian
|
|
Resolved |
Not A Bug
|
|
|
|
|
|
|
TIKA-2104
|
Upgrade to a version of POI that fixes common bugs in macro extraction, when available
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2099
|
Tar files without magic bytes are sporadically detected as text
|
Tim Allison
|
Robin Schimpf
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2096
|
Supply AutoDetectParser for embedded documents if user forgets to pass it in via ParseContext
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2090
|
Extract javascript from PDActions in PDFs
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2056
|
Installing exiftool causes ForkParserIntegration test errors
|
Konstantin Gribov
|
Chris A. Mattmann
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-2016
|
A parser that combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.
|
Chris A. Mattmann
|
Anastasija Mensikova
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1946
|
Add mime detection and parser for WordPerfect
|
Unassigned
|
Nick C
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1879
|
Extract recipient information in MSG files with more granularity
|
Unassigned
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1865
|
Save sender email address in Outlook MSG metadata
|
Unassigned
|
Luís Filipe Nassif
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1822
|
NullPointerException when parsing a .doc file
|
Tim Allison
|
Panagiotis Mpailis
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1815
|
Text content from parser is empty when NamedEntityParser is enabled
|
Chris A. Mattmann
|
Thamme Gowda
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1658
|
unable to parse microsoft visio files with tika
|
Unassigned
|
senthil
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1631
|
OutOfMemoryException in ZipContainerDetector
|
Unassigned
|
Pavel Micka
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1508
|
Add uniformity to parser parameter configuration
|
Chris A. Mattmann
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1343
|
Create a Tika Translator implementation that uses JoshuaDecoder
|
Lewis John McGibbney
|
Chris A. Mattmann
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1332
|
TIKA-1302
Create tika-eval module
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1321
|
Add experimental SAX/Streaming XWPF/docx extractor
|
Tim Allison
|
Tim Allison
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-1195
|
XLSB support
|
Unassigned
|
Frederic Ronny
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
TIKA-456
|
Support timeouts for parsers
|
Tim Allison
|
Kenneth William Krugler
|
|
Resolved |
Fixed
|
|
|
|
|