public abstract class AbstractDocumentExtractor extends AbstractCloneableSerializable implements DocumentExtractor
DocumentExtractor
interface. It
chains together the extraction calls so that subclasses only have to handle
the URLConnection
calls.Constructor and Description |
---|
AbstractDocumentExtractor()
Creates a new
AbstractDocumentExtractor . |
Modifier and Type | Method and Description |
---|---|
boolean |
canExtract(java.io.File file)
Determines if the given file can be extracted by this extractor.
|
java.lang.Iterable<? extends Document> |
extractAll(java.io.File file)
Attempts to extract all of the documents from the given file.
|
java.lang.Iterable<? extends Document> |
extractAll(java.net.URI uri)
Attempts to extract all of the documents from the given file.
|
clone
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
canExtract, canExtract, extractAll
public AbstractDocumentExtractor()
AbstractDocumentExtractor
.public boolean canExtract(java.io.File file) throws java.io.IOException
DocumentExtractor
canExtract
in interface DocumentExtractor
file
- The file to extract.java.io.IOException
- If there is an IO error.public java.lang.Iterable<? extends Document> extractAll(java.io.File file) throws DocumentExtractionException, java.io.IOException
DocumentExtractor
extractAll
in interface DocumentExtractor
file
- The file to extract.DocumentExtractionException
- If there is an error extracting data from the file.java.io.IOException
- If there is an IO error.public java.lang.Iterable<? extends Document> extractAll(java.net.URI uri) throws DocumentExtractionException, java.io.IOException
DocumentExtractor
extractAll
in interface DocumentExtractor
uri
- The URI of the file to extract.DocumentExtractionException
- If there is an error extracting data from the file.java.io.IOException
- If there is an IO error.