public class TextDocumentExtractor extends AbstractSingleDocumentExtractor
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
CONTENT_TYPE
The content type is "text/plain".
|
static java.util.List<java.lang.String> |
DEFAULT_TEXT_FILE_EXTENSIONS
The default set of file extensions for text files.
|
Constructor and Description |
---|
TextDocumentExtractor()
Creates a new
TextDocumentExtractor . |
Modifier and Type | Method and Description |
---|---|
boolean |
canExtract(java.net.URI uri)
Determines if the given file can be extracted by this extractor.
|
boolean |
canExtract(java.net.URLConnection connection)
Determines if the given file can be extracted by this extractor.
|
Document |
extractDocument(java.net.URLConnection connection)
Attempts to extract a document from the given file.
|
extractAll, extractAll, extractAll, extractDocument, extractDocument
canExtract
clone
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
canExtract
public static final java.lang.String CONTENT_TYPE
public static final java.util.List<java.lang.String> DEFAULT_TEXT_FILE_EXTENSIONS
public TextDocumentExtractor()
TextDocumentExtractor
.public boolean canExtract(java.net.URI uri) throws java.io.IOException
DocumentExtractor
uri
- The URI of the file to extract.java.io.IOException
- If there is an IO error.public boolean canExtract(java.net.URLConnection connection) throws java.io.IOException
DocumentExtractor
connection
- The connection to the file to extract.java.io.IOException
- If there is an IO error.public Document extractDocument(java.net.URLConnection connection) throws java.io.IOException
SingleDocumentExtractor
connection
- The connection to the file to extract.java.io.IOException
- If there is an IO error.