AbstractDocumentExtractor (Cognitive Foundry)

java.lang.Object
- gov.sandia.cognition.util.AbstractCloneableSerializable
- - gov.sandia.cognition.text.document.extractor.AbstractDocumentExtractor

All Implemented Interfaces:

DocumentExtractor, CloneableSerializable, java.io.Serializable, java.lang.Cloneable

Direct Known Subclasses:

AbstractSingleDocumentExtractor
```
public abstract class AbstractDocumentExtractor
extends AbstractCloneableSerializable
implements DocumentExtractor
```
An abstract implementation of the DocumentExtractor interface. It chains together the extraction calls so that subclasses only have to handle the URLConnection calls.

Since:

3.0

Author:

Justin Basilico

See Also:

Serialized Form

Constructor Summary

Constructors
Constructor and Description

AbstractDocumentExtractor()
Creates a new AbstractDocumentExtractor.

Constructors
Constructor and Description
`AbstractDocumentExtractor()` Creates a new `AbstractDocumentExtractor`.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`boolean`	`canExtract(java.io.File file)` Determines if the given file can be extracted by this extractor.
`java.lang.Iterable<? extends Document>`	`extractAll(java.io.File file)` Attempts to extract all of the documents from the given file.
`java.lang.Iterable<? extends Document>`	`extractAll(java.net.URI uri)` Attempts to extract all of the documents from the given file.

Methods inherited from class gov.sandia.cognition.util.AbstractCloneableSerializable
clone

Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface gov.sandia.cognition.text.document.extractor.DocumentExtractor
canExtract, canExtract, extractAll

- Constructor Detail
  - AbstractDocumentExtractor
```
public AbstractDocumentExtractor()
```
    Creates a new AbstractDocumentExtractor.
- Method Detail
  - canExtract
```
public boolean canExtract(java.io.File file)
                   throws java.io.IOException
```
    Description copied from interface: DocumentExtractor
    
    Determines if the given file can be extracted by this extractor.
    
    Specified by:
    
    canExtract in interface DocumentExtractor
    
    Parameters:
    
    file - The file to extract.
    
    Returns:
    
    True if this extractor can extract the file and false otherwise.
    
    Throws:
    
    java.io.IOException - If there is an IO error.
  - extractAll
```
public java.lang.Iterable<? extends Document> extractAll(java.io.File file)
                                                  throws DocumentExtractionException,
                                                         java.io.IOException
```
    Description copied from interface: DocumentExtractor
    
    Attempts to extract all of the documents from the given file.
    
    Specified by:
    
    extractAll in interface DocumentExtractor
    
    Parameters:
    
    file - The file to extract.
    
    Returns:
    
    The list of documents extracted from the given file.
    
    Throws:
    
    DocumentExtractionException - If there is an error extracting data from the file.
    
    java.io.IOException - If there is an IO error.
  - extractAll
```
public java.lang.Iterable<? extends Document> extractAll(java.net.URI uri)
                                                  throws DocumentExtractionException,
                                                         java.io.IOException
```
    Description copied from interface: DocumentExtractor
    
    Attempts to extract all of the documents from the given file.
    
    Specified by:
    
    extractAll in interface DocumentExtractor
    
    Parameters:
    
    uri - The URI of the file to extract.
    
    Returns:
    
    The list of documents extracted from the given file.
    
    Throws:
    
    DocumentExtractionException - If there is an error extracting data from the file.
    
    java.io.IOException - If there is an IO error.

Class AbstractDocumentExtractor

Constructor Summary

Method Summary

Methods inherited from class gov.sandia.cognition.util.AbstractCloneableSerializable

Methods inherited from class java.lang.Object

Methods inherited from interface gov.sandia.cognition.text.document.extractor.DocumentExtractor

Constructor Detail

AbstractDocumentExtractor

Method Detail

canExtract

extractAll

extractAll