Search in sources :

Example 1 with DocumentMetadata

use of software.amazon.awssdk.services.textract.model.DocumentMetadata in project aws-doc-sdk-examples by awsdocs.

the class DetectDocumentText method detectDocText.

// snippet-start:[textract.java2._detect_doc_text.main]
public static void detectDocText(TextractClient textractClient, String sourceDoc) {
    try {
        InputStream sourceStream = new FileInputStream(new File(sourceDoc));
        SdkBytes sourceBytes = SdkBytes.fromInputStream(sourceStream);
        // Get the input Document object as bytes
        Document myDoc = Document.builder().bytes(sourceBytes).build();
        DetectDocumentTextRequest detectDocumentTextRequest = DetectDocumentTextRequest.builder().document(myDoc).build();
        // Invoke the Detect operation
        DetectDocumentTextResponse textResponse = textractClient.detectDocumentText(detectDocumentTextRequest);
        List<Block> docInfo = textResponse.blocks();
        Iterator<Block> blockIterator = docInfo.iterator();
        while (blockIterator.hasNext()) {
            Block block = blockIterator.next();
            System.out.println("The block type is " + block.blockType().toString());
        }
        DocumentMetadata documentMetadata = textResponse.documentMetadata();
        System.out.println("The number of pages in the document is " + documentMetadata.pages());
    } catch (TextractException | FileNotFoundException e) {
        System.err.println(e.getMessage());
        System.exit(1);
    }
}
Also used : FileInputStream(java.io.FileInputStream) InputStream(java.io.InputStream) TextractException(software.amazon.awssdk.services.textract.model.TextractException) FileNotFoundException(java.io.FileNotFoundException) Document(software.amazon.awssdk.services.textract.model.Document) DetectDocumentTextRequest(software.amazon.awssdk.services.textract.model.DetectDocumentTextRequest) FileInputStream(java.io.FileInputStream) DetectDocumentTextResponse(software.amazon.awssdk.services.textract.model.DetectDocumentTextResponse) SdkBytes(software.amazon.awssdk.core.SdkBytes) DocumentMetadata(software.amazon.awssdk.services.textract.model.DocumentMetadata) Block(software.amazon.awssdk.services.textract.model.Block) File(java.io.File)

Example 2 with DocumentMetadata

use of software.amazon.awssdk.services.textract.model.DocumentMetadata in project aws-doc-sdk-examples by awsdocs.

the class DetectDocumentTextS3 method detectDocTextS3.

// snippet-start:[textract.java2._detect_s3_text.main]
public static void detectDocTextS3(TextractClient textractClient, String bucketName, String docName) {
    try {
        S3Object s3Object = S3Object.builder().bucket(bucketName).name(docName).build();
        // Create a Document object and reference the s3Object instance
        Document myDoc = Document.builder().s3Object(s3Object).build();
        // Create a DetectDocumentTextRequest object
        DetectDocumentTextRequest detectDocumentTextRequest = DetectDocumentTextRequest.builder().document(myDoc).build();
        // Invoke the detectDocumentText method
        DetectDocumentTextResponse textResponse = textractClient.detectDocumentText(detectDocumentTextRequest);
        List<Block> docInfo = textResponse.blocks();
        Iterator<Block> blockIterator = docInfo.iterator();
        while (blockIterator.hasNext()) {
            Block block = blockIterator.next();
            System.out.println("The block type is " + block.blockType().toString());
        }
        DocumentMetadata documentMetadata = textResponse.documentMetadata();
        System.out.println("The number of pages in the document is " + documentMetadata.pages());
    } catch (TextractException e) {
        System.err.println(e.getMessage());
        System.exit(1);
    }
}
Also used : DetectDocumentTextResponse(software.amazon.awssdk.services.textract.model.DetectDocumentTextResponse) DocumentMetadata(software.amazon.awssdk.services.textract.model.DocumentMetadata) TextractException(software.amazon.awssdk.services.textract.model.TextractException) Block(software.amazon.awssdk.services.textract.model.Block) S3Object(software.amazon.awssdk.services.textract.model.S3Object) Document(software.amazon.awssdk.services.textract.model.Document) DetectDocumentTextRequest(software.amazon.awssdk.services.textract.model.DetectDocumentTextRequest)

Aggregations

Block (software.amazon.awssdk.services.textract.model.Block)2 DetectDocumentTextRequest (software.amazon.awssdk.services.textract.model.DetectDocumentTextRequest)2 DetectDocumentTextResponse (software.amazon.awssdk.services.textract.model.DetectDocumentTextResponse)2 Document (software.amazon.awssdk.services.textract.model.Document)2 DocumentMetadata (software.amazon.awssdk.services.textract.model.DocumentMetadata)2 TextractException (software.amazon.awssdk.services.textract.model.TextractException)2 File (java.io.File)1 FileInputStream (java.io.FileInputStream)1 FileNotFoundException (java.io.FileNotFoundException)1 InputStream (java.io.InputStream)1 SdkBytes (software.amazon.awssdk.core.SdkBytes)1 S3Object (software.amazon.awssdk.services.textract.model.S3Object)1