In this lab, you will verify the lack of image processing results we got from the previous lab and fix it by adding image analysis skill set to our pipeline.
There are png and jpg images within the provided dataset. If you decided to bring your own data, it was suggested to also include images. But we did not add any predefined skillsets for image analysis. This is exactly what you will do now, but first, let's check out the kind of problems we could expect to see if we used the Language Detection, Text Split, Named Entity Recognition and Key Phrase Extraction Skills on images with steps 1 and 2.
Let's check the indexer status again, it has valuable information about our "images problem". You can use the same command we used in the previous lab (pasted below for convenience). If you used another indexer name, just change it in the URL.
GET https://[your-service-name].search.windows.net/indexers/demoindexer/status?api-version=2019-05-06
Content-Type: application/json
api-key: [api-key]
If you check the response messages for any of the png or jpg files in the results, there will be warnings about missing text for the images.
Let's again repeat a previous lab request, but with another analysis. you will re-execute the step to verify content.
GET https://[your-service-name].search.windows.net/indexes/demoindex/docs?search=*&$select=blob_uri,organizations,languageCode,keyPhrases&api-version=2019-05-06
api-key: [api-key]
Send the request and go to any result set about an image file like jpg or png, like these images listed in the image below. Note that no organizations, languageCode and keyPhrases are returned for the images, jpg or png files. That's because they were not created.
Tip: if any image is in the top results: There is a magnifying glass button in the top right of the results screen. Click on this button to open the Search box, in the text box type jpg or png and click on the find next button to find this result.
The next steps will guide you through a challenge and don't worry if you get stuck (that's why it's a challenge!), we will share the finished solution, too.
Two of the nine predefined skills are related to image analysis. Your first assignment is to read about how to use them using this link.
You will add OCR to the cognitive search pipeline, this skill set will read text from the images within our dataset. Here is a link where you can read more details.
Note For now, Cognitive Search uses OCR V2 (preview), for english. And uses V1 for other languages. This may change in the future.
Image skills, like OCR and Image Analysis, are heavier than text skills. Behind the scenes, Microsoft is running deep learning algorithms on your data. Expect to have the indexer running longer than the text only skillset.
Note Currently OCR only works with "/document/normalized_images" field, produced by the Azure Blob indexer when imageAction is set to generateNormalizedImages. As part of document cracking, there are a new set of indexer configuration parameters for handling image files or images embedded in files. These parameters are used to normalize images for further downstream processing. Normalizing images makes them more uniform. Large images are resized to a maximum height and width to make them consumable. For images providing metadata on orientation, image rotation is adjusted for vertical loading. Metadata adjustments are captured in a complex type created for each image.
Click here and review how the Enrichment Pipeline works. This content will help you with the challenge below. At the end of the page there is a link to return to this lab.
You need to prepare the environment to add the image analysis you will create. The most practical approach is to delete the objects from Azure Search and rebuild them. This also avoids redundancy of similar information. This cleaning also reduces cost, two replicated/similar indexes will use space os the service. Last, but not least: to teach about DELETES is also an objective of this training. With the exception of the data source, you will delete everything else. Resource names are unique, so by deleting an object, you can recreate it using the same name.
Save all the scripts (API calls) you've done up until this point, including the definition json files you used in the "body" field. Let's start deleting the index and the indexer. You can use Azure Portal or API calls:
Status code 204 is returned on a successful deletion. The deletion order doesn't matter, while when creating those objects, the indexer must be the last one, since it has references to the others.
Tip: It was possible to update the index instead of delete and recreate it. The addition of a new field is one of the situations where you can use this method. Click here to learn more about it.
In this challenge, you will perform the following steps:
Create the services at the portalNot required, we did not delete it.Create the Data SourceNot required, we did not delete it.- Recreate the Skillset
- Recreate the Index
- Recreate the Indexer
- Check Indexer Status - If you don't have a different result, something went wrong.
- Check the Index Fields - Check the image fields you just created.
- Check the data - If you don't have a different result, something went wrong.
Use the same skillset definition from the previous lab, adding the OCR image analysis skill to your skillset. The objectives are:
-
Save the text extracted from OCR into the index
-
Submit the text extracted from OCR and also the
content
, extracted by default from all text documents, to language detection, key phrases, and entity detection. You will need to use another pre-defined skill to merge the text, since you can't use the same skill twice in the same skillset. It is part of the challenge to find the correct skill and how to use it
Skipping the services and the data source creation, repeat the other steps of the previous lab, in the same order. Use the same scripts as a reference.
TIP 1: What you need to do:
-
Create a new index exactly like the one we did in the previous lab, but with an extra field for the OCR text from the images. Name the new field as myOcrText. You can use the same json body field and add the new OCR field in the end. If you decide to use a different name, you will need to change the Bot code to make it work.
-
Create a new indexer exactly like the one we did in the previous, but with and extra mapping for the new skill and the new field listed above. You can use the same json body field and add the new OCR mapping in the end
-
Check the indexer execution status as you did in the previous lab
TIP 2: Your new field in the Index must have the Collection Data Type.
TIP 3: Your indexer sourceFieldName for the OCR text field has to be /document/normalized_images/*/myOcrText if your field is named myOcrText.
TIP 4: Now your skillset has image skills. The indexer processing time will be bigger than what you saw in the last lab, up to 10 minutes is expected.
Run the same query of the Step 2, the URL is pasted below. Now you should see organizations, languageCode and keyPhrases for most of the images.
GET https://[your-service-name].search.windows.net/indexes/demoindex/docs?search=*&$select=blob_uri,organizations,languageCode,keyPhrases&api-version=2019-05-06
Content-Type: application/json
api-key: [api-key]
Now run the query below to check the OCR text extracted from the images. You Should see text for most of the images.
GET https://[your-service-name].search.windows.net/indexes/demoindex/docs?search=*&$select=blob_uri,myOcrText&api-version=2019-05-06
Content-Type: application/json
api-key: [api-key]
Log into the Azure portal and verify the creation of the skillset, index and indexers in the Azure Search dashboard. If nothing is missed, use the Search Explorer to do the searches below. Click on the files URLs (crtrl+click) to check if the AI services created the metadada as expected.
- Search for "linux"
search=myOcrText:linux&querytype=full
- Search for "microsoft"
search=myOcrText:microsoft&querytype=full
- Search for "Learning", what will show you an image of the LearnAI Team portal, who created this training.
search=myOcrText:Learning&querytype=full
If you could not make it, here is the challenge solution. You just need to follow the steps.