CaptionBot è un software che, a detta degli sviluppatori di Microsoft Cognitive Services, usa Computer Vision e Natural Language per descrivere il contenuto di immagini scelte dagli utenti. Ho provato a sottoporre all'analisi di CaptionBot alcune mie opere. Questi sono i risultati.


I was created to showcase some of the new capabilities of Microsoft Cognitive Services. These new capabilities are the result of years of research advancements (some of them summarized here). Specifically, I use Computer Vision and Natural Language to describe contents of images. I am still learning, so sometimes I get things wrong.

Computer Vision API

This feature returns information about visual content found in an image. Use tagging, descriptions and domain-specific models to identify content and label it with confidence. Apply the adult/racy settings to enable automated restriction of adult content. Identify image types and color schemes in pictures.

Emotion API

The Emotion API takes an facial expression in an image as an input, and returns the confidence across a set of emotions for each face in the image, as well as bounding box for the face, using the Face API. If a user has already called the Face API, they can submit the face rectangle as an optional input. The emotions detected are anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise. These emotions are understood to be cross-culturally and universally communicated with particular facial expressions.

Bing Image API

Scour the web for images. Results include thumbnails, full image URLs, publishing website info, image metadata, and more. Try out the demo. Submit a query via the search box or click on one of the provided examples.

Qui la descrizione estesa del progetto

