Image Recognition Microsevice
npm install image-recognition-microservice!image-recognition-microservice
This is a standalone microservice for image recognition and question answering. It can analyze an image and respond to specific questions about its content. The server is built using Python and moondream (a small vision language model designed to run efficiently on edge devices, (more on huggingface) for image recognition and question answering.
- Recognizes objects and scenes in images.
- Answers questions related to the image's content.
- Simple REST API for integration.
> The image depicts the animated character Homer Simpson in a room, pointing to a drawing of a car on a whiteboard.
Request:
``bash`
curl -X POST http://127.0.0.1:5000/ -H "Authorization: Bearer 123" -F "image=@./assets/example.png"
Response:
`json`
{
"answer": "The image depicts the animated character Homer Simpson in a room, pointing to a drawing of a car on a whiteboard.",
"question": "Describe this image"
}
> The image depicts a scene from the animated television series "The Simpsons". The central figure is Homer Simpson, a renowned character known for his love of cars. He is standing in front of a whiteboard, which displays a drawing of a car. Homer is pointing towards the drawing, suggesting he is explaining or admiring it. The background is a vibrant purple color, providing a contrast to the whiteboard and the yellow figure of Homer.
Request:
`bash`
curl -X POST http://127.0.0.1:5000/ -H "Authorization: Bearer 123" -F "image=@./assets/example.png" -F "question=Describe in detail this image"
Response:
`json`
{
"answer": "The image depicts a scene from the animated television series \"The Simpsons\". The central figure is Homer Simpson, a renowned character known for his love of cars. He is standing in front of a whiteboard, which displays a drawing of a car. Homer is pointing towards the drawing, suggesting he is explaining or admiring it. The background is a vibrant purple color, providing a contrast to the whiteboard and the yellow figure of Homer.",
"question": "Describe in detail this image"
}
> The color of skin in the image is yellow.
Request:
`bash`
curl -X POST http://127.0.0.1:5000/ -H "Authorization: Bearer 123" -F "image=@./assets/example.png" -F "question=What color is the skin?"
Response:
`json`
{
"answer": "The color of skin in the image is yellow.",
"question": "What the color of skin?"
}
- Image Recognition Microsevice
- Features
- Brief Example
- Describe this image (default question)
- Describe in detail this image
- What color is the skin?
- Table of Contents
- Server
- Prerequisites
- Installation
- Configuration
- Start the server
- Dependencies
- API Usage
- Endpoint
- Headers
- Form Data
- Example Request
- Example Response
- Example Response with Error
- Node.js Image Recognition Client
- Installation
- Usage
- API
- Notes
- Project Structure
- License
- Contributing
- Support
- Created by
1. Install Docker and Docker Compose.
2. Prepare an .env file for API token configuration.
1. Clone this repository:
`bash`
git clone
cd
2. Edit .env file and set the API_TOKEN:`
plaintext`
API_TOKEN=your_secret_token
3. Build and run the service using Docker Compose:
`bash`
docker-compose up --build
4. The API will be available at http://127.0.0.1:5000/.
Set the API_TOKEN in the .env file to secure the API. Example:`plaintext`
API_TOKEN=your_secret_token
To start the server, run the following command:
`bash`
sudo docker compose build
sudo docker compose up
After running the command, wait for the following message to appear in the terminal:
`plaintext`
api-1 | * Serving Flask app 'server'
api-1 | * Debug mode: off
api-1 | WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
api-1 | * Running on all addresses (0.0.0.0)
api-1 | * Running on http://127.0.0.1:5000
api-1 | * Running on http://172.21.0.2:5000
api-1 | Press CTRL+C to quit
The server will be available at http://localhost:5000/.
- Python 3.10
- Flask
- PyTorch
- Pillow
- Transformers
- Einops
POST /
- Authorization: Bearer
- image: The image file to be analyzed (e.g., .jpg, .png).question
- : A string containing the question about the image. Optional, default value is Describe this image..
`bash`
curl -X POST http://127.0.0.1:5000/ \
-H "Authorization: Bearer your_secret_token" \
-F "image=@./assets/example.png" \
-F "question=Describe this image."
`json`
{
"question":"Describe this image.",
"answer":"A close-up image of a pile of ripe, red strawberries with green leaves."
}
`json`
{
"question":"Describe this image.",
"error":"Invalid image format."
}
This client provides a simple interface to interact with the Image Recognition. It allows you to easily recognize objects and scenes in images and ask questions about them.
See Server section to set up the server.
To install the image recognition client, use npm:
`bash`
npm i -S image-recognition-microservice
Here's a basic example of how to use the image recognition client:
`typescript
import ImageRecogniton from 'image-recognition-microservice';
// Initialize the client with the server URL
const imageRecognition = new ImageRecogniton('http://localhost:3000');
const imageBuffer = await readFile('path/to/your/image.jpg');
const image = new File([imageBuffer], 'image.jpg', { type: 'image/jpeg' });
// Check the file
const result = await imageRecognition.recognize(image, 'Describe this image.');
// Log the result
console.log(result);
`
In this example, we're reading an image file from disk, creating a File object, and passing it to the recognize method along with a question. The method returns a promise that resolves to an object with the answer to the question.
The ImageRecogniton class provides the following method:
- recognize(file: File | Blob, question: string): Promise<{ question: string, answer?: string, error?: string }>answer
Scans the provided file for viruses. Returns a promise that resolves to an object with:
- : a string with the answer to the question asked about the image.error
- : a string with error message if the file is not recognized.question
- : the question that was asked.
- Make sure the (Image Recognition Server)[#server] is running and accessible at the URL you provide when initializing the ImageRecogniton client.File
- The client works with both and Blob objects, making it flexible for various use cases.recognize
- Error handling is built into the client. If there's an error communicating with the server, the method will return { error: 'Error message', question: 'The question asked' }.
For more information on setting up and using the server, refer to the Image Recognition Server documentation above.
- Dockerfile: Docker configuration for the service.docker-compose.yaml
- : Docker Compose configuration.requirements.txt
- : Python dependencies.src/server.py
- : Server implementation.src/client.js
- : Node.js client..env.example
- : Example of environment variables.
This project is licensed under the MIT License. See the LICENSE` file for details.
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
If you encounter any problems or have questions, please open an issue in the project repository.
Dimitry Ivanov <2@ivanoff.org.ua> # curl -A cv ivanoff.org.ua