PharmaScan with Gemini

Aashi Dutt
2 min readJan 1, 2024

in collaboration with Nitin Tiwari (ML GDE)

Imagine you wake up feeling under the weather and navigating through your medical box sends shivers down your spine, checking every medicine in the box and not knowing which one would suit your symptoms the best.

Introducing the PharmaScan app, powered by revolutionary Gemini Pro Vision API that empowers you to scan your medicines and analyze them for instant prescription information.

👀Gemini Pro Vision Model

Gemini Pro Vision is a versatile multimodal model designed to comprehend both textual and visual inputs, including images and videos. It excels in tasks like visual comprehension, classification, summarization, and generating textual content based on visual and textual information. This model is proficient at analyzing various forms of data, such as photos, documents, infographics, and screenshots, making it a valuable tool for a wide range of applications.

Gemini models are pretty easy to use, more like plug-and-play. The vision model is enabled by the API key to take in images and return textual information. Here is a small code snippet to get you started in Google Colab.

# Import generative ai module
import google.generativeai as genai

genai.configure(api_key = apiKey)
model = genai.GenerativeModel('gemini-pro-vision')

# Image to Text generation
response = model.generate_content(["YOUR PROMPT GOES HERE", image])
to_markdown(response.text)

📱About the App

The app aims at using Gemini Pro Vision API to understand the text written on the medicine package and utilizes its training of large language models to work around a prescription that provides all vital information about the medicine like medicine name, symptoms it cures, diagnosis, dosage, and method of use.

Simply upload the image of the medicine you want a prescription for and click submit. 🚀

PharmaScan Android App

Gone are the days of squinting at tiny print on pill labels or deciphering cryptic doctor’s handwriting.

Note: This is just a proof of concept and should not be replaced by an actual doctor's prescription.

🔮Future Impacts

The impact of Gemini Pro Vision extends far beyond individual convenience.

  • Accessibility: This app can bridge the gap for those with visual impairments or difficulty reading, empowering them to manage their own medication needs.
  • Combating counterfeit drugs: The API’s advanced image recognition can help identify fake medications, safeguarding user health.
  • Remote healthcare: In areas with limited medical access, it can offer a telemedicine option, allowing patients to receive consultation and prescriptions based on image scans.

📝Resources for you

  1. Code: https://github.com/NSTiwari/Medicine-Scan-with-Gemini
  2. Demo video: https://www.youtube.com/watch?v=Q06ABLwFGTQ
  3. What more to do with Gemini: https://www.kaggle.com/code/prathameshbang/gemini-api-starter-notebook

--

--