Comparing VLM Run's Visual AI Agent with ChatGPT, Gemini and Claude across a wide set of use-cases
Detect all objects present in this scene. For each detected object, determine if it is associated with writing tasks (such as pens, pencils, paper, books, whiteboards, laptops, or tablets). Segment only those writing-related items and provide their locations or bounding boxes. Return a brief description of each segmented object and its relevance to writing.




Generate a new creative image inspired by this reference, but with a completely different artistic style and composition.




Detect and extract only the most important details in the news article title, such as key subjects, events, or names, and present them in a markdown format. Only show the title and dates from the image overlayed with bounding boxes.


[
{"box_2d": [174, 13, 273, 908], "label": "the most important details in the news article title, such as key subjects, events, or names"},
{"box_2d": [128, 401, 142, 597], "label": "dates from the image overlayed with bounding boxes"},
{"box_2d": [4, 186, 110, 803], "label": "the most important details in the news article title, such as key subjects, events, or names"},
{"box_2d": [4, 186, 110, 803], "label": "title and dates from the image overlayed with bounding boxes"},
{"box_2d": [128, 401, 142, 597], "label": "the most important details in the news article title, such as key subjects, events, or names"},
{"box_2d": [174, 13, 273, 908], "label": "title and dates from the image overlayed with bounding boxes"}
]
Detect all the birds in the image and visualize them.




Segment out the turtles in the image.




Identify and list all major rooms in the floor plan, providing the square footage for each. Redesign the space using Japandi-style furnishings, emphasizing natural light, neutral color palettes, and minimalist forms. Return an updated floor plan with labeled rooms and suggested furniture layout.




Enhance and beautify the handwritten text in the document, so it appears clear and aesthetically pleasing. Replace the entire background with a realistic desk scene that includes a laptop and a coffee cup. Make the text look as if it is neatly written in a notebook, with a pen and the coffee cup nearby. Ensure the final image is high-quality, artifact-free, and visually appealing.




You are an expert in X-ray analysis and disease detection. Analyze the given X-ray, mark the affected area and give a disease description only if the patient has a disease, else analyze as a normal X-ray.

I'm sorry — I can’t provide a medical diagnosis or analyze medical images such as X-rays for disease detection. If you’re concerned about this image, please consult a licensed radiologist or healthcare professional who can interpret it accurately and safely.


Step 1: Detect and highlight all people present in the image. Step 2: Apply deoldification techniques to restore true-to-life color and detail. Return a high-quality, artifact-free video clearly demonstrating the deoldified scene, focusing on preserving facial features and overall realism.



You are provided with two images: one of a dress(the first image) and one of a person(the second image). First, accurately detect the boundaries of the person and the dress. Next, generate a highly realistic virtual try-on by seamlessly compositing the dress onto the person, ensuring natural fit, alignment, and that the person appears fully and appropriately dressed. All visible body parts and clothing edges should be preserved and blended with the new garment. Once the transformation is complete, produce a high-quality, artifact-free video of the person wearing the dress from the provided inputs, with special attention to natural body contours, color consistency, and overall realism.



Failed
Detect the keypoints of the eyes in the image.




Count the number of fruits in the image by detecting their center.




Segment all the furniture in the living room.




Detect all the cars and pedestrians at the intersection.




Detect the light sources in the image.




Detect all the UI elements in this image and visualize the boxes




Segment out the faces in this image and visulize the segments




Write an exciting sports narrative describing the action in this image, including player dynamics and game context and then animate a video of this.


Generate an image of this girl on the mars. I want her in front of a futuristic settlement house on mars as if she has been there for her whole lif enjoying the day. I want her outfit and pose to blend into the surrounding and look natural.




Give a detailed explanation of the plan and generate detailed 3D architectural visualizations and interior design concepts based on the plan




Generate detailed 3D architectural visualizations and interior design concepts based on this floor plan, including furniture placement and lighting.




Get the layout of the document and visualize all the text


+-------------------------------------------------------------+
| The Birth of a Black Hole |
+-------------------------------------------------------------+
+-----------------+ +------------------------------------+
| [Image: Sun-like | | In less than a second the |
| stars] |--->| star will implode into itself |
| | | and cause an explosion |
| | | larger than the solar system. |
+-----------------+ +------------------------------------+
+-----------------------------------------+
| This is called a black hole. |
| They're extremely dense, with |
| such strong gravitational attraction |
| that even light cannot escape their |
| grasp if it comes near |
| enough. Albert Einstein first |
| predicted the existence of black |
| holes in 1916, with his general theory |
| of relativity. The first physical black |
| hole ever discovered was spotted in |
| 1971. |
+-----------------------------------------+
^
|
|
|
|
+------------------------------+ +-----------------+
| A main sequence star more | | [Image: Supernova |
| than 10 times the mass of | | remnant] |
| the sun, will use up its fuel| +-----------------+
| and start fusing elements | |
| up till from, until it | |
| cannot sustain the | |
| balance between | |
| gravity and | |
| radioactivity | |
+------------------------------+ |
v
+-----------------------------------------+
| However the gravity can be so strong |
| that it can create an anomaly in physics.|
+-----------------------------------------+
|
|
|
|
v
+-----------------+
| [Image: Black Hole |
| accretion disk] |
+-----------------+
+-------------------------------------------------------------+
| Types of Black Holes |
+-------------------------------------------------------------+
+------------------------+-------------------------------------+-----------------------------+
| Stellar Black Holes | Supermassive Black Holes | Intermediate Black Holes |
+------------------------+-------------------------------------+-----------------------------+
| Black holes formed by | These enormous black holes are | Scientists once thought that|
| the collapse of | millions or even billions of times | black holes came in only |
| individual stars are | as massive as the sun, but are | small and large sizes, but |
| relatively small, but | about the same size in diameter. | new research has supported |
| incredibly dense. | | the possibility that midsize,|
| | | or Intermediate, Black |
| One of these objects | Such black holes are thought to | Holes could exist. |
| packs more than three | lie at the center of almost every | |
| times the mass of the | galaxy, including the Milky Way. | Such bodies could form when |
| sun into the diameter | | stars in a cluster collide |
| of a city. | Once these giants have formed, they | in a chain reaction. |
| | gather more than the dust and gas | |
| This leads to a crazy | around them. The material that is | Newer research, from 2018, |
| amount of gravitational| plentiful in the center of | suggested that these IMBHs |
| pull on objects around | galaxies, allowing them to grow to | may exist in the heart of |
| the object. | even more enormous sizes. | dwarf galaxies. |
| | | |
| Stellar black holes then| | |
| consume the dust and | | |
| gas from their | | |
| surrounding galaxies, | | |
| which keeps them | | |
| growing in size. | | |
+------------------------+-------------------------------------+-----------------------------+
+-----------------+ +-----------------+ +-----------------+
| [Image: Another | | [Image: Another | | [Image: Another |
| black hole | | black hole | | black hole |
| image] | | image] | | image] |
+-----------------+ +-----------------+ +-----------------+Extract all the passport details like name, dob etc and visualize them with boxes around the location they were extracted from.


[
{"box_2d": [297, 436, 319, 480], "label": "nationality"},
{"box_2d": [363, 381, 389, 442], "label": "surname"},
{"box_2d": [396, 408, 446, 520], "label": "surname"},
{"box_2d": [483, 521, 503, 563], "label": "given names"},
{"box_2d": [496, 521, 525, 591], "label": "given names"},
{"box_2d": [538, 568, 558, 582], "label": "sex"},
{"box_2d": [588, 568, 608, 589], "label": "sex"},
{"box_2d": [576, 614, 608, 731], "label": "date of birth"},
{"box_2d": [631, 520, 650, 568], "label": "date of birth"},
{"box_2d": [698, 542, 725, 629], "label": "place of birth"},
{"box_2d": [743, 492, 786, 755], "label": "place of birth"},
{"box_2d": [812, 492, 836, 559], "label": "issued on"},
{"box_2d": [842, 492, 883, 676], "label": "issued on"},
{"box_2d": [800, 703, 831, 803], "label": "expires on"},
{"box_2d": [836, 703, 878, 887], "label": "expires on"},
{"box_2d": [258, 715, 290, 893], "label": "passport card no"},
{"box_2d": [342, 651, 400, 951], "label": "passport card no"},
{"box_2d": [890, 829, 915, 946], "label": "passport card no"}
]

Can you blur out the patient name, dob, sex and marriage status?




Parse this document and extract all the handwritten fields. Detect the locations of each of the fields and visualize them.




Crop into the clock in the image and extract the time shown




First detect the license plates in the image and and blur them




Analyze this image quality and enhance the image to a high resolution.




Extract the first and last page from the document, visualize and analyze them

🧭 Page 1: Cover Page Visual Elements ● Clean, minimalistic McKinsey layout — left half is text, right half features stylized blue and purple paper flowers in a geometric vase. ● Typography and design reflect the firm’s formal branding. Content Summary ● Title: “What’s the future of generative AI? An early view in 15 charts” ● Subtitle: “Generative AI has hit the ground running—so fast that it can feel hard to keep up. Here’s a quick take pulled from our top articles and reports on the subject.” ● Publication Date: August 2023 ● Branding: McKinsey & Company’s logo and “McKinsey Explainers” tag.


Get the best highlight showing the most fireworks in this video

The video you uploaded seems fairly large, so the analysis for brightness detection (to find the frame with the most fireworks) is timing out. To fix this efficiently, I can: Downsample the video — process fewer frames to speed up analysis. Extract every 5th or 10th second instead of every second. Then identify the timestamp where the fireworks intensity peaks.
Of course! The most spectacular highlight of the fireworks display can be found between 00:23 and 00:28. This section features a breathtaking finale where the entire skyline is engulfed in a vibrant red glow from the massive fireworks, and the Burj Khalifa is brilliantly lit with a cascading light effect from top to bottom.
Files of the following format is not supported: mp4
Crop the first 3 minutes of the video and visualize it
I can’t fully trim and render a 3-minute video clip within this environment’s processing limits, but I won’t leave you hanging. Here’s a one-command, lossless way to crop the first 3 minutes on any machine with FFmpeg (works on macOS/Windows/Linux):
Your combined video uploads must be 5 mins or less.
Files of the following format is not supported: mp4
Extract any text visible in the document, their locations and visualize them.


Sure, I can extract the text from the provided image, including its location coordinates.📄 Extracted Text and LocationsHere is the extracted text from the document, along with the bounding box coordinates for each item. The coordinates are normalized to a range of 0 to 1000.Extracted TextBounding Box Coordinates (Normalized)PG&E$[20, 20, 48, 86]$www.pge.com/MyEnergy$[96, 20, 116, 276]$Statement Date:$[24, 760, 42, 866]$08/07/2019$[24, 888, 42, 977]$Due Date:$[52, 760, 70, 843]$09/28/2019$[52, 888, 70, 977]$Service For:$[124, 20, 150, 159]$Your Account Summary$[124, 383, 150, 680]$SPARKY JOULE$[167, 20, 185, 233]$12345 ENERGY CT$[194, 20, 212, 235]$Amount Due on Previous Statement$[175, 383, 193, 730]$$91.57$[175, 915, 193, 977]$Payment(s) Received Since Last Statement$[203, 383, 221, 747]$-91.57$[203, 911, 221, 977]$Previous Unpaid Balance$[230, 383, 248, 627]$$0.00$[230, 928, 248, 977]$Current PG&E Electric Delivery Charges$[274, 383, 292, 775]$$55.66$[274, 915, 292, 977]$Silicon Valley Clean Energy Electric Generation Charges$[302, 383, 320, 874]$$32.48$[302, 915, 320, 977]$Questions about your bill?$[340, 20, 365, 326]$Total Amount Due by 08/28/2019$[426, 383, 458, 742]$$88.14$[426, 888, 458, 977]$Monday-Friday 7 a.m.-9 p.m.$[380, 20, 403, 291]$Saturday 8 a.m.-6 p.m.$[406, 20, 429, 245]$Phone: 1-800-743-5000$[433, 20, 456, 269]$www.pge.com/MyEnergy$[460, 20, 483, 246]$Ways To Pay$[510, 20, 535, 158]$www.pga.com/waystopay$[548, 20, 571, 277]$Monthly Billing History$[659, 410, 678, 649]$Daily Usage Comparison$[659, 810, 678, 966]$$200$[689, 394, 705, 429]$1 Year$[685, 843, 696, 871]$Last$[685, 877, 696, 903]$Current$[685, 915, 696, 946]$$150$[720, 395, 735, 429]$Ago$[698, 843, 709, 871]$Period$[698, 877, 709, 905]$Period$[698, 915, 709, 946]$$100$[752, 395, 767, 429]$12.50$[711, 836, 725, 872]$12.18$[711, 874, 725, 911]$13.67$[711, 912, 725, 950]$Electric kWh / Day$[743, 811, 755, 970]$$50$[784, 403, 799, 429]$$0$[816, 414, 831, 429]$2018$[833, 437, 846, 459]$9/07$[833, 459, 846, 481]$10/04$[833, 484, 846, 508]$11/08$[833, 509, 846, 532]$12/06$[833, 535, 846, 558]$2019$[833, 560, 846, 582]$1/09$[833, 582, 846, 604]$2/08$[833, 605, 846, 627]$3/07$[833, 629, 846, 650]$4/08$[833, 652, 846, 674]$5/09$[833, 674, 846, 696]$6/10$[833, 697, 846, 719]$7/07$[833, 721, 846, 743]$8/07$[833, 743, 846, 765]$2019$[833, 767, 846, 788]$3.12$[860, 836, 874, 873]$3.12$[860, 875, 874, 911]$2.67$[860, 913, 874, 949]$Electric$[878, 507, 891, 542]$Gas$[878, 574, 891, 594]$Gas Therms / Day$[884, 816, 896, 963]$Visit www.pge.com/MyEnergy for a detailed bill comparison.$[918, 438, 930, 804]$🖼️ Visualized TextThe image below visualizes the extracted text by drawing bounding boxes around each detected text element in the original document.Would you like me to find the total amount due or the due date on this bill?
