SHOWDOWN

VLM Run Showdown: Orion vs. Frontier Models

Comparing VLM Run's Visual AI Agent with ChatGPT, Gemini and Claude across a wide set of use-cases

Desk Object Segmentation

Detect all objects present in this scene. For each detected object, determine if it is associated with writing tasks (such as pens, pencils, paper, books, whiteboards, laptops, or tablets). Segment only those writing-related items and provide their locations or bounding boxes. Return a brief description of each segmented object and its relevance to writing.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Image Generation

Generate a new creative image inspired by this reference, but with a completely different artistic style and composition.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Newspaper Key Details Extraction

Detect and extract only the most important details in the news article title, such as key subjects, events, or names, and present them in a markdown format. Only show the title and dates from the image overlayed with bounding boxes.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
No visual output
[
  {"box_2d": [174, 13, 273, 908], "label": "the most important details in the news article title, such as key subjects, events, or names"},
  {"box_2d": [128, 401, 142, 597], "label": "dates from the image overlayed with bounding boxes"},
  {"box_2d": [4, 186, 110, 803], "label": "the most important details in the news article title, such as key subjects, events, or names"},
  {"box_2d": [4, 186, 110, 803], "label": "title and dates from the image overlayed with bounding boxes"},
  {"box_2d": [128, 401, 142, 597], "label": "the most important details in the news article title, such as key subjects, events, or names"},
  {"box_2d": [174, 13, 273, 908], "label": "title and dates from the image overlayed with bounding boxes"}
]
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Bird Detection

Detect all the birds in the image and visualize them.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Turtle Segmentation

Segment out the turtles in the image.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Floor Plan Room Identification

Identify and list all major rooms in the floor plan, providing the square footage for each. Redesign the space using Japandi-style furnishings, emphasizing natural light, neutral color palettes, and minimalist forms. Return an updated floor plan with labeled rooms and suggested furniture layout.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Handwriting Beautification

Enhance and beautify the handwritten text in the document, so it appears clear and aesthetically pleasing. Replace the entire background with a realistic desk scene that includes a laptop and a coffee cup. Make the text look as if it is neatly written in a notebook, with a pen and the coffee cup nearby. Ensure the final image is high-quality, artifact-free, and visually appealing.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

X-ray Analysis and Disease Detection

You are an expert in X-ray analysis and disease detection. Analyze the given X-ray, mark the affected area and give a disease description only if the patient has a disease, else analyze as a normal X-ray.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
No visual output
I'm sorry — I can’t provide a medical diagnosis or analyze medical images such as X-rays for disease detection. If you’re concerned about this image, please consult a licensed radiologist or healthcare professional who can interpret it accurately and safely.
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Image to Video

Step 1: Detect and highlight all people present in the image. Step 2: Apply deoldification techniques to restore true-to-life color and detail. Return a high-quality, artifact-free video clearly demonstrating the deoldified scene, focusing on preserving facial features and overall realism.

VLM Run Orion
InputOutput
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Virtual Try-On

You are provided with two images: one of a dress(the first image) and one of a person(the second image). First, accurately detect the boundaries of the person and the dress. Next, generate a highly realistic virtual try-on by seamlessly compositing the dress onto the person, ensuring natural fit, alignment, and that the person appears fully and appropriately dressed. All visible body parts and clothing edges should be preserved and blended with the new garment. Once the transformation is complete, produce a high-quality, artifact-free video of the person wearing the dress from the provided inputs, with special attention to natural body contours, color consistency, and overall realism.

VLM Run Orion
DressOutput
Output comparison reference
Original comparison reference
ChatGPT 5
DressOutput
Output comparison reference
Original comparison reference
Gemini 2.5
DressOutput
Output comparison reference
Original comparison reference
Claude 4.5
No visual output
Failed

Spider Eye Keypoint Detection

Detect the keypoints of the eyes in the image.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Fruit Counting

Count the number of fruits in the image by detecting their center.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Segment Furniture

Segment all the furniture in the living room.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Detect Cars and Pedestrians

Detect all the cars and pedestrians at the intersection.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Detect Light Sources

Detect the light sources in the image.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

UI Element Detection

Detect all the UI elements in this image and visualize the boxes

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Segment Faces in News Image

Segment out the faces in this image and visulize the segments

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Tennis Match Animation and Narration

Write an exciting sports narrative describing the action in this image, including player dynamics and game context and then animate a video of this.

VLM Run Orion
InputOutput
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
No visual output

Teleport To Mars

Generate an image of this girl on the mars. I want her in front of a futuristic settlement house on mars as if she has been there for her whole lif enjoying the day. I want her outfit and pose to blend into the surrounding and look natural.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Architectural Visualization from Floor Plan

Give a detailed explanation of the plan and generate detailed 3D architectural visualizations and interior design concepts based on the plan

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Architectural 3D Visualization

Generate detailed 3D architectural visualizations and interior design concepts based on this floor plan, including furniture placement and lighting.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Poster Layout Understanding

Get the layout of the document and visualize all the text

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
No visual output
+-------------------------------------------------------------+
|                     The Birth of a Black Hole               |
+-------------------------------------------------------------+

+-----------------+    +------------------------------------+
| [Image: Sun-like |    | In less than a second the          |
| stars]          |--->| star will implode into itself      |
|                 |    | and cause an explosion             |
|                 |    | larger than the solar system.      |
+-----------------+    +------------------------------------+

                                      +-----------------------------------------+
                                      | This is called a black hole.            |
                                      | They're extremely dense, with           |
                                      | such strong gravitational attraction    |
                                      | that even light cannot escape their     |
                                      | grasp if it comes near                  |
                                      | enough. Albert Einstein first           |
                                      | predicted the existence of black        |
                                      | holes in 1916, with his general theory  |
                                      | of relativity. The first physical black |
                                      | hole ever discovered was spotted in     |
                                      | 1971.                                   |
                                      +-----------------------------------------+
                                            ^
                                            |
                                            |
                                            |
                                            |
+------------------------------+     +-----------------+
| A main sequence star more    |     | [Image: Supernova |
| than 10 times the mass of    |     | remnant]        |
| the sun, will use up its fuel|     +-----------------+
| and start fusing elements    |             |
| up till from, until it       |             |
| cannot sustain the           |             |
| balance between              |             |
| gravity and                  |             |
| radioactivity                |             |
+------------------------------+             |
                                             v
                                      +-----------------------------------------+
                                      | However the gravity can be so strong    |
                                      | that it can create an anomaly in physics.|
                                      +-----------------------------------------+
                                            |
                                            |
                                            |
                                            |
                                            v
                                      +-----------------+
                                      | [Image: Black Hole |
                                      | accretion disk]   |
                                      +-----------------+

+-------------------------------------------------------------+
|                       Types of Black Holes                  |
+-------------------------------------------------------------+

+------------------------+-------------------------------------+-----------------------------+
| Stellar Black Holes    | Supermassive Black Holes            | Intermediate Black Holes    |
+------------------------+-------------------------------------+-----------------------------+
| Black holes formed by  | These enormous black holes are      | Scientists once thought that|
| the collapse of        | millions or even billions of times  | black holes came in only    |
| individual stars are   | as massive as the sun, but are      | small and large sizes, but  |
| relatively small, but  | about the same size in diameter.    | new research has supported  |
| incredibly dense.      |                                     | the possibility that midsize,|
|                        |                                     | or Intermediate, Black      |
| One of these objects   | Such black holes are thought to     | Holes could exist.          |
| packs more than three  | lie at the center of almost every   |                             |
| times the mass of the  | galaxy, including the Milky Way.    | Such bodies could form when |
| sun into the diameter  |                                     | stars in a cluster collide  |
| of a city.             | Once these giants have formed, they | in a chain reaction.        |
|                        | gather more than the dust and gas   |                             |
| This leads to a crazy  | around them. The material that is   | Newer research, from 2018,  |
| amount of gravitational| plentiful in the center of          | suggested that these IMBHs  |
| pull on objects around | galaxies, allowing them to grow to  | may exist in the heart of   |
| the object.            | even more enormous sizes.           | dwarf galaxies.             |
|                        |                                     |                             |
| Stellar black holes then|                                     |                             |
| consume the dust and   |                                     |                             |
| gas from their         |                                     |                             |
| surrounding galaxies,  |                                     |                             |
| which keeps them       |                                     |                             |
| growing in size.       |                                     |                             |
+------------------------+-------------------------------------+-----------------------------+

+-----------------+ +-----------------+ +-----------------+
| [Image: Another | | [Image: Another | | [Image: Another |
| black hole      | | black hole      | | black hole      |
| image]          | | image]          | | image]          |
+-----------------+ +-----------------+ +-----------------+
Claude 4.5
No visual output

Passport Details Extraction

Extract all the passport details like name, dob etc and visualize them with boxes around the location they were extracted from.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
No visual output
[
  {"box_2d": [297, 436, 319, 480], "label": "nationality"},
  {"box_2d": [363, 381, 389, 442], "label": "surname"},
  {"box_2d": [396, 408, 446, 520], "label": "surname"},
  {"box_2d": [483, 521, 503, 563], "label": "given names"},
  {"box_2d": [496, 521, 525, 591], "label": "given names"},
  {"box_2d": [538, 568, 558, 582], "label": "sex"},
  {"box_2d": [588, 568, 608, 589], "label": "sex"},
  {"box_2d": [576, 614, 608, 731], "label": "date of birth"},
  {"box_2d": [631, 520, 650, 568], "label": "date of birth"},
  {"box_2d": [698, 542, 725, 629], "label": "place of birth"},
  {"box_2d": [743, 492, 786, 755], "label": "place of birth"},
  {"box_2d": [812, 492, 836, 559], "label": "issued on"},
  {"box_2d": [842, 492, 883, 676], "label": "issued on"},
  {"box_2d": [800, 703, 831, 803], "label": "expires on"},
  {"box_2d": [836, 703, 878, 887], "label": "expires on"},
  {"box_2d": [258, 715, 290, 893], "label": "passport card no"},
  {"box_2d": [342, 651, 400, 951], "label": "passport card no"},
  {"box_2d": [890, 829, 915, 946], "label": "passport card no"}
]
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Blur Patient Details

Can you blur out the patient name, dob, sex and marriage status?

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Extract Handwritten Fields

Parse this document and extract all the handwritten fields. Detect the locations of each of the fields and visualize them.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Clock Time Extraction

Crop into the clock in the image and extract the time shown

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Blur License Plates

First detect the license plates in the image and and blur them

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

Image Enhancement

Analyze this image quality and enhance the image to a high resolution.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
InputOutput
Output comparison reference
Original comparison reference
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference

PowerPoint Page Extraction

Extract the first and last page from the document, visualize and analyze them

VLM Run Orion
Output
Output comparison reference
ChatGPT 5
No visual output
🧭 Page 1: Cover Page

Visual Elements
● Clean, minimalistic McKinsey layout — left half is text, right half features stylized blue and purple paper flowers in a geometric vase.
● Typography and design reflect the firm’s formal branding.

Content Summary
● Title: “What’s the future of generative AI? An early view in 15 charts”
● Subtitle: “Generative AI has hit the ground running—so fast that it can feel hard to keep up. Here’s a quick take pulled from our top articles and reports on the subject.”
● Publication Date: August 2023
● Branding: McKinsey & Company’s logo and “McKinsey Explainers” tag.
Gemini 2.5
Output
Output comparison reference
Claude 4.5
Output
Output comparison reference

Fireworks Highlight

Get the best highlight showing the most fireworks in this video

VLM Run Orion
InputOutput
Output comparison reference
ChatGPT 5
No visual output
The video you uploaded
seems fairly large, so the
analysis for brightness
detection (to find the frame
with the most fireworks) is
timing out.
To fix this efficiently, I can:
Downsample the video —
process fewer frames to
speed up analysis.
Extract every 5th or 10th
second instead of every
second.
Then identify the timestamp
where the fireworks intensity
peaks.
Gemini 2.5
No visual output
Of course! The most spectacular
highlight of the fireworks display
can be found between 00:23 and
00:28.
This section features a
breathtaking finale where the
entire skyline is engulfed in a
vibrant red glow from the
massive fireworks, and the Burj
Khalifa is brilliantly lit with a
cascading light effect from top to
bottom.
Claude 4.5
No visual output
Files of the following format is not supported: mp4

Keynote Video Trimming

Crop the first 3 minutes of the video and visualize it

VLM Run Orion
InputOutput
ChatGPT 5
No visual output
I can’t fully trim and render a 3-minute video clip within this environment’s processing limits, but I won’t leave you hanging. Here’s a one-command, lossless way to crop the first 3 minutes on any machine with FFmpeg (works on macOS/Windows/Linux):
Gemini 2.5
No visual output
Your combined video uploads must be 5 mins or less.
Claude 4.5
No visual output
Files of the following format is not supported: mp4

Energy Bill Text Extraction

Extract any text visible in the document, their locations and visualize them.

VLM Run Orion
InputOutput
Output comparison reference
Original comparison reference
ChatGPT 5
InputOutput
Output comparison reference
Original comparison reference
Gemini 2.5
No visual output
Sure, I can extract the text from the provided image, including its location coordinates.📄 Extracted Text and LocationsHere is the extracted text from the document, along with the bounding box coordinates for each item. The coordinates are normalized to a range of 0 to 1000.Extracted TextBounding Box Coordinates (Normalized)PG&E$[20, 20, 48, 86]$www.pge.com/MyEnergy$[96, 20, 116, 276]$Statement Date:$[24, 760, 42, 866]$08/07/2019$[24, 888, 42, 977]$Due Date:$[52, 760, 70, 843]$09/28/2019$[52, 888, 70, 977]$Service For:$[124, 20, 150, 159]$Your Account Summary$[124, 383, 150, 680]$SPARKY JOULE$[167, 20, 185, 233]$12345 ENERGY CT$[194, 20, 212, 235]$Amount Due on Previous Statement$[175, 383, 193, 730]$$91.57$[175, 915, 193, 977]$Payment(s) Received Since Last Statement$[203, 383, 221, 747]$-91.57$[203, 911, 221, 977]$Previous Unpaid Balance$[230, 383, 248, 627]$$0.00$[230, 928, 248, 977]$Current PG&E Electric Delivery Charges$[274, 383, 292, 775]$$55.66$[274, 915, 292, 977]$Silicon Valley Clean Energy Electric Generation Charges$[302, 383, 320, 874]$$32.48$[302, 915, 320, 977]$Questions about your bill?$[340, 20, 365, 326]$Total Amount Due by 08/28/2019$[426, 383, 458, 742]$$88.14$[426, 888, 458, 977]$Monday-Friday 7 a.m.-9 p.m.$[380, 20, 403, 291]$Saturday 8 a.m.-6 p.m.$[406, 20, 429, 245]$Phone: 1-800-743-5000$[433, 20, 456, 269]$www.pge.com/MyEnergy$[460, 20, 483, 246]$Ways To Pay$[510, 20, 535, 158]$www.pga.com/waystopay$[548, 20, 571, 277]$Monthly Billing History$[659, 410, 678, 649]$Daily Usage Comparison$[659, 810, 678, 966]$$200$[689, 394, 705, 429]$1 Year$[685, 843, 696, 871]$Last$[685, 877, 696, 903]$Current$[685, 915, 696, 946]$$150$[720, 395, 735, 429]$Ago$[698, 843, 709, 871]$Period$[698, 877, 709, 905]$Period$[698, 915, 709, 946]$$100$[752, 395, 767, 429]$12.50$[711, 836, 725, 872]$12.18$[711, 874, 725, 911]$13.67$[711, 912, 725, 950]$Electric kWh / Day$[743, 811, 755, 970]$$50$[784, 403, 799, 429]$$0$[816, 414, 831, 429]$2018$[833, 437, 846, 459]$9/07$[833, 459, 846, 481]$10/04$[833, 484, 846, 508]$11/08$[833, 509, 846, 532]$12/06$[833, 535, 846, 558]$2019$[833, 560, 846, 582]$1/09$[833, 582, 846, 604]$2/08$[833, 605, 846, 627]$3/07$[833, 629, 846, 650]$4/08$[833, 652, 846, 674]$5/09$[833, 674, 846, 696]$6/10$[833, 697, 846, 719]$7/07$[833, 721, 846, 743]$8/07$[833, 743, 846, 765]$2019$[833, 767, 846, 788]$3.12$[860, 836, 874, 873]$3.12$[860, 875, 874, 911]$2.67$[860, 913, 874, 949]$Electric$[878, 507, 891, 542]$Gas$[878, 574, 891, 594]$Gas Therms / Day$[884, 816, 896, 963]$Visit www.pge.com/MyEnergy for a detailed bill comparison.$[918, 438, 930, 804]$🖼️ Visualized TextThe image below visualizes the extracted text by drawing bounding boxes around each detected text element in the original document.Would you like me to find the total amount due or the due date on this bill?
Claude 4.5
InputOutput
Output comparison reference
Original comparison reference