SHOWDOWN

VLM Run Showdown: Orion vs. Frontier Models

Comparing VLM Run's Visual AI Agent with ChatGPT, Gemini and Claude across a wide set of use-cases

Desk Object Segmentation

Detect all objects present in this scene. For each detected object, determine if it is associated with writing tasks (such as pens, pencils, paper, books, whiteboards, laptops, or tablets). Segment only those writing-related items and provide their locations or bounding boxes. Return a brief description of each segmented object and its relevance to writing.

Image Generation

Generate a new creative image inspired by this reference, but with a completely different artistic style and composition.

Newspaper Key Details Extraction

Detect and extract only the most important details in the news article title, such as key subjects, events, or names, and present them in a markdown format. Only show the title and dates from the image overlayed with bounding boxes.

Bird Detection

Detect all the birds in the image and visualize them.

Turtle Segmentation

Segment out the turtles in the image.

Floor Plan Room Identification

Identify and list all major rooms in the floor plan, providing the square footage for each. Redesign the space using Japandi-style furnishings, emphasizing natural light, neutral color palettes, and minimalist forms. Return an updated floor plan with labeled rooms and suggested furniture layout.

Handwriting Beautification

Enhance and beautify the handwritten text in the document, so it appears clear and aesthetically pleasing. Replace the entire background with a realistic desk scene that includes a laptop and a coffee cup. Make the text look as if it is neatly written in a notebook, with a pen and the coffee cup nearby. Ensure the final image is high-quality, artifact-free, and visually appealing.

X-ray Analysis and Disease Detection

You are an expert in X-ray analysis and disease detection. Analyze the given X-ray, mark the affected area and give a disease description only if the patient has a disease, else analyze as a normal X-ray.

Image to Video

Step 1: Detect and highlight all people present in the image. Step 2: Apply deoldification techniques to restore true-to-life color and detail. Return a high-quality, artifact-free video clearly demonstrating the deoldified scene, focusing on preserving facial features and overall realism.

Virtual Try-On

You are provided with two images: one of a dress(the first image) and one of a person(the second image). First, accurately detect the boundaries of the person and the dress. Next, generate a highly realistic virtual try-on by seamlessly compositing the dress onto the person, ensuring natural fit, alignment, and that the person appears fully and appropriately dressed. All visible body parts and clothing edges should be preserved and blended with the new garment. Once the transformation is complete, produce a high-quality, artifact-free video of the person wearing the dress from the provided inputs, with special attention to natural body contours, color consistency, and overall realism.

Spider Eye Keypoint Detection

Detect the keypoints of the eyes in the image.

Fruit Counting

Count the number of fruits in the image by detecting their center.

Segment Furniture

Segment all the furniture in the living room.

Detect Cars and Pedestrians

Detect all the cars and pedestrians at the intersection.

Detect Light Sources

Detect the light sources in the image.

UI Element Detection

Detect all the UI elements in this image and visualize the boxes

Segment Faces in News Image

Segment out the faces in this image and visulize the segments

Tennis Match Animation and Narration

Write an exciting sports narrative describing the action in this image, including player dynamics and game context and then animate a video of this.

Teleport To Mars

Generate an image of this girl on the mars. I want her in front of a futuristic settlement house on mars as if she has been there for her whole lif enjoying the day. I want her outfit and pose to blend into the surrounding and look natural.

Architectural Visualization from Floor Plan

Give a detailed explanation of the plan and generate detailed 3D architectural visualizations and interior design concepts based on the plan

Architectural 3D Visualization

Generate detailed 3D architectural visualizations and interior design concepts based on this floor plan, including furniture placement and lighting.

Poster Layout Understanding

Get the layout of the document and visualize all the text

Passport Details Extraction

Extract all the passport details like name, dob etc and visualize them with boxes around the location they were extracted from.

Blur Patient Details

Can you blur out the patient name, dob, sex and marriage status?

Extract Handwritten Fields

Parse this document and extract all the handwritten fields. Detect the locations of each of the fields and visualize them.

Clock Time Extraction

Crop into the clock in the image and extract the time shown

Blur License Plates

First detect the license plates in the image and and blur them

Image Enhancement

Analyze this image quality and enhance the image to a high resolution.

PowerPoint Page Extraction

Extract the first and last page from the document, visualize and analyze them

Fireworks Highlight

Get the best highlight showing the most fireworks in this video

Keynote Video Trimming

Crop the first 3 minutes of the video and visualize it

Energy Bill Text Extraction

Extract any text visible in the document, their locations and visualize them.