dwb2023 commited on
Commit
7d59523
·
verified ·
1 Parent(s): 97ec49f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -5
README.md CHANGED
@@ -10,12 +10,12 @@ pinned: false
10
  license: openrail
11
  ---
12
 
13
- # Use of Landing AI for brain tumor detection
14
 
15
  - a quick overview of the inner workings of LandingAI's Vision Agent, how it breaks down an initial user requirement to identify candidate components in the application architecture.
16
- - the diagram below captures what I had in mind for a multi-agent system but LandingAI's vision agent starts this much earlier, taking a fresh approach on old school architecture trade-off analysis.
17
- - if you want a deeper understanding of the run-time flow of the application I encourage you to instrument it with Weave. Additional information in [this GitHub repo](https://github.com/donbr/vision-agent).
18
- - the flow in the most recent version of the official [Vision Agent](https://va.landing.ai/agent) app has shifted somewhat, but the number of concepts it helped bring together for me was amazing.
19
 
20
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/653d62fab16f657d28ce2cf2/KPV1Szj6IkY457n3Hqjl6.png)
21
 
@@ -271,6 +271,8 @@ overlay_segmentation_masks(image: numpy.ndarray, masks: List[Dict[str, Any]]) ->
271
 
272
  ## Vision Agent Tools - model summary
273
 
 
 
274
  | Model Name | Hugging Face Model | Primary Function | Use Cases |
275
  |---------------------|-------------------------------------|-------------------------------|--------------------------------------------------------------|
276
  | OWL-ViT v2 | google/owlv2-base-patch16-ensemble | Object detection and localization | - Open-world object detection<br>- Locating specific objects based on text prompts |
@@ -279,7 +281,7 @@ overlay_segmentation_masks(image: numpy.ndarray, masks: List[Dict[str, Any]]) ->
279
  | CLIP | openai/clip-vit-base-patch32 | Image-text similarity | - Zero-shot image classification<br>- Image-text matching |
280
  | BLIP | Salesforce/blip-image-captioning-base | Image captioning | - Generating text descriptions of images |
281
  | LOCA | Custom implementation | Object counting | - Zero-shot object counting<br>- Object counting with visual prompts |
282
- | GIT v2 | microsoft/git-base-textcaps | Visual question answering and image captioning | - Answering questions about image content<br>- Generating text descriptions of images |
283
  | Grounding DINO | groundingdino/groundingdino-swint-ogc | Object detection and localization | - Detecting objects based on text prompts |
284
  | SAM | facebook/sam-vit-huge | Instance segmentation | - Text-prompted instance segmentation |
285
  | DETR | facebook/detr-resnet-50 | Object detection | - General object detection |
 
10
  license: openrail
11
  ---
12
 
13
+ # Using Landing AI's Vision Agent to architect an app for brain tumor detection
14
 
15
  - a quick overview of the inner workings of LandingAI's Vision Agent, how it breaks down an initial user requirement to identify candidate components in the application architecture.
16
+ - the diagram below captures what I had in mind for a multi-agent system implementation -- but LandingAI's vision agent starts this much earlier, taking a fresh approach on old school architecture trade-off analysis.
17
+ - the design-time flow in the most recent version of the official [Vision Agent](https://va.landing.ai/agent) app has shifted somewhat, but the number of concepts it helped bring together for me was amazing.
18
+ - if you want a deeper understanding of the run-time flow of the application I encourage you to instrument it with Weave. Additional information on how to instrument the app can be found in [this GitHub repo](https://github.com/donbr/vision-agent).
19
 
20
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/653d62fab16f657d28ce2cf2/KPV1Szj6IkY457n3Hqjl6.png)
21
 
 
271
 
272
  ## Vision Agent Tools - model summary
273
 
274
+ - any mistakes in the following table are mine. my efforts to do some QUICK reverse engineering to identify target models.
275
+
276
  | Model Name | Hugging Face Model | Primary Function | Use Cases |
277
  |---------------------|-------------------------------------|-------------------------------|--------------------------------------------------------------|
278
  | OWL-ViT v2 | google/owlv2-base-patch16-ensemble | Object detection and localization | - Open-world object detection<br>- Locating specific objects based on text prompts |
 
281
  | CLIP | openai/clip-vit-base-patch32 | Image-text similarity | - Zero-shot image classification<br>- Image-text matching |
282
  | BLIP | Salesforce/blip-image-captioning-base | Image captioning | - Generating text descriptions of images |
283
  | LOCA | Custom implementation | Object counting | - Zero-shot object counting<br>- Object counting with visual prompts |
284
+ | GIT v2 | microsoft/git-base-vqav2 | Visual question answering and image captioning | - Answering questions about image content<br>- Generating text descriptions of images |
285
  | Grounding DINO | groundingdino/groundingdino-swint-ogc | Object detection and localization | - Detecting objects based on text prompts |
286
  | SAM | facebook/sam-vit-huge | Instance segmentation | - Text-prompted instance segmentation |
287
  | DETR | facebook/detr-resnet-50 | Object detection | - General object detection |