Jason Corkill

jasoncorkill

AI & ML interests

Human data annotation

Recent Activity

View all activity

Organizations

Blog-explorers's profile picture Hugging Face Discord Community's profile picture mlo-data-collab's profile picture Rapidata's profile picture

jasoncorkill's activity

reacted to their post with ❀️πŸ”₯πŸš€ about 11 hours ago
view post
Post
1639
πŸš€ Building Better Evaluations: 32K Image Annotations Now Available

Today, we're releasing an expanded version: 32K images annotated with 3.7M responses from over 300K individuals which was completed in under two weeks using the Rapidata Python API.

Rapidata/text-2-image-Rich-Human-Feedback-32k

A few months ago, we published one of our most liked dataset with 13K images based on the @data-is-better-together 's dataset, following Google's research on "Rich Human Feedback for Text-to-Image Generation" (https://arxiv.org/abs/2312.10240). It collected over 1.5M responses from 150K+ participants.

Rapidata/text-2-image-Rich-Human-Feedback

In the examples below, users highlighted words from prompts that were not correctly depicted in the generated images. Higher word scores indicate more frequent issues. If an image captured the prompt accurately, users could select [No_mistakes].

We're continuing to work on large-scale human feedback and model evaluation. If you're working on related research and need large, high-quality annotations, feel free to get in touch: [email protected].
posted an update about 12 hours ago
view post
Post
1639
πŸš€ Building Better Evaluations: 32K Image Annotations Now Available

Today, we're releasing an expanded version: 32K images annotated with 3.7M responses from over 300K individuals which was completed in under two weeks using the Rapidata Python API.

Rapidata/text-2-image-Rich-Human-Feedback-32k

A few months ago, we published one of our most liked dataset with 13K images based on the @data-is-better-together 's dataset, following Google's research on "Rich Human Feedback for Text-to-Image Generation" (https://arxiv.org/abs/2312.10240). It collected over 1.5M responses from 150K+ participants.

Rapidata/text-2-image-Rich-Human-Feedback

In the examples below, users highlighted words from prompts that were not correctly depicted in the generated images. Higher word scores indicate more frequent issues. If an image captured the prompt accurately, users could select [No_mistakes].

We're continuing to work on large-scale human feedback and model evaluation. If you're working on related research and need large, high-quality annotations, feel free to get in touch: [email protected].
replied to their post 5 days ago
reacted to their post with ❀️πŸ”₯πŸš€ 17 days ago
view post
Post
3249
πŸš€ We tried something new!

We just published a dataset using a new (for us) preference modality: direct ranking based on aesthetic preference. We ranked a couple of thousand images from most to least preferred, all sampled from the Open Image Preferences v1 dataset by the amazing @data-is-better-together team.

πŸ“Š Check it out here:
Rapidata/2k-ranked-images-open-image-preferences-v1

We're really curious to hear your thoughts!
Is this kind of ranking interesting or useful to you? Let us know! πŸ’¬

If it is, please consider leaving a ❀️ and if we hit 30 ❀️s, we’ll go ahead and rank the full 17k image dataset!
Β·
replied to their post 17 days ago
posted an update 17 days ago
view post
Post
3249
πŸš€ We tried something new!

We just published a dataset using a new (for us) preference modality: direct ranking based on aesthetic preference. We ranked a couple of thousand images from most to least preferred, all sampled from the Open Image Preferences v1 dataset by the amazing @data-is-better-together team.

πŸ“Š Check it out here:
Rapidata/2k-ranked-images-open-image-preferences-v1

We're really curious to hear your thoughts!
Is this kind of ranking interesting or useful to you? Let us know! πŸ’¬

If it is, please consider leaving a ❀️ and if we hit 30 ❀️s, we’ll go ahead and rank the full 17k image dataset!
Β·
reacted to their post with πŸ”₯πŸ‘€πŸš€ 19 days ago
view post
Post
3042
πŸ”₯ Yesterday was a fire day!
We dropped two brand-new datasets capturing Human Preferences for text-to-video and text-to-image generations powered by our own crowdsourcing tool!

Whether you're working on model evaluation, alignment, or fine-tuning, this is for you.

1. Text-to-Video Dataset (Pika 2.2 model):
Rapidata/text-2-video-human-preferences-pika2.2

2. Text-to-Image Dataset (Reve-AI Halfmoon):
Rapidata/Reve-AI-Halfmoon_t2i_human_preference

Let’s train AI on AI-generated content with humans in the loop.
Let’s make generative models that actually get us.
posted an update 19 days ago
view post
Post
3042
πŸ”₯ Yesterday was a fire day!
We dropped two brand-new datasets capturing Human Preferences for text-to-video and text-to-image generations powered by our own crowdsourcing tool!

Whether you're working on model evaluation, alignment, or fine-tuning, this is for you.

1. Text-to-Video Dataset (Pika 2.2 model):
Rapidata/text-2-video-human-preferences-pika2.2

2. Text-to-Image Dataset (Reve-AI Halfmoon):
Rapidata/Reve-AI-Halfmoon_t2i_human_preference

Let’s train AI on AI-generated content with humans in the loop.
Let’s make generative models that actually get us.