Forum Post - e6AI

Topic: A lukewarm defense of the quality standards

Silvicultor

Member

sharpy said:
Soooo it's a kind of honeypot, give us training data and we'll let you see stuff others post? Not sure if I like it but okay.

I don’t say they that e6ai team is doing it. But from what I’ve heard the team had the training data issue in mind when setting up the new rules.
You know how furry AI models were made? Someone took a “normal”, censored image gen model like base-SD1.5 or base-SDXL, scraped tons of images from e621 and finetuned it with these images. That’s how furry AI was born.
And these attempted signatures come from the real signatures from all the artists on e621, they are sort of an echo of these.
Future AI models will be much bigger than current ones, so “real data” won’t be enough to train them, because there isn’t enough (high quality) “real data”. So you need good synthetic data. And the better this synthetic data is the better the future furry AI models will be. We will all benefit from this, no matter if you use Civitai or gen locally.

sharpy said:
It would not take any more effort from the mod team than deleting takes currently. Not even a little bit. Maybe even add these to default blacklist so logged-in users need to enable them manually if the mods don't want to show that stuff to outsiders?

Adding it to blacklist isn’t really helping. Because the problem would be the posts that are not correctly tagged. We would need many extra tags for all the quality issues to solve the issue. And many users are not aware themselves what is wrong with their image. But 6 finger hands, attempted signatures and GAN artifacts are all poison to the training.
So we would need many tags like 6_fingers, GAN_artifacts, melting_fingers, double_knee etc. And all these tags would have to be applied correctly by the uploader, which wouldn’t happen. So instead the staff team would have to add these tags. But doing so would consume much more time than deleting the post. And remember, the janitors are doing all this in their free time, nobody is paying them for it!

tyto4tme4l said:
Wouldn't examples of various issues (properly tagged) be useful for training? They could be put in the negative prompt to hopefully enhance the quality of the output.

Not really. It is less problematic if the are tagged, but it’s better if they aren’t there in the first place. I know that from my own LORA training.

sharpy said:
Finding the perfect balance isn't. But currently the mods overcorrected so ridiculously far away from the "sweet spot" that making the situation much better than it is would take no effort at all. Not perfect. But much better.

I don’t dare to judge what is the right balance. But I can really understand someone feels frustrated because they can’t really participate. It would be really great if everyone could join the fun while avoiding the above-mentioned pitfalls. But I sadly don’t really know how to achieve this.