Cookies   I display ads to cover the expenses. See the privacy policy for more information. You can keep or reject the ads.

Video thumbnail
So I was thinking about making some kind of neural network to cure cancer...
but then I stumbled upon this video by Avery Miller and thought "Hey, furries would be fun!"
He tried training a GAN to generate fursonas, but his results were not so great. And I thought maybe I could improve it.
So here it is! The fursona generator in all its glory. The download link and source code are in the description as always.
So now, let's jump down this rabbit hole and figure out how I got this result.
Collecting data wasn't too difficult. There's so much fury stuff online!
I just searched for "furry headshot" on DeviantART and wrote a Scraper to download the first 15,000 results.
Just looking through the data, I can already tell this is gonna be really challenging.
Furry art is just insanely colorful and crazy, with so many different styles.
It's a much harder problem than something like human faces. Like, there's hand-drawn stuff, cartoony styles, Disney style,
here's literally someone's dog.
And they're not even all headshots, despite my search. So to get any kind of decent result
I needed to remove some of the outliers.
I decided to remove all black and white drawings and non-headshots,
but I couldn't think of an automated way to do it reliably. So guess what I spent the next three hours doing?
Yes, I manually sifted through
15,000 furry images
Oh, this was definitely one of those times I began to question my sanity halfway through
Anyway, I perched it down to 10,000 images and then started training.
I'm using the same hybrid Gantt technique that I use in my Garfield video, but instead of using an embedding encoder,
I'm using a regular auto-encoder, because it would be nice to also convert images into personas. So here's my initial results.
I mean, they definitely have furry-like qualities, but maybe through the eyes of Picasso or something.
It just seems like there's still too much variety for the network to really generalize any features,
so I had to take drastic measures yet again. This time,
I decided to really clamp down on a single style, which I call the Disney style.
And again, I went through the images keeping only those I deemed perfect.
This time I got the data set all the way down to
250 images. That's going to be a challenge, though, because you usually need a lot more training samples to make a good Network.
But luckily, there's a trick I can use called data augmentation that I've used in most of my past projects,
But haven't really talked about yet.
Basically, when you don't have enough data,
sometimes you can just augment more. One easy way is mirror images.
Adding horizontally flipped images creates new and valid fursonas. So that's an instant doubling to 500.
Sometimes you can even use vertical flips and rotations, but that isn't helpful here.
Another thing to notice is that the personas can have basically any color and still look good.
So I also added a copy with the green and blue channels swapped. So double it again to a thousand. Lastly,
I also included random translations of all the images which brings it into the tens of thousands. Although the images are technically unique,
they're not as unique as if they were completely handcrafted.
But it's a hell of a lot better than the 250 we started out with.
This is when I finally started to get some good results. And I have to admit, some of these results were just too good.
I was very suspicious
I was overfitting. But the auto-encoded images actually deviated significantly from the ground truth, and
random personas don't appear to associate with anything in the training either,
so at least I'm confident it's not mimicking any training images. It really did learn to just copy this style. They're not all great, though,
OH NO
It can go from cute-looking to absolutely horrifying demon. A common thing to see is multiple eyes in the wrong places or
double faces. For the most part, though, this project really exceeded my expectations. And
remember how I said we have a full encoder too - well, that means it can also convert faces to personas.
It didn't really work as well. Not too surprising, since it's never actually seen anything other than furries.
But it gets the colors right at least. And
Holy crap
My channel got into the YouTube
Recommendation algorithm somehow, so a big welcome to the 95% of you that joined the last couple of weeks.
I was not expecting to grow so quickly. As a thank you, I want to do a Q&A video soon.
So if you have any questions
you want me to answer, leave them in the comments, upvote the ones you like, and I'll try to answer a good chunk of them.
Thanks for watching!
*playful music*