AI LORAs for Wildeer's Lara Croft (Development Thread)

Sepheyer · Sep 7, 2023

If you are also obsessed with Wildeer's Lara Croft, this thread might be for you.

Current AI tools, such as Stable Diffusion image generators let users capture likeness of a particular character via LORA technique. Multiple AI-fakes of the celebrities probably use this LORA technique to generate AI fan art of the respective celebrities.

In this thread we'll attempt to create the world's best LORA for Wildeer's Lara Croft model.

You are welcome to contribute either with the original images, with crops, with tags, with Kohya settings or by generating the final LORAs using either of the aforementioned ingredients.

Images (originals, all resolutions)	Credit to these guys. link 1 link 2 link 3
Croped Images for Training Datasets
512x512 (WxH)	link 1 link 2 link 3
Final LORA Links	v1.0.2 link v1.0.0 link

If you never heard of Stable Diffusion, here are two links:

Art created with stable diffusion: link

Stable Diffusion learning thread: link

Sepheyer · Sep 7, 2023

Set 1. Credit to these guys.

(Missing 35-39.)

Mr-Fox · Sep 7, 2023

A quick tip. Don't use images that look too washed out. Adjust them in photoshop with camera rawfilter and adjust "clarity", be very careful with this though because too much can make the image too harsh, also adjust "dehaze" to remove the haziness..
With adjustments in photoshop you can make a more consistent final Lora.

Resized original	Camera raw filter and smart sharpen

I have uploaded the psd file to my gofile folder so that you can see the settings I have used by simply clicking on the added filters in the image layer. You don't have to use these exact settings, it's only an example.
Play around with them and see what you like and prefer.

You must be registered to see the links

Sepheyer · Sep 7, 2023

Set 2. Credit to these guys.

Sepheyer · Sep 7, 2023

Set 3. Credit to these guys.

(This is all the images I have.)

Sepheyer · Sep 8, 2023

Crops 512x512, 001 - 064.

Sepheyer · Sep 8, 2023

Crops 512x512, 065 - 128.

Sepheyer · Sep 8, 2023

Crops 512x512, 129 - 155.

Mr-Fox · Sep 8, 2023

off topic kinda...

There are many different versions and creations of Lara. I found a guy on deviantart that makes Ai generated images of Lara.

You must be registered to see the links

Most 3d artists that makes renders with Daz Studio have created at least a couple renders of Lara.

You must be registered to see the links

There have been several releases of Lara for Daz studio, either ported from or based on the games being released.
3d artists might alter them or put their own spin on it.

You must be registered to see the links

Sepheyer · Sep 10, 2023

Get the LORA here:

You must be registered to see the links

.

Activation token: WildeerLaraCroft.
Settings for the attached images are: 0.7 / 1.0.

I also attached the Kohya SS settings file. The tags were WD1.4 automatic tags except the following were removed: "1girl, solo, realistic" (I think).

Mr-Fox · Sep 10, 2023

Sepheyer said:
Get the LORA here:
You must be registered to see the links
.

Activation token: WildeerLaraCroft.
Settings for the attached images are: 0.7 / 1.0.

I also attached the Kohya SS settings file. The tags were WD1.4 automatic tags except the following were removed: "1girl, solo, realistic" (I think).

View attachment 2918563 View attachment 2918564 View attachment 2918565

Good job! These are only portraits though. Have you tried different compositions like full body or full torso, half torso etc? Can you also post an example of a typical caption so we can use it for testing? Wich model did you train it on and wich checkpoint did you use for the example images? I don't use comfy and I would rather not start kohya. Did you use clipskip 2 for the training or anything else we need to know about the training? How many images did you use?

Sepheyer · Sep 10, 2023

Mr-Fox said:
Good job! These are only portraits though. Have you tried different compositions like full body or full torso, half torso etc? Can you also post an example of a typical caption so we can use it for testing? Wich model did you train it on and wich checkpoint did you use for the example images? I don't use comfy and I would rather not start kohya. Did you use clipskip 2 for the training or anything else we need to know about the training? How many images did you use?

Yeaa, the LORA might have been a dud. The 1:1 images are superb, but 1:1.5 have 99% failure rate producing WTFs. Might be that I haven't found the settings yet, so imma experiment for a bit longer.

But I gotta say, this LORA nailed, nailed I am telling you, the 512x512 portraits. With prompt like this:

You don't have permission to view the spoiler content. Log in or register now.

You get these lookers (same seed 592805119372563 but different models):

Sepheyer · Sep 10, 2023

I am trying to wrap my head around what went wrong with the LORA training. The 512x512 images are great, the 1024x1024 are aight. 1024x1024:

The tall images are trash.

I wonder what went wrong. Now, my other LORA, Sina's, was trained at nearly the same settings as Lara's (too bad I was a moron and forgot to save the Kohya's settings there). I think the major difference is for Lara I picked ~150 images to train on (every 512x512 crop in the original post of this thread) vs ~50 images for Sina. Another change I made was Lara's token length set to 225 instead of 75 for Sina. And finally while both had their images automatically captioned via WD1.4, Lara's captioning had following tokens excluded: "1girl, solo, realistic".

Now, I reiterate that both LORAs were trained on 512x512 and yet Sina's can do very tall images no problem, you can see how LORA capures not only the character it was trained on but also the general details of the HoneySelect2 models that generated Sina in the first place:

You don't have permission to view the spoiler content. Log in or register now.

So, yea, see? The LORA here was trained exclusively on 512x512 and it has no trouble doing long-ass images. Also, Sina's LORA is rather crap when it needs to render 512x512 but delivers on anything larger than that. So, something went wrong, and Lara's LORA is the inverse, in usability, of Sina's. Too many images? Welp, the Wildeer's training set turned out to be less diverse than I thought it would be: the backgrounds are mostly dark gradients with lots of ... (FML what's the name of that gfx light effect, sounds like "bleech"). So yea, I thought having ~160 images would be a pro, not a con. Was the learning rate too high for that many images, should I half the rate next time?

So, as I am debating, I think the next reasonable step is to change the settings to the exact ones that Sina was trained on and to lower the learning rate by half. Naturally, I am open to ideas as imma sit on my hands for the next few days. On the other hand I hate the idea of running the training for ~40 hours. May be I'll do a 20-image training to see how well the merge works. Anything to avoid that ~40 hour-long training PTSD.

Mr-Fox · Sep 10, 2023

I have seen the same as you. Got to use a low-ish weight of 0.4 and with clipskip 2 and some prompting you can get something half decent.

How many images did you use for the training? I think you need to be more selective in wich images you include in the dataset.
Don't use images with multiple subject such as animals. Don't use images with too complicated poses or strange face expressions or the hands is covering the face. If she is holding a gun or an item it needs to be tagged in the captions. In fact everything in the image needs to be described and tagged in the captions so that it is required to be described and tagged in the prompt for it to show up when generating images. This is how you get a controllable Lora, otherwise things gets baked into it and then it's very difficult to get rid of with prompting. Having a too large dataset is also something to avoid.
It's recommended to use 35-100 images for a character Lora. Only one epoch is not going to be enough, I used 3 epochs my Lora. Set it to 5 and select it to save a Lora each epoch so you can test out wich one is the best. Most likely 2-3 epochs is going to be good.

Sepheyer · Sep 10, 2023

Mr-Fox said:
I have seen the same as you. Got to use a low-ish weight of 0.4 and with clipskip 2 and some prompting you can get something half decent.

View attachment 2919393 View attachment 2919394

How many images did you use for the training? I think you need to be more selective in wich images you include in the dataset.
Don't use images with multiple subject such as animals. Don't use images with too complicated poses or strange face expressions or the hands is covering the face. If she is holding a gun or an item it needs to be tagged in the captions. In fact everything in the image needs to be described and tagged in the captions so that it is required to be described and tagged in the prompt for it to show up when generating images. This is how you get a controllable Lora, otherwise things gets baked into it and then it's very difficult to get rid of with prompting. Having a too large dataset is also something to avoid.
It's recommended to use 35-100 images for a character Lora. Only one epoch is not going to be enough, I used 3 epochs my Lora. Set it to 5 and select it to save a Lora each epoch so you can test out wich one is the best. Most likely 2-3 epochs is going to be good.

View attachment 2919423

I used all of the images from the 512x512 crops linked to the original post. Indeed, I read Rentry's guide, tested it out a bit, probably messed up something somewhere, and decided to avoid his advice entirely instead building on top of what already worked. Given how well 512x512 captured Wildeer's Lara, I am convinced the process is on the right track, it is a mere one or two settings somewhere. And yes, I do like the idea of adding more epochs and lowering the learning rate.

Mr-Fox · Sep 10, 2023

Sepheyer said:
I am trying to wrap my head around what went wrong with the LORA training. The 512x512 images are great, the 1024x1024 are aight. 1024x1024:

View attachment 2919311

The tall images are trash.

I wonder what went wrong. Now, my other LORA, Sina's, was trained at nearly the same settings as Lara's (too bad I was a moron and forgot to save the Kohya's settings there). I think the major difference is for Lara I picked ~150 images to train on (every 512x512 crop in the original post of this thread) vs ~50 images for Sina. Another change I made was Lara's token length set to 225 instead of 75 for Sina. And finally while both had their images automatically captioned via WD1.4, Lara's captioning had following tokens excluded: "1girl, solo, realistic".

Now, I reiterate that both LORAs were trained on 512x512 and yet Sina's can do very tall images no problem, you can see how LORA capures not only the character it was trained on but also the general details of the HoneySelect2 models that generated Sina in the first place:

You don't have permission to view the spoiler content. Log in or register now.

So, yea, see? The LORA here was trained exclusively on 512x512 and it has no trouble doing long-ass images. Also, Sina's LORA is rather crap when it needs to render 512x512 but delivers on anything larger than that. So, something went wrong, and Lara's LORA is the inverse, in usability, of Sina's. Too many images? Welp, the Wildeer's training set turned out to be less diverse than I thought it would be: the backgrounds are mostly dark gradients with lots of ... (FML what's the name of that gfx light effect, sounds like "bleech"). So yea, I thought having ~160 images would be a pro, not a con. Was the learning rate too high for that many images, should I half the rate next time?

So, as I am debating, I think the next reasonable step is to change the settings to the exact ones that Sina was trained on and to lower the learning rate by half. Naturally, I am open to ideas as imma sit on my hands for the next few days. On the other hand I hate the idea of running the training for ~40 hours. May be I'll do a 20-image training to see how well the merge works. Anything to avoid that ~40 hour-long training PTSD.

Did you use the same model for the training? Wich one did you use. If the model has flaws it's likely to get inherited down into the Lora also. Either use the basemodel or pick the most consistent one that you can find that also is responsive. I used elegance for Kendra. I tried clarity and experience and deliberate, elegance worked the best in my case.
A really slow learning rate is the ticket, the rentry guide also talks about ways to dampen the training. Such as adding a very small amount of denoise offset. I used 0.1 I think, it should be in the settings I posted.
denoise offset: " Increases dynamic range (brighter brights, darker darks). May "deep fry" if set too high. "
Use AdamW8bit if you can.

Sepheyer · Sep 10, 2023

Mr-Fox said:
Did you use the same model for the training? Wich one did you use. If the model has flaws it's likely to get inherited down into the Lora also. Either use the basemodel or pick the most consistent one that you can find that also is responsive. I used elegance for Kendra. I tried clarity and experience and deliberate, elegance worked the best in my case.
A really slow learning rate is the ticket, the trentry guide also talks about ways to dampen the training. Such as adding a very small amount of denoise offset. I used 0.1 I think, it should be in the settings I posted.
denoise offset: " Increases dynamic range (brighter brights, darker darks). May "deep fry" if set too high. "
Use AdamW8bit if you can.

View attachment 2919488

Good call! It completely escaped me that Lara was off Zovya's Photoreal 2, while Sina was off Photogen 3.4. Duh. Good call, thanks!!!

me3 · Sep 10, 2023

Sepheyer said:
I used all of the images from the 512x512 crops linked to the original post. Indeed, I read Rentry's guide, tested it out a bit, probably messed up something somewhere, and decided to avoid his advice entirely instead building on top of what already worked. Given how well 512x512 captured Wildeer's Lara, I am convinced the process is on the right track, it is a mere one or two settings somewhere. And yes, I do like the idea of adding more epochs and lowering the learning rate.

Just to throw you a curve ball. Applying "logic" to training, as you pointed out in my initial training it kept the leotard, to try and fix another issue i delete one image from my set, a nude one, everything else was unchanged. Images from that training had no leotard and it was pretty hard to get it back with prompting. Judging what impact something has can be a nightmare, so while it might seem you're on the right track it might just be nicely made dead end
Currently sitting with >300 deleted loras from trainings on this, i can tell you there's a lot of nicely paved glittering roads with hard endings...and a lot of GPU generated heating

Mr-Fox · Sep 10, 2023

me3 said:
Just to throw you a curve ball. Applying "logic" to training, as you pointed out in my initial training it kept the leotard, to try and fix another issue i delete one image from my set, a nude one, everything else was unchanged. Images from that training had no leotard and it was pretty hard to get it back with prompting. Judging what impact something has can be a nightmare, so while it might seem you're on the right track it might just be nicely made dead end
Currently sitting with >300 deleted loras from trainings on this, i can tell you there's a lot of nicely paved glittering roads with hard endings...and a lot of GPU generated heating

wow dude.. 300

I guess that's also a way of heating your house..

Sepheyer · Sep 10, 2023

So, using WD1.4 for tagging. There's a threshold setting for items and for characters.

Let's see how it treats this photo using different settings of General threshold (Adjust `general_threshold` for pruning tags (less tags, less flexible)) and Character threshold (useful if you want to train with character).

SmilingWolf/wd-v1-4-convnextv2-tagger-v2

General threshold	Character threshold	Tag
0.00	1.00	Too much to be useful: see file 00-10.txt
0.25	0.75	WildeerLaraCroft, 1girl, solo, long hair, breasts, looking at viewer, simple background, brown hair, black hair, gloves, bare shoulders, brown eyes, ponytail, ass, parted lips, black gloves, looking back, fingerless gloves, from behind, nail polish, leotard, lips, shiny skin, thigh strap, blue background, bent over, realistic, kneepits, hand on own ass, mole on ass
0.50	0.50	WildeerLaraCroft, 1girl, solo, long hair, looking at viewer, brown hair, gloves, brown eyes, ponytail, ass, parted lips, looking back, fingerless gloves, blue background, bent over, realistic
0.75	0.25	WildeerLaraCroft, 1girl, solo, long hair, looking at viewer, brown hair, brown eyes, ass, fingerless gloves
1.00	0.00	Too much to be useful: see file 10-00.txt
---	---	---
0.05	0.05	WildeerLaraCroft, 1girl, solo, long hair, breasts, looking at viewer, blush, smile, open mouth, bangs, large breasts, simple background, brown hair, shirt, black hair, gloves, bare shoulders, brown eyes, medium breasts, underwear, standing, panties, ponytail, ass, thighs, parted lips, teeth, sleeveless, black gloves, shiny, looking back, artist name, fingerless gloves, from behind, nail polish, mole, leotard, lips, fingernails, head tilt, gradient, legs, one-piece swimsuit, shiny skin, parted bangs, bare arms, gradient background, tattoo, thigh strap, leaning forward, anus, feet out of frame, cameltoe, watermark, blue background, bent over, tan, blue nails, tanlines, thong, ass grab, realistic, nose, ass focus, partially visible vulva, kneepits, blue leotard, hand on own ass, grabbing own ass, spread ass, anus peek, spanked, mole on ass, slap mark, hands on ass, hands on own ass
0.10	0.10	WildeerLaraCroft, 1girl, solo, long hair, breasts, looking at viewer, bangs, simple background, brown hair, black hair, gloves, bare shoulders, brown eyes, medium breasts, underwear, standing, panties, ponytail, ass, thighs, parted lips, sleeveless, black gloves, shiny, looking back, fingerless gloves, from behind, nail polish, leotard, lips, fingernails, head tilt, shiny skin, tattoo, thigh strap, leaning forward, feet out of frame, cameltoe, blue background, bent over, tan, ass grab, realistic, nose, ass focus, kneepits, hand on own ass, grabbing own ass, spanked, mole on ass, slap mark
0.90	0.90	WildeerLaraCroft, 1girl, solo, ass
0.95	0.95	WildeerLaraCroft, 1girl, solo

So, around 0.10 the tagging is sensitive enough to pick up the slap mark and the leotard. Although the setting has to be at 0.05 to tell the color of the leotard apart.

PS. Naturally, this is not definitive, just a directional test.

AI LORAs for Wildeer's Lara Croft (Development Thread)

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Member

Well-Known Member

Well-Known Member