AI LORAs for Wildeer's Lara Croft (Development Thread)

Sepheyer

Well-Known Member
Dec 21, 2020
1,517
3,568
dfff14l-1a57b42f-bea6-4a93-b330-e3b242685b73.jpg

If you are also obsessed with Wildeer's Lara Croft, this thread might be for you.

Current AI tools, such as Stable Diffusion image generators let users capture likeness of a particular character via LORA technique. Multiple AI-fakes of the celebrities probably use this LORA technique to generate AI fan art of the respective celebrities.

In this thread we'll attempt to create the world's best LORA for Wildeer's Lara Croft model.

You are welcome to contribute either with the original images, with crops, with tags, with Kohya settings or by generating the final LORAs using either of the aforementioned ingredients.

Images (originals, all resolutions)Credit to these guys.
link 1
link 2
link 3
Croped Images for Training Datasets
512x512 (WxH)​
link 1
link 2
link 3
Final LORA Linksv1.0.2 link
v1.0.0 link

If you never heard of Stable Diffusion, here are two links:
Art created with stable diffusion: link
Stable Diffusion learning thread: link
 
Last edited:

Mr-Fox

Well-Known Member
Jan 24, 2020
1,400
3,786
A quick tip. Don't use images that look too washed out. Adjust them in photoshop with camera rawfilter and adjust "clarity", be very careful with this though because too much can make the image too harsh, also adjust "dehaze" to remove the haziness..
With adjustments in photoshop you can make a more consistent final Lora.
Resized originalCamera raw filter and smart sharpen
dfbaiaq-ce57f8f2-bd93-4acf-8e5d-3258a493f15c copy.png dfbaiaq-ce57f8f2-bd93-4acf-8e5d-3258a493f15c copy2.png

I have uploaded the psd file to my gofile folder so that you can see the settings I have used by simply clicking on the added filters in the image layer. You don't have to use these exact settings, it's only an example.
Play around with them and see what you like and prefer.
 
  • Red Heart
Reactions: Sepheyer

Mr-Fox

Well-Known Member
Jan 24, 2020
1,400
3,786
off topic kinda...

There are many different versions and creations of Lara. I found a guy on deviantart that makes Ai generated images of Lara.


1694185417982.png 1694186789789.png 1694186860819.png

Most 3d artists that makes renders with Daz Studio have created at least a couple renders of Lara.


1694185579806.png


1694188090340.png 1694188125786.png


1694188807270.png 1694188847595.png 1694188898225.png

There have been several releases of Lara for Daz studio, either ported from or based on the games being released.
3d artists might alter them or put their own spin on it.





1694185948225.png 1694185829166.png 1694185861798.png


1694186382469.png 1694186406874.png


1694186496850.png 1694186658190.png


1694187107269.png 1694187133336.png 1694187163531.png 1694187180572.png
 
Last edited:

Sepheyer

Well-Known Member
Dec 21, 2020
1,517
3,568
Get the LORA here: .

Activation token: WildeerLaraCroft.
Settings for the attached images are: 0.7 / 1.0.

I also attached the Kohya SS settings file. The tags were WD1.4 automatic tags except the following were removed: "1girl, solo, realistic" (I think).

a_16674_.png a_16676_.png a_16677_.png
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,400
3,786
Get the LORA here: .

Activation token: WildeerLaraCroft.
Settings for the attached images are: 0.7 / 1.0.

I also attached the Kohya SS settings file. The tags were WD1.4 automatic tags except the following were removed: "1girl, solo, realistic" (I think).

View attachment 2918563 View attachment 2918564 View attachment 2918565
Good job! These are only portraits though. Have you tried different compositions like full body or full torso, half torso etc? Can you also post an example of a typical caption so we can use it for testing? Wich model did you train it on and wich checkpoint did you use for the example images? I don't use comfy and I would rather not start kohya. Did you use clipskip 2 for the training or anything else we need to know about the training? How many images did you use?
 
Last edited:
  • Like
Reactions: Sepheyer

Sepheyer

Well-Known Member
Dec 21, 2020
1,517
3,568
Good job! These are only portraits though. Have you tried different compositions like full body or full torso, half torso etc? Can you also post an example of a typical caption so we can use it for testing? Wich model did you train it on and wich checkpoint did you use for the example images? I don't use comfy and I would rather not start kohya. Did you use clipskip 2 for the training or anything else we need to know about the training? How many images did you use?
Yeaa, the LORA might have been a dud. The 1:1 images are superb, but 1:1.5 have 99% failure rate producing WTFs. Might be that I haven't found the settings yet, so imma experiment for a bit longer.

But I gotta say, this LORA nailed, nailed I am telling you, the 512x512 portraits. With prompt like this:
You don't have permission to view the spoiler content. Log in or register now.
You get these lookers (same seed 592805119372563 but different models):

a_16729_.png a_16730_.png a_16731_.png a_16732_.png a_16733_.png a_16734_.png a_16735_.png a_16736_.png a_16737_.png
 
Last edited:
  • Like
Reactions: Mr-Fox

Sepheyer

Well-Known Member
Dec 21, 2020
1,517
3,568
I am trying to wrap my head around what went wrong with the LORA training. The 512x512 images are great, the 1024x1024 are aight. 1024x1024:

a_16864_.png

The tall images are trash.

I wonder what went wrong. Now, my other LORA, Sina's, was trained at nearly the same settings as Lara's (too bad I was a moron and forgot to save the Kohya's settings there). I think the major difference is for Lara I picked ~150 images to train on (every 512x512 crop in the original post of this thread) vs ~50 images for Sina. Another change I made was Lara's token length set to 225 instead of 75 for Sina. And finally while both had their images automatically captioned via WD1.4, Lara's captioning had following tokens excluded: "1girl, solo, realistic".

Now, I reiterate that both LORAs were trained on 512x512 and yet Sina's can do very tall images no problem, you can see how LORA capures not only the character it was trained on but also the general details of the HoneySelect2 models that generated Sina in the first place:
You don't have permission to view the spoiler content. Log in or register now.
So, yea, see? The LORA here was trained exclusively on 512x512 and it has no trouble doing long-ass images. Also, Sina's LORA is rather crap when it needs to render 512x512 but delivers on anything larger than that. So, something went wrong, and Lara's LORA is the inverse, in usability, of Sina's. Too many images? Welp, the Wildeer's training set turned out to be less diverse than I thought it would be: the backgrounds are mostly dark gradients with lots of ... (FML what's the name of that gfx light effect, sounds like "bleech"). So yea, I thought having ~160 images would be a pro, not a con. Was the learning rate too high for that many images, should I half the rate next time?

So, as I am debating, I think the next reasonable step is to change the settings to the exact ones that Sina was trained on and to lower the learning rate by half. Naturally, I am open to ideas as imma sit on my hands for the next few days. On the other hand I hate the idea of running the training for ~40 hours. May be I'll do a 20-image training to see how well the merge works. Anything to avoid that ~40 hour-long training PTSD.
 
  • Love
Reactions: Mr-Fox

Mr-Fox

Well-Known Member
Jan 24, 2020
1,400
3,786
I have seen the same as you. Got to use a low-ish weight of 0.4 and with clipskip 2 and some prompting you can get something half decent.

00093-3173840622.png 00095-360379744.png

How many images did you use for the training? I think you need to be more selective in wich images you include in the dataset.
Don't use images with multiple subject such as animals. Don't use images with too complicated poses or strange face expressions or the hands is covering the face. If she is holding a gun or an item it needs to be tagged in the captions. In fact everything in the image needs to be described and tagged in the captions so that it is required to be described and tagged in the prompt for it to show up when generating images. This is how you get a controllable Lora, otherwise things gets baked into it and then it's very difficult to get rid of with prompting. Having a too large dataset is also something to avoid.
It's recommended to use 35-100 images for a character Lora. Only one epoch is not going to be enough, I used 3 epochs my Lora. Set it to 5 and select it to save a Lora each epoch so you can test out wich one is the best. Most likely 2-3 epochs is going to be good.

Amount of images for character Lora.png
 
  • Like
Reactions: Sepheyer

Sepheyer

Well-Known Member
Dec 21, 2020
1,517
3,568
I have seen the same as you. Got to use a low-ish weight of 0.4 and with clipskip 2 and some prompting you can get something half decent.

View attachment 2919393 View attachment 2919394

How many images did you use for the training? I think you need to be more selective in wich images you include in the dataset.
Don't use images with multiple subject such as animals. Don't use images with too complicated poses or strange face expressions or the hands is covering the face. If she is holding a gun or an item it needs to be tagged in the captions. In fact everything in the image needs to be described and tagged in the captions so that it is required to be described and tagged in the prompt for it to show up when generating images. This is how you get a controllable Lora, otherwise things gets baked into it and then it's very difficult to get rid of with prompting. Having a too large dataset is also something to avoid.
It's recommended to use 35-100 images for a character Lora. Only one epoch is not going to be enough, I used 3 epochs my Lora. Set it to 5 and select it to save a Lora each epoch so you can test out wich one is the best. Most likely 2-3 epochs is going to be good.

View attachment 2919423
I used all of the images from the 512x512 crops linked to the original post. Indeed, I read Rentry's guide, tested it out a bit, probably messed up something somewhere, and decided to avoid his advice entirely instead building on top of what already worked. Given how well 512x512 captured Wildeer's Lara, I am convinced the process is on the right track, it is a mere one or two settings somewhere. And yes, I do like the idea of adding more epochs and lowering the learning rate.
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,400
3,786
I am trying to wrap my head around what went wrong with the LORA training. The 512x512 images are great, the 1024x1024 are aight. 1024x1024:

View attachment 2919311

The tall images are trash.

I wonder what went wrong. Now, my other LORA, Sina's, was trained at nearly the same settings as Lara's (too bad I was a moron and forgot to save the Kohya's settings there). I think the major difference is for Lara I picked ~150 images to train on (every 512x512 crop in the original post of this thread) vs ~50 images for Sina. Another change I made was Lara's token length set to 225 instead of 75 for Sina. And finally while both had their images automatically captioned via WD1.4, Lara's captioning had following tokens excluded: "1girl, solo, realistic".

Now, I reiterate that both LORAs were trained on 512x512 and yet Sina's can do very tall images no problem, you can see how LORA capures not only the character it was trained on but also the general details of the HoneySelect2 models that generated Sina in the first place:
You don't have permission to view the spoiler content. Log in or register now.
So, yea, see? The LORA here was trained exclusively on 512x512 and it has no trouble doing long-ass images. Also, Sina's LORA is rather crap when it needs to render 512x512 but delivers on anything larger than that. So, something went wrong, and Lara's LORA is the inverse, in usability, of Sina's. Too many images? Welp, the Wildeer's training set turned out to be less diverse than I thought it would be: the backgrounds are mostly dark gradients with lots of ... (FML what's the name of that gfx light effect, sounds like "bleech"). So yea, I thought having ~160 images would be a pro, not a con. Was the learning rate too high for that many images, should I half the rate next time?

So, as I am debating, I think the next reasonable step is to change the settings to the exact ones that Sina was trained on and to lower the learning rate by half. Naturally, I am open to ideas as imma sit on my hands for the next few days. On the other hand I hate the idea of running the training for ~40 hours. May be I'll do a 20-image training to see how well the merge works. Anything to avoid that ~40 hour-long training PTSD.
Did you use the same model for the training? Wich one did you use. If the model has flaws it's likely to get inherited down into the Lora also. Either use the basemodel or pick the most consistent one that you can find that also is responsive. I used elegance for Kendra. I tried clarity and experience and deliberate, elegance worked the best in my case.
A really slow learning rate is the ticket, the rentry guide also talks about ways to dampen the training. Such as adding a very small amount of denoise offset. I used 0.1 I think, it should be in the settings I posted.
denoise offset: " Increases dynamic range (brighter brights, darker darks). May "deep fry" if set too high. "
Use AdamW8bit if you can.

Learn Dampening.png
 
  • Red Heart
Reactions: Sepheyer

Sepheyer

Well-Known Member
Dec 21, 2020
1,517
3,568
Did you use the same model for the training? Wich one did you use. If the model has flaws it's likely to get inherited down into the Lora also. Either use the basemodel or pick the most consistent one that you can find that also is responsive. I used elegance for Kendra. I tried clarity and experience and deliberate, elegance worked the best in my case.
A really slow learning rate is the ticket, the trentry guide also talks about ways to dampen the training. Such as adding a very small amount of denoise offset. I used 0.1 I think, it should be in the settings I posted.
denoise offset: " Increases dynamic range (brighter brights, darker darks). May "deep fry" if set too high. "
Use AdamW8bit if you can.

View attachment 2919488
Good call! It completely escaped me that Lara was off Zovya's Photoreal 2, while Sina was off Photogen 3.4. Duh. Good call, thanks!!!
 
  • Like
Reactions: Mr-Fox

me3

Member
Dec 31, 2016
316
708
I used all of the images from the 512x512 crops linked to the original post. Indeed, I read Rentry's guide, tested it out a bit, probably messed up something somewhere, and decided to avoid his advice entirely instead building on top of what already worked. Given how well 512x512 captured Wildeer's Lara, I am convinced the process is on the right track, it is a mere one or two settings somewhere. And yes, I do like the idea of adding more epochs and lowering the learning rate.
Just to throw you a curve ball. Applying "logic" to training, as you pointed out in my initial training it kept the leotard, to try and fix another issue i delete one image from my set, a nude one, everything else was unchanged. Images from that training had no leotard and it was pretty hard to get it back with prompting. Judging what impact something has can be a nightmare, so while it might seem you're on the right track it might just be nicely made dead end
Currently sitting with >300 deleted loras from trainings on this, i can tell you there's a lot of nicely paved glittering roads with hard endings...and a lot of GPU generated heating :p
 
  • Haha
  • Like
Reactions: Sepheyer and Mr-Fox

Mr-Fox

Well-Known Member
Jan 24, 2020
1,400
3,786
Just to throw you a curve ball. Applying "logic" to training, as you pointed out in my initial training it kept the leotard, to try and fix another issue i delete one image from my set, a nude one, everything else was unchanged. Images from that training had no leotard and it was pretty hard to get it back with prompting. Judging what impact something has can be a nightmare, so while it might seem you're on the right track it might just be nicely made dead end
Currently sitting with >300 deleted loras from trainings on this, i can tell you there's a lot of nicely paved glittering roads with hard endings...and a lot of GPU generated heating :p
wow dude.. 300:eek: I guess that's also a way of heating your house.. :LOL:
 
  • Like
Reactions: Sepheyer

Sepheyer

Well-Known Member
Dec 21, 2020
1,517
3,568
So, using WD1.4 for tagging. There's a threshold setting for items and for characters.

Let's see how it treats this photo using different settings of General threshold (Adjust `general_threshold` for pruning tags (less tags, less flexible)) and Character threshold (useful if you want to train with character).

000001.png

SmilingWolf/wd-v1-4-convnextv2-tagger-v2

General thresholdCharacter thresholdTag
0.001.00Too much to be useful: see file 00-10.txt
0.250.75WildeerLaraCroft, 1girl, solo, long hair, breasts, looking at viewer, simple background, brown hair, black hair, gloves, bare shoulders, brown eyes, ponytail, ass, parted lips, black gloves, looking back, fingerless gloves, from behind, nail polish, leotard, lips, shiny skin, thigh strap, blue background, bent over, realistic, kneepits, hand on own ass, mole on ass
0.500.50WildeerLaraCroft, 1girl, solo, long hair, looking at viewer, brown hair, gloves, brown eyes, ponytail, ass, parted lips, looking back, fingerless gloves, blue background, bent over, realistic
0.750.25WildeerLaraCroft, 1girl, solo, long hair, looking at viewer, brown hair, brown eyes, ass, fingerless gloves
1.000.00Too much to be useful: see file 10-00.txt
---------
0.050.05WildeerLaraCroft, 1girl, solo, long hair, breasts, looking at viewer, blush, smile, open mouth, bangs, large breasts, simple background, brown hair, shirt, black hair, gloves, bare shoulders, brown eyes, medium breasts, underwear, standing, panties, ponytail, ass, thighs, parted lips, teeth, sleeveless, black gloves, shiny, looking back, artist name, fingerless gloves, from behind, nail polish, mole, leotard, lips, fingernails, head tilt, gradient, legs, one-piece swimsuit, shiny skin, parted bangs, bare arms, gradient background, tattoo, thigh strap, leaning forward, anus, feet out of frame, cameltoe, watermark, blue background, bent over, tan, blue nails, tanlines, thong, ass grab, realistic, nose, ass focus, partially visible vulva, kneepits, blue leotard, hand on own ass, grabbing own ass, spread ass, anus peek, spanked, mole on ass, slap mark, hands on ass, hands on own ass
0.100.10WildeerLaraCroft, 1girl, solo, long hair, breasts, looking at viewer, bangs, simple background, brown hair, black hair, gloves, bare shoulders, brown eyes, medium breasts, underwear, standing, panties, ponytail, ass, thighs, parted lips, sleeveless, black gloves, shiny, looking back, fingerless gloves, from behind, nail polish, leotard, lips, fingernails, head tilt, shiny skin, tattoo, thigh strap, leaning forward, feet out of frame, cameltoe, blue background, bent over, tan, ass grab, realistic, nose, ass focus, kneepits, hand on own ass, grabbing own ass, spanked, mole on ass, slap mark
0.900.90WildeerLaraCroft, 1girl, solo, ass
0.950.95WildeerLaraCroft, 1girl, solo

So, around 0.10 the tagging is sensitive enough to pick up the slap mark and the leotard. Although the setting has to be at 0.05 to tell the color of the leotard apart.

PS. Naturally, this is not definitive, just a directional test.
 
  • Like
Reactions: Mr-Fox