Immediately after switching the page, it will work with CSR.
Please reload your browser to see how it works.
Another common constraint in vision vs language is the long tails are very long in the visual world. There's a number of domains where you have very little examples to learn (defects are designed to happen infrequently; rare species for identification show up, well, rarely). And pulling from the blog: "But small models ... benefit greatly from the exact type experiment of outlined in this post: strong augmentation with limited data trained across many epochs."
One thing you could do is add semantic search so when a user searches "red shoes," the index returns images that look like red shoes even if the metadata doesn't say anything about color or item types. To do this, I'd use a model like CLIP. Here's an example of using CLIP and Supabase to do semantic image search: https://blog.roboflow.com/how-to-use-semantic-search-supabas...