⬅️ CoreNet: A library for training deep neural networks

gbickford 11 daysReload

> Relationship with CVNets

> CoreNet evolved from CVNets, to encompass a broader range of applications beyond computer vision. Its expansion facilitated the training of foundational models, including LLMs.

We can expect it to have grown from here: https://apple.github.io/ml-cvnets/index.html

It looks like a mid-level implementations of training and inference. You can see in their "default_trainer.py"[1] that the engine uses Tensors from torch but implements its own training method. They implement their own LR scheduler and optimizer; the caller can optionally use Adam from torch.

It's an interesting (maybe very Apple) choice to build from the ground up instead of partnering with existing frameworks to provide first class support in them.

The MLX examples seem to be inference only at this point. It does look like this might be a landing ground for more MLX specific implementations: e.g. https://github.com/apple/corenet/blob/5b50eca42bc97f6146b812...

It will be interesting to see how it tracks over the next year; especially with their recent acquisitions:

Datakalab https://news.ycombinator.com/item?id=40114350

DarwinAI https://news.ycombinator.com/item?id=39709835

1: https://github.com/apple/corenet/blob/main/corenet/engine/de...

ipsum2 11 daysReload

It's interesting that Apple also actively develops https://github.com/apple/axlearn, which is a library on top of Jax. Seems like half the ML teams at Apple use PyTorch, and the other half uses Jax. Maybe its split between Google Cloud and AWS?

coder543 11 daysReload

They also mention in the README:

> CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

This is the first I’m hearing of that, and the link seems broken.

mxwsn 11 daysReload

Built on top of pytorch.

leodriesch 11 daysReload

How does this compare to MLX? As far as I understand MLX is equivalent to PyTorch but optimized for Apple Silicon.

Is this meant for training MLX models in a distributed manner? Or what is its purpose?